public class SimilarityESAlines extends SimilarityESA
cat.lump.ir.sim.ml.esa.esa.SimilarityESA
esaGen, objectA, objectB, overrideObjects, textsA, textsB
esaVectorsA, esaVectorsB, log
Constructor and Description |
---|
SimilarityESAlines(java.io.File docA,
java.io.File docB,
java.lang.String indexPath,
java.lang.String language,
java.lang.Boolean overrideObjects)
Given two files with independent sentences, process line-wise
similarities.
|
Modifier and Type | Method and Description |
---|---|
protected EsaVectors |
computeVectors(java.io.File file,
java.lang.String object,
java.lang.String set)
Computes the vectors for the texts in the given set.
|
protected java.lang.String |
processLine(java.lang.String text)
Method used to do some preprocessing to the input text.
|
protected void |
setDocumentsPath(java.io.File fileA,
java.io.File fileB)
A method that loads the texts in collections A and B.
|
protected void |
setObjects()
Set the name of the resulting vector objects
|
computeVectorsA, computeVectorsB, objectExists, saveObject
computePairwiseSimilarities, computeSimilarities, computeSimilarity, displaySimilarities, documentsExist, exitError, getPairwiseSimilarities, getSimilarities, getSimilaritiesMatrix, getSimilarity, getSimilarity
public SimilarityESAlines(java.io.File docA, java.io.File docB, java.lang.String indexPath, java.lang.String language, java.lang.Boolean overrideObjects)
docA
- first text filedocB
- second text fileindexPath
- path to the Lucene indexoverrideObjects
- whether vector similarities should be
computed from scratchanalyzer
- the analyzer to consider (for various languages)protected void setDocumentsPath(java.io.File fileA, java.io.File fileB)
SimilarityESA
setDocumentsPath
in class SimilarityESA
protected void setObjects()
SimilarityESA
setObjects
in class SimilarityESA
protected EsaVectors computeVectors(java.io.File file, java.lang.String object, java.lang.String set) throws java.lang.ClassNotFoundException, java.io.IOException
SimilarityESA
computeVectors
in class SimilarityESA
file
- path to the documentsobject
- name of the (previously generated object)set
- whether we are processing A or Bjava.lang.ClassNotFoundException
java.io.IOException
protected java.lang.String processLine(java.lang.String text)
text
-