A B C D E F G H I J K L M N O P Q R S T V W X Y 

A

Abstracter - Class in cat.lump.ir.index
Contains the basic operation for indexing and querying a documents' index
Abstracter(Locale, File, RepresentationType[]) - Constructor for class cat.lump.ir.index.Abstracter
Calls the setters for language and representation type.
AbstractPreprocess - Class in cat.lump.aq.textextraction.wikipedia.prepro
 
AbstractPreprocess(TypePreprocess, String, Locale, int, File) - Constructor for class cat.lump.aq.textextraction.wikipedia.prepro.AbstractPreprocess
Creates a preprocess method able to preprocess pages from the Wikipedia dump identified by language and year.
accept(File) - Method in class cat.lump.ir.lucene.index.LuceneIndexer.TextFilesFilter
 
accept(File) - Method in class cat.lump.ir.lucene.index.LuceneIndexerWT.TextFilesFilter
 
actualPair - Variable in class cat.lump.aq.basics.structure.ArticlePair
1 (?)
add(Vector) - Method in class cat.lump.aq.basics.algebra.matrix.DynamicMatrixOfVectors
Adds a new vector to the matrix, in the last available slot
add(Vector) - Method in class cat.lump.aq.basics.algebra.vector.Vector
Sums the contents of v to vector
This method does not modify the vector.
add(float[]) - Method in class cat.lump.aq.basics.algebra.vector.Vector
Sums the contents of vector to values and returns the result
This method does not modify the vector.
add(Map<String, Double>, String) - Method in class cat.lump.ir.index.Index
Add a new document with the given weights to the index.
add(String, double) - Method in class cat.lump.ir.index.Ranking
Insert a new document to the ranking
add(int, int, double) - Method in class cat.lump.ir.retrievalmodels.similarity.SimilarityMatrix
Adds the value to the current value of the position given by row and col
addDocument(File, String) - Method in class cat.lump.ir.index.Indexer
Add a new document file to the index
addDocument(String, String) - Method in class cat.lump.ir.index.Indexer
Add a new document to the index.
addEquals(Vector) - Method in class cat.lump.aq.basics.algebra.vector.Vector
Sums v2 to vector and store the result in the vector itself.
addEquals(float[]) - Method in class cat.lump.aq.basics.algebra.vector.Vector
Sums the values to vector, modifying its contents.
addPage(int) - Method in class cat.lump.aq.textextraction.wikipedia.categories.ArticleTextExtractor
Adds a page ID to the set of page IDs
addPage(Integer) - Method in class cat.lump.aq.textextraction.wikipedia.prepro.AbstractPreprocess
Adds a page to preprocess.
addPages(Collection<Integer>) - Method in class cat.lump.aq.textextraction.wikipedia.categories.ArticleTextExtractor
Adds a collection of pages to the set of page IDs
addPages(Collection<Integer>) - Method in class cat.lump.aq.textextraction.wikipedia.prepro.AbstractPreprocess
Adds a set of pages to preprocess
addPair(int, File, int, File) - Method in class cat.lump.aq.textextraction.wikipedia.fragments.ArticlesSimilarity
Add an article pair to the list with similarities
addPair(int, File, int, File) - Method in class cat.lump.ir.comparison.toCheck.ArticlesSimilarity
Add an article pair to the list
addPairs(List<SimilarityPair>) - Method in class cat.lump.aq.textextraction.wikipedia.fragments.ArticlesSimilarity
Add a new collection of pairs
addPairs(List<SimilarityPair>) - Method in class cat.lump.ir.comparison.toCheck.ArticlesSimilarity
Add a new collection of pairs
addPreprocess(TypePreprocess) - Method in class cat.lump.aq.textextraction.wikipedia.categories.ArticleTextExtractor
Adds a new preprocess to the available ones.
addScoredCategory(GroupOfCategories.ScoredCategory) - Method in class cat.lump.aq.textextraction.wikipedia.categories.GroupOfCategories
Adds a new ScoredCategory to the group.
addString(String) - Method in class cat.lump.ir.retrievalmodels.document.Dictionary
Add the term to the dictionary (if it was not there yet).
addTerm(String) - Method in class cat.lump.ir.weighting.TermFrequency
Add a term into the collection.
addTerms(Collection<String>) - Method in class cat.lump.aq.textextraction.wikipedia.categories.DomainVocabulary
Add new terms from a collection of words.
addTerms(String) - Method in class cat.lump.aq.textextraction.wikipedia.categories.DomainVocabulary
Add new terms or modify the already added with the tokens obtained by the preprocessing of the given text.
addTerms(List<String>) - Method in class cat.lump.ir.weighting.TermFrequency
Add these terms into the collection.
addVector(Vector, String) - Method in class cat.lump.ir.sim.ml.esa.EsaVectors
Add a new vector into the matrix.
admitShorterNgrams - Variable in class cat.lump.ie.textprocessing.ngram.CharacterNgrams
Flag to admit texts shorter than n
analyzer - Static variable in class cat.lump.ie.textprocessing.word.AnalyzerFactoryLucene
 
analyzer - Static variable in class cat.lump.ir.lucene.engine.AnalyzerFactory
 
analyzer - Variable in class cat.lump.ir.lucene.query.LuceneTokenizer
 
AnalyzerFactory - Class in cat.lump.ir.lucene.engine
Factory that allows for getting a Lucene Analyzer with all the preprocess needed TODO This is duplicated with textextraction, maybe we should unify
AnalyzerFactory() - Constructor for class cat.lump.ir.lucene.engine.AnalyzerFactory
 
AnalyzerFactoryLucene - Class in cat.lump.ie.textprocessing.word
Factory that allows for getting a Lucene Analyzer and a stemmer for the required language (if available)
AnalyzerFactoryLucene() - Constructor for class cat.lump.ie.textprocessing.word.AnalyzerFactoryLucene
 
appendStringToFile(File, String, boolean) - Static method in class cat.lump.aq.basics.io.files.FileIO
 
argmax() - Method in class cat.lump.aq.basics.algebra.vector.Vector
 
argmin() - Method in class cat.lump.aq.basics.algebra.vector.Vector
 
Article - Class in cat.lump.ir.retrievalmodels.similarity
An article object is the abstraction of a document to translate whith its related information.
Article(String, RepresentationType) - Constructor for class cat.lump.ir.retrievalmodels.similarity.Article
Creates an undefined article together with its language and type of representation.
Article(String, String) - Constructor for class cat.lump.ir.retrievalmodels.similarity.Article
Creates an article with text and language.
Article(String, String, RepresentationType) - Constructor for class cat.lump.ir.retrievalmodels.similarity.Article
Creates an article with text, language and type of representation.
ArticlePair - Class in cat.lump.aq.basics.structure
This class stores a pair of Wikipedia articles covering the same topic in different languages.
ArticlePair() - Constructor for class cat.lump.aq.basics.structure.ArticlePair
 
ArticlePair(int, String, int, String) - Constructor for class cat.lump.aq.basics.structure.ArticlePair
TODO this is added for the alignment interface.
ArticleSelector - Class in cat.lump.aq.textextraction.wikipedia.categories
This class extracts all the articles that belong to a given category in Wikipedia
ArticleSelector(File, Locale, int) - Constructor for class cat.lump.aq.textextraction.wikipedia.categories.ArticleSelector
Constructor.
ArticlesSimilarity - Class in cat.lump.aq.textextraction.wikipedia.fragments
 
ArticlesSimilarity(String, String) - Constructor for class cat.lump.aq.textextraction.wikipedia.fragments.ArticlesSimilarity
 
ArticlesSimilarity - Class in cat.lump.ir.comparison.toCheck
 
ArticlesSimilarity(String, String) - Constructor for class cat.lump.ir.comparison.toCheck.ArticlesSimilarity
 
ArticlesTFs - Class in cat.lump.aq.textextraction.wikipedia.utilities
A class to calculate the TFs associated to all terms in a document from an already extracted WP edition.
ArticlesTFs(String, String) - Constructor for class cat.lump.aq.textextraction.wikipedia.utilities.ArticlesTFs
Constructor for the class.
ArticlesTranslator - Class in cat.lump.aq.textextraction.wikipedia.utilities
A class to translate all the articles from L1 into L2 in a folder with the structure of Wikicardi: path/plain/L1/index/id.L1.txt The index files with the position of the articles and its length is required.
ArticlesTranslator(String[], String) - Constructor for class cat.lump.aq.textextraction.wikipedia.utilities.ArticlesTranslator
 
ArticleTextExtractor - Class in cat.lump.aq.textextraction.wikipedia.categories
This class provides methods to load a list of Wikipedia articles IDs and preprocess them.
ArticleTextExtractor(Locale, int) - Constructor for class cat.lump.aq.textextraction.wikipedia.categories.ArticleTextExtractor
Creates a preprocessor without any page to preprocess.
ArticleTextExtractor(Locale, int, File) - Constructor for class cat.lump.aq.textextraction.wikipedia.categories.ArticleTextExtractor
Creates a preprocessor with the pages listed in listOfPages.

B

BackArticleSelector - Class in cat.lump.aq.textextraction.wikipedia.categories
This class extracts all the articles belonging to a given category in Wikipedia
BackArticleSelector(File, WikipediaJwpl) - Constructor for class cat.lump.aq.textextraction.wikipedia.categories.BackArticleSelector
Constructor.

C

calculate(Similarity, File) - Method in class cat.lump.aq.textextraction.wikipedia.fragments.ArticlesSimilarity
 
calculate(Similarity, File, int) - Method in class cat.lump.aq.textextraction.wikipedia.fragments.ArticlesSimilarity
 
calculate(Similarity, File) - Method in class cat.lump.ir.comparison.toCheck.ArticlesSimilarity
 
calculate(Similarity, File, int) - Method in class cat.lump.ir.comparison.toCheck.ArticlesSimilarity
 
calculate(File, File) - Method in class cat.lump.ir.retrievalmodels.similarity.SimilarityCalculator
Calculates the matrix of similarities for the given files.
calculate(File, File) - Method in class cat.lump.ir.retrievalmodels.similarity.SimilarityCalculatorLenFact
Calculates the matrix of similarities for the given files.
calculateInvIndex() - Method in class cat.lump.ir.retrievalmodels.similarity.SimilarityCalculator
Generate the inverted index of the articles
calculateMatrix(Article, Article) - Method in class cat.lump.ir.retrievalmodels.similarity.CosineSimilarity
 
calculateMatrix(Article, Article) - Method in class cat.lump.ir.retrievalmodels.similarity.JaccardSimilarity
 
calculateMatrix(Article, Article) - Method in interface cat.lump.ir.retrievalmodels.similarity.SimilarityModel
 
calculateSimilarityMatrix() - Method in class cat.lump.ir.retrievalmodels.similarity.SimilarityCalculator
Calculates the resulting matrix
calculateSimilarityMatrix() - Method in class cat.lump.ir.retrievalmodels.similarity.SimilarityCalculatorLenFact
 
calculateTFs(String) - Method in class cat.lump.aq.textextraction.wikipedia.utilities.ArticlesTFs
Estimates the TF for all the terms in file file and writes the resulting List<TermFrequencyTuple> in a file with the same name as the text file but in folder "tfs" instead of "plain"
CAN_READ(File, String) - Static method in class cat.lump.aq.basics.check.F
throw error if the file cannot be read.
cat.lump.aq.basics.algebra.matrix - package cat.lump.aq.basics.algebra.matrix
 
cat.lump.aq.basics.algebra.vector - package cat.lump.aq.basics.algebra.vector
 
cat.lump.aq.basics.check - package cat.lump.aq.basics.check
 
cat.lump.aq.basics.io.files - package cat.lump.aq.basics.io.files
 
cat.lump.aq.basics.log - package cat.lump.aq.basics.log
 
cat.lump.aq.basics.structure - package cat.lump.aq.basics.structure
 
cat.lump.aq.basics.structure.ir - package cat.lump.aq.basics.structure.ir
 
cat.lump.aq.basics.structure.standard - package cat.lump.aq.basics.structure.standard
 
cat.lump.aq.textextraction.wikipedia - package cat.lump.aq.textextraction.wikipedia
 
cat.lump.aq.textextraction.wikipedia.categories - package cat.lump.aq.textextraction.wikipedia.categories
 
cat.lump.aq.textextraction.wikipedia.cli - package cat.lump.aq.textextraction.wikipedia.cli
 
cat.lump.aq.textextraction.wikipedia.experiments - package cat.lump.aq.textextraction.wikipedia.experiments
 
cat.lump.aq.textextraction.wikipedia.fragments - package cat.lump.aq.textextraction.wikipedia.fragments
 
cat.lump.aq.textextraction.wikipedia.io - package cat.lump.aq.textextraction.wikipedia.io
 
cat.lump.aq.textextraction.wikipedia.prepro - package cat.lump.aq.textextraction.wikipedia.prepro
 
cat.lump.aq.textextraction.wikipedia.utilities - package cat.lump.aq.textextraction.wikipedia.utilities
 
cat.lump.aq.wikilink - package cat.lump.aq.wikilink
 
cat.lump.aq.wikilink.config - package cat.lump.aq.wikilink.config
 
cat.lump.aq.wikilink.connexion - package cat.lump.aq.wikilink.connexion
 
cat.lump.aq.wikilink.jwpl - package cat.lump.aq.wikilink.jwpl
 
cat.lump.ie.textprocessing - package cat.lump.ie.textprocessing
 
cat.lump.ie.textprocessing.ner - package cat.lump.ie.textprocessing.ner
 
cat.lump.ie.textprocessing.ngram - package cat.lump.ie.textprocessing.ngram
 
cat.lump.ie.textprocessing.sentence - package cat.lump.ie.textprocessing.sentence
 
cat.lump.ie.textprocessing.stopwords - package cat.lump.ie.textprocessing.stopwords
 
cat.lump.ie.textprocessing.transform - package cat.lump.ie.textprocessing.transform
 
cat.lump.ie.textprocessing.word - package cat.lump.ie.textprocessing.word
 
cat.lump.ir.comparison - package cat.lump.ir.comparison
 
cat.lump.ir.comparison.toCheck - package cat.lump.ir.comparison.toCheck
 
cat.lump.ir.index - package cat.lump.ir.index
 
cat.lump.ir.lucene - package cat.lump.ir.lucene
 
cat.lump.ir.lucene.cli - package cat.lump.ir.lucene.cli
 
cat.lump.ir.lucene.engine - package cat.lump.ir.lucene.engine
 
cat.lump.ir.lucene.index - package cat.lump.ir.lucene.index
 
cat.lump.ir.lucene.index.analyzers - package cat.lump.ir.lucene.index.analyzers
 
cat.lump.ir.lucene.query - package cat.lump.ir.lucene.query
 
cat.lump.ir.retrievalmodels.document - package cat.lump.ir.retrievalmodels.document
 
cat.lump.ir.retrievalmodels.similarity - package cat.lump.ir.retrievalmodels.similarity
 
cat.lump.ir.sim - package cat.lump.ir.sim
 
cat.lump.ir.sim.cl.clesa - package cat.lump.ir.sim.cl.clesa
 
cat.lump.ir.sim.cl.len - package cat.lump.ir.sim.cl.len
 
cat.lump.ir.sim.ml.esa - package cat.lump.ir.sim.ml.esa
 
cat.lump.ir.weighting - package cat.lump.ir.weighting
 
CategoryDepth - Class in cat.lump.aq.textextraction.wikipedia.categories
Class that automatises the process of selecting how deep within the category tree one must go to extract articles from a given domain.
CategoryDepth(File, double, int, int) - Constructor for class cat.lump.aq.textextraction.wikipedia.categories.CategoryDepth
 
CategoryExplorer - Class in cat.lump.aq.textextraction.wikipedia.categories
The CategoryExplorer class is used to explore the categories of Wikipedia.
CategoryExplorer(Locale, int) - Constructor for class cat.lump.aq.textextraction.wikipedia.categories.CategoryExplorer
Creates a new explorer related to the Wikipedia dump defined by its language and year.
CategoryExtractor - Class in cat.lump.aq.textextraction.wikipedia.categories
This class extracts all the subcategories from an indicated category in Wikipedia TODO build junit
CategoryExtractor(Locale, int, String) - Constructor for class cat.lump.aq.textextraction.wikipedia.categories.CategoryExtractor
Default ---non-verbose--- invocation.
CategoryExtractor(Locale, int, boolean, String) - Constructor for class cat.lump.aq.textextraction.wikipedia.categories.CategoryExtractor
Invocation in which verbosity is set.
CategoryNameStats - Class in cat.lump.aq.textextraction.wikipedia.categories
This class computes the percentage of categories that are claimed to belong to a concrete domain from a category tree.
CategoryNameStats(Locale) - Constructor for class cat.lump.aq.textextraction.wikipedia.categories.CategoryNameStats
 
CategoryTreeNode - Class in cat.lump.aq.textextraction.wikipedia.categories
This class stores all the relevant information about a classified category.
CategoryTreeNode(Category, int, int) - Constructor for class cat.lump.aq.textextraction.wikipedia.categories.CategoryTreeNode
Constructor.
changeFileSuffix(File, String) - Static method in class cat.lump.aq.basics.io.files.FileIO
 
changeFileSuffix(String, String) - Static method in class cat.lump.aq.basics.io.files.FileIO
Given a filename, whether relative or absolute, it substitutes the suffix for a newSuffix.
CharacterNgrams - Class in cat.lump.ie.textprocessing.ngram
 
CharacterNgrams(int) - Constructor for class cat.lump.ie.textprocessing.ngram.CharacterNgrams
 
CharacterNgrams(int, Boolean) - Constructor for class cat.lump.ie.textprocessing.ngram.CharacterNgrams
 
CHECK(boolean) - Static method in class cat.lump.aq.basics.check.CHK
throw CheckFailedError if false
CHECK(boolean, String) - Static method in class cat.lump.aq.basics.check.CHK
throw CheckFailedError if false, displaying the required message
CHECK_NOT_NULL(Object) - Static method in class cat.lump.aq.basics.check.CHK
Check that the given object is not null; throws a CheckFailedError if it is
checkAllTablesAvailable() - Method in class cat.lump.aq.textextraction.wikipedia.utilities.CommonArticlesFinder
Checks if all the tables needed are in the database.
checkAllTablesAvailable() - Method in class cat.lump.aq.textextraction.wikipedia.utilities.CommonNamespaceFinder
Checks if all the tables needed are in the database.
CheckFailedError - Error in cat.lump.aq.basics.check
 
CHK - Class in cat.lump.aq.basics.check
A class that contains methods to check
CHK() - Constructor for class cat.lump.aq.basics.check.CHK
 
close() - Method in class cat.lump.aq.wikilink.connexion.WikipediaDriverManager
Closes the connection
close() - Method in class cat.lump.ir.lucene.index.LuceneIndexer
Closes the Lucene index
close() - Method in class cat.lump.ir.lucene.index.LuceneIndexerWT
Closes the Lucene index
closeConnection() - Method in class cat.lump.aq.textextraction.wikipedia.utilities.CommonArticlesFinder
 
closeConnection() - Method in class cat.lump.aq.textextraction.wikipedia.utilities.CommonNamespaceFinder
 
closeStatement() - Method in class cat.lump.aq.wikilink.connexion.WikipediaDriverManager
 
COLLECTION_FILE - Variable in class cat.lump.ir.index.Abstracter
Name for the output documents' object file
command - Variable in class cat.lump.aq.textextraction.wikipedia.cli.WikipediaCliMinimum
 
command - Variable in class cat.lump.ir.lucene.cli.LuceneCliMinimum0
 
CommonArticlesFinder - Class in cat.lump.aq.textextraction.wikipedia.utilities
A class to identify the common articles across n languages in Wikipedia from files with the list of IDs for every language.
CommonArticlesFinder(String[], int, String[], File) - Constructor for class cat.lump.aq.textextraction.wikipedia.utilities.CommonArticlesFinder
Instantiates the object with the provided languages.
CommonCategoriesExtractor - Class in cat.lump.aq.textextraction.wikipedia.utilities
A class to identify the common categories across n languages in Wikipedia.
CommonCategoriesExtractor(String[], String, File) - Constructor for class cat.lump.aq.textextraction.wikipedia.utilities.CommonCategoriesExtractor
Instantiates the object with the provided languages.
CommonNamespaceFinder - Class in cat.lump.aq.textextraction.wikipedia.utilities
A class to identify the common articles (namespace=0) and categories (namespace=14) across n languages in Wikipedia from files with the list of IDs for every language.
CommonNamespaceFinder(String[], int, String[], File) - Constructor for class cat.lump.aq.textextraction.wikipedia.utilities.CommonNamespaceFinder
Instantiates the object with the provided languages.
compareTo(TermFrequencyTuple) - Method in class cat.lump.aq.basics.structure.ir.TermFrequencyTuple
 
compareTo(Pair<S, T>) - Method in class cat.lump.aq.basics.structure.Pair
 
compareTo(SimilarityPair) - Method in class cat.lump.ir.comparison.toCheck.SimilarityPair
 
compute(Vector, Vector) - Method in interface cat.lump.ir.retrievalmodels.similarity.SimilarityMeasure
 
compute(Vector, Vector) - Method in class cat.lump.ir.retrievalmodels.similarity.VectorCosine
Computes the cosine similarity measure between two vectors sim(v1,v2) = (v1 * v2) / (|v1||v2|)
computeCategoryStats(int) - Method in class cat.lump.aq.textextraction.wikipedia.categories.Xecutor
 
computeCategoryStats(int, int, String) - Method in class cat.lump.aq.textextraction.wikipedia.categories.XecutorTheFirst
 
computeLengths(File) - Method in class cat.lump.ir.sim.cl.len.LengthModelLearn
Opens the file and computes lengths for every line within
computeNorm(String, FieldInvertState) - Method in class cat.lump.ir.lucene.index.TFSimilarity
 
computePairwiseSimilarities() - Method in class cat.lump.ir.sim.ml.esa.Esa
Compute only similarities for the matrix diagonal
computeRanking(File) - Method in class cat.lump.ir.index.Querier
Queries a text file to the index.
computeRanking(String) - Method in class cat.lump.ir.index.Querier
Queries a text to the index.
computeSimilarities() - Method in class cat.lump.ir.sim.ml.esa.Esa
Computes the ESA-based similarities between the previously loaded documents.
computeSimilarities() - Method in interface cat.lump.ir.sim.Similarity
Compute the similarity between all the texts in the collection
computeSimilarity(String, String) - Method in class cat.lump.ir.sim.ml.esa.Esa
Computes the similarity between two specific documents.
computeSimilarity(String, String) - Method in interface cat.lump.ir.sim.Similarity
Compute the similarity between two specific texts
computeStats() - Method in class cat.lump.aq.textextraction.wikipedia.categories.CategoryNameStats
Calculates and returns the percentages.
computeTF() - Method in class cat.lump.aq.textextraction.wikipedia.categories.DomainKeywords
Gets the term frequency tuples resulting the treatment of a set of TODO this should be private!!
computeTF(String) - Method in class cat.lump.aq.textextraction.wikipedia.categories.DomainKeywords
As computeTF() but including the title of the root category
computeVector(String) - Method in class cat.lump.ir.sim.ml.esa.EsaGenerator
Computes the ESA vector representation for the given text.
computeVectors(File, String, String) - Method in class cat.lump.ir.sim.cl.clesa.SimilarityCLESA
 
computeVectors(File, String, String) - Method in class cat.lump.ir.sim.cl.clesa.SimilarityCLESAdocs
 
computeVectors(File, String, String) - Method in class cat.lump.ir.sim.ml.esa.SimilarityESA
Computes the vectors for the texts in the given set.
computeVectors(File, String, String) - Method in class cat.lump.ir.sim.ml.esa.SimilarityESAdocs
Computes the vectors for the texts in the given set.
computeVectors(File, String, String) - Method in class cat.lump.ir.sim.ml.esa.SimilarityESAlines
 
computeVectorsA() - Method in class cat.lump.ir.sim.ml.esa.SimilarityESA
Compute the characteristic vectors for dataset A
computeVectorsA() - Method in class cat.lump.ir.sim.ml.esa.SimilarityESAsent
 
computeVectorsB() - Method in class cat.lump.ir.sim.ml.esa.SimilarityESA
Compute the characteristic vectors for dataset B
computeVectorsB() - Method in class cat.lump.ir.sim.ml.esa.SimilarityESAsent
 
contains(String) - Method in class cat.lump.aq.textextraction.wikipedia.categories.DomainVocabulary
Checks if a term is contained in the vocabulary.
containsKey(String) - Method in class cat.lump.ir.sim.ml.esa.EsaVectors
 
containsPunctuation(String) - Static method in class cat.lump.ie.textprocessing.sentence.Punctuation
 
coord(int, int) - Method in class cat.lump.ir.lucene.index.TFSimilarity
 
copy(File, File) - Static method in class cat.lump.aq.basics.io.files.FileIO
 
CorrelationsxCategory - Class in cat.lump.aq.textextraction.wikipedia.experiments
Wikiparable: Experiment 2 for evaluation.
CorrelationsxCategory(String, String, String, String) - Constructor for class cat.lump.aq.textextraction.wikipedia.experiments.CorrelationsxCategory
 
cosinePerFragment(Article, Article) - Method in class cat.lump.ir.retrievalmodels.similarity.CosineSimilarity
Compute the cosine similarity for all the fragments (e.g. sentences).
CosineSimilarity - Class in cat.lump.ir.retrievalmodels.similarity
 
CosineSimilarity() - Constructor for class cat.lump.ir.retrievalmodels.similarity.CosineSimilarity
 
createCategoryVocabulary(Category) - Method in class cat.lump.aq.textextraction.wikipedia.categories.CategoryExplorer
Creates the vocabulary related to the given category.
createConnection() - Method in class cat.lump.aq.wikilink.connexion.WikipediaDriverManager
Creates the connection to the database
createDir(File) - Static method in class cat.lump.aq.basics.io.files.FileIO
 
createRepresentations() - Method in class cat.lump.ir.retrievalmodels.similarity.SimilarityCalculator
Creates the representations of the source text and the target text
createScoredCategory(Category, Category, int) - Static method in class cat.lump.aq.textextraction.wikipedia.categories.GroupOfCategories
 
createScoredCategory(Category, Category, int, boolean) - Static method in class cat.lump.aq.textextraction.wikipedia.categories.GroupOfCategories
 
CroatianAnalyzer - Class in cat.lump.ir.lucene.index.analyzers
 
CroatianAnalyzer() - Constructor for class cat.lump.ir.lucene.index.analyzers.CroatianAnalyzer
 
csv2matrix(File) - Static method in class cat.lump.aq.basics.io.files.CsvFoolReader
Reads a csv file and return a 2-dimensional array of Strings of it.
csv2matrix(File, String) - Static method in class cat.lump.aq.basics.io.files.CsvFoolReader
Reads a csv file.
csvFileToList(File) - Static method in class cat.lump.aq.basics.io.files.CsvFoolReader
 
csvFileToList(File, String) - Static method in class cat.lump.aq.basics.io.files.CsvFoolReader
 
CsvFoolReader - Class in cat.lump.aq.basics.io.files
A simple reader for CSV files.
CsvFoolReader() - Constructor for class cat.lump.aq.basics.io.files.CsvFoolReader
 
CsvFoolWriter - Class in cat.lump.aq.basics.io.files
A simple reader for CSV files.
CsvFoolWriter() - Constructor for class cat.lump.aq.basics.io.files.CsvFoolWriter
 
CsvFoolWriter(String) - Constructor for class cat.lump.aq.basics.io.files.CsvFoolWriter
 

D

Decomposition - Interface in cat.lump.ie.textprocessing
 
decrement() - Method in class cat.lump.aq.basics.structure.ir.TermFrequencyTuple
Decrements the number of occurrences of the term in one unit.
deleteDir(File) - Static method in class cat.lump.aq.basics.io.files.FileIO
Deletes all files and subdirectories under "dir".
deleteFile(File) - Static method in class cat.lump.aq.basics.io.files.FileIO
 
Diacritics - Class in cat.lump.ie.textprocessing.sentence
This class is indeed a link to icu's normalizer.
Diacritics() - Constructor for class cat.lump.ie.textprocessing.sentence.Diacritics
Default invocation.
Diacritics(Boolean) - Constructor for class cat.lump.ie.textprocessing.sentence.Diacritics
At invocation time defining whether the texts are going to be casefolded is required.
Dictionary - Class in cat.lump.ir.retrievalmodels.document
 
Dictionary() - Constructor for class cat.lump.ir.retrievalmodels.document.Dictionary
 
dictionary - Variable in class cat.lump.ir.retrievalmodels.document.Document
Internal dictionary that links terms to their numerical representation
dimension - Variable in class cat.lump.aq.basics.algebra.matrix.DynamicMatrix
dimension of the matrix.
dirCanBeRead(File) - Static method in class cat.lump.aq.basics.io.files.FileIO
Check whether the directory exists and can be read.
display() - Method in class cat.lump.aq.basics.algebra.matrix.DynamicMatrixOfVectors
 
display() - Method in class cat.lump.ir.sim.cl.len.LengthModelEstimate
Display only estimation
displaySimilarities() - Method in class cat.lump.ir.sim.ml.esa.Esa
 
displaySimilarities() - Method in interface cat.lump.ir.sim.Similarity
Prints a matrix including all the similarities
displayVerbose() - Method in class cat.lump.ir.sim.cl.len.LengthModelEstimate
Display estimation, source and target sentences
div(int, int, double) - Method in class cat.lump.ir.retrievalmodels.similarity.SimilarityMatrix
Divides the current value of the position given by row and col by the indicated value
divide(float) - Method in class cat.lump.aq.basics.algebra.vector.Vector
Divides the vector by a scalar and returns the resulting array.
divideEquals(float) - Method in class cat.lump.aq.basics.algebra.vector.Vector
Divides the vector by a scalar and updates its value internally.
doc2WeightQuery(String) - Method in class cat.lump.ir.lucene.query.Document2Query
Generates a query in which tokens' relevance depend on their frequency
docCollection - Variable in class cat.lump.ir.index.Abstracter
Internal collection of documents
Document - Class in cat.lump.ir.retrievalmodels.document
A frame to represent a text document and its fragments (in the form of sentences).
Document(String, Locale) - Constructor for class cat.lump.ir.retrievalmodels.document.Document
 
Document(String[], Locale) - Constructor for class cat.lump.ir.retrievalmodels.document.Document
 
Document(String, Locale, boolean, boolean, boolean, boolean) - Constructor for class cat.lump.ir.retrievalmodels.document.Document
 
Document(String[], Locale, boolean, boolean, boolean, boolean) - Constructor for class cat.lump.ir.retrievalmodels.document.Document
 
Document(String, Locale, boolean, boolean, boolean, boolean, int) - Constructor for class cat.lump.ir.retrievalmodels.document.Document
 
Document(String[], Locale, boolean, boolean, boolean, boolean, int) - Constructor for class cat.lump.ir.retrievalmodels.document.Document
 
Document2Query - Class in cat.lump.ir.lucene.query
The contents of a document are processed to be in the right format for Lucene querying.
Document2Query() - Constructor for class cat.lump.ir.lucene.query.Document2Query
 
Document2Query(Locale) - Constructor for class cat.lump.ir.lucene.query.Document2Query
 
documentNumber() - Method in class cat.lump.ir.index.Index
 
documentsExist(String, String) - Method in class cat.lump.ir.sim.ml.esa.Esa
Checks whether both documents exist already in the corresponding vector.
DomainKeywords - Class in cat.lump.aq.textextraction.wikipedia.categories
This class gets the most common terms in the articles belonging to, at least, one category of a given domain.
DomainKeywords(Locale, int) - Constructor for class cat.lump.aq.textextraction.wikipedia.categories.DomainKeywords
 
DomainVocabulary - Class in cat.lump.aq.textextraction.wikipedia.categories
A DomainVocabulary instance is used to store a set of terms with its frequency.
DomainVocabulary(Locale) - Constructor for class cat.lump.aq.textextraction.wikipedia.categories.DomainVocabulary
Creates an empty vocabulary.
DomainVocabulary(Locale, Collection<TermFrequencyTuple>) - Constructor for class cat.lump.aq.textextraction.wikipedia.categories.DomainVocabulary
Creates a vocabulary which include some initial terms with an initial frequency.
DomainVocabulary(DomainVocabulary) - Constructor for class cat.lump.aq.textextraction.wikipedia.categories.DomainVocabulary
Creates a new vocabulary identical to the given one.
dotProduct(Vector) - Method in class cat.lump.aq.basics.algebra.vector.Vector
Computes the dot product between the vector and vector v2.
dotProduct(float[]) - Method in class cat.lump.aq.basics.algebra.vector.Vector
Computes the dot product between the vector and an array of doubles.
Dump - Class in cat.lump.aq.wikilink.config
The years for which we have dumps available
Dump(Locale, int) - Constructor for class cat.lump.aq.wikilink.config.Dump
Locale and year are set.
DynamicMatrix - Class in cat.lump.aq.basics.algebra.matrix
Abstract class for the creation of dynamic matrices of different types .
DynamicMatrix() - Constructor for class cat.lump.aq.basics.algebra.matrix.DynamicMatrix
 
DynamicMatrixOfVectors - Class in cat.lump.aq.basics.algebra.matrix
A matrix than can grow as required.
DynamicMatrixOfVectors() - Constructor for class cat.lump.aq.basics.algebra.matrix.DynamicMatrixOfVectors
Initialises the matrix (with size=1) and sets the dimension, which cannot be changed afterwards

E

equals(Object) - Method in class cat.lump.aq.basics.structure.ir.TermFrequencyTuple
 
equals(Object) - Method in class cat.lump.aq.textextraction.wikipedia.categories.CategoryTreeNode
 
error(String) - Method in class cat.lump.aq.basics.log.LumpLogger
 
errorEnd(String) - Method in class cat.lump.aq.basics.log.LumpLogger
Stops the program execution giving an error message.
Esa - Class in cat.lump.ir.retrievalmodels.document
 
Esa() - Constructor for class cat.lump.ir.retrievalmodels.document.Esa
 
Esa - Class in cat.lump.ir.sim.ml.esa
 
Esa() - Constructor for class cat.lump.ir.sim.ml.esa.Esa
 
esaGen - Variable in class cat.lump.ir.sim.ml.esa.SimilarityESA
A generator of ESA vectorial representations
EsaGenerator - Class in cat.lump.ir.sim.ml.esa
A class that allows for passing from a text (collection) into its ESA vector representation.
EsaGenerator(File, String) - Constructor for class cat.lump.ir.sim.ml.esa.EsaGenerator
Invokes an instance of the EsaGenerator by loading the index and the analyzer for the required language

TODO whether the indexpath should be established by default depending on the language
EsaVectors - Class in cat.lump.ir.sim.ml.esa
Set of vector representation of a set of texts.
EsaVectors(int) - Constructor for class cat.lump.ir.sim.ml.esa.EsaVectors
At invocation time the index and initial matrix of vectors is generated
esaVectorsA - Variable in class cat.lump.ir.sim.ml.esa.Esa
Instance of an ESA vector for the set of documents A.
esaVectorsB - Variable in class cat.lump.ir.sim.ml.esa.Esa
Instance of an ESA vector for the set of documents B (see description for esaVectors_A
estimate() - Method in class cat.lump.ir.sim.cl.len.LengthModelEstimate
 
estimateMatrix() - Method in class cat.lump.ir.sim.cl.len.LengthModelEstimate
Estimates the length factors of the sentences among them.
EstonianAnalyzer - Class in cat.lump.ir.lucene.index.analyzers
 
EstonianAnalyzer(Version) - Constructor for class cat.lump.ir.lucene.index.analyzers.EstonianAnalyzer
 
exist(String) - Method in class cat.lump.ir.retrievalmodels.document.Dictionary
 
existsFile(File, TypePreprocess, String, int) - Static method in class cat.lump.aq.textextraction.wikipedia.io.FileManager
Checks if a file related to a page exists on the given root directory.
existTerm(String) - Method in class cat.lump.ir.weighting.TermFrequency
 
exitError(HelpFormatter, String) - Method in class cat.lump.aq.textextraction.wikipedia.cli.WikipediaCliMinimum
Finish the process with the CLI help and the an error message.
exitError(HelpFormatter, String) - Method in class cat.lump.ir.lucene.cli.LuceneCliMinimum0
Finish the process with the CLI help and the an error message.
exitError(String) - Method in class cat.lump.ir.sim.ml.esa.Esa
 
exitHelp(HelpFormatter) - Method in class cat.lump.aq.textextraction.wikipedia.cli.WikipediaCliMinimum
Exit displaying the CLI help
exitHelp(HelpFormatter) - Method in class cat.lump.ir.lucene.cli.LuceneCliMinimum0
Exit displaying the CLI help
extract(short) - Method in class cat.lump.aq.textextraction.wikipedia.categories.ArticleSelector
Extracts the articles associated to the indicated categories.
extract(short) - Method in class cat.lump.aq.textextraction.wikipedia.categories.BackArticleSelector
Extracts the articles associated to the indicated categories.
extractCategories(int, boolean) - Method in class cat.lump.aq.textextraction.wikipedia.categories.Xecutor
 
extractCategories(int, int) - Method in class cat.lump.aq.textextraction.wikipedia.categories.XecutorTheFirst
 
extractDomainKeywords(String, int) - Method in class cat.lump.aq.textextraction.wikipedia.categories.Xecutor
Obtains the vocabulary that represents the given category in the language and year Wikipedia.
extractDomainKeywords(String) - Method in class cat.lump.aq.textextraction.wikipedia.categories.XecutorTheFirst
Obtains the vocabulary that represents the given category in the language and year Wikipedia.
extractDomainKeywords(String, int, String, String) - Method in class cat.lump.ir.lucene.Xecutor
Obtains the vocabulary that represents the given category in the language and year Wikipedia.
extractEntireWikipedia(Locale, int, File) - Static method in class cat.lump.aq.textextraction.wikipedia.categories.ArticleTextExtractor
The entire Wiki for the set language and year
extractSpecificArticles(Locale, int, Integer[], File) - Static method in class cat.lump.aq.textextraction.wikipedia.categories.ArticleTextExtractor
 
extractSpecificArticles(Locale, int, File, File) - Static method in class cat.lump.aq.textextraction.wikipedia.categories.ArticleTextExtractor
Extract only the articles specified in the pagesFile
extractTexts(int, int) - Method in class cat.lump.aq.textextraction.wikipedia.categories.Xecutor
 
extractTexts() - Method in class cat.lump.aq.textextraction.wikipedia.categories.XecutorTheSecond
 

F

F - Class in cat.lump.aq.basics.check
A class that contains methods to check files and directories
F() - Constructor for class cat.lump.aq.basics.check.F
 
file2FlatQuery(String) - Method in class cat.lump.ir.lucene.query.Document2Query
Generates a query in which every token has the same relevance TODO why am I using the same tokenizer for every language???
fileCountLines(File) - Static method in class cat.lump.aq.basics.io.files.FileIO
Counts the lines in a file http://stackoverflow.com/questions/453018/number-of-lines-in-a-file-in-java
FileIO - Class in cat.lump.aq.basics.io.files
 
FileIO() - Constructor for class cat.lump.aq.basics.io.files.FileIO
 
FileManager - Class in cat.lump.aq.textextraction.wikipedia.io
This class provides a set of methods to manage files which stores preprocessed Wikipedia pages.
FileManager() - Constructor for class cat.lump.aq.textextraction.wikipedia.io.FileManager
 
fileToLines(File) - Static method in class cat.lump.aq.basics.io.files.FileIO
Opens a file and returns the lines in it
fileToString(File) - Static method in class cat.lump.aq.basics.io.files.FileIO
 
finalize() - Method in class cat.lump.ir.sim.ml.esa.EsaGenerator
TODO important: check when to call this process instead of trusting the garbage collector Once the process is over, the index is closed
findIntersection(String) - Method in class cat.lump.aq.textextraction.wikipedia.utilities.CommonArticlesFinder
Looks for the articles that appear in all the selected languages langs simultaneously.
findIntersection(String) - Method in class cat.lump.aq.textextraction.wikipedia.utilities.CommonNamespaceFinder
Looks for the articles that appear in all the selected languages langs simultaneously.
findUnion(String) - Method in class cat.lump.aq.textextraction.wikipedia.utilities.CommonArticlesFinder
Looks for the articles that appear in any of the selected languages langs and builds a set with its union.
findUnion(String) - Method in class cat.lump.aq.textextraction.wikipedia.utilities.CommonNamespaceFinder
Looks for the articles that appear in any of the selected languages langs and builds a set with its union.
flatQuery(String[]) - Static method in class cat.lump.ir.lucene.query.Document2Query
Creates a query considering all the tokens (i.e. some words could be repeated)
footer - Variable in class cat.lump.aq.textextraction.wikipedia.cli.WikipediaCliMinimum
 
footer - Variable in class cat.lump.ir.lucene.cli.LuceneCliMinimum0
 
Fragment - Class in cat.lump.ir.retrievalmodels.document
A fragment of text with multiple representations including: plain text bag of words character n-grams
Fragment(Dictionary, String, Locale) - Constructor for class cat.lump.ir.retrievalmodels.document.Fragment
Default invocation where all the characterizations are computed
Fragment(Dictionary, String, Locale, boolean, boolean, boolean, boolean, int) - Constructor for class cat.lump.ir.retrievalmodels.document.Fragment
 
fragmentExists(int) - Method in class cat.lump.ir.retrievalmodels.document.Document
 

G

generateCategoryTree(Category, int) - Method in class cat.lump.aq.textextraction.wikipedia.categories.CategoryExtractor
Creates a tree of categories with the root category as root and all its subcategories allocated by levels of depth.
generateInvIndex() - Method in class cat.lump.ir.retrievalmodels.similarity.Article
Loads the inverted index from the text as well as the corresponding magnitudes
generateSetsCommon(String, boolean) - Method in class cat.lump.aq.textextraction.wikipedia.utilities.ArticlesTranslator
Looks for the index files in the given folder and generates temporal files (tokenised) to be translated with only the articles that also appear in file String commonArticles.
generateSetsFullFolder(HashSet<Integer>, boolean) - Method in class cat.lump.aq.textextraction.wikipedia.utilities.ArticlesTranslator
Looks for the index files in the given folder and generates temporal files (tokenised) to be translated.
get(int) - Method in class cat.lump.aq.basics.algebra.matrix.DynamicMatrix
 
get(int) - Method in class cat.lump.aq.basics.algebra.matrix.DynamicMatrixOfVectors
Get the vector from the specified position.
get() - Method in class cat.lump.aq.basics.algebra.vector.Vector
 
get(int) - Method in class cat.lump.aq.basics.algebra.vector.Vector
 
get(String) - Method in class cat.lump.ie.textprocessing.transform.Transliteratorr
 
get(String) - Method in class cat.lump.ir.index.Ranking
 
get(RepresentationType) - Method in class cat.lump.ir.retrievalmodels.document.Document
 
get(RepresentationType) - Method in class cat.lump.ir.retrievalmodels.document.Fragment
 
getAcronymLabel(Language) - Method in class cat.lump.aq.wikilink.jwpl.LanguageConstants
 
getAll() - Method in class cat.lump.ir.weighting.TermFrequency
 
getAllPairsDBname() - Static method in class cat.lump.aq.wikilink.config.MySQLWikiConfiguration
 
getAllVectors() - Method in class cat.lump.ir.sim.ml.esa.EsaVectors
 
getAnalyzer() - Method in class cat.lump.ir.lucene.query.Document2Query
 
getArticles() - Method in class cat.lump.aq.textextraction.wikipedia.categories.ArticleSelector
 
getArticlesFile() - Method in class cat.lump.aq.textextraction.wikipedia.cli.WikipediaCliArticleTextExtractor
 
getAvailablePairs() - Static method in class cat.lump.ir.sim.cl.len.LengthFactors
Retrieve the set of language pairs for which default parameters are available.
getAvailablePreprocesses() - Method in class cat.lump.aq.textextraction.wikipedia.categories.ArticleTextExtractor
 
getBibliographyLabel(Language) - Method in class cat.lump.aq.wikilink.jwpl.LanguageConstants
 
getBibliographyLabelDep(Language) - Static method in class cat.lump.aq.wikilink.jwpl.LanguageConstants
Deprecated.
getBottomK(int) - Method in class cat.lump.ir.index.Ranking
 
getByTag(String) - Static method in enum cat.lump.aq.textextraction.wikipedia.prepro.TypePreprocess
Searches the entry identified by the given name.
getCategories() - Method in class cat.lump.aq.textextraction.wikipedia.categories.GroupOfCategories
 
getCategory() - Method in class cat.lump.aq.textextraction.wikipedia.categories.CategoryTreeNode
Returns the category.
getCategory() - Method in class cat.lump.aq.textextraction.wikipedia.categories.GroupOfCategories.ScoredCategory
 
getCategory() - Method in class cat.lump.ir.lucene.cli.LuceneCliWT2Query
 
getCategoryFile() - Method in class cat.lump.aq.textextraction.wikipedia.cli.WikipediaCliArticleSelector
 
getCategoryID() - Method in class cat.lump.aq.textextraction.wikipedia.cli.WikipediaCliCategoryDepth
 
getCategoryID() - Method in class cat.lump.aq.textextraction.wikipedia.cli.WikipediaCliCategoryExtractor
 
getCategoryID() - Method in class cat.lump.aq.textextraction.wikipedia.cli.WikipediaCliDomainKeywords
 
getCategoryLabel(Language) - Method in class cat.lump.aq.wikilink.jwpl.LanguageConstants
 
getCategoryLabelDep(Language) - Static method in class cat.lump.aq.wikilink.jwpl.LanguageConstants
Deprecated.
getCategoryLabels() - Method in class cat.lump.aq.wikilink.jwpl.WikipediaJwpl
 
getCategoryName() - Method in class cat.lump.aq.textextraction.wikipedia.cli.WikipediaCliCategoryExtractor
 
getCategoryName() - Method in class cat.lump.aq.textextraction.wikipedia.cli.WikipediaCliDomainKeywords
 
getCategoryTree(int, int) - Method in class cat.lump.aq.textextraction.wikipedia.categories.CategoryExtractor
 
getCategoryTree() - Method in class cat.lump.aq.textextraction.wikipedia.categories.CategoryExtractor
 
getDBprefix() - Static method in class cat.lump.aq.wikilink.config.MySQLWikiConfiguration
The prefix for the Wikipedia SQL dumps containing page and langlinks tables.
getDepth() - Method in class cat.lump.aq.textextraction.wikipedia.categories.CategoryTreeNode
Returns the depth where this category has been found
getDepth() - Method in class cat.lump.aq.textextraction.wikipedia.cli.WikipediaCliArticleSelector
 
getDepth() - Method in class cat.lump.aq.textextraction.wikipedia.cli.WikipediaCliCategoriesXecutor
 
getDepthLinear() - Method in class cat.lump.aq.textextraction.wikipedia.categories.CategoryDepth
Getters
getDepthSplines() - Method in class cat.lump.aq.textextraction.wikipedia.categories.CategoryDepth
 
getDictFile() - Method in class cat.lump.ir.lucene.cli.LuceneCliCategoriesXecutor
 
getDirectory() - Method in class cat.lump.aq.textextraction.wikipedia.cli.WikipediaCliFragmentsXecutor
 
getDisambiguationLabel(Language) - Method in class cat.lump.aq.wikilink.jwpl.LanguageConstants
 
getDisambiguationLabelFep(Language) - Static method in class cat.lump.aq.wikilink.jwpl.LanguageConstants
Deprecated.
getDistance() - Method in class cat.lump.aq.textextraction.wikipedia.categories.GroupOfCategories.ScoredCategory
 
getDocument(File) - Method in class cat.lump.ir.lucene.index.LuceneIndexer
 
getDocument(File) - Method in class cat.lump.ir.lucene.index.LuceneIndexerWT
 
getDocuments(String) - Method in class cat.lump.ir.index.Index
 
getDocuments() - Method in class cat.lump.ir.index.Index
 
getDomainCategories(List<GroupOfCategories>, double) - Method in class cat.lump.aq.textextraction.wikipedia.categories.CategoryExplorer
Obtains the categories of all the groups which compose a domain.
getEArticlesFileName() - Method in class cat.lump.ir.lucene.cli.LuceneCliWT2Query
 
getEnd() - Method in class cat.lump.ie.textprocessing.Span
 
getExternalLinksLabel(Language) - Method in class cat.lump.aq.wikilink.jwpl.LanguageConstants
 
getExternalLinksLabelDep(Language) - Static method in class cat.lump.aq.wikilink.jwpl.LanguageConstants
Deprecated.
getFile(File, TypePreprocess, String, int) - Static method in class cat.lump.aq.textextraction.wikipedia.io.FileManager
Creates the abstract path for a given page considering its ID, language and the type of preprocess.
getFile(File, String, int) - Static method in class cat.lump.aq.textextraction.wikipedia.io.FileManager
Creates the abstract path for a given page considering its ID and language The path will be constructed as follows: root/language/index/filename.txt where index is the result of dividing pageID by and filename is formed by concatenating the pageID and language, separated by dots.
getFileCommonArticles() - Method in class cat.lump.aq.textextraction.wikipedia.cli.WikipediaCliFragmentsXecutor
 
getFilenames(File, String, boolean) - Static method in class cat.lump.aq.textextraction.wikipedia.io.FileManager
Retrieve the name of the preprocessed files in a language from the given directory.
getFilesExt(File, String) - Static method in class cat.lump.aq.basics.io.files.FileIO
Gets all the files with a given extension ext
getFilesID() - Method in class cat.lump.aq.textextraction.wikipedia.cli.WikipediaCliFragmentsXecutor
 
getFilesRecursively(File, String) - Static method in class cat.lump.aq.basics.io.files.FileIO
 
getFilesRecursively(File, String, long, long) - Static method in class cat.lump.aq.basics.io.files.FileIO
 
getFirstStep() - Method in class cat.lump.aq.textextraction.wikipedia.cli.WikipediaCliCategoriesXecutor
 
getFirstStep() - Method in class cat.lump.aq.textextraction.wikipedia.cli.WikipediaCliFragmentsXecutor
 
getFirstStep() - Method in class cat.lump.ir.lucene.cli.LuceneCliCategoriesXecutor
 
getformatTitleGivenID(int, String, int, WikipediaDriverManager) - Static method in class cat.lump.aq.wikilink.WikipediaDBdata
Retrieves the title of an article with ID id from table "page" The connection must be ready before.
getformatWPtitle(ResultSet, String) - Static method in class cat.lump.aq.wikilink.WikipediaDBdata
Given a ResultSet object extracts from column colName and formats a Wikipedia title as expected to be found in table page
getFragment(int) - Method in class cat.lump.ir.retrievalmodels.document.Document
 
getFragmentSize(int) - Method in class cat.lump.ir.retrievalmodels.similarity.Article
 
getFrequency() - Method in class cat.lump.aq.basics.structure.ir.TermFrequencyTuple
Returns the number of occurrences of the term
getFrequency(String) - Method in class cat.lump.aq.textextraction.wikipedia.categories.DomainVocabulary
Gives the frequency of the given term.
getFromFile(File, boolean) - Static method in class cat.lump.ir.retrievalmodels.similarity.SimilarityMatrix
Loads a similarity matrix froma binary a file.
getFurhterReadingLabel(Language) - Method in class cat.lump.aq.wikilink.jwpl.LanguageConstants
 
getFurhterReadingLabelDep(Language) - Static method in class cat.lump.aq.wikilink.jwpl.LanguageConstants
Deprecated.
getiCategory() - Method in class cat.lump.aq.textextraction.wikipedia.cli.WikipediaCliCategoriesXecutor
 
getiCategory() - Method in class cat.lump.ir.lucene.cli.LuceneCliCategoriesXecutor
 
getiCategory1() - Method in class cat.lump.aq.textextraction.wikipedia.cli.WikipediaCliFragmentsXecutor
 
getiCategory2() - Method in class cat.lump.aq.textextraction.wikipedia.cli.WikipediaCliFragmentsXecutor
 
getId(String) - Method in class cat.lump.ir.retrievalmodels.document.Dictionary
 
getId(int) - Method in class cat.lump.ir.sim.ml.esa.EsaVectors
 
getIdentifier() - Method in enum cat.lump.aq.textextraction.wikipedia.prepro.TypePreprocess
 
getImageLabel(Language) - Method in class cat.lump.aq.wikilink.jwpl.LanguageConstants
 
getImageLabelDep(Language) - Static method in class cat.lump.aq.wikilink.jwpl.LanguageConstants
Deprecated.
getImageLabels() - Method in class cat.lump.aq.wikilink.jwpl.WikipediaJwpl
 
getIn() - Method in class cat.lump.ir.lucene.cli.LuceneCliIndexerWT
Getters
getIn() - Method in class cat.lump.ir.lucene.cli.LuceneCliQuerierWT
Getters
getIndex(String) - Method in class cat.lump.ir.sim.ml.esa.EsaVectors
 
getIndexDimension() - Method in class cat.lump.ir.sim.ml.esa.EsaGenerator
 
getIndexDir() - Method in class cat.lump.ir.lucene.cli.LuceneCliCategoriesXecutor
 
getIndexDir() - Method in class cat.lump.ir.lucene.cli.LuceneCliWT2Query
Getters
getInDir() - Method in class cat.lump.ir.lucene.cli.LuceneCliWT2Query
 
getInputDir() - Method in class cat.lump.ir.lucene.cli.LuceneCliCategoriesXecutor
 
getInstance(TypePreprocess, String, Locale, int, File) - Static method in class cat.lump.aq.textextraction.wikipedia.prepro.AbstractPreprocess
Creates a preprocess method able to preprocess pages from the Wikipedia dump identified by language and year.
getInstance(String, String, Similarity) - Static method in class cat.lump.ir.retrievalmodels.similarity.SimilarityCalculator
Returns the suitable calculator according to the given similarity method.
getInstance(String, String, Similarity, int) - Static method in class cat.lump.ir.retrievalmodels.similarity.SimilarityCalculator
Returns the suitable calculator according to the given similarity method.
getInverseRank() - Method in class cat.lump.ir.index.Ranking
 
getInvIndex() - Method in class cat.lump.ir.retrievalmodels.similarity.Article
 
getJwplDBprefix() - Static method in class cat.lump.aq.wikilink.config.MySQLWikiConfiguration
 
getJwplDBprefix() - Static method in class cat.lump.aq.wikilink.jwpl.WikipediaJwpl
 
getJwplLanguage(Locale) - Static method in class cat.lump.aq.wikilink.Languages
 
getJwplLanguage(String) - Static method in class cat.lump.aq.wikilink.Languages
 
getKey() - Method in class cat.lump.aq.basics.structure.Pair
 
getLang() - Method in class cat.lump.aq.textextraction.wikipedia.categories.DomainKeywords
 
getLangAll() - Static method in class cat.lump.aq.wikilink.Languages
 
getLanglinksTableName(String, int) - Static method in class cat.lump.aq.wikilink.WikipediaDBdata
Generates the name of the table langlinks as stored in the database for a given language and year
getLanguage() - Method in class cat.lump.aq.textextraction.wikipedia.categories.DomainVocabulary
 
getLanguage() - Method in class cat.lump.aq.textextraction.wikipedia.cli.WikipediaCliMinimum
 
getLanguage() - Method in class cat.lump.aq.wikilink.config.Dump
 
getLanguage() - Method in class cat.lump.ir.lucene.cli.LuceneCliMinimum0
Getters
getLanguage() - Method in class cat.lump.ir.retrievalmodels.similarity.Article
 
getLanguage2() - Method in class cat.lump.aq.textextraction.wikipedia.cli.WikipediaCliFragmentsXecutor
getters
getLastStep() - Method in class cat.lump.aq.textextraction.wikipedia.cli.WikipediaCliCategoriesXecutor
 
getLastStep() - Method in class cat.lump.aq.textextraction.wikipedia.cli.WikipediaCliFragmentsXecutor
 
getLastStep() - Method in class cat.lump.ir.lucene.cli.LuceneCliCategoriesXecutor
 
getLocale() - Method in class cat.lump.aq.textextraction.wikipedia.categories.CategoryExplorer
 
getLuceneTokenizer(Locale) - Static method in class cat.lump.ir.lucene.query.TokenizerFactory
 
getMagnitude(String) - Method in class cat.lump.ir.index.Index
 
getMagnitude(int) - Method in class cat.lump.ir.retrievalmodels.similarity.Article
The magnitude of the given sentence
getMatrix() - Method in class cat.lump.aq.basics.algebra.matrix.DynamicMatrix
 
getMatrix() - Method in class cat.lump.aq.basics.algebra.matrix.DynamicMatrixOfVectors
 
getMatrix() - Method in class cat.lump.ir.retrievalmodels.similarity.SimilarityMatrix
 
getmaxDepth() - Method in class cat.lump.aq.textextraction.wikipedia.cli.WikipediaCliCategoryDepth
 
getMaxDepth() - Method in class cat.lump.aq.textextraction.wikipedia.cli.WikipediaCliCategoryExtractor
 
getMaxVocab() - Method in class cat.lump.ir.lucene.cli.LuceneCliWT2Query
 
getMean(String, String) - Static method in class cat.lump.ir.sim.cl.len.LengthFactors
Get the mean for the desired pair.
getMean(String) - Static method in class cat.lump.ir.sim.cl.len.LengthFactors
Get the mean for the desired pair.
getMethod() - Method in class cat.lump.aq.textextraction.wikipedia.cli.WikipediaCliFragmentsXecutor
 
getminDepth() - Method in class cat.lump.aq.textextraction.wikipedia.cli.WikipediaCliCategoryDepth
 
getMinimumSize() - Method in class cat.lump.aq.textextraction.wikipedia.categories.DomainVocabulary
 
getModel() - Method in class cat.lump.aq.textextraction.wikipedia.cli.WikipediaCliCategoriesXecutor
 
getModel() - Method in enum cat.lump.ir.retrievalmodels.similarity.Similarity
 
getModel() - Method in class cat.lump.ir.retrievalmodels.similarity.SimilarityCalculator
 
getMu() - Method in class cat.lump.ir.sim.cl.len.LengthModelLearn
 
getMulti_db() - Static method in class cat.lump.aq.wikilink.config.MySQLWikiConfiguration
 
getMulti_table() - Static method in class cat.lump.aq.wikilink.config.MySQLWikiConfiguration
 
getName() - Method in enum cat.lump.ir.retrievalmodels.similarity.Similarity
 
getNCols() - Method in class cat.lump.ir.retrievalmodels.similarity.SimilarityMatrix
 
getNextCoordenates() - Method in class cat.lump.ir.comparison.toCheck.SimilarityMatrixIterator
 
getNextCoordenates() - Method in class cat.lump.ir.retrievalmodels.similarity.SimilarityMatrixIterator
 
getnGrams() - Method in class cat.lump.ir.retrievalmodels.similarity.SimilarityCalculator
 
getNormalized(RepresentationType) - Method in class cat.lump.ir.retrievalmodels.document.Document
 
getNormalized(RepresentationType) - Method in class cat.lump.ir.retrievalmodels.document.Fragment
 
getNotesLabel(Language) - Method in class cat.lump.aq.wikilink.jwpl.LanguageConstants
 
getNotesLabelDep(Language) - Static method in class cat.lump.aq.wikilink.jwpl.LanguageConstants
Deprecated.
getNRows() - Method in class cat.lump.ir.retrievalmodels.similarity.SimilarityMatrix
 
getNumberOfArticles() - Method in class cat.lump.aq.textextraction.wikipedia.categories.CategoryTreeNode
Returns the number of articles that are categorized under this category.
getNumberOfChildren() - Method in class cat.lump.aq.textextraction.wikipedia.categories.CategoryTreeNode
Returns the number of children (subcategories) of this category.
getNumberOfParents() - Method in class cat.lump.aq.textextraction.wikipedia.categories.CategoryTreeNode
Returns the number of parents (supercategories) of this category.
getOriginalString() - Method in class cat.lump.ie.textprocessing.TextPreprocessor
 
getOut() - Method in class cat.lump.ir.lucene.cli.LuceneCliIndexerWT
 
getOut() - Method in class cat.lump.ir.lucene.cli.LuceneCliQuerierWT
 
getOutDir() - Method in class cat.lump.ir.lucene.cli.LuceneCliWT2Query
 
getOutputDir() - Method in class cat.lump.aq.textextraction.wikipedia.cli.WikipediaCliArticleTextExtractor
 
getOutputDir() - Method in class cat.lump.ir.lucene.cli.LuceneCliCategoriesXecutor
 
getOutputFile() - Method in class cat.lump.aq.textextraction.wikipedia.cli.WikipediaCliCategoriesXecutor
 
getOutputFile() - Method in class cat.lump.aq.textextraction.wikipedia.cli.WikipediaCliCategoryExtractor
 
getOutputFile() - Method in class cat.lump.aq.textextraction.wikipedia.cli.WikipediaCliDomainKeywords
 
getOutputPath() - Method in class cat.lump.aq.textextraction.wikipedia.cli.WikipediaCliArticleSelector
 
getOverThreshold(double) - Method in class cat.lump.ir.index.Ranking
 
getPageID() - Method in class cat.lump.aq.textextraction.wikipedia.categories.CategoryTreeNode
Returns the ID of the category at the used Wikipedia
getPageTableName(String, int) - Static method in class cat.lump.aq.wikilink.WikipediaDBdata
Generates the name of the table page as stored in the database for a given language and year
getPairs() - Method in class cat.lump.aq.textextraction.wikipedia.fragments.ArticlesSimilarity
 
getPairs() - Method in class cat.lump.ir.comparison.toCheck.ArticlesSimilarity
 
getPairsDBname() - Static method in class cat.lump.aq.wikilink.config.MySQLWikiConfiguration
 
getPairwiseSimilarities() - Method in class cat.lump.ir.sim.ml.esa.Esa
 
getParagraphsFromArticle(int) - Method in class cat.lump.aq.wikilink.jwpl.WikipediaJwpl
Obtains the paragraphs of a parsed article id
getParagraphsFromArticle(String) - Method in class cat.lump.aq.wikilink.jwpl.WikipediaJwpl
Obtains the paragraphs of a parsed article title
getParent() - Method in class cat.lump.aq.textextraction.wikipedia.categories.GroupOfCategories.ScoredCategory
 
getParsedArticle(int) - Method in class cat.lump.aq.wikilink.jwpl.WikipediaJwpl
Parses a Wikipedia article
getParsedArticle(String) - Method in class cat.lump.aq.wikilink.jwpl.WikipediaJwpl
Parses a Wikipedia article
getParser() - Method in class cat.lump.aq.wikilink.jwpl.WikipediaJwpl
TODO determine whether this parser should be returned.
getPercentage() - Method in class cat.lump.aq.textextraction.wikipedia.cli.WikipediaCliCategoryDepth
getters
getPercentage() - Method in class cat.lump.ir.lucene.cli.LuceneCliQuerierWT
 
getPercentage() - Method in class cat.lump.ir.lucene.cli.LuceneCliWT2Query
 
getPlain() - Method in class cat.lump.ir.retrievalmodels.document.Fragment
 
getPrefixOutputFile() - Method in class cat.lump.aq.textextraction.wikipedia.utilities.CommonArticlesFinder
getters
getPrefixOutputFile() - Method in class cat.lump.aq.textextraction.wikipedia.utilities.CommonNamespaceFinder
getters
getProperties(String) - Static method in class cat.lump.aq.textextraction.wikipedia.WikiProperties
Deprecated.
getProperties(String) - Static method in class cat.lump.aq.textextraction.wikipedia.WTConfig
Loads the wikiTailor.ini config file
getPropertyInt(String) - Method in class cat.lump.aq.textextraction.wikipedia.cli.WikipediaCliMinimum
 
getPropertyInt(String) - Method in class cat.lump.aq.textextraction.wikipedia.WTConfig
Gets a value in the config file given a key and returns it as an integer
getPropertyStr(String) - Method in class cat.lump.aq.textextraction.wikipedia.cli.WikipediaCliMinimum
 
getPropertyStr(String) - Method in class cat.lump.aq.textextraction.wikipedia.WTConfig
Gets a value in the config file given a key and returns it as a String
getRank() - Method in class cat.lump.ir.index.Ranking
 
getRedirectLabel(Language) - Method in class cat.lump.aq.wikilink.jwpl.LanguageConstants
 
getRedirectLabelDep(Language) - Static method in class cat.lump.aq.wikilink.jwpl.LanguageConstants
Deprecated.
getRedirectTableName(String, int) - Static method in class cat.lump.aq.wikilink.WikipediaDBdata
Generates the name of the table redirect as stored in the database for a given language and year
getReferencesLabel(Language) - Method in class cat.lump.aq.wikilink.jwpl.LanguageConstants
 
getReferencesLabelDep(Language) - Static method in class cat.lump.aq.wikilink.jwpl.LanguageConstants
Deprecated.
getRepresentation() - Method in class cat.lump.ir.retrievalmodels.similarity.Article
 
getRepresentation() - Method in enum cat.lump.ir.retrievalmodels.similarity.Similarity
 
getRoot() - Method in class cat.lump.aq.textextraction.wikipedia.categories.GroupOfCategories
 
getRootDirectory() - Method in class cat.lump.aq.textextraction.wikipedia.categories.ArticleTextExtractor
 
getsCategory() - Method in class cat.lump.aq.textextraction.wikipedia.cli.WikipediaCliCategoriesXecutor
Getters
getsCategory() - Method in class cat.lump.ir.lucene.cli.LuceneCliCategoriesXecutor
Getters
getsCategory1() - Method in class cat.lump.aq.textextraction.wikipedia.cli.WikipediaCliFragmentsXecutor
 
getsCategory2() - Method in class cat.lump.aq.textextraction.wikipedia.cli.WikipediaCliFragmentsXecutor
 
getScore() - Method in class cat.lump.aq.textextraction.wikipedia.categories.GroupOfCategories
 
getSD(String, String) - Static method in class cat.lump.ir.sim.cl.len.LengthFactors
Get the standard deviation for the desired pair.
getSD(String) - Static method in class cat.lump.ir.sim.cl.len.LengthFactors
Get the standard deviation for the desired pair.
getSectionsFromArticle(int) - Method in class cat.lump.aq.wikilink.jwpl.WikipediaJwpl
Obtains the sections of parsed article
getSectionsFromArticle(String) - Method in class cat.lump.aq.wikilink.jwpl.WikipediaJwpl
Obtains the sections of a parsed article
getSeeAlsoLabel(Language) - Method in class cat.lump.aq.wikilink.jwpl.LanguageConstants
 
getSeeAlsoLabelDep(Language) - Static method in class cat.lump.aq.wikilink.jwpl.LanguageConstants
Deprecated.
getSigma() - Method in class cat.lump.ir.sim.cl.len.LengthModelLearn
 
getSimilarities() - Method in class cat.lump.aq.textextraction.wikipedia.fragments.ArticlesSimilarity
 
getSimilarities() - Method in class cat.lump.ir.comparison.toCheck.ArticlesSimilarity
 
getSimilarities() - Method in class cat.lump.ir.sim.ml.esa.Esa
 
getSimilarities() - Method in interface cat.lump.ir.sim.Similarity
 
getSimilaritiesMatrix() - Method in class cat.lump.ir.sim.ml.esa.Esa
 
getSimilarity(int, int) - Method in class cat.lump.ir.retrievalmodels.similarity.SimilarityMatrix
 
getSimilarity(String, String) - Method in class cat.lump.ir.sim.ml.esa.Esa
Obtains the similarity between texts id_A and id_B.
getSimilarity(String) - Method in class cat.lump.ir.sim.ml.esa.Esa
 
getSimilarity(String, String) - Method in interface cat.lump.ir.sim.Similarity
Get the (previously computed) similarity between the two ids
getSimilarityByName(String) - Static method in enum cat.lump.ir.retrievalmodels.similarity.Similarity
 
getSimilarityByRepr(RepresentationType) - Static method in enum cat.lump.ir.retrievalmodels.similarity.Similarity
TODO ask Josu about this
getSize() - Method in class cat.lump.ir.retrievalmodels.similarity.SimilarityMatrix
 
getSource() - Method in class cat.lump.ir.comparison.toCheck.SimilarityPair
 
getSource() - Method in class cat.lump.ir.retrievalmodels.similarity.SimilarityCalculator
 
getSpans(String) - Method in interface cat.lump.ie.textprocessing.Decomposition
 
getSpans(String[]) - Method in class cat.lump.ie.textprocessing.ner.NerOpennlp
 
getSpans(String) - Method in class cat.lump.ie.textprocessing.ner.NerOpennlp
 
getSpans(String) - Method in class cat.lump.ie.textprocessing.ngram.CharacterNgrams
 
getSpans(String) - Method in class cat.lump.ie.textprocessing.ngram.WordNgrams
 
getSpans(String) - Method in class cat.lump.ie.textprocessing.sentence.SentencesOpennlp
 
getSpans(String) - Method in class cat.lump.ie.textprocessing.word.WordDecompositionICU4J
 
getSpecificDirs(File, String, String) - Static method in class cat.lump.aq.basics.io.files.FileIO
 
getSpecificFilesRecursively(File, String, String) - Static method in class cat.lump.aq.basics.io.files.FileIO
 
getSpecificFilesRecursively(File, String, String, long, long) - Static method in class cat.lump.aq.basics.io.files.FileIO
 
getSrcLang() - Method in class cat.lump.aq.textextraction.wikipedia.fragments.ArticlesSimilarity
 
getSrcLang() - Method in class cat.lump.ir.comparison.toCheck.ArticlesSimilarity
 
getStart() - Method in class cat.lump.ie.textprocessing.Span
 
getStatsFile() - Method in class cat.lump.aq.textextraction.wikipedia.cli.WikipediaCliCategoryDepth
 
getStopWords() - Method in class cat.lump.ie.textprocessing.stopwords.Stopwords
 
getString() - Method in class cat.lump.ie.textprocessing.TextPreprocessor
 
getString(int) - Method in class cat.lump.ir.retrievalmodels.document.Dictionary
 
getStrings(String) - Method in interface cat.lump.ie.textprocessing.Decomposition
 
getStrings(String[]) - Method in class cat.lump.ie.textprocessing.ner.NerOpennlp
 
getStrings(String) - Method in class cat.lump.ie.textprocessing.ner.NerOpennlp
 
getStrings(String) - Method in class cat.lump.ie.textprocessing.ngram.CharacterNgrams
 
getStrings(String) - Method in class cat.lump.ie.textprocessing.ngram.WordNgrams
 
getStrings(String) - Method in class cat.lump.ie.textprocessing.sentence.SentencesOpennlp
 
getStrings(String) - Method in class cat.lump.ie.textprocessing.word.WordDecompositionICU4J
 
getSubSectionsFromArticle(int) - Method in class cat.lump.aq.wikilink.jwpl.WikipediaJwpl
Obtains the sub-sections of a parsed article
getSubSectionsFromArticle(String) - Method in class cat.lump.aq.wikilink.jwpl.WikipediaJwpl
Obtains the sub-sections of a parsed article
getSubstring(String) - Method in class cat.lump.ie.textprocessing.Span
String in the current span
getTarget() - Method in class cat.lump.ir.comparison.toCheck.SimilarityPair
 
getTarget() - Method in class cat.lump.ir.retrievalmodels.similarity.SimilarityCalculator
 
getTerm() - Method in class cat.lump.aq.basics.structure.ir.TermFrequencyTuple
Returns the associated term
getTerm(String) - Method in class cat.lump.ir.weighting.TermFrequency
 
getTerms() - Method in class cat.lump.aq.textextraction.wikipedia.categories.DomainVocabulary
 
getTerms(String, int) - Method in class cat.lump.aq.textextraction.wikipedia.prepro.TermExtractor
Transforms the content of a String into a set of terms.
getTermTuples() - Method in class cat.lump.aq.textextraction.wikipedia.categories.DomainKeywords
 
getText() - Method in class cat.lump.ir.retrievalmodels.document.Document
 
getText() - Method in class cat.lump.ir.retrievalmodels.similarity.Article
 
getTitle() - Method in class cat.lump.aq.textextraction.wikipedia.categories.CategoryTreeNode
Returns the title of the category in Wikitext (without blanks).
getTokens() - Method in class cat.lump.ie.textprocessing.TextPreprocessor
 
getTokenValues(String) - Method in class cat.lump.ir.retrievalmodels.similarity.Article
 
getTop(float) - Method in class cat.lump.aq.textextraction.wikipedia.categories.DomainVocabulary
Extracts the most frequent terms of the vocabulary.
getTop(int) - Method in class cat.lump.aq.textextraction.wikipedia.categories.DomainVocabulary
Extract the qtt most frequent terms of the vocabulary.
getTop() - Method in class cat.lump.aq.textextraction.wikipedia.cli.WikipediaCliCategoriesXecutor
 
getTop() - Method in class cat.lump.aq.textextraction.wikipedia.cli.WikipediaCliDomainKeywords
 
getTop() - Method in class cat.lump.ir.lucene.cli.LuceneCliCategoriesXecutor
 
getTop(int, int) - Method in class cat.lump.ir.weighting.TermFrequency
Subset of terms with the highest tf up to top% or up to max Note that not the top% is returned sometimes but a little bit more.
getTopK(int) - Method in class cat.lump.ir.index.Ranking
 
getTopPlus(int, int, List<String>) - Method in class cat.lump.ir.weighting.TermFrequency
Note that not the top% is returned sometimes but a little bit more.
getTopTuples() - Method in class cat.lump.aq.textextraction.wikipedia.categories.DomainKeywords
 
getTopTuplesPlus(String) - Method in class cat.lump.aq.textextraction.wikipedia.categories.DomainKeywords
 
getTranslation() - Method in class cat.lump.aq.textextraction.wikipedia.cli.WikipediaCliFragmentsXecutor
 
getTrgLang() - Method in class cat.lump.aq.textextraction.wikipedia.fragments.ArticlesSimilarity
 
getTrgLang() - Method in class cat.lump.ir.comparison.toCheck.ArticlesSimilarity
 
getType() - Method in class cat.lump.aq.textextraction.wikipedia.prepro.AbstractPreprocess
 
getType() - Method in class cat.lump.ir.retrievalmodels.similarity.SimilarityCalculator
 
getValue() - Method in class cat.lump.aq.basics.structure.Pair
 
getVector(String) - Method in class cat.lump.ir.sim.ml.esa.EsaVectors
Obtain the vector corresponding to this id (null if it does not exist)
getVector(int) - Method in class cat.lump.ir.sim.ml.esa.EsaVectors
Obtain the vector corresponding to this slot
getVerbose() - Method in class cat.lump.aq.textextraction.wikipedia.cli.WikipediaCliCategoryExtractor
 
getVerbosity() - Method in class cat.lump.ir.lucene.cli.LuceneCliIndexerWT
 
getVerbosity() - Method in class cat.lump.ir.lucene.cli.LuceneCliMinimum0
 
getVerbosity() - Method in class cat.lump.ir.lucene.cli.LuceneCliQuerierWT
 
getVerbosity() - Method in class cat.lump.ir.lucene.cli.LuceneCliWT2Query
 
getVocabulary() - Method in class cat.lump.ir.index.Index
 
getVocabulary() - Method in class cat.lump.ir.retrievalmodels.similarity.Article
 
getWeighted(RepresentationType) - Method in class cat.lump.ir.retrievalmodels.document.Document
 
getWeighted(RepresentationType) - Method in class cat.lump.ir.retrievalmodels.document.Fragment
 
getWikipediaConnector() - Method in class cat.lump.aq.textextraction.wikipedia.categories.CategoryExplorer
 
getYear() - Method in class cat.lump.aq.textextraction.wikipedia.categories.DomainKeywords
 
getYear() - Method in class cat.lump.aq.textextraction.wikipedia.cli.WikipediaCliMinimum
 
getYear() - Method in class cat.lump.aq.wikilink.config.Dump
 
GroupOfCategories - Class in cat.lump.aq.textextraction.wikipedia.categories
A GroupOfCategories instance contains the scored categories from Wikipedia which are related to other called root category.
GroupOfCategories(Category) - Constructor for class cat.lump.aq.textextraction.wikipedia.categories.GroupOfCategories
Creates an empty group of categories related to the given root category.
GroupOfCategories(File, WikipediaJwpl) - Constructor for class cat.lump.aq.textextraction.wikipedia.categories.GroupOfCategories
 
GroupOfCategories.ScoredCategory - Class in cat.lump.aq.textextraction.wikipedia.categories
The ScoredCategory class enriches the de.tudarmstadt.ukp.wikipedia.api.Category objects providing the following information: Parent: The first category which allows access to this one.
GroupOfCategories.ScoredCategory(Category, Category, int) - Constructor for class cat.lump.aq.textextraction.wikipedia.categories.GroupOfCategories.ScoredCategory
Creates a scored category which hasn't already been scored.
GroupOfCategories.ScoredCategory(Category, Category, int, boolean) - Constructor for class cat.lump.aq.textextraction.wikipedia.categories.GroupOfCategories.ScoredCategory
Creates a scored category defining if its score.
GroupOfCategories.ScoredCategory(String, WikipediaJwpl) - Constructor for class cat.lump.aq.textextraction.wikipedia.categories.GroupOfCategories.ScoredCategory
 
gZipToString(String) - Static method in class cat.lump.aq.basics.io.files.FileIO
Opens a gziped file and returns the lines it contains

H

hasNext() - Method in class cat.lump.ir.comparison.toCheck.SimilarityMatrixIterator
 
hasNext() - Method in class cat.lump.ir.retrievalmodels.similarity.SimilarityMatrixIterator
 
hasPair(String) - Static method in class cat.lump.ir.sim.cl.len.LengthFactors
Check whether parameters for the desired language pair are available
header - Variable in class cat.lump.aq.textextraction.wikipedia.cli.WikipediaCliMinimum
 
header - Variable in class cat.lump.ir.lucene.cli.LuceneCliMinimum0
 

I

idf(int, int) - Method in class cat.lump.ir.lucene.index.TFSimilarity
 
idsToFile(String) - Method in class cat.lump.aq.textextraction.wikipedia.categories.ArticleSelector
Stores the IDs of the extracted articles into a file; one ID per line.
INCLUDE_BOW - Variable in class cat.lump.ir.index.Abstracter
Representations to include
INCLUDE_CNG - Variable in class cat.lump.ir.index.Abstracter
Representations to include
INCLUDE_COG - Variable in class cat.lump.ir.index.Abstracter
Representations to include
INCLUDE_WNG - Variable in class cat.lump.ir.index.Abstracter
Representations to include
increment() - Method in class cat.lump.aq.basics.structure.ir.TermFrequencyTuple
Increments the number of occurrences of the term in one unit.
index - Variable in class cat.lump.ir.index.Abstracter
Inverted index used to compute similarities
Index - Class in cat.lump.ir.index
 
Index() - Constructor for class cat.lump.ir.index.Index
 
index() - Method in class cat.lump.ir.lucene.index.LuceneIndexer
 
index(String, FileFilter) - Method in class cat.lump.ir.lucene.index.LuceneIndexer
Open an index and start file directory traversal
index() - Method in class cat.lump.ir.lucene.index.LuceneIndexerWT
 
index(String, FileFilter) - Method in class cat.lump.ir.lucene.index.LuceneIndexerWT
Open an index and start file directory traversal
INDEX_FILE - Variable in class cat.lump.ir.index.Abstracter
Prefix and suffix for the index-related files
indexDir - Static variable in class cat.lump.ir.lucene.LuceneInterface
Directory where the Lucene index has to be stored
indexEdition(String, String) - Method in class cat.lump.ir.lucene.Xecutor
Indexes the Wikipedia edition in language "locale" available at inputDir, and outputs the indexes at indexDir.
Indexer - Class in cat.lump.ir.index
A class that index a set of Documents on the basis of a given representation.
Indexer(Locale, File, RepresentationType[]) - Constructor for class cat.lump.ir.index.Indexer
Invocation where the documents' language is given and the directory for the index is provided.
Indexer(File, RepresentationType[]) - Constructor for class cat.lump.ir.index.Indexer
Default invocation where files are in English
indexPath - Variable in class cat.lump.ir.index.Abstracter
Path to the index
INFINITE_DISTANCE - Static variable in class cat.lump.aq.textextraction.wikipedia.categories.CategoryExplorer
 
info(String) - Method in class cat.lump.aq.basics.log.LumpLogger
Prints a log message
insertFromFile(File) - Method in class cat.lump.aq.textextraction.wikipedia.categories.DomainVocabulary
Inserts the terms stored in the given file into the domain vocabulary.
InvIndexContainer - Class in cat.lump.aq.basics.structure
A container for storing an inverted index.
InvIndexContainer() - Constructor for class cat.lump.aq.basics.structure.InvIndexContainer
 
isAcronym(Page) - Method in class cat.lump.aq.wikilink.jwpl.WikipediaJwpl
Checks if the page is an acronym page
isAvailable(String) - Static method in class cat.lump.ir.lucene.index.LuceneLanguages
Checks whether we can index a given language.
isAvailablePreprocess(TypePreprocess) - Method in class cat.lump.aq.textextraction.wikipedia.categories.ArticleTextExtractor
Query if exists the preprocess within the available preprocesses.
isAvailableSimilarity(Similarity) - Method in class cat.lump.aq.textextraction.wikipedia.fragments.ArticlesSimilarity
Checks if the given similarity is known for this class
isAvailableSimilarity(Similarity) - Method in class cat.lump.ir.comparison.toCheck.ArticlesSimilarity
Checks if the given similarity is known for this class
isDisambiguation(Page) - Method in class cat.lump.aq.wikilink.jwpl.WikipediaJwpl
Checks if this is a disambiguation page
isDomain(Category, DomainVocabulary) - Method in class cat.lump.aq.textextraction.wikipedia.categories.CategoryExplorer
Checks if a category belongs to the domain defined by the given vocabulary.
isDomain(String[]) - Method in class cat.lump.aq.textextraction.wikipedia.categories.CategoryNameStats
Checks if a category name "belongs" to the domain.
isDomain() - Method in class cat.lump.aq.textextraction.wikipedia.categories.GroupOfCategories.ScoredCategory
 
isEmpty() - Method in class cat.lump.ir.index.Index
 
isIndexAvailable(String) - Static method in class cat.lump.ir.lucene.index.LuceneLanguages
Checks whether a given languages is indexed and ready to compute similarities.
isLanguageAvailable(String) - Static method in class cat.lump.aq.wikilink.Languages
 
isRedirect(Page) - Method in class cat.lump.aq.wikilink.jwpl.WikipediaJwpl
Checks if the page is a redirect page
isRedirect(int, String, int, WikipediaDriverManager) - Static method in class cat.lump.aq.wikilink.WikipediaDBdata
Looks in the "page" table of the corresponding language and year if the id is a redirect.
isStopword(String) - Method in class cat.lump.ie.textprocessing.stopwords.Stopwords
 
iterator() - Method in class cat.lump.ir.retrievalmodels.similarity.SimilarityMatrix
 

J

JaccardSimilarity - Class in cat.lump.ir.retrievalmodels.similarity
 
JaccardSimilarity() - Constructor for class cat.lump.ir.retrievalmodels.similarity.JaccardSimilarity
 

K

keySet() - Method in class cat.lump.ir.sim.ml.esa.EsaVectors
 

L

LABEL - Variable in class cat.lump.aq.textextraction.wikipedia.cli.WikipediaCliMinimum
 
LABEL - Variable in class cat.lump.ir.lucene.cli.LuceneCliMinimum0
 
lan - Variable in class cat.lump.ir.lucene.LuceneInterface
Language of the texts
langToString() - Static method in class cat.lump.ir.lucene.index.LuceneLanguages
 
language - Variable in class cat.lump.ir.lucene.cli.LuceneCliMinimum0
Language
LanguageConstants - Class in cat.lump.aq.wikilink.jwpl
Includes the constant identifiers for Wikipedia labels in different languages.
LanguageConstants() - Constructor for class cat.lump.aq.wikilink.jwpl.LanguageConstants
Opens the CONFIG_FILE and loads all the language constants for the available languages.
Languages - Class in cat.lump.aq.wikilink
A collection of all the available languages in Wikipedia up to 2015
Languages() - Constructor for class cat.lump.aq.wikilink.Languages
 
learn() - Method in class cat.lump.ir.sim.cl.len.LengthModelLearn
Esimate the mu and sigma parameters for the parallel corpus provided
length() - Method in class cat.lump.aq.basics.algebra.matrix.DynamicMatrixOfVectors
 
length() - Method in class cat.lump.aq.basics.algebra.vector.Vector
 
length() - Method in class cat.lump.ie.textprocessing.Span
 
length() - Method in class cat.lump.ir.retrievalmodels.document.Document
 
length() - Method in class cat.lump.ir.retrievalmodels.similarity.Article
 
length() - Method in class cat.lump.ir.sim.ml.esa.EsaVectors
 
LengthFactors - Class in cat.lump.ir.sim.cl.len
It includes the default values for a bunch of language pairs, namely: cz-en de-en en-cz en-de en-es en-fr en-ru es-en fr-en ru-en The parameters were estimated by txell on different corpora, including: commoncrawl.wmt2013 CzEng.v1.0 el_periodico europarl.v6 europarl.v7 FAUST_D4.2 French_treebank newscommentary.v8 news.shuffled.en.conll.gz news.shuffled.fr.conll.gz patents Romanian_treebank UNdoc.2000 wiki-titles.ru-en wmt10 wmt10.select
LengthFactors() - Constructor for class cat.lump.ir.sim.cl.len.LengthFactors
 
LengthModel - Class in cat.lump.ir.sim.cl.len
A class to estimate length models for a language pair.
LengthModel() - Constructor for class cat.lump.ir.sim.cl.len.LengthModel
 
LengthModelEstimate - Class in cat.lump.ir.sim.cl.len
A class to estimate the length factor between two texts according to previously learnt parameters.
LengthModelEstimate() - Constructor for class cat.lump.ir.sim.cl.len.LengthModelEstimate
 
LengthModelLearn - Class in cat.lump.ir.sim.cl.len
Class to learn the parameters of the length model from a parallel corpus (two files).
LengthModelLearn() - Constructor for class cat.lump.ir.sim.cl.len.LengthModelLearn
Invoke without setting the source and target files (it's going to be done by the calling class
LengthModelLearn(File, File) - Constructor for class cat.lump.ir.sim.cl.len.LengthModelLearn
Invocation with the source and target files
LithuanianAnalyzer - Class in cat.lump.ir.lucene.index.analyzers
 
LithuanianAnalyzer() - Constructor for class cat.lump.ir.lucene.index.analyzers.LithuanianAnalyzer
 
loadAnalyzer(Locale) - Static method in class cat.lump.ie.textprocessing.word.AnalyzerFactoryLucene
Analyser from Lucene for different languages.
loadAnalyzer(Locale) - Static method in class cat.lump.ir.lucene.engine.AnalyzerFactory
Analyser from Lucene for different languages.
loadArticles(int) - Method in class cat.lump.aq.textextraction.wikipedia.categories.DomainKeywords
Loads the articles of the given category ID.
loadArticles(String) - Method in class cat.lump.aq.textextraction.wikipedia.categories.DomainKeywords
Loads the articles of the given category.
loadCategory(String) - Method in class cat.lump.aq.textextraction.wikipedia.categories.CategoryExplorer
Loads a category from Wikipedia by its title.
loadCategory(int) - Method in class cat.lump.aq.textextraction.wikipedia.categories.CategoryExplorer
Loads a category from Wikipedia by its page ID.
loadCategoryNames(File) - Method in class cat.lump.aq.textextraction.wikipedia.categories.CategoryNameStats
Reads the file with the list of categories TODO a loader from an object.
loadDictionary(File) - Method in class cat.lump.aq.textextraction.wikipedia.categories.CategoryNameStats
Reads the "Domain Key Words" dictionary
loadDictionary(HashSet<String>) - Method in class cat.lump.aq.textextraction.wikipedia.categories.CategoryNameStats
 
loadFile(File, TypePreprocess, String, int) - Static method in class cat.lump.aq.textextraction.wikipedia.io.FileManager
Loads the file related to the given parameters in text format.
loadfromFile(File) - Static method in class cat.lump.aq.textextraction.wikipedia.categories.DomainVocabulary
Creates a DomainVocabulary instance by reading a binary file which contains a domain vocabulary.
loadIndex() - Method in class cat.lump.ir.index.Abstracter
Loads the index components (empty if new, with data if existed previously).
loadIndex(Locale) - Method in class cat.lump.ir.lucene.query.LuceneQuerier
Loads the Lucene index (previously created) with the reference corpus.
loadIndex(Locale) - Method in class cat.lump.ir.lucene.query.LuceneQuerierWT
Loads the Lucene index (previously created) with the reference corpus.
loadIndex() - Method in class cat.lump.ir.sim.ml.esa.EsaGenerator
Loads the Lucene index (previously created) with the reference corpus.
loadOptions() - Method in class cat.lump.aq.textextraction.wikipedia.cli.WikipediaCliArticleSelector
Loads additional options: category (numerical and string), percentage of words required and output file
loadOptions() - Method in class cat.lump.aq.textextraction.wikipedia.cli.WikipediaCliArticleTextExtractor
Loads additional options: category (numerical and string), percentage of words required and output file
loadOptions() - Method in class cat.lump.aq.textextraction.wikipedia.cli.WikipediaCliCategoriesXecutor
Loads additional options: category (numerical and string), percentage of words required and output file
loadOptions() - Method in class cat.lump.aq.textextraction.wikipedia.cli.WikipediaCliCategoryDepth
Loads additional options: percentage of categories with keywords, input file, depth and category
loadOptions() - Method in class cat.lump.aq.textextraction.wikipedia.cli.WikipediaCliCategoryExtractor
Loads additional options: category (numerical and string), percentage of words required and output file
loadOptions() - Method in class cat.lump.aq.textextraction.wikipedia.cli.WikipediaCliDomainKeywords
Loads additional options: category (numerical and string), percentage of words required and output file
loadOptions() - Method in class cat.lump.aq.textextraction.wikipedia.cli.WikipediaCliFragmentsXecutor
Loads additional options: category (numerical and string), percentage of words required and output file
loadOptions() - Method in class cat.lump.aq.textextraction.wikipedia.cli.WikipediaCliMinimum
Load the options for language, year, and help
loadOptions() - Method in class cat.lump.ir.lucene.cli.LuceneCliCategoriesXecutor
Loads additional options: category (numerical and string), percentage of words required and output file
loadOptions() - Method in class cat.lump.ir.lucene.cli.LuceneCliIndexerWT
Loads additional options
loadOptions() - Method in class cat.lump.ir.lucene.cli.LuceneCliMinimum0
Load the options for input, output, language and help
loadOptions() - Method in class cat.lump.ir.lucene.cli.LuceneCliQuerierWT
Loads additional options
loadOptions() - Method in class cat.lump.ir.lucene.cli.LuceneCliWT2Query
Loads additional options
loadPages(File) - Method in class cat.lump.aq.textextraction.wikipedia.categories.ArticleTextExtractor
Loads all the page IDs of the file list.
loadPages(Integer[]) - Method in class cat.lump.aq.textextraction.wikipedia.categories.ArticleTextExtractor
 
loadPairs(File) - Method in class cat.lump.aq.textextraction.wikipedia.fragments.ArticlesSimilarity
Load the pairs of articles from a file
loadPairs(File) - Method in class cat.lump.ir.comparison.toCheck.ArticlesSimilarity
Load the pairs of articles from a file
loadStemFilter(Locale, Analyzer, String) - Static method in class cat.lump.ie.textprocessing.word.AnalyzerFactoryLucene
Stemmer from Lucene for different languages.
loadStemFilter(Locale, Analyzer, String) - Static method in class cat.lump.ir.lucene.engine.AnalyzerFactory
Stemmer from Lucene for different languages.
loadStemmer(Locale) - Static method in class cat.lump.ie.textprocessing.word.StemmerFactory
 
loadWikipedia() - Method in class cat.lump.aq.wikilink.jwpl.WikipediaJwpl
Sets the image and category labels for the working language.
locale - Variable in class cat.lump.aq.textextraction.wikipedia.cli.WikipediaCliMinimum
The Wikipedia language
locale - Variable in class cat.lump.aq.textextraction.wikipedia.prepro.AbstractPreprocess
Language of the Wikipedia dump
locale - Variable in class cat.lump.ir.index.Abstracter
Language of the index/query
log - Static variable in class cat.lump.ir.sim.ml.esa.Esa
 
log - Static variable in interface cat.lump.ir.sim.Similarity
Logger for the application
logger - Variable in class cat.lump.aq.textextraction.wikipedia.cli.WikipediaCliMinimum
 
logger - Variable in class cat.lump.ir.lucene.cli.LuceneCliMinimum0
Logs
logger - Variable in class cat.lump.ir.lucene.LuceneInterface
 
lookForMaximumNumber() - Method in class cat.lump.aq.textextraction.wikipedia.utilities.CommonArticlesFinder
Given the list of articles for every language String[] filesID the language from String[] langs with more articles is returned.
lookForMaximumNumber() - Method in class cat.lump.aq.textextraction.wikipedia.utilities.CommonNamespaceFinder
Given the list of articles for every language String[] filesID the language from String[] langs with more articles is returned.
lookForMinimumNumber() - Method in class cat.lump.aq.textextraction.wikipedia.utilities.CommonArticlesFinder
Given the list of articles for every language String[] filesID the language from String[] langs with less articles is returned.
lookForMinimumNumber() - Method in class cat.lump.aq.textextraction.wikipedia.utilities.CommonNamespaceFinder
Given the list of articles for every language String[] filesID the language from String[] langs with less articles is returned.
LuceneCliCategoriesXecutor - Class in cat.lump.ir.lucene.cli
CLI to access the Xecutor pipeline for the WikiTailor IR-based in-domain comparable corpora extraction.
LuceneCliCategoriesXecutor() - Constructor for class cat.lump.ir.lucene.cli.LuceneCliCategoriesXecutor
 
LuceneCliIndexerWT - Class in cat.lump.ir.lucene.cli
 
LuceneCliIndexerWT() - Constructor for class cat.lump.ir.lucene.cli.LuceneCliIndexerWT
 
LuceneCliMinimum0 - Class in cat.lump.ir.lucene.cli
CLI to access the Lucene-related classes for the Wikiparable project
LuceneCliMinimum0() - Constructor for class cat.lump.ir.lucene.cli.LuceneCliMinimum0
Loads the logger and the available options (by calling loadOptions)
LuceneCliQuerierWT - Class in cat.lump.ir.lucene.cli
 
LuceneCliQuerierWT() - Constructor for class cat.lump.ir.lucene.cli.LuceneCliQuerierWT
 
LuceneCliWT2Query - Class in cat.lump.ir.lucene.cli
CLI for WikiTailor2Query
LuceneCliWT2Query() - Constructor for class cat.lump.ir.lucene.cli.LuceneCliWT2Query
 
LuceneIndexer - Class in cat.lump.ir.lucene.index
An indexer based on Lucene in Action 2nd edition.
LuceneIndexer(String, String) - Constructor for class cat.lump.ir.lucene.index.LuceneIndexer
Default invocation for English
LuceneIndexer(Locale, String, String) - Constructor for class cat.lump.ir.lucene.index.LuceneIndexer
 
LuceneIndexer.TextFilesFilter - Class in cat.lump.ir.lucene.index
 
LuceneIndexer.TextFilesFilter() - Constructor for class cat.lump.ir.lucene.index.LuceneIndexer.TextFilesFilter
 
LuceneIndexerAbstract - Class in cat.lump.ir.lucene.index
An indexer based on Lucene in Action 2nd edition.
LuceneIndexerAbstract() - Constructor for class cat.lump.ir.lucene.index.LuceneIndexerAbstract
 
LuceneIndexerWT - Class in cat.lump.ir.lucene.index
This is an adaptation of class LuceneIndexer to be used for Wikiparable
LuceneIndexerWT(String, String) - Constructor for class cat.lump.ir.lucene.index.LuceneIndexerWT
Default invocation for English
LuceneIndexerWT(Locale, String, String) - Constructor for class cat.lump.ir.lucene.index.LuceneIndexerWT
 
LuceneIndexerWT.TextFilesFilter - Class in cat.lump.ir.lucene.index
 
LuceneIndexerWT.TextFilesFilter() - Constructor for class cat.lump.ir.lucene.index.LuceneIndexerWT.TextFilesFilter
 
LuceneInterface - Class in cat.lump.ir.lucene
An abstract class with the necessary data and methods to interact with Lucene's indexer and querier modules.
LuceneInterface(String) - Constructor for class cat.lump.ir.lucene.LuceneInterface
Set up
LuceneLanguages - Class in cat.lump.ir.lucene.index
A collection of static methods that contain the available languages for both ESA index construction and ESA-based text characterization TODO probably move to en_GB, en_US, en_CA, es_ES, es_MX
LuceneLanguages() - Constructor for class cat.lump.ir.lucene.index.LuceneLanguages
 
LuceneQuerier - Class in cat.lump.ir.lucene.query
 
LuceneQuerier(String) - Constructor for class cat.lump.ir.lucene.query.LuceneQuerier
 
LuceneQuerierWT - Class in cat.lump.ir.lucene.query
Query into Lucene Indexes for WikiTailor.
LuceneQuerierWT(String, float) - Constructor for class cat.lump.ir.lucene.query.LuceneQuerierWT
Default invocation for English
LuceneQuerierWT(String) - Constructor for class cat.lump.ir.lucene.query.LuceneQuerierWT
 
LuceneQuerierWT(String, String, float) - Constructor for class cat.lump.ir.lucene.query.LuceneQuerierWT
 
LuceneQuerierWT(String, String) - Constructor for class cat.lump.ir.lucene.query.LuceneQuerierWT
 
LuceneQuerierWT(Locale, String, float) - Constructor for class cat.lump.ir.lucene.query.LuceneQuerierWT
 
LuceneTokenizer - Class in cat.lump.ir.lucene.query
A simple interface to perform tokenization through the Lucene methods.
LuceneTokenizer() - Constructor for class cat.lump.ir.lucene.query.LuceneTokenizer
 
LuceneTokenizer(Locale) - Constructor for class cat.lump.ir.lucene.query.LuceneTokenizer
 
LumpLogger - Class in cat.lump.aq.basics.log
A link to the log4j different configurations.
LumpLogger(String) - Constructor for class cat.lump.aq.basics.log.LumpLogger
Initialise the logger with some label identifying the process

M

magnitude() - Method in class cat.lump.aq.basics.algebra.vector.Vector
Computes the magnitude of the vector (aka norm).
magnitudes - Variable in class cat.lump.ir.index.Index
 
main(String[]) - Static method in class cat.lump.aq.basics.algebra.matrix.DynamicMatrixOfVectors
 
main(String[]) - Static method in class cat.lump.aq.basics.log.LumpLogger
 
main(String[]) - Static method in class cat.lump.aq.textextraction.wikipedia.categories.ArticleSelector
 
main(String[]) - Static method in class cat.lump.aq.textextraction.wikipedia.categories.ArticleTextExtractor
Extract texts from Wikipedia articles and save them into text files, after some given preprocessing.
main(String[]) - Static method in class cat.lump.aq.textextraction.wikipedia.categories.BackArticleSelector
 
main(String[]) - Static method in class cat.lump.aq.textextraction.wikipedia.categories.CategoryDepth
Main method.
main(String[]) - Static method in class cat.lump.aq.textextraction.wikipedia.categories.CategoryExplorer
 
main(String[]) - Static method in class cat.lump.aq.textextraction.wikipedia.categories.CategoryExtractor
Main method.
main(String[]) - Static method in class cat.lump.aq.textextraction.wikipedia.categories.CategoryNameStats
 
main(String[]) - Static method in class cat.lump.aq.textextraction.wikipedia.categories.DomainKeywords
Main method.
main(String[]) - Static method in class cat.lump.aq.textextraction.wikipedia.categories.Xecutor
 
main(String[]) - Static method in class cat.lump.aq.textextraction.wikipedia.categories.XecutorTheFirst
 
main(String[]) - Static method in class cat.lump.aq.textextraction.wikipedia.categories.XecutorTheSecond
 
main(String[]) - Static method in class cat.lump.aq.textextraction.wikipedia.experiments.CorrelationsxCategory
Main function to run the class, serves as example
main(String[]) - Static method in class cat.lump.aq.textextraction.wikipedia.experiments.MeanTFxCategory
Main function to run the class, serves as example
main(String[]) - Static method in class cat.lump.aq.textextraction.wikipedia.fragments.ArticlesSimilarity
 
main(String[]) - Static method in class cat.lump.aq.textextraction.wikipedia.fragments.Xecutor
 
main(String[]) - Static method in class cat.lump.aq.textextraction.wikipedia.io.FileManager
Example to get files with filename filter
main(String[]) - Static method in class cat.lump.aq.textextraction.wikipedia.prepro.TermExtractor
 
main(String[]) - Static method in class cat.lump.aq.textextraction.wikipedia.utilities.ArticlesTFs
Main function to run the class, serves as example
main(String[]) - Static method in class cat.lump.aq.textextraction.wikipedia.utilities.ArticlesTranslator
 
main(String[]) - Static method in class cat.lump.aq.textextraction.wikipedia.utilities.CommonArticlesFinder
Example for using the class
main(String[]) - Static method in class cat.lump.aq.textextraction.wikipedia.utilities.CommonCategoriesExtractor
 
main(String[]) - Static method in class cat.lump.aq.textextraction.wikipedia.utilities.CommonNamespaceFinder
Example for using the class
main(String[]) - Static method in class cat.lump.aq.wikilink.connexion.WikipediaDriverManager
 
main(String[]) - Static method in class cat.lump.aq.wikilink.jwpl.WikipediaJwpl
In this example we run an instance of Wikipedia and display a couple of articles' contents.
main(String[]) - Static method in class cat.lump.ie.textprocessing.ner.NerOpennlp
 
main(String[]) - Static method in class cat.lump.ie.textprocessing.ngram.CharacterNgrams
 
main(String[]) - Static method in class cat.lump.ie.textprocessing.ngram.WordNgrams
 
main(String[]) - Static method in class cat.lump.ie.textprocessing.TestTokenize
 
main(String[]) - Static method in class cat.lump.ie.textprocessing.transform.Transformation
 
main(String[]) - Static method in class cat.lump.ie.textprocessing.transform.Transliteratorr
 
main(String[]) - Static method in class cat.lump.ir.comparison.toCheck.ArticlesSimilarity
 
main(String[]) - Static method in class cat.lump.ir.index.Indexer
 
main(String[]) - Static method in class cat.lump.ir.index.Querier
 
main(String[]) - Static method in class cat.lump.ir.lucene.index.LuceneIndexer
 
main(String[]) - Static method in class cat.lump.ir.lucene.index.LuceneIndexerWT
 
main(String[]) - Static method in class cat.lump.ir.lucene.query.LuceneQuerier
 
main(String[]) - Static method in class cat.lump.ir.lucene.query.LuceneQuerierWT
 
main(String[]) - Static method in class cat.lump.ir.lucene.query.WikiTailor2Query
 
main(String[]) - Static method in class cat.lump.ir.lucene.Xecutor
 
main(String[]) - Static method in class cat.lump.ir.sim.cl.len.LengthModel
Parses the input parameters and either learns a length model from a collection or estimates the corresponding values for a set of texts
MapUtil - Class in cat.lump.aq.basics.structure.standard
A class to sort a map according to it values.
MapUtil() - Constructor for class cat.lump.aq.basics.structure.standard.MapUtil
 
matrix2csv(String[][], File) - Method in class cat.lump.aq.basics.io.files.CsvFoolWriter
Reads a csv file and return a 2-dimensional array of Strings of it.
matrix2csv(String[], String[][], File) - Method in class cat.lump.aq.basics.io.files.CsvFoolWriter
Reads a csv file.
max() - Method in class cat.lump.aq.basics.algebra.vector.Vector
 
maxSimilaritiesIndex - Variable in class cat.lump.ir.retrievalmodels.similarity.CosineSimilarity
Index to the maximum similarity value for each text fragment in A
maxSimilaritiesIndex - Variable in class cat.lump.ir.retrievalmodels.similarity.JaccardSimilarity
Index to the maximum similarity value for each text fragment in A
md5(File) - Static method in class cat.lump.aq.basics.io.files.FileIO
 
md5(String) - Static method in class cat.lump.aq.basics.io.files.FileIO
 
MeanTFxCategory - Class in cat.lump.aq.textextraction.wikipedia.experiments
Wikiparable: Experiment 1 for evaluation.
MeanTFxCategory(String, String, String, String) - Constructor for class cat.lump.aq.textextraction.wikipedia.experiments.MeanTFxCategory
 
min() - Method in class cat.lump.aq.basics.algebra.vector.Vector
 
mirrorCommonArticles2Langs(String) - Method in class cat.lump.aq.textextraction.wikipedia.utilities.ArticlesTranslator
Method for converting the file with the articles in common L1.icat1.L2.icat2.method into the mirror file with L2.icat2.L1.icat1.method.
Model - Enum in cat.lump.ir.comparison
This enumeration contains the models of similarity implemented in this package.
model - Variable in class cat.lump.ir.retrievalmodels.similarity.SimilarityCalculator
Model of similarity
modifyVector(Vector, String) - Method in class cat.lump.ir.sim.ml.esa.EsaVectors
Very unlikely to happen, but a previously filled vector could be modified.
move(File, File) - Static method in class cat.lump.aq.basics.io.files.FileIO
 
mysql_pss - Static variable in class cat.lump.aq.wikilink.config.MySQLWikiConfiguration
 
mysql_url_jwpl - Static variable in class cat.lump.aq.wikilink.config.MySQLWikiConfiguration
 
mysql_usr - Static variable in class cat.lump.aq.wikilink.config.MySQLWikiConfiguration
 
mysqlUrl() - Static method in class cat.lump.aq.wikilink.config.MySQLWikiConfiguration
 
mysqlUrlJwpl() - Static method in class cat.lump.aq.wikilink.config.MySQLWikiConfiguration
 
MySQLWikiConfiguration - Class in cat.lump.aq.wikilink.config
Class to handle the connection variables to the database.
MySQLWikiConfiguration() - Constructor for class cat.lump.aq.wikilink.config.MySQLWikiConfiguration
 

N

NerOpennlp - Class in cat.lump.ie.textprocessing.ner
A class to get the named entities from a given text based on OpenNLP.
NerOpennlp(Locale) - Constructor for class cat.lump.ie.textprocessing.ner.NerOpennlp
 
next() - Method in class cat.lump.ir.comparison.toCheck.SimilarityMatrixIterator
 
next() - Method in class cat.lump.ir.retrievalmodels.similarity.SimilarityMatrixIterator
 
nGrams - Variable in class cat.lump.ir.retrievalmodels.similarity.SimilarityCalculator
N value for the n-grams
ngramTokenize(String) - Static method in class cat.lump.ir.lucene.query.LuceneTokenizer
 
normalizeAndDeAposText(String) - Method in class cat.lump.ie.textprocessing.TextPreprocessor
Normalize the text by shrinking white spaces in one as well as substituting quotations, dashes and dots.
normalizeText(String) - Static method in class cat.lump.ie.textprocessing.TextPreprocessor
Normalize the text by shrinking white spaces in one as well as substituting quotations, dashes and dots.

O

objectA - Variable in class cat.lump.ir.sim.ml.esa.SimilarityESA
Identifier for object A
objectB - Variable in class cat.lump.ir.sim.ml.esa.SimilarityESA
Identifier for object B
objectExists(File, String) - Method in class cat.lump.ir.sim.ml.esa.SimilarityESA
Checks whether the vector-representation object exists.
onlyPunctuation(String) - Static method in class cat.lump.ie.textprocessing.sentence.Punctuation
 
options - Variable in class cat.lump.aq.textextraction.wikipedia.cli.WikipediaCliMinimum
The options for the given CLI
options - Variable in class cat.lump.ir.lucene.cli.LuceneCliMinimum0
The options for the given CLI
overrideObjects - Variable in class cat.lump.ir.sim.ml.esa.SimilarityESA
Whether previously computed semantic representations should be discarded (if they exist)

P

p - Variable in class cat.lump.aq.textextraction.wikipedia.cli.WikipediaCliMinimum
Configuration file
p - Static variable in class cat.lump.aq.textextraction.wikipedia.WTConfig
Configuration file
Pair<S extends java.lang.Comparable<S>,T extends java.lang.Comparable<T>> - Class in cat.lump.aq.basics.structure
A class that contains a pair for storing data.
Pair(S, T) - Constructor for class cat.lump.aq.basics.structure.Pair
 
PairOperations - Class in cat.lump.aq.basics.structure
A class that allows for comparing a list of pairs according to their second value.
PairOperations() - Constructor for class cat.lump.aq.basics.structure.PairOperations
 
pairs_db - Static variable in class cat.lump.aq.textextraction.wikipedia.utilities.CommonArticlesFinder
DB with langlinks & articles pairs
pairs_db - Static variable in class cat.lump.aq.textextraction.wikipedia.utilities.CommonCategoriesExtractor
DB with langlinks & articles pairs
pairs_db - Static variable in class cat.lump.aq.textextraction.wikipedia.utilities.CommonNamespaceFinder
DB with langlinks & articles pairs
pairs_db - Static variable in class cat.lump.aq.wikilink.config.MySQLWikiConfiguration
 
parseArguments(String[]) - Method in class cat.lump.aq.textextraction.wikipedia.cli.WikipediaCliArticleSelector
 
parseArguments(String[]) - Method in class cat.lump.aq.textextraction.wikipedia.cli.WikipediaCliArticleTextExtractor
 
parseArguments(String[]) - Method in class cat.lump.aq.textextraction.wikipedia.cli.WikipediaCliCategoriesXecutor
 
parseArguments(String[]) - Method in class cat.lump.aq.textextraction.wikipedia.cli.WikipediaCliCategoryDepth
 
parseArguments(String[]) - Method in class cat.lump.aq.textextraction.wikipedia.cli.WikipediaCliCategoryExtractor
 
parseArguments(String[]) - Method in class cat.lump.aq.textextraction.wikipedia.cli.WikipediaCliDomainKeywords
 
parseArguments(String[]) - Method in class cat.lump.aq.textextraction.wikipedia.cli.WikipediaCliFragmentsXecutor
 
parseArguments(String[]) - Method in class cat.lump.aq.textextraction.wikipedia.cli.WikipediaCliMinimum
Method to parse the arguments received.
parseArguments(String[]) - Method in class cat.lump.ir.lucene.cli.LuceneCliCategoriesXecutor
 
parseArguments(String[]) - Method in class cat.lump.ir.lucene.cli.LuceneCliIndexerWT
Parses the arguments received
parseArguments(String[]) - Method in class cat.lump.ir.lucene.cli.LuceneCliMinimum0
Method to parse the arguments received.
parseArguments(String[]) - Method in class cat.lump.ir.lucene.cli.LuceneCliQuerierWT
Parses the arguments received
parseArguments(String[]) - Method in class cat.lump.ir.lucene.cli.LuceneCliWT2Query
Parses the arguments received
parseLine(String[]) - Method in class cat.lump.aq.textextraction.wikipedia.cli.WikipediaCliMinimum
Parses the arguments and generates the command line for further processing the parameters.
parseLine(String[]) - Method in class cat.lump.ir.lucene.cli.LuceneCliMinimum0
Parses the arguments and generates the command line for further processing the parameters.
PlainTextPreprocess - Class in cat.lump.aq.textextraction.wikipedia.prepro
Preprocess a Wikipedia page using JWPL API.
PlainTextPreprocess(TypePreprocess, String, Locale, int, File) - Constructor for class cat.lump.aq.textextraction.wikipedia.prepro.PlainTextPreprocess
 
preprocess(TypePreprocess) - Method in class cat.lump.aq.textextraction.wikipedia.categories.ArticleTextExtractor
Preprocess all the pages with the preprocess method indentified by preprocess
preprocess(String) - Method in class cat.lump.aq.textextraction.wikipedia.categories.DomainVocabulary
Preprocess a text to extract its terms as defined in TermExtractor.getTerms();
preprocess(int) - Method in class cat.lump.aq.textextraction.wikipedia.prepro.AbstractPreprocess
Function which implements the preprocessing procedure.
preprocess(int) - Method in class cat.lump.aq.textextraction.wikipedia.prepro.PlainTextPreprocess
 
preprocess(String) - Method in class cat.lump.ir.retrievalmodels.document.PseudoCognates
 
preprocessAll() - Method in class cat.lump.aq.textextraction.wikipedia.categories.ArticleTextExtractor
Applies all the available preprocessing steps to the pages.
PROCESS_START - Variable in class cat.lump.ir.lucene.LuceneInterface
 
processCategory(String, File) - Method in class cat.lump.ir.lucene.query.WikiTailor2Query
Extracts the top articles corresponding to the current model for a concrete category.
processLine(String) - Method in class cat.lump.ir.sim.ml.esa.SimilarityESAlines
Method used to do some preprocessing to the input text.
processLine(String) - Method in class cat.lump.ir.sim.ml.esa.SimilarityESAsent
Method used to do some preprocessing to the input text.
PseudoCognates - Class in cat.lump.ir.retrievalmodels.document
A document representation based in Simard cognateness model.
PseudoCognates(Dictionary, Locale) - Constructor for class cat.lump.ir.retrievalmodels.document.PseudoCognates
 
Punctuation - Class in cat.lump.ie.textprocessing.sentence
Regular expression-based punctuation finder.
Punctuation() - Constructor for class cat.lump.ie.textprocessing.sentence.Punctuation
 
put(Vector, int) - Method in class cat.lump.aq.basics.algebra.matrix.DynamicMatrixOfVectors
Store the vector in the specified position of the array.

Q

Querier - Class in cat.lump.ir.index
 
Querier(Locale, File, RepresentationType) - Constructor for class cat.lump.ir.index.Querier
Invocation where the documents' language is given and the directory for the index is provided.
Querier(File, RepresentationType) - Constructor for class cat.lump.ir.index.Querier
Default invocation; files are in English
query(String) - Method in class cat.lump.ir.lucene.query.LuceneQuerier
 
queryIDs(String) - Method in class cat.lump.ir.lucene.query.LuceneQuerierWT
Queries Lucene with the query text and returns a String with the ID of those documents that have a score larger than max_score/percentage
queryNorm(float) - Method in class cat.lump.ir.lucene.index.TFSimilarity
 
queryScoreDoc(String) - Method in class cat.lump.ir.lucene.query.LuceneQuerierWT
Queries Lucene with the query text and returns a LinkedHashMap with the pairs (score, filename) for those documents that have a score larger than max_score/percentage

R

Ranking - Class in cat.lump.ir.index
A class that implements a ranking of identifiers (documents) and their relevance.
Ranking() - Constructor for class cat.lump.ir.index.Ranking
 
readFully(Reader) - Method in class cat.lump.ir.lucene.engine.WTAnalyzer
 
readObject(File) - Static method in class cat.lump.aq.basics.io.files.FileIO
 
reconstructTradArticles(HashSet<Integer>, boolean) - Method in class cat.lump.aq.textextraction.wikipedia.utilities.ArticlesTranslator
Given the original index files and the temporal files already translated, the method reconstructs the translation of every individual article and saves them in path/plain/L1/index/id.trad.L2.txt
reconstructTradArticlesCommon(String, boolean) - Method in class cat.lump.aq.textextraction.wikipedia.utilities.ArticlesTranslator
Given the original index files and the temporal files already translated, the method reconstructs the translation of every individual article that was originally in the file of common articles and saves them in path/plain/L1/index/id.trad.L2.txt
remove(int) - Method in class cat.lump.ir.index.Index
Remove an existing index from the index.
remove(String) - Method in class cat.lump.ir.index.Ranking
Remove a document from the ranking
removeAccents(String) - Static method in class cat.lump.ie.textprocessing.transform.Transformation
 
removeDiacritics(String) - Static method in class cat.lump.ie.textprocessing.sentence.Diacritics
 
removeDiacritics() - Method in class cat.lump.ie.textprocessing.TextPreprocessor
Removes the diacritics from the string
removeDocument(int) - Method in class cat.lump.ir.index.Indexer
Remove document id from the index and documents' collection
removeEngStopwords() - Method in class cat.lump.ie.textprocessing.TextPreprocessor
Eliminates all the English stopwords from the string
removeMarks(String) - Static method in class cat.lump.ie.textprocessing.sentence.Diacritics
 
removeNonAlphabetic(int) - Method in class cat.lump.ie.textprocessing.TextPreprocessor
Remove any token which is not in [:alpha:] character class.
removeNonAlphaNumeric(int) - Method in class cat.lump.ie.textprocessing.TextPreprocessor
Remove any token which is not the in [:alnum:] character class.
removePage(int) - Method in class cat.lump.aq.textextraction.wikipedia.categories.ArticleTextExtractor
 
removePage(Integer) - Method in class cat.lump.aq.textextraction.wikipedia.prepro.AbstractPreprocess
Removes a page to preprocess
removePreprocess(TypePreprocess) - Method in class cat.lump.aq.textextraction.wikipedia.categories.ArticleTextExtractor
Removes the preprocess if it belongs to the set of available preprocesses.
removePunctuation() - Method in class cat.lump.ie.textprocessing.TextPreprocessor
Removes the punctuation marks from the string
removeScoredCategory(GroupOfCategories.ScoredCategory) - Method in class cat.lump.aq.textextraction.wikipedia.categories.GroupOfCategories
 
removeSource(int) - Method in class cat.lump.aq.textextraction.wikipedia.fragments.ArticlesSimilarity
Remove the pair with the given source ID
removeSource(int) - Method in class cat.lump.ir.comparison.toCheck.ArticlesSimilarity
Remove the pair with the given source ID
removeStopwords(String) - Method in class cat.lump.ie.textprocessing.stopwords.Stopwords
Checks for the vocabulary in the text and returns a copy after discarding stopwords.
removeStopwords(List<String>) - Method in class cat.lump.ie.textprocessing.stopwords.Stopwords
 
removeStopwords() - Method in class cat.lump.ie.textprocessing.TextPreprocessor
Eliminates all the stopwords from the string
removeTarget(int) - Method in class cat.lump.aq.textextraction.wikipedia.fragments.ArticlesSimilarity
Remove the pair with the given target ID
removeTarget(int) - Method in class cat.lump.ir.comparison.toCheck.ArticlesSimilarity
Removes the pair with the given source ID
removeTerm(String) - Method in class cat.lump.ir.weighting.TermFrequency
Remove the given term (warning if the term does not exist)
removeTerms(List<String>) - Method in class cat.lump.ir.weighting.TermFrequency
Remove these terms into the collection.
RepresentationType - Enum in cat.lump.ir.retrievalmodels.document
This enumeration contains the types of data representation availables in this package.
representText(int) - Method in class cat.lump.ir.retrievalmodels.similarity.Article
Represents the article text according to the type of representation assigned.
repType - Variable in class cat.lump.ir.index.Abstracter
Set of representations to generate/load in the index
reset() - Method in class cat.lump.ir.index.Ranking
Start a new ranking
resolveIfRedirect(int, int, String, int, WikipediaDriverManager) - Static method in class cat.lump.aq.wikilink.WikipediaDBdata
Looks for a resolvedId given the original id only in case it corresponds to a redirect.
resolveRedirect(int, int, String, int, WikipediaDriverManager) - Static method in class cat.lump.aq.wikilink.WikipediaDBdata
Looks for a resolvedId given an original redirect id.
reusableTokenStream(String, Reader) - Method in class cat.lump.ir.lucene.index.analyzers.CroatianAnalyzer
 
reusableTokenStream(String, Reader) - Method in class cat.lump.ir.lucene.index.analyzers.EstonianAnalyzer
 
reusableTokenStream(String, Reader) - Method in class cat.lump.ir.lucene.index.analyzers.LithuanianAnalyzer
 
reusableTokenStream(String, Reader) - Method in class cat.lump.ir.lucene.index.analyzers.SlovenianAnalyzer
 
rootDirectory - Variable in class cat.lump.aq.textextraction.wikipedia.prepro.AbstractPreprocess
Root directory wherein the output will be stored.
run() - Method in class cat.lump.aq.textextraction.wikipedia.prepro.AbstractPreprocess
This function is executed when the instance is treated as a thread.
runStatement(String) - Method in class cat.lump.aq.wikilink.connexion.WikipediaDriverManager
 

S

sameCardinality(Vector) - Method in class cat.lump.aq.basics.algebra.vector.Vector
Check if the vector and v2 have the same cardinality.
sameCardinality(float[]) - Method in class cat.lump.aq.basics.algebra.vector.Vector
Check if the vector and v2 have the same cardinality.
saveIndex() - Method in class cat.lump.ir.index.Indexer
Saves the documents into the provided output directory
saveObject(File, EsaVectors) - Method in class cat.lump.ir.sim.ml.esa.SimilarityESA
Saves a textual representation into an object file
savePage(File, TypePreprocess, String, int, StringBuffer) - Static method in class cat.lump.aq.textextraction.wikipedia.io.FileManager
Saves the text contained in the buffer in the correct file considering the page ID, language and type of preprocess.
scoreDomain(DomainVocabulary, Category) - Method in class cat.lump.aq.textextraction.wikipedia.categories.CategoryExplorer
Scores all the groups which compose a domain.
scoreDomain(DomainVocabulary, Category, int) - Method in class cat.lump.aq.textextraction.wikipedia.categories.CategoryExplorer
Scores the groups which compose a domain and are at most as far as the maxDistance parameter defines.
searchCategoryDepth(int) - Method in class cat.lump.aq.textextraction.wikipedia.categories.Xecutor
 
selectArticles(int, int) - Method in class cat.lump.aq.textextraction.wikipedia.categories.Xecutor
Select the articles that are supposed to appear in a category
selectArticles(File, short) - Method in class cat.lump.aq.textextraction.wikipedia.categories.XecutorTheSecond
Select the articles that are supposed to appear in a category
SentencesOpennlp - Class in cat.lump.ie.textprocessing.sentence
A class to get the sentences from a given text.
SentencesOpennlp(Locale) - Constructor for class cat.lump.ie.textprocessing.sentence.SentencesOpennlp
 
separator - Static variable in class cat.lump.aq.basics.io.files.FileIO
 
serialize(File) - Method in class cat.lump.aq.textextraction.wikipedia.categories.DomainVocabulary
Saves the vocabulary in a binary file.
serialize(File) - Method in class cat.lump.ir.retrievalmodels.similarity.SimilarityMatrix
Saves this object in a file as binary raw data.
set(S, T) - Method in class cat.lump.aq.basics.structure.Pair
 
setAnalyzer() - Method in class cat.lump.ir.lucene.index.LuceneIndexer
 
setAnalyzer() - Method in class cat.lump.ir.lucene.index.LuceneIndexerWT
Sets the analyzer as a new instance of WTAnalyzer
setAnalyzer() - Method in class cat.lump.ir.lucene.query.LuceneQuerierWT
Sets the analyzer as a new instance of WTAnalyzer
setAnalyzer(Locale) - Method in class cat.lump.ir.lucene.query.LuceneTokenizer
TODO: This code is duplicated with cat.lump.ir.lucene.engine.loadAnalyzer
setAnalyzer() - Method in class cat.lump.ir.sim.ml.esa.EsaGenerator
Set the Lucene analyzer to use according to the given language
setDataDir(String) - Method in class cat.lump.ir.lucene.index.LuceneIndexer
 
setDataDir(String) - Method in class cat.lump.ir.lucene.index.LuceneIndexerWT
 
setDB(String) - Method in class cat.lump.aq.wikilink.connexion.WikipediaDriverManager
 
setDir(String) - Method in class cat.lump.aq.textextraction.wikipedia.fragments.Xecutor
 
setDocumentsPath(File, File) - Method in class cat.lump.ir.sim.cl.clesa.SimilarityCLESAdocs
 
setDocumentsPath(File, File) - Method in class cat.lump.ir.sim.ml.esa.SimilarityESA
A method that loads the texts in collections A and B.
setDocumentsPath(File, File) - Method in class cat.lump.ir.sim.ml.esa.SimilarityESAdocs
 
setDocumentsPath(File, File) - Method in class cat.lump.ir.sim.ml.esa.SimilarityESAlines
 
setDomain(boolean) - Method in class cat.lump.aq.textextraction.wikipedia.categories.GroupOfCategories.ScoredCategory
Sets if this category belongs to the domain.
setFiles(File, File) - Method in class cat.lump.ir.sim.cl.len.LengthModelEstimate
Set the input files to estimate the model from
setFiles(File, File) - Method in class cat.lump.ir.sim.cl.len.LengthModelLearn
Set the two input files
setIndexDir(String) - Method in class cat.lump.ir.lucene.LuceneInterface
 
setIndexPath(File) - Method in class cat.lump.ir.sim.ml.esa.EsaGenerator
Set the path to Lucene's index
setInvIndex(InvIndexContainer) - Method in class cat.lump.ir.retrievalmodels.similarity.Article
 
setKey(S) - Method in class cat.lump.aq.basics.structure.Pair
 
setLang(Locale) - Method in class cat.lump.aq.textextraction.wikipedia.categories.DomainKeywords
Sets the language.
setLanguage(String) - Method in class cat.lump.ir.lucene.LuceneInterface
 
setLanguage(Locale) - Method in class cat.lump.ir.lucene.LuceneInterface
 
setLanguage(Locale) - Method in class cat.lump.ir.retrievalmodels.document.Document
 
setLanguage(Locale) - Method in class cat.lump.ir.retrievalmodels.similarity.Article
 
setMatrix(double[][]) - Method in class cat.lump.ir.retrievalmodels.similarity.SimilarityMatrix
 
setMaxSize(int) - Method in class cat.lump.aq.textextraction.wikipedia.categories.DomainKeywords
Defines the maximum number of terms that should be considered as domain terms
setMinimumSize(int) - Method in class cat.lump.aq.textextraction.wikipedia.categories.DomainVocabulary
Changes the minimum size of the tokens to be accepted as terms of the vocabulary.
setMinNumArticles(int) - Method in class cat.lump.aq.textextraction.wikipedia.categories.DomainKeywords
Defines the minimum number of articles required to build the vocabulary
setModel(Model) - Method in class cat.lump.ir.retrievalmodels.similarity.SimilarityCalculator
 
setMuSigma(double, double) - Method in class cat.lump.ir.sim.cl.len.LengthModelEstimate
Set values for mu and sigma
setnGrams(int) - Method in class cat.lump.ir.retrievalmodels.similarity.SimilarityCalculator
 
setNormalisation(Boolean) - Method in class cat.lump.ir.sim.ml.esa.EsaGenerator
Determines whether the ESA vectors are going to be normalised.
setObjects() - Method in class cat.lump.ir.sim.ml.esa.SimilarityESA
Set the name of the resulting vector objects
setObjects() - Method in class cat.lump.ir.sim.ml.esa.SimilarityESAlines
 
setObjects() - Method in class cat.lump.ir.sim.ml.esa.SimilarityESAsent
 
setOutPath(String) - Method in class cat.lump.aq.textextraction.wikipedia.categories.Xecutor
 
setOutPath(String) - Method in class cat.lump.aq.textextraction.wikipedia.categories.XecutorTheFirst
 
setOutPath(String) - Method in class cat.lump.aq.textextraction.wikipedia.categories.XecutorTheSecond
 
setOutputDir(String) - Method in class cat.lump.aq.textextraction.wikipedia.categories.CategoryExtractor
 
setOutputDir(String) - Method in class cat.lump.ir.lucene.query.WikiTailor2Query
 
setPairs(ArrayList<SimilarityPair>) - Method in class cat.lump.aq.textextraction.wikipedia.fragments.ArticlesSimilarity
Sets the list of pairs.
setPairs(ArrayList<SimilarityPair>) - Method in class cat.lump.ir.comparison.toCheck.ArticlesSimilarity
Sets the list of pairs.
setPercentage(int) - Method in class cat.lump.aq.textextraction.wikipedia.categories.DomainKeywords
Defines the percentage of terms that should be considered as domain terms
setRepresentation(Document) - Method in class cat.lump.ir.retrievalmodels.similarity.Article
 
setRootDirectory(File) - Method in class cat.lump.aq.textextraction.wikipedia.categories.ArticleTextExtractor
Changes the root directory of the preprocessor.
setSimilarity(int, int, float) - Method in class cat.lump.ir.retrievalmodels.similarity.SimilarityMatrix
Changes the value of the position of the matrix given by row and col.
setSource(<any>) - Method in class cat.lump.ir.comparison.toCheck.SimilarityPair
 
setSource(Article) - Method in class cat.lump.ir.retrievalmodels.similarity.SimilarityCalculator
 
setSrcLang(String) - Method in class cat.lump.aq.textextraction.wikipedia.fragments.ArticlesSimilarity
 
setSrcLang(String) - Method in class cat.lump.ir.comparison.toCheck.ArticlesSimilarity
 
setString(String) - Method in class cat.lump.ie.textprocessing.TextPreprocessor
Stores a copy of the original string and generates a tokenized copy.
setStringTokens(String) - Method in class cat.lump.ie.textprocessing.TextPreprocessor
Stores a copy of the original string and generates a tokenized copy.
setTarget(<any>) - Method in class cat.lump.ir.comparison.toCheck.SimilarityPair
 
setTarget(Article) - Method in class cat.lump.ir.retrievalmodels.similarity.SimilarityCalculator
 
setTerms(Collection<TermFrequencyTuple>) - Method in class cat.lump.aq.textextraction.wikipedia.categories.DomainVocabulary
Sets a collection of tuples formed by a term and its frequency as vocabulary.
setText(String) - Method in class cat.lump.ir.retrievalmodels.document.PseudoCognates
 
setText(String) - Method in class cat.lump.ir.retrievalmodels.similarity.Article
 
setTrgLang(String) - Method in class cat.lump.aq.textextraction.wikipedia.fragments.ArticlesSimilarity
 
setTrgLang(String) - Method in class cat.lump.ir.comparison.toCheck.ArticlesSimilarity
 
setType(RepresentationType) - Method in class cat.lump.ir.retrievalmodels.similarity.SimilarityCalculator
 
setValue(T) - Method in class cat.lump.aq.basics.structure.Pair
 
setVerbose(Boolean) - Method in class cat.lump.ir.lucene.LuceneInterface
 
setVerbose(Boolean) - Method in class cat.lump.ir.sim.cl.len.LengthModelEstimate
Deprecated.
setYear(int) - Method in class cat.lump.aq.textextraction.wikipedia.categories.DomainKeywords
 
Similarity - Enum in cat.lump.ir.retrievalmodels.similarity
This enumeration contains the similarity models and characterizations available.
Similarity - Interface in cat.lump.ir.sim
Interface with the minimum required methods to code a similarity model.
SimilarityCalculator - Class in cat.lump.ir.retrievalmodels.similarity
 
SimilarityCalculator(String, String, RepresentationType, Model) - Constructor for class cat.lump.ir.retrievalmodels.similarity.SimilarityCalculator
Creates a similarity calculator with the given arguments and n=1 for the n-grams methods.
SimilarityCalculator(String, String, RepresentationType, Model, int) - Constructor for class cat.lump.ir.retrievalmodels.similarity.SimilarityCalculator
Creates a similarity calculator with the given arguments.
SimilarityCalculatorLenFact - Class in cat.lump.ir.retrievalmodels.similarity
 
SimilarityCalculatorLenFact(String, String, RepresentationType, Model) - Constructor for class cat.lump.ir.retrievalmodels.similarity.SimilarityCalculatorLenFact
 
SimilarityCLESA - Class in cat.lump.ir.sim.cl.clesa
Implementation of Cross-Language Explicit Semantic Analysis, an extension of explicit semantic analysis proposed by:
Potthast, Stein and Anderka.
SimilarityCLESA(String, String, File, File, String, String, boolean) - Constructor for class cat.lump.ir.sim.cl.clesa.SimilarityCLESA
 
SimilarityCLESAdocs - Class in cat.lump.ir.sim.cl.clesa
Implementation of Explicit Semantic Analysis as described in:
Gabrilovich, Evgeniy, and Shaul Markovitch.
SimilarityCLESAdocs(String, String, File, File, String, String, boolean) - Constructor for class cat.lump.ir.sim.cl.clesa.SimilarityCLESAdocs
 
SimilarityESA - Class in cat.lump.ir.sim.ml.esa
Implementation of Explicit Semantic Analysis as described in:
Gabrilovich, Evgeniy, and Shaul Markovitch.
SimilarityESA(String, String, Boolean) - Constructor for class cat.lump.ir.sim.ml.esa.SimilarityESA
Constructor.
SimilarityESAdocs - Class in cat.lump.ir.sim.ml.esa
Implementation of of SimilarityESA, with two files collections to compute similarities against each other.
SimilarityESAdocs(File, File, String, String, boolean) - Constructor for class cat.lump.ir.sim.ml.esa.SimilarityESAdocs
Includes the path to the documents in A and B.
SimilarityESAlines - Class in cat.lump.ir.sim.ml.esa
Extension of SimilarityESA, with two files to compute similarities against each other.
SimilarityESAlines(File, File, String, String, Boolean) - Constructor for class cat.lump.ir.sim.ml.esa.SimilarityESAlines
Given two files with independent sentences, process line-wise similarities.
SimilarityESAsent - Class in cat.lump.ir.sim.ml.esa
Implementation of SimilarityESAlines that works over one single tab-separated document in which the left-side text-line has to be compared against the right-side one.
SimilarityESAsent(File, String, String, Boolean) - Constructor for class cat.lump.ir.sim.ml.esa.SimilarityESAsent
It invokes the superclass SimilarityESAlines, but it deceives it by claiming two docs exist which are indeed the same one.
SimilarityMatrix - Class in cat.lump.ir.retrievalmodels.similarity
This class represents a matrix of similarities.
SimilarityMatrix(int, int) - Constructor for class cat.lump.ir.retrievalmodels.similarity.SimilarityMatrix
Creates a matrix filled with zeros with the size given by the parameters.
SimilarityMatrixIterator - Class in cat.lump.ir.comparison.toCheck
 
SimilarityMatrixIterator(SimilarityMatrix) - Constructor for class cat.lump.ir.comparison.toCheck.SimilarityMatrixIterator
 
SimilarityMatrixIterator - Class in cat.lump.ir.retrievalmodels.similarity
 
SimilarityMatrixIterator(SimilarityMatrix) - Constructor for class cat.lump.ir.retrievalmodels.similarity.SimilarityMatrixIterator
 
SimilarityMeasure - Interface in cat.lump.ir.retrievalmodels.similarity
 
SimilarityModel - Interface in cat.lump.ir.retrievalmodels.similarity
 
SimilarityPair - Class in cat.lump.ir.comparison.toCheck
A similarity pair contains the pairs which define the input data for the similarity calculators.
SimilarityPair(int, File, int, File) - Constructor for class cat.lump.ir.comparison.toCheck.SimilarityPair
 
size - Variable in class cat.lump.aq.basics.algebra.matrix.DynamicMatrix
the size of the matrix (which changes depending on the inserted elements
size() - Method in class cat.lump.ir.index.Ranking
 
size(RepresentationType) - Method in class cat.lump.ir.retrievalmodels.document.Fragment
 
size() - Method in class cat.lump.ir.weighting.TermFrequency
 
skipPage(int) - Method in class cat.lump.aq.textextraction.wikipedia.prepro.AbstractPreprocess
Checks if the page must be skipped.
sloppyFreq(int) - Method in class cat.lump.ir.lucene.index.TFSimilarity
 
SlovenianAnalyzer - Class in cat.lump.ir.lucene.index.analyzers
 
SlovenianAnalyzer() - Constructor for class cat.lump.ir.lucene.index.analyzers.SlovenianAnalyzer
 
sortByValue(ArrayList<Pair<Integer, Double>>) - Static method in class cat.lump.aq.basics.structure.PairOperations
Sort a list of pairs according to its value
sortByValue(Map<K, V>) - Static method in class cat.lump.aq.basics.structure.standard.MapUtil
 
sortByValueInverse(Map<K, V>) - Static method in class cat.lump.aq.basics.structure.standard.MapUtil
 
sortByValueReverse(ArrayList<Pair<Integer, Double>>) - Static method in class cat.lump.aq.basics.structure.PairOperations
Sort a list of pairs in reverse order according to its value
source - Variable in class cat.lump.ir.retrievalmodels.similarity.SimilarityCalculator
Article in source language
Span - Class in cat.lump.ie.textprocessing
Copied from the aitools span definition.
Span(int, int) - Constructor for class cat.lump.ie.textprocessing.Span
 
splitDiacritics() - Method in class cat.lump.ie.textprocessing.sentence.Diacritics
 
splitText(String) - Method in class cat.lump.ir.retrievalmodels.document.Document
 
sqlPass() - Static method in class cat.lump.aq.wikilink.config.MySQLWikiConfiguration
 
sqlUser() - Static method in class cat.lump.aq.wikilink.config.MySQLWikiConfiguration
 
src_len - Variable in class cat.lump.aq.basics.structure.ArticlePair
Length of the source language article
srcID - Variable in class cat.lump.aq.basics.structure.ArticlePair
Identifier of the source language article
srcTitle - Variable in class cat.lump.aq.basics.structure.ArticlePair
Title of the source language article
standardTokenize(File) - Method in class cat.lump.ir.lucene.query.LuceneTokenizer
Tokenize the text from the given file using Lucene's StandardAnalyzer
standardTokenize(String) - Method in class cat.lump.ir.lucene.query.LuceneTokenizer
Tokenize the text using Lucene's StandardAnalyzer
stem() - Method in class cat.lump.ie.textprocessing.TextPreprocessor
 
stemLucene() - Method in class cat.lump.ie.textprocessing.TextPreprocessor
 
StemmerFactory - Class in cat.lump.ie.textprocessing.word
Factory that allows for getting a stemmer for the required language (if available)
StemmerFactory() - Constructor for class cat.lump.ie.textprocessing.word.StemmerFactory
 
STOP_LIST - Variable in class cat.lump.ie.textprocessing.stopwords.Stopwords
 
Stopwords - Class in cat.lump.ie.textprocessing.stopwords
Abstract class that gives the methods for stopwords acquisition and modification in different languages.
Stopwords(Locale) - Constructor for class cat.lump.ie.textprocessing.stopwords.Stopwords
 
str2FlatQuery(Analyzer, String) - Method in class cat.lump.ir.lucene.query.Document2Query
Generates a query in which every token has the same relevance
str2FlatQuery(String) - Method in class cat.lump.ir.lucene.query.Document2Query
Generates a query in which every token has the same relevance
stringToFile(File, String, boolean) - Static method in class cat.lump.aq.basics.io.files.FileIO
 

T

tableExists(String, String) - Method in class cat.lump.aq.wikilink.connexion.WikipediaDriverManager
Checks if a table exists in a database
target - Variable in class cat.lump.ir.retrievalmodels.similarity.SimilarityCalculator
Article in target language
tBox - Variable in class cat.lump.aq.basics.structure.InvIndexContainer
 
TermExtractor - Class in cat.lump.aq.textextraction.wikipedia.prepro
A class that extracts terms according to different definitions.
TermExtractor(Locale) - Constructor for class cat.lump.aq.textextraction.wikipedia.prepro.TermExtractor
Creates a new TermExtractor
TermFrequency - Class in cat.lump.ir.weighting
A class to compute and store a simple term frequency.
TermFrequency() - Constructor for class cat.lump.ir.weighting.TermFrequency
Invokes the class with an empty list of term tuples
TermFrequency(List<TermFrequencyTuple>) - Constructor for class cat.lump.ir.weighting.TermFrequency
Invokes the class with an existing empty list of term tuples
TermFrequencyTuple - Class in cat.lump.aq.basics.structure.ir
This class provides a term frequency abstraction.
TermFrequencyTuple(String, int) - Constructor for class cat.lump.aq.basics.structure.ir.TermFrequencyTuple
Constructor.
TermFrequencyTuple(String) - Constructor for class cat.lump.aq.basics.structure.ir.TermFrequencyTuple
Constructor.
test() - Method in class cat.lump.aq.wikilink.connexion.WikipediaDriverManager
Tests the connection by displaying the databases
TestTokenize - Class in cat.lump.ie.textprocessing
 
TestTokenize() - Constructor for class cat.lump.ie.textprocessing.TestTokenize
 
text_magnitudes - Variable in class cat.lump.ir.retrievalmodels.similarity.Article
Map with the article's text magnitudes
TextPreprocessor - Class in cat.lump.ie.textprocessing
This class represents an "interface" to the different text processing tools available in this package.
TextPreprocessor(Locale) - Constructor for class cat.lump.ie.textprocessing.TextPreprocessor
 
textsA - Variable in class cat.lump.ir.sim.ml.esa.SimilarityESA
Path to documents A
textsB - Variable in class cat.lump.ir.sim.ml.esa.SimilarityESA
Path to documents B
TFSimilarity - Class in cat.lump.ir.lucene.index
An extension to Lucene's default similarity that intends to represent a document simply by its TFs.
TFSimilarity() - Constructor for class cat.lump.ir.lucene.index.TFSimilarity
 
times(float) - Method in class cat.lump.aq.basics.algebra.vector.Vector
Multiplies the vector times a scalar and returns the result.
timesEquals(float) - Method in class cat.lump.aq.basics.algebra.vector.Vector
Multiplies the vector times a scalar and updates its internal value.
toFile(File) - Method in class cat.lump.aq.textextraction.wikipedia.categories.ArticleSelector
TODO ???
toFile(File) - Method in class cat.lump.aq.textextraction.wikipedia.categories.BackArticleSelector
TODO ???
toFile(int, int) - Method in class cat.lump.aq.textextraction.wikipedia.categories.CategoryExtractor
Dump tree into a file.
toFile(File) - Method in class cat.lump.aq.textextraction.wikipedia.categories.CategoryNameStats
 
toFile(File) - Method in class cat.lump.aq.textextraction.wikipedia.categories.DomainKeywords
Saves the top list of a text file.
toFile(File, List<TermFrequencyTuple>) - Method in class cat.lump.aq.textextraction.wikipedia.categories.DomainKeywords
Saves the given list of TermFrequencyTuples into a file
toFile(File) - Method in class cat.lump.aq.textextraction.wikipedia.categories.DomainVocabulary
Exports the vocabulary to a textual file.
toFile(File) - Method in class cat.lump.aq.textextraction.wikipedia.categories.GroupOfCategories
 
toFile(File) - Method in class cat.lump.ir.retrievalmodels.similarity.SimilarityMatrix
Writes the matrix as text in a given file
TokenizerFactory - Class in cat.lump.ir.lucene.query
Creates and returns a Lucene Tokenizer instance for the required language
TokenizerFactory() - Constructor for class cat.lump.ir.lucene.query.TokenizerFactory
 
tokenStream(String, Reader) - Method in class cat.lump.ir.lucene.engine.WTAnalyzer
Pipeline for the analyser.
tokenStream(String, Reader) - Method in class cat.lump.ir.lucene.index.analyzers.CroatianAnalyzer
 
tokenStream(String, Reader) - Method in class cat.lump.ir.lucene.index.analyzers.EstonianAnalyzer
 
tokenStream(String, Reader) - Method in class cat.lump.ir.lucene.index.analyzers.LithuanianAnalyzer
 
tokenStream(String, Reader) - Method in class cat.lump.ir.lucene.index.analyzers.SlovenianAnalyzer
 
toList() - Method in class cat.lump.aq.textextraction.wikipedia.categories.DomainVocabulary
Exports the vocabulary as a list of tuples which contains the term and its frequency.
toLowerCase() - Method in class cat.lump.ie.textprocessing.TextPreprocessor
Converts the string to lowercase
toString() - Method in class cat.lump.aq.basics.structure.ir.TermFrequencyTuple
 
toString() - Method in class cat.lump.aq.basics.structure.Pair
 
toString() - Method in class cat.lump.aq.textextraction.wikipedia.categories.CategoryTreeNode
Transforms the object into a string
toString() - Method in class cat.lump.aq.textextraction.wikipedia.categories.GroupOfCategories.ScoredCategory
Returns a string representation of the instance.
toString() - Method in enum cat.lump.aq.textextraction.wikipedia.prepro.TypePreprocess
 
toString() - Method in class cat.lump.aq.wikilink.config.Dump
 
toString() - Method in class cat.lump.ir.index.Ranking
 
toString() - Method in class cat.lump.ir.retrievalmodels.similarity.SimilarityMatrix
 
Transformation - Class in cat.lump.ie.textprocessing.transform
 
Transformation() - Constructor for class cat.lump.ie.textprocessing.transform.Transformation
 
translateSets(String, int) - Method in class cat.lump.aq.textextraction.wikipedia.utilities.ArticlesTranslator
Main method to call the decoder String translator for all the files generated by generateSetsFullFolder().
Transliteratorr - Class in cat.lump.ie.textprocessing.transform
A class to transliterate a text with ICU4J
Transliteratorr(Locale) - Constructor for class cat.lump.ie.textprocessing.transform.Transliteratorr
 
trg_id - Variable in class cat.lump.aq.basics.structure.ArticlePair
Identifier of the target language article
trg_len - Variable in class cat.lump.aq.basics.structure.ArticlePair
Length of the target language article
trgTitle - Variable in class cat.lump.aq.basics.structure.ArticlePair
Title of the target language article
type - Variable in class cat.lump.ir.retrievalmodels.similarity.SimilarityCalculator
Type of text representation
TypePreprocess - Enum in cat.lump.aq.textextraction.wikipedia.prepro
Enumeration of the different types of preprocess.

V

valueOf(String) - Static method in enum cat.lump.aq.textextraction.wikipedia.prepro.TypePreprocess
Returns the enum constant of this type with the specified name.
valueOf(String) - Static method in enum cat.lump.ir.comparison.Model
Returns the enum constant of this type with the specified name.
valueOf(String) - Static method in enum cat.lump.ir.retrievalmodels.document.RepresentationType
Returns the enum constant of this type with the specified name.
valueOf(String) - Static method in enum cat.lump.ir.retrievalmodels.similarity.Similarity
Returns the enum constant of this type with the specified name.
values() - Static method in enum cat.lump.aq.textextraction.wikipedia.prepro.TypePreprocess
Returns an array containing the constants of this enum type, in the order they are declared.
values() - Static method in enum cat.lump.ir.comparison.Model
Returns an array containing the constants of this enum type, in the order they are declared.
values() - Static method in enum cat.lump.ir.retrievalmodels.document.RepresentationType
Returns an array containing the constants of this enum type, in the order they are declared.
values() - Static method in enum cat.lump.ir.retrievalmodels.similarity.Similarity
Returns an array containing the constants of this enum type, in the order they are declared.
Vector - Class in cat.lump.aq.basics.algebra.vector
A vector of doubles that allows for a number of vector-vector and vector-scalar algebraic operations, including:
  • sum of vectors (--> vector)
  • product of vectors (--> scalar)
  • product by scalar (--> vector)
  • division by scalar (--> vector) Properties of the vector ---magnitude, max, min, argmax, and argmin--- are available as well.
  • Vector(float[]) - Constructor for class cat.lump.aq.basics.algebra.vector.Vector
    Initialisation with an array of doubles
    VectorCosine - Class in cat.lump.ir.retrievalmodels.similarity
    A class to compute the cosine similarity between two vectors.
    VectorCosine() - Constructor for class cat.lump.ir.retrievalmodels.similarity.VectorCosine
     
    verbose - Variable in class cat.lump.ir.lucene.cli.LuceneCliMinimum0
    Verbosity
    verbose - Variable in class cat.lump.ir.lucene.index.LuceneIndexer
    Directory where the Lucene index has to be stored
    verbose - Variable in class cat.lump.ir.lucene.index.LuceneIndexerWT
     
    verbose - Variable in class cat.lump.ir.lucene.LuceneInterface
     
    vocabularySize() - Method in class cat.lump.ir.index.Index
     
    vocQuery(String[]) - Static method in class cat.lump.ir.lucene.query.Document2Query
    Creates a query considering only the vocabulary (i.e. types)

    W

    warn(String) - Method in class cat.lump.aq.basics.log.LumpLogger
     
    weightQuery(String[]) - Static method in class cat.lump.ir.lucene.query.Document2Query
    Creates a query where the relevance of a type depends on its frequency (i.e. if a token w appears 4 times, it will appear as w^4)
    wiki - Variable in class cat.lump.aq.textextraction.wikipedia.prepro.AbstractPreprocess
    Wikipedia JWPL connector
    WikipediaCliArticleSelector - Class in cat.lump.aq.textextraction.wikipedia.cli
    CLI to access directly the selection of articles step of Xecutor pipeline for the WikiTailor category-based in-domain comparable corpora extraction.
    WikipediaCliArticleSelector() - Constructor for class cat.lump.aq.textextraction.wikipedia.cli.WikipediaCliArticleSelector
     
    WikipediaCliArticleTextExtractor - Class in cat.lump.aq.textextraction.wikipedia.cli
    CLI to access directly the extraction of articles step of Xecutor pipeline for the WikiTailor category-based in-domain comparable corpora extraction.
    WikipediaCliArticleTextExtractor() - Constructor for class cat.lump.aq.textextraction.wikipedia.cli.WikipediaCliArticleTextExtractor
     
    WikipediaCliCategoriesXecutor - Class in cat.lump.aq.textextraction.wikipedia.cli
    CLI to access the Xecutor pipeline for the WikiTailor category-based in-domain comparable corpora extraction.
    WikipediaCliCategoriesXecutor() - Constructor for class cat.lump.aq.textextraction.wikipedia.cli.WikipediaCliCategoriesXecutor
     
    WikipediaCliCategoryDepth - Class in cat.lump.aq.textextraction.wikipedia.cli
    CLI to access directly the estimation of the category depth step of Xecutor pipeline for the WikiTailor category-based in-domain comparable corpora extraction.
    WikipediaCliCategoryDepth() - Constructor for class cat.lump.aq.textextraction.wikipedia.cli.WikipediaCliCategoryDepth
     
    WikipediaCliCategoryExtractor - Class in cat.lump.aq.textextraction.wikipedia.cli
    CLI to access directly the extraction of categories step of Xecutor pipeline for the WikiTailor category-based in-domain comparable corpora extraction.
    WikipediaCliCategoryExtractor() - Constructor for class cat.lump.aq.textextraction.wikipedia.cli.WikipediaCliCategoryExtractor
     
    WikipediaCliDomainKeywords - Class in cat.lump.aq.textextraction.wikipedia.cli
    CLI to access directly the extraction of domain keywords step of Xecutor pipeline for the WikiTailor category-based in-domain comparable corpora extraction.
    WikipediaCliDomainKeywords() - Constructor for class cat.lump.aq.textextraction.wikipedia.cli.WikipediaCliDomainKeywords
     
    WikipediaCliFragmentsXecutor - Class in cat.lump.aq.textextraction.wikipedia.cli
    CLI to access the extraction of parallel sentences from the corpus obtained with WikiTailor textextraction.wikipedia.
    WikipediaCliFragmentsXecutor() - Constructor for class cat.lump.aq.textextraction.wikipedia.cli.WikipediaCliFragmentsXecutor
     
    WikipediaCliMinimum - Class in cat.lump.aq.textextraction.wikipedia.cli
    CLI to access JWPL-WIKIPEDIA-related programs in this package.
    WikipediaCliMinimum() - Constructor for class cat.lump.aq.textextraction.wikipedia.cli.WikipediaCliMinimum
    Loads the logger and the available options (by calling loadOptions)
    WikipediaDBdata - Class in cat.lump.aq.wikilink
    Utilities for querying the Wikipedia DB and dealing with its data.
    WikipediaDBdata() - Constructor for class cat.lump.aq.wikilink.WikipediaDBdata
     
    WikipediaDriverManager - Class in cat.lump.aq.wikilink.connexion
    Adaptation of cat.talp.lump.co.db.DriverManagerClass from
    WikipediaDriverManager() - Constructor for class cat.lump.aq.wikilink.connexion.WikipediaDriverManager
     
    WikipediaJwpl - Class in cat.lump.aq.wikilink.jwpl
    This class provides methods for initialising a jwpl Wikipedia instance.
    WikipediaJwpl(Locale, int) - Constructor for class cat.lump.aq.wikilink.jwpl.WikipediaJwpl
    Creates its own database configuration according to language and year
    WikipediaJwpl(DatabaseConfiguration) - Constructor for class cat.lump.aq.wikilink.jwpl.WikipediaJwpl
    Invokes the super class with the database configuration, sets the constants loads the JWPL Wikipedia instance.
    WikiProperties - Class in cat.lump.aq.textextraction.wikipedia
    Deprecated.
    WikiProperties() - Constructor for class cat.lump.aq.textextraction.wikipedia.WikiProperties
    Deprecated.
     
    WikiTailor2Query - Class in cat.lump.ir.lucene.query
    Query into Lucene indexes for WikiTailor.
    WikiTailor2Query(Locale, String, String, String, float, int, String) - Constructor for class cat.lump.ir.lucene.query.WikiTailor2Query
    Constructors
    WikiTailor2Query(Locale, String, String, String, float, int, String, Boolean) - Constructor for class cat.lump.ir.lucene.query.WikiTailor2Query
     
    WordDecompositionICU4J - Class in cat.lump.ie.textprocessing.word
    A class based on aitools WordDEcompositionICU4J class.
    WordDecompositionICU4J(Locale) - Constructor for class cat.lump.ie.textprocessing.word.WordDecompositionICU4J
     
    WordNgrams - Class in cat.lump.ie.textprocessing.ngram
    This class allows for generating word-level n-grams from a text.
    WordNgrams(int, Locale) - Constructor for class cat.lump.ie.textprocessing.ngram.WordNgrams
     
    writeObject(Object, File) - Static method in class cat.lump.aq.basics.io.files.FileIO
     
    writePreprocessing(Collection<String>, File, int) - Method in class cat.lump.aq.textextraction.wikipedia.prepro.AbstractPreprocess
    Writes the result of a preprocessing in the given output file.
    WTAnalyzer - Class in cat.lump.ir.lucene.engine
    Modification of the standard Lucene Analyzer to mimic the preprocess and term extraction used in the Wikiparable experiment
    WTAnalyzer(Version, Locale) - Constructor for class cat.lump.ir.lucene.engine.WTAnalyzer
     
    WTConfig - Class in cat.lump.aq.textextraction.wikipedia
    A class to read the configuration file as a Properties object
    WTConfig() - Constructor for class cat.lump.aq.textextraction.wikipedia.WTConfig
     

    X

    Xecutor - Class in cat.lump.aq.textextraction.wikipedia.categories
    This class intends to join together all the necessary process to extract the articles related to a given category.
    Xecutor(Locale, int, double, int) - Constructor for class cat.lump.aq.textextraction.wikipedia.categories.Xecutor
     
    Xecutor(Locale, int, double, int, int, int, int, int, String, String, String, String) - Constructor for class cat.lump.aq.textextraction.wikipedia.categories.Xecutor
     
    Xecutor - Class in cat.lump.aq.textextraction.wikipedia.fragments
    This class intends to join together all the necessary process to extract comparable fragments from articles belonging to a given category.
    Xecutor(Locale, Locale, int) - Constructor for class cat.lump.aq.textextraction.wikipedia.fragments.Xecutor
     
    Xecutor - Class in cat.lump.ir.lucene
    This class intends to join together all the necessary process to extract the articles related to a given category through the Lucene engine.
    Xecutor(Locale, int, int, int, int, String) - Constructor for class cat.lump.ir.lucene.Xecutor
     
    XecutorTheFirst - Class in cat.lump.aq.textextraction.wikipedia.categories
    This class intends to join together all the necessary process to extract the articles related to a given category.
    XecutorTheFirst(Locale, int) - Constructor for class cat.lump.aq.textextraction.wikipedia.categories.XecutorTheFirst
     
    XecutorTheSecond - Class in cat.lump.aq.textextraction.wikipedia.categories
    This class intends to join together all the necessary process to extract the articles related to a given category.
    XecutorTheSecond(Locale, int) - Constructor for class cat.lump.aq.textextraction.wikipedia.categories.XecutorTheSecond
     

    Y

    year - Variable in class cat.lump.aq.textextraction.wikipedia.cli.WikipediaCliMinimum
    Year of the Wikipedia edition
    A B C D E F G H I J K L M N O P Q R S T V W X Y