- Abstracter - Class in cat.lump.ir.index
-
Contains the basic operation for indexing and querying a documents' index
- Abstracter(Locale, File, RepresentationType[]) - Constructor for class cat.lump.ir.index.Abstracter
-
Calls the setters for language and representation type.
- AbstractPreprocess - Class in cat.lump.aq.textextraction.wikipedia.prepro
-
- AbstractPreprocess(TypePreprocess, String, Locale, int, File) - Constructor for class cat.lump.aq.textextraction.wikipedia.prepro.AbstractPreprocess
-
Creates a preprocess method able to preprocess pages from the Wikipedia
dump identified by language and year.
- accept(File) - Method in class cat.lump.ir.lucene.index.LuceneIndexer.TextFilesFilter
-
- accept(File) - Method in class cat.lump.ir.lucene.index.LuceneIndexerWT.TextFilesFilter
-
- actualPair - Variable in class cat.lump.aq.basics.structure.ArticlePair
-
1 (?)
- add(Vector) - Method in class cat.lump.aq.basics.algebra.matrix.DynamicMatrixOfVectors
-
Adds a new vector to the matrix, in the last available slot
- add(Vector) - Method in class cat.lump.aq.basics.algebra.vector.Vector
-
Sums the contents of v to vector
This method does not modify the vector.
- add(float[]) - Method in class cat.lump.aq.basics.algebra.vector.Vector
-
Sums the contents of vector to values and returns the result
This method does not modify the vector.
- add(Map<String, Double>, String) - Method in class cat.lump.ir.index.Index
-
Add a new document with the given weights to the index.
- add(String, double) - Method in class cat.lump.ir.index.Ranking
-
Insert a new document to the ranking
- add(int, int, double) - Method in class cat.lump.ir.retrievalmodels.similarity.SimilarityMatrix
-
Adds the value to the current value of the position given by
row and col
- addDocument(File, String) - Method in class cat.lump.ir.index.Indexer
-
Add a new document file to the index
- addDocument(String, String) - Method in class cat.lump.ir.index.Indexer
-
Add a new document to the index.
- addEquals(Vector) - Method in class cat.lump.aq.basics.algebra.vector.Vector
-
Sums v2 to vector and store the result in the vector itself.
- addEquals(float[]) - Method in class cat.lump.aq.basics.algebra.vector.Vector
-
Sums the values to vector, modifying its contents.
- addPage(int) - Method in class cat.lump.aq.textextraction.wikipedia.categories.ArticleTextExtractor
-
Adds a page ID to the set of page IDs
- addPage(Integer) - Method in class cat.lump.aq.textextraction.wikipedia.prepro.AbstractPreprocess
-
Adds a page to preprocess.
- addPages(Collection<Integer>) - Method in class cat.lump.aq.textextraction.wikipedia.categories.ArticleTextExtractor
-
Adds a collection of pages to the set of page IDs
- addPages(Collection<Integer>) - Method in class cat.lump.aq.textextraction.wikipedia.prepro.AbstractPreprocess
-
Adds a set of pages to preprocess
- addPair(int, File, int, File) - Method in class cat.lump.aq.textextraction.wikipedia.fragments.ArticlesSimilarity
-
Add an article pair to the list with similarities
- addPair(int, File, int, File) - Method in class cat.lump.ir.comparison.toCheck.ArticlesSimilarity
-
Add an article pair to the list
- addPairs(List<SimilarityPair>) - Method in class cat.lump.aq.textextraction.wikipedia.fragments.ArticlesSimilarity
-
Add a new collection of pairs
- addPairs(List<SimilarityPair>) - Method in class cat.lump.ir.comparison.toCheck.ArticlesSimilarity
-
Add a new collection of pairs
- addPreprocess(TypePreprocess) - Method in class cat.lump.aq.textextraction.wikipedia.categories.ArticleTextExtractor
-
Adds a new preprocess to the available ones.
- addScoredCategory(GroupOfCategories.ScoredCategory) - Method in class cat.lump.aq.textextraction.wikipedia.categories.GroupOfCategories
-
Adds a new ScoredCategory to the group.
- addString(String) - Method in class cat.lump.ir.retrievalmodels.document.Dictionary
-
Add the term to the dictionary (if it was not there yet).
- addTerm(String) - Method in class cat.lump.ir.weighting.TermFrequency
-
Add a term into the collection.
- addTerms(Collection<String>) - Method in class cat.lump.aq.textextraction.wikipedia.categories.DomainVocabulary
-
Add new terms from a collection of words.
- addTerms(String) - Method in class cat.lump.aq.textextraction.wikipedia.categories.DomainVocabulary
-
Add new terms or modify the already added with the tokens obtained by the
preprocessing of the given text.
- addTerms(List<String>) - Method in class cat.lump.ir.weighting.TermFrequency
-
Add these terms into the collection.
- addVector(Vector, String) - Method in class cat.lump.ir.sim.ml.esa.EsaVectors
-
Add a new vector into the matrix.
- admitShorterNgrams - Variable in class cat.lump.ie.textprocessing.ngram.CharacterNgrams
-
Flag to admit texts shorter than n
- analyzer - Static variable in class cat.lump.ie.textprocessing.word.AnalyzerFactoryLucene
-
- analyzer - Static variable in class cat.lump.ir.lucene.engine.AnalyzerFactory
-
- analyzer - Variable in class cat.lump.ir.lucene.query.LuceneTokenizer
-
- AnalyzerFactory - Class in cat.lump.ir.lucene.engine
-
Factory that allows for getting a Lucene Analyzer with all the preprocess needed
TODO This is duplicated with textextraction, maybe we should unify
- AnalyzerFactory() - Constructor for class cat.lump.ir.lucene.engine.AnalyzerFactory
-
- AnalyzerFactoryLucene - Class in cat.lump.ie.textprocessing.word
-
Factory that allows for getting a Lucene Analyzer and a stemmer for the
required language (if available)
- AnalyzerFactoryLucene() - Constructor for class cat.lump.ie.textprocessing.word.AnalyzerFactoryLucene
-
- appendStringToFile(File, String, boolean) - Static method in class cat.lump.aq.basics.io.files.FileIO
-
- argmax() - Method in class cat.lump.aq.basics.algebra.vector.Vector
-
- argmin() - Method in class cat.lump.aq.basics.algebra.vector.Vector
-
- Article - Class in cat.lump.ir.retrievalmodels.similarity
-
An article object is the abstraction of a document to translate whith its
related information.
- Article(String, RepresentationType) - Constructor for class cat.lump.ir.retrievalmodels.similarity.Article
-
Creates an undefined article together with its language and type of
representation.
- Article(String, String) - Constructor for class cat.lump.ir.retrievalmodels.similarity.Article
-
Creates an article with text and language.
- Article(String, String, RepresentationType) - Constructor for class cat.lump.ir.retrievalmodels.similarity.Article
-
Creates an article with text, language and type of representation.
- ArticlePair - Class in cat.lump.aq.basics.structure
-
This class stores a pair of Wikipedia articles covering the same topic in
different languages.
- ArticlePair() - Constructor for class cat.lump.aq.basics.structure.ArticlePair
-
- ArticlePair(int, String, int, String) - Constructor for class cat.lump.aq.basics.structure.ArticlePair
-
TODO this is added for the alignment interface.
- ArticleSelector - Class in cat.lump.aq.textextraction.wikipedia.categories
-
This class extracts all the articles that belong to a given category in
Wikipedia
- ArticleSelector(File, Locale, int) - Constructor for class cat.lump.aq.textextraction.wikipedia.categories.ArticleSelector
-
Constructor.
- ArticlesSimilarity - Class in cat.lump.aq.textextraction.wikipedia.fragments
-
- ArticlesSimilarity(String, String) - Constructor for class cat.lump.aq.textextraction.wikipedia.fragments.ArticlesSimilarity
-
- ArticlesSimilarity - Class in cat.lump.ir.comparison.toCheck
-
- ArticlesSimilarity(String, String) - Constructor for class cat.lump.ir.comparison.toCheck.ArticlesSimilarity
-
- ArticlesTFs - Class in cat.lump.aq.textextraction.wikipedia.utilities
-
A class to calculate the TFs associated to all terms in a document from an already extracted WP
edition.
- ArticlesTFs(String, String) - Constructor for class cat.lump.aq.textextraction.wikipedia.utilities.ArticlesTFs
-
Constructor for the class.
- ArticlesTranslator - Class in cat.lump.aq.textextraction.wikipedia.utilities
-
A class to translate all the articles from L1 into L2 in a folder with the structure
of Wikicardi: path/plain/L1/index/id.L1.txt
The index files with the position of the articles and its length is required.
- ArticlesTranslator(String[], String) - Constructor for class cat.lump.aq.textextraction.wikipedia.utilities.ArticlesTranslator
-
- ArticleTextExtractor - Class in cat.lump.aq.textextraction.wikipedia.categories
-
This class provides methods to load a list of Wikipedia articles IDs and
preprocess them.
- ArticleTextExtractor(Locale, int) - Constructor for class cat.lump.aq.textextraction.wikipedia.categories.ArticleTextExtractor
-
Creates a preprocessor without any page to preprocess.
- ArticleTextExtractor(Locale, int, File) - Constructor for class cat.lump.aq.textextraction.wikipedia.categories.ArticleTextExtractor
-
Creates a preprocessor with the pages listed in listOfPages.
- calculate(Similarity, File) - Method in class cat.lump.aq.textextraction.wikipedia.fragments.ArticlesSimilarity
-
- calculate(Similarity, File, int) - Method in class cat.lump.aq.textextraction.wikipedia.fragments.ArticlesSimilarity
-
- calculate(Similarity, File) - Method in class cat.lump.ir.comparison.toCheck.ArticlesSimilarity
-
- calculate(Similarity, File, int) - Method in class cat.lump.ir.comparison.toCheck.ArticlesSimilarity
-
- calculate(File, File) - Method in class cat.lump.ir.retrievalmodels.similarity.SimilarityCalculator
-
Calculates the matrix of similarities for the given files.
- calculate(File, File) - Method in class cat.lump.ir.retrievalmodels.similarity.SimilarityCalculatorLenFact
-
Calculates the matrix of similarities for the given files.
- calculateInvIndex() - Method in class cat.lump.ir.retrievalmodels.similarity.SimilarityCalculator
-
Generate the inverted index of the articles
- calculateMatrix(Article, Article) - Method in class cat.lump.ir.retrievalmodels.similarity.CosineSimilarity
-
- calculateMatrix(Article, Article) - Method in class cat.lump.ir.retrievalmodels.similarity.JaccardSimilarity
-
- calculateMatrix(Article, Article) - Method in interface cat.lump.ir.retrievalmodels.similarity.SimilarityModel
-
- calculateSimilarityMatrix() - Method in class cat.lump.ir.retrievalmodels.similarity.SimilarityCalculator
-
Calculates the resulting matrix
- calculateSimilarityMatrix() - Method in class cat.lump.ir.retrievalmodels.similarity.SimilarityCalculatorLenFact
-
- calculateTFs(String) - Method in class cat.lump.aq.textextraction.wikipedia.utilities.ArticlesTFs
-
Estimates the TF for all the terms in file file and writes the
resulting List<TermFrequencyTuple> in a file with the same name as
the text file but in folder "tfs" instead of "plain"
- CAN_READ(File, String) - Static method in class cat.lump.aq.basics.check.F
-
throw error if the file cannot be read.
- cat.lump.aq.basics.algebra.matrix - package cat.lump.aq.basics.algebra.matrix
-
- cat.lump.aq.basics.algebra.vector - package cat.lump.aq.basics.algebra.vector
-
- cat.lump.aq.basics.check - package cat.lump.aq.basics.check
-
- cat.lump.aq.basics.io.files - package cat.lump.aq.basics.io.files
-
- cat.lump.aq.basics.log - package cat.lump.aq.basics.log
-
- cat.lump.aq.basics.structure - package cat.lump.aq.basics.structure
-
- cat.lump.aq.basics.structure.ir - package cat.lump.aq.basics.structure.ir
-
- cat.lump.aq.basics.structure.standard - package cat.lump.aq.basics.structure.standard
-
- cat.lump.aq.textextraction.wikipedia - package cat.lump.aq.textextraction.wikipedia
-
- cat.lump.aq.textextraction.wikipedia.categories - package cat.lump.aq.textextraction.wikipedia.categories
-
- cat.lump.aq.textextraction.wikipedia.cli - package cat.lump.aq.textextraction.wikipedia.cli
-
- cat.lump.aq.textextraction.wikipedia.experiments - package cat.lump.aq.textextraction.wikipedia.experiments
-
- cat.lump.aq.textextraction.wikipedia.fragments - package cat.lump.aq.textextraction.wikipedia.fragments
-
- cat.lump.aq.textextraction.wikipedia.io - package cat.lump.aq.textextraction.wikipedia.io
-
- cat.lump.aq.textextraction.wikipedia.prepro - package cat.lump.aq.textextraction.wikipedia.prepro
-
- cat.lump.aq.textextraction.wikipedia.utilities - package cat.lump.aq.textextraction.wikipedia.utilities
-
- cat.lump.aq.wikilink - package cat.lump.aq.wikilink
-
- cat.lump.aq.wikilink.config - package cat.lump.aq.wikilink.config
-
- cat.lump.aq.wikilink.connexion - package cat.lump.aq.wikilink.connexion
-
- cat.lump.aq.wikilink.jwpl - package cat.lump.aq.wikilink.jwpl
-
- cat.lump.ie.textprocessing - package cat.lump.ie.textprocessing
-
- cat.lump.ie.textprocessing.ner - package cat.lump.ie.textprocessing.ner
-
- cat.lump.ie.textprocessing.ngram - package cat.lump.ie.textprocessing.ngram
-
- cat.lump.ie.textprocessing.sentence - package cat.lump.ie.textprocessing.sentence
-
- cat.lump.ie.textprocessing.stopwords - package cat.lump.ie.textprocessing.stopwords
-
- cat.lump.ie.textprocessing.transform - package cat.lump.ie.textprocessing.transform
-
- cat.lump.ie.textprocessing.word - package cat.lump.ie.textprocessing.word
-
- cat.lump.ir.comparison - package cat.lump.ir.comparison
-
- cat.lump.ir.comparison.toCheck - package cat.lump.ir.comparison.toCheck
-
- cat.lump.ir.index - package cat.lump.ir.index
-
- cat.lump.ir.lucene - package cat.lump.ir.lucene
-
- cat.lump.ir.lucene.cli - package cat.lump.ir.lucene.cli
-
- cat.lump.ir.lucene.engine - package cat.lump.ir.lucene.engine
-
- cat.lump.ir.lucene.index - package cat.lump.ir.lucene.index
-
- cat.lump.ir.lucene.index.analyzers - package cat.lump.ir.lucene.index.analyzers
-
- cat.lump.ir.lucene.query - package cat.lump.ir.lucene.query
-
- cat.lump.ir.retrievalmodels.document - package cat.lump.ir.retrievalmodels.document
-
- cat.lump.ir.retrievalmodels.similarity - package cat.lump.ir.retrievalmodels.similarity
-
- cat.lump.ir.sim - package cat.lump.ir.sim
-
- cat.lump.ir.sim.cl.clesa - package cat.lump.ir.sim.cl.clesa
-
- cat.lump.ir.sim.cl.len - package cat.lump.ir.sim.cl.len
-
- cat.lump.ir.sim.ml.esa - package cat.lump.ir.sim.ml.esa
-
- cat.lump.ir.weighting - package cat.lump.ir.weighting
-
- CategoryDepth - Class in cat.lump.aq.textextraction.wikipedia.categories
-
Class that automatises the process of selecting how deep within the category tree
one must go to extract articles from a given domain.
- CategoryDepth(File, double, int, int) - Constructor for class cat.lump.aq.textextraction.wikipedia.categories.CategoryDepth
-
- CategoryExplorer - Class in cat.lump.aq.textextraction.wikipedia.categories
-
The CategoryExplorer class is used to explore the categories of
Wikipedia.
- CategoryExplorer(Locale, int) - Constructor for class cat.lump.aq.textextraction.wikipedia.categories.CategoryExplorer
-
Creates a new explorer related to the Wikipedia dump defined by its
language and year.
- CategoryExtractor - Class in cat.lump.aq.textextraction.wikipedia.categories
-
This class extracts all the subcategories from an indicated category in
Wikipedia
TODO build junit
- CategoryExtractor(Locale, int, String) - Constructor for class cat.lump.aq.textextraction.wikipedia.categories.CategoryExtractor
-
Default ---non-verbose--- invocation.
- CategoryExtractor(Locale, int, boolean, String) - Constructor for class cat.lump.aq.textextraction.wikipedia.categories.CategoryExtractor
-
Invocation in which verbosity is set.
- CategoryNameStats - Class in cat.lump.aq.textextraction.wikipedia.categories
-
This class computes the percentage of categories that are claimed to belong
to a concrete domain from a category tree.
- CategoryNameStats(Locale) - Constructor for class cat.lump.aq.textextraction.wikipedia.categories.CategoryNameStats
-
- CategoryTreeNode - Class in cat.lump.aq.textextraction.wikipedia.categories
-
This class stores all the relevant information about a classified
category.
- CategoryTreeNode(Category, int, int) - Constructor for class cat.lump.aq.textextraction.wikipedia.categories.CategoryTreeNode
-
Constructor.
- changeFileSuffix(File, String) - Static method in class cat.lump.aq.basics.io.files.FileIO
-
- changeFileSuffix(String, String) - Static method in class cat.lump.aq.basics.io.files.FileIO
-
Given a filename, whether relative or absolute, it substitutes the suffix
for a newSuffix.
- CharacterNgrams - Class in cat.lump.ie.textprocessing.ngram
-
- CharacterNgrams(int) - Constructor for class cat.lump.ie.textprocessing.ngram.CharacterNgrams
-
- CharacterNgrams(int, Boolean) - Constructor for class cat.lump.ie.textprocessing.ngram.CharacterNgrams
-
- CHECK(boolean) - Static method in class cat.lump.aq.basics.check.CHK
-
throw CheckFailedError if false
- CHECK(boolean, String) - Static method in class cat.lump.aq.basics.check.CHK
-
throw CheckFailedError if false, displaying the required message
- CHECK_NOT_NULL(Object) - Static method in class cat.lump.aq.basics.check.CHK
-
Check that the given object is not null; throws a CheckFailedError
if it is
- checkAllTablesAvailable() - Method in class cat.lump.aq.textextraction.wikipedia.utilities.CommonArticlesFinder
-
Checks if all the tables needed are in the database.
- checkAllTablesAvailable() - Method in class cat.lump.aq.textextraction.wikipedia.utilities.CommonNamespaceFinder
-
Checks if all the tables needed are in the database.
- CheckFailedError - Error in cat.lump.aq.basics.check
-
- CHK - Class in cat.lump.aq.basics.check
-
A class that contains methods to check
- CHK() - Constructor for class cat.lump.aq.basics.check.CHK
-
- close() - Method in class cat.lump.aq.wikilink.connexion.WikipediaDriverManager
-
Closes the connection
- close() - Method in class cat.lump.ir.lucene.index.LuceneIndexer
-
Closes the Lucene index
- close() - Method in class cat.lump.ir.lucene.index.LuceneIndexerWT
-
Closes the Lucene index
- closeConnection() - Method in class cat.lump.aq.textextraction.wikipedia.utilities.CommonArticlesFinder
-
- closeConnection() - Method in class cat.lump.aq.textextraction.wikipedia.utilities.CommonNamespaceFinder
-
- closeStatement() - Method in class cat.lump.aq.wikilink.connexion.WikipediaDriverManager
-
- COLLECTION_FILE - Variable in class cat.lump.ir.index.Abstracter
-
Name for the output documents' object file
- command - Variable in class cat.lump.aq.textextraction.wikipedia.cli.WikipediaCliMinimum
-
- command - Variable in class cat.lump.ir.lucene.cli.LuceneCliMinimum0
-
- CommonArticlesFinder - Class in cat.lump.aq.textextraction.wikipedia.utilities
-
A class to identify the common articles across n languages in Wikipedia
from files with the list of IDs for every language.
- CommonArticlesFinder(String[], int, String[], File) - Constructor for class cat.lump.aq.textextraction.wikipedia.utilities.CommonArticlesFinder
-
Instantiates the object with the provided languages.
- CommonCategoriesExtractor - Class in cat.lump.aq.textextraction.wikipedia.utilities
-
A class to identify the common categories across n languages in Wikipedia.
- CommonCategoriesExtractor(String[], String, File) - Constructor for class cat.lump.aq.textextraction.wikipedia.utilities.CommonCategoriesExtractor
-
Instantiates the object with the provided languages.
- CommonNamespaceFinder - Class in cat.lump.aq.textextraction.wikipedia.utilities
-
A class to identify the common articles (namespace=0) and categories (namespace=14)
across n languages in Wikipedia from files with the list of IDs for every language.
- CommonNamespaceFinder(String[], int, String[], File) - Constructor for class cat.lump.aq.textextraction.wikipedia.utilities.CommonNamespaceFinder
-
Instantiates the object with the provided languages.
- compareTo(TermFrequencyTuple) - Method in class cat.lump.aq.basics.structure.ir.TermFrequencyTuple
-
- compareTo(Pair<S, T>) - Method in class cat.lump.aq.basics.structure.Pair
-
- compareTo(SimilarityPair) - Method in class cat.lump.ir.comparison.toCheck.SimilarityPair
-
- compute(Vector, Vector) - Method in interface cat.lump.ir.retrievalmodels.similarity.SimilarityMeasure
-
- compute(Vector, Vector) - Method in class cat.lump.ir.retrievalmodels.similarity.VectorCosine
-
Computes the cosine similarity measure between two vectors
sim(v1,v2) = (v1 * v2) / (|v1||v2|)
- computeCategoryStats(int) - Method in class cat.lump.aq.textextraction.wikipedia.categories.Xecutor
-
- computeCategoryStats(int, int, String) - Method in class cat.lump.aq.textextraction.wikipedia.categories.XecutorTheFirst
-
- computeLengths(File) - Method in class cat.lump.ir.sim.cl.len.LengthModelLearn
-
Opens the file and computes lengths for every line within
- computeNorm(String, FieldInvertState) - Method in class cat.lump.ir.lucene.index.TFSimilarity
-
- computePairwiseSimilarities() - Method in class cat.lump.ir.sim.ml.esa.Esa
-
Compute only similarities for the matrix diagonal
- computeRanking(File) - Method in class cat.lump.ir.index.Querier
-
Queries a text file to the index.
- computeRanking(String) - Method in class cat.lump.ir.index.Querier
-
Queries a text to the index.
- computeSimilarities() - Method in class cat.lump.ir.sim.ml.esa.Esa
-
Computes the ESA-based similarities between the previously loaded
documents.
- computeSimilarities() - Method in interface cat.lump.ir.sim.Similarity
-
Compute the similarity between all the texts in the collection
- computeSimilarity(String, String) - Method in class cat.lump.ir.sim.ml.esa.Esa
-
Computes the similarity between two specific documents.
- computeSimilarity(String, String) - Method in interface cat.lump.ir.sim.Similarity
-
Compute the similarity between two specific texts
- computeStats() - Method in class cat.lump.aq.textextraction.wikipedia.categories.CategoryNameStats
-
Calculates and returns the percentages.
- computeTF() - Method in class cat.lump.aq.textextraction.wikipedia.categories.DomainKeywords
-
Gets the term frequency tuples resulting the treatment of a set of
TODO this should be private!!
- computeTF(String) - Method in class cat.lump.aq.textextraction.wikipedia.categories.DomainKeywords
-
As computeTF() but including the title of the root category
- computeVector(String) - Method in class cat.lump.ir.sim.ml.esa.EsaGenerator
-
Computes the ESA vector representation for the given text.
- computeVectors(File, String, String) - Method in class cat.lump.ir.sim.cl.clesa.SimilarityCLESA
-
- computeVectors(File, String, String) - Method in class cat.lump.ir.sim.cl.clesa.SimilarityCLESAdocs
-
- computeVectors(File, String, String) - Method in class cat.lump.ir.sim.ml.esa.SimilarityESA
-
Computes the vectors for the texts in the given set.
- computeVectors(File, String, String) - Method in class cat.lump.ir.sim.ml.esa.SimilarityESAdocs
-
Computes the vectors for the texts in the given set.
- computeVectors(File, String, String) - Method in class cat.lump.ir.sim.ml.esa.SimilarityESAlines
-
- computeVectorsA() - Method in class cat.lump.ir.sim.ml.esa.SimilarityESA
-
Compute the characteristic vectors for dataset A
- computeVectorsA() - Method in class cat.lump.ir.sim.ml.esa.SimilarityESAsent
-
- computeVectorsB() - Method in class cat.lump.ir.sim.ml.esa.SimilarityESA
-
Compute the characteristic vectors for dataset B
- computeVectorsB() - Method in class cat.lump.ir.sim.ml.esa.SimilarityESAsent
-
- contains(String) - Method in class cat.lump.aq.textextraction.wikipedia.categories.DomainVocabulary
-
Checks if a term is contained in the vocabulary.
- containsKey(String) - Method in class cat.lump.ir.sim.ml.esa.EsaVectors
-
- containsPunctuation(String) - Static method in class cat.lump.ie.textprocessing.sentence.Punctuation
-
- coord(int, int) - Method in class cat.lump.ir.lucene.index.TFSimilarity
-
- copy(File, File) - Static method in class cat.lump.aq.basics.io.files.FileIO
-
- CorrelationsxCategory - Class in cat.lump.aq.textextraction.wikipedia.experiments
-
Wikiparable: Experiment 2 for evaluation.
- CorrelationsxCategory(String, String, String, String) - Constructor for class cat.lump.aq.textextraction.wikipedia.experiments.CorrelationsxCategory
-
- cosinePerFragment(Article, Article) - Method in class cat.lump.ir.retrievalmodels.similarity.CosineSimilarity
-
Compute the cosine similarity for all the fragments (e.g. sentences).
- CosineSimilarity - Class in cat.lump.ir.retrievalmodels.similarity
-
- CosineSimilarity() - Constructor for class cat.lump.ir.retrievalmodels.similarity.CosineSimilarity
-
- createCategoryVocabulary(Category) - Method in class cat.lump.aq.textextraction.wikipedia.categories.CategoryExplorer
-
Creates the vocabulary related to the given category.
- createConnection() - Method in class cat.lump.aq.wikilink.connexion.WikipediaDriverManager
-
Creates the connection to the database
- createDir(File) - Static method in class cat.lump.aq.basics.io.files.FileIO
-
- createRepresentations() - Method in class cat.lump.ir.retrievalmodels.similarity.SimilarityCalculator
-
Creates the representations of the source text and the target text
- createScoredCategory(Category, Category, int) - Static method in class cat.lump.aq.textextraction.wikipedia.categories.GroupOfCategories
-
- createScoredCategory(Category, Category, int, boolean) - Static method in class cat.lump.aq.textextraction.wikipedia.categories.GroupOfCategories
-
- CroatianAnalyzer - Class in cat.lump.ir.lucene.index.analyzers
-
- CroatianAnalyzer() - Constructor for class cat.lump.ir.lucene.index.analyzers.CroatianAnalyzer
-
- csv2matrix(File) - Static method in class cat.lump.aq.basics.io.files.CsvFoolReader
-
Reads a csv file and return a 2-dimensional array of Strings of
it.
- csv2matrix(File, String) - Static method in class cat.lump.aq.basics.io.files.CsvFoolReader
-
Reads a csv file.
- csvFileToList(File) - Static method in class cat.lump.aq.basics.io.files.CsvFoolReader
-
- csvFileToList(File, String) - Static method in class cat.lump.aq.basics.io.files.CsvFoolReader
-
- CsvFoolReader - Class in cat.lump.aq.basics.io.files
-
A simple reader for CSV files.
- CsvFoolReader() - Constructor for class cat.lump.aq.basics.io.files.CsvFoolReader
-
- CsvFoolWriter - Class in cat.lump.aq.basics.io.files
-
A simple reader for CSV files.
- CsvFoolWriter() - Constructor for class cat.lump.aq.basics.io.files.CsvFoolWriter
-
- CsvFoolWriter(String) - Constructor for class cat.lump.aq.basics.io.files.CsvFoolWriter
-
- Decomposition - Interface in cat.lump.ie.textprocessing
-
- decrement() - Method in class cat.lump.aq.basics.structure.ir.TermFrequencyTuple
-
Decrements the number of occurrences of the term in one unit.
- deleteDir(File) - Static method in class cat.lump.aq.basics.io.files.FileIO
-
Deletes all files and subdirectories under "dir".
- deleteFile(File) - Static method in class cat.lump.aq.basics.io.files.FileIO
-
- Diacritics - Class in cat.lump.ie.textprocessing.sentence
-
This class is indeed a link to icu's normalizer.
- Diacritics() - Constructor for class cat.lump.ie.textprocessing.sentence.Diacritics
-
Default invocation.
- Diacritics(Boolean) - Constructor for class cat.lump.ie.textprocessing.sentence.Diacritics
-
At invocation time defining whether the texts are going to
be casefolded is required.
- Dictionary - Class in cat.lump.ir.retrievalmodels.document
-
- Dictionary() - Constructor for class cat.lump.ir.retrievalmodels.document.Dictionary
-
- dictionary - Variable in class cat.lump.ir.retrievalmodels.document.Document
-
Internal dictionary that links terms to their numerical representation
- dimension - Variable in class cat.lump.aq.basics.algebra.matrix.DynamicMatrix
-
dimension of the matrix.
- dirCanBeRead(File) - Static method in class cat.lump.aq.basics.io.files.FileIO
-
Check whether the directory exists and can be read.
- display() - Method in class cat.lump.aq.basics.algebra.matrix.DynamicMatrixOfVectors
-
- display() - Method in class cat.lump.ir.sim.cl.len.LengthModelEstimate
-
Display only estimation
- displaySimilarities() - Method in class cat.lump.ir.sim.ml.esa.Esa
-
- displaySimilarities() - Method in interface cat.lump.ir.sim.Similarity
-
Prints a matrix including all the similarities
- displayVerbose() - Method in class cat.lump.ir.sim.cl.len.LengthModelEstimate
-
Display estimation, source and target sentences
- div(int, int, double) - Method in class cat.lump.ir.retrievalmodels.similarity.SimilarityMatrix
-
Divides the current value of the position given by row and
col by the indicated value
- divide(float) - Method in class cat.lump.aq.basics.algebra.vector.Vector
-
Divides the vector by a scalar and returns the resulting array.
- divideEquals(float) - Method in class cat.lump.aq.basics.algebra.vector.Vector
-
Divides the vector by a scalar and updates its value internally.
- doc2WeightQuery(String) - Method in class cat.lump.ir.lucene.query.Document2Query
-
Generates a query in which tokens' relevance depend on their frequency
- docCollection - Variable in class cat.lump.ir.index.Abstracter
-
Internal collection of documents
- Document - Class in cat.lump.ir.retrievalmodels.document
-
A frame to represent a text document and its fragments (in the form of
sentences).
- Document(String, Locale) - Constructor for class cat.lump.ir.retrievalmodels.document.Document
-
- Document(String[], Locale) - Constructor for class cat.lump.ir.retrievalmodels.document.Document
-
- Document(String, Locale, boolean, boolean, boolean, boolean) - Constructor for class cat.lump.ir.retrievalmodels.document.Document
-
- Document(String[], Locale, boolean, boolean, boolean, boolean) - Constructor for class cat.lump.ir.retrievalmodels.document.Document
-
- Document(String, Locale, boolean, boolean, boolean, boolean, int) - Constructor for class cat.lump.ir.retrievalmodels.document.Document
-
- Document(String[], Locale, boolean, boolean, boolean, boolean, int) - Constructor for class cat.lump.ir.retrievalmodels.document.Document
-
- Document2Query - Class in cat.lump.ir.lucene.query
-
The contents of a document are processed to be in the
right format for Lucene querying.
- Document2Query() - Constructor for class cat.lump.ir.lucene.query.Document2Query
-
- Document2Query(Locale) - Constructor for class cat.lump.ir.lucene.query.Document2Query
-
- documentNumber() - Method in class cat.lump.ir.index.Index
-
- documentsExist(String, String) - Method in class cat.lump.ir.sim.ml.esa.Esa
-
Checks whether both documents exist already in the corresponding vector.
- DomainKeywords - Class in cat.lump.aq.textextraction.wikipedia.categories
-
This class gets the most common terms in the articles belonging to, at
least, one category of a given domain.
- DomainKeywords(Locale, int) - Constructor for class cat.lump.aq.textextraction.wikipedia.categories.DomainKeywords
-
- DomainVocabulary - Class in cat.lump.aq.textextraction.wikipedia.categories
-
A DomainVocabulary instance is used to store a set of terms with its
frequency.
- DomainVocabulary(Locale) - Constructor for class cat.lump.aq.textextraction.wikipedia.categories.DomainVocabulary
-
Creates an empty vocabulary.
- DomainVocabulary(Locale, Collection<TermFrequencyTuple>) - Constructor for class cat.lump.aq.textextraction.wikipedia.categories.DomainVocabulary
-
Creates a vocabulary which include some initial terms with an initial
frequency.
- DomainVocabulary(DomainVocabulary) - Constructor for class cat.lump.aq.textextraction.wikipedia.categories.DomainVocabulary
-
Creates a new vocabulary identical to the given one.
- dotProduct(Vector) - Method in class cat.lump.aq.basics.algebra.vector.Vector
-
Computes the dot product between the vector and vector v2.
- dotProduct(float[]) - Method in class cat.lump.aq.basics.algebra.vector.Vector
-
Computes the dot product between the vector and an array of doubles.
- Dump - Class in cat.lump.aq.wikilink.config
-
The years for which we have dumps available
- Dump(Locale, int) - Constructor for class cat.lump.aq.wikilink.config.Dump
-
Locale and year are set.
- DynamicMatrix - Class in cat.lump.aq.basics.algebra.matrix
-
Abstract class for the creation of dynamic matrices of different types .
- DynamicMatrix() - Constructor for class cat.lump.aq.basics.algebra.matrix.DynamicMatrix
-
- DynamicMatrixOfVectors - Class in cat.lump.aq.basics.algebra.matrix
-
A matrix than can grow as required.
- DynamicMatrixOfVectors() - Constructor for class cat.lump.aq.basics.algebra.matrix.DynamicMatrixOfVectors
-
Initialises the matrix (with size=1) and sets
the dimension, which cannot be changed afterwards
- equals(Object) - Method in class cat.lump.aq.basics.structure.ir.TermFrequencyTuple
-
- equals(Object) - Method in class cat.lump.aq.textextraction.wikipedia.categories.CategoryTreeNode
-
- error(String) - Method in class cat.lump.aq.basics.log.LumpLogger
-
- errorEnd(String) - Method in class cat.lump.aq.basics.log.LumpLogger
-
Stops the program execution giving an error message.
- Esa - Class in cat.lump.ir.retrievalmodels.document
-
- Esa() - Constructor for class cat.lump.ir.retrievalmodels.document.Esa
-
- Esa - Class in cat.lump.ir.sim.ml.esa
-
- Esa() - Constructor for class cat.lump.ir.sim.ml.esa.Esa
-
- esaGen - Variable in class cat.lump.ir.sim.ml.esa.SimilarityESA
-
A generator of ESA vectorial representations
- EsaGenerator - Class in cat.lump.ir.sim.ml.esa
-
A class that allows for passing from a text (collection)
into its ESA vector representation.
- EsaGenerator(File, String) - Constructor for class cat.lump.ir.sim.ml.esa.EsaGenerator
-
Invokes an instance of the EsaGenerator by loading the
index and the analyzer for the required language
TODO whether the indexpath should be established by
default depending on the language
- EsaVectors - Class in cat.lump.ir.sim.ml.esa
-
Set of vector representation of a set of texts.
- EsaVectors(int) - Constructor for class cat.lump.ir.sim.ml.esa.EsaVectors
-
At invocation time the index and initial matrix of vectors is
generated
- esaVectorsA - Variable in class cat.lump.ir.sim.ml.esa.Esa
-
Instance of an ESA vector for the set of documents A.
- esaVectorsB - Variable in class cat.lump.ir.sim.ml.esa.Esa
-
Instance of an ESA vector for the set of documents B (see description
for esaVectors_A
- estimate() - Method in class cat.lump.ir.sim.cl.len.LengthModelEstimate
-
- estimateMatrix() - Method in class cat.lump.ir.sim.cl.len.LengthModelEstimate
-
Estimates the length factors of the sentences among them.
- EstonianAnalyzer - Class in cat.lump.ir.lucene.index.analyzers
-
- EstonianAnalyzer(Version) - Constructor for class cat.lump.ir.lucene.index.analyzers.EstonianAnalyzer
-
- exist(String) - Method in class cat.lump.ir.retrievalmodels.document.Dictionary
-
- existsFile(File, TypePreprocess, String, int) - Static method in class cat.lump.aq.textextraction.wikipedia.io.FileManager
-
Checks if a file related to a page exists on the given root directory.
- existTerm(String) - Method in class cat.lump.ir.weighting.TermFrequency
-
- exitError(HelpFormatter, String) - Method in class cat.lump.aq.textextraction.wikipedia.cli.WikipediaCliMinimum
-
Finish the process with the CLI help and the an error message.
- exitError(HelpFormatter, String) - Method in class cat.lump.ir.lucene.cli.LuceneCliMinimum0
-
Finish the process with the CLI help and the an error message.
- exitError(String) - Method in class cat.lump.ir.sim.ml.esa.Esa
-
- exitHelp(HelpFormatter) - Method in class cat.lump.aq.textextraction.wikipedia.cli.WikipediaCliMinimum
-
Exit displaying the CLI help
- exitHelp(HelpFormatter) - Method in class cat.lump.ir.lucene.cli.LuceneCliMinimum0
-
Exit displaying the CLI help
- extract(short) - Method in class cat.lump.aq.textextraction.wikipedia.categories.ArticleSelector
-
Extracts the articles associated to the indicated categories.
- extract(short) - Method in class cat.lump.aq.textextraction.wikipedia.categories.BackArticleSelector
-
Extracts the articles associated to the indicated categories.
- extractCategories(int, boolean) - Method in class cat.lump.aq.textextraction.wikipedia.categories.Xecutor
-
- extractCategories(int, int) - Method in class cat.lump.aq.textextraction.wikipedia.categories.XecutorTheFirst
-
- extractDomainKeywords(String, int) - Method in class cat.lump.aq.textextraction.wikipedia.categories.Xecutor
-
Obtains the vocabulary that represents the given category in the language
and year Wikipedia.
- extractDomainKeywords(String) - Method in class cat.lump.aq.textextraction.wikipedia.categories.XecutorTheFirst
-
Obtains the vocabulary that represents the given category in the language
and year Wikipedia.
- extractDomainKeywords(String, int, String, String) - Method in class cat.lump.ir.lucene.Xecutor
-
Obtains the vocabulary that represents the given category in the language
and year Wikipedia.
- extractEntireWikipedia(Locale, int, File) - Static method in class cat.lump.aq.textextraction.wikipedia.categories.ArticleTextExtractor
-
The entire Wiki for the set language and year
- extractSpecificArticles(Locale, int, Integer[], File) - Static method in class cat.lump.aq.textextraction.wikipedia.categories.ArticleTextExtractor
-
- extractSpecificArticles(Locale, int, File, File) - Static method in class cat.lump.aq.textextraction.wikipedia.categories.ArticleTextExtractor
-
Extract only the articles specified in the pagesFile
- extractTexts(int, int) - Method in class cat.lump.aq.textextraction.wikipedia.categories.Xecutor
-
- extractTexts() - Method in class cat.lump.aq.textextraction.wikipedia.categories.XecutorTheSecond
-
- generateCategoryTree(Category, int) - Method in class cat.lump.aq.textextraction.wikipedia.categories.CategoryExtractor
-
Creates a tree of categories with the root category as root
and all its subcategories allocated by levels of depth.
- generateInvIndex() - Method in class cat.lump.ir.retrievalmodels.similarity.Article
-
Loads the inverted index from the text as well as the corresponding
magnitudes
- generateSetsCommon(String, boolean) - Method in class cat.lump.aq.textextraction.wikipedia.utilities.ArticlesTranslator
-
Looks for the index files in the given folder and generates temporal files (tokenised)
to be translated with only the articles that also appear in file String commonArticles.
- generateSetsFullFolder(HashSet<Integer>, boolean) - Method in class cat.lump.aq.textextraction.wikipedia.utilities.ArticlesTranslator
-
Looks for the index files in the given folder and generates temporal files (tokenised)
to be translated.
- get(int) - Method in class cat.lump.aq.basics.algebra.matrix.DynamicMatrix
-
- get(int) - Method in class cat.lump.aq.basics.algebra.matrix.DynamicMatrixOfVectors
-
Get the vector from the specified position.
- get() - Method in class cat.lump.aq.basics.algebra.vector.Vector
-
- get(int) - Method in class cat.lump.aq.basics.algebra.vector.Vector
-
- get(String) - Method in class cat.lump.ie.textprocessing.transform.Transliteratorr
-
- get(String) - Method in class cat.lump.ir.index.Ranking
-
- get(RepresentationType) - Method in class cat.lump.ir.retrievalmodels.document.Document
-
- get(RepresentationType) - Method in class cat.lump.ir.retrievalmodels.document.Fragment
-
- getAcronymLabel(Language) - Method in class cat.lump.aq.wikilink.jwpl.LanguageConstants
-
- getAll() - Method in class cat.lump.ir.weighting.TermFrequency
-
- getAllPairsDBname() - Static method in class cat.lump.aq.wikilink.config.MySQLWikiConfiguration
-
- getAllVectors() - Method in class cat.lump.ir.sim.ml.esa.EsaVectors
-
- getAnalyzer() - Method in class cat.lump.ir.lucene.query.Document2Query
-
- getArticles() - Method in class cat.lump.aq.textextraction.wikipedia.categories.ArticleSelector
-
- getArticlesFile() - Method in class cat.lump.aq.textextraction.wikipedia.cli.WikipediaCliArticleTextExtractor
-
- getAvailablePairs() - Static method in class cat.lump.ir.sim.cl.len.LengthFactors
-
Retrieve the set of language pairs for which default parameters are
available.
- getAvailablePreprocesses() - Method in class cat.lump.aq.textextraction.wikipedia.categories.ArticleTextExtractor
-
- getBibliographyLabel(Language) - Method in class cat.lump.aq.wikilink.jwpl.LanguageConstants
-
- getBibliographyLabelDep(Language) - Static method in class cat.lump.aq.wikilink.jwpl.LanguageConstants
-
Deprecated.
- getBottomK(int) - Method in class cat.lump.ir.index.Ranking
-
- getByTag(String) - Static method in enum cat.lump.aq.textextraction.wikipedia.prepro.TypePreprocess
-
Searches the entry identified by the given name.
- getCategories() - Method in class cat.lump.aq.textextraction.wikipedia.categories.GroupOfCategories
-
- getCategory() - Method in class cat.lump.aq.textextraction.wikipedia.categories.CategoryTreeNode
-
Returns the category.
- getCategory() - Method in class cat.lump.aq.textextraction.wikipedia.categories.GroupOfCategories.ScoredCategory
-
- getCategory() - Method in class cat.lump.ir.lucene.cli.LuceneCliWT2Query
-
- getCategoryFile() - Method in class cat.lump.aq.textextraction.wikipedia.cli.WikipediaCliArticleSelector
-
- getCategoryID() - Method in class cat.lump.aq.textextraction.wikipedia.cli.WikipediaCliCategoryDepth
-
- getCategoryID() - Method in class cat.lump.aq.textextraction.wikipedia.cli.WikipediaCliCategoryExtractor
-
- getCategoryID() - Method in class cat.lump.aq.textextraction.wikipedia.cli.WikipediaCliDomainKeywords
-
- getCategoryLabel(Language) - Method in class cat.lump.aq.wikilink.jwpl.LanguageConstants
-
- getCategoryLabelDep(Language) - Static method in class cat.lump.aq.wikilink.jwpl.LanguageConstants
-
Deprecated.
- getCategoryLabels() - Method in class cat.lump.aq.wikilink.jwpl.WikipediaJwpl
-
- getCategoryName() - Method in class cat.lump.aq.textextraction.wikipedia.cli.WikipediaCliCategoryExtractor
-
- getCategoryName() - Method in class cat.lump.aq.textextraction.wikipedia.cli.WikipediaCliDomainKeywords
-
- getCategoryTree(int, int) - Method in class cat.lump.aq.textextraction.wikipedia.categories.CategoryExtractor
-
- getCategoryTree() - Method in class cat.lump.aq.textextraction.wikipedia.categories.CategoryExtractor
-
- getDBprefix() - Static method in class cat.lump.aq.wikilink.config.MySQLWikiConfiguration
-
The prefix for the Wikipedia SQL dumps containing
page and langlinks tables.
- getDepth() - Method in class cat.lump.aq.textextraction.wikipedia.categories.CategoryTreeNode
-
Returns the depth where this category has been found
- getDepth() - Method in class cat.lump.aq.textextraction.wikipedia.cli.WikipediaCliArticleSelector
-
- getDepth() - Method in class cat.lump.aq.textextraction.wikipedia.cli.WikipediaCliCategoriesXecutor
-
- getDepthLinear() - Method in class cat.lump.aq.textextraction.wikipedia.categories.CategoryDepth
-
Getters
- getDepthSplines() - Method in class cat.lump.aq.textextraction.wikipedia.categories.CategoryDepth
-
- getDictFile() - Method in class cat.lump.ir.lucene.cli.LuceneCliCategoriesXecutor
-
- getDirectory() - Method in class cat.lump.aq.textextraction.wikipedia.cli.WikipediaCliFragmentsXecutor
-
- getDisambiguationLabel(Language) - Method in class cat.lump.aq.wikilink.jwpl.LanguageConstants
-
- getDisambiguationLabelFep(Language) - Static method in class cat.lump.aq.wikilink.jwpl.LanguageConstants
-
Deprecated.
- getDistance() - Method in class cat.lump.aq.textextraction.wikipedia.categories.GroupOfCategories.ScoredCategory
-
- getDocument(File) - Method in class cat.lump.ir.lucene.index.LuceneIndexer
-
- getDocument(File) - Method in class cat.lump.ir.lucene.index.LuceneIndexerWT
-
- getDocuments(String) - Method in class cat.lump.ir.index.Index
-
- getDocuments() - Method in class cat.lump.ir.index.Index
-
- getDomainCategories(List<GroupOfCategories>, double) - Method in class cat.lump.aq.textextraction.wikipedia.categories.CategoryExplorer
-
Obtains the categories of all the groups which compose a domain.
- getEArticlesFileName() - Method in class cat.lump.ir.lucene.cli.LuceneCliWT2Query
-
- getEnd() - Method in class cat.lump.ie.textprocessing.Span
-
- getExternalLinksLabel(Language) - Method in class cat.lump.aq.wikilink.jwpl.LanguageConstants
-
- getExternalLinksLabelDep(Language) - Static method in class cat.lump.aq.wikilink.jwpl.LanguageConstants
-
Deprecated.
- getFile(File, TypePreprocess, String, int) - Static method in class cat.lump.aq.textextraction.wikipedia.io.FileManager
-
Creates the abstract path for a given page considering its ID, language
and the type of preprocess.
- getFile(File, String, int) - Static method in class cat.lump.aq.textextraction.wikipedia.io.FileManager
-
Creates the abstract path for a given page considering its ID and
language
The path will be constructed as follows:
root/language/index/filename.txt where index is the
result of dividing pageID by and
filename is formed by concatenating the pageID and
language, separated by dots.
- getFileCommonArticles() - Method in class cat.lump.aq.textextraction.wikipedia.cli.WikipediaCliFragmentsXecutor
-
- getFilenames(File, String, boolean) - Static method in class cat.lump.aq.textextraction.wikipedia.io.FileManager
-
Retrieve the name of the preprocessed files in a language from the given
directory.
- getFilesExt(File, String) - Static method in class cat.lump.aq.basics.io.files.FileIO
-
Gets all the files with a given extension ext
- getFilesID() - Method in class cat.lump.aq.textextraction.wikipedia.cli.WikipediaCliFragmentsXecutor
-
- getFilesRecursively(File, String) - Static method in class cat.lump.aq.basics.io.files.FileIO
-
- getFilesRecursively(File, String, long, long) - Static method in class cat.lump.aq.basics.io.files.FileIO
-
- getFirstStep() - Method in class cat.lump.aq.textextraction.wikipedia.cli.WikipediaCliCategoriesXecutor
-
- getFirstStep() - Method in class cat.lump.aq.textextraction.wikipedia.cli.WikipediaCliFragmentsXecutor
-
- getFirstStep() - Method in class cat.lump.ir.lucene.cli.LuceneCliCategoriesXecutor
-
- getformatTitleGivenID(int, String, int, WikipediaDriverManager) - Static method in class cat.lump.aq.wikilink.WikipediaDBdata
-
Retrieves the title of an article with ID id from table "page"
The connection must be ready before.
- getformatWPtitle(ResultSet, String) - Static method in class cat.lump.aq.wikilink.WikipediaDBdata
-
Given a ResultSet object extracts from column colName and formats a
Wikipedia title as expected to be found in table page
- getFragment(int) - Method in class cat.lump.ir.retrievalmodels.document.Document
-
- getFragmentSize(int) - Method in class cat.lump.ir.retrievalmodels.similarity.Article
-
- getFrequency() - Method in class cat.lump.aq.basics.structure.ir.TermFrequencyTuple
-
Returns the number of occurrences of the term
- getFrequency(String) - Method in class cat.lump.aq.textextraction.wikipedia.categories.DomainVocabulary
-
Gives the frequency of the given term.
- getFromFile(File, boolean) - Static method in class cat.lump.ir.retrievalmodels.similarity.SimilarityMatrix
-
Loads a similarity matrix froma binary a file.
- getFurhterReadingLabel(Language) - Method in class cat.lump.aq.wikilink.jwpl.LanguageConstants
-
- getFurhterReadingLabelDep(Language) - Static method in class cat.lump.aq.wikilink.jwpl.LanguageConstants
-
Deprecated.
- getiCategory() - Method in class cat.lump.aq.textextraction.wikipedia.cli.WikipediaCliCategoriesXecutor
-
- getiCategory() - Method in class cat.lump.ir.lucene.cli.LuceneCliCategoriesXecutor
-
- getiCategory1() - Method in class cat.lump.aq.textextraction.wikipedia.cli.WikipediaCliFragmentsXecutor
-
- getiCategory2() - Method in class cat.lump.aq.textextraction.wikipedia.cli.WikipediaCliFragmentsXecutor
-
- getId(String) - Method in class cat.lump.ir.retrievalmodels.document.Dictionary
-
- getId(int) - Method in class cat.lump.ir.sim.ml.esa.EsaVectors
-
- getIdentifier() - Method in enum cat.lump.aq.textextraction.wikipedia.prepro.TypePreprocess
-
- getImageLabel(Language) - Method in class cat.lump.aq.wikilink.jwpl.LanguageConstants
-
- getImageLabelDep(Language) - Static method in class cat.lump.aq.wikilink.jwpl.LanguageConstants
-
Deprecated.
- getImageLabels() - Method in class cat.lump.aq.wikilink.jwpl.WikipediaJwpl
-
- getIn() - Method in class cat.lump.ir.lucene.cli.LuceneCliIndexerWT
-
Getters
- getIn() - Method in class cat.lump.ir.lucene.cli.LuceneCliQuerierWT
-
Getters
- getIndex(String) - Method in class cat.lump.ir.sim.ml.esa.EsaVectors
-
- getIndexDimension() - Method in class cat.lump.ir.sim.ml.esa.EsaGenerator
-
- getIndexDir() - Method in class cat.lump.ir.lucene.cli.LuceneCliCategoriesXecutor
-
- getIndexDir() - Method in class cat.lump.ir.lucene.cli.LuceneCliWT2Query
-
Getters
- getInDir() - Method in class cat.lump.ir.lucene.cli.LuceneCliWT2Query
-
- getInputDir() - Method in class cat.lump.ir.lucene.cli.LuceneCliCategoriesXecutor
-
- getInstance(TypePreprocess, String, Locale, int, File) - Static method in class cat.lump.aq.textextraction.wikipedia.prepro.AbstractPreprocess
-
Creates a preprocess method able to preprocess pages from the Wikipedia
dump identified by language and year.
- getInstance(String, String, Similarity) - Static method in class cat.lump.ir.retrievalmodels.similarity.SimilarityCalculator
-
Returns the suitable calculator according to the given similarity method.
- getInstance(String, String, Similarity, int) - Static method in class cat.lump.ir.retrievalmodels.similarity.SimilarityCalculator
-
Returns the suitable calculator according to the given similarity method.
- getInverseRank() - Method in class cat.lump.ir.index.Ranking
-
- getInvIndex() - Method in class cat.lump.ir.retrievalmodels.similarity.Article
-
- getJwplDBprefix() - Static method in class cat.lump.aq.wikilink.config.MySQLWikiConfiguration
-
- getJwplDBprefix() - Static method in class cat.lump.aq.wikilink.jwpl.WikipediaJwpl
-
- getJwplLanguage(Locale) - Static method in class cat.lump.aq.wikilink.Languages
-
- getJwplLanguage(String) - Static method in class cat.lump.aq.wikilink.Languages
-
- getKey() - Method in class cat.lump.aq.basics.structure.Pair
-
- getLang() - Method in class cat.lump.aq.textextraction.wikipedia.categories.DomainKeywords
-
- getLangAll() - Static method in class cat.lump.aq.wikilink.Languages
-
- getLanglinksTableName(String, int) - Static method in class cat.lump.aq.wikilink.WikipediaDBdata
-
Generates the name of the table langlinks as stored in the database for a
given language and year
- getLanguage() - Method in class cat.lump.aq.textextraction.wikipedia.categories.DomainVocabulary
-
- getLanguage() - Method in class cat.lump.aq.textextraction.wikipedia.cli.WikipediaCliMinimum
-
- getLanguage() - Method in class cat.lump.aq.wikilink.config.Dump
-
- getLanguage() - Method in class cat.lump.ir.lucene.cli.LuceneCliMinimum0
-
Getters
- getLanguage() - Method in class cat.lump.ir.retrievalmodels.similarity.Article
-
- getLanguage2() - Method in class cat.lump.aq.textextraction.wikipedia.cli.WikipediaCliFragmentsXecutor
-
getters
- getLastStep() - Method in class cat.lump.aq.textextraction.wikipedia.cli.WikipediaCliCategoriesXecutor
-
- getLastStep() - Method in class cat.lump.aq.textextraction.wikipedia.cli.WikipediaCliFragmentsXecutor
-
- getLastStep() - Method in class cat.lump.ir.lucene.cli.LuceneCliCategoriesXecutor
-
- getLocale() - Method in class cat.lump.aq.textextraction.wikipedia.categories.CategoryExplorer
-
- getLuceneTokenizer(Locale) - Static method in class cat.lump.ir.lucene.query.TokenizerFactory
-
- getMagnitude(String) - Method in class cat.lump.ir.index.Index
-
- getMagnitude(int) - Method in class cat.lump.ir.retrievalmodels.similarity.Article
-
The magnitude of the given sentence
- getMatrix() - Method in class cat.lump.aq.basics.algebra.matrix.DynamicMatrix
-
- getMatrix() - Method in class cat.lump.aq.basics.algebra.matrix.DynamicMatrixOfVectors
-
- getMatrix() - Method in class cat.lump.ir.retrievalmodels.similarity.SimilarityMatrix
-
- getmaxDepth() - Method in class cat.lump.aq.textextraction.wikipedia.cli.WikipediaCliCategoryDepth
-
- getMaxDepth() - Method in class cat.lump.aq.textextraction.wikipedia.cli.WikipediaCliCategoryExtractor
-
- getMaxVocab() - Method in class cat.lump.ir.lucene.cli.LuceneCliWT2Query
-
- getMean(String, String) - Static method in class cat.lump.ir.sim.cl.len.LengthFactors
-
Get the mean for the desired pair.
- getMean(String) - Static method in class cat.lump.ir.sim.cl.len.LengthFactors
-
Get the mean for the desired pair.
- getMethod() - Method in class cat.lump.aq.textextraction.wikipedia.cli.WikipediaCliFragmentsXecutor
-
- getminDepth() - Method in class cat.lump.aq.textextraction.wikipedia.cli.WikipediaCliCategoryDepth
-
- getMinimumSize() - Method in class cat.lump.aq.textextraction.wikipedia.categories.DomainVocabulary
-
- getModel() - Method in class cat.lump.aq.textextraction.wikipedia.cli.WikipediaCliCategoriesXecutor
-
- getModel() - Method in enum cat.lump.ir.retrievalmodels.similarity.Similarity
-
- getModel() - Method in class cat.lump.ir.retrievalmodels.similarity.SimilarityCalculator
-
- getMu() - Method in class cat.lump.ir.sim.cl.len.LengthModelLearn
-
- getMulti_db() - Static method in class cat.lump.aq.wikilink.config.MySQLWikiConfiguration
-
- getMulti_table() - Static method in class cat.lump.aq.wikilink.config.MySQLWikiConfiguration
-
- getName() - Method in enum cat.lump.ir.retrievalmodels.similarity.Similarity
-
- getNCols() - Method in class cat.lump.ir.retrievalmodels.similarity.SimilarityMatrix
-
- getNextCoordenates() - Method in class cat.lump.ir.comparison.toCheck.SimilarityMatrixIterator
-
- getNextCoordenates() - Method in class cat.lump.ir.retrievalmodels.similarity.SimilarityMatrixIterator
-
- getnGrams() - Method in class cat.lump.ir.retrievalmodels.similarity.SimilarityCalculator
-
- getNormalized(RepresentationType) - Method in class cat.lump.ir.retrievalmodels.document.Document
-
- getNormalized(RepresentationType) - Method in class cat.lump.ir.retrievalmodels.document.Fragment
-
- getNotesLabel(Language) - Method in class cat.lump.aq.wikilink.jwpl.LanguageConstants
-
- getNotesLabelDep(Language) - Static method in class cat.lump.aq.wikilink.jwpl.LanguageConstants
-
Deprecated.
- getNRows() - Method in class cat.lump.ir.retrievalmodels.similarity.SimilarityMatrix
-
- getNumberOfArticles() - Method in class cat.lump.aq.textextraction.wikipedia.categories.CategoryTreeNode
-
Returns the number of articles that are categorized under this
category.
- getNumberOfChildren() - Method in class cat.lump.aq.textextraction.wikipedia.categories.CategoryTreeNode
-
Returns the number of children (subcategories) of this category.
- getNumberOfParents() - Method in class cat.lump.aq.textextraction.wikipedia.categories.CategoryTreeNode
-
Returns the number of parents (supercategories) of this category.
- getOriginalString() - Method in class cat.lump.ie.textprocessing.TextPreprocessor
-
- getOut() - Method in class cat.lump.ir.lucene.cli.LuceneCliIndexerWT
-
- getOut() - Method in class cat.lump.ir.lucene.cli.LuceneCliQuerierWT
-
- getOutDir() - Method in class cat.lump.ir.lucene.cli.LuceneCliWT2Query
-
- getOutputDir() - Method in class cat.lump.aq.textextraction.wikipedia.cli.WikipediaCliArticleTextExtractor
-
- getOutputDir() - Method in class cat.lump.ir.lucene.cli.LuceneCliCategoriesXecutor
-
- getOutputFile() - Method in class cat.lump.aq.textextraction.wikipedia.cli.WikipediaCliCategoriesXecutor
-
- getOutputFile() - Method in class cat.lump.aq.textextraction.wikipedia.cli.WikipediaCliCategoryExtractor
-
- getOutputFile() - Method in class cat.lump.aq.textextraction.wikipedia.cli.WikipediaCliDomainKeywords
-
- getOutputPath() - Method in class cat.lump.aq.textextraction.wikipedia.cli.WikipediaCliArticleSelector
-
- getOverThreshold(double) - Method in class cat.lump.ir.index.Ranking
-
- getPageID() - Method in class cat.lump.aq.textextraction.wikipedia.categories.CategoryTreeNode
-
Returns the ID of the category at the used Wikipedia
- getPageTableName(String, int) - Static method in class cat.lump.aq.wikilink.WikipediaDBdata
-
Generates the name of the table page as stored in the database for a
given language and year
- getPairs() - Method in class cat.lump.aq.textextraction.wikipedia.fragments.ArticlesSimilarity
-
- getPairs() - Method in class cat.lump.ir.comparison.toCheck.ArticlesSimilarity
-
- getPairsDBname() - Static method in class cat.lump.aq.wikilink.config.MySQLWikiConfiguration
-
- getPairwiseSimilarities() - Method in class cat.lump.ir.sim.ml.esa.Esa
-
- getParagraphsFromArticle(int) - Method in class cat.lump.aq.wikilink.jwpl.WikipediaJwpl
-
Obtains the paragraphs of a parsed article id
- getParagraphsFromArticle(String) - Method in class cat.lump.aq.wikilink.jwpl.WikipediaJwpl
-
Obtains the paragraphs of a parsed article title
- getParent() - Method in class cat.lump.aq.textextraction.wikipedia.categories.GroupOfCategories.ScoredCategory
-
- getParsedArticle(int) - Method in class cat.lump.aq.wikilink.jwpl.WikipediaJwpl
-
Parses a Wikipedia article
- getParsedArticle(String) - Method in class cat.lump.aq.wikilink.jwpl.WikipediaJwpl
-
Parses a Wikipedia article
- getParser() - Method in class cat.lump.aq.wikilink.jwpl.WikipediaJwpl
-
TODO determine whether this parser should be returned.
- getPercentage() - Method in class cat.lump.aq.textextraction.wikipedia.cli.WikipediaCliCategoryDepth
-
getters
- getPercentage() - Method in class cat.lump.ir.lucene.cli.LuceneCliQuerierWT
-
- getPercentage() - Method in class cat.lump.ir.lucene.cli.LuceneCliWT2Query
-
- getPlain() - Method in class cat.lump.ir.retrievalmodels.document.Fragment
-
- getPrefixOutputFile() - Method in class cat.lump.aq.textextraction.wikipedia.utilities.CommonArticlesFinder
-
getters
- getPrefixOutputFile() - Method in class cat.lump.aq.textextraction.wikipedia.utilities.CommonNamespaceFinder
-
getters
- getProperties(String) - Static method in class cat.lump.aq.textextraction.wikipedia.WikiProperties
-
Deprecated.
- getProperties(String) - Static method in class cat.lump.aq.textextraction.wikipedia.WTConfig
-
Loads the wikiTailor.ini config file
- getPropertyInt(String) - Method in class cat.lump.aq.textextraction.wikipedia.cli.WikipediaCliMinimum
-
- getPropertyInt(String) - Method in class cat.lump.aq.textextraction.wikipedia.WTConfig
-
Gets a value in the config file given a key and returns it as an integer
- getPropertyStr(String) - Method in class cat.lump.aq.textextraction.wikipedia.cli.WikipediaCliMinimum
-
- getPropertyStr(String) - Method in class cat.lump.aq.textextraction.wikipedia.WTConfig
-
Gets a value in the config file given a key and returns it as a String
- getRank() - Method in class cat.lump.ir.index.Ranking
-
- getRedirectLabel(Language) - Method in class cat.lump.aq.wikilink.jwpl.LanguageConstants
-
- getRedirectLabelDep(Language) - Static method in class cat.lump.aq.wikilink.jwpl.LanguageConstants
-
Deprecated.
- getRedirectTableName(String, int) - Static method in class cat.lump.aq.wikilink.WikipediaDBdata
-
Generates the name of the table redirect as stored in the database for a
given language and year
- getReferencesLabel(Language) - Method in class cat.lump.aq.wikilink.jwpl.LanguageConstants
-
- getReferencesLabelDep(Language) - Static method in class cat.lump.aq.wikilink.jwpl.LanguageConstants
-
Deprecated.
- getRepresentation() - Method in class cat.lump.ir.retrievalmodels.similarity.Article
-
- getRepresentation() - Method in enum cat.lump.ir.retrievalmodels.similarity.Similarity
-
- getRoot() - Method in class cat.lump.aq.textextraction.wikipedia.categories.GroupOfCategories
-
- getRootDirectory() - Method in class cat.lump.aq.textextraction.wikipedia.categories.ArticleTextExtractor
-
- getsCategory() - Method in class cat.lump.aq.textextraction.wikipedia.cli.WikipediaCliCategoriesXecutor
-
Getters
- getsCategory() - Method in class cat.lump.ir.lucene.cli.LuceneCliCategoriesXecutor
-
Getters
- getsCategory1() - Method in class cat.lump.aq.textextraction.wikipedia.cli.WikipediaCliFragmentsXecutor
-
- getsCategory2() - Method in class cat.lump.aq.textextraction.wikipedia.cli.WikipediaCliFragmentsXecutor
-
- getScore() - Method in class cat.lump.aq.textextraction.wikipedia.categories.GroupOfCategories
-
- getSD(String, String) - Static method in class cat.lump.ir.sim.cl.len.LengthFactors
-
Get the standard deviation for the desired pair.
- getSD(String) - Static method in class cat.lump.ir.sim.cl.len.LengthFactors
-
Get the standard deviation for the desired pair.
- getSectionsFromArticle(int) - Method in class cat.lump.aq.wikilink.jwpl.WikipediaJwpl
-
Obtains the sections of parsed article
- getSectionsFromArticle(String) - Method in class cat.lump.aq.wikilink.jwpl.WikipediaJwpl
-
Obtains the sections of a parsed article
- getSeeAlsoLabel(Language) - Method in class cat.lump.aq.wikilink.jwpl.LanguageConstants
-
- getSeeAlsoLabelDep(Language) - Static method in class cat.lump.aq.wikilink.jwpl.LanguageConstants
-
Deprecated.
- getSigma() - Method in class cat.lump.ir.sim.cl.len.LengthModelLearn
-
- getSimilarities() - Method in class cat.lump.aq.textextraction.wikipedia.fragments.ArticlesSimilarity
-
- getSimilarities() - Method in class cat.lump.ir.comparison.toCheck.ArticlesSimilarity
-
- getSimilarities() - Method in class cat.lump.ir.sim.ml.esa.Esa
-
- getSimilarities() - Method in interface cat.lump.ir.sim.Similarity
-
- getSimilaritiesMatrix() - Method in class cat.lump.ir.sim.ml.esa.Esa
-
- getSimilarity(int, int) - Method in class cat.lump.ir.retrievalmodels.similarity.SimilarityMatrix
-
- getSimilarity(String, String) - Method in class cat.lump.ir.sim.ml.esa.Esa
-
Obtains the similarity between texts id_A and id_B.
- getSimilarity(String) - Method in class cat.lump.ir.sim.ml.esa.Esa
-
- getSimilarity(String, String) - Method in interface cat.lump.ir.sim.Similarity
-
Get the (previously computed) similarity between the two ids
- getSimilarityByName(String) - Static method in enum cat.lump.ir.retrievalmodels.similarity.Similarity
-
- getSimilarityByRepr(RepresentationType) - Static method in enum cat.lump.ir.retrievalmodels.similarity.Similarity
-
TODO ask Josu about this
- getSize() - Method in class cat.lump.ir.retrievalmodels.similarity.SimilarityMatrix
-
- getSource() - Method in class cat.lump.ir.comparison.toCheck.SimilarityPair
-
- getSource() - Method in class cat.lump.ir.retrievalmodels.similarity.SimilarityCalculator
-
- getSpans(String) - Method in interface cat.lump.ie.textprocessing.Decomposition
-
- getSpans(String[]) - Method in class cat.lump.ie.textprocessing.ner.NerOpennlp
-
- getSpans(String) - Method in class cat.lump.ie.textprocessing.ner.NerOpennlp
-
- getSpans(String) - Method in class cat.lump.ie.textprocessing.ngram.CharacterNgrams
-
- getSpans(String) - Method in class cat.lump.ie.textprocessing.ngram.WordNgrams
-
- getSpans(String) - Method in class cat.lump.ie.textprocessing.sentence.SentencesOpennlp
-
- getSpans(String) - Method in class cat.lump.ie.textprocessing.word.WordDecompositionICU4J
-
- getSpecificDirs(File, String, String) - Static method in class cat.lump.aq.basics.io.files.FileIO
-
- getSpecificFilesRecursively(File, String, String) - Static method in class cat.lump.aq.basics.io.files.FileIO
-
- getSpecificFilesRecursively(File, String, String, long, long) - Static method in class cat.lump.aq.basics.io.files.FileIO
-
- getSrcLang() - Method in class cat.lump.aq.textextraction.wikipedia.fragments.ArticlesSimilarity
-
- getSrcLang() - Method in class cat.lump.ir.comparison.toCheck.ArticlesSimilarity
-
- getStart() - Method in class cat.lump.ie.textprocessing.Span
-
- getStatsFile() - Method in class cat.lump.aq.textextraction.wikipedia.cli.WikipediaCliCategoryDepth
-
- getStopWords() - Method in class cat.lump.ie.textprocessing.stopwords.Stopwords
-
- getString() - Method in class cat.lump.ie.textprocessing.TextPreprocessor
-
- getString(int) - Method in class cat.lump.ir.retrievalmodels.document.Dictionary
-
- getStrings(String) - Method in interface cat.lump.ie.textprocessing.Decomposition
-
- getStrings(String[]) - Method in class cat.lump.ie.textprocessing.ner.NerOpennlp
-
- getStrings(String) - Method in class cat.lump.ie.textprocessing.ner.NerOpennlp
-
- getStrings(String) - Method in class cat.lump.ie.textprocessing.ngram.CharacterNgrams
-
- getStrings(String) - Method in class cat.lump.ie.textprocessing.ngram.WordNgrams
-
- getStrings(String) - Method in class cat.lump.ie.textprocessing.sentence.SentencesOpennlp
-
- getStrings(String) - Method in class cat.lump.ie.textprocessing.word.WordDecompositionICU4J
-
- getSubSectionsFromArticle(int) - Method in class cat.lump.aq.wikilink.jwpl.WikipediaJwpl
-
Obtains the sub-sections of a parsed article
- getSubSectionsFromArticle(String) - Method in class cat.lump.aq.wikilink.jwpl.WikipediaJwpl
-
Obtains the sub-sections of a parsed article
- getSubstring(String) - Method in class cat.lump.ie.textprocessing.Span
-
String in the current span
- getTarget() - Method in class cat.lump.ir.comparison.toCheck.SimilarityPair
-
- getTarget() - Method in class cat.lump.ir.retrievalmodels.similarity.SimilarityCalculator
-
- getTerm() - Method in class cat.lump.aq.basics.structure.ir.TermFrequencyTuple
-
Returns the associated term
- getTerm(String) - Method in class cat.lump.ir.weighting.TermFrequency
-
- getTerms() - Method in class cat.lump.aq.textextraction.wikipedia.categories.DomainVocabulary
-
- getTerms(String, int) - Method in class cat.lump.aq.textextraction.wikipedia.prepro.TermExtractor
-
Transforms the content of a String into a set of terms.
- getTermTuples() - Method in class cat.lump.aq.textextraction.wikipedia.categories.DomainKeywords
-
- getText() - Method in class cat.lump.ir.retrievalmodels.document.Document
-
- getText() - Method in class cat.lump.ir.retrievalmodels.similarity.Article
-
- getTitle() - Method in class cat.lump.aq.textextraction.wikipedia.categories.CategoryTreeNode
-
Returns the title of the category in Wikitext (without blanks).
- getTokens() - Method in class cat.lump.ie.textprocessing.TextPreprocessor
-
- getTokenValues(String) - Method in class cat.lump.ir.retrievalmodels.similarity.Article
-
- getTop(float) - Method in class cat.lump.aq.textextraction.wikipedia.categories.DomainVocabulary
-
Extracts the most frequent terms of the vocabulary.
- getTop(int) - Method in class cat.lump.aq.textextraction.wikipedia.categories.DomainVocabulary
-
Extract the qtt most frequent terms of the vocabulary.
- getTop() - Method in class cat.lump.aq.textextraction.wikipedia.cli.WikipediaCliCategoriesXecutor
-
- getTop() - Method in class cat.lump.aq.textextraction.wikipedia.cli.WikipediaCliDomainKeywords
-
- getTop() - Method in class cat.lump.ir.lucene.cli.LuceneCliCategoriesXecutor
-
- getTop(int, int) - Method in class cat.lump.ir.weighting.TermFrequency
-
Subset of terms with the highest tf up to top% or up to max
Note that not the top% is returned sometimes but a little bit more.
- getTopK(int) - Method in class cat.lump.ir.index.Ranking
-
- getTopPlus(int, int, List<String>) - Method in class cat.lump.ir.weighting.TermFrequency
-
Note that not the top% is returned sometimes but a little bit more.
- getTopTuples() - Method in class cat.lump.aq.textextraction.wikipedia.categories.DomainKeywords
-
- getTopTuplesPlus(String) - Method in class cat.lump.aq.textextraction.wikipedia.categories.DomainKeywords
-
- getTranslation() - Method in class cat.lump.aq.textextraction.wikipedia.cli.WikipediaCliFragmentsXecutor
-
- getTrgLang() - Method in class cat.lump.aq.textextraction.wikipedia.fragments.ArticlesSimilarity
-
- getTrgLang() - Method in class cat.lump.ir.comparison.toCheck.ArticlesSimilarity
-
- getType() - Method in class cat.lump.aq.textextraction.wikipedia.prepro.AbstractPreprocess
-
- getType() - Method in class cat.lump.ir.retrievalmodels.similarity.SimilarityCalculator
-
- getValue() - Method in class cat.lump.aq.basics.structure.Pair
-
- getVector(String) - Method in class cat.lump.ir.sim.ml.esa.EsaVectors
-
Obtain the vector corresponding to this id
(null if it does not exist)
- getVector(int) - Method in class cat.lump.ir.sim.ml.esa.EsaVectors
-
Obtain the vector corresponding to this slot
- getVerbose() - Method in class cat.lump.aq.textextraction.wikipedia.cli.WikipediaCliCategoryExtractor
-
- getVerbosity() - Method in class cat.lump.ir.lucene.cli.LuceneCliIndexerWT
-
- getVerbosity() - Method in class cat.lump.ir.lucene.cli.LuceneCliMinimum0
-
- getVerbosity() - Method in class cat.lump.ir.lucene.cli.LuceneCliQuerierWT
-
- getVerbosity() - Method in class cat.lump.ir.lucene.cli.LuceneCliWT2Query
-
- getVocabulary() - Method in class cat.lump.ir.index.Index
-
- getVocabulary() - Method in class cat.lump.ir.retrievalmodels.similarity.Article
-
- getWeighted(RepresentationType) - Method in class cat.lump.ir.retrievalmodels.document.Document
-
- getWeighted(RepresentationType) - Method in class cat.lump.ir.retrievalmodels.document.Fragment
-
- getWikipediaConnector() - Method in class cat.lump.aq.textextraction.wikipedia.categories.CategoryExplorer
-
- getYear() - Method in class cat.lump.aq.textextraction.wikipedia.categories.DomainKeywords
-
- getYear() - Method in class cat.lump.aq.textextraction.wikipedia.cli.WikipediaCliMinimum
-
- getYear() - Method in class cat.lump.aq.wikilink.config.Dump
-
- GroupOfCategories - Class in cat.lump.aq.textextraction.wikipedia.categories
-
A GroupOfCategories instance contains the scored categories from
Wikipedia which are related to other called root category.
- GroupOfCategories(Category) - Constructor for class cat.lump.aq.textextraction.wikipedia.categories.GroupOfCategories
-
Creates an empty group of categories related to the given root category.
- GroupOfCategories(File, WikipediaJwpl) - Constructor for class cat.lump.aq.textextraction.wikipedia.categories.GroupOfCategories
-
- GroupOfCategories.ScoredCategory - Class in cat.lump.aq.textextraction.wikipedia.categories
-
The ScoredCategory class enriches the
de.tudarmstadt.ukp.wikipedia.api.Category objects providing the
following information:
Parent: The first category which allows access to this one.
- GroupOfCategories.ScoredCategory(Category, Category, int) - Constructor for class cat.lump.aq.textextraction.wikipedia.categories.GroupOfCategories.ScoredCategory
-
Creates a scored category which hasn't already been scored.
- GroupOfCategories.ScoredCategory(Category, Category, int, boolean) - Constructor for class cat.lump.aq.textextraction.wikipedia.categories.GroupOfCategories.ScoredCategory
-
Creates a scored category defining if its score.
- GroupOfCategories.ScoredCategory(String, WikipediaJwpl) - Constructor for class cat.lump.aq.textextraction.wikipedia.categories.GroupOfCategories.ScoredCategory
-
- gZipToString(String) - Static method in class cat.lump.aq.basics.io.files.FileIO
-
Opens a gziped file and returns the lines it contains
- idf(int, int) - Method in class cat.lump.ir.lucene.index.TFSimilarity
-
- idsToFile(String) - Method in class cat.lump.aq.textextraction.wikipedia.categories.ArticleSelector
-
Stores the IDs of the extracted articles into a file; one ID per line.
- INCLUDE_BOW - Variable in class cat.lump.ir.index.Abstracter
-
Representations to include
- INCLUDE_CNG - Variable in class cat.lump.ir.index.Abstracter
-
Representations to include
- INCLUDE_COG - Variable in class cat.lump.ir.index.Abstracter
-
Representations to include
- INCLUDE_WNG - Variable in class cat.lump.ir.index.Abstracter
-
Representations to include
- increment() - Method in class cat.lump.aq.basics.structure.ir.TermFrequencyTuple
-
Increments the number of occurrences of the term in one unit.
- index - Variable in class cat.lump.ir.index.Abstracter
-
Inverted index used to compute similarities
- Index - Class in cat.lump.ir.index
-
- Index() - Constructor for class cat.lump.ir.index.Index
-
- index() - Method in class cat.lump.ir.lucene.index.LuceneIndexer
-
- index(String, FileFilter) - Method in class cat.lump.ir.lucene.index.LuceneIndexer
-
Open an index and start file directory traversal
- index() - Method in class cat.lump.ir.lucene.index.LuceneIndexerWT
-
- index(String, FileFilter) - Method in class cat.lump.ir.lucene.index.LuceneIndexerWT
-
Open an index and start file directory traversal
- INDEX_FILE - Variable in class cat.lump.ir.index.Abstracter
-
Prefix and suffix for the index-related files
- indexDir - Static variable in class cat.lump.ir.lucene.LuceneInterface
-
Directory where the Lucene index has to be stored
- indexEdition(String, String) - Method in class cat.lump.ir.lucene.Xecutor
-
Indexes the Wikipedia edition in language "locale" available at
inputDir, and outputs the indexes at indexDir.
- Indexer - Class in cat.lump.ir.index
-
A class that index a set of Documents on the basis of a given
representation.
- Indexer(Locale, File, RepresentationType[]) - Constructor for class cat.lump.ir.index.Indexer
-
Invocation where the documents' language is given and the directory
for the index is provided.
- Indexer(File, RepresentationType[]) - Constructor for class cat.lump.ir.index.Indexer
-
Default invocation where files are in English
- indexPath - Variable in class cat.lump.ir.index.Abstracter
-
Path to the index
- INFINITE_DISTANCE - Static variable in class cat.lump.aq.textextraction.wikipedia.categories.CategoryExplorer
-
- info(String) - Method in class cat.lump.aq.basics.log.LumpLogger
-
Prints a log message
- insertFromFile(File) - Method in class cat.lump.aq.textextraction.wikipedia.categories.DomainVocabulary
-
Inserts the terms stored in the given file into the domain vocabulary.
- InvIndexContainer - Class in cat.lump.aq.basics.structure
-
A container for storing an inverted index.
- InvIndexContainer() - Constructor for class cat.lump.aq.basics.structure.InvIndexContainer
-
- isAcronym(Page) - Method in class cat.lump.aq.wikilink.jwpl.WikipediaJwpl
-
Checks if the page is an acronym page
- isAvailable(String) - Static method in class cat.lump.ir.lucene.index.LuceneLanguages
-
Checks whether we can index a given language.
- isAvailablePreprocess(TypePreprocess) - Method in class cat.lump.aq.textextraction.wikipedia.categories.ArticleTextExtractor
-
Query if exists the preprocess within the available preprocesses.
- isAvailableSimilarity(Similarity) - Method in class cat.lump.aq.textextraction.wikipedia.fragments.ArticlesSimilarity
-
Checks if the given similarity is known for this class
- isAvailableSimilarity(Similarity) - Method in class cat.lump.ir.comparison.toCheck.ArticlesSimilarity
-
Checks if the given similarity is known for this class
- isDisambiguation(Page) - Method in class cat.lump.aq.wikilink.jwpl.WikipediaJwpl
-
Checks if this is a disambiguation page
- isDomain(Category, DomainVocabulary) - Method in class cat.lump.aq.textextraction.wikipedia.categories.CategoryExplorer
-
Checks if a category belongs to the domain defined by the given
vocabulary.
- isDomain(String[]) - Method in class cat.lump.aq.textextraction.wikipedia.categories.CategoryNameStats
-
Checks if a category name "belongs" to the domain.
- isDomain() - Method in class cat.lump.aq.textextraction.wikipedia.categories.GroupOfCategories.ScoredCategory
-
- isEmpty() - Method in class cat.lump.ir.index.Index
-
- isIndexAvailable(String) - Static method in class cat.lump.ir.lucene.index.LuceneLanguages
-
Checks whether a given languages is indexed and ready to compute
similarities.
- isLanguageAvailable(String) - Static method in class cat.lump.aq.wikilink.Languages
-
- isRedirect(Page) - Method in class cat.lump.aq.wikilink.jwpl.WikipediaJwpl
-
Checks if the page is a redirect page
- isRedirect(int, String, int, WikipediaDriverManager) - Static method in class cat.lump.aq.wikilink.WikipediaDBdata
-
Looks in the "page" table of the corresponding language and
year if the id is a redirect.
- isStopword(String) - Method in class cat.lump.ie.textprocessing.stopwords.Stopwords
-
- iterator() - Method in class cat.lump.ir.retrievalmodels.similarity.SimilarityMatrix
-
- LABEL - Variable in class cat.lump.aq.textextraction.wikipedia.cli.WikipediaCliMinimum
-
- LABEL - Variable in class cat.lump.ir.lucene.cli.LuceneCliMinimum0
-
- lan - Variable in class cat.lump.ir.lucene.LuceneInterface
-
Language of the texts
- langToString() - Static method in class cat.lump.ir.lucene.index.LuceneLanguages
-
- language - Variable in class cat.lump.ir.lucene.cli.LuceneCliMinimum0
-
Language
- LanguageConstants - Class in cat.lump.aq.wikilink.jwpl
-
Includes the constant identifiers for Wikipedia labels in different languages.
- LanguageConstants() - Constructor for class cat.lump.aq.wikilink.jwpl.LanguageConstants
-
Opens the CONFIG_FILE and loads all the language constants for the
available languages.
- Languages - Class in cat.lump.aq.wikilink
-
A collection of all the available languages in Wikipedia up to 2015
- Languages() - Constructor for class cat.lump.aq.wikilink.Languages
-
- learn() - Method in class cat.lump.ir.sim.cl.len.LengthModelLearn
-
Esimate the mu and sigma parameters for the parallel corpus provided
- length() - Method in class cat.lump.aq.basics.algebra.matrix.DynamicMatrixOfVectors
-
- length() - Method in class cat.lump.aq.basics.algebra.vector.Vector
-
- length() - Method in class cat.lump.ie.textprocessing.Span
-
- length() - Method in class cat.lump.ir.retrievalmodels.document.Document
-
- length() - Method in class cat.lump.ir.retrievalmodels.similarity.Article
-
- length() - Method in class cat.lump.ir.sim.ml.esa.EsaVectors
-
- LengthFactors - Class in cat.lump.ir.sim.cl.len
-
It includes the default values for a bunch of language pairs, namely:
cz-en
de-en
en-cz
en-de
en-es
en-fr
en-ru
es-en
fr-en
ru-en
The parameters were estimated by txell on different corpora, including:
commoncrawl.wmt2013
CzEng.v1.0
el_periodico
europarl.v6
europarl.v7
FAUST_D4.2
French_treebank
newscommentary.v8
news.shuffled.en.conll.gz
news.shuffled.fr.conll.gz
patents
Romanian_treebank
UNdoc.2000
wiki-titles.ru-en
wmt10
wmt10.select
- LengthFactors() - Constructor for class cat.lump.ir.sim.cl.len.LengthFactors
-
- LengthModel - Class in cat.lump.ir.sim.cl.len
-
A class to estimate length models for a language pair.
- LengthModel() - Constructor for class cat.lump.ir.sim.cl.len.LengthModel
-
- LengthModelEstimate - Class in cat.lump.ir.sim.cl.len
-
A class to estimate the length factor between two texts according to
previously learnt parameters.
- LengthModelEstimate() - Constructor for class cat.lump.ir.sim.cl.len.LengthModelEstimate
-
- LengthModelLearn - Class in cat.lump.ir.sim.cl.len
-
Class to learn the parameters of the length model from a parallel
corpus (two files).
- LengthModelLearn() - Constructor for class cat.lump.ir.sim.cl.len.LengthModelLearn
-
Invoke without setting the source and target files
(it's going to be done by the calling class
- LengthModelLearn(File, File) - Constructor for class cat.lump.ir.sim.cl.len.LengthModelLearn
-
Invocation with the source and target files
- LithuanianAnalyzer - Class in cat.lump.ir.lucene.index.analyzers
-
- LithuanianAnalyzer() - Constructor for class cat.lump.ir.lucene.index.analyzers.LithuanianAnalyzer
-
- loadAnalyzer(Locale) - Static method in class cat.lump.ie.textprocessing.word.AnalyzerFactoryLucene
-
Analyser from Lucene for different languages.
- loadAnalyzer(Locale) - Static method in class cat.lump.ir.lucene.engine.AnalyzerFactory
-
Analyser from Lucene for different languages.
- loadArticles(int) - Method in class cat.lump.aq.textextraction.wikipedia.categories.DomainKeywords
-
Loads the articles of the given category ID.
- loadArticles(String) - Method in class cat.lump.aq.textextraction.wikipedia.categories.DomainKeywords
-
Loads the articles of the given category.
- loadCategory(String) - Method in class cat.lump.aq.textextraction.wikipedia.categories.CategoryExplorer
-
Loads a category from Wikipedia by its title.
- loadCategory(int) - Method in class cat.lump.aq.textextraction.wikipedia.categories.CategoryExplorer
-
Loads a category from Wikipedia by its page ID.
- loadCategoryNames(File) - Method in class cat.lump.aq.textextraction.wikipedia.categories.CategoryNameStats
-
Reads the file with the list of categories
TODO a loader from an object.
- loadDictionary(File) - Method in class cat.lump.aq.textextraction.wikipedia.categories.CategoryNameStats
-
Reads the "Domain Key Words" dictionary
- loadDictionary(HashSet<String>) - Method in class cat.lump.aq.textextraction.wikipedia.categories.CategoryNameStats
-
- loadFile(File, TypePreprocess, String, int) - Static method in class cat.lump.aq.textextraction.wikipedia.io.FileManager
-
Loads the file related to the given parameters in text format.
- loadfromFile(File) - Static method in class cat.lump.aq.textextraction.wikipedia.categories.DomainVocabulary
-
Creates a DomainVocabulary instance by reading a binary file
which contains a domain vocabulary.
- loadIndex() - Method in class cat.lump.ir.index.Abstracter
-
Loads the index components (empty if new, with data if existed
previously).
- loadIndex(Locale) - Method in class cat.lump.ir.lucene.query.LuceneQuerier
-
Loads the Lucene index (previously created) with the reference
corpus.
- loadIndex(Locale) - Method in class cat.lump.ir.lucene.query.LuceneQuerierWT
-
Loads the Lucene index (previously created) with the reference
corpus.
- loadIndex() - Method in class cat.lump.ir.sim.ml.esa.EsaGenerator
-
Loads the Lucene index (previously created) with the reference
corpus.
- loadOptions() - Method in class cat.lump.aq.textextraction.wikipedia.cli.WikipediaCliArticleSelector
-
Loads additional options: category (numerical and string), percentage of
words required and output file
- loadOptions() - Method in class cat.lump.aq.textextraction.wikipedia.cli.WikipediaCliArticleTextExtractor
-
Loads additional options: category (numerical and string), percentage of
words required and output file
- loadOptions() - Method in class cat.lump.aq.textextraction.wikipedia.cli.WikipediaCliCategoriesXecutor
-
Loads additional options: category (numerical and string), percentage of
words required and output file
- loadOptions() - Method in class cat.lump.aq.textextraction.wikipedia.cli.WikipediaCliCategoryDepth
-
Loads additional options: percentage of categories with keywords, input file,
depth and category
- loadOptions() - Method in class cat.lump.aq.textextraction.wikipedia.cli.WikipediaCliCategoryExtractor
-
Loads additional options: category (numerical and string), percentage of
words required and output file
- loadOptions() - Method in class cat.lump.aq.textextraction.wikipedia.cli.WikipediaCliDomainKeywords
-
Loads additional options: category (numerical and string), percentage of
words required and output file
- loadOptions() - Method in class cat.lump.aq.textextraction.wikipedia.cli.WikipediaCliFragmentsXecutor
-
Loads additional options: category (numerical and string), percentage of
words required and output file
- loadOptions() - Method in class cat.lump.aq.textextraction.wikipedia.cli.WikipediaCliMinimum
-
Load the options for language, year, and help
- loadOptions() - Method in class cat.lump.ir.lucene.cli.LuceneCliCategoriesXecutor
-
Loads additional options: category (numerical and string), percentage of
words required and output file
- loadOptions() - Method in class cat.lump.ir.lucene.cli.LuceneCliIndexerWT
-
Loads additional options
- loadOptions() - Method in class cat.lump.ir.lucene.cli.LuceneCliMinimum0
-
Load the options for input, output, language and help
- loadOptions() - Method in class cat.lump.ir.lucene.cli.LuceneCliQuerierWT
-
Loads additional options
- loadOptions() - Method in class cat.lump.ir.lucene.cli.LuceneCliWT2Query
-
Loads additional options
- loadPages(File) - Method in class cat.lump.aq.textextraction.wikipedia.categories.ArticleTextExtractor
-
Loads all the page IDs of the file list.
- loadPages(Integer[]) - Method in class cat.lump.aq.textextraction.wikipedia.categories.ArticleTextExtractor
-
- loadPairs(File) - Method in class cat.lump.aq.textextraction.wikipedia.fragments.ArticlesSimilarity
-
Load the pairs of articles from a file
- loadPairs(File) - Method in class cat.lump.ir.comparison.toCheck.ArticlesSimilarity
-
Load the pairs of articles from a file
- loadStemFilter(Locale, Analyzer, String) - Static method in class cat.lump.ie.textprocessing.word.AnalyzerFactoryLucene
-
Stemmer from Lucene for different languages.
- loadStemFilter(Locale, Analyzer, String) - Static method in class cat.lump.ir.lucene.engine.AnalyzerFactory
-
Stemmer from Lucene for different languages.
- loadStemmer(Locale) - Static method in class cat.lump.ie.textprocessing.word.StemmerFactory
-
- loadWikipedia() - Method in class cat.lump.aq.wikilink.jwpl.WikipediaJwpl
-
Sets the image and category labels for the working language.
- locale - Variable in class cat.lump.aq.textextraction.wikipedia.cli.WikipediaCliMinimum
-
The Wikipedia language
- locale - Variable in class cat.lump.aq.textextraction.wikipedia.prepro.AbstractPreprocess
-
Language of the Wikipedia dump
- locale - Variable in class cat.lump.ir.index.Abstracter
-
Language of the index/query
- log - Static variable in class cat.lump.ir.sim.ml.esa.Esa
-
- log - Static variable in interface cat.lump.ir.sim.Similarity
-
Logger for the application
- logger - Variable in class cat.lump.aq.textextraction.wikipedia.cli.WikipediaCliMinimum
-
- logger - Variable in class cat.lump.ir.lucene.cli.LuceneCliMinimum0
-
Logs
- logger - Variable in class cat.lump.ir.lucene.LuceneInterface
-
- lookForMaximumNumber() - Method in class cat.lump.aq.textextraction.wikipedia.utilities.CommonArticlesFinder
-
Given the list of articles for every language String[] filesID the
language from String[] langs with more articles is returned.
- lookForMaximumNumber() - Method in class cat.lump.aq.textextraction.wikipedia.utilities.CommonNamespaceFinder
-
Given the list of articles for every language String[] filesID the
language from String[] langs with more articles is returned.
- lookForMinimumNumber() - Method in class cat.lump.aq.textextraction.wikipedia.utilities.CommonArticlesFinder
-
Given the list of articles for every language String[] filesID the
language from String[] langs with less articles is returned.
- lookForMinimumNumber() - Method in class cat.lump.aq.textextraction.wikipedia.utilities.CommonNamespaceFinder
-
Given the list of articles for every language String[] filesID the
language from String[] langs with less articles is returned.
- LuceneCliCategoriesXecutor - Class in cat.lump.ir.lucene.cli
-
CLI to access the Xecutor pipeline for the WikiTailor IR-based
in-domain comparable corpora extraction.
- LuceneCliCategoriesXecutor() - Constructor for class cat.lump.ir.lucene.cli.LuceneCliCategoriesXecutor
-
- LuceneCliIndexerWT - Class in cat.lump.ir.lucene.cli
-
- LuceneCliIndexerWT() - Constructor for class cat.lump.ir.lucene.cli.LuceneCliIndexerWT
-
- LuceneCliMinimum0 - Class in cat.lump.ir.lucene.cli
-
CLI to access the Lucene-related classes for the Wikiparable project
- LuceneCliMinimum0() - Constructor for class cat.lump.ir.lucene.cli.LuceneCliMinimum0
-
Loads the logger and the available options (by calling loadOptions)
- LuceneCliQuerierWT - Class in cat.lump.ir.lucene.cli
-
- LuceneCliQuerierWT() - Constructor for class cat.lump.ir.lucene.cli.LuceneCliQuerierWT
-
- LuceneCliWT2Query - Class in cat.lump.ir.lucene.cli
-
CLI for WikiTailor2Query
- LuceneCliWT2Query() - Constructor for class cat.lump.ir.lucene.cli.LuceneCliWT2Query
-
- LuceneIndexer - Class in cat.lump.ir.lucene.index
-
An indexer based on Lucene in Action 2nd edition.
- LuceneIndexer(String, String) - Constructor for class cat.lump.ir.lucene.index.LuceneIndexer
-
Default invocation for English
- LuceneIndexer(Locale, String, String) - Constructor for class cat.lump.ir.lucene.index.LuceneIndexer
-
- LuceneIndexer.TextFilesFilter - Class in cat.lump.ir.lucene.index
-
- LuceneIndexer.TextFilesFilter() - Constructor for class cat.lump.ir.lucene.index.LuceneIndexer.TextFilesFilter
-
- LuceneIndexerAbstract - Class in cat.lump.ir.lucene.index
-
An indexer based on Lucene in Action 2nd edition.
- LuceneIndexerAbstract() - Constructor for class cat.lump.ir.lucene.index.LuceneIndexerAbstract
-
- LuceneIndexerWT - Class in cat.lump.ir.lucene.index
-
This is an adaptation of class LuceneIndexer to be
used for Wikiparable
- LuceneIndexerWT(String, String) - Constructor for class cat.lump.ir.lucene.index.LuceneIndexerWT
-
Default invocation for English
- LuceneIndexerWT(Locale, String, String) - Constructor for class cat.lump.ir.lucene.index.LuceneIndexerWT
-
- LuceneIndexerWT.TextFilesFilter - Class in cat.lump.ir.lucene.index
-
- LuceneIndexerWT.TextFilesFilter() - Constructor for class cat.lump.ir.lucene.index.LuceneIndexerWT.TextFilesFilter
-
- LuceneInterface - Class in cat.lump.ir.lucene
-
An abstract class with the necessary data and methods to interact with
Lucene's indexer and querier modules.
- LuceneInterface(String) - Constructor for class cat.lump.ir.lucene.LuceneInterface
-
Set up
- LuceneLanguages - Class in cat.lump.ir.lucene.index
-
A collection of static methods that contain the available
languages for both ESA index construction and ESA-based
text characterization
TODO probably move to en_GB, en_US, en_CA, es_ES, es_MX
- LuceneLanguages() - Constructor for class cat.lump.ir.lucene.index.LuceneLanguages
-
- LuceneQuerier - Class in cat.lump.ir.lucene.query
-
- LuceneQuerier(String) - Constructor for class cat.lump.ir.lucene.query.LuceneQuerier
-
- LuceneQuerierWT - Class in cat.lump.ir.lucene.query
-
Query into Lucene Indexes for WikiTailor.
- LuceneQuerierWT(String, float) - Constructor for class cat.lump.ir.lucene.query.LuceneQuerierWT
-
Default invocation for English
- LuceneQuerierWT(String) - Constructor for class cat.lump.ir.lucene.query.LuceneQuerierWT
-
- LuceneQuerierWT(String, String, float) - Constructor for class cat.lump.ir.lucene.query.LuceneQuerierWT
-
- LuceneQuerierWT(String, String) - Constructor for class cat.lump.ir.lucene.query.LuceneQuerierWT
-
- LuceneQuerierWT(Locale, String, float) - Constructor for class cat.lump.ir.lucene.query.LuceneQuerierWT
-
- LuceneTokenizer - Class in cat.lump.ir.lucene.query
-
A simple interface to perform tokenization through the Lucene
methods.
- LuceneTokenizer() - Constructor for class cat.lump.ir.lucene.query.LuceneTokenizer
-
- LuceneTokenizer(Locale) - Constructor for class cat.lump.ir.lucene.query.LuceneTokenizer
-
- LumpLogger - Class in cat.lump.aq.basics.log
-
A link to the log4j different configurations.
- LumpLogger(String) - Constructor for class cat.lump.aq.basics.log.LumpLogger
-
Initialise the logger with some label identifying the process
- magnitude() - Method in class cat.lump.aq.basics.algebra.vector.Vector
-
Computes the magnitude of the vector (aka norm).
- magnitudes - Variable in class cat.lump.ir.index.Index
-
- main(String[]) - Static method in class cat.lump.aq.basics.algebra.matrix.DynamicMatrixOfVectors
-
- main(String[]) - Static method in class cat.lump.aq.basics.log.LumpLogger
-
- main(String[]) - Static method in class cat.lump.aq.textextraction.wikipedia.categories.ArticleSelector
-
- main(String[]) - Static method in class cat.lump.aq.textextraction.wikipedia.categories.ArticleTextExtractor
-
Extract texts from Wikipedia articles and save them into text files, after
some given preprocessing.
- main(String[]) - Static method in class cat.lump.aq.textextraction.wikipedia.categories.BackArticleSelector
-
- main(String[]) - Static method in class cat.lump.aq.textextraction.wikipedia.categories.CategoryDepth
-
Main method.
- main(String[]) - Static method in class cat.lump.aq.textextraction.wikipedia.categories.CategoryExplorer
-
- main(String[]) - Static method in class cat.lump.aq.textextraction.wikipedia.categories.CategoryExtractor
-
Main method.
- main(String[]) - Static method in class cat.lump.aq.textextraction.wikipedia.categories.CategoryNameStats
-
- main(String[]) - Static method in class cat.lump.aq.textextraction.wikipedia.categories.DomainKeywords
-
Main method.
- main(String[]) - Static method in class cat.lump.aq.textextraction.wikipedia.categories.Xecutor
-
- main(String[]) - Static method in class cat.lump.aq.textextraction.wikipedia.categories.XecutorTheFirst
-
- main(String[]) - Static method in class cat.lump.aq.textextraction.wikipedia.categories.XecutorTheSecond
-
- main(String[]) - Static method in class cat.lump.aq.textextraction.wikipedia.experiments.CorrelationsxCategory
-
Main function to run the class, serves as example
- main(String[]) - Static method in class cat.lump.aq.textextraction.wikipedia.experiments.MeanTFxCategory
-
Main function to run the class, serves as example
- main(String[]) - Static method in class cat.lump.aq.textextraction.wikipedia.fragments.ArticlesSimilarity
-
- main(String[]) - Static method in class cat.lump.aq.textextraction.wikipedia.fragments.Xecutor
-
- main(String[]) - Static method in class cat.lump.aq.textextraction.wikipedia.io.FileManager
-
Example to get files with filename filter
- main(String[]) - Static method in class cat.lump.aq.textextraction.wikipedia.prepro.TermExtractor
-
- main(String[]) - Static method in class cat.lump.aq.textextraction.wikipedia.utilities.ArticlesTFs
-
Main function to run the class, serves as example
- main(String[]) - Static method in class cat.lump.aq.textextraction.wikipedia.utilities.ArticlesTranslator
-
- main(String[]) - Static method in class cat.lump.aq.textextraction.wikipedia.utilities.CommonArticlesFinder
-
Example for using the class
- main(String[]) - Static method in class cat.lump.aq.textextraction.wikipedia.utilities.CommonCategoriesExtractor
-
- main(String[]) - Static method in class cat.lump.aq.textextraction.wikipedia.utilities.CommonNamespaceFinder
-
Example for using the class
- main(String[]) - Static method in class cat.lump.aq.wikilink.connexion.WikipediaDriverManager
-
- main(String[]) - Static method in class cat.lump.aq.wikilink.jwpl.WikipediaJwpl
-
In this example we run an instance of Wikipedia and display a couple of
articles' contents.
- main(String[]) - Static method in class cat.lump.ie.textprocessing.ner.NerOpennlp
-
- main(String[]) - Static method in class cat.lump.ie.textprocessing.ngram.CharacterNgrams
-
- main(String[]) - Static method in class cat.lump.ie.textprocessing.ngram.WordNgrams
-
- main(String[]) - Static method in class cat.lump.ie.textprocessing.TestTokenize
-
- main(String[]) - Static method in class cat.lump.ie.textprocessing.transform.Transformation
-
- main(String[]) - Static method in class cat.lump.ie.textprocessing.transform.Transliteratorr
-
- main(String[]) - Static method in class cat.lump.ir.comparison.toCheck.ArticlesSimilarity
-
- main(String[]) - Static method in class cat.lump.ir.index.Indexer
-
- main(String[]) - Static method in class cat.lump.ir.index.Querier
-
- main(String[]) - Static method in class cat.lump.ir.lucene.index.LuceneIndexer
-
- main(String[]) - Static method in class cat.lump.ir.lucene.index.LuceneIndexerWT
-
- main(String[]) - Static method in class cat.lump.ir.lucene.query.LuceneQuerier
-
- main(String[]) - Static method in class cat.lump.ir.lucene.query.LuceneQuerierWT
-
- main(String[]) - Static method in class cat.lump.ir.lucene.query.WikiTailor2Query
-
- main(String[]) - Static method in class cat.lump.ir.lucene.Xecutor
-
- main(String[]) - Static method in class cat.lump.ir.sim.cl.len.LengthModel
-
Parses the input parameters and either learns a length model from a
collection or estimates the corresponding values for a set of texts
- MapUtil - Class in cat.lump.aq.basics.structure.standard
-
A class to sort a map according to it values.
- MapUtil() - Constructor for class cat.lump.aq.basics.structure.standard.MapUtil
-
- matrix2csv(String[][], File) - Method in class cat.lump.aq.basics.io.files.CsvFoolWriter
-
Reads a csv file and return a 2-dimensional array of Strings of
it.
- matrix2csv(String[], String[][], File) - Method in class cat.lump.aq.basics.io.files.CsvFoolWriter
-
Reads a csv file.
- max() - Method in class cat.lump.aq.basics.algebra.vector.Vector
-
- maxSimilaritiesIndex - Variable in class cat.lump.ir.retrievalmodels.similarity.CosineSimilarity
-
Index to the maximum similarity value for each text fragment in A
- maxSimilaritiesIndex - Variable in class cat.lump.ir.retrievalmodels.similarity.JaccardSimilarity
-
Index to the maximum similarity value for each text fragment in A
- md5(File) - Static method in class cat.lump.aq.basics.io.files.FileIO
-
- md5(String) - Static method in class cat.lump.aq.basics.io.files.FileIO
-
- MeanTFxCategory - Class in cat.lump.aq.textextraction.wikipedia.experiments
-
Wikiparable: Experiment 1 for evaluation.
- MeanTFxCategory(String, String, String, String) - Constructor for class cat.lump.aq.textextraction.wikipedia.experiments.MeanTFxCategory
-
- min() - Method in class cat.lump.aq.basics.algebra.vector.Vector
-
- mirrorCommonArticles2Langs(String) - Method in class cat.lump.aq.textextraction.wikipedia.utilities.ArticlesTranslator
-
Method for converting the file with the articles in common L1.icat1.L2.icat2.method
into the mirror file with L2.icat2.L1.icat1.method.
- Model - Enum in cat.lump.ir.comparison
-
This enumeration contains the models of similarity implemented in this
package.
- model - Variable in class cat.lump.ir.retrievalmodels.similarity.SimilarityCalculator
-
Model of similarity
- modifyVector(Vector, String) - Method in class cat.lump.ir.sim.ml.esa.EsaVectors
-
Very unlikely to happen, but a previously filled vector
could be modified.
- move(File, File) - Static method in class cat.lump.aq.basics.io.files.FileIO
-
- mysql_pss - Static variable in class cat.lump.aq.wikilink.config.MySQLWikiConfiguration
-
- mysql_url_jwpl - Static variable in class cat.lump.aq.wikilink.config.MySQLWikiConfiguration
-
- mysql_usr - Static variable in class cat.lump.aq.wikilink.config.MySQLWikiConfiguration
-
- mysqlUrl() - Static method in class cat.lump.aq.wikilink.config.MySQLWikiConfiguration
-
- mysqlUrlJwpl() - Static method in class cat.lump.aq.wikilink.config.MySQLWikiConfiguration
-
- MySQLWikiConfiguration - Class in cat.lump.aq.wikilink.config
-
Class to handle the connection variables to the database.
- MySQLWikiConfiguration() - Constructor for class cat.lump.aq.wikilink.config.MySQLWikiConfiguration
-
- p - Variable in class cat.lump.aq.textextraction.wikipedia.cli.WikipediaCliMinimum
-
Configuration file
- p - Static variable in class cat.lump.aq.textextraction.wikipedia.WTConfig
-
Configuration file
- Pair<S extends java.lang.Comparable<S>,T extends java.lang.Comparable<T>> - Class in cat.lump.aq.basics.structure
-
A class that contains a pair for storing data.
- Pair(S, T) - Constructor for class cat.lump.aq.basics.structure.Pair
-
- PairOperations - Class in cat.lump.aq.basics.structure
-
A class that allows for comparing a list of pairs according to
their second value.
- PairOperations() - Constructor for class cat.lump.aq.basics.structure.PairOperations
-
- pairs_db - Static variable in class cat.lump.aq.textextraction.wikipedia.utilities.CommonArticlesFinder
-
DB with langlinks & articles pairs
- pairs_db - Static variable in class cat.lump.aq.textextraction.wikipedia.utilities.CommonCategoriesExtractor
-
DB with langlinks & articles pairs
- pairs_db - Static variable in class cat.lump.aq.textextraction.wikipedia.utilities.CommonNamespaceFinder
-
DB with langlinks & articles pairs
- pairs_db - Static variable in class cat.lump.aq.wikilink.config.MySQLWikiConfiguration
-
- parseArguments(String[]) - Method in class cat.lump.aq.textextraction.wikipedia.cli.WikipediaCliArticleSelector
-
- parseArguments(String[]) - Method in class cat.lump.aq.textextraction.wikipedia.cli.WikipediaCliArticleTextExtractor
-
- parseArguments(String[]) - Method in class cat.lump.aq.textextraction.wikipedia.cli.WikipediaCliCategoriesXecutor
-
- parseArguments(String[]) - Method in class cat.lump.aq.textextraction.wikipedia.cli.WikipediaCliCategoryDepth
-
- parseArguments(String[]) - Method in class cat.lump.aq.textextraction.wikipedia.cli.WikipediaCliCategoryExtractor
-
- parseArguments(String[]) - Method in class cat.lump.aq.textextraction.wikipedia.cli.WikipediaCliDomainKeywords
-
- parseArguments(String[]) - Method in class cat.lump.aq.textextraction.wikipedia.cli.WikipediaCliFragmentsXecutor
-
- parseArguments(String[]) - Method in class cat.lump.aq.textextraction.wikipedia.cli.WikipediaCliMinimum
-
Method to parse the arguments received.
- parseArguments(String[]) - Method in class cat.lump.ir.lucene.cli.LuceneCliCategoriesXecutor
-
- parseArguments(String[]) - Method in class cat.lump.ir.lucene.cli.LuceneCliIndexerWT
-
Parses the arguments received
- parseArguments(String[]) - Method in class cat.lump.ir.lucene.cli.LuceneCliMinimum0
-
Method to parse the arguments received.
- parseArguments(String[]) - Method in class cat.lump.ir.lucene.cli.LuceneCliQuerierWT
-
Parses the arguments received
- parseArguments(String[]) - Method in class cat.lump.ir.lucene.cli.LuceneCliWT2Query
-
Parses the arguments received
- parseLine(String[]) - Method in class cat.lump.aq.textextraction.wikipedia.cli.WikipediaCliMinimum
-
Parses the arguments and generates the command line for further
processing the parameters.
- parseLine(String[]) - Method in class cat.lump.ir.lucene.cli.LuceneCliMinimum0
-
Parses the arguments and generates the command line for further
processing the parameters.
- PlainTextPreprocess - Class in cat.lump.aq.textextraction.wikipedia.prepro
-
Preprocess a Wikipedia page using JWPL API.
- PlainTextPreprocess(TypePreprocess, String, Locale, int, File) - Constructor for class cat.lump.aq.textextraction.wikipedia.prepro.PlainTextPreprocess
-
- preprocess(TypePreprocess) - Method in class cat.lump.aq.textextraction.wikipedia.categories.ArticleTextExtractor
-
Preprocess all the pages with the preprocess method indentified by
preprocess
- preprocess(String) - Method in class cat.lump.aq.textextraction.wikipedia.categories.DomainVocabulary
-
Preprocess a text to extract its terms as defined in TermExtractor.getTerms();
- preprocess(int) - Method in class cat.lump.aq.textextraction.wikipedia.prepro.AbstractPreprocess
-
Function which implements the preprocessing procedure.
- preprocess(int) - Method in class cat.lump.aq.textextraction.wikipedia.prepro.PlainTextPreprocess
-
- preprocess(String) - Method in class cat.lump.ir.retrievalmodels.document.PseudoCognates
-
- preprocessAll() - Method in class cat.lump.aq.textextraction.wikipedia.categories.ArticleTextExtractor
-
Applies all the available preprocessing steps to the pages.
- PROCESS_START - Variable in class cat.lump.ir.lucene.LuceneInterface
-
- processCategory(String, File) - Method in class cat.lump.ir.lucene.query.WikiTailor2Query
-
Extracts the top articles corresponding to the current model for a concrete category.
- processLine(String) - Method in class cat.lump.ir.sim.ml.esa.SimilarityESAlines
-
Method used to do some preprocessing to the input text.
- processLine(String) - Method in class cat.lump.ir.sim.ml.esa.SimilarityESAsent
-
Method used to do some preprocessing to the input text.
- PseudoCognates - Class in cat.lump.ir.retrievalmodels.document
-
A document representation based in Simard cognateness model.
- PseudoCognates(Dictionary, Locale) - Constructor for class cat.lump.ir.retrievalmodels.document.PseudoCognates
-
- Punctuation - Class in cat.lump.ie.textprocessing.sentence
-
Regular expression-based punctuation finder.
- Punctuation() - Constructor for class cat.lump.ie.textprocessing.sentence.Punctuation
-
- put(Vector, int) - Method in class cat.lump.aq.basics.algebra.matrix.DynamicMatrixOfVectors
-
Store the vector in the specified position of the array.
- Ranking - Class in cat.lump.ir.index
-
A class that implements a ranking of identifiers (documents) and
their relevance.
- Ranking() - Constructor for class cat.lump.ir.index.Ranking
-
- readFully(Reader) - Method in class cat.lump.ir.lucene.engine.WTAnalyzer
-
- readObject(File) - Static method in class cat.lump.aq.basics.io.files.FileIO
-
- reconstructTradArticles(HashSet<Integer>, boolean) - Method in class cat.lump.aq.textextraction.wikipedia.utilities.ArticlesTranslator
-
Given the original index files and the temporal files already translated,
the method reconstructs the translation of every individual article and
saves them in path/plain/L1/index/id.trad.L2.txt
- reconstructTradArticlesCommon(String, boolean) - Method in class cat.lump.aq.textextraction.wikipedia.utilities.ArticlesTranslator
-
Given the original index files and the temporal files already translated,
the method reconstructs the translation of every individual article that
was originally in the file of common articles and saves them in
path/plain/L1/index/id.trad.L2.txt
- remove(int) - Method in class cat.lump.ir.index.Index
-
Remove an existing index from the index.
- remove(String) - Method in class cat.lump.ir.index.Ranking
-
Remove a document from the ranking
- removeAccents(String) - Static method in class cat.lump.ie.textprocessing.transform.Transformation
-
- removeDiacritics(String) - Static method in class cat.lump.ie.textprocessing.sentence.Diacritics
-
- removeDiacritics() - Method in class cat.lump.ie.textprocessing.TextPreprocessor
-
Removes the diacritics from the string
- removeDocument(int) - Method in class cat.lump.ir.index.Indexer
-
Remove document id from the index and documents' collection
- removeEngStopwords() - Method in class cat.lump.ie.textprocessing.TextPreprocessor
-
Eliminates all the English stopwords from the string
- removeMarks(String) - Static method in class cat.lump.ie.textprocessing.sentence.Diacritics
-
- removeNonAlphabetic(int) - Method in class cat.lump.ie.textprocessing.TextPreprocessor
-
Remove any token which is not in [:alpha:] character class.
- removeNonAlphaNumeric(int) - Method in class cat.lump.ie.textprocessing.TextPreprocessor
-
Remove any token which is not the in [:alnum:] character class.
- removePage(int) - Method in class cat.lump.aq.textextraction.wikipedia.categories.ArticleTextExtractor
-
- removePage(Integer) - Method in class cat.lump.aq.textextraction.wikipedia.prepro.AbstractPreprocess
-
Removes a page to preprocess
- removePreprocess(TypePreprocess) - Method in class cat.lump.aq.textextraction.wikipedia.categories.ArticleTextExtractor
-
Removes the preprocess if it belongs to the set of available
preprocesses.
- removePunctuation() - Method in class cat.lump.ie.textprocessing.TextPreprocessor
-
Removes the punctuation marks from the string
- removeScoredCategory(GroupOfCategories.ScoredCategory) - Method in class cat.lump.aq.textextraction.wikipedia.categories.GroupOfCategories
-
- removeSource(int) - Method in class cat.lump.aq.textextraction.wikipedia.fragments.ArticlesSimilarity
-
Remove the pair with the given source ID
- removeSource(int) - Method in class cat.lump.ir.comparison.toCheck.ArticlesSimilarity
-
Remove the pair with the given source ID
- removeStopwords(String) - Method in class cat.lump.ie.textprocessing.stopwords.Stopwords
-
Checks for the vocabulary in the text and returns a copy
after discarding stopwords.
- removeStopwords(List<String>) - Method in class cat.lump.ie.textprocessing.stopwords.Stopwords
-
- removeStopwords() - Method in class cat.lump.ie.textprocessing.TextPreprocessor
-
Eliminates all the stopwords from the string
- removeTarget(int) - Method in class cat.lump.aq.textextraction.wikipedia.fragments.ArticlesSimilarity
-
Remove the pair with the given target ID
- removeTarget(int) - Method in class cat.lump.ir.comparison.toCheck.ArticlesSimilarity
-
Removes the pair with the given source ID
- removeTerm(String) - Method in class cat.lump.ir.weighting.TermFrequency
-
Remove the given term (warning if the term does not exist)
- removeTerms(List<String>) - Method in class cat.lump.ir.weighting.TermFrequency
-
Remove these terms into the collection.
- RepresentationType - Enum in cat.lump.ir.retrievalmodels.document
-
This enumeration contains the types of data representation availables in this
package.
- representText(int) - Method in class cat.lump.ir.retrievalmodels.similarity.Article
-
Represents the article text according to the type of representation
assigned.
- repType - Variable in class cat.lump.ir.index.Abstracter
-
Set of representations to generate/load in the index
- reset() - Method in class cat.lump.ir.index.Ranking
-
Start a new ranking
- resolveIfRedirect(int, int, String, int, WikipediaDriverManager) - Static method in class cat.lump.aq.wikilink.WikipediaDBdata
-
Looks for a resolvedId given the original id only in case it
corresponds to a redirect.
- resolveRedirect(int, int, String, int, WikipediaDriverManager) - Static method in class cat.lump.aq.wikilink.WikipediaDBdata
-
Looks for a resolvedId given an original redirect id.
- reusableTokenStream(String, Reader) - Method in class cat.lump.ir.lucene.index.analyzers.CroatianAnalyzer
-
- reusableTokenStream(String, Reader) - Method in class cat.lump.ir.lucene.index.analyzers.EstonianAnalyzer
-
- reusableTokenStream(String, Reader) - Method in class cat.lump.ir.lucene.index.analyzers.LithuanianAnalyzer
-
- reusableTokenStream(String, Reader) - Method in class cat.lump.ir.lucene.index.analyzers.SlovenianAnalyzer
-
- rootDirectory - Variable in class cat.lump.aq.textextraction.wikipedia.prepro.AbstractPreprocess
-
Root directory wherein the output will be stored.
- run() - Method in class cat.lump.aq.textextraction.wikipedia.prepro.AbstractPreprocess
-
This function is executed when the instance is treated as a thread.
- runStatement(String) - Method in class cat.lump.aq.wikilink.connexion.WikipediaDriverManager
-
- sameCardinality(Vector) - Method in class cat.lump.aq.basics.algebra.vector.Vector
-
Check if the vector and v2 have the same cardinality.
- sameCardinality(float[]) - Method in class cat.lump.aq.basics.algebra.vector.Vector
-
Check if the vector and v2 have the same cardinality.
- saveIndex() - Method in class cat.lump.ir.index.Indexer
-
Saves the documents into the provided output directory
- saveObject(File, EsaVectors) - Method in class cat.lump.ir.sim.ml.esa.SimilarityESA
-
Saves a textual representation into an object file
- savePage(File, TypePreprocess, String, int, StringBuffer) - Static method in class cat.lump.aq.textextraction.wikipedia.io.FileManager
-
Saves the text contained in the buffer in the correct file considering
the page ID, language and type of preprocess.
- scoreDomain(DomainVocabulary, Category) - Method in class cat.lump.aq.textextraction.wikipedia.categories.CategoryExplorer
-
Scores all the groups which compose a domain.
- scoreDomain(DomainVocabulary, Category, int) - Method in class cat.lump.aq.textextraction.wikipedia.categories.CategoryExplorer
-
Scores the groups which compose a domain and are at most as far as the
maxDistance parameter defines.
- searchCategoryDepth(int) - Method in class cat.lump.aq.textextraction.wikipedia.categories.Xecutor
-
- selectArticles(int, int) - Method in class cat.lump.aq.textextraction.wikipedia.categories.Xecutor
-
Select the articles that are supposed to appear in a category
- selectArticles(File, short) - Method in class cat.lump.aq.textextraction.wikipedia.categories.XecutorTheSecond
-
Select the articles that are supposed to appear in a category
- SentencesOpennlp - Class in cat.lump.ie.textprocessing.sentence
-
A class to get the sentences from a given text.
- SentencesOpennlp(Locale) - Constructor for class cat.lump.ie.textprocessing.sentence.SentencesOpennlp
-
- separator - Static variable in class cat.lump.aq.basics.io.files.FileIO
-
- serialize(File) - Method in class cat.lump.aq.textextraction.wikipedia.categories.DomainVocabulary
-
Saves the vocabulary in a binary file.
- serialize(File) - Method in class cat.lump.ir.retrievalmodels.similarity.SimilarityMatrix
-
Saves this object in a file as binary raw data.
- set(S, T) - Method in class cat.lump.aq.basics.structure.Pair
-
- setAnalyzer() - Method in class cat.lump.ir.lucene.index.LuceneIndexer
-
- setAnalyzer() - Method in class cat.lump.ir.lucene.index.LuceneIndexerWT
-
Sets the analyzer as a new instance of WTAnalyzer
- setAnalyzer() - Method in class cat.lump.ir.lucene.query.LuceneQuerierWT
-
Sets the analyzer as a new instance of WTAnalyzer
- setAnalyzer(Locale) - Method in class cat.lump.ir.lucene.query.LuceneTokenizer
-
TODO: This code is duplicated with cat.lump.ir.lucene.engine.loadAnalyzer
- setAnalyzer() - Method in class cat.lump.ir.sim.ml.esa.EsaGenerator
-
Set the Lucene analyzer to use according to the given language
- setDataDir(String) - Method in class cat.lump.ir.lucene.index.LuceneIndexer
-
- setDataDir(String) - Method in class cat.lump.ir.lucene.index.LuceneIndexerWT
-
- setDB(String) - Method in class cat.lump.aq.wikilink.connexion.WikipediaDriverManager
-
- setDir(String) - Method in class cat.lump.aq.textextraction.wikipedia.fragments.Xecutor
-
- setDocumentsPath(File, File) - Method in class cat.lump.ir.sim.cl.clesa.SimilarityCLESAdocs
-
- setDocumentsPath(File, File) - Method in class cat.lump.ir.sim.ml.esa.SimilarityESA
-
A method that loads the texts in collections A and B.
- setDocumentsPath(File, File) - Method in class cat.lump.ir.sim.ml.esa.SimilarityESAdocs
-
- setDocumentsPath(File, File) - Method in class cat.lump.ir.sim.ml.esa.SimilarityESAlines
-
- setDomain(boolean) - Method in class cat.lump.aq.textextraction.wikipedia.categories.GroupOfCategories.ScoredCategory
-
Sets if this category belongs to the domain.
- setFiles(File, File) - Method in class cat.lump.ir.sim.cl.len.LengthModelEstimate
-
Set the input files to estimate the model from
- setFiles(File, File) - Method in class cat.lump.ir.sim.cl.len.LengthModelLearn
-
Set the two input files
- setIndexDir(String) - Method in class cat.lump.ir.lucene.LuceneInterface
-
- setIndexPath(File) - Method in class cat.lump.ir.sim.ml.esa.EsaGenerator
-
Set the path to Lucene's index
- setInvIndex(InvIndexContainer) - Method in class cat.lump.ir.retrievalmodels.similarity.Article
-
- setKey(S) - Method in class cat.lump.aq.basics.structure.Pair
-
- setLang(Locale) - Method in class cat.lump.aq.textextraction.wikipedia.categories.DomainKeywords
-
Sets the language.
- setLanguage(String) - Method in class cat.lump.ir.lucene.LuceneInterface
-
- setLanguage(Locale) - Method in class cat.lump.ir.lucene.LuceneInterface
-
- setLanguage(Locale) - Method in class cat.lump.ir.retrievalmodels.document.Document
-
- setLanguage(Locale) - Method in class cat.lump.ir.retrievalmodels.similarity.Article
-
- setMatrix(double[][]) - Method in class cat.lump.ir.retrievalmodels.similarity.SimilarityMatrix
-
- setMaxSize(int) - Method in class cat.lump.aq.textextraction.wikipedia.categories.DomainKeywords
-
Defines the maximum number of terms that should be considered as domain terms
- setMinimumSize(int) - Method in class cat.lump.aq.textextraction.wikipedia.categories.DomainVocabulary
-
Changes the minimum size of the tokens to be accepted as terms of the
vocabulary.
- setMinNumArticles(int) - Method in class cat.lump.aq.textextraction.wikipedia.categories.DomainKeywords
-
Defines the minimum number of articles required to build the vocabulary
- setModel(Model) - Method in class cat.lump.ir.retrievalmodels.similarity.SimilarityCalculator
-
- setMuSigma(double, double) - Method in class cat.lump.ir.sim.cl.len.LengthModelEstimate
-
Set values for mu and sigma
- setnGrams(int) - Method in class cat.lump.ir.retrievalmodels.similarity.SimilarityCalculator
-
- setNormalisation(Boolean) - Method in class cat.lump.ir.sim.ml.esa.EsaGenerator
-
Determines whether the ESA vectors are going to be normalised.
- setObjects() - Method in class cat.lump.ir.sim.ml.esa.SimilarityESA
-
Set the name of the resulting vector objects
- setObjects() - Method in class cat.lump.ir.sim.ml.esa.SimilarityESAlines
-
- setObjects() - Method in class cat.lump.ir.sim.ml.esa.SimilarityESAsent
-
- setOutPath(String) - Method in class cat.lump.aq.textextraction.wikipedia.categories.Xecutor
-
- setOutPath(String) - Method in class cat.lump.aq.textextraction.wikipedia.categories.XecutorTheFirst
-
- setOutPath(String) - Method in class cat.lump.aq.textextraction.wikipedia.categories.XecutorTheSecond
-
- setOutputDir(String) - Method in class cat.lump.aq.textextraction.wikipedia.categories.CategoryExtractor
-
- setOutputDir(String) - Method in class cat.lump.ir.lucene.query.WikiTailor2Query
-
- setPairs(ArrayList<SimilarityPair>) - Method in class cat.lump.aq.textextraction.wikipedia.fragments.ArticlesSimilarity
-
Sets the list of pairs.
- setPairs(ArrayList<SimilarityPair>) - Method in class cat.lump.ir.comparison.toCheck.ArticlesSimilarity
-
Sets the list of pairs.
- setPercentage(int) - Method in class cat.lump.aq.textextraction.wikipedia.categories.DomainKeywords
-
Defines the percentage of terms that should be considered as domain terms
- setRepresentation(Document) - Method in class cat.lump.ir.retrievalmodels.similarity.Article
-
- setRootDirectory(File) - Method in class cat.lump.aq.textextraction.wikipedia.categories.ArticleTextExtractor
-
Changes the root directory of the preprocessor.
- setSimilarity(int, int, float) - Method in class cat.lump.ir.retrievalmodels.similarity.SimilarityMatrix
-
Changes the value of the position of the matrix given by row and
col.
- setSource(<any>) - Method in class cat.lump.ir.comparison.toCheck.SimilarityPair
-
- setSource(Article) - Method in class cat.lump.ir.retrievalmodels.similarity.SimilarityCalculator
-
- setSrcLang(String) - Method in class cat.lump.aq.textextraction.wikipedia.fragments.ArticlesSimilarity
-
- setSrcLang(String) - Method in class cat.lump.ir.comparison.toCheck.ArticlesSimilarity
-
- setString(String) - Method in class cat.lump.ie.textprocessing.TextPreprocessor
-
Stores a copy of the original string and generates a tokenized copy.
- setStringTokens(String) - Method in class cat.lump.ie.textprocessing.TextPreprocessor
-
Stores a copy of the original string and generates a tokenized copy.
- setTarget(<any>) - Method in class cat.lump.ir.comparison.toCheck.SimilarityPair
-
- setTarget(Article) - Method in class cat.lump.ir.retrievalmodels.similarity.SimilarityCalculator
-
- setTerms(Collection<TermFrequencyTuple>) - Method in class cat.lump.aq.textextraction.wikipedia.categories.DomainVocabulary
-
Sets a collection of tuples formed by a term and its frequency as
vocabulary.
- setText(String) - Method in class cat.lump.ir.retrievalmodels.document.PseudoCognates
-
- setText(String) - Method in class cat.lump.ir.retrievalmodels.similarity.Article
-
- setTrgLang(String) - Method in class cat.lump.aq.textextraction.wikipedia.fragments.ArticlesSimilarity
-
- setTrgLang(String) - Method in class cat.lump.ir.comparison.toCheck.ArticlesSimilarity
-
- setType(RepresentationType) - Method in class cat.lump.ir.retrievalmodels.similarity.SimilarityCalculator
-
- setValue(T) - Method in class cat.lump.aq.basics.structure.Pair
-
- setVerbose(Boolean) - Method in class cat.lump.ir.lucene.LuceneInterface
-
- setVerbose(Boolean) - Method in class cat.lump.ir.sim.cl.len.LengthModelEstimate
-
Deprecated.
- setYear(int) - Method in class cat.lump.aq.textextraction.wikipedia.categories.DomainKeywords
-
- Similarity - Enum in cat.lump.ir.retrievalmodels.similarity
-
This enumeration contains the similarity models and characterizations
available.
- Similarity - Interface in cat.lump.ir.sim
-
Interface with the minimum required methods to code a similarity
model.
- SimilarityCalculator - Class in cat.lump.ir.retrievalmodels.similarity
-
- SimilarityCalculator(String, String, RepresentationType, Model) - Constructor for class cat.lump.ir.retrievalmodels.similarity.SimilarityCalculator
-
Creates a similarity calculator with the given arguments and n=1
for the n-grams methods.
- SimilarityCalculator(String, String, RepresentationType, Model, int) - Constructor for class cat.lump.ir.retrievalmodels.similarity.SimilarityCalculator
-
Creates a similarity calculator with the given arguments.
- SimilarityCalculatorLenFact - Class in cat.lump.ir.retrievalmodels.similarity
-
- SimilarityCalculatorLenFact(String, String, RepresentationType, Model) - Constructor for class cat.lump.ir.retrievalmodels.similarity.SimilarityCalculatorLenFact
-
- SimilarityCLESA - Class in cat.lump.ir.sim.cl.clesa
-
Implementation of Cross-Language Explicit Semantic Analysis,
an extension of explicit semantic analysis proposed by:
Potthast, Stein and Anderka.
- SimilarityCLESA(String, String, File, File, String, String, boolean) - Constructor for class cat.lump.ir.sim.cl.clesa.SimilarityCLESA
-
- SimilarityCLESAdocs - Class in cat.lump.ir.sim.cl.clesa
-
Implementation of Explicit Semantic Analysis as described in:
Gabrilovich, Evgeniy, and Shaul Markovitch.
- SimilarityCLESAdocs(String, String, File, File, String, String, boolean) - Constructor for class cat.lump.ir.sim.cl.clesa.SimilarityCLESAdocs
-
- SimilarityESA - Class in cat.lump.ir.sim.ml.esa
-
Implementation of Explicit Semantic Analysis as described in:
Gabrilovich, Evgeniy, and Shaul Markovitch.
- SimilarityESA(String, String, Boolean) - Constructor for class cat.lump.ir.sim.ml.esa.SimilarityESA
-
Constructor.
- SimilarityESAdocs - Class in cat.lump.ir.sim.ml.esa
-
Implementation of of SimilarityESA, with two files collections
to compute similarities against each other.
- SimilarityESAdocs(File, File, String, String, boolean) - Constructor for class cat.lump.ir.sim.ml.esa.SimilarityESAdocs
-
Includes the path to the documents in A and B.
- SimilarityESAlines - Class in cat.lump.ir.sim.ml.esa
-
Extension of SimilarityESA, with two files to compute similarities
against each other.
- SimilarityESAlines(File, File, String, String, Boolean) - Constructor for class cat.lump.ir.sim.ml.esa.SimilarityESAlines
-
Given two files with independent sentences, process line-wise
similarities.
- SimilarityESAsent - Class in cat.lump.ir.sim.ml.esa
-
Implementation of SimilarityESAlines that works over one single
tab-separated document in which the left-side text-line has
to be compared against the right-side one.
- SimilarityESAsent(File, String, String, Boolean) - Constructor for class cat.lump.ir.sim.ml.esa.SimilarityESAsent
-
It invokes the superclass SimilarityESAlines, but it deceives
it by claiming two docs exist which are indeed the same one.
- SimilarityMatrix - Class in cat.lump.ir.retrievalmodels.similarity
-
This class represents a matrix of similarities.
- SimilarityMatrix(int, int) - Constructor for class cat.lump.ir.retrievalmodels.similarity.SimilarityMatrix
-
Creates a matrix filled with zeros with the size given by the parameters.
- SimilarityMatrixIterator - Class in cat.lump.ir.comparison.toCheck
-
- SimilarityMatrixIterator(SimilarityMatrix) - Constructor for class cat.lump.ir.comparison.toCheck.SimilarityMatrixIterator
-
- SimilarityMatrixIterator - Class in cat.lump.ir.retrievalmodels.similarity
-
- SimilarityMatrixIterator(SimilarityMatrix) - Constructor for class cat.lump.ir.retrievalmodels.similarity.SimilarityMatrixIterator
-
- SimilarityMeasure - Interface in cat.lump.ir.retrievalmodels.similarity
-
- SimilarityModel - Interface in cat.lump.ir.retrievalmodels.similarity
-
- SimilarityPair - Class in cat.lump.ir.comparison.toCheck
-
A similarity pair contains the pairs which define the input data for the
similarity calculators.
- SimilarityPair(int, File, int, File) - Constructor for class cat.lump.ir.comparison.toCheck.SimilarityPair
-
- size - Variable in class cat.lump.aq.basics.algebra.matrix.DynamicMatrix
-
the size of the matrix (which changes depending on
the inserted elements
- size() - Method in class cat.lump.ir.index.Ranking
-
- size(RepresentationType) - Method in class cat.lump.ir.retrievalmodels.document.Fragment
-
- size() - Method in class cat.lump.ir.weighting.TermFrequency
-
- skipPage(int) - Method in class cat.lump.aq.textextraction.wikipedia.prepro.AbstractPreprocess
-
Checks if the page must be skipped.
- sloppyFreq(int) - Method in class cat.lump.ir.lucene.index.TFSimilarity
-
- SlovenianAnalyzer - Class in cat.lump.ir.lucene.index.analyzers
-
- SlovenianAnalyzer() - Constructor for class cat.lump.ir.lucene.index.analyzers.SlovenianAnalyzer
-
- sortByValue(ArrayList<Pair<Integer, Double>>) - Static method in class cat.lump.aq.basics.structure.PairOperations
-
Sort a list of pairs according to its value
- sortByValue(Map<K, V>) - Static method in class cat.lump.aq.basics.structure.standard.MapUtil
-
- sortByValueInverse(Map<K, V>) - Static method in class cat.lump.aq.basics.structure.standard.MapUtil
-
- sortByValueReverse(ArrayList<Pair<Integer, Double>>) - Static method in class cat.lump.aq.basics.structure.PairOperations
-
Sort a list of pairs in reverse order according to its value
- source - Variable in class cat.lump.ir.retrievalmodels.similarity.SimilarityCalculator
-
Article in source language
- Span - Class in cat.lump.ie.textprocessing
-
Copied from the aitools span definition.
- Span(int, int) - Constructor for class cat.lump.ie.textprocessing.Span
-
- splitDiacritics() - Method in class cat.lump.ie.textprocessing.sentence.Diacritics
-
- splitText(String) - Method in class cat.lump.ir.retrievalmodels.document.Document
-
- sqlPass() - Static method in class cat.lump.aq.wikilink.config.MySQLWikiConfiguration
-
- sqlUser() - Static method in class cat.lump.aq.wikilink.config.MySQLWikiConfiguration
-
- src_len - Variable in class cat.lump.aq.basics.structure.ArticlePair
-
Length of the source language article
- srcID - Variable in class cat.lump.aq.basics.structure.ArticlePair
-
Identifier of the source language article
- srcTitle - Variable in class cat.lump.aq.basics.structure.ArticlePair
-
Title of the source language article
- standardTokenize(File) - Method in class cat.lump.ir.lucene.query.LuceneTokenizer
-
Tokenize the text from the given file using
Lucene's StandardAnalyzer
- standardTokenize(String) - Method in class cat.lump.ir.lucene.query.LuceneTokenizer
-
Tokenize the text using Lucene's StandardAnalyzer
- stem() - Method in class cat.lump.ie.textprocessing.TextPreprocessor
-
- stemLucene() - Method in class cat.lump.ie.textprocessing.TextPreprocessor
-
- StemmerFactory - Class in cat.lump.ie.textprocessing.word
-
Factory that allows for getting a stemmer for the required
language (if available)
- StemmerFactory() - Constructor for class cat.lump.ie.textprocessing.word.StemmerFactory
-
- STOP_LIST - Variable in class cat.lump.ie.textprocessing.stopwords.Stopwords
-
- Stopwords - Class in cat.lump.ie.textprocessing.stopwords
-
Abstract class that gives the methods for stopwords acquisition
and modification in different languages.
- Stopwords(Locale) - Constructor for class cat.lump.ie.textprocessing.stopwords.Stopwords
-
- str2FlatQuery(Analyzer, String) - Method in class cat.lump.ir.lucene.query.Document2Query
-
Generates a query in which every token has the same relevance
- str2FlatQuery(String) - Method in class cat.lump.ir.lucene.query.Document2Query
-
Generates a query in which every token has the same relevance
- stringToFile(File, String, boolean) - Static method in class cat.lump.aq.basics.io.files.FileIO
-
- tableExists(String, String) - Method in class cat.lump.aq.wikilink.connexion.WikipediaDriverManager
-
Checks if a table exists in a database
- target - Variable in class cat.lump.ir.retrievalmodels.similarity.SimilarityCalculator
-
Article in target language
- tBox - Variable in class cat.lump.aq.basics.structure.InvIndexContainer
-
- TermExtractor - Class in cat.lump.aq.textextraction.wikipedia.prepro
-
A class that extracts terms according to different definitions.
- TermExtractor(Locale) - Constructor for class cat.lump.aq.textextraction.wikipedia.prepro.TermExtractor
-
Creates a new TermExtractor
- TermFrequency - Class in cat.lump.ir.weighting
-
A class to compute and store a simple term frequency.
- TermFrequency() - Constructor for class cat.lump.ir.weighting.TermFrequency
-
Invokes the class with an empty list of term tuples
- TermFrequency(List<TermFrequencyTuple>) - Constructor for class cat.lump.ir.weighting.TermFrequency
-
Invokes the class with an existing empty list of term tuples
- TermFrequencyTuple - Class in cat.lump.aq.basics.structure.ir
-
This class provides a term frequency abstraction.
- TermFrequencyTuple(String, int) - Constructor for class cat.lump.aq.basics.structure.ir.TermFrequencyTuple
-
Constructor.
- TermFrequencyTuple(String) - Constructor for class cat.lump.aq.basics.structure.ir.TermFrequencyTuple
-
Constructor.
- test() - Method in class cat.lump.aq.wikilink.connexion.WikipediaDriverManager
-
Tests the connection by displaying the databases
- TestTokenize - Class in cat.lump.ie.textprocessing
-
- TestTokenize() - Constructor for class cat.lump.ie.textprocessing.TestTokenize
-
- text_magnitudes - Variable in class cat.lump.ir.retrievalmodels.similarity.Article
-
Map with the article's text magnitudes
- TextPreprocessor - Class in cat.lump.ie.textprocessing
-
This class represents an "interface" to the different text processing tools
available in this package.
- TextPreprocessor(Locale) - Constructor for class cat.lump.ie.textprocessing.TextPreprocessor
-
- textsA - Variable in class cat.lump.ir.sim.ml.esa.SimilarityESA
-
Path to documents A
- textsB - Variable in class cat.lump.ir.sim.ml.esa.SimilarityESA
-
Path to documents B
- TFSimilarity - Class in cat.lump.ir.lucene.index
-
An extension to Lucene's default similarity that intends to represent a
document simply by its TFs.
- TFSimilarity() - Constructor for class cat.lump.ir.lucene.index.TFSimilarity
-
- times(float) - Method in class cat.lump.aq.basics.algebra.vector.Vector
-
Multiplies the vector times a scalar and returns the result.
- timesEquals(float) - Method in class cat.lump.aq.basics.algebra.vector.Vector
-
Multiplies the vector times a scalar and updates its internal value.
- toFile(File) - Method in class cat.lump.aq.textextraction.wikipedia.categories.ArticleSelector
-
TODO ???
- toFile(File) - Method in class cat.lump.aq.textextraction.wikipedia.categories.BackArticleSelector
-
TODO ???
- toFile(int, int) - Method in class cat.lump.aq.textextraction.wikipedia.categories.CategoryExtractor
-
Dump tree into a file.
- toFile(File) - Method in class cat.lump.aq.textextraction.wikipedia.categories.CategoryNameStats
-
- toFile(File) - Method in class cat.lump.aq.textextraction.wikipedia.categories.DomainKeywords
-
Saves the top list of a text file.
- toFile(File, List<TermFrequencyTuple>) - Method in class cat.lump.aq.textextraction.wikipedia.categories.DomainKeywords
-
Saves the given list of TermFrequencyTuples into a file
- toFile(File) - Method in class cat.lump.aq.textextraction.wikipedia.categories.DomainVocabulary
-
Exports the vocabulary to a textual file.
- toFile(File) - Method in class cat.lump.aq.textextraction.wikipedia.categories.GroupOfCategories
-
- toFile(File) - Method in class cat.lump.ir.retrievalmodels.similarity.SimilarityMatrix
-
Writes the matrix as text in a given file
- TokenizerFactory - Class in cat.lump.ir.lucene.query
-
Creates and returns a Lucene Tokenizer instance for the required language
- TokenizerFactory() - Constructor for class cat.lump.ir.lucene.query.TokenizerFactory
-
- tokenStream(String, Reader) - Method in class cat.lump.ir.lucene.engine.WTAnalyzer
-
Pipeline for the analyser.
- tokenStream(String, Reader) - Method in class cat.lump.ir.lucene.index.analyzers.CroatianAnalyzer
-
- tokenStream(String, Reader) - Method in class cat.lump.ir.lucene.index.analyzers.EstonianAnalyzer
-
- tokenStream(String, Reader) - Method in class cat.lump.ir.lucene.index.analyzers.LithuanianAnalyzer
-
- tokenStream(String, Reader) - Method in class cat.lump.ir.lucene.index.analyzers.SlovenianAnalyzer
-
- toList() - Method in class cat.lump.aq.textextraction.wikipedia.categories.DomainVocabulary
-
Exports the vocabulary as a list of tuples which contains the term and
its frequency.
- toLowerCase() - Method in class cat.lump.ie.textprocessing.TextPreprocessor
-
Converts the string to lowercase
- toString() - Method in class cat.lump.aq.basics.structure.ir.TermFrequencyTuple
-
- toString() - Method in class cat.lump.aq.basics.structure.Pair
-
- toString() - Method in class cat.lump.aq.textextraction.wikipedia.categories.CategoryTreeNode
-
Transforms the object into a string
- toString() - Method in class cat.lump.aq.textextraction.wikipedia.categories.GroupOfCategories.ScoredCategory
-
Returns a string representation of the instance.
- toString() - Method in enum cat.lump.aq.textextraction.wikipedia.prepro.TypePreprocess
-
- toString() - Method in class cat.lump.aq.wikilink.config.Dump
-
- toString() - Method in class cat.lump.ir.index.Ranking
-
- toString() - Method in class cat.lump.ir.retrievalmodels.similarity.SimilarityMatrix
-
- Transformation - Class in cat.lump.ie.textprocessing.transform
-
- Transformation() - Constructor for class cat.lump.ie.textprocessing.transform.Transformation
-
- translateSets(String, int) - Method in class cat.lump.aq.textextraction.wikipedia.utilities.ArticlesTranslator
-
Main method to call the decoder String translator for all the files generated
by generateSetsFullFolder().
- Transliteratorr - Class in cat.lump.ie.textprocessing.transform
-
A class to transliterate a text with ICU4J
- Transliteratorr(Locale) - Constructor for class cat.lump.ie.textprocessing.transform.Transliteratorr
-
- trg_id - Variable in class cat.lump.aq.basics.structure.ArticlePair
-
Identifier of the target language article
- trg_len - Variable in class cat.lump.aq.basics.structure.ArticlePair
-
Length of the target language article
- trgTitle - Variable in class cat.lump.aq.basics.structure.ArticlePair
-
Title of the target language article
- type - Variable in class cat.lump.ir.retrievalmodels.similarity.SimilarityCalculator
-
Type of text representation
- TypePreprocess - Enum in cat.lump.aq.textextraction.wikipedia.prepro
-
Enumeration of the different types of preprocess.
- valueOf(String) - Static method in enum cat.lump.aq.textextraction.wikipedia.prepro.TypePreprocess
-
Returns the enum constant of this type with the specified name.
- valueOf(String) - Static method in enum cat.lump.ir.comparison.Model
-
Returns the enum constant of this type with the specified name.
- valueOf(String) - Static method in enum cat.lump.ir.retrievalmodels.document.RepresentationType
-
Returns the enum constant of this type with the specified name.
- valueOf(String) - Static method in enum cat.lump.ir.retrievalmodels.similarity.Similarity
-
Returns the enum constant of this type with the specified name.
- values() - Static method in enum cat.lump.aq.textextraction.wikipedia.prepro.TypePreprocess
-
Returns an array containing the constants of this enum type, in
the order they are declared.
- values() - Static method in enum cat.lump.ir.comparison.Model
-
Returns an array containing the constants of this enum type, in
the order they are declared.
- values() - Static method in enum cat.lump.ir.retrievalmodels.document.RepresentationType
-
Returns an array containing the constants of this enum type, in
the order they are declared.
- values() - Static method in enum cat.lump.ir.retrievalmodels.similarity.Similarity
-
Returns an array containing the constants of this enum type, in
the order they are declared.
- Vector - Class in cat.lump.aq.basics.algebra.vector
-
A vector of doubles that allows for a number of vector-vector and
vector-scalar algebraic operations, including:
sum of vectors (--> vector)
product of vectors (--> scalar)
product by scalar (--> vector)
division by scalar (--> vector)
Properties of the vector ---magnitude, max, min, argmax, and argmin---
are available as well.
- Vector(float[]) - Constructor for class cat.lump.aq.basics.algebra.vector.Vector
-
Initialisation with an array of doubles
- VectorCosine - Class in cat.lump.ir.retrievalmodels.similarity
-
A class to compute the cosine similarity between two vectors.
- VectorCosine() - Constructor for class cat.lump.ir.retrievalmodels.similarity.VectorCosine
-
- verbose - Variable in class cat.lump.ir.lucene.cli.LuceneCliMinimum0
-
Verbosity
- verbose - Variable in class cat.lump.ir.lucene.index.LuceneIndexer
-
Directory where the Lucene index has to be stored
- verbose - Variable in class cat.lump.ir.lucene.index.LuceneIndexerWT
-
- verbose - Variable in class cat.lump.ir.lucene.LuceneInterface
-
- vocabularySize() - Method in class cat.lump.ir.index.Index
-
- vocQuery(String[]) - Static method in class cat.lump.ir.lucene.query.Document2Query
-
Creates a query considering only the vocabulary (i.e. types)
- warn(String) - Method in class cat.lump.aq.basics.log.LumpLogger
-
- weightQuery(String[]) - Static method in class cat.lump.ir.lucene.query.Document2Query
-
Creates a query where the relevance of a type depends on its
frequency (i.e. if a token w appears 4 times, it will appear
as w^4)
- wiki - Variable in class cat.lump.aq.textextraction.wikipedia.prepro.AbstractPreprocess
-
Wikipedia JWPL connector
- WikipediaCliArticleSelector - Class in cat.lump.aq.textextraction.wikipedia.cli
-
CLI to access directly the selection of articles step of Xecutor pipeline
for the WikiTailor category-based in-domain comparable corpora extraction.
- WikipediaCliArticleSelector() - Constructor for class cat.lump.aq.textextraction.wikipedia.cli.WikipediaCliArticleSelector
-
- WikipediaCliArticleTextExtractor - Class in cat.lump.aq.textextraction.wikipedia.cli
-
CLI to access directly the extraction of articles step of Xecutor pipeline
for the WikiTailor category-based in-domain comparable corpora extraction.
- WikipediaCliArticleTextExtractor() - Constructor for class cat.lump.aq.textextraction.wikipedia.cli.WikipediaCliArticleTextExtractor
-
- WikipediaCliCategoriesXecutor - Class in cat.lump.aq.textextraction.wikipedia.cli
-
CLI to access the Xecutor pipeline for the WikiTailor category-based
in-domain comparable corpora extraction.
- WikipediaCliCategoriesXecutor() - Constructor for class cat.lump.aq.textextraction.wikipedia.cli.WikipediaCliCategoriesXecutor
-
- WikipediaCliCategoryDepth - Class in cat.lump.aq.textextraction.wikipedia.cli
-
CLI to access directly the estimation of the category depth step of Xecutor pipeline
for the WikiTailor category-based in-domain comparable corpora extraction.
- WikipediaCliCategoryDepth() - Constructor for class cat.lump.aq.textextraction.wikipedia.cli.WikipediaCliCategoryDepth
-
- WikipediaCliCategoryExtractor - Class in cat.lump.aq.textextraction.wikipedia.cli
-
CLI to access directly the extraction of categories step of Xecutor pipeline
for the WikiTailor category-based in-domain comparable corpora extraction.
- WikipediaCliCategoryExtractor() - Constructor for class cat.lump.aq.textextraction.wikipedia.cli.WikipediaCliCategoryExtractor
-
- WikipediaCliDomainKeywords - Class in cat.lump.aq.textextraction.wikipedia.cli
-
CLI to access directly the extraction of domain keywords step of Xecutor pipeline
for the WikiTailor category-based in-domain comparable corpora extraction.
- WikipediaCliDomainKeywords() - Constructor for class cat.lump.aq.textextraction.wikipedia.cli.WikipediaCliDomainKeywords
-
- WikipediaCliFragmentsXecutor - Class in cat.lump.aq.textextraction.wikipedia.cli
-
CLI to access the extraction of parallel sentences from the corpus obtained
with WikiTailor textextraction.wikipedia.
- WikipediaCliFragmentsXecutor() - Constructor for class cat.lump.aq.textextraction.wikipedia.cli.WikipediaCliFragmentsXecutor
-
- WikipediaCliMinimum - Class in cat.lump.aq.textextraction.wikipedia.cli
-
CLI to access JWPL-WIKIPEDIA-related programs in this package.
- WikipediaCliMinimum() - Constructor for class cat.lump.aq.textextraction.wikipedia.cli.WikipediaCliMinimum
-
Loads the logger and the available options (by calling loadOptions)
- WikipediaDBdata - Class in cat.lump.aq.wikilink
-
Utilities for querying the Wikipedia DB and dealing with its data.
- WikipediaDBdata() - Constructor for class cat.lump.aq.wikilink.WikipediaDBdata
-
- WikipediaDriverManager - Class in cat.lump.aq.wikilink.connexion
-
Adaptation of cat.talp.lump.co.db.DriverManagerClass from
- WikipediaDriverManager() - Constructor for class cat.lump.aq.wikilink.connexion.WikipediaDriverManager
-
- WikipediaJwpl - Class in cat.lump.aq.wikilink.jwpl
-
This class provides methods for initialising a jwpl Wikipedia instance.
- WikipediaJwpl(Locale, int) - Constructor for class cat.lump.aq.wikilink.jwpl.WikipediaJwpl
-
Creates its own database configuration according to language and year
- WikipediaJwpl(DatabaseConfiguration) - Constructor for class cat.lump.aq.wikilink.jwpl.WikipediaJwpl
-
Invokes the super class with the database configuration, sets the
constants loads the JWPL Wikipedia instance.
- WikiProperties - Class in cat.lump.aq.textextraction.wikipedia
-
Deprecated.
- WikiProperties() - Constructor for class cat.lump.aq.textextraction.wikipedia.WikiProperties
-
Deprecated.
- WikiTailor2Query - Class in cat.lump.ir.lucene.query
-
Query into Lucene indexes for WikiTailor.
- WikiTailor2Query(Locale, String, String, String, float, int, String) - Constructor for class cat.lump.ir.lucene.query.WikiTailor2Query
-
Constructors
- WikiTailor2Query(Locale, String, String, String, float, int, String, Boolean) - Constructor for class cat.lump.ir.lucene.query.WikiTailor2Query
-
- WordDecompositionICU4J - Class in cat.lump.ie.textprocessing.word
-
A class based on aitools WordDEcompositionICU4J class.
- WordDecompositionICU4J(Locale) - Constructor for class cat.lump.ie.textprocessing.word.WordDecompositionICU4J
-
- WordNgrams - Class in cat.lump.ie.textprocessing.ngram
-
This class allows for generating word-level n-grams from a text.
- WordNgrams(int, Locale) - Constructor for class cat.lump.ie.textprocessing.ngram.WordNgrams
-
- writeObject(Object, File) - Static method in class cat.lump.aq.basics.io.files.FileIO
-
- writePreprocessing(Collection<String>, File, int) - Method in class cat.lump.aq.textextraction.wikipedia.prepro.AbstractPreprocess
-
Writes the result of a preprocessing in the given output file.
- WTAnalyzer - Class in cat.lump.ir.lucene.engine
-
Modification of the standard Lucene Analyzer to mimic the preprocess
and term extraction used in the Wikiparable experiment
- WTAnalyzer(Version, Locale) - Constructor for class cat.lump.ir.lucene.engine.WTAnalyzer
-
- WTConfig - Class in cat.lump.aq.textextraction.wikipedia
-
A class to read the configuration file as a Properties object
- WTConfig() - Constructor for class cat.lump.aq.textextraction.wikipedia.WTConfig
-