public class PseudoCognates
extends java.lang.Object
Reference: Michel Simard, George F. Foster and Piere Isabelle. Using Cognates to Align Sentences in Bilingual Corpora
The cognates are preprocessed as follows:Modifier and Type | Field and Description |
---|---|
protected Dictionary |
dictionary |
protected int |
GLOBAL_COUNTER
Keeps track of the amount of considered tokens
|
protected java.util.Map<java.lang.Integer,java.lang.Double> |
TERMS
The terms in the document together with their weights
|
Modifier | Constructor and Description |
---|---|
protected |
PseudoCognates(Dictionary dictionary,
java.util.Locale language) |
Modifier and Type | Method and Description |
---|---|
protected java.lang.String |
getDummyWord() |
java.util.Map<java.lang.String,java.lang.Double> |
getNormalizedRepresentation() |
java.util.List<java.lang.String> |
getRepresentation() |
protected java.util.List<java.lang.String> |
getTokens(java.lang.String text) |
java.util.Map<java.lang.String,java.lang.Double> |
getWeightedRepresentation() |
int |
length() |
protected java.lang.String |
preprocess(java.lang.String text) |
void |
setText(java.lang.String text) |
protected java.util.Map<java.lang.Integer,java.lang.Double> TERMS
protected int GLOBAL_COUNTER
protected Dictionary dictionary
protected PseudoCognates(Dictionary dictionary, java.util.Locale language)
public void setText(java.lang.String text)
protected java.lang.String preprocess(java.lang.String text)
protected java.lang.String getDummyWord()
public int length()
protected java.util.List<java.lang.String> getTokens(java.lang.String text)
text
- public java.util.List<java.lang.String> getRepresentation()
public java.util.Map<java.lang.String,java.lang.Double> getWeightedRepresentation()
public java.util.Map<java.lang.String,java.lang.Double> getNormalizedRepresentation()