Class | Description |
---|---|
LengthFactors |
It includes the default values for a bunch of language pairs, namely:
cz-en
de-en
en-cz
en-de
en-es
en-fr
en-ru
es-en
fr-en
ru-en
The parameters were estimated by txell on different corpora, including:
commoncrawl.wmt2013
CzEng.v1.0
el_periodico
europarl.v6
europarl.v7
FAUST_D4.2
French_treebank
newscommentary.v8
news.shuffled.en.conll.gz
news.shuffled.fr.conll.gz
patents
Romanian_treebank
UNdoc.2000
wiki-titles.ru-en
wmt10
wmt10.select
|
LengthModel |
A class to estimate length models for a language pair.
|
LengthModelEstimate |
A class to estimate the length factor between two texts according to
previously learnt parameters.
|
LengthModelLearn |
Class to learn the parameters of the length model from a parallel
corpus (two files).
|