public class LuceneTokenizer
extends java.lang.Object
| Modifier and Type | Field and Description |
|---|---|
protected Analyzer |
analyzer |
| Constructor and Description |
|---|
LuceneTokenizer() |
LuceneTokenizer(java.util.Locale lan) |
| Modifier and Type | Method and Description |
|---|---|
static java.lang.String[] |
ngramTokenize(java.lang.String file) |
protected void |
setAnalyzer(java.util.Locale lan)
TODO: This code is duplicated with cat.lump.ir.lucene.engine.loadAnalyzer
|
java.lang.String[] |
standardTokenize(java.io.File file)
Tokenize the text from the given file using
Lucene's StandardAnalyzer
|
java.lang.String[] |
standardTokenize(java.lang.String text)
Tokenize the text using Lucene's StandardAnalyzer
|
public LuceneTokenizer()
public LuceneTokenizer(java.util.Locale lan)
protected void setAnalyzer(java.util.Locale lan)
lan - public java.lang.String[] standardTokenize(java.io.File file)
throws java.io.IOException
file - java.io.IOExceptionpublic java.lang.String[] standardTokenize(java.lang.String text)
text - public static java.lang.String[] ngramTokenize(java.lang.String file)