|
Qizx/db 2.1 API | ||||||||||
| PREV CLASS NEXT CLASS | FRAMES NO FRAMES | ||||||||||
| SUMMARY: NESTED | FIELD | CONSTR | METHOD | DETAIL: FIELD | CONSTR | METHOD | ||||||||||
java.lang.Objectcom.qizx.api.util.text.SieveBase
com.qizx.api.util.text.DefaultWordSieve
A basic word extractor suitable for most European languages.
All methods can be redefined.
| Field Summary |
| Fields inherited from class com.qizx.api.util.text.SieveBase |
parameters |
| Constructor Summary | |
DefaultWordSieve()
Builds a case-insensitive and accent-insensitive sieve. |
|
DefaultWordSieve(boolean caseSensitive,
boolean accentSensitive)
Builds a sieve specifying case and accent sensitiveness. |
|
| Method Summary | |
char |
charAt(int ahead)
Returns the source character at a given position. |
Indexing.WordSieve |
copy()
Creates a carbon copy of this object. |
boolean |
isWordPart(char c)
Returns true if the char can be part of a word. |
boolean |
isWordStart(char c)
Returns true if the char can be at start of a word. |
char |
mapChar(char c)
Normalizes a character. |
char |
multiCharsWildcard()
Returns the wildcard character which matches several characters. |
char |
nextChar()
Moves to next source character and returns it, returns 0 if at end. |
char[] |
nextWord()
Returns the next normalized word, or null if the end of the fragment to analyze is reached. |
void |
setParameters(String[] parameters)
Defines optional parameters for the sieve. |
protected void |
setup(boolean caseSensitive,
boolean accentSensitive)
|
char |
singleCharWildcard()
Returns the wildcard character which matches a single character. |
void |
start(char[] text,
int length)
Starts the analysis of a new text chunk. |
void |
start(String text)
Starts the analysis of a new text chunk. |
int |
wordLength()
Returns the original length of the last word returned by nextWord. |
int |
wordOffset()
Returns the offset of the last word returned by nextWord. |
| Methods inherited from class com.qizx.api.util.text.SieveBase |
addParameter, getParameters, toString, toString |
| Methods inherited from class java.lang.Object |
clone, equals, finalize, getClass, hashCode, notify, notifyAll, wait, wait, wait |
| Methods inherited from interface com.qizx.api.Indexing.Sieve |
getParameters |
| Constructor Detail |
public DefaultWordSieve()
public DefaultWordSieve(boolean caseSensitive,
boolean accentSensitive)
caseSensitive - if false, uppercase and lowercase characters are
equivalent.accentSensitive - if false, a letter with diacritic signs is
equivalent to the same letter without diacritic sign, for example '?'
is equivalent to 'e'.| Method Detail |
public void start(char[] text,
int length)
Indexing.WordSieve
start in interface Indexing.WordSievetext - characters to analyze, index from 0 to length - 1length - number of characters in the text arraypublic void start(String text)
Indexing.WordSieve
start in interface Indexing.WordSievetext - fragment to analyzepublic char[] nextWord()
Indexing.WordSieve
nextWord in interface Indexing.WordSieveIndexing.WordSieve.nextWord()public boolean isWordStart(char c)
Indexing.WordSieve
isWordStart in interface Indexing.WordSievec - a source character
public boolean isWordPart(char c)
Indexing.WordSieve
isWordPart in interface Indexing.WordSievec - a source character
public char multiCharsWildcard()
Indexing.WordSieve
multiCharsWildcard in interface Indexing.WordSievepublic char singleCharWildcard()
Indexing.WordSieve
singleCharWildcard in interface Indexing.WordSievepublic char mapChar(char c)
Indexing.WordSieve
mapChar in interface Indexing.WordSievec - a source character converted to a normalized value in the
returned word, for example converted to uppercase.
public char charAt(int ahead)
Indexing.WordSieve
charAt in interface Indexing.WordSieveahead - an offset to the current position of the sieve in the
source text. If equal to 0, return the character at current
position.
public char nextChar()
Indexing.WordSieve
nextChar in interface Indexing.WordSievepublic int wordOffset()
Indexing.WordSieve
wordOffset in interface Indexing.WordSievepublic int wordLength()
Indexing.WordSieve
wordLength in interface Indexing.WordSieve
protected void setup(boolean caseSensitive,
boolean accentSensitive)
public void setParameters(String[] parameters)
Indexing.Sieve
setParameters in interface Indexing.Sieveparameters - an array of even size containing alternately a
parameter name and a parameter value.public Indexing.WordSieve copy()
Indexing.WordSieve
copy in interface Indexing.WordSieve
|
© 2008 Axyana Software | ||||||||||
| PREV CLASS NEXT CLASS | FRAMES NO FRAMES | ||||||||||
| SUMMARY: NESTED | FIELD | CONSTR | METHOD | DETAIL: FIELD | CONSTR | METHOD | ||||||||||