Stemmer class is a convenient facade for other stemmer-related classes. The core stemming algorithm and its implementation is taken verbatim from the Egothor project ( www.egothor.org ).

Even though the stemmer tables supplied in the distribution package are built for Polish language, there is nothing language-specific here.

예제 #1
0
 /// <summary>
 /// Create filter using the supplied stemming table.
 /// </summary>
 /// <param name="in">input token stream</param>
 /// <param name="stemmer">stemmer</param>
 /// <param name="minLength">For performance reasons words shorter than minLength 
 /// characters are not processed, but simply returned.</param>
 public StempelFilter(TokenStream @in, StempelStemmer stemmer, int minLength)
     : base(@in)
 {
     this.stemmer = stemmer;
     this.minLength = minLength;
     this.termAtt = AddAttribute<ICharTermAttribute>();
     this.keywordAtt = AddAttribute<IKeywordAttribute>();
 }
예제 #2
0
 /// <summary>
 /// Create filter using the supplied stemming table.
 /// </summary>
 /// <param name="in">input token stream</param>
 /// <param name="stemmer">stemmer</param>
 public StempelFilter(TokenStream @in, StempelStemmer stemmer)
     : this(@in, stemmer, DEFAULT_MIN_LENGTH)
 {
 }