C# (CSharp) Lucene.Net.Analysis.Miscellaneous WordDelimiterIterator 예제들

프로그래밍 언어: C# (CSharp)

네임스페이스/패키지 이름: Lucene.Net.Analysis.Miscellaneous

hotexamples.com에서의 예제들: 8

C# (CSharp) Lucene.Net.Analysis.Miscellaneous WordDelimiterIterator - 8개의 예제가 발견되었습니다. 이것들은 오픈소스 프로젝트에서 추출된 C# (CSharp)의 Lucene.Net.Analysis.Miscellaneous.WordDelimiterIterator에 대한 실세계 최고 등급의 예제들입니다. 예제들을 평가하여 예제의 품질 향상에 도움을 줄 수 있습니다.

자주 사용되는 메소드들

보기 숨기기

GetType(1)

getType(1)

A BreakIterator-like API for iterating over subwords in text, according to WordDelimiterFilter rules. @lucene.internal

WordDelimiterIterator 1 문서

예제 #1

파일 보기

파일: Lucene47WordDelimiterFilter.cs 프로젝트: ywscr/lucenenet

        /// <summary>
        /// Creates a new <see cref="Lucene47WordDelimiterFilter"/>
        /// </summary>
        /// <param name="in"> <see cref="TokenStream"/> to be filtered </param>
        /// <param name="charTypeTable"> table containing character types </param>
        /// <param name="configurationFlags"> Flags configuring the filter </param>
        /// <param name="protWords"> If not null is the set of tokens to protect from being delimited </param>
        public Lucene47WordDelimiterFilter(TokenStream @in, byte[] charTypeTable, WordDelimiterFlags configurationFlags, CharArraySet protWords)
            : base(@in)
        {
            termAttribute   = AddAttribute <ICharTermAttribute>();
            offsetAttribute = AddAttribute <IOffsetAttribute>();
            posIncAttribute = AddAttribute <IPositionIncrementAttribute>();
            typeAttribute   = AddAttribute <ITypeAttribute>();
            concat          = new WordDelimiterConcatenation(this);
            concatAll       = new WordDelimiterConcatenation(this);

            this.flags     = configurationFlags;
            this.protWords = protWords;
            this.iterator  = new WordDelimiterIterator(charTypeTable, Has(WordDelimiterFlags.SPLIT_ON_CASE_CHANGE), Has(WordDelimiterFlags.SPLIT_ON_NUMERICS), Has(WordDelimiterFlags.STEM_ENGLISH_POSSESSIVE));
        }

예제 #2

파일 보기

 /// <summary>
 /// Creates a new WordDelimiterFilter
 /// </summary>
 /// <param name="in"> TokenStream to be filtered </param>
 /// <param name="charTypeTable"> table containing character types </param>
 /// <param name="configurationFlags"> Flags configuring the filter </param>
 /// <param name="protWords"> If not null is the set of tokens to protect from being delimited </param>
 public WordDelimiterFilter(Version matchVersion, TokenStream @in, sbyte[] charTypeTable, int configurationFlags, CharArraySet protWords) : base(@in)
 {
     if (!InstanceFieldsInitialized)
     {
         InitializeInstanceFields();
         InstanceFieldsInitialized = true;
     }
     if (!matchVersion.onOrAfter(Version.LUCENE_48))
     {
         throw new System.ArgumentException("This class only works with Lucene 4.8+. To emulate the old (broken) behavior of WordDelimiterFilter, use Lucene47WordDelimiterFilter");
     }
     this.flags     = configurationFlags;
     this.protWords = protWords;
     this.iterator  = new WordDelimiterIterator(charTypeTable, has(SPLIT_ON_CASE_CHANGE), has(SPLIT_ON_NUMERICS), has(STEM_ENGLISH_POSSESSIVE));
 }

예제 #3

파일 보기

        /// <summary>
        /// Creates a new WordDelimiterFilter
        /// </summary>
        /// <param name="in"> TokenStream to be filtered </param>
        /// <param name="charTypeTable"> table containing character types </param>
        /// <param name="configurationFlags"> Flags configuring the filter </param>
        /// <param name="protWords"> If not null is the set of tokens to protect from being delimited </param>
        public Lucene47WordDelimiterFilter(TokenStream @in, sbyte[] charTypeTable, int configurationFlags, CharArraySet protWords)
            : base(@in)
        {
            termAttribute   = AddAttribute <ICharTermAttribute>();
            offsetAttribute = AddAttribute <IOffsetAttribute>();
            posIncAttribute = AddAttribute <IPositionIncrementAttribute>();
            typeAttribute   = AddAttribute <ITypeAttribute>();

            if (!InstanceFieldsInitialized)
            {
                InitializeInstanceFields();
                InstanceFieldsInitialized = true;
            }
            this.flags     = configurationFlags;
            this.protWords = protWords;
            this.iterator  = new WordDelimiterIterator(charTypeTable, has(SPLIT_ON_CASE_CHANGE), has(SPLIT_ON_NUMERICS), has(STEM_ENGLISH_POSSESSIVE));
        }

예제 #4

파일 보기

        /// <summary>
        /// Creates a new WordDelimiterFilter
        /// </summary>
        /// <param name="matchVersion"> lucene compatibility version </param>
        /// <param name="in"> TokenStream to be filtered </param>
        /// <param name="charTypeTable"> table containing character types </param>
        /// <param name="configurationFlags"> Flags configuring the filter </param>
        /// <param name="protWords"> If not null is the set of tokens to protect from being delimited </param>
        public WordDelimiterFilter(LuceneVersion matchVersion, TokenStream @in, byte[] charTypeTable, WordDelimiterFlags configurationFlags, CharArraySet protWords)
            : base(@in)
        {
            this.termAttribute   = AddAttribute <ICharTermAttribute>();
            this.offsetAttribute = AddAttribute <IOffsetAttribute>();
            this.posIncAttribute = AddAttribute <IPositionIncrementAttribute>();
            this.typeAttribute   = AddAttribute <ITypeAttribute>();
            concat    = new WordDelimiterConcatenation(this);
            concatAll = new WordDelimiterConcatenation(this);
            sorter    = new OffsetSorter(this);

            if (!matchVersion.OnOrAfter(LuceneVersion.LUCENE_48))
            {
                throw new ArgumentException("This class only works with Lucene 4.8+. To emulate the old (broken) behavior of WordDelimiterFilter, use Lucene47WordDelimiterFilter");
            }
            this.flags     = configurationFlags;
            this.protWords = protWords;
            this.iterator  = new WordDelimiterIterator(charTypeTable,
                                                       Has(WordDelimiterFlags.SPLIT_ON_CASE_CHANGE),
                                                       Has(WordDelimiterFlags.SPLIT_ON_NUMERICS),
                                                       Has(WordDelimiterFlags.STEM_ENGLISH_POSSESSIVE));
        }

예제 #5

파일 보기

        // parses a list of MappingCharFilter style rules into a custom byte[] type table
        private sbyte[] ParseTypes(IEnumerable <string> rules)
        {
            IDictionary <char, sbyte> typeMap = new SortedDictionary <char, sbyte>();

            foreach (string rule in rules)
            {
                //Matcher m = typePattern.matcher(rule);
                //if (!m.find())
                Match m = typePattern.Match(rule);
                if (!m.Success)
                {
                    throw new System.ArgumentException("Invalid Mapping Rule : [" + rule + "]");
                }
                string lhs = ParseString(m.Groups[1].Value.Trim());
                sbyte  rhs = ParseType(m.Groups[2].Value.Trim());
                if (lhs.Length != 1)
                {
                    throw new System.ArgumentException("Invalid Mapping Rule : [" + rule + "]. Only a single character is allowed.");
                }
                if (rhs == WordDelimiterFilter.NOT_SET)
                {
                    throw new System.ArgumentException("Invalid Mapping Rule : [" + rule + "]. Illegal type.");
                }
                typeMap[lhs[0]] = rhs;
            }

            // ensure the table is always at least as big as DEFAULT_WORD_DELIM_TABLE for performance
            sbyte[] types = new sbyte[Math.Max(typeMap.Keys.LastOrDefault() + 1, WordDelimiterIterator.DEFAULT_WORD_DELIM_TABLE.Length)];
            for (int i = 0; i < types.Length; i++)
            {
                types[i] = WordDelimiterIterator.GetType(i);
            }
            foreach (var mapping in typeMap)
            {
                types[mapping.Key] = mapping.Value;
            }
            return(types);
        }

예제 #6

파일 보기

파일: WordDelimiterFilterFactory.cs 프로젝트: richard-green/lucenenet

        // parses a list of MappingCharFilter style rules into a custom byte[] type table
        private sbyte[] parseTypes(IList <string> rules)
        {
            SortedMap <char?, sbyte?> typeMap = new SortedDictionary <char?, sbyte?>();

            foreach (string rule in rules)
            {
                Matcher m = typePattern.matcher(rule);
                if (!m.find())
                {
                    throw new System.ArgumentException("Invalid Mapping Rule : [" + rule + "]");
                }
                string lhs = parseString(m.group(1).Trim());
                sbyte? rhs = parseType(m.group(2).Trim());
                if (lhs.Length != 1)
                {
                    throw new System.ArgumentException("Invalid Mapping Rule : [" + rule + "]. Only a single character is allowed.");
                }
                if (rhs == null)
                {
                    throw new System.ArgumentException("Invalid Mapping Rule : [" + rule + "]. Illegal type.");
                }
                typeMap.put(lhs[0], rhs);
            }

            // ensure the table is always at least as big as DEFAULT_WORD_DELIM_TABLE for performance
            sbyte[] types = new sbyte[Math.Max(typeMap.LastKey() + 1, WordDelimiterIterator.DEFAULT_WORD_DELIM_TABLE.Length)];
            for (int i = 0; i < types.Length; i++)
            {
                types[i] = WordDelimiterIterator.getType(i);
            }
            foreach (KeyValuePair <char?, sbyte?> mapping in typeMap.EntrySet())
            {
                types[mapping.Key] = mapping.Value;
            }
            return(types);
        }

예제 #7

파일 보기

파일: WordDelimiterFilter.cs 프로젝트: ChristopherHaws/lucenenet

        /// <summary>
        /// Creates a new WordDelimiterFilter
        /// </summary>
        /// <param name="in"> TokenStream to be filtered </param>
        /// <param name="charTypeTable"> table containing character types </param>
        /// <param name="configurationFlags"> Flags configuring the filter </param>
        /// <param name="protWords"> If not null is the set of tokens to protect from being delimited </param>
        public WordDelimiterFilter(LuceneVersion matchVersion, TokenStream @in, sbyte[] charTypeTable, int configurationFlags, CharArraySet protWords)
              : base(@in)
        {
            if (!InstanceFieldsInitialized)
            {
                InitializeInstanceFields();
                InstanceFieldsInitialized = true;
            }
            if (!matchVersion.OnOrAfter(LuceneVersion.LUCENE_48))
            {
                throw new System.ArgumentException("This class only works with Lucene 4.8+. To emulate the old (broken) behavior of WordDelimiterFilter, use Lucene47WordDelimiterFilter");
            }
            this.flags = configurationFlags;
            this.protWords = protWords;
            this.iterator = new WordDelimiterIterator(charTypeTable, Has(SPLIT_ON_CASE_CHANGE), Has(SPLIT_ON_NUMERICS), Has(STEM_ENGLISH_POSSESSIVE));

            this.termAttribute = AddAttribute<ICharTermAttribute>();
            this.offsetAttribute = AddAttribute<IOffsetAttribute>();
            this.posIncAttribute = AddAttribute<IPositionIncrementAttribute>();
            this.typeAttribute = AddAttribute<ITypeAttribute>();
        }

예제 #8

파일 보기

파일: Lucene47WordDelimiterFilter.cs 프로젝트: ChristopherHaws/lucenenet

        /// <summary>
        /// Creates a new WordDelimiterFilter
        /// </summary>
        /// <param name="in"> TokenStream to be filtered </param>
        /// <param name="charTypeTable"> table containing character types </param>
        /// <param name="configurationFlags"> Flags configuring the filter </param>
        /// <param name="protWords"> If not null is the set of tokens to protect from being delimited </param>
        public Lucene47WordDelimiterFilter(TokenStream @in, sbyte[] charTypeTable, int configurationFlags, CharArraySet protWords)
            : base(@in)
        {
            termAttribute = AddAttribute<ICharTermAttribute>();
            offsetAttribute = AddAttribute<IOffsetAttribute>();
            posIncAttribute = AddAttribute<IPositionIncrementAttribute>();
            typeAttribute = AddAttribute<ITypeAttribute>();

            if (!InstanceFieldsInitialized)
            {
                InitializeInstanceFields();
                InstanceFieldsInitialized = true;
            }
            this.flags = configurationFlags;
            this.protWords = protWords;
            this.iterator = new WordDelimiterIterator(charTypeTable, Has(SPLIT_ON_CASE_CHANGE), Has(SPLIT_ON_NUMERICS), Has(STEM_ENGLISH_POSSESSIVE));
        }