C# (CSharp) IStopWordFilter.IsStopWord 예제들

프로그래밍 언어: C# (CSharp)

클래스/타입: IStopWordFilter

메소드/함수: IsStopWord

hotexamples.com에서의 예제들: 3

C# (CSharp) IStopWordFilter.IsStopWord - 3개의 예제가 발견되었습니다. 이것들은 오픈소스 프로젝트에서 추출된 C# (CSharp)의 IStopWordFilter.IsStopWord에 대한 실세계 최고 등급의 예제들입니다. 예제들을 평가하여 예제의 품질 향상에 도움을 줄 수 있습니다.

자주 사용되는 메소드들

보기 숨기기

IsStopWord(3)

IsPunctuation(1)

예제 #1

파일 보기

파일: TextRankSummarizer.cs 프로젝트: regme/Korzh.NLP

        private double SentanceSimilarity(IList <string> sentance1, IList <string> sentance2)
        {
            var allWords = sentance1.Concat(sentance2).Distinct().ToList();

            var v1 = new DenseVector(allWords.Count);
            var v2 = new DenseVector(allWords.Count);

            foreach (var word in sentance1)
            {
                if (_stopWordFilter.IsStopWord(word))
                {
                    continue;
                }

                var index = allWords.IndexOf(word);
                v1[index] += 1;
            }

            foreach (var word in sentance2)
            {
                if (_stopWordFilter.IsStopWord(word))
                {
                    continue;
                }

                var index = allWords.IndexOf(word);
                v2[index] += 1;
            }

            return(1 - Utils.CosineSimilarity(v1, v2));
        }

예제 #2

파일 보기

파일: KeywordExtractor.cs 프로젝트: regme/Korzh.NLP

        public KeywordExtractor(INlpServiceProvider nlpServices, string lang)
        {
            _stopWordFilter = nlpServices.GetStopWordFilter(lang);

            _wordStemmer = nlpServices.GetStemmer(lang);

            _filter = (word) => {
                return(!_stopWordFilter.IsStopWord(word));
            };

            _mapper = (word) => {
                return(_wordStemmer.Stem(word));
            };
        }

예제 #3

파일 보기

파일: KeywordExtractor.cs 프로젝트: ioncakephper/NRakePhraseExtractor

        /// <summary>
        /// Note: this method has side-effects. In addition to returning the array of phrases, it maintains the internal index of unique words.
        /// </summary>
        /// <param name="tokens"></param>
        /// <returns></returns>
        public string[] ToPhrases(string[] tokens)
        {
            _uniqueWords = new SortedSet <string>();
            List <string> phrases = new List <string>();

            string current = string.Empty;

            foreach (string t in tokens)
            {
                if (_stopWords.IsPunctuation(t) || _stopWords.IsStopWord(t))
                {
                    //Throw it away!
                    if (current.Length > 0)
                    {
                        phrases.Add(current);
                        current = string.Empty;
                    }
                }
                else
                {
                    _uniqueWords.Add(t);
                    if (current.Length == 0)
                    {
                        current = t;
                    }
                    else
                    {
                        current += " " + t;
                    }
                }
            }

            if (current.Length > 0)
            {
                phrases.Add(current);
            }

            return(phrases.ToArray());
        }