C# (CSharp) CrawlWave.ServerPlugins.WordExtraction WordsCache.AddStemmedWord示例

编程语言: C# (CSharp)

命名空间/包名称: CrawlWave.ServerPlugins.WordExtraction

类/类型: WordsCache

方法/功能: AddStemmedWord

hotexamples.com的示例: 1

C# (CSharp) CrawlWave.ServerPlugins.WordExtraction WordsCache.AddStemmedWord - 已找到1个示例。这些是从开源项目中提取的最受好评的CrawlWave.ServerPlugins.WordExtraction.WordsCache.AddStemmedWord现实C# (CSharp)示例。您可以评价示例，以帮助我们提高示例质量。

常用方法

显示隐藏

AddStemmedWord(1)

Instance(1)

示例#1

显示文件

文件： WordExtractorPlugin.cs 项目： tmzani/CrawlWave

 /// <summary>
 /// Extracts the words found in the contents of a document. Used by DBUpdater when
 /// a document is stored in the database in order to extract the words it contains
 /// and add them to the database at the same time.
 /// </summary>
 /// <param name="data">The <see cref="UrlCrawlData"/> to be processed.</param>
 public void ExtractWords(ref UrlCrawlData data)
 {
     //First try to extract the words from the document. If something goes wrong just
     //return, otherwise add the words to the cache, remove any old words related to
     //the url with this id from the database and store the new url-words.
     try
     {
         SortedList words = wordExtractor.ExtractWords(data.Data);
         if (words.Count == 0)
         {
             return;
         }
         //add all the words to the database if they don't exist already
         string word       = String.Empty;
         short  word_count = 0;
         int    word_id    = -1;
         foreach (DictionaryEntry de in words)
         {
             word = (string)de.Key;
             cache.AddStemmedWord(word);
         }
         //remove all the old words related to this url from the database
         RemoveUrlWords(data.ID);
         //now add relationships between the url and its words
         foreach (DictionaryEntry d in words)
         {
             word       = (string)d.Key;
             word_count = (short)d.Value;
             word_id    = cache[word];
             AddUrlWord(data.ID, word_id, word_count);
         }
         UpdateUrlDataLastProcess(data.ID);
     }
     catch (Exception e)
     {
         events.Enqueue(new EventLoggerEntry(CWLoggerEntryType.Warning, DateTime.Now, "WordExtractionPlugin failed to extract words from Url with ID " + data.ID.ToString() + ": " + e.ToString()));
     }
 }