public PosTaggerProcessor(PosTaggerProcessorConfig config, IMorphoModel morphoModel, MorphoAmbiguityResolverModel morphoAmbiguityModel) { CheckConfig(config, morphoModel, morphoAmbiguityModel); _tokenizer = new Tokenizer(config.TokenizerConfig); _words = new List <Word>(DEFAULT_WORDSLIST_CAPACITY); _posTaggerScriber = PosTaggerScriber.Create(config.ModelFilename, config.TemplateFilename); _posTaggerPreMerging = new PosTaggerPreMerging(config.Model); _posTaggerMorphoAnalyzer = new PosTaggerMorphoAnalyzer(morphoModel, morphoAmbiguityModel); _processSentCallback1Delegate = new Tokenizer.ProcessSentCallbackDelegate(ProcessSentCallback1); _processSentCallback2Delegate = new Tokenizer.ProcessSentCallbackDelegate(ProcessSentCallback2); }
public void SetWordMorphologyAsUndefined() { var wma = morphoAmbiguityTuples[0]; if (!wma.wordFormMorphology.IsEmpty()) { word.morphology = wma.wordFormMorphology; } else { var partOfSpeech = PosTaggerMorphoAnalyzer.ToPartOfSpeech(word.posTaggerOutputType).GetValueOrDefault(); word.morphology = new WordFormMorphology(partOfSpeech); } morphoAmbiguityTuples.Clear(); morphoAmbiguityTuples.Add(new MorphoAmbiguityTuple(word, word.morphology, wma.punctuationType)); }