private void Align(List <string> database, List <string> query, params int[] result) { var aligner = new LongTextAligner(database, 1); int[] alignment = aligner.Align(query); Assert.IsTrue(Helper.Contains(Utilities.AsList(alignment), result)); }
public void TextAligner_SmallAlign() { var wordList = new Dictionary <string[], int[]> { { new[] { "foo", "foo" }, new[] { -1, -1 } }, { new[] { "foo", "baz" }, new[] { 2, 3 } }, { new[] { "foo", "bar", "foo", "bar", "baz", "42" }, new[] { 0, 1, 2, 4, 5, 6 } }, { new[] { "foo", "bar", "foo", "baz", "bar" }, new[] { 0, 1, 2, 3, 4 } } }; foreach (var item in wordList.Keys) { Assert.IsTrue(Helper.Contains(Utilities.AsList(_aligner.Align(item.ToList())), wordList[item])); } }
public List <WordResult> Align(FileInfo audioUrl, List <string> sentenceTranscript) { var transcript = SentenceToWords(sentenceTranscript); var aligner = new LongTextAligner(transcript, TupleSize); var alignedWords = new Dictionary <int, WordResult>(); var ranges = new LinkedList <Range>(); //var texts = new ArrayDeque(); //var timeFrames = new ArrayDeque(); var texts = new LinkedList <List <string> >(); var timeFrames = new LinkedList <TimeFrame>(); ranges.AddLast(new Range(0, transcript.Count)); texts.Offer(transcript); TimeFrame totalTimeFrame = TimeFrame.Infinite; timeFrames.Offer(totalTimeFrame); long lastFrame = TimeFrame.Infinite.End; for (int i = 0; i < 4; i++) { if (i == 3) { _context.SetLocalProperty("decoder->searchManager", "alignerSearchManager"); } while (texts.Count != 0) { Debug.Assert(texts.Count == ranges.Count); Debug.Assert(texts.Count == timeFrames.Count); var text = texts.Poll(); var frame = timeFrames.Poll(); var range = ranges.Poll(); if (i < 3 && texts.Count < MinLmAlignSize) { continue; } this.LogInfo("Aligning frame " + frame + " to text " + text + " range " + range); if (i < 3) { _languageModel.SetText(text); } _recognizer.Allocate(); if (i == 3) { _grammar.SetWords(text); } _context.SetSpeechSource(audioUrl.OpenRead(), frame); var hypothesis = new List <WordResult>(); Result speechResult; while (null != (speechResult = _recognizer.Recognize())) { hypothesis.AddRange(speechResult.GetTimedBestResult(false)); } if (i == 0) { if (hypothesis.Count > 0) { lastFrame = hypothesis[hypothesis.Count - 1].TimeFrame.End; } } var words = new List <string>(); foreach (WordResult wr in hypothesis) { words.Add(wr.Word.Spelling); } int[] alignment = aligner.Align(words, range); List <WordResult> results = hypothesis; this.LogInfo("Decoding result is " + results); // dumpAlignment(transcript, alignment, results); DumpAlignmentStats(transcript, alignment, results); for (int j = 0; j < alignment.Length; j++) { if (alignment[j] != -1) { alignedWords.Add(alignment[j], hypothesis[j]); } } _recognizer.Deallocate(); } ScheduleNextAlignment(transcript, alignedWords, ranges, texts, timeFrames, lastFrame); } return(new List <WordResult>(alignedWords.Values)); }