private void FillList(TermIndexRecord rec, ref int entryCounter, Color backColor) { for (int i = 0; i < rec.DocsNumber; i++, entryCounter++) { Entry e = rec.GetEntryAt(i); ListViewItem item = new ListViewItem(); item.BackColor = backColor; item.Text = entryCounter.ToString(); item.SubItems.Add(e.DocIndex.ToString()); item.SubItems.Add(e.Offsets.Length.ToString()); if (hash.ContainsKey(e.DocIndex)) { ListViewItem refItem = listEntries.Items[hash[e.DocIndex] - 1]; item.ForeColor = refItem.ForeColor = Color.Red; ListViewItem.ListViewSubItem prev2 = refItem.SubItems[3]; prev2.Text = prev2.Text + ">>" + entryCounter.ToString(); item.SubItems.Add("<<" + hash[e.DocIndex].ToString()); } else { item.SubItems.Add(""); } hash[e.DocIndex] = entryCounter; if (!Core.TextIndexManager.IsDocumentInIndex(e.DocIndex)) { item.Font = delFont; } listEntries.Items.Add(item); } }
public static void Main(string[] Arguments) { TermIndexAccessor TermIndex = new TermIndexAccessor(Arguments[0]); StreamWriter writer = new StreamWriter("lexdump.dat", false, System.Text.Encoding.Default); int TotalDocEntries = 0, TotalInstances = 0; Console.WriteLine("loading..."); TermIndex.Load(); Console.WriteLine("dumping..."); foreach (KeyPair pair in TermIndex.Keys) { TermIndexRecord record = TermIndex.GetRecordByHandle(pair._offset); int InstancesCount = 0; TotalDocEntries += record.DocsNumber; for (int j = 0; j < record.DocsNumber; j++) { InstancesCount += record.GetEntryAt(j).Count; TotalInstances += record.GetEntryAt(j).Count; } /* * if( Arguments.Length == 1 || Arguments[ 1 ] == "bydoc" ) * writer.WriteLine( "{0,6} {1,8} {2}", record.DocsNumber, InstancesCount, record.Term ); * else * writer.WriteLine( "{1,8} {0,6} {2}", record.DocsNumber, InstancesCount, record.Term ); */ } writer.WriteLine("--- 1 Terms number: " + TermIndex.TermsNumber); writer.WriteLine("--- 2 Words number: " + TotalInstances); writer.WriteLine("--- 3 Entries number: " + TotalDocEntries); TermIndex.Close(); writer.Close(); Console.WriteLine("sorting..."); Process process = new Process(); process.StartInfo.FileName = "sort.exe"; process.StartInfo.Arguments = " /R /L C lexdump.dat /O lexdump.srt"; process.StartInfo.WorkingDirectory = "."; process.StartInfo.CreateNoWindow = true; process.StartInfo.UseShellExecute = false; process.Start(); process.WaitForExit(); Console.WriteLine("done..."); }
protected void CrossIndexChecks(TermIndexAccessor termIndex) { foreach (KeyPair pair in termIndex.Keys) { TermIndexRecord termRecord = termIndex.GetRecordByHandle(pair._offset); for (int j = 0; j < termRecord.DocsNumber; j++) { Entry entry = termRecord.GetEntryAt(j); if (entry.DocIndex < -1) { throw new FormatException("DocIndex is negative in the TermIndex record entry"); } if (entry.Count <= 0) { throw new FormatException("Number of term instances is negative in the TermIndex record"); } } } }
private void buttonShowContent_Click(object sender, System.EventArgs e) { int termHC; int entryCounter = 1; listEntries.Items.Clear(); hash.Clear(); try { termHC = Int32.Parse(textTermID.Text); } catch (Exception) { termHC = Word.GetTermId(textTermID.Text.ToLower()); LexemeConstructor ctor = new LexemeConstructor(OMEnv.ScriptMorphoAnalyzer, OMEnv.DictionaryServer); string normForm = ctor.GetNormalizedToken(textTermID.Text); labelNormForm.Text = "Normalized Form: " + normForm; termHC = Word.GetTermId(normForm.ToLower()); } if (termHC != -1) { labelHC.Text = termHC.ToString(); recMain = (TermIndexRecord)FullTextIndexer.Instance.GetTermRecordMain(termHC); recMem = (TermIndexRecord)FullTextIndexer.Instance.GetTermRecordMem(termHC); if (recMain != null) { FillList(recMain, ref entryCounter, Color.LightSkyBlue); } if (recMem != null) { FillList(recMem, ref entryCounter, Color.LightYellow); } } else { MessageBox.Show("No such term in the index"); } }