public void ProcessDataThroughTree( DataPointManager dataPointMgr, GeneticAlgorithmRunResults results, IEnumerable <DataPoint> dataPoints) { //reset traverse counts //double check on traverse count //TODO clean this up to determine which one is failing _nodes.Clear(); Stack <TreeNode> nodes_to_process = new Stack <TreeNode>(); nodes_to_process.Push(_root); while (nodes_to_process.Count > 0) { TreeNode node = nodes_to_process.Pop(); //TODO determine why this line will fail at times... seems to be related to root nodes _nodes.Add(node); node._traverseCount = 0; node.matrix = new ConfusionMatrix(dataPointMgr.classes.Length); foreach (var subNode in node._subNodes) { nodes_to_process.Push(subNode); } } foreach (var dataPoint in dataPoints) { results.count_allData++; TraverseData(dataPoint, results); } //at this point, have the results for the run through, need to determine a score results.ProcessScoresAfterTraverse(); results.tree_nodeCount = _nodes.Count; results._matrix = this._root.matrix; //store the results for future use _currentResults = results; }
public static TreeTest TreeTestFactory(DataPointManager dataPointMgr, Random rando) { //TODO this shoudl take a ga_mgr instead of the parts //TODO clean up this mess once the DataColumns quit using the TYPE part TreeTest output; var col_param = rando.Next(dataPointMgr._columns.Count); DataColumn column = dataPointMgr._columns[col_param]; switch (column._type) { case DataColumn.DataValueTypes.NUMBER: LessThanEqualTreeTest test = new LessThanEqualTreeTest(); test.param = col_param; test.valTest = column.GetTestValue(rando); output = test; break; case DataColumn.DataValueTypes.CATEGORY: var cat_column = column as CategoryDataColumn; var categories = cat_column._codebook.GetCategories(); var category_count = categories.Count(); //toss a coin to decide on subsetter EqualTreeTest test_eq = new EqualTreeTest(); test_eq._param = col_param; test_eq._valTest = column.GetTestValue(rando); output = test_eq; break; default: throw new ArgumentOutOfRangeException(); } output._testCol = column; return(output); }
public void CreateValues(DataPointManager dp_mgr) { foreach (var baseValue in _baseColumn._values) { DataValue dv_new = new DataValue(); switch (this._formula) { case FormulaOptions.LN: if (baseValue._value > 0) { dv_new._value = Math.Log(baseValue._value); } else { dv_new._isMissing = true; this._hasMissingValues = true; } break; case FormulaOptions.TANH: dv_new._value = Math.Tanh(baseValue._value); break; case FormulaOptions.SQRT: if (baseValue._value > 0) { dv_new._value = Math.Sqrt(baseValue._value); } else { dv_new._isMissing = true; this._hasMissingValues = true; } break; case FormulaOptions.SQR: dv_new._value = Math.Pow(baseValue._value, 2); break; case FormulaOptions.INV: if (baseValue._value != 0.0) { dv_new._value = 1 / baseValue._value; } else { dv_new._isMissing = true; this._hasMissingValues = true; } break; case FormulaOptions.NONE: dv_new._value = baseValue._value; break; default: throw new ArgumentOutOfRangeException(); } dv_new._value *= _scaling; this._values.Add(dv_new); } //do a zip with the original DataPoints var items = dp_mgr._dataPoints.Zip(this._values, (dp, dv) => { dp._data.Add(dv); return(true); }); //TODO remove this sloppy bit that forces execution above foreach (var item in items) { } this.ProcessRanges(); }