C# (CSharp) Assets.Scripts.IAJ.Unity.DecisionMaking.MCTS Reward 예제들

프로그래밍 언어: C# (CSharp)

네임스페이스/패키지 이름: Assets.Scripts.IAJ.Unity.DecisionMaking.MCTS

클래스/타입: Reward

hotexamples.com에서의 예제들: 2

C# (CSharp) Assets.Scripts.IAJ.Unity.DecisionMaking.MCTS Reward - 2개의 예제가 발견되었습니다. 이것들은 오픈소스 프로젝트에서 추출된 C# (CSharp)의 Assets.Scripts.IAJ.Unity.DecisionMaking.MCTS.Reward에 대한 실세계 최고 등급의 예제들입니다. 예제들을 평가하여 예제의 품질 향상에 도움을 줄 수 있습니다.

자주 사용되는 메소드들

보기 숨기기

GetRewardForNode(3)

예제 #1

파일 보기

        protected override Reward Playout(WorldModel initialPlayoutState)
        {
            WorldModel        model = initialPlayoutState.GenerateChildWorldModel();
            List <GOB.Action> actions;
            List <GOB.Action> executableActions = new List <GOB.Action>();

            GOB.Action nextAction = null;
            Reward     reward     = new Reward();
            double     heuristicValue;
            double     accumulatedHeuristicValue;
            double     bestValue, minValue;
            SortedDictionary <double, GOB.Action> heuristicList = new SortedDictionary <double, GOB.Action>();

            actions = model.GetActions();

            while (!model.IsTerminal())
            {
                heuristicList.Clear();
                executableActions.Clear();
                heuristicValue            = 0;
                accumulatedHeuristicValue = 0;

                bestValue = -1;
                minValue  = float.MaxValue;

                if (actions.Count == 0)
                {
                    break;
                }

                foreach (GOB.Action action in actions)
                {
                    if (action.CanExecute(model))
                    {
                        accumulatedHeuristicValue += Math.Pow(Math.E, action.H(model));
                        executableActions.Add(action);
                    }
                }

                foreach (GOB.Action action in executableActions)
                {
                    heuristicValue = Math.Pow(Math.E, action.H(model)) / accumulatedHeuristicValue;

                    if (!heuristicList.ContainsKey(heuristicValue))
                    {
                        heuristicList.Add(heuristicValue, action);
                    }

                    if (heuristicValue > bestValue)
                    {
                        bestValue = heuristicValue;
                    }
                    if (heuristicValue < minValue)
                    {
                        minValue = heuristicValue;
                    }
                }

                double randomNumber = GetRandomNumber(minValue, bestValue);

                foreach (KeyValuePair <double, GOB.Action> actionHeuristic in heuristicList)
                {
                    if (actionHeuristic.Key >= randomNumber)
                    {
                        nextAction = actionHeuristic.Value;
                        break;
                    }
                }

                if (nextAction == null)
                {
                    break;
                }

                nextAction.ApplyActionEffects(model);
                model.CalculateNextPlayer();
            }

            reward.PlayerID = model.GetNextPlayer();
            reward.Value    = model.GetScore();
            return(reward);
        }

예제 #2

파일 보기

파일: MCTS.cs 프로젝트: hlferreira/IAJ

 protected virtual void Backpropagate(MCTSNode node, Reward reward)
 {
     //TODO: implement
     throw new NotImplementedException();
 }