C# (CSharp) IExplorationPolicy.ChooseAction 예제들

프로그래밍 언어: C# (CSharp)

클래스/타입: IExplorationPolicy

메소드/함수: ChooseAction

hotexamples.com에서의 예제들: 6

C# (CSharp) IExplorationPolicy.ChooseAction - 6개의 예제가 발견되었습니다. 이것들은 오픈소스 프로젝트에서 추출된 C# (CSharp)의 IExplorationPolicy.ChooseAction에 대한 실세계 최고 등급의 예제들입니다. 예제들을 평가하여 예제의 품질 향상에 도움을 줄 수 있습니다.

자주 사용되는 메소드들

보기 숨기기

ChooseAction(6)

예제 #1

파일 보기

파일: TabuSearchExploration.cs 프로젝트: Webreaper/Damselfly

        /// <summary>
        /// Choose an action.
        /// </summary>
        ///
        /// <param name="actionEstimates">Action estimates.</param>
        ///
        /// <returns>Returns selected action.</returns>
        ///
        /// <remarks>The method chooses an action depending on the provided estimates. The
        /// estimates can be any sort of estimate, which values usefulness of the action
        /// (expected summary reward, discounted reward, etc). The action is choosed from
        /// non-tabu actions only.</remarks>
        ///
        public int ChooseAction(double[] actionEstimates)
        {
            // get amount of non-tabu actions
            int nonTabuActions = actions;

            for (int i = 0; i < actions; i++)
            {
                if (tabuActions[i] != 0)
                {
                    nonTabuActions--;
                }
            }

            // allowed actions
            double[] allowedActionEstimates = new double[nonTabuActions];
            int[]    allowedActionMap       = new int[nonTabuActions];

            for (int i = 0, j = 0; i < actions; i++)
            {
                if (tabuActions[i] == 0)
                {
                    // allowed action
                    allowedActionEstimates[j] = actionEstimates[i];
                    allowedActionMap[j]       = i;
                    j++;
                }
                else
                {
                    // decrease tabu time of tabu action
                    tabuActions[i]--;
                }
            }

            return(allowedActionMap[basePolicy.ChooseAction(allowedActionEstimates)]);;
        }

예제 #2

파일 보기

파일: DoubleQLearning.cs 프로젝트: HimmaTong/ToolGood.MazeGame

 /// <summary>
 /// 从指定状态获取下一个动作。
 /// Get next action from the specified state.
 /// </summary>
 ///
 /// <param name="state">要获取操作的当前状态。
 /// Current state to get an action for.</param>
 ///
 /// <returns>
 /// 返回状态的动作
 /// Returns the action for the state.</returns>
 ///
 /// <remarks>
 /// 该方法根据当前返回一个动作
 /// The method returns an action according to current
 /// <see cref="ExplorationPolicy">exploration policy</see>.</remarks>
 ///
 public int GetAction(int state)
 {
     double[] qs = new double[actions];
     for (int i = 0; i < actions; i++)
     {
         qs[i] = (qvalues[state][i] + qvalues2[state][i]) / 2;
     }
     return(explorationPolicy.ChooseAction(qs));
 }

예제 #3

파일 보기

        /// <summary>
        /// 从指定状态获取下一个动作。
        /// Get next action from the specified state.
        /// </summary>
        ///
        /// <param name="state">要获取操作的当前状态。
        /// Current state to get an action for.</param>
        ///
        /// <returns>
        /// 返回状态的动作
        /// Returns the action for the state.</returns>
        ///
        /// <remarks>
        /// 该方法根据当前返回一个动作
        /// The method returns an action according to current
        /// <see cref="ExplorationPolicy">exploration policy</see>.</remarks>
        ///
        public int GetAction(int state)
        {
            double[] nextActionEstimations = qvalues[state];
            double   maxNextExpectedReward = nextActionEstimations[0];

            for (int i = 1; i < actions; i++)
            {
                if (nextActionEstimations[i] > maxNextExpectedReward)
                {
                    maxNextExpectedReward = nextActionEstimations[i];
                }
            }

            return(explorationPolicy.ChooseAction(qvalues[state]));
        }

예제 #4

파일 보기

 /// <summary>
 /// Get next action from the specified state.
 /// </summary>
 ///
 /// <param name="state">Current state to get an action for.</param>
 ///
 /// <returns>Returns the action for the state.</returns>
 ///
 /// <remarks>The method returns an action according to current
 /// <see cref="ExplorationPolicy">exploration policy</see>.</remarks>
 ///
 public int GetAction(int state)
 {
     return(explorationPolicy.ChooseAction(qvalues[state]));
 }

예제 #5

파일 보기

파일: InfiniteQLearning.cs 프로젝트: Webreaper/Damselfly

 /// <summary>
 /// Get next action from the specified state.
 /// </summary>
 ///
 /// <param name="state">Current state to get an action for.</param>
 ///
 /// <returns>Returns the action for the state.</returns>
 ///
 /// <remarks>The method returns an action according to current
 /// <see cref="ExplorationPolicy">exploration policy</see>.</remarks>
 ///
 public int GetAction(int state)
 {
     return(explorationPolicy.ChooseAction(Q(state)));
 }

예제 #6

파일 보기

파일: QLearning_FDGS.cs 프로젝트: RitterRBC/framework

 public int GetAction(int state)
 {
     return(_explorationPolicy.ChooseAction(_rewardTable[state]));
 }