C# (CSharp) MarkovDecisionProcess State 예제들

프로그래밍 언어: C# (CSharp)

네임스페이스/패키지 이름: MarkovDecisionProcess

클래스/타입: State

hotexamples.com에서의 예제들: 13

C# (CSharp) MarkovDecisionProcess State - 13개의 예제가 발견되었습니다. 이것들은 오픈소스 프로젝트에서 추출된 C# (CSharp)의 MarkovDecisionProcess.State에 대한 실세계 최고 등급의 예제들입니다. 예제들을 평가하여 예제의 품질 향상에 도움을 줄 수 있습니다.

자주 사용되는 메소드들

보기 숨기기

Reward(4)

TransitionProbability(3)

Apply(2)

Successors(2)

Equals(1)

예제 #1

파일 보기

파일: Domain.cs 프로젝트: haozhuoran1991/recommend-2011

 public abstract bool IsGoalState(State s);

예제 #2

파일 보기

파일: RaceCarState.cs 프로젝트: haozhuoran1991/recommend-2011

 public override double TransitionProbability(Action a, State sTag)
 {
     RaceCarState sTagApply = new RaceCarState(this);
     sTagApply.Apply((VelocityAction)a, true);
     if (sTag.Equals(sTagApply))
         return RaceTrack.ACTION_SUCCESS_PROBABILITY;
     RaceCarState sTagNoApply = new RaceCarState(this);
     sTagNoApply.Apply((VelocityAction)a, false);
     if (sTag.Equals(sTagNoApply))
         return 1 - RaceTrack.ACTION_SUCCESS_PROBABILITY;
     return 0.0;
 }

예제 #3

파일 보기

파일: PolicyValueFunction.cs 프로젝트: haozhuoran1991/recommend-2011

 public double ValueAt(State s)
 {
     //your code here
     return ViByS[s];
 }

예제 #4

파일 보기

파일: PolicyValueFunction.cs 프로젝트: haozhuoran1991/recommend-2011

        private double update(State s)
        {
            double maxV = ViByS[s];
            Action maxA = null;
            foreach (Action a in m_dDomain.Actions)
            {
                double sum = 0;
                foreach (State stag in s.Successors(a))
                    sum += s.TransitionProbability(a, stag) * ViByS[stag];
                double tmp = s.Reward(a) + (m_dDomain.DiscountFactor * sum);

               // save max
                if ((tmp >= maxV) && (!s.Apply(a).Equals(s)))
                {
                    maxV = tmp;
                    maxA = a;
                }
            }
            if (maxA != null)
            {
                double delta = maxV - ViByS[s];
                ViByS[s] = maxV;
                ViBySActions[s] = maxA;
                return Math.Abs(delta);
            }
            return 0;
        }

예제 #5

파일 보기

파일: Policy.cs 프로젝트: haozhuoran1991/recommend-2011

 public abstract Action GetAction(State s);

예제 #6

파일 보기

파일: PolicyValueFunction.cs 프로젝트: haozhuoran1991/recommend-2011

 public override Action GetAction(State s)
 {
     //your code here
     return ViBySActions[s];
 }

예제 #7

파일 보기

파일: State.cs 프로젝트: haozhuoran1991/recommend-2011

 public abstract double TransitionProbability(Action a, State sTag);

예제 #8

파일 보기

파일: RandomPolicy.cs 프로젝트: haozhuoran1991/recommend-2011

 public override Action GetAction(State s)
 {
     int idx = RandomGenerator.Next(m_lActions.Count);
     return m_lActions[idx];
 }

예제 #9

파일 보기

파일: ValueFunction.cs 프로젝트: haozhuoran1991/recommend-2011

 // calc the formula for Vi+1(s)
 private double updateValueIter(State s)
 {
     double maxV = Double.MinValue;
     Action maxA = null;
     foreach (Action a in m_dDomain.Actions)
     {
         // clac formula for action a
         double sum = 0;
         foreach (State stag in s.Successors(a))
             sum += s.TransitionProbability(a, stag) * ViByS[stag];
         double tmp = s.Reward(a) + m_dDomain.DiscountFactor * sum;
         // save max
         if((tmp >= maxV)){
             maxV = tmp;
             maxA = a;
         }
     }
     if (maxA != null)
     {
         Vi_1ByS[s] = maxV;
         ViBySActions[s] = maxA;
         return Math.Abs(Vi_1ByS[s] - ViByS[s]);
     }
     return 0;
 }

예제 #10

파일 보기

파일: ValueFunction.cs 프로젝트: haozhuoran1991/recommend-2011

 private Action epsilonGreedy(State s,double depsilon)
 {
     if ( RandomGenerator.NextDouble() > depsilon)
         return m_dDomain.Actions.ElementAt(RandomGenerator.Next(m_dDomain.Actions.Count()));
     else
         return findMaxQA(s);
 }

예제 #11

파일 보기

파일: ValueFunction.cs 프로젝트: haozhuoran1991/recommend-2011

 private Action findMaxQA(State j)
 {
     List<Action> actions = new List<Action>();
     double maxQA = double.MinValue;
     foreach (Action a in m_dDomain.Actions)
         if (Q[j][a] > maxQA)
             maxQA = Q[j][a];
     foreach (Action a in m_dDomain.Actions)
         if (Q[j][a] == maxQA)
             actions.Add(a);
     int idx = RandomGenerator.Next(actions.Count);
     return actions[idx];
 }

예제 #12

파일 보기

파일: ValueFunction.cs 프로젝트: haozhuoran1991/recommend-2011

 private double MaxR(State stag)
 {
     double maxR = double.MinValue;
     foreach (Action a in m_dDomain.Actions)
         maxR = Math.Max(maxR,Q[stag][a] );
     return maxR;
 }

예제 #13

파일 보기

파일: RaceTrack.cs 프로젝트: haozhuoran1991/recommend-2011

 public override bool IsGoalState(State s)
 {
     return IsGoalState((RaceCarState)s);
 }