C# (CSharp) LabryntRules.IsFinalState 예제들

프로그래밍 언어: C# (CSharp)

클래스/타입: LabryntRules

메소드/함수: IsFinalState

hotexamples.com에서의 예제들: 2

C# (CSharp) LabryntRules.IsFinalState - 2개의 예제가 발견되었습니다. 이것들은 오픈소스 프로젝트에서 추출된 C# (CSharp)의 LabryntRules.IsFinalState에 대한 실세계 최고 등급의 예제들입니다. 예제들을 평가하여 예제의 품질 향상에 도움을 줄 수 있습니다.

자주 사용되는 메소드들

보기 숨기기

GetPossibleMovements(2)

IsFinalState(2)

GetLandingCell(1)

GetReward(1)

예제 #1

파일 보기

파일: Player.cs 프로젝트: AlejandroGd/LearningLabrynt

    //Gives you the max Q-Value possible from a determined state (cell).
    private double CalculateEstimate(int nextCell)
    {
        double estimate = 0;

        if (!LabryntRules.IsFinalState(nextCell))
        {
            Movement bestMove = GetBestMovement(nextCell);
            estimate = qMat.GetQValue(nextCell, bestMove);
        }

        return(estimate);
    }

예제 #2

파일 보기

파일: Player.cs 프로젝트: AlejandroGd/LearningLabrynt

    // Update is called once per frame
    void Update()
    {
        if (!paused)
        {
            timer += Time.deltaTime;
            float dt = speedSelector.GetInverse();
            if (timer > dt)
            {
                //Calculate how many choices we need according to the speed selector.
                int loops = (int)(timer / dt);
                while (loops > 0)
                {
                    if (restart)
                    {
                        InitialiseAlgorithm();
                        restart    = false;
                        newEpisode = true;
                    }

                    if (newEpisode)
                    {
                        SetValuesForNewEpisode();
                        newEpisode = false;
                    }

                    Movement currentMove;
                    //Choose action (a) based on policy (p)
                    if (ShouldExplore())
                    {
                        currentMove = GetRandomMovement(currentCell);
                    }
                    else
                    {
                        currentMove = GetBestMovement(currentCell);
                    }

                    //Observe reward (r)
                    int    reward    = LabryntRules.GetReward(currentCell, currentMove);
                    double oldQValue = qMat.GetQValue(currentCell, currentMove);

                    //Do an estimation of the best Q-value to obtain from next state.
                    int    nextCell   = LabryntRules.GetLandingCell(currentCell, currentMove);
                    double estimation = CalculateEstimate(nextCell);

                    //Recalculate Q-value for the current cell based on estimation.
                    double updatedQValue = CalculateNewQValue(oldQValue, alpha, gamma, reward, estimation);
                    qMat.SetQValue(currentCell, currentMove, updatedQValue);

                    cumulativeReward += reward;
                    currentCell       = nextCell;

                    UpdatePlayerPositionInDisplay(); //Now we update the display with the new player position.

                    //If the player landed in an final state cell, we need to update epsilon and start a new episode.
                    if (LabryntRules.IsFinalState(currentCell))
                    {
                        //Change epsilon parameter so it shifts from exploration to explotation gradually between episodes.
                        if (epsilon > 0.3)
                        {
                            epsilon *= epsilonDecay1;
                        }
                        if (epsilon < 0.3)
                        {
                            epsilon *= epsilonDecay2;
                        }
                        newEpisode = true;
                    }

                    loops--;
                }

                timer -= dt;
            }
        }
    }