Ejemplos de ExplorationPolicy.SelectAction en C# (CSharp)

Lenguaje de programación: C# (CSharp)

Clase / Tipo: ExplorationPolicy

Método / Función: SelectAction

Ejemplos en hotexamples.com: 4

C# (CSharp) ExplorationPolicy.SelectAction - 4 ejemplos encontrados. Estos son los ejemplos en C# (CSharp) del mundo real mejor valorados de ExplorationPolicy.SelectAction extraídos de proyectos de código abierto. Puedes valorar ejemplos para ayudarnos a mejorar la calidad de los ejemplos.

Métodos usados con frecuencia

Mostrar Ocultar

SelectAction(4)

ChooseAction(1)

Ejemplo n.º 1

Mostrar archivo

        public void Step(double reward, int nextState)
        {
            var nextAction = ExplorationPolicy.SelectAction(_q[nextState]);

            var target = reward + DiscountFactor * _q[nextState][nextAction];
            var delta  = target - _q[CurrentState][SelectedAction];

            _q[CurrentState][SelectedAction] += LearningRate * delta;

            CurrentState   = nextState;
            SelectedAction = ExplorationPolicy.SelectAction(_q[CurrentState]);
        }

Ejemplo n.º 2

Mostrar archivo

Archivo: DynaQ.cs Proyecto: kapkapas/ReinforcementLearning

        public void Step(double reward, int nextState)
        {
            if (!_visited.ContainsKey(CurrentState))
            {
                var actions = new HashSet <int>();
                actions.Add(SelectedAction);
                _visited[CurrentState] = actions;
            }

            UpdateQ(reward, nextState);
            Plan();

            CurrentState   = nextState;
            SelectedAction = ExplorationPolicy.SelectAction(_q[CurrentState]);
        }

Ejemplo n.º 3

Mostrar archivo

        public void Step(double reward, int nextState)
        {
            var bestNext = _q[nextState][0];

            for (var i = 1; i < ActionCount; i++)
            {
                if (_q[nextState][i] > bestNext)
                {
                    bestNext = _q[nextState][i];
                }
            }

            var target = reward + DiscountFactor * bestNext;
            var delta  = target - _q[CurrentState][SelectedAction];

            _q[CurrentState][SelectedAction] += LearningRate * delta;

            CurrentState   = nextState;
            SelectedAction = ExplorationPolicy.SelectAction(_q[CurrentState]);
        }

Ejemplo n.º 4

Mostrar archivo

 public void Begin(int state)
 {
     CurrentState   = state;
     SelectedAction = ExplorationPolicy.SelectAction(_q[CurrentState]);
 }