- CartPole_v0 example: [https://gym.openai.com/docs/]
- What if the stochastic approximation method is applied to reinforcement learning problems such as CartPole?
- Here is result.
- Finite-difference stochastic approximation algorithm (FDSA)
- Simultaneous pertubation stochastic approximation algorithm (SPSA)
- Adaptive SPSA (2SPSA)