Reward-based-learning-agents

a modified version of the temporal-difference method Q-learning and SARSA. Additionally,a modified version of the action selection methods softmax and ϵ-greedy.

Process raw data with z5443641.py file in different methods

Name		Name	Last commit message	Last commit date
Latest commit History 5 Commits
README.md		README.md
Z5443641.py		Z5443641.py
discuss.txt		discuss.txt
initial_Q_values.txt		initial_Q_values.txt
initial_Q_values_New_World.txt		initial_Q_values_New_World.txt
random_numbers.txt		random_numbers.txt
reward_Q_learning_EG.txt		reward_Q_learning_EG.txt
reward_Q_learning_SM.txt		reward_Q_learning_SM.txt
reward_SARSA_EG.txt		reward_SARSA_EG.txt
reward_SARSA_SM.txt		reward_SARSA_SM.txt
steps_Q_learning_EG.txt		steps_Q_learning_EG.txt
steps_Q_learning_SM.txt		steps_Q_learning_SM.txt
steps_SARSA_EG.txt		steps_SARSA_EG.txt
steps_SARSA_SM.txt		steps_SARSA_SM.txt

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

Reward-based-learning-agents

About

Releases

Packages

Languages

zhoupeng1225/Reward-based-learning-agents

Folders and files

Latest commit

History

Repository files navigation

Reward-based-learning-agents

About

Resources

Stars

Watchers

Forks

Releases

Packages 0

Languages

Packages