Skip to content

zhongzero/RL-KaiWu

Repository files navigation

Tencent Kaiwu Arena

This is the final project of CS3316(Reinforcement Learning) in Shanghai Jiao Tong University.

Team Members

  • Yongshan Chen: Guided the research process, proposed the improvemnets on PPO algorithm and implemented the PSRO algorithm.
  • Lai Jiang: Write the Abstract, Introduction and Conclusion part of the paper. Propose and polish the structure of the paper. Propose the possible experiments.
  • Yuhao Wang: Checked feasibility of truly PPO mechanism and implemented the PPO-RB part in codes. Finished the truly PPO part in introduction, related work, methods and conclusion section.
  • Linhao Zhong: Run the experiment, refine the parameter and evaluate the model. Write the Section 4.2, 4.3 and part of introduction.
  • Binglin Zhou: Run the experiment, refine the parameter and analysis the evaluation result. Write the Section 4.1, 4.4.

Setup

First, you should upload the code to the Kaiwu Arena platform. Then, you can run the experiment by just running the following command:

python3 train_test.py

Acknowledgement

We would like to thank the course instructor, Prof. Weinan Zhang, TA Xialin He and Kaiwu Arena for providing the platform for this project.

About

No description, website, or topics provided.

Resources

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published

Contributors 3

  •  
  •  
  •