Welcome to discuss about this book here! #1
Replies: 32 comments 79 replies
-
In slides 9, page 22. I guess this is `exercise`. |
Beta Was this translation helpful? Give feedback.
-
Great book and course! Helped me reorganize some of the points of RL. In the process of reading this book (ver 2022.8), I met some little confusion, probably clerical errors. Thanks! |
Beta Was this translation helpful? Give feedback.
-
|
Beta Was this translation helpful? Give feedback.
-
第6章,page 20,这个好像不收敛的。 w = 0
g = lambda w: w**3 - 5
import random
for i in range(100):
print(i, w)
w = w - 1/(i+10) * (g(w) + random.gauss(0, 1)) |
Beta Was this translation helpful? Give feedback.
-
Dvoretzky 定理的证明只包括了 |
Beta Was this translation helpful? Give feedback.
-
6.2 Page 107 |
Beta Was this translation helpful? Give feedback.
-
Beta Was this translation helpful? Give feedback.
-
Beta Was this translation helpful? Give feedback.
-
In page. 85, the second paragraph of the subsection A comprehensive example: Episode length and sparse reward, "See, for example, Figure 5.3(h)" should be "5.3(a)", because you mentioned that the episode length is 1. |
Beta Was this translation helpful? Give feedback.
-
|
Beta Was this translation helpful? Give feedback.
-
Hello dear Professor Zhao, I think this "+" symbol may should be a "=", this equation has "tortured" me for one month, hahahaha. |
Beta Was this translation helpful? Give feedback.
-
Thanks for the nice write-up! There is a small (but important) typo in Algorithm 4.1:
should read
|
Beta Was this translation helpful? Give feedback.
-
163页 感谢您的书。 |
Beta Was this translation helpful? Give feedback.
-
Thank you so much for writing such a helpful book! |
Beta Was this translation helpful? Give feedback.
-
Page 172. Chapter 8. Algorithm 8.1. |
Beta Was this translation helpful? Give feedback.
-
Beta Was this translation helpful? Give feedback.
-
I think there's a symbol \gamma missing here. There's the same typo in slides. |
Beta Was this translation helpful? Give feedback.
-
|
Beta Was this translation helpful? Give feedback.
-
|
Beta Was this translation helpful? Give feedback.
-
Page 207. Chapter 9. Here I think the π(a|s,θ) should take values in the interval (0,1) due to the softmax function, not [0,1]. |
Beta Was this translation helpful? Give feedback.
-
Prof. Zhao, when will this book be published? |
Beta Was this translation helpful? Give feedback.
-
In Section 3.6, Page 64, book ver. March 2024, it says: |
Beta Was this translation helpful? Give feedback.
-
|
Beta Was this translation helpful? Give feedback.
-
Prof. Zhao, can u add more algorithm such as TRPO, PPO, SAC... into this book, though u have mentioned part of that basic mathematical knowledge in some chapters. VERY THANKS. |
Beta Was this translation helpful? Give feedback.
-
Hi there,
If you have any feedback about the book, you can leave a comment here. Thanks!
Beta Was this translation helpful? Give feedback.
All reactions