We read every piece of feedback, and take your input very seriously.
To see all available qualifiers, see our documentation.
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
您好,我初入RL领域,很是小白。 我的问题是: 蒙特卡洛方法主要针对的是无模型情况下的RL吗?无模型应该是不知道Pss',不知道状态转移概率是不是就是说不知道状态转移,但是我看您代码中,进行采样时,有用到Env.transform(),在该函数中,用到了状态转移啊。这不就是变成了有模型吗。 不知道我哪块理解错了,希望大佬指正! @zhuliquan
The text was updated successfully, but these errors were encountered:
蒙特卡洛模型是无模型的,因为在求解V(S) 过程没有使用状态转移矩阵。采样用到Env.transform只是为了让环境告诉agent不同状态对应的回报,用于后面的评估,同时环境也没有显式的告诉自己的转移矩阵。
Sorry, something went wrong.
多谢!
No branches or pull requests
您好,我初入RL领域,很是小白。
我的问题是:
蒙特卡洛方法主要针对的是无模型情况下的RL吗?无模型应该是不知道Pss',不知道状态转移概率是不是就是说不知道状态转移,但是我看您代码中,进行采样时,有用到Env.transform(),在该函数中,用到了状态转移啊。这不就是变成了有模型吗。
不知道我哪块理解错了,希望大佬指正! @zhuliquan
The text was updated successfully, but these errors were encountered: