Skip to content

Commit

Permalink
[ZH-CN] Translations
Browse files Browse the repository at this point in the history
  • Loading branch information
ilyaspiridonov committed Dec 19, 2023
1 parent 5e0c422 commit a747bbc
Show file tree
Hide file tree
Showing 105 changed files with 10,873 additions and 784 deletions.
10 changes: 6 additions & 4 deletions site/zh-cn/agents/tutorials/0_intro_rl.ipynb
Original file line number Diff line number Diff line change
Expand Up @@ -6,7 +6,7 @@
"id": "I1JiGtmRbLVp"
},
"source": [
"##### Copyright 2021 The TF-Agents Authors."
"##### Copyright 2023 The TF-Agents Authors."
]
},
{
Expand Down Expand Up @@ -100,7 +100,9 @@
"\n",
"Q-Learning 基于 Q 函数的概念。策略 $\\pi$, $Q^{\\pi}(s, a)$ 的 Q 函数(又称状态-操作值函数)用于衡量通过首先采取操作 $a$、随后采取策略 $\\pi$,从状态 $s$ 获得的预期回报或折扣奖励总和。我们将最优 Q 函数 $Q^*(s, a)$ 定义为从观测值 $s$ 开始,先采取操作 $a$,随后采取最优策略所能获得的最大回报。最优 Q 函数遵循以下*贝尔曼*最优性方程:\n",
"\n",
"```\n",
"$\\begin{equation}Q^\\ast(s, a) = \\mathbb{E}[ r + \\gamma \\max_{a'} Q^\\ast(s', a') ]\\end{equation}$\n",
"```\n",
"\n",
"这意味着,从状态 $s$ 和操作 $a$ 获得的最大回报等于即时奖励 $r$ 与通过遵循最优策略,随后直到片段结束所获得的回报(折扣因子为 $\\gamma$)的总和(即,来自下一个状态 $s'$ 的最高奖励)。期望是在即时奖励 $r$ 的分布以及可能的下一个状态 $s'$ 的基础上计算的。\n",
"\n",
Expand All @@ -110,17 +112,17 @@
"\n",
"对于大多数问题,将 $Q$ 函数表示为包含 $s$ 和 $a$ 每种组合的值的表是不切实际的。相反,我们训练一个函数逼近器(例如,带参数 $\\theta$ 的神经网络)来估算 Q 值,即 $Q(s, a; \\theta) \\approx Q^*(s, a)$。这可以通过在每个步骤 $i$ 使以下损失最小化来实现:\n",
"\n",
"$\\begin{equation}L_i(\\theta_i) = \\mathbb{E}*{s, a, r, s'\\sim \\rho(.)} \\left[ (y_i - Q(s, a; \\theta_i))^2 \\right]\\end{equation}$,其中 $y_i = r + \\gamma \\max*{a'} Q(s', a'; \\theta_{i-1})$\n",
"$\\begin{equation}L_i(\\theta_i) = \\mathbb{E}{em0}{s, a, r, s'\\sim \\rho(.)} \\left[ (y_i - Q(s, a; \\theta_i))^2 \\right]\\end{equation}$,其中 $y_i = r + \\gamma \\max{/em0}{a'} Q(s', a'; \\theta_{i-1})$\n",
"\n",
"此处,$y_i$ 称为 TD(时间差分)目标,而 $y_i - Q$ 称为 TD 误差。$\\rho$ 表示行为分布,即从环境中收集的转换 ${s, a, r, s'}$ 的分布。\n",
"此处,$y_i$ is 称为 TD(时间差分)目标,而 $y_i - Q$ 称为 TD 误差。$\\rho$ 表示行为分布,即从环境中收集的转换 ${s, a, r, s'}$ 的分布。\n",
"\n",
"注意,先前迭代 $\\theta_{i-1}$ 中的参数是固定的,不会更新。实际上,我们使用前几次迭代而不是最后一次迭代的网络参数快照。此副本称为*目标网络*。\n",
"\n",
"Q-Learning 是一种*离策略*算法,可在学习贪心策略 $a = \\max_{a} Q(s, a; \\theta)$ 的同时使用不同的行为策略在环境/收集数据过程中执行操作。此行为策略通常是一种 $\\epsilon$ 贪心策略,可选择概率为 $1-\\epsilon$ 的贪心操作和概率为 $\\epsilon$ 的随机操作,以确保良好覆盖状态-操作空间。\n",
"\n",
"### 经验回放\n",
"\n",
"为了避免计算 DQN 损失的全期望,我们可以使用随机梯度下降算法将其最小化。如果仅使用最后一个转换 ${s, a, r, s'}$ 来计算损失,那么这会简化为标准 Q-Learning。\n",
"为了避免计算 DQN 损失的全期望,我们可以使用随机梯度下降算法将其最小化。如果仅使用最后的转换 ${s, a, r, s'}$ 计算损失,这将简化为标准 Q-Learning。\n",
"\n",
"Atari DQN 工作引入了一种称为“经验回放”的技术,可使网络更新更加稳定。在数据收集的每个时间步骤,转换都会添加到称为*回放缓冲区*的循环缓冲区中。然后,在训练过程中,我们不是仅仅使用最新的转换来计算损失及其梯度,而是使用从回放缓冲区中采样的转换的 mini-batch 来计算它们。这样做有两个优点:通过在许多更新中重用每个转换来提高数据效率,以及在批次中使用不相关的转换来提高稳定性。\n"
]
Expand Down
Original file line number Diff line number Diff line change
Expand Up @@ -6,7 +6,7 @@
"id": "W7rEsKyWcxmu"
},
"source": [
"##### Copyright 2021 The TF-Agents Authors.\n"
"##### Copyright 2023 The TF-Agents Authors.\n"
]
},
{
Expand Down
11 changes: 4 additions & 7 deletions site/zh-cn/agents/tutorials/1_dqn_tutorial.ipynb
Original file line number Diff line number Diff line change
Expand Up @@ -6,7 +6,7 @@
"id": "klGNgWREsvQv"
},
"source": [
"##### Copyright 2021 The TF-Agents Authors."
"##### Copyright 2023 The TF-Agents Authors."
]
},
{
Expand Down Expand Up @@ -40,12 +40,9 @@
"# 使用 TF-Agents 训练深度 Q 网络\n",
"\n",
"<table class=\"tfo-notebook-buttons\" align=\"left\">\n",
" <td> <a target=\"_blank\" href=\"https://tensorflow.google.cn/agents/tutorials/1_dqn_tutorial\"><img src=\"https://tensorflow.google.cn/images/tf_logo_32px.png\">在 TensorFlow.org 上查看</a>\n",
"</td>\n",
" <td> <a target=\"_blank\" href=\"https://colab.research.google.com/github/tensorflow/docs-l10n/blob/master/site/zh-cn/agents/tutorials/1_dqn_tutorial.ipynb\"> <img src=\"https://tensorflow.google.cn/images/colab_logo_32px.png\"> 在 Google Colab 中运行</a>\n",
"</td>\n",
" <td> <a target=\"_blank\" href=\"https://github.com/tensorflow/docs-l10n/blob/master/site/zh-cn/agents/tutorials/1_dqn_tutorial.ipynb\"><img src=\"https://tensorflow.google.cn/images/GitHub-Mark-32px.png\">在 Github 上查看源代码</a>\n",
"</td>\n",
" <td> <a target=\"_blank\" href=\"https://tensorflow.google.cn/agents/tutorials/1_dqn_tutorial\"><img src=\"https://tensorflow.google.cn/images/tf_logo_32px.png\">在 TensorFlow.org 上查看</a> </td>\n",
" <td> <a target=\"_blank\" href=\"https://colab.research.google.com/github/tensorflow/docs-l10n/blob/master/site/zh-cn/agents/tutorials/1_dqn_tutorial.ipynb\"> <img src=\"https://tensorflow.google.cn/images/colab_logo_32px.png\"> 在 Google Colab 中运行</a> </td>\n",
" <td> <a target=\"_blank\" href=\"https://github.com/tensorflow/docs-l10n/blob/master/site/zh-cn/agents/tutorials/1_dqn_tutorial.ipynb\"><img src=\"https://tensorflow.google.cn/images/GitHub-Mark-32px.png\">在 Github 上查看源代码</a> </td>\n",
" <td> <a href=\"https://storage.googleapis.com/tensorflow_docs/docs-l10n/site/zh-cn/agents/tutorials/1_dqn_tutorial.ipynb\"><img src=\"https://tensorflow.google.cn/images/download_logo_32px.png\">下载笔记本</a> </td>\n",
"</table>"
]
Expand Down
3 changes: 1 addition & 2 deletions site/zh-cn/agents/tutorials/2_environments_tutorial.ipynb
Original file line number Diff line number Diff line change
Expand Up @@ -6,7 +6,7 @@
"id": "Ma19Ks2CTDbZ"
},
"source": [
"##### Copyright 2021 The TF-Agents Authors."
"##### Copyright 2023 The TF-Agents Authors."
]
},
{
Expand Down Expand Up @@ -95,7 +95,6 @@
},
"outputs": [],
"source": [
"!pip install \"gym>=0.21.0\"\n",
"!pip install tf-agents[reverb]\n"
]
},
Expand Down
11 changes: 4 additions & 7 deletions site/zh-cn/agents/tutorials/3_policies_tutorial.ipynb
Original file line number Diff line number Diff line change
Expand Up @@ -6,7 +6,7 @@
"id": "1Pi_B2cvdBiW"
},
"source": [
"##### Copyright 2021 The TF-Agents Authors."
"##### Copyright 2023 The TF-Agents Authors."
]
},
{
Expand Down Expand Up @@ -40,12 +40,9 @@
"# 策略\n",
"\n",
"<table class=\"tfo-notebook-buttons\" align=\"left\">\n",
" <td> <a target=\"_blank\" href=\"https://tensorflow.google.cn/agents/tutorials/3_policies_tutorial\"><img src=\"https://tensorflow.google.cn/images/tf_logo_32px.png\">在 TensorFlow.org 上查看</a>\n",
"</td>\n",
" <td> <a target=\"_blank\" href=\"https://colab.research.google.com/github/tensorflow/docs-l10n/blob/master/site/zh-cn/agents/tutorials/3_policies_tutorial.ipynb\"><img src=\"https://tensorflow.google.cn/images/colab_logo_32px.png\">在 Google Colab 运行</a>\n",
"</td>\n",
" <td> <a target=\"_blank\" href=\"https://github.com/tensorflow/docs-l10n/blob/master/site/zh-cn/agents/tutorials/3_policies_tutorial.ipynb\"><img src=\"https://tensorflow.google.cn/images/GitHub-Mark-32px.png\">在 Github 上查看源代码</a>\n",
"</td>\n",
" <td> <a target=\"_blank\" href=\"https://tensorflow.google.cn/agents/tutorials/3_policies_tutorial\"><img src=\"https://tensorflow.google.cn/images/tf_logo_32px.png\">在 TensorFlow.org 上查看</a> </td>\n",
" <td> <a target=\"_blank\" href=\"https://colab.research.google.com/github/tensorflow/docs-l10n/blob/master/site/zh-cn/agents/tutorials/3_policies_tutorial.ipynb\"><img src=\"https://tensorflow.google.cn/images/colab_logo_32px.png\">在 Google Colab 运行</a> </td>\n",
" <td> <a target=\"_blank\" href=\"https://github.com/tensorflow/docs-l10n/blob/master/site/zh-cn/agents/tutorials/3_policies_tutorial.ipynb\"><img src=\"https://tensorflow.google.cn/images/GitHub-Mark-32px.png\">在 Github 上查看源代码</a> </td>\n",
" <td> <a href=\"https://storage.googleapis.com/tensorflow_docs/docs-l10n/site/zh-cn/agents/tutorials/3_policies_tutorial.ipynb\"><img src=\"https://tensorflow.google.cn/images/download_logo_32px.png\">下载笔记本</a> </td>\n",
"</table>"
]
Expand Down
2 changes: 1 addition & 1 deletion site/zh-cn/agents/tutorials/4_drivers_tutorial.ipynb
Original file line number Diff line number Diff line change
Expand Up @@ -6,7 +6,7 @@
"id": "beObUOFyuRjT"
},
"source": [
"##### Copyright 2021 The TF-Agents Authors."
"##### Copyright 2023 The TF-Agents Authors."
]
},
{
Expand Down
Original file line number Diff line number Diff line change
Expand Up @@ -6,7 +6,7 @@
"id": "beObUOFyuRjT"
},
"source": [
"##### Copyright 2021 The TF-Agents Authors."
"##### Copyright 2023 The TF-Agents Authors."
]
},
{
Expand Down
11 changes: 4 additions & 7 deletions site/zh-cn/agents/tutorials/6_reinforce_tutorial.ipynb
Original file line number Diff line number Diff line change
Expand Up @@ -6,7 +6,7 @@
"id": "klGNgWREsvQv"
},
"source": [
"##### Copyright 2021 The TF-Agents Authors."
"##### Copyright 2023 The TF-Agents Authors."
]
},
{
Expand Down Expand Up @@ -40,12 +40,9 @@
"# REINFORCE 代理\n",
"\n",
"<table class=\"tfo-notebook-buttons\" align=\"left\">\n",
" <td> <a target=\"_blank\" href=\"https://tensorflow.google.cn/agents/tutorials/6_reinforce_tutorial\"><img src=\"https://tensorflow.google.cn/images/tf_logo_32px.png\">在 TensorFlow.org 上查看</a>\n",
"</td>\n",
" <td> <a target=\"_blank\" href=\"https://colab.research.google.com/github/tensorflow/docs-l10n/blob/master/site/zh-cn/agents/tutorials/6_reinforce_tutorial.ipynb\"><img src=\"https://tensorflow.google.cn/images/colab_logo_32px.png\">在 Google Colab 运行</a>\n",
"</td>\n",
" <td> <a target=\"_blank\" href=\"https://github.com/tensorflow/docs-l10n/blob/master/site/zh-cn/agents/tutorials/6_reinforce_tutorial.ipynb\"><img src=\"https://tensorflow.google.cn/images/GitHub-Mark-32px.png\">在 Github 上查看源代码</a>\n",
"</td>\n",
" <td> <a target=\"_blank\" href=\"https://tensorflow.google.cn/agents/tutorials/6_reinforce_tutorial\"><img src=\"https://tensorflow.google.cn/images/tf_logo_32px.png\">在 TensorFlow.org 上查看</a> </td>\n",
" <td> <a target=\"_blank\" href=\"https://colab.research.google.com/github/tensorflow/docs-l10n/blob/master/site/zh-cn/agents/tutorials/6_reinforce_tutorial.ipynb\"><img src=\"https://tensorflow.google.cn/images/colab_logo_32px.png\">在 Google Colab 运行</a> </td>\n",
" <td> <a target=\"_blank\" href=\"https://github.com/tensorflow/docs-l10n/blob/master/site/zh-cn/agents/tutorials/6_reinforce_tutorial.ipynb\"><img src=\"https://tensorflow.google.cn/images/GitHub-Mark-32px.png\">在 Github 上查看源代码</a> </td>\n",
" <td> <a href=\"https://storage.googleapis.com/tensorflow_docs/docs-l10n/site/zh-cn/agents/tutorials/6_reinforce_tutorial.ipynb\"><img src=\"https://tensorflow.google.cn/images/download_logo_32px.png\">下载笔记本</a> </td>\n",
"</table>"
]
Expand Down
2 changes: 1 addition & 1 deletion site/zh-cn/agents/tutorials/7_SAC_minitaur_tutorial.ipynb
Original file line number Diff line number Diff line change
Expand Up @@ -6,7 +6,7 @@
"id": "klGNgWREsvQv"
},
"source": [
"**Copyright 2021 The TF-Agents Authors.**"
"**Copyright 2023 The TF-Agents Authors.**"
]
},
{
Expand Down
2 changes: 1 addition & 1 deletion site/zh-cn/agents/tutorials/8_networks_tutorial.ipynb
Original file line number Diff line number Diff line change
Expand Up @@ -6,7 +6,7 @@
"id": "1Pi_B2cvdBiW"
},
"source": [
"##### Copyright 2021 The TF-Agents Authors."
"##### Copyright 2023 The TF-Agents Authors."
]
},
{
Expand Down
11 changes: 4 additions & 7 deletions site/zh-cn/agents/tutorials/9_c51_tutorial.ipynb
Original file line number Diff line number Diff line change
Expand Up @@ -6,7 +6,7 @@
"id": "klGNgWREsvQv"
},
"source": [
"##### Copyright 2021 The TF-Agents Authors."
"##### Copyright 2023 The TF-Agents Authors."
]
},
{
Expand Down Expand Up @@ -40,12 +40,9 @@
"# DQN C51/Rainbow\n",
"\n",
"<table class=\"tfo-notebook-buttons\" align=\"left\">\n",
" <td> <a target=\"_blank\" href=\"https://tensorflow.google.cn/agents/tutorials/9_c51_tutorial\"><img src=\"https://tensorflow.google.cn/images/tf_logo_32px.png\">在 TensorFlow.org 上查看</a>\n",
"</td>\n",
" <td> <a target=\"_blank\" href=\"https://colab.research.google.com/github/tensorflow/docs-l10n/blob/master/site/zh-cn/agents/tutorials/9_c51_tutorial.ipynb\"><img src=\"https://tensorflow.google.cn/images/colab_logo_32px.png\">在 Google Colab 运行</a>\n",
"</td>\n",
" <td> <a target=\"_blank\" href=\"https://github.com/tensorflow/docs-l10n/blob/master/site/zh-cn/agents/tutorials/9_c51_tutorial.ipynb\"><img src=\"https://tensorflow.google.cn/images/GitHub-Mark-32px.png\">在 Github 上查看源代码</a>\n",
"</td>\n",
" <td> <a target=\"_blank\" href=\"https://tensorflow.google.cn/agents/tutorials/9_c51_tutorial\"><img src=\"https://tensorflow.google.cn/images/tf_logo_32px.png\">在 TensorFlow.org 上查看</a> </td>\n",
" <td> <a target=\"_blank\" href=\"https://colab.research.google.com/github/tensorflow/docs-l10n/blob/master/site/zh-cn/agents/tutorials/9_c51_tutorial.ipynb\"><img src=\"https://tensorflow.google.cn/images/colab_logo_32px.png\">在 Google Colab 运行</a> </td>\n",
" <td> <a target=\"_blank\" href=\"https://github.com/tensorflow/docs-l10n/blob/master/site/zh-cn/agents/tutorials/9_c51_tutorial.ipynb\"><img src=\"https://tensorflow.google.cn/images/GitHub-Mark-32px.png\">在 Github 上查看源代码</a> </td>\n",
" <td> <a href=\"https://storage.googleapis.com/tensorflow_docs/docs-l10n/site/zh-cn/agents/tutorials/9_c51_tutorial.ipynb\"><img src=\"https://tensorflow.google.cn/images/download_logo_32px.png\">下载笔记本</a> </td>\n",
"</table>"
]
Expand Down
2 changes: 1 addition & 1 deletion site/zh-cn/agents/tutorials/bandits_tutorial.ipynb
Original file line number Diff line number Diff line change
Expand Up @@ -6,7 +6,7 @@
"id": "klGNgWREsvQv"
},
"source": [
"##### Copyright 2020 The TF-Agents Authors."
"##### Copyright 2023 The TF-Agents Authors."
]
},
{
Expand Down
2 changes: 1 addition & 1 deletion site/zh-cn/agents/tutorials/intro_bandit.ipynb
Original file line number Diff line number Diff line change
Expand Up @@ -6,7 +6,7 @@
"id": "I1JiGtmRbLVp"
},
"source": [
"##### Copyright 2021 The TF-Agents Authors."
"##### Copyright 2023 The TF-Agents Authors."
]
},
{
Expand Down
2 changes: 1 addition & 1 deletion site/zh-cn/agents/tutorials/per_arm_bandits_tutorial.ipynb
Original file line number Diff line number Diff line change
Expand Up @@ -6,7 +6,7 @@
"id": "nPjtEgqN4SjA"
},
"source": [
"##### Copyright 2021 The TF-Agents Authors."
"##### Copyright 2023 The TF-Agents Authors."
]
},
{
Expand Down
14 changes: 5 additions & 9 deletions site/zh-cn/agents/tutorials/ranking_tutorial.ipynb
Original file line number Diff line number Diff line change
Expand Up @@ -6,7 +6,7 @@
"id": "6tzp2bPEiK_S"
},
"source": [
"##### Copyright 2022 The TF-Agents Authors."
"##### Copyright 2023 The TF-Agents Authors."
]
},
{
Expand Down Expand Up @@ -49,14 +49,10 @@
"### 开始\n",
"\n",
"<table class=\"tfo-notebook-buttons\" align=\"left\">\n",
" <td> <a target=\"_blank\" href=\"https://tensorflow.google.cn/agents/tutorials/ranking_tutorial\"> <img src=\"https://tensorflow.google.cn/images/tf_logo_32px.png\"> 在 TensorFlow.org 上查看</a>\n",
"</td>\n",
" <td> <a target=\"_blank\" href=\"https://colab.research.google.com/github/tensorflow/docs-l10n/blob/master/site/zh-cn/agents/tutorials/ranking_tutorial.ipynb\"> <img src=\"https://tensorflow.google.cn/images/colab_logo_32px.png\"> 在 Google Colab 中运行</a>\n",
"</td>\n",
" <td> <a target=\"_blank\" href=\"https://github.com/tensorflow/docs-l10n/blob/master/site/zh-cn/agents/tutorials/ranking_tutorial.ipynb\"> <img src=\"https://tensorflow.google.cn/images/GitHub-Mark-32px.png\"> 在 GitHub 上查看源代码</a>\n",
"</td>\n",
" <td> <a href=\"https://storage.googleapis.com/tensorflow_docs/docs-l10n/site/zh-cn/agents/tutorials/ranking_tutorial.ipynb\"><img src=\"https://tensorflow.google.cn/images/download_logo_32px.png\">下载笔记本</a>\n",
"</td>\n",
" <td> <a target=\"_blank\" href=\"https://tensorflow.google.cn/agents/tutorials/ranking_tutorial\"> <img src=\"https://tensorflow.google.cn/images/tf_logo_32px.png\"> 在 TensorFlow.org 上查看</a> </td>\n",
" <td> <a target=\"_blank\" href=\"https://colab.research.google.com/github/tensorflow/docs-l10n/blob/master/site/zh-cn/agents/tutorials/ranking_tutorial.ipynb\"> <img src=\"https://tensorflow.google.cn/images/colab_logo_32px.png\"> 在 Google Colab 中运行</a> </td>\n",
" <td> <a target=\"_blank\" href=\"https://github.com/tensorflow/docs-l10n/blob/master/site/zh-cn/agents/tutorials/ranking_tutorial.ipynb\"> <img src=\"https://tensorflow.google.cn/images/GitHub-Mark-32px.png\"> 在 GitHub 上查看源代码</a> </td>\n",
" <td> <a href=\"https://storage.googleapis.com/tensorflow_docs/docs-l10n/site/zh-cn/agents/tutorials/ranking_tutorial.ipynb\"><img src=\"https://tensorflow.google.cn/images/download_logo_32px.png\">下载笔记本</a> </td>\n",
"</table>\n"
]
},
Expand Down
Loading

0 comments on commit a747bbc

Please sign in to comment.