@@ -102,7 +102,7 @@ <h1 class="title is-1 publication-title">Diffusion Models Are Real-Time Game Eng
102
102
< div class ="column is-four-fifths " style ="max-width:960px ">
103
103
< div class ="publication-video ">
104
104
< iframe
105
- src ="https://www.youtube.com/embed/O3616ZFGpqw?autoplay=1&mute=1&loop=1&showinfo=0&list=PL3ZfMho22LwDvJSEKVBiwxNsVEqUTUmhJ "
105
+ src ="https://www.youtube.com/embed/O3616ZFGpqw?autoplay=1&mute=1&loop=1&showinfo=0&list=PL3ZfMho22LwDvJSEKVBiwxNsVEqUTUmhJ "
106
106
frameborder ="0 " allow ="autoplay; encrypted-media " allowfullscreen > </ iframe >
107
107
</ div >
108
108
</ div >
@@ -126,16 +126,11 @@ <h2 class="title is-3">Abstract</h2>
126
126
< p >
127
127
We present < i > GameNGen</ i > , the first game engine powered entirely by a neural model
128
128
that enables real-time interaction with a complex environment over long trajectories at high quality.
129
- < i > GameNGen</ i > can interactively simulate the classic game DOOM at over 20 frames per second on a single
130
- TPU.
131
- < i > GameNGen</ i > simulations do not suffer from accumulated deterioration even after long play sessions,
132
- achieving a PSNR of 29.4, comparable to lossy JPEG compression.
133
- Human raters are only slightly better than random chance at distinguishing short clips of the game
134
- from clips of the simulation.
135
- < i > GameNGen</ i > is trained in two phases:
136
- (1) an RL-agent learns to play the game and the training sessions are recorded, and
137
- (2) a diffusion model is trained to produce the next frame, conditioned on the sequence of past frames and
138
- actions.
129
+ < i > GameNGen</ i > can interactively simulate the classic game DOOM at over 20 frames per second on a single TPU.
130
+ Next frame prediction achieves a PSNR of 29.4, comparable to lossy JPEG compression.
131
+ Human raters are only slightly better than random chance at distinguishing short clips of the game from clips of the simulation.
132
+ < i > GameNGen</ i > is trained in two phases: (1) an RL-agent learns to play the game and the training sessions are recorded, and
133
+ (2) a diffusion model is trained to produce the next frame, conditioned on the sequence of past frames and actions.
139
134
Conditioning augmentations enable stable auto-regressive generation over long trajectories.
140
135
</ p >
141
136
</ div >
0 commit comments