index.html

<!DOCTYPE HTML PUBLIC "-//W3C//DTD HTML 4.01 Transitional//EN">
<html><head><title>DeepDoom</title>

<style type="text/css">
body {
font-family: "Courier New",Courier,monospace;
font-weight: normal;
font-style: normal;
text-transform: none;
text-align: left;
background-color: #43e8ff;
}

.description {
	list-style-type: circle;
}

.files {
	list-style-type: square;
}

</style>
</head>
<body>
<h1 style="color: rgb(253, 5, 19);"><span>DeepDoom</span></h1>
<h2>About</h2>
<hr>
<p>Applying Deep Reinforcement Learning Techniques on the VizDoom environment to learn
navigation behaviors. Our goal is to train Deep Q-Learning Networks on simple navigational
tasks and combining them to solve more complex navigational tasks.</p>
<h4>Group Roles:</h4>
<h5>Rafael Zamora</h5>
<ul>
<li>Team Leader</li>
<li>Machine Learning Specialist/A.I. Designer</li>
</ul>
<h5>Lauren An</h5>
<ul>
<li>Support Programmer</li>
</ul>
<h5>Will Steele</h5>
<ul>
<li>Hardware Spectialist</li>
<li>Head of Documentation</li>
</ul>
<h5>Josh Hidayat</h5>
<ul>
<li>Head of Data Gathering</li>
</ul>
<h2>Scenarios</h2>
<hr>
<p>We designed a set of scenarios where the agent will learn
specific behaviors. These
scenarios where created using Doom Builder and VizDoom. The following
are descriptions of
the scenarios:</p>
<h4>Scenario 1: Rigid Turning</h4>
<h5>Description:</h5>
<ul class="description">
	<li>The purpose of this scenario is to train the AI on navigating through corridors with sharp 90&deg; turns.</li>
	<li>Map is a rigid S-shape consisting of two left and two right turns, with straight corridors connecting them, followed by a third left turn that is beyond the player exit.</li>
	<li>The player starts at one end of the corridor</li>
	<li>The textures of the walls, floors, and ceiling are randomly selected by an ACS script from a pool of predefined textures</li>
	<li>The player gets rewarded for progressing from one end of the 'S' to the other.</li>
	<li>The player gets penalized for bumping into walls and not moving.</li>
</ul>
<p><em>Available Actions</em>: [MOVE_FORWARD,
MOVE_BACKWARD, TURN_LEFT, TURN_RIGHT]</p>
<img src="https://raw.githubusercontent.com/Atlas-Soft/DeepDoom/master/webPhotos/rigid_turning.PNG" alt="rigid_turning.wad map" width="40%">
<h5>Goal Function:</h5>
<ul class="goal">
	<li><b>+50</b> reward checkpoints</li>
	<li><b>+100</b> level exit</li>
	<li><b>-10</b> hitting walls</li>
	<li><b>-1</b> living reward</li>
</ul>
<h5>Files:</h5>
<ul class"files">
	<li><a href="https://github.com/Atlas-Soft/DeepDoom/blob/master/src/wads/rigid_turning.wad">rigid_turning.wad</a></li>
	<li><a href="https://github.com/Atlas-Soft/DeepDoom/tree/master/src/configs/rigid_turning.cfg">rigid_turning.cfg</a></li>
</ul>

<h4>Scenario 2: Exit Finding</h4>
<h5>Description:</h5>
<ul class="description">
	<li>This scenario is designed to train the AI to spot an exit from a room, in the form of a hallway branching off this room, and move into that exit.</li>
	<li>Map is a square room where player starts in, with long 128-unit-wide corridor leading out of it.</li>
	<li>Player is randomly placed at a point inside the square starting room by a ZDoom ACS script that runs when player enters the map.</li>
	<li>The textures of the walls and floors are selected randomly by another ACS script from a pool of predefined textures.</li>
	<li>The player is rewarded for moving closer to the exit while looking at it (the exit is within a 21.6&deg; field of view relative to player's direction).</li>
	<li>The player does not receive any reward for moving towards the exit while not looking at it.</li>
	<li>The player is penalized for bumping into walls and not moving.</li>
</ul>
<p><em>Available Actions</em>: [MOVE_FORWARD,
MOVE_BACKWARD, TURN_LEFT, TURN_RIGHT]</p>
<img src="https://raw.githubusercontent.com/Atlas-Soft/DeepDoom/master/webPhotos/exit_finding.PNG" alt="exit_finding.wad map" width="40%">
<h5>Goal Function:</h5>
<ul class="goal">
	<li><b>+1</b> moving reward (changes player x,y position)</li>
	<li><b>+10 * (x)</b>  moving closer to goal while looking at it where x corresponds inversely to distance to the exit.  x decreases as distance increases</li>
	<li><b>+100</b> level exit</li>
	<li><b>-10</b> hitting walls</li>
	<li><b>-1</b> living reward</li>
</ul>
<h5>Files:</h5>
<ul class="files">
	<li><a href="https://github.com/Atlas-Soft/DeepDoom/blob/master/src/wads/exit_finding.wad">exit_finding.wad</a></li>
	<li><a href="https://github.com/Atlas-Soft/DeepDoom/tree/master/src/configs/exit_finding.cfg">exit_finding.cfg</a></li>
</ul>

<h4>Scenario 3: Switches (No longer needed)</h4><strike>
<h5>Description:</h5>
<ul class="description">
	<li>This scenario is designed to train the AI to spot a switch on a wall.</li>
	<li>Map is a square room where player starts in, with a button placed on the south wall.</li>
	<li>Player is randomly placed at a point inside the room, facing a random direction, by a ZDoom ACS script that runs when player enters the map.</li>
	<li>The textures of the walls, floor, ceiling, and button are selected randomly by another ACS script from a pool of predefined textures.</li>
	<li>The player is rewarded for moving closer to the switch while looking at it (the switch is within a 21.6&deg; field of view relative to player's direction).</li>
	<li>The player does not receive any reward for moving towards the switch while not looking at it.</li>
	<li>The player is penalized for bumping into walls and not moving.</li>
</ul>
<p><em>Available Actions</em>: [USE, MOVE_FORWARD,
MOVE_BACKWARD, TURN_LEFT, TURN_RIGHT]</p>
<img src="https://raw.githubusercontent.com/Atlas-Soft/DeepDoom/master/webPhotos/Switches.PNG" alt="Switches.wad map" width="40%">
<h5>Goal Function:</h5>
<ul class="goal">
	<li><b>+1</b> moving reward (changes player x,y position)</li>
	<li><b>+10 * (x)</b>  moving closer to goal while looking at it where x corresponds inversely to distance to the switch.  x decreases as distance increases</li>
	<li><b>+100</b> pressing the switch</li>
	<li><b>-1</b> living reward</li>
</ul>
<h5>Files:</h5>
<ul class="files">
	<li><a href="https://github.com/Atlas-Soft/DeepDoom/blob/master/src/wads/Switches.wad">Switches.wad</a></li>
	<li><a href="https://github.com/Atlas-Soft/DeepDoom/tree/master/src/configs/switches.cfg">switches.cfg</a></li>
</ul></strike>

<h4>Scenario 4: Doors</h4>
<h5>Description:</h5>
<ul class="description">
	<li>This scenario is designed to train the AI to recognize and open doors.</li>
	<li>Map is a straight rectangular corridor with 9 doors placed inside it.</li>
	<li>Player is placed at one end of this corridor and is expected to proceed straight towards the exit.</li>
	<li>The textures of the walls, floor, ceiling, and doors are selected randomly by an ACS script that runs when the player enters the map.</li>
	<li>The player is rewarded for advancing towards doors, for advancing through doors that are open, and for reaching the exit.</li>
	<li>The player is penalized for not moving.</li>
</ul>
<p><em>Available Actions</em>: [USE, MOVE_FORWARD]</p>
<img src="https://raw.githubusercontent.com/Atlas-Soft/DeepDoom/master/webPhotos/Doors.PNG" alt="Doors.wad map" width="40%">
<h5>Goal Function:</h5>
<ul class="goal">
	<li><b>+1</b> moving reward (changes player x,y position)</li>
	<li><b>+50</b> passing through an open door</li>
	<li><b>+10</b> moving towards the next door</li>
	<li><b>+20</b> reaching level exit</li>
	<li><b>-1</b> living reward</li>
</ul>
<h5>Files:</h5>
<ul class="files">
	<li><a href="https://github.com/Atlas-Soft/DeepDoom/blob/master/src/wads/Doors.wad">Doors.wad</a></li>
	<li><a href="https://github.com/Atlas-Soft/DeepDoom/tree/master/src/configs/doors.cfg">doors.cfg</a></li>
</ul>

<h2>Results:</h2>
<p>In Progress;<br>
Update 3/23/17: 
Our AI now has been trained on all skills required for our composite level, and our higher level model is now training.
<br><img src="https://raw.githubusercontent.com/Atlas-Soft/DeepDoom/master/doc/figures/rigid_turning_total_reward.png">   <img src="https://raw.githubusercontent.com/Atlas-Soft/DeepDoom/master/doc/figures/rigid_turning_loss.png"><br>
Our program is hard at work learning how to navigate through the other scenarios. This process
will take awhile, giving us time to work on documentation.
Stay tuned for more!</p>
<iframe src="http://free.timeanddate.com/countdown/i5lrklfh/n4513/cf12/cm0/cu5/ct0/cs0/ca1/co1/cr0/ss0/cac000/cpc000/pct/tcfff/fs100/szw576/szh243/tatProject%20Due%20in%3A/tac000/tptProject%20was%20due%3A/tpc000/iso2017-03-31T23:59:59" allowTransparency="true" frameborder="0" width="269" height="89">some pig</iframe>
<h2 id="resources">Resources:</h2>
<ul>
	<li><a href="https://github.com/marqt/vizdoom">VizDoom</a></li>
	<li><a href="https://github.com/fchollet/keras">Keras</a></li>
	<li><a href="http://matplotlib.org/">Matplotlib</a></li>
	<li><a href="https://www.tensorflow.org/">TensorFlow</a></li>
</ul>
<p>Our Deep Q-Learning implementation is based on the following
sources:</p>
<ul>
	<li><a href="https://github.com/farizrahman4u/qlearning4k">Qlearning4k</a></li>
	<li><a href="https://www.cs.toronto.edu/%7Evmnih/docs/dqn.pdf">Playing Atari with Deep Reinforcement Learning</a></li>
	<li><a href="http://www.nature.com/nature/journal/v518/n7540/full/nature14236.html">Human-level control through deep reinforcement learning</a></li>
</ul>
<h4>Setup and Installation:</h4>
<p>Download or clone repository and install required packages.</p>
<p>The <a href="https://github.com/Atlas-Soft/DeepDoom/blob/master/src">/src/</a>
folder includes all scripts used for this project.</p>
<p>The <a href="https://github.com/Atlas-Soft/DeepDoom/blob/master/src/wads">/src/wads/</a>
folder contains the wad files for the scenarios.</p>
<p>The <a href="https://github.com/Atlas-Soft/DeepDoom/blob/master/src/configs">/src/configs/</a>
folder contains the config files for the scenarios.</p>
<p></p>
<p><small>Last Edited 3/10/2017</small></p>
</body></html>