This repository has been archived by the owner on Sep 1, 2023. It is now read-only.
-
Notifications
You must be signed in to change notification settings - Fork 1
/
Copy path09.tex
50 lines (37 loc) · 1.91 KB
/
09.tex
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
\documentclass{exam}
\usepackage{amsmath, amsfonts}
\usepackage{verbatim}
\usepackage{graphicx}
\usepackage[super]{nth}
\usepackage[hyperfootnotes=false]{hyperref}
\usepackage[usenames,dvipsnames]{color}
\newcommand{\note}[1]{
\noindent~\\
\vspace{0.25cm}
\fcolorbox{Red}{Orange}{\parbox{0.99\textwidth}{#1\\}}
%{\parbox{0.99\textwidth}{#1\\}}
\vspace{0.25cm}
}
%\input{../macros}
%\renewcommand{\hide}[1]{#1}
\qformat{\thequestion. \textbf{\thequestiontitle}\hfill}
\bonusqformat{\thequestion. \textbf{\thequestiontitle}\hfill}
\pagestyle{headandfoot}
%%%%%% MODIFY FOR EACH SHEET!!!! %%%%%%
\newcommand{\duedate}{12.01.2021 (15:00)}
\newcommand{\due}{{\bf This assignment is due on \duedate.} }
\firstpageheader{Reinforcement Learning assignment 9}{}{\due}
\runningheader{Due: \duedate}{\assignment{9}}{\semester}
%%%%%% MODIFY FOR EACH SHEET!!!! %%%%%%
\firstpagefooter{}{\thepage}{}
\runningfooter{}{\thepage}{}
\headrule
\pointsinrightmargin
\bracketedpoints
\marginpointname{pt.}
\begin{document}
\noindent Like last time, you will have no tests and no set guidelines. Your task will be to complete the code stub you will find in the vacuum.py file and build an environment to control an automatic vacuum cleaner.
You environment should adhere to the OpenAI gym format, though you don't need to render anything. Start by implementing the basic requirements that should be learned (e.g. moving) before adding options for greater difficulty. This could include extra dirty spots in the room, different apartment layouts, breakable vacuum cleaners or even additional functions like dusting.
To test if you design works as expected, you should try to run an agent on your environment from time to time.
Ideally, it will learn slower or not at all whenever you add a new difficulty. If your design decisions are flawed the may even make the task easier or much too hard, so make sure to check. Good Luck and have fun!
\end{document}