Expert-Guided ML for 2D car driving trained under intentionally poor demonstrations.
To use experts (humans, most likely, although not necessarily) in order to train a neural network in driving a car without crashing, providing only intentionally poor driving demonstrations.
This project was started as a possible final project for CS 5170 in Northeastern University, Spring 2021 semester.
PostgreSQL and Redis ports left open instead of merely exposed in case these databases need to be accessed from other servers, this may be modified.
make dev
Then visit http://localhost:8080/ to see the website running.
Software:
- Docker
Python3 libraries:
- matplotlib
- numpy
- shapely
All commands start within the root directory for this repository. Unless mentioned in this README, all reinforcement_learning python scripts can show the meaning of all flags when flag --help is used.
- Generate the necessary matrices and data
Example:
cd reinforcement_learning/
python3 create_circuit.py --circuit circuits/five.json --output circuits/five_Q_matrix.json --show
To show all the options:
cd reinforcement_learning/
python3 create_circuit.py --help
- (Optional) Run an A* search in order to obtain an optimal baseline to compare to
Example:
cd reinforcement_learning/
python3 q_Astar_trainer.py --A-star-runs 50 --data circuits/five_Q_matrix.json --output demonstration_data/five_A_1.json
Note: A* is not an optimal search method, since it is implemented using the L1 norm (Manhattan distance) as a heuristic. However, it results in a faster search than Breadth First Search while still producing good results.
- Setup the docker containers
Modify the .env file to change the credentials.
Enter the generated Q matrix circuit (five_Q_matrix.json) from step 1 as five.json in circuits (this step is aready complete in the repository):
cat reinforcement_learning/circuits/five_Q_matrix.json > server/main_node/circuits/five.json
Note: sudo permission may be necessary.
make deploy
Note, to bring the containers down:
make teardown
- Train
Create an account by going to the appropiate URL and sign-up, then click on Driving RL. Execute as many positive and negative intent demonstrations as needed. Here, only 20 of each are done.
Enter the main_node container (sudo may be needed), combine all positive and negative intent data, export the data:
# Enter the container
docker exec -it server_main_node_1 bash
# Combine the data
python3 data_combiner.py
# Exit the container and export the data
exit
docker cp server_main_node_1:/DARLMID/data/combined_negative.json reinforcement_learning/demonstration_data/combined_negative.json
docker cp server_main_node_1:/DARLMID/data/combined_positive.json reinforcement_learning/demonstration_data/combined_positive.json
- Q-learning
Note: reinforcement_learning/q_compare_1v1.py doesnot accept flags or arguments, changes must be done on the file itself.
Run a simple Q-learning agent without any demonstrations.
Example:
cd reinforcement_learning/
python3 q_learn.py --epochs 300 --explore-probability 0.15 --learning-rate 0.25 \
--discount-factor 0.3 --data circuits/five_Q_matrix.json \
--output results/five_output.json \
--show
Optional: Run a Q-learning agent using the A* demonstrations from step 2:
Example:
cd reinforcement_learning/
python3 q_learn.py --epochs 300 --explore-probability 0.15 --learning-rate 0.25 \
--discount-factor 0.3 --data circuits/five_Q_matrix.json \
--positive-demonstration demonstration_data/five_A_1.json \
--output results/five_A_1_output.json \
--show
# Compare both models
# Update q_compare_1v1.py first if needed
vi q_compare_1v1.py
python3 q_compare_1v1.py
Run a Q-learning agent using the positive demonstrations from step 4:
Example:
cd reinforcement_learning/
python3 q_learn.py --epochs 300 --explore-probability 0.15 --learning-rate 0.25 \
--discount-factor 0.3 --data circuits/five_Q_matrix.json \
--positive-demonstration demonstration_data/combined_positive.json \
--output results/five_positive_output.json \
--show
# Compare both models
# Update q_compare_1v1.py first if needed
vi q_compare_1v1.py
python3 q_compare_1v1.py
Run a Q-learning agent using negative demonstrations from step 4:
Example:
cd reinforcement_learning/
python3 q_learn.py --epochs 300 --explore-probability 0.15 --learning-rate 0.25 \
--discount-factor 0.3 --data circuits/five_Q_matrix.json \
--negative-demonstration demonstration_data/combined_negative.json \
--output results/five_negative_output.json \
--show
# Compare both models
# Update q_compare_1v1.py first if needed
vi q_compare_1v1.py
python3 q_compare_1v1.py
Run a Q-learning agent using both positive and negative demonstrations from step 4:
Example:
cd reinforcement_learning/
python3 q_learn.py --epochs 300 --explore-probability 0.15 --learning-rate 0.25 \
--discount-factor 0.20 --data circuits/five_Q_matrix.json \
--positive-demonstration demonstration_data/combined_positive.json \
--negative-demonstration demonstration_data/combined_negative.json \
--output results/five_positive_and_negative_output.json \
--show
# Compare both models
# Update q_compare_1v1.py first if needed
vi q_compare_1v1.py
python3 q_compare_1v1.py
Parts of this project utilize software and images which are licensed under different conditions. An overview of these materials, licenses, and conditions is provided in the licenses subdirectory.
- https://docs.aiohttp.org/en/stable/deployment.html
- https://stackoverflow.com/questions/52569051/aiohttp-and-nginx-running-in-docker
- https://docs.gunicorn.org/en/stable/install.html
- https://docs.gunicorn.org/en/stable/install.html
- https://docs.docker.com/storage/volumes/
- https://docs.nginx.com/nginx/admin-guide/web-server/serving-static-content/
- http://nginx.org/en/docs/beginners_guide.html#static
- https://mkyong.com/html/html-tutorial-hello-world/
- https://developer.mozilla.org/en-US/docs/Web/CSS/CSS_Flexible_Box_Layout/Aligning_Items_in_a_Flex_Container
- https://www.w3schools.com/html/tryit.asp?filename=tryhtml_images_trulli
- https://www.digitalocean.com/community/tutorials/how-to-configure-nginx-to-use-custom-error-pages-on-ubuntu-14-04
- https://commons.wikimedia.org/wiki/File:Aft_(PSF).png
- https://hub.docker.com/_/postgres
- https://stackoverflow.com/questions/45128902/psycopg2-and-sql-injection-security
- https://docs.github.com/en/github/importing-your-projects-to-github/adding-an-existing-project-to-github-using-the-command-line
- https://pkgs.alpinelinux.org/package/edge/main/x86/postgresql-dev
- https://www.w3schools.com/css/css_howto.asp
- https://www.dummies.com/web-design-development/html5-and-css3/how-to-use-an-external-style-sheet-for-html5-and-css3-programming/
- https://www.w3schools.com/howto/tryit.asp?filename=tryhow_css_topnav
- https://stackoverflow.com/questions/8722163/how-to-assign-multiple-classes-to-an-html-container
- https://www.w3schools.com/colors/colors_names.asp
- https://www.w3schools.com/howto/howto_css_fixed_footer.asp
- https://stackoverflow.com/questions/45764517/how-to-return-redirect-response-from-aiohttp-web-server
- http://demos.aiohttp.org/en/latest/tutorial.html#middlewares
- https://www.w3schools.com/css/css3_gradients.asp
- https://stackoverflow.com/questions/29573489/nginx-failing-to-load-css-and-js-files-mime-type-error
- https://stackoverflow.com/questions/2242086/how-to-detect-the-screen-resolution-with-javascript
- https://stackoverflow.com/questions/15615552/get-div-height-with-plain-javascript
- https://stackoverflow.com/questions/19484544/set-height-of-div-to-height-of-another-div-through-css
- https://www.w3schools.com/js/js_functions.asp
- https://stackoverflow.com/questions/807878/how-to-make-javascript-execute-after-page-load
- https://stackoverflow.com/questions/34796085/how-to-stick-footer-to-bottom-not-fixed-even-with-scrolling/34796186
- https://stackoverflow.com/questions/19039628/how-to-calculate-height-of-viewable-area-i-e-window-height-minus-address-bo
- https://www.w3schools.com/jsref/event_onresize.asp
- https://freesvg.org/nemeth-flying-machine
- https://www.w3schools.com/css/tryit.asp?filename=trycss3_border-radius
- https://stackoverflow.com/questions/54845686/nginx-wont-serve-svg-files
- https://www.w3schools.com/tags/tryit.asp?filename=tryhtml_table_test
- https://www.w3schools.com/html/tryit.asp?filename=tryhtml_form_submit
- https://developer.mozilla.org/en-US/docs/Web/HTML/Attributes/minlength
- https://stackoverflow.com/questions/1297449/change-image-size-with-javascript
- https://stackoverflow.com/questions/9686538/align-labels-in-form-next-to-input
- https://freesvg.org/international-space-station-vector-drawing
- https://www.w3schools.com/cssref/css_units.asp
- https://www.w3schools.com/jsref/tryit.asp?filename=tryjsref_form_submit
- https://developer.mozilla.org/en-US/docs/Web/JavaScript/Reference/Global_Objects/String/length
- https://stackoverflow.com/questions/6199773/how-to-enable-disable-an-html-button-based-on-scenarios
- https://stackoverflow.com/questions/195951/how-can-i-change-an-elements-class-with-javascript
- https://stackoverflow.com/questions/3547035/javascript-getting-html-form-values
- https://stackoverflow.com/questions/32459646/removing-the-shadow-from-a-button
- https://stackoverflow.com/questions/15110484/javascript-how-to-append-div-in-begining-of-another-div
- https://developer.mozilla.org/en-US/docs/Web/API/ParentNode/prepend
- https://stackoverflow.com/questions/16584121/change-div-id-by-javascript
- https://stackoverflow.com/questions/596467/how-do-i-convert-a-float-number-to-a-whole-number-in-javascript
- https://stackoverflow.com/questions/11722400/programmatically-change-the-src-of-an-img-tag
- https://stackoverflow.com/questions/21727317/how-to-check-confirm-password-field-in-form-without-reloading-page
- https://stackoverflow.com/questions/39449739/aiohttp-how-to-retrieve-the-data-body-in-aiohttp-server-from-requests-get
- https://stackoverflow.com/questions/52246796/await-a-method-and-assign-a-variable-to-the-returned-value-with-asyncio
- https://stackoverflow.com/questions/46428889/keeping-pycache-out-of-my-repository-when-adding-committing-from-pythonany
- https://www.w3schools.com/css/tryit.asp?filename=trycss_table_align_center
- https://stackoverflow.com/questions/29775797/fetch-post-json-data
- https://github.com/ritua2/gib/blob/master/middle-layer/.env
- https://github.com/ritua2/gib/blob/master/middle-layer/docker-compose.yml
- https://hub.docker.com/_/redis
- https://www.psycopg.org/docs/module.html#psycopg2.connect
- https://www.postgresqltutorial.com/postgresql-create-table/
- https://stackoverflow.com/questions/50070877/postgres-psycopg2-create-table
- https://www.postgresql.org/docs/8.0/sql-createuser.html
- http://oliviertech.com/python/generate-SHA512-hash-from-a-String/
- https://stackoverflow.com/questions/4244896/dynamically-access-object-property-using-variable
- https://stackoverflow.com/questions/45018338/javascript-fetch-api-how-to-save-output-to-variable-as-an-object-not-the-prom/45018619
- https://tldrlegal.com/license/apache-license-2.0-(apache-2.0)
- http://www.apache.org/licenses/LICENSE-2.0.txt
- https://github.com/mozilla/bleach
- https://bleach.readthedocs.io/en/latest/clean.html
- https://github.com/aio-libs/aiohttp/blob/master/examples/web_cookies.py
- https://stackoverflow.com/questions/26745519/converting-dictionary-to-json
- https://github.com/js-cookie/js-cookie
- https://docs.aiohttp.org/en/stable/web_reference.html
- https://docs.python.org/3/library/sys.html
- https://docs.python.org/3/howto/argparse.html
- https://github.com/adam-p/markdown-here/wiki/Markdown-Cheatsheet
- https://dillinger.io/
- https://stackoverflow.com/questions/9215658/plot-a-circle-with-pyplot
- https://matplotlib.org/3.1.1/gallery/lines_bars_and_markers/fill.html
- https://stackoverflow.com/questions/2849286/python-matplotlib-subplot-how-to-set-the-axis-range
- https://www.w3schools.com/python/ref_keyword_assert.asp
- https://stackoverflow.com/questions/26226816/argparse-making-required-flags
- https://shapely.readthedocs.io/en/stable/manual.html
- https://gis.stackexchange.com/questions/95670/creating-shapely-linestring-from-two-points
- https://docs.blender.org/manual/en/latest/getting_started/installing/linux.html
- https://www.w3schools.com/html/tryit.asp?filename=tryhtml_table
- https://commons.wikimedia.org/wiki/File:Car_in_Black_Rock_Desert.jpg
- https://smallbusiness.chron.com/crop-circle-out-picture-gimp-36366.html
- https://splidejs.com/getting-started/
- https://www.w3schools.com/tags/att_script_defer.asp
- https://splidejs.com/
- https://stackoverflow.com/questions/15121343/how-to-center-a-p-element-inside-a-div-container
- https://upload.wikimedia.org/wikipedia/commons/thumb/6/69/Storsj%C3%B6n_i_Vindelns_kommun.jpg/1280px-Storsj%C3%B6n_i_Vindelns_kommun.jpg
- https://web.dev/browser-level-image-lazy-loading/
- https://davidwalsh.name/lazyload-image-fade
- https://developer.mozilla.org/en-US/docs/Web/API/Element/removeAttribute
- Asked some of Carlos' friends for feedback on the front-end's look
- https://stackoverflow.com/questions/534839/how-to-create-a-guid-uuid-in-python
- https://aioredis.readthedocs.io/en/v1.3.0/examples.html
- https://aioredis.readthedocs.io/en/v1.3.0/mixins.html
- https://aioredis.readthedocs.io/en/v1.3.0/api_reference.html
- https://redis.io/commands/expire
- https://www.w3schools.com/tags/tryit.asp?filename=tryhtml5_script_async
- https://wiki.freecadweb.org/Topological_data_scripting
- https://json2html.com/
- https://json2html.com/examples/
- https://stackoverflow.com/questions/684672/how-do-i-loop-through-or-enumerate-a-javascript-object
- https://api.jquery.com/jQuery.isEmptyObject/
- https://code.jquery.com/
- https://www.quackit.com/html/howto/how_to_make_a_background_image_not_repeat.cfm
- https://stackoverflow.com/questions/1085801/get-selected-value-in-dropdown-list-using-javascript
- https://select2.org/getting-started/installation
- https://select2.org/getting-started/basic-usage
- https://www.w3schools.com/jsref/jsref_length_array.asp
- https://stackoverflow.com/questions/30650961/functional-way-to-iterate-over-range-es6-7
- https://stackoverflow.com/questions/10879045/how-to-set-opacity-in-parent-div-and-not-affect-in-child-div
- https://github.com/jonobr1/two.js/
- https://two.js.org/
- https://www.geeksforgeeks.org/python-os-path-isfile-method/
- https://www.w3schools.com/jsref/met_element_remove.asp
- https://developer.mozilla.org/en-US/docs/Web/JavaScript/Reference/Errors/Missing_colon_after_property_id
- https://jsonlint.com/
- https://stackoverflow.com/questions/596467/how-do-i-convert-a-float-number-to-a-whole-number-in-javascript
- https://www.w3schools.com/howto/tryit.asp?filename=tryhow_css_list_without_bullets
- https://www.w3schools.com/howto/howto_css_list_without_bullets.asp
- https://code.tutsplus.com/tutorials/drawing-with-twojs--net-32024
- jonobr1/two.js#144
- Previous and continuing coursework materials
- https://developer.mozilla.org/en-US/docs/Web/JavaScript/Reference/Functions/Default_parameters
- https://stackoverflow.com/questions/21227287/make-div-scrollable
- https://matplotlib.org/stable/gallery/color/named_colors.html
- https://en.wikiversity.org/wiki/Python_Programming/Classes
- https://en.wikipedia.org/wiki/Q-learning
- Previous Q-learning homework assignment, provided in CS 5100 (Northeastern University)
- https://docs.python.org/3/library/random.html
- Artificial Intelligence A Modern Approach Third Edition (Stuart Russell, Peter Norvig)
- https://stackoverflow.com/questions/927358/how-do-i-undo-the-most-recent-local-commits-in-git
- https://en.wikipedia.org/wiki/Taxicab_geometry
- https://docs.python.org/3/library/heapq.html
- https://wiki.python.org/moin/TimeComplexity
- https://stackoverflow.com/questions/33282368/plotting-a-2d-heatmap-with-matplotlib
- https://stackoverflow.com/questions/36343928/python-heatmap-plot-colorbar
- https://stackoverflow.com/questions/8396101/invert-image-displayed-by-imshow-in-matplotlib
- https://stackoverflow.com/questions/1527803/generating-random-whole-numbers-in-javascript-in-a-specific-range
- https://www.w3schools.com/js/js_classes.asp
- https://www.w3schools.com/js/js_comparisons.asp
- https://www.w3schools.com/jsref/tryit.asp?filename=tryjsref_concat
- https://stackoverflow.com/questions/3396754/onkeypress-vs-onkeyup-and-onkeydown
- https://stackoverflow.com/questions/24028225/addeventlistener-keypress-doesnt-register-key-presses
- https://css-tricks.com/snippets/javascript/javascript-keycodes/
- https://stackoverflow.com/questions/2647867/how-can-i-determine-if-a-variable-is-undefined-or-null
- https://stackoverflow.com/questions/31746182/docker-compose-wait-for-container-x-before-starting-y
- https://stackoverflow.com/questions/20895290/count-number-of-files-within-a-directory-in-linux