Skip to content

Commit

Permalink
lint
Browse files Browse the repository at this point in the history
  • Loading branch information
nathanjzhao committed Aug 8, 2024
1 parent 9c99714 commit a236903
Showing 1 changed file with 12 additions and 0 deletions.
12 changes: 12 additions & 0 deletions training.notes
Original file line number Diff line number Diff line change
@@ -0,0 +1,12 @@

# Currently tests:
- Hidden layer size of 256 shows progress (loss is based on state.q[2])

- setting std to zero makes rewards nans why. I wonder if there NEEDS to be randomization in the enviornment

- ctrl cost is whats giving nans? interesting?
- it is unrelated to randomization of enviornmnet. i think gradient related

- first thing to become nans seems to be actor loss and scores. after that, everything becomes nans

- fixed entropy epsilon. hope this works now.

0 comments on commit a236903

Please sign in to comment.