resources/tutorials/example for creating custom reward function? #218

MicahJSherry · 2024-12-27T01:25:58Z

MicahJSherry
Dec 27, 2024

I am having trouble translating my ideas for reward functions into bsk_rl objects. Can someone point me to a good (preferably beginner-friendly) example or tutorial that might help me adapt to my goals?

Answered by Mark2000

Dec 27, 2024

Here's what I would do:

Data: Contains the current difference in orbit from the desired orbit (in classical orbital elements, or Cartesian coordinates, or whatever). Addition would just produce a unit of data with the most recent measurement (may need to track the measurement time in the data to do this)
DataStore: get_log_state would return the current OEs/coordinates. compare_log_states would only use the second argument (new_state) to generate an instance of your Data class that contains the difference between the state and the desired state.
Rewarder: calculate_reward receives a dictionary with the satellite names and newest data, and needs to return a dictionary with satellite names…

View full answer

Mark2000 · 2024-12-27T16:43:08Z

Mark2000
Dec 27, 2024
Maintainer

There aren't any tutorials specifically, but my advice would be to look at how one of the reward functions is implemented relative to the base data class. If you have a specific reward model in mind, I can perhaps offer some pointers on how I would structure it.

3 replies

MicahJSherry Dec 27, 2024
Author

my goal is to have satellite that will match the orbit of a target satellite but I am unsure of how to implement it with the Data and Datastore classes.

Mark2000 Dec 27, 2024
Maintainer

Here's what I would do:

Data: Contains the current difference in orbit from the desired orbit (in classical orbital elements, or Cartesian coordinates, or whatever). Addition would just produce a unit of data with the most recent measurement (may need to track the measurement time in the data to do this)
DataStore: get_log_state would return the current OEs/coordinates. compare_log_states would only use the second argument (new_state) to generate an instance of your Data class that contains the difference between the state and the desired state.
Rewarder: calculate_reward receives a dictionary with the satellite names and newest data, and needs to return a dictionary with satellite names and rewards.

Hope this helps. Check out some of the implemented data types for examples.

Answer selected by MicahJSherry

MicahJSherry Dec 27, 2024
Author

thanks for your help!

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

resources/tutorials/example for creating custom reward function? #218

{{title}}

Replies: 1 comment 3 replies

{{title}}

{{title}}

{{title}}

{{title}}

Select a reply

resources/tutorials/example for creating custom reward function? #218

MicahJSherry Dec 27, 2024

Replies: 1 comment · 3 replies

Mark2000 Dec 27, 2024 Maintainer

MicahJSherry Dec 27, 2024 Author

Mark2000 Dec 27, 2024 Maintainer

MicahJSherry Dec 27, 2024 Author

MicahJSherry
Dec 27, 2024

Replies: 1 comment 3 replies

Mark2000
Dec 27, 2024
Maintainer

MicahJSherry Dec 27, 2024
Author

Mark2000 Dec 27, 2024
Maintainer

MicahJSherry Dec 27, 2024
Author