Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Improve FPS #336

Merged
merged 15 commits into from
Aug 21, 2024
Merged

Improve FPS #336

merged 15 commits into from
Aug 21, 2024

Conversation

AdrianSosic
Copy link
Collaborator

@AdrianSosic AdrianSosic commented Aug 7, 2024

Working on the 0.10.0 release, it was noticed that the optimization results can depend on the order of substances provided by the user if a SubstanceParameter is involved. While the problem is still partly open (see #341), investigation showed that also the FPSRecommender can yield different results when provided the same set of points in a different order.

This PR improves the FPS code in that:

  • It adds proper input validation
  • Makes the algorithm agnostic to permutations of the input
  • Adds a thorough hypothesis test to verify the FPS property and permutation-invariance
  • Properly handles edge cases involving duplicated points
  • Improves the documentation

@AdrianSosic AdrianSosic added the enhancement Expand / change existing functionality label Aug 7, 2024
@AdrianSosic AdrianSosic self-assigned this Aug 7, 2024
@AdrianSosic AdrianSosic marked this pull request as draft August 7, 2024 07:34
@AdrianSosic AdrianSosic marked this pull request as ready for review August 7, 2024 10:38
Copy link
Collaborator

@AVHopp AVHopp left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

That was a very nice PR to review, and I highly appreciate all of the comments that you left in the code :)

baybe/utils/sampling_algorithms.py Outdated Show resolved Hide resolved
baybe/utils/sampling_algorithms.py Outdated Show resolved Hide resolved
baybe/utils/sampling_algorithms.py Show resolved Hide resolved
baybe/utils/sampling_algorithms.py Show resolved Hide resolved
tests/utils/test_sampling_algorithms.py Show resolved Hide resolved
@AdrianSosic AdrianSosic merged commit e49dc3d into main Aug 21, 2024
9 of 11 checks passed
@AdrianSosic AdrianSosic deleted the refactor/fps branch August 21, 2024 12:08
AdrianSosic added a commit that referenced this pull request Sep 3, 2024
This PR adds code to:
* sort the user-provided values before storing them as attributes in
discrete parameters (see also #336)
* sort the parameters stored in search spaces

If unsorted, this can cause problems with reproducibility in the sense
that the same parameter content provided in a different order can lead
to different optimization results. For instance, the `RandomRecommender`
randomly selects rows from the `comp_rep` dataframe for the discrete
subspace, which will be ordered differently if the parameter values come
in a different order. This can lead to rather surprising behavior, like
in situations when the parameter values are given as the output of
Python's `set` function, whose order depends on `PYTHONHASHSEED`.
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
enhancement Expand / change existing functionality
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants