Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Capture full serialized metric representation when loading with skipped runners and metrics, in properties #2888

Open
wants to merge 1 commit into
base: main
Choose a base branch
from

Conversation

lena-kashtelyan
Copy link
Contributor

Summary:
Context:

  1. When an experiment is loaded with skip_runners_and_metrics flag enabled, runners are simply omitted, but metrics are not omitted, since their names and lower_is_better direction is needed in a few places during optimization (e.g. in formatting and looking up the data). Instead of simply omitting them, we "stub" them as base Metric objects, which still have the name and the lower_is_better of the original metric.
  2. Further, copies of experiment's metrics end up being saved on trials, unless Experiment.immutable_search_space_and_opt_config is true. This is bad: we now have copies of the stubbed metrics saved on trials.

What's done in this diff:

  1. We preserve the serialized representation of the original metric in the Metric.properties json blob (we don't need to decode it for this);
  2. With 1) in place, we are able to, during re-encoding of the metrics, inject their correct representation (not stubbed) back into their SQA representation and save the metric in a way that matches its original form.

Is this beautiful code... Most certainly not, but the problem is quite important and this does address it fully, functionality-wise. Addressing it more elegantly requires a full storage refactor (and a very clever one at that) or elimination of a) use of Metric-s throughout the optimization code or b) stopping storage of metric copies on GRs (which could be considered, but we corrently don't have another mechanism for keeping history of the opt. config).

But is this safe? Yes; the "full SQA row" is only in the properties of the associated metrics, and attempting to make a change to the stubbed metric as if it's the non-stubbed version, wouldn't have worked anyway.

Does this apply to JSON serialization? No, we don't do any stubbing there currently, although only because we don't ever need to reload from JSON with non-core registries. We can cross that bridge when we get there; the hope is that we can refactor storage before getting there at all.

Differential Revision: D64212275

…ed runners and metrics, in properties

Summary:
**Context:**
1. **When an experiment is loaded with `skip_runners_and_metrics`** flag enabled, runners are simply omitted, but **metrics are not omitted, since their names and `lower_is_better` direction is needed in a few places during optimization** (e.g. in formatting and looking up the data). Instead of simply omitting them, we "stub" them as base `Metric` objects, which still have the name and the `lower_is_better` of the original metric.
2. Further, copies of experiment's metrics end up being saved on trials, unless `Experiment.immutable_search_space_and_opt_config` is true. **This is bad: we now have copies of the stubbed metrics saved on trials.**

**What's done in this diff:** 
1. We preserve the serialized representation of the original metric in the `Metric.properties` json blob (we don't need to decode it for this);
2. With 1) in place, we are able to, during re-encoding of the metrics, inject their correct representation (not stubbed) back into their SQA representation and save the metric in a way that matches its original form.

**Is this beautiful code...** Most certainly not, but the problem is quite important and this does address it fully, functionality-wise. Addressing it more elegantly requires a full storage refactor (and a very clever one at that) or elimination of a) use of `Metric`-s throughout the optimization code or b) stopping storage of metric copies on GRs (which could be considered, but we corrently don't have another mechanism for keeping history  of the opt. config).

**But is this safe?** Yes; the "full SQA row" is only in the properties of the associated metrics, and attempting to make a change to the stubbed metric as if it's the non-stubbed version, wouldn't have worked anyway.

**Does this apply to JSON serialization?** No, we don't do any stubbing there currently, although only because we don't ever need to reload from JSON with non-core registries. We can cross that bridge when we get there; the hope is that we can refactor storage before getting there at all.

Differential Revision: D64212275
@facebook-github-bot facebook-github-bot added the CLA Signed Do not delete this pull request or issue due to inactivity. label Oct 16, 2024
@facebook-github-bot
Copy link
Contributor

This pull request was exported from Phabricator. Differential Revision: D64212275

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
CLA Signed Do not delete this pull request or issue due to inactivity. fb-exported
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants