Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Hparam: Add selectors to retrieve data written by the new hparams effects #6544

Merged
merged 11 commits into from
Aug 21, 2023

Conversation

rileyajones
Copy link
Contributor

@rileyajones rileyajones commented Aug 15, 2023

Motivation for features / changes

This is a continuation of #6540
Now that data is being retrieved from the hparams plugin and written to the store I am adding a pair of selectors to retrieve it and inject it into the the data retrieved from getRuns and getRunsForExperimentId.

The logic used to translate SessionGroups to Record<RunId/Group, Value> is very similar to that being used by the old code in runs_data_source. I could imagine this being refactored to retrieve the run/group relationships from another part of the store in the future in order to better integrate with our plans to stop relying on the metrics portion of the hparams payload.

Note For Reviewer(s)

A large majority of the lines changed in this PR are indent changes resulting from changing the method used to mock the state during tests.

I have opted not to add additional test to the runs or common selectors (though I did have to fix 16) as the intention is for this to be a drop in replacement with no new features added. The number of test failures which I had to fix left me feeling confident of the existing test coverage.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The changes to this file are noop. I simply change the method being used to construct the mock state (so that the hparams state would be properly mocked) and that resulted in a ton of indentation changes.

@rileyajones rileyajones marked this pull request as ready for review August 15, 2023 18:29
@rileyajones rileyajones requested a review from bmd3k August 15, 2023 20:02
Comment on lines 87 to 93
.map((metadata) => {
if (runsToHparamsAndMetrics[metadata.id]) {
metadata.hparams = runsToHparamsAndMetrics[metadata.id].hparams;
metadata.metrics = runsToHparamsAndMetrics[metadata.id].metrics;
}
return metadata;
});
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is where the data from the new hparams_data_source is being injected

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

All the changes to this file are related to changing the way the mock state is constructed

DebuggerState,
} from '../../plugins/debugger_v2/tf_debugger_v2_plugin/store/debugger_types';

interface FullState extends State {
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This may not be necessary but I would like all the aspects of the state to be mocked if possible.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Can you instead just add these to app_state's original definition of State?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

TBH I assumed that was going to stir up a bunch of issues but I managed to add them without any big issues.


export function buildMockState(overrides: Partial<State> = {}): State {
export function buildMockState(overrides: Partial<FullState> = {}): State {
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Previously this did not support overriding all the different kinds of state because the overrides were not being used. They still aren't all being used but making this fully capable seemed like a good thing to do.

.map((id) => state.runMetadata[id]);
.map((id) => ({...state.runMetadata[id]}))
.map((metadata) => {
if (runsToHparamsAndMetrics[metadata.id]) {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I had hoped the changes in the data model would be reflected in the other layers as well - in how we model our selectors and how the components retrieve data. I had hoped that dashboard things would clearly rely on dashboard hparams data (HparamsState.dashboardSpec and dashboardSessionGroups) while experiment list things would rely on experiment list data (RunsState.runMetadata.hparams and metrics).

Instead we've quietly merged them in this layer, making it difficult to predict whether a runs table in the experiment list, for example, will be getting its data from one part of the data model or another.

More concretely, could we refactor hparams value retrieval into two selectors:

  • getExperimentHparamsAndMetrics, which accepts an experimentId as property and returns hparam and metric values from the old location (ie RunsState.runMetadata)
  • getDashboardHparamsAndMetrics, which requires no properties and returns hparam and metric values from the new location (ie HparamsState.dashboardSessionGroups).

The first selector would be called for runs tables in the experiment list, the second selector would be called for runs tables in the dashboard.

Similarly (but differently) could the runs table in the dashboard rely more heavily on the data we now store in HparamsState.dashboardSpecs? Is it correct to say that this current scheme will still rely on us (re)constructing the domain information using the run information retrieved here? Would it be clearer to instead take the data already returned to you by the server and use that as the domain information?

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We spoke offline. The key insights:

  • getRuns is essentially getExperimentHpramsAndMetrics. It is only used by the runs tables in the experiment list.
  • getRunsFromExperimentIds is essentially getDashboardHparamsAndMetrics. It is only used by the new runs table and scalar card data tables in the dashboard.
  • we will be using HparamsState.dashboardSpecs to generate hparam filter domains. We don't see that happening in this PR because we don't have filters in the new runs table, yet.

Action items:

  • We agreed these functions need to be renamed. If we decide to rename in a later PR, it would at least be nice to put a comment in this PR.
  • We agreed that getRuns should only use hparams and metrics data from RunsState.
  • We agreed that getRunsFromExperimentIds should only use hparams and metrics data from HparamsState.

@rileyajones
Copy link
Contributor Author

I will rename getRunsByExperimentId to getRunsForDashboard or getRunsForTimeSeries.

getRuns will not return anything from the hparams state.

getRunsForDashboard will blank out any hparams data that may exist on the run.metadata object

@rileyajones
Copy link
Contributor Author

It seems like there are still a few places where getRuns is used in the timeseries dashboard. None of those places rely on hparam data but I am going to hold off on renaming and phasing it out until a future PR.

… should always return data from the hparams state and never return hparams data from the runs state
@rileyajones
Copy link
Contributor Author

I will rename getRunsByExperimentId to getRunsForDashboard or getRunsForTimeSeries.

getRuns will not return anything from the hparams state.

getRunsForDashboard will blank out any hparams data that may exist on the run.metadata object

I've made these changes. There are a couple of other places where getRuns is used in the timeseries dashboard, but the hparams data it supplies isn't being referenced so I would like to save that refactor for a later PR. Additionally I would like to remove the parameter supplied to getDashboardRuns but that turned out to be a larger change than expected so I would also like to do that in a later PR.

@rileyajones rileyajones requested a review from bmd3k August 17, 2023 22:34
hparams,
};

for (const metricValue of session.metricValues) {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I recall in a recent discussion that we decided we didn't want metrics from the /session_groups response to appear in the Time Series runs table. You would instead be using the metric values from elsewhere in the data model - from the /timeSeries response perhaps?

We thought that we could therefore turn off metric calculation for /session_groups requests for the dashboard. We'd save a bunch of unneccesary processing and reduce latency.

With that in mind, is there any need for parsing the session metric values?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

From an offline discussion:

  • We will stop relying on this portion of the response for data.
  • We should calculate these values if they exist in the store particularly considering this is essentially the same as the logic found in the runs data source.

};

for (const metricValue of session.metricValues) {
const runId = metricValue.name.group
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

On the other hand, without metric values from the /session_groups requests, you wouldn't be able to calculate runIds in this way - you'd have to find a different approach. Perhaps get a list of runs for each experiment (I believe you get this elsewhere, too) and then determine which session groups they belong to?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

From an offline discussion:

  • We will calculate this mapping using existing data on the front end (probably from the /runs response) paired with the Session.name value.

);

for (const session of sessionGroup.sessions) {
runToHparamsAndMetrics[session.name] = {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It probably doesn't matter since this is all going to get rewritten, but in case you decide to carry over this logic: I don't think this is desirable. session.name does not guarantee even being the name of a run.

I don't think runs_data_source does this but I only took a quick look.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is essentially what runs_data_source does on lines 171-188
In practice Session.name seems to always be a run name. If no mapping between Session and run exists we cannot display hparams by runs.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I don't think this is in runs_data_source.ts:L171-188. Those lines of code only generate runToHparamsAndMetrics keys from combination of session.name + metricValue.name.group (with some special handling for when group is empty).

This statement here is generating keys from session.name alone.

To illustrate this, the test at runs_data_source_test.ts:L63 does not generate a runToHparamsAndMetrics entry with key 'eid/run_name_2'.

In practice Session.name seems to always be a run name.

It is not. Sometimes it is a run name. Sometimes it is a common prefix of a set of run names. Sometimes it is an experiment id. Sometimes it is empty.

Even though this will be refactored soon, it would be reassuring if we could agree on this before merging the PR, especially since it's possible I'll be OOO during the refactor.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We discussed offline and it seems like we are on the same page. We will not generate data for runs that do not exist - more specifically we do not generate data based solely on session name.

buildSessionGroup({
name: 'session_group_1',
hparams: {
hp1: 1,
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Use quotes for keys in 'hparams': 'hp1', 'hp2', etc...
You will otherwise get lint errors when you import. (The import would still suceed but might as well keep it as clean as we can when we see things in advance).

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The OSS linter doesn't like that.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The definition of 'hparams' is:

  hparams: {
    [hparamName: string]: boolean | number | string;
  };

It says the keys should be strings.

Perhaps there is confusion because I quoted 'hparams' as well. I perhaps should have used hparams instead to try to make it clearer that you should just quote the keys.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Step 1:
image

Step 2:
image

Step 3:
image

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I see. Sorry, my bad. It's surprising to me but I guess makes sense when I realize that prettier is not the typescript compiler and probably doesn't know what has string keys and what does not.

I could have tried this out myself which would have saved some back and forth.

I looked briefly into prettier option "--quote-props=preserve", which would allow us to quote the property names. But perhaps its best to leave things alone and see how this quote issue plays out when prettier is adopted internally.

DebuggerState,
} from '../../plugins/debugger_v2/tf_debugger_v2_plugin/store/debugger_types';

interface FullState extends State {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Can you instead just add these to app_state's original definition of State?

[DEBUGGER_FEATURE_KEY]: DebuggerState;
[ALERT_FEATURE_KEY]: AlertState;
[PERSISTENT_SETTINGS_FEATURE_KEY]: PersistentSettingsState;
}
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

How about HparamState?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Hparams state is in here? On line 122.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yes I see now. My bad.

},
},
const state = buildMockState({
metrics: buildMetricsState({
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

buildMockState is useful but it doesn't change the fact that we can continue to use appStateFromMetricsState, buildStateFromAppRoutingState, etc..

Instead you have changed to referring to METRICS_FEATURE_KEY, APP_ROUTING_FEATURE_KEY, etc by their bare constant values 'metrics', 'app_routing', etc... but that seems unnecessary and mildly controversial?

There could be some pros to do this (readability, for instance) but there are also clearly some cons (maintainability). It would be worth consulting the rest of the team about this. Maybe split those particular changes from this PR and we can discuss at the next FE team?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Alright, I will return to using the state builders but spread them into buildMockState then plan to bring the topic up at the FE meeting.

const state = {
...buildStateFromAppRoutingState(
const state = buildMockState({
app_routing:
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

As I mentioned elsewhere, moving from using functions like ...buildStateFromAppRoutingState to using bare constant values like 'app_routing' could probably use some discussion with the rest of the FE team.

The changes to use buildMockState() in this file are fine but its still unrelated to the rest of the PR as far as I can tell. (Perhaps it was necessary previously when you had also updated getRuns() to rely on Hparams state).

@rileyajones rileyajones requested a review from bmd3k August 19, 2023 00:08
);

for (const session of sessionGroup.sessions) {
runToHparamsAndMetrics[session.name] = {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I don't think this is in runs_data_source.ts:L171-188. Those lines of code only generate runToHparamsAndMetrics keys from combination of session.name + metricValue.name.group (with some special handling for when group is empty).

This statement here is generating keys from session.name alone.

To illustrate this, the test at runs_data_source_test.ts:L63 does not generate a runToHparamsAndMetrics entry with key 'eid/run_name_2'.

In practice Session.name seems to always be a run name.

It is not. Sometimes it is a run name. Sometimes it is a common prefix of a set of run names. Sometimes it is an experiment id. Sometimes it is empty.

Even though this will be refactored soon, it would be reassuring if we could agree on this before merging the PR, especially since it's possible I'll be OOO during the refactor.

buildSessionGroup({
name: 'session_group_1',
hparams: {
hp1: 1,
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The definition of 'hparams' is:

  hparams: {
    [hparamName: string]: boolean | number | string;
  };

It says the keys should be strings.

Perhaps there is confusion because I quoted 'hparams' as well. I perhaps should have used hparams instead to try to make it clearer that you should just quote the keys.

[DEBUGGER_FEATURE_KEY]: DebuggerState;
[ALERT_FEATURE_KEY]: AlertState;
[PERSISTENT_SETTINGS_FEATURE_KEY]: PersistentSettingsState;
}
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yes I see now. My bad.

);

for (const session of sessionGroup.sessions) {
runToHparamsAndMetrics[session.name] = {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We discussed offline and it seems like we are on the same page. We will not generate data for runs that do not exist - more specifically we do not generate data based solely on session name.

@rileyajones rileyajones merged commit b408561 into tensorflow:master Aug 21, 2023
bileschi pushed a commit that referenced this pull request Aug 22, 2023
In #6544 we introduced a
dependency on AppState on DebuggerState. This unfortunately fails some
internal tests, where certain TensorBoard products do not want to
unnecessarily depend on plugin state that they don't use.

I've removed DebuggerState from AppState. Fortunately there are no tests
or features that require this dependency.
rileyajones added a commit that referenced this pull request Aug 22, 2023
…6553)

## Motivation for features / changes
Now that the filter dialog is merged #6493 we need to add support for
filtering by hparams. However, #6544 changed the way hparam values are
read. This change adds a new property to state in which to store
dashboard related hparam and metric filters.

See #6488 for the WIP integration with the runs_table_container +
tb_data_table -> common_selectors + hparam_selectors.
rileyajones added a commit that referenced this pull request Aug 23, 2023
## Motivation for features / changes
As per a discussion around #6544 we determined that the dashboard should
not be requesting runs per experiment but should instead be getting all
runs for all experiments shown.
This greatly simplifies a lot of the logic used to retrieve runs and run
related data.
yatbear pushed a commit to yatbear/tensorboard that referenced this pull request Aug 25, 2023
…ects (tensorflow#6544)

## Motivation for features / changes
This is a continuation of tensorflow#6540
Now that data is being retrieved from the hparams plugin and written to
the store I am adding a pair of selectors to retrieve it and inject it
into the the data retrieved from `getRuns` and `getRunsForExperimentId`.

The logic used to translate `SessionGroups` to `Record<RunId/Group,
Value>` is very similar to that being used by the old code in
`runs_data_source`. I could imagine this being refactored to retrieve
the run/group relationships from another part of the store in the future
in order to better integrate with our plans to stop relying on the
metrics portion of the hparams payload.

## Note For Reviewer(s)
A large majority of the lines changed in this PR are indent changes
resulting from changing the method used to mock the state during tests.

I have opted not to add additional test to the runs or common selectors
(though I did have to fix 16) as the intention is for this to be a drop
in replacement with no new features added. The number of test failures
which I had to fix left me feeling confident of the existing test
coverage.
yatbear pushed a commit to yatbear/tensorboard that referenced this pull request Aug 25, 2023
In tensorflow#6544 we introduced a
dependency on AppState on DebuggerState. This unfortunately fails some
internal tests, where certain TensorBoard products do not want to
unnecessarily depend on plugin state that they don't use.

I've removed DebuggerState from AppState. Fortunately there are no tests
or features that require this dependency.
yatbear pushed a commit to yatbear/tensorboard that referenced this pull request Aug 25, 2023
…ensorflow#6553)

## Motivation for features / changes
Now that the filter dialog is merged tensorflow#6493 we need to add support for
filtering by hparams. However, tensorflow#6544 changed the way hparam values are
read. This change adds a new property to state in which to store
dashboard related hparam and metric filters.

See tensorflow#6488 for the WIP integration with the runs_table_container +
tb_data_table -> common_selectors + hparam_selectors.
yatbear pushed a commit to yatbear/tensorboard that referenced this pull request Aug 25, 2023
…w#6555)

## Motivation for features / changes
As per a discussion around tensorflow#6544 we determined that the dashboard should
not be requesting runs per experiment but should instead be getting all
runs for all experiments shown.
This greatly simplifies a lot of the logic used to retrieve runs and run
related data.
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants