Sandersaarond/mysql model metrics #98

SandersAaronD · 2024-09-27T20:29:44Z

WIP

Meant to transition writing and reading model metrics from loki to mysql, currently has a writer implemented and the reader function is there but not attached to an endpoint.

TODO for this:

make sure the reader function works, and serves an endpoint
have the UI query that endpoint instead of loki
switch the python client to write to this (format was changed slightly to allow for debouncing so that we don't end up with too many http connections open at once for frequent logging jobs)

Unclear if this can be done as part of this PR, but: I have a dangling concern about authentication, where it isn't clear to me if tenantID will always be populated correctly in production, since our local setup just defaults to 0, making this hard to test locally. Either we need to find a way to test this locally or we won't know about that until we deploy it.

DO NOT MERGE

annanay25

Left an initial round of comments, nice work so far!

annanay25 · 2024-10-02T00:19:45Z

ai-training-api/model/model_metrics.go

+    StackID     uint64   `json:"stack_id" gorm:"not null;primaryKey"`
+    ProcessID   uuid.UUID `json:"process_id" gorm:"type:char(36);not null;primaryKey;foreignKey:ProcessID;references:ID"` // Foreign key
+    MetricName  string   `json:"metric_name" gorm:"size:32;not null;primaryKey"`
+    StepName    string   `json:"step_name" gorm:"size:32;not null;primaryKey"`


It's not entirely obvious what StepName is and why we need it. We should add comments explaining the fields.

annanay25 · 2024-10-03T18:55:20Z

ai-training-api/model/model_metrics.go

+// Add a custom hook if necessary for additional logic.
+// Example: AfterCreate hook for custom logic
+func (m *ModelMetrics) AfterCreate(tx *gorm.DB) error {
+	// Custom logic after creating a metric entry
+	tx.Logger.Info(tx.Statement.Context, "AfterCreate hook called for ModelMetrics")
+	return nil
+}


I don't think this is being used?

annanay25 · 2024-10-03T18:59:36Z

ai-training-api/app/model_metrics_test.go

+type testApp struct {
+	App
+}
+
+func (a *testApp) db(ctx context.Context) *gorm.DB {
+	return a.App._db
+}
+
+func setupTestDB(t *testing.T) (*gorm.DB, func()) {
+	db, err := gorm.Open(sqlite.Open("file::memory:?cache=shared"), &gorm.Config{})
+	require.NoError(t, err)
+
+	err = db.AutoMigrate(&model.Process{}, &model.ModelMetrics{})
+	require.NoError(t, err)
+
+	return db, func() {
+		sqlDB, err := db.DB()
+		require.NoError(t, err)
+		sqlDB.Close()
+	}
+}


Can we reuse the code in api_test.go? I think most of this boilerplate is set up already.

annanay25 · 2024-10-03T19:05:23Z

ai-training-api/model/model_metrics.go

+)
+
+type ModelMetrics struct {
+    StackID     uint64   `json:"stack_id" gorm:"not null;primaryKey"`


Can we use TenantID as that's the consistent key we are using in other structs? Or can we update the other structs to use StackID as a uint64?

annanay25 · 2024-10-03T19:07:17Z

ai-training-api/app/model_metrics.go

+	}
+
+	// Iterate over the metrics and build the series data
+    var response GetModelMetricsResponse


I think there's some indentation mismatch in this file from this point on

annanay25 · 2024-10-03T19:11:09Z

ai-training-api/app/model_metrics.go

+// Incoming format is an array of these
+type ModelMetricsSeries struct {
+	MetricName string `json:"metric_name"`
+	StepName   string `json:"step_name"`
+	Points     []struct {
+		Step  uint32 `json:"step"`
+		Value json.Number `json:"value"`
+	} `json:"points"`
+}


Why are we storing metrics as individual rows (one row each for a step and value) in mySQL? Why not store an array like in this struct? Is it not performant? Curious if there was some benchmarking done to choose the former.

Agreed, it would also limit us from compressing the points in the future which could end up with a lot more disk and network usage. Do we ever not pull back all of the points at once?

annanay25 · 2024-10-03T19:12:44Z

ai-training-api/model/model_metrics.go

+    MetricValue string   `json:"metric_value" gorm:"size:64;not null"`
+
+    Process Process `gorm:"foreignKey:ProcessID;references:ID"` // Relationship definition
+}


should we also track timestamp for each step? i feel that might be helpful in tracking time between training runs, performance tracking, etc.

DO NOT MERGE

SandersAaronD added 9 commits September 19, 2024 15:33

add model metrics table

ee71829

Pin air version to continue supporting go 1.22

d2879e0

switch model_metrics endpoint to write to mysql

2b3326a

factor model metrics endpoint into its own file, add tests

dfd7ce8

First stab at a reader for model metrics from mysql

f26f957

add model metrics table

6c041ab

Merge branch 'main' into sandersaarond/mysql-model-metrics

858c354

DO NOT MERGE

Unbroken, almost there ...

34a59ce

Some cleanup

e068d1e

SandersAaronD force-pushed the sandersaarond/mysql-model-metrics branch from 53d8df6 to e068d1e Compare October 1, 2024 15:41

SandersAaronD added 4 commits October 1, 2024 14:38

working tests, bugfix

2ef0aca

Refactor some tests

0f7d9c8

Tidy up frontend and backend contracts to store model metrics

84f708a

Remove debug logging

c1e05bc

SandersAaronD marked this pull request as draft October 2, 2024 20:31

Merge branch 'main' into sandersaarond/mysql-model-metrics

d9c5ea3

annanay25 reviewed Oct 3, 2024

View reviewed changes

SandersAaronD added 6 commits October 7, 2024 15:50

Add getter for model metrics from backend service

33552fd

DO NOT MERGE

Committing a clean-ish WIP

530baea

Add config field

07046ff

Remove debug logging

e178512

Slight fix to error handler

be998de

Viz working somewhat finally

83dd594

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Sandersaarond/mysql model metrics #98

Sandersaarond/mysql model metrics #98

SandersAaronD commented Sep 27, 2024 •

edited

Loading

annanay25 left a comment

annanay25 Oct 2, 2024

annanay25 Oct 3, 2024

annanay25 Oct 3, 2024

annanay25 Oct 3, 2024

annanay25 Oct 3, 2024

annanay25 Oct 3, 2024

csmarchbanks Oct 8, 2024

annanay25 Oct 3, 2024

Sandersaarond/mysql model metrics #98

Are you sure you want to change the base?

Sandersaarond/mysql model metrics #98

Conversation

SandersAaronD commented Sep 27, 2024 • edited Loading

annanay25 left a comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

SandersAaronD commented Sep 27, 2024 •

edited

Loading