[back-37] jellyfish data migration #665

jh-bate · 2023-10-03T20:43:05Z

NOTE: not to merge until we have done test runs in various environments

process only jellyfish data that doesn't have a _depulicator.hash
jellyfish data is denoted by the lack of a _id of type objectId
only set the hash, do not modify the datum in any other way
only update the record id the _id and modifiedTime remain the same as when we started

darinkrauss

A number of comments...

darinkrauss · 2023-10-04T20:12:38Z

migrations/back_37/back_37.go

+)
+
+func main() {
+	ctx := context.Background()


I'll send you what I have for the interrupt/context code later this afternoon.

Okay, I looked into integrating cancelable contexts into the existing Runner interface and it is going to be a wider change than makes sense for this code. Furthermore, to "hack" it in to this code is going to duplicate much of runner.go (error channel, signal channel, select, etc.) so perhaps it is best we just defer the work. Worst case it won't have a completely clean exit when interrupted, but perhaps that isn't such a big deal. Thoughts?

Ok yeah based on our discussion it doesn't make sense sense for us to do this now. As you said we can restart the process as our query will just find all of those Jellyfish records that do not yet have a _depulicator.hash set.

darinkrauss · 2023-10-04T20:15:07Z

migrations/back_37/back_37.go

+
+func createDatumHash(bsonData bson.M) (string, error) {
+	identityFields := []string{}
+	if userID, err := getValidatedString(bsonData, "_userId"); err != nil {


[nit] Maybe, for the sake of consistency, userID -> datumUserID (and deviceID, deliveryType, subType, units, value` below).

migrations/back_37/back_37.go

darinkrauss · 2023-10-04T20:47:28Z

migrations/back_37/back_37.go

+		valueRaw, ok := bsonData["value"]
+		if !ok {
+			return "", errors.New("value is missing")
+		}
+		val, ok := valueRaw.(float64)
+		if !ok {
+			return "", errors.New("value is not of expected type")
+		}
+
+		if len(fmt.Sprintf("%v", valueRaw)) > 7 {
+			// NOTE: we need to ensure the same precision for the
+			// converted value as it is used to calculate the hash
+			mgdlVal := val*18.01559 + 0.5
+			mgdL := glucose.MgdL
+			val = *glucose.NormalizeValueForUnits(&mgdlVal, &mgdL)
+		}
+		strVal := strconv.FormatFloat(val, 'f', -1, 64)
+		identityFields = append(identityFields, strVal)


[nit] Consider collapsing into one block.

Suggested change

valueRaw, ok := bsonData["value"]

if !ok {

return "", errors.New("value is missing")

}

val, ok := valueRaw.(float64)

if !ok {

return "", errors.New("value is not of expected type")

}

if len(fmt.Sprintf("%v", valueRaw)) > 7 {

// NOTE: we need to ensure the same precision for the

// converted value as it is used to calculate the hash

mgdlVal := val*18.01559 + 0.5

mgdL := glucose.MgdL

val = *glucose.NormalizeValueForUnits(&mgdlVal, &mgdL)

}

strVal := strconv.FormatFloat(val, 'f', -1, 64)

identityFields = append(identityFields, strVal)

if valRaw, ok := bsonData["value"]; !ok {

return "", errors.New("value is missing")

} else if val, ok := valRaw.(float64); !ok {

return "", errors.New("value is not of expected type")

} else {

if len(fmt.Sprintf("%v", valueRaw)) > 7 {

// NOTE: we need to ensure the same precision for the

// converted value as it is used to calculate the hash

mgdlVal := val*18.01559 + 0.5

mgdL := glucose.MgdL

val = *glucose.NormalizeValueForUnits(&mgdlVal, &mgdL)

}

identityFields = append(identityFields, strconv.FormatFloat(val, 'f', -1, 64))

}

I don't think the check for length > 7 is going to work. What happens if the value is "12.34567". It's length is 8 (so > 7), but it shouldn't be doing any truncating. I suppose though, it would just be unnecessary work??? So, maybe not a big deal?

Is the + 0.5 necessary? If we have a large amount of decimal places, then I'm sure they came from simply dividing a mg/dL value by 18.01559. If we add 0.5 then this may throw off the calculations. What does jellyfish (or the Uploader code that uses jellyfish) do with mg/dL values before converting to mmol/L? I suppose in one sense we need to "undo" any conversion jellyfish did and then reapply the platform conversion, yes?

We should probably be paranoid and check the units before we multiply by 18.01559. If it is already mg/dL then this would be bad.

We should use the 18.01559 constant from glucose.go, specifically MmolLToMgdLConversionFactor. That way there is a single instance of this value in all of the platform code.

On a bigger picture, I'm wondering if we should just truncate what we have directly rather that convert to mg/dL and back to mmol/L.

Finally, looking at the code from platform in glucose.go in NormalizeValueForUnits it only does this "truncation of more that 6 digits past the decimal" if the incoming units are mg/dL. It leaves incoming mmol/L values as-is. I think what this means is that we'll need to ensure the new Uploader code that replaces jellyfish with platform will either 1) send mg/dL values, or 2) send mmol/L values and truncate to 6 digits in the Uploader before uploading. So, if the Uploader will upload mmol/L values NOT truncated to 6 digits for some devices, then this code here will need to change. My concern here is that we'd end up with a different deduplicate hash from the one after the Uploader is sending to platform.

ok - I have reworked that.

I think what this means is that we'll need to ensure the new Uploader code that replaces jellyfish with platform will either 1) send mg/dL values, or 2) send mmol/L values and truncate to 6 digits in the Uploader before uploading. So, if the Uploader will upload mmol/L values NOT truncated to 6 digits for some devices, then this code here will need to change. My concern here is that we'd end up with a different deduplicate hash from the one after the Uploader is sending to platform.

So wouldn't we just do the same as other devices the uploader ingests data from? From my understanding

it takes the units and value as whatever the devices gives us and then the platform normalize's that to save as mmol/L units and converts the value (if it wasn't mmol/L).

the platform truncation would occur that same as any other device that is uploaded

the hash is all good as both our JF hash and migration hash use a truncated value

Yes, all of the above is correct.

I think the problem is in the second bullet, "the platform truncation would occur that same as any other device that is uploaded". Specifically, from glucose.go in platform:

func NormalizeValueForUnits(value *float64, units *string) *float64 { if value != nil && units != nil { switch *units { case MgdL, Mgdl: intValue := int(*value/MmolLToMgdLConversionFactor*MmolLToMgdLPrecisionFactor + 0.5) floatValue := float64(intValue) / MmolLToMgdLPrecisionFactor return &floatValue } } return value }

Note that the truncation only occurs when converting from mg/dL to mmol/L and does not occur if the units uploaded is already mmol/L. Thus, for this migration to be consistent with any new uploader code that moves from Jellyfish to Platform uploads, the new Uploader code has to conform to either 1) send mg/dL values (and above normalization code will truncate), or 2) send mmol/L values already truncated to 6 digits using the same formula.

For example, with this migration code, all mmol/L values will be truncated. This will be the same iff the new Uploader code sends mg/dL or does the truncation itself. However, if the new Uploader code sends non-truncated mmol/L values, the platform will not do any truncation (see NormalizeValueForUnits above) and therefore the deduplicator hash generated by this migration code (which has truncated all values) will not match the deduplicator hash generated when the new Uploader code uploads non-truncated mmol/L values.

Does that make sense?

migrations/back_37/back_37.go

darinkrauss · 2023-10-04T21:00:41Z

migrations/back_37/back_37.go

+		hashUpdatedCount++
+	}
+	if err := jellyfishDocCursor.Err(); err != nil {
+		logger.WithError(err).Error("error while fetching data. Please re-run to complete the migration.")


[nit] Capitalize first letter of log message.

migrations/back_37/back_37.go

darinkrauss · 2023-10-05T23:06:11Z

I updated a couple of comments. I think this is mostly looking good, but may change when we figure out the downstream requirements (e.g. interleaved Jellyfish and Platform uploads).

jh-bate added 2 commits October 4, 2023 09:42

taketwo

8d81723

cleanup based on disscusion

315a2e8

jh-bate requested a review from darinkrauss October 4, 2023 02:32

jh-bate changed the title ~~taketwo~~ [back-37] jellyfish data migration Oct 4, 2023

jh-bate mentioned this pull request Oct 4, 2023

[BACK-37] Jellyfish data migration #660

Closed

darinkrauss requested changes Oct 4, 2023

View reviewed changes

review updates

68caf75

jh-bate requested a review from darinkrauss October 5, 2023 04:12

jh-bate added 2 commits November 8, 2023 12:46

Merge branch 'master' into jellyfish_data_migration.2

9c8dfa4

migrate jellyfish pumpSettings bolus if present

08387b4

jh-bate changed the base branch from master to jellyfish-migration-updates_0 November 8, 2023 21:58

jh-bate marked this pull request as ready for review November 8, 2023 21:59

jh-bate merged commit e8e07ff into jellyfish-migration-updates_0 Nov 8, 2023

jh-bate mentioned this pull request Nov 8, 2023

[BACK-37][BACK-2520][BACK-2728] Jellyfish migration updates required for platfrom #683

Closed

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

[back-37] jellyfish data migration #665

[back-37] jellyfish data migration #665

jh-bate commented Oct 3, 2023 •

edited

Loading

darinkrauss left a comment

darinkrauss Oct 4, 2023

darinkrauss Oct 5, 2023

jh-bate Oct 9, 2023

darinkrauss Oct 4, 2023

darinkrauss Oct 4, 2023

jh-bate Oct 5, 2023

jh-bate Oct 5, 2023

darinkrauss Oct 5, 2023

darinkrauss Oct 4, 2023

darinkrauss commented Oct 5, 2023

[back-37] jellyfish data migration #665

[back-37] jellyfish data migration #665

Conversation

jh-bate commented Oct 3, 2023 • edited Loading

darinkrauss left a comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

darinkrauss commented Oct 5, 2023

jh-bate commented Oct 3, 2023 •

edited

Loading