From 2ecbebc4257ff948eac5b4805eb80d25e0ffc222 Mon Sep 17 00:00:00 2001 From: Schneems Date: Mon, 15 Jan 2024 12:46:52 -0500 Subject: [PATCH] v0.1.0 --- CHANGELOG.md | 2 ++ Cargo.toml | 2 ++ README.md | 33 ++++++++++++++++----------------- 3 files changed, 20 insertions(+), 17 deletions(-) diff --git a/CHANGELOG.md b/CHANGELOG.md index 910ec1b..e30d298 100644 --- a/CHANGELOG.md +++ b/CHANGELOG.md @@ -1,3 +1,5 @@ ## Unreleased +## 0.1.0 - 2024/01/15 + - Created diff --git a/Cargo.toml b/Cargo.toml index ce3ff82..27bdac0 100644 --- a/Cargo.toml +++ b/Cargo.toml @@ -3,6 +3,8 @@ name = "magic_migrate" version = "0.1.0" edition = "2021" license = "MIT" +description = "Automagically load and migrate deserialized structs to the latest version" +keywords = ["serde", "version", "upgrade", "migrate", "transfer", "isomorphic"] repository = "https://github.com/schneems/magic_migrate" documentation = "https://docs.rs/magic_migrate" readme = "README.md" diff --git a/README.md b/README.md index eb08af3..58be0a4 100644 --- a/README.md +++ b/README.md @@ -1,6 +1,6 @@ ## Magic Migrate -Automagically load and migrate deserialized structs to the latest version +Automagically load and migrate deserialized structs to the latest version. > 🎵 If you believe in magic, come along with me > @@ -11,13 +11,13 @@ These docs are [intended to be read on docs.rs](https://docs.rs/magic_migrate/la ## What -Let's say that you made a struct that serializes to disk somehow, perhaps it uses toml. Now, let's say that you want to add a new field to that struct, but you don't want to lose older persisted data. What ever should you do? +Let's say that you made a struct that serializes to disk somehow; perhaps it uses toml. Now, let's say you want to add a new field to that struct but want to keep older persisted data. Whatever should you do? -You can define how to convert from one struct to another using either [`From`] or [`TryFrom`] then tell Rust how to migrate from one to the next via [`Migrate`] or [`TryMigrate`] traits. Now, when you try to load data into the current struct it will follow a chain of structs in reverse order to find the first one that successfully serializes. When that happens, it will convert that struct to the latest version for you. It's magic! (Actually it's mostly clever use of trait boundries, but whatever). +You can define how to convert from one struct to another using either [`From`] or [`TryFrom`], then tell Rust how to migrate from one to the next via [`Migrate`] or [`TryMigrate`] traits. Now, when you try to load data into the current struct, it will follow a chain of structs in reverse order to find the first one that successfully serializes. When that happens, it will convert that struct to the latest version for you. It's magic! (Actually, it's mostly clever use of trait boundaries, but whatever). ## Docs -For additional docs see: +For additional docs, see: - Traits - [`Migrate`] trait @@ -53,7 +53,7 @@ struct PersonV2 { updated_at: DateTime } -// First define how to map from one struct to another +// First, define how to map from one struct to another impl From for PersonV2 { fn from(value: PersonV1) -> Self { PersonV2 { @@ -82,39 +82,38 @@ assert_eq!(person.name, "Schneems".to_string()); This library was created to handle the case of serialized metadata stored in layers in a buildpack as toml. -In this use case, structs are serialized to disk when the Cloud Native Buildpack (CNB) is run. Usually these values represent application cache state and are important for cache invalidations. +In this use case, structs are serialized to disk when the Cloud Native Buildpack (CNB) is run. Usually, these values represent the application cache state and are important for cache invalidations. -The buildpack implementer has no control over how often the buildpack is run. That means there's no guarantee the end user will run it with sequential struct versions. One user might be running with the latest struct verison serialized and another user might be using a version from years ago. +The buildpack implementer has no control over how often the buildpack is run. That means there's no guarantee the end user will run it with sequential struct versions. One user might run with the latest struct version serialized, and another might use a version from years ago. This scenario happens in the wild with (a "classic" buildpack i.e. not CNB). -Instead of having to force the programmer to consider all possible cache states at all possible times a "migration" approach allows programmers to focus on a single cache state change at a time. This reduces programmer cognitive overhead and (hopefully) reduces bugs. - +Instead of forcing the programmer to consider all possible cache states at all times, a "migration" approach allows programmers to focus on a single cache state change at a time. Which reduces programmer cognitive overhead and (hopefully) reduces bugs. ## What won't it do? (The ABA problem) -This library cannot ensure that if a `PersonV1` struct was serialized that it cannot be loaded into `PersonV2` without migration. I.e. it does not guarantee that the [`From`] or [`TryFrom`] code was run. +This library cannot ensure that if a `PersonV1` struct was serialized, it cannot be loaded into `PersonV2` without migration. I.e. it does not guarantee that the [`From`] or [`TryFrom`] code was run. -For example if `PersonV2` struct introduced an `Option` field, instead of `DateTime` then the string `"name = 'Richar'"` could be deserialized to either PersonV1 or PersonV2 without needing to call a migration. +For example, if the `PersonV2` struct introduced an `Option` field, instead of `DateTime` then the string `"name = 'Richard'"` could be deserialized to either PersonV1 or PersonV2 without needing to call a migration. - [Playground demonstration of the ABA problem](https://play.rust-lang.org/?version=stable&mode=debug&edition=2021&gist=e26033d3c8c3c34414fe594674f6d053) -There are more links in a related discussion in serde: +There are more links in a related discussion in Serde: - [serde-rs/serde issue trying to use `tag` and `deny_unknown_fields`](https://github.com/serde-rs/serde/issues/2666) ## What can you do to harden your code against this (ABA) issue? -- Use [deny_unknown_fields](https://serde.rs/container-attrs.html) from serde. This setting prevents silently dropping additional struct fields. This would handle the case where V1 has two fields and V2 has only one field [playground example](https://play.rust-lang.org/?version=stable&mode=debug&edition=2021&gist=75c6f06234e1d64aea7b37c448321abf). However, it will **not** protect the case where we've added a field that is optional, [playground example](https://play.rust-lang.org/?version=stable&mode=debug&edition=2021&gist=47dde9f52b0c5114ef28f35bb019969c). -- Add tests that ensure one struct cannot deserialize into a later one in the chain. This might be difficult if your structs have lots of optional fields and you want to generate permutations of all of them. -- Add a [version marker field](https://stackoverflow.com/a/77700752/147390). This works, but requires that you notice and keep the field name updated when creating a new struct (possible programmer error). And it will leak an implementation detail to anyone who might see your serialized data (which may or may not matter) to you. +- Use [deny_unknown_fields](https://serde.rs/container-attrs.html) from serde. This setting prevents silently dropping additional struct fields. This strategy would handle the case where V1 has two fields and V2 has only one field [playground example](https://play.rust-lang.org/?version=stable&mode=debug&edition=2021&gist=75c6f06234e1d64aea7b37c448321abf). However, it will **not** protect the case where we've added an optional field, [playground example](https://play.rust-lang.org/?version=stable&mode=debug&edition=2021&gist=47dde9f52b0c5114ef28f35bb019969c). +- Add tests that ensure one struct cannot deserialize into a later one in the chain. Writing tests might be difficult if your structs have many optional fields and you want to generate permutations of all of them. +- Add a [version marker field](https://stackoverflow.com/a/77700752/147390). This strategy works, but you must notice and keep the field name updated when creating a new struct (possible programmer error). And it will leak an implementation detail to anyone who might see your serialized data (which may or may not matter) to you. - Read these docs and understand the underlying reason why this happens. - If you have another suggestion to harden a codebase, open an issue. ## Other possible "migration" solutions and their differences -- Using serde's [container attributes from and try_from](https://serde.rs/container-attrs.html). This only works if you never want to store and deserialize the latest version in the chain. [playground example showing you when this fails](https://play.rust-lang.org/?version=stable&mode=debug&edition=2021&gist=b6ea1cd054bab5d7df62a04cbd7c6284). +- Using Serde's [container attributes from and try_from](https://serde.rs/container-attrs.html). This feature only works if you never want to store and deserialize the latest version in the chain. [playground example showing you when this fails](https://play.rust-lang.org/?version=stable&mode=debug&edition=2021&gist=b6ea1cd054bab5d7df62a04cbd7c6284). -In comparison to using serde's `from` and `try_from`, magic migrate will always try to convert to the target struct first, and then migrate using the latest possible struct in the chain. This allows for migrating through the entire chain or storing and using the latest value. +Compared to using Serde's `from` and `try_from` container attribute features, magic migrate will always try to convert to the target struct first, then migrate using the latest possible struct in the chain, allowing structs to migrate through the entire chain or storing and using the latest value. - The [Serde version crate](https://docs.rs/serde-version/latest/serde_version/) seems to have overlapping goals. Differences are unclear. If you've tried it, update these docs.