Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Local storage v3 migration #172

Merged
merged 13 commits into from
Aug 26, 2024
Merged

Local storage v3 migration #172

merged 13 commits into from
Aug 26, 2024

Conversation

grod220
Copy link
Contributor

@grod220 grod220 commented Aug 20, 2024

Closes #166 & quite a bit more. We use chrome.local.storage for persisting some state for the extension. This includes things like wallets, grpc urls, frontend selection, etc. Given local storage is simply a key-value store, it does not come with any tooling to handle migrations. This has made this task a bit more complex than expected.

Our custom migration code for local storage was not quite robust enough to reliably handle the task of the issue. For that reason, I refactored some mechanics. Will add more color inline below.

Note: This is PR is definitely within the "risky" territory. Migrating user state on the fly can have unintended consequences. This should be throughly tested before we ship another Prax version. For example, how does changing the user's rpc they saved in local storage affect an existing block sync in progress? It's very likely it won't take effect until the service worker is restarted (even though it shows otherwise in the extension). Hence the reason why changing RPCs triggers an extension restart.

@grod220 grod220 requested a review from a team August 20, 2024 16:48
@TalDerei
Copy link
Contributor

TalDerei commented Aug 21, 2024

side note; isn't it still the case that the recommended rpc entries in the prax registry contain outdated embedded frontends? I think we should remove any frontends that aren't using cron.

@grod220 grod220 removed the request for review from a team August 21, 2024 13:22
@grod220
Copy link
Contributor Author

grod220 commented Aug 21, 2024

Environment:

  • Load up extension version 5e62cc6 (last release)
  • Start up and sync a bit
  • Set grpc/frontend state per test conditions
  • Load this PR's version
  • Check console storage via chrome.storage.local.get().then(console.log)
  • Ensure bumped to expected state

User flows to test:

  • Frontend url is in the registry. After ext update: no change made.
  • Frontend url is not in the registry. After ext update: changed to a random one in the registry.
  • Grpc endpoint is in the registry. After ext update: no change made.
  • Grpc endpoint is not in the registry. After ext update: changed to a random one in the registry. Can test with https://void.s9.gay/.
  • If connected a non-mainnet chain id, after ext update: no changes made to grpc endpoint.
  • Go back to April extension version (before v2 migration). V1 -> v2 migration works for both frontend urls and grpc endpoints.

For all of these, confirm no corruption to the other keys (all get bumped to v3 after accessing (when migration takes place)).

=====

Edit: Tested all flows. Working successfully. Two things to note:

  • Even after a grpc endpoint update, block syncing will still pull from previous grpc endpoint until service worker restarts. The UI will still display the old one.
  • Unfortunately, we can't rewind the clock all the way back to an April build to manually test v1 -> v3 upgrades. The build no longer compiles (think given Rust version updates).

@grod220
Copy link
Contributor Author

grod220 commented Aug 21, 2024

@TalDerei side note; isn't it still the case that the recommended rpc entries in the prax registry contain outdated embedded frontends?

RPCs will continually be behind for their embedded frontends. It's not something that can be updated without a point release (where we update it on core).

I think we should remove any frontends that aren't using cron.

All in our frontend list except for one are using the cron. I've messaged UnityChaos about options in keeping their frontend up to date automatically.

@TalDerei
Copy link
Contributor

The migration is designed for version bumps, but I'm wondering if there's a way to rollback the extension (V3 --> V2), which seems to corrupt the local state storage?

Screenshot 2024-08-21 at 9 58 31 AM

@grod220
Copy link
Contributor Author

grod220 commented Aug 21, 2024

but I'm wondering if there's a way to rollback the extension (V3 --> V2)

The version will never decrement as the user does not have a way to transition to older versions of the extension. After using V3 code, the reason you may see a few V2 keys is because those keys have not yet been accessed (the point at which those fields get migrated).

In dev mode, if you go back to a previous version with v3 fields, it should just see there is not a migration for those v3 fields and do nothing.

@TalDerei TalDerei self-requested a review August 21, 2024 23:23
apps/extension/src/storage/base.ts Outdated Show resolved Hide resolved
apps/extension/src/storage/base.ts Outdated Show resolved Hide resolved
apps/extension/src/storage/base.ts Outdated Show resolved Hide resolved
apps/extension/src/storage/base.ts Show resolved Hide resolved
type Migrations<K extends string | number | symbol> = Partial<Record<K, Migration>>;
export type Migrations<T> = Partial<Record<Version, MigrationMap<any, T>>>;

export type VersionSteps = Record<Version, Version>;

export class ExtensionStorage<T> {
Copy link
Contributor

@TalDerei TalDerei Aug 22, 2024

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

this is not a suggested code change, but rather a general statement for reference if we restructure our migration mechanism in the future.

I personally think migrations feel more transactional in nature, where migrating multiple fields should ideally be handled atomically.

one potential approach is a batch migration design, where you define migrations at the version level. It ensures atomicity by checking that all changes associated with a particular version are applied together in a single transaction – so either all migrations succeed or none do.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is an important comment. It's unfortunate that chrome.local.storage does not come with any support for versioning or migrations. For this reason, we had to create our own custom migration code. It was probably a bad design that each field has their own version key. A possibly better design is to save a global version number, check for needed migrations when someone attempts a get(), and if so, migrate all keys at once.

Going to spend today trying this out. The hardest part of this will be migrating from the old data structure to the new.

@TalDerei
Copy link
Contributor

tested some workflows – looks good to me

  • Grpc endpoint and Frontend url is in the registry. After ext update: no change made.
Screenshot 2024-08-21 at 10 36 02 AM Screenshot 2024-08-21 at 10 40 39 AM
  • Frontend url is not in the registry. After ext update: changed to a random one in the registry.
Screenshot 2024-08-21 at 12 19 32 PM Screenshot 2024-08-21 at 12 21 40 PM
  • Grpc endpoint is not in the registry. After ext update: changed to a random one in the registry
Screenshot 2024-08-21 at 1 21 34 PM Screenshot 2024-08-21 at 1 25 32 PM
  • If connected a non-mainnet chain id, after ext update: no changes made to grpc endpoint.
Screenshot 2024-08-21 at 1 34 16 PM Screenshot 2024-08-21 at 1 36 33 PM

@TalDerei TalDerei requested a review from a team August 22, 2024 00:38
@TalDerei
Copy link
Contributor

TalDerei commented Aug 22, 2024

update: needs addressing

can you speak on partial storage object migrations? suppose we're migrating through the storage versions from V1 to V3, but the process partially fails during the migration from V2 to V3. what's the fallback here? will the extension state remain on V2, or will it retry the migration? can we simulate this?

I think we should think through what constitutes the possible failure domains.

@grod220
Copy link
Contributor Author

grod220 commented Aug 22, 2024

Will pick up the missing code documentation tomorrow. Just pushed the update+tests that adopts the better global data structure for migrations in local storage. All is working on my end, but want to keep testing user flows given how sensitive of a change this is.

@TalDerei TalDerei self-requested a review August 22, 2024 21:29
@TalDerei
Copy link
Contributor

tested some workflows – looks good to me

I still haven't retested these workflows post refactor

@grod220
Copy link
Contributor Author

grod220 commented Aug 23, 2024

Tested on all user flows, things in my view are looking good ✅

@TalDerei TalDerei self-requested a review August 24, 2024 23:42
Copy link
Contributor

@TalDerei TalDerei left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

left a minor comment; double-checked the locking system logic and that we're propagating the errors up the caller stack properly, and everything looks good to me.

Comment on lines 148 to 161
private async withDbLock<R>(fn: () => Promise<R>): Promise<R> {
if (this.dbLock) {
await this.dbLock;
}

this.dbLock = this.migrateOrInitializeIfNeeded();

try {
await this.dbLock;
return await fn();
} finally {
this.dbLock = undefined;
}
}
Copy link
Contributor

@TalDerei TalDerei Aug 25, 2024

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

there's a subtle timing-related race condition here that may slip past the locking mechanism.

  1. this.dbLock = this.migrateOrInitializeIfNeeded() effectively sets the lock by assigning the promise returned by the migrateOrInitializeIfNeeded. The migration process hasn't started yet.

-------------------------------------------- (small window of time) ---------------------------------------

  1. The migration technically starts when we reach first await inside migrateOrInitializeIfNeeded.

there's a brief window there where the race condition can be exploited – we're prematurely setting the lock to a promise. If another operation runs during this window, it would see the lock as set and mistakenly assume that the migration has already started, even though it hasn’t. Instead, we should set the lock only after the migration process is actively in progress.

private async withDbLock<R>(fn: () => Promise<R>): Promise<R> {
  if (this.dbLock) {
    await this.dbLock;
  } else {
    // ensure lock is only set after migration has started
    const migrationPromise = this.migrateOrInitializeIfNeeded();
    this.dbLock = migrationPromise;
  }

  try {
    await migrationPromise;
    return await fn();
  } finally {
    this.dbLock = undefined;
  }
}

@TalDerei
Copy link
Contributor

tested user workflows – looks good to me ✔️

  • Grpc endpoint and Frontend url is in the registry. After ext update: no change made.
Screenshot 2024-08-24 at 3 16 54 PM Screenshot 2024-08-24 at 3 19 15 PM
  • Frontend url is not in the registry. After ext update: changed to a random one in the registry.
Screenshot 2024-08-24 at 3 32 33 PM Screenshot 2024-08-24 at 3 35 32 PM
  • Grpc endpoint is not in the registry. After ext update: changed to a random one in the registry
Screenshot 2024-08-24 at 6 27 43 PM Screenshot 2024-08-24 at 6 29 40 PM
  • If connected a non-mainnet chain id, after ext update: no changes made to grpc endpoint.
Screenshot 2024-08-24 at 6 50 53 PM Screenshot 2024-08-24 at 6 53 38 PM

@grod220 grod220 merged commit c020aea into main Aug 26, 2024
3 checks passed
@grod220 grod220 deleted the local-migration-v3 branch August 26, 2024 08:44
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

Migrate outdated rpc/frontends
2 participants