Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

fix: Correctly detect truly circular paths in JS objects, rather than objects with multiple references #1862

Open
wants to merge 1 commit into
base: main
Choose a base branch
from

Conversation

georgecrawford
Copy link
Contributor

@georgecrawford georgecrawford commented Nov 4, 2024

This conversation uncovered an issue whereby multiple offline calls to track() could potentially share object references (for example, in the additional context object).

The previous containsCircularPaths() implementation wasn't really looking for circular paths, but rather multiple references to the same object. See the original PR and issue.

A use-case which is common in an offline-first application such as FT App - queueing multiple events to be stored and sent when next online - is somewhat likely to include a shared reference across the events at some point in the event tracking. This causes an o-tracking does not support circular references in the analytics data error, which discards the whole queue of events.

Describe your changes

This implementation is based on https://github.com/douglascrockford/JSON-js/blob/master/cycle.js, and draws on the utility of JSON.serialize()'s replacer parameter. It also makes use of the fact that JSON.serialize() will throw a TypeError if there's a circular reference, so a normal attempt to serialize is made first, before even looking for circular references.

Note

Note that the output of jsonReplacer could potentially be used in future to send a payload to Spoor with the circular references removed. Since this is a major change in behaviour, I've left that discussion for another time.

Issue ticket number and link

Link to Figma designs

Checklist before requesting a review

  • I have applied percy label for o-[COMPONENT] or chromatic label for o3-[COMPONENT] on my PR before merging and after review. Find more details in CONTRIBUTING.md
  • If it is a new feature, I have added thorough tests.
  • I have updated relevant docs.
  • I have updated relevant env variables in Doppler.

@georgecrawford georgecrawford requested a review from a team as a code owner November 4, 2024 15:26
@notlee notlee temporarily deployed to origami-webs-circular-r-81hmci November 4, 2024 15:34 Inactive
@georgecrawford georgecrawford force-pushed the circular-reference-check-improvements branch from f4ffff9 to beccaec Compare November 5, 2024 10:21
@notlee notlee temporarily deployed to origami-webs-circular-r-81hmci November 5, 2024 10:21 Inactive
Copy link
Contributor

@notlee notlee left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This looks like a fun one, thanks for the PR! Just some test naming to flag and a little clarification to ensure my understanding. o-tracking is owned by the Data Platform team now but I'm happy to approve and release this on their behalf if they are

libraries/o-tracking/test/utils.test.js Outdated Show resolved Hide resolved
libraries/o-tracking/test/utils.test.js Outdated Show resolved Hide resolved
libraries/o-tracking/test/utils.test.js Outdated Show resolved Hide resolved
libraries/o-tracking/test/utils.test.js Outdated Show resolved Hide resolved
libraries/o-tracking/test/utils.test.js Outdated Show resolved Hide resolved
libraries/o-tracking/src/javascript/utils.js Show resolved Hide resolved
@georgecrawford georgecrawford force-pushed the circular-reference-check-improvements branch from beccaec to cb97aca Compare November 5, 2024 12:01
@notlee notlee temporarily deployed to origami-webs-circular-r-81hmci November 5, 2024 12:01 Inactive
@georgecrawford
Copy link
Contributor Author

@notlee Thanks for your review. I've addressed your issues, but I don't think I've helped matters by force-pushing. GitHub is confused, or I've missed a step. Could you check over my changes please? Thank you!

Copy link
Contributor

@rowanbeentje rowanbeentje left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think this would work, but mostly because of the try/catch 😁 I've got a couple of nitpicks about the approach, and as we don't have proper circular detection anyway I'm wondering how good the resulting errors would be.

I'm still in favour of sending the events with some values replaced with error strings - I think that's nicer behaviour for a tracking library than dropping events?

@@ -222,99 +222,94 @@ function assignIfUndefined (subject, target) {
}

/**
* Used to find out all the paths which contain a circular reference.
* Used to find out all the paths which contain a circular reference.
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Nitpick: I think this particular line now only applies to the contained function ;)

*/
function findCircularPathsIn(rootObject) {
const traversedValues = new WeakSet();
function safelyStringifyJson(object) {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Question: This function doesn't quite do this - should we change the name?

Right now it does three things:

  1. Try to JSON.stringify the object. If this succeeds, the object contains no circular references, and is returned.
  2. If the object failed to stringify, a safe-encoding routine is run, recording circular references.
  3. If circular references were found, an error is thrown.

So really the only reason for (2) is so that (3) can report the circular reference paths. And it also means this function is not safely stringifying the json - instead it's designed to error.

I think I'd be in favour of actually safely stringifying the data! I don't think we need to go as far as using the JSON schema's support, as it seems unlikely that we'd ever have a server try to reconstruct the original data. Instead I think we could replace it with a human-readable path reference, so the tracking data isn't lost; potentially also paired with a console.error - or perhaps instead an actual thrown error in a timeout, so that global error handlers can also catch/report?

Comment on lines +258 to +260
oldPath = traversedValues.get(value);

if (oldPath !== undefined) {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thought: much like the old function, I think this isn't actually detecting only circular references, is it? It's detecting object values seen before, by putting every object as a key in a weakmap and then using references if those objects are encountered again.

Your use of json.stringify() in a try/catch is hiding this from the tests I think 😁

(That said, I think the new function is a significant improvement over the old one because it has much better skipping of built-in types including null. But I think if we're relying on a list of circular references in the error, it won't be...)

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants