-
Notifications
You must be signed in to change notification settings - Fork 8
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Metric Accuracy #68
Comments
Define what you mean by accurate? This TTVC library will look for any visible change on the page, and wait for all network requests to finish (for 2s) before marking TTVC as complete (at the point the last visible change occurred). Unless there's an error somewhere in the logic, you could say it's completely accurate, to that defined metric. Using a Visually Complete metric from WPT or SpeedCurve that compares screenshots to determine when a visible change last occurred will often return completely different results than this one. If a single pixel on the page changes, TTVC will pick it up (whether you consider that a note-worthy visible change) where a screenshot comparison may not. It's all in how you define the metric, and what they're looking for. Since the two are measuring different things (and presumably have different thresholds), don't be surprised if they return different values. The big advantage of this library is that you can run it on every page load for every user and get highly accurate (to the metric) logging of what your TTVC is. That's obviously not possible using a screenshot comparison tool. |
Can you elaborate how waiting for 2s for all network requests to finish is related to visual completeness? My understanding was that it only waits for downloading of images that are visible on the screen, am I wrong about it? |
Because it's impossible to tell if those network requests will result in a visible change. Say a JS file was requested, or a JSON file was returned as the response from an API call. The code on the page could draw something in the viewable window as a response. So what the TTVC library does is it records anytime anything was changed in the viewport (stuff drawn below the fold doesn't count) and it saves that as the last visible change, and then it waits for network and CPU idle. If two seconds elapse without any new files requested or API calls being made, then that last visible change it had recorded earlier is then marked as visually complete in If something does change, then the last visible change is bumped to the current time, and it goes back to waiting for network/CPU idle for 2 more seconds. What this means is that if your website has a really dumb waterfall like this, the TTVC would be calculated like this:
|
Ahh. Looks like I misunderstood you - the waiting for network only counts toward waiting for potential changes to viewport, not towards the metric itself. That makes sense - determining when to stop waiting is definitely a challenge, often times I see this being a delicate balance for sending the beacons, on one hand you want to collect as much data as possible, on another, you don't want to loose too many beacons. This does not however explain the numbers that I see where TTVC is marked much higher than the last screenshot times. Is there an explanation for that? e.g. why would anything be marked as updated in TTVC world if no pixels changes were registered by WPT? |
Hey, thanks for your interest @sergeychernyshev! I think you're right to trust a screenshot-based analysis over this implementation. We are always going to be working within the confines of the APIs that browser developers expose to us. (I think it would be awesome if browser vendors decide to support this metric natively.) If I had to guess, the most likely reason that You can enable debug logging to verify exactly what visual changes we are picking up, using this init option! import {init} from '@dropbox/ttvc';
init({ debug: true }); To try and get as accurate as possible, we have built out a set of test suites, here: https://github.com/dropbox/ttvc/tree/main/test/e2e Looking through this might give you a sense of what scenarios we cover. If you can identify any new problem scenarios and want to sketch out a new test or two here for us, that would be most welcome! |
I am wondering how accurate is this metric?
My initial tests on a very simplistic prototype were OK, but now I am trying to test real website using WebPageTest and am seeing significant discrepancies between
TTVC
measured using this library and Visually Complete metric recorded by WebPageTest (using screen captures).The
TTVC
metric is usually larger ranging from ~30% to ~300% higher than the time stamp recorded using screenshots in a few tests that I manually ran.Is there an existing methodology that I can use to assess the accuracy of this technique?
Additionally, are there any known aspects of the page implementation that are particularly prone to throwing this metric off?
The text was updated successfully, but these errors were encountered: