Skip to content

chore: download browser downloads in-memory #35909

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Open
wants to merge 1 commit into
base: main
Choose a base branch
from

Conversation

mxschmitt
Copy link
Member

When installing browsers, we currently first download them to disk (zip file) and then extract them. This means we write N MB to disk and read it again from disk. Since we vendored extract-zip, we can make this small modification which saves us around 300-400 MB per npx playwright install. Especially on slow disks this should be very noticeable.

@@ -55,7 +56,7 @@ class Extractor {
})
this.zipfile.readEntry()

this.zipfile.on('close', () => {
this.zipfile.on('end', () => {
Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@@ -49,7 +49,7 @@ declare namespace extract {
}

declare function extract(
zipPath: string,
zipPath: Buffer,
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Do you have numbers of how much do we save? I guess it'd be the most beneficial on Windows, but by how much? Also this means holding entire buffers in memory which might hit us if we finally decide to download in parallel. So I'm a bit hesitant changing this.

Copy link
Member Author

@mxschmitt mxschmitt May 11, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

On a M1 Pro Mac machine its a 9.55 % improvement and on a Windows DevBox its a 11.71 % improvement. Both of these tests were made on a fast machine and using a SSD. I expect much higher improvements when we consider slow (e.g. HDD) or busy disks.

re the comment for parallel downloads. If we assume to hold all our downloads in memory it would be 293 MB on Windows which would be still manageable I'd say. This aligns well of how much Chromium takes when navigating to websites. Its even higher sometimes on busy websites.

Used data

macOS M1 Pro

with file write:

test chromium-win64.zip via file attempt 1: 1.472s
test chromium-win64.zip via file attempt 2: 1.455s
test chromium-win64.zip via file attempt 3: 1.445s
test webkit-ubuntu-24.04.zip via file attempt 1: 968.385ms
test webkit-ubuntu-24.04.zip via file attempt 2: 1.017s
test webkit-ubuntu-24.04.zip via file attempt 3: 1.065s

with in-memory:

test chromium-win64.zip in memory attempt 1: 1.332s
test chromium-win64.zip in memory attempt 2: 1.391s
test chromium-win64.zip in memory attempt 3: 1.273s
test webkit-ubuntu-24.04.zip in memory attempt 1: 966.896ms
test webkit-ubuntu-24.04.zip in memory attempt 2: 906.812ms
test webkit-ubuntu-24.04.zip in memory attempt 3: 912.546ms
((1.472 + 1.455 + 1.445) / (1.332 + 1.391 + 1.273) - 1) * 100 = 9.41 % faster with chromium-win64.zip
((0.968 + 1.017 + 1.065) / (0.966 + 0.906 + 0.912) - 1) * 100 = 9.55 % faster with webkit-ubuntu-24.04.zip

Windows DevBox 64GB RAM

with file write:

PS C:\Users\maxschmitt\Developer\playwright> node test.mjs
test chromium-win64.zip via file attempt 1: 3.575s
test chromium-win64.zip via file attempt 2: 3.529s
test chromium-win64.zip via file attempt 3: 3.499s
test webkit-ubuntu-24.04.zip via file attempt 1: 2.423s
test webkit-ubuntu-24.04.zip via file attempt 2: 2.421s
test webkit-ubuntu-24.04.zip via file attempt 3: 2.310s

with in-memory:

PS C:\Users\maxschmitt\Developer\playwright> node test.mjs
test chromium-win64.zip in memory attempt 1: 3.234s
test chromium-win64.zip in memory attempt 2: 3.042s
test chromium-win64.zip in memory attempt 3: 3.240s
test webkit-ubuntu-24.04.zip in memory attempt 1: 2.051s
test webkit-ubuntu-24.04.zip in memory attempt 2: 2.140s
test webkit-ubuntu-24.04.zip in memory attempt 3: 2.213s
PS C:\Users\maxschmitt\Developer\playwright>

results

((3.575 + 3.529 + 3.499) / (3.234 + 3.042 + 3.240) - 1) * 100 = 11.42 % faster with chromium-win64.zip
((2.423 + 2.421 + 2.310) / (2.051 + 2.140 + 2.213) - 1) * 100 = 11.71 % faster with webkit-ubuntu-24.04.zip
Script
import os from 'node:os';
import fs from 'node:fs';
import path from 'node:path';

import extractZip from './packages/playwright-core/bundles/zip/src/third_party/extract-zip.js';

const testcases = [
  'https://playwright.azureedge.net/dbazure/download/playwright/builds/chromium/1172/chromium-win64.zip',
  'https://playwright.azureedge.net/dbazure/download/playwright/builds/webkit/2169/webkit-ubuntu-24.04.zip',
]
const inMemory = true;

const testDir = path.join(os.tmpdir(), 'playwright-extract-zip-test');

for (const url of testcases) {
  const file = Buffer.from(await fetch(url).then(res => res.arrayBuffer()));
  for (let i = 0; i < 3; i++) {
    if (fs.existsSync(testDir))
      fs.rmSync(testDir, { recursive: true, force: true });
    fs.mkdirSync(testDir);
    const testTitle = `test ${path.basename(url)} ${inMemory ? 'in memory' : 'via file'} attempt ${i + 1}`;
    if (inMemory) {
      console.time(testTitle);
      await extractZip(file, { dir: testDir });
    } else {
      console.time(testTitle);
      const filePath = path.join(testDir, 'chromium.zip');
      // Intentionally having the writeFileSync included in the time measurement
      // to see if it affects the performance.
      fs.writeFileSync(filePath, file);
      await extractZip(filePath, { dir: testDir });
    }
    console.timeEnd(testTitle);
  }
}

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Sounds plausible! This is on the critical path, so we should place this behind an opt-out flag.

@mxschmitt mxschmitt force-pushed the in-memory-downloader branch from f97bad0 to 20c4380 Compare May 9, 2025 15:41

This comment has been minimized.

This comment has been minimized.

@mxschmitt mxschmitt force-pushed the in-memory-downloader branch from 20c4380 to 8f83864 Compare May 11, 2025 18:56

This comment has been minimized.

Copy link
Contributor

Test results for "tests 1"

1 failed
❌ [firefox-page] › page/page-evaluate.spec.ts:424:3 › should throw for too deep reference chain @firefox-ubuntu-22.04-node18

6 flaky ⚠️ [firefox-library] › library/inspector/cli-codegen-1.spec.ts:986:7 › cli codegen › should not throw csp directive violation errors @firefox-ubuntu-22.04-node18
⚠️ [chromium-library] › library/client-certificates.spec.ts:400:3 › browser › should not hang on tls errors during TLS 1.2 handshake @ubuntu-22.04-chromium-tip-of-tree
⚠️ [webkit-library] › library/ignorehttpserrors.spec.ts:30:3 › should isolate contexts @webkit-ubuntu-22.04-node18
⚠️ [webkit-page] › page/elementhandle-bounding-box.spec.ts:22:3 › should work @webkit-ubuntu-22.04-node18
⚠️ [webkit-page] › page/page-event-popup.spec.ts:28:3 › should work with window features @webkit-ubuntu-22.04-node18
⚠️ [playwright-test] › ui-mode-test-watch.spec.ts:145:5 › should watch all @windows-latest-node18-1

39159 passed, 803 skipped
✔️✔️✔️

Merge workflow run.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants