Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Download event not caught and always times out #27

Open
imhashir opened this issue Jan 23, 2021 · 12 comments
Open

Download event not caught and always times out #27

imhashir opened this issue Jan 23, 2021 · 12 comments

Comments

@imhashir
Copy link

Thanks a bunch for creating this awesome package.

I was having an issue with the download event. It works great when I try to execute my code locally (serverless invoke local) but when I deploy this via serverless deploy, the waitForEvent('download') times out.

Here's the code:

  const { page, browser } = await openWebpage(URL);

  const [download] = await Promise.all([
    // Start waiting for the download
    page.waitForEvent('download'),
    // Perform the action that initiates download
    page.click(`#${BTN_ID}`),
  ]);

Here's the openWebpage function:

export async function openWebpage(url) {
  const browser = await playwright.launchChromium();
  const context = await browser.newContext({
    acceptDownloads: true,
  });

  const page = await context.newPage();
  await page.goto(url);

  return { page, browser: context };
}

A similar issue was posted in playwright's official repo here.
In that same issue, I've commented about my issue as well, here.

I guess since this package was created based on chrome-aws-lambda, which is for puppeteer basically, and puppeteer does not support download event, so it wasn't included in this package as well. But that's just a random guess. I'd love to help in any way to get this issue fixed.

Hope to hear from you soon.

@Madhu1512
Copy link

Madhu1512 commented Jan 23, 2021

I am also seeing same timeout error when running in lambda.

TimeoutError: Timeout while waiting for event "download"
Note: use DEBUG=pw:api environment variable and rerun to capture Playwright logs.

"playwright-aws-lambda": "^0.6.0", "playwright-core": "^1.8.0",

@Madhu1512
Copy link

I feel the issue is more with the packaged version of chrome with this library. For now, I switched to the new docker container functionality with lambda and able to process the downloads without any issue.

@imhashir
Copy link
Author

I feel the issue is more with the packaged version of chrome with this library. For now, I switched to the new docker container functionality with lambda and able to process the downloads without any issue.

Amazing. Are you doing it via Serverless or bare lambda? Can you guide me through the process or share some code snippet? Thank You.

@osmenia
Copy link

osmenia commented Jan 24, 2021

I feel the issue is more with the packaged version of chrome with this library. For now, I switched to the new docker container functionality with lambda and able to process the downloads without any issue.

Wow, I am also interested. Can you please guide us through the process or share some code snippet?
Thank You very much and have nice day.

@anupsunni
Copy link

I feel the issue is more with the packaged version of chrome with this library. For now, I switched to the new docker container functionality with lambda and able to process the downloads without any issue.

Awesome, it would be great if you could guide us here.

Thanking you in anticipation.

@Madhu1512
Copy link

Here is the example I put together for the playwright running in a lambda docker container.

https://github.com/Madhu1512/playwright-lambda-demo

@imhashir
Copy link
Author

Thanks a lot @Madhu1512 for going through the effort of creating an example for us.
I'll have to look into docker based lambda deployments to get that to work but I'll definitely try your solution. For now, I could get it to work by downgrading playwright-core to 1.0.2 as suggested by @osmenia in microsoft/playwright#3726 (comment)

@osmenia
Copy link

osmenia commented Jan 29, 2021

@austinkelleher

can you pls update chromium
see
microsoft/playwright#3726 (comment)

@CRSylar
Copy link

CRSylar commented Sep 27, 2022

Hi All !

any news ? i'm stucked with this error...

aws lambda of course, nodejs > 16 runtime

Here's my package.json :

"dependencies": { "playwright-aws-lambda": "^0.9.0", "playwright-core": "^1.26.0" }

i've tried to downgrade Playwright-core to the suggested 1.2.0 but then i need to refactor all the code since the locator not exist in such old version...

Any suggestion ?
note that i've also tried to "manually" dispatch the click event but without success

what i need to achieve is to save the downloaded file to /tmp/ so i cant parse it ( is a Csv) later on.

finally, there's the code ( locally works flawless)

`
const playwright = require('playwright-aws-lambda')

const extractData = async () => {
const browser = await playwright.launchChromium()
const cxt = await browser.newContext()

const page = await cxt.newPage()


await page.goto('https:/<TheTargetSite>/auth/login');
]
await page.locator('input[type="email"]').click();

await page.locator('input[type="email"]').fill('[email protected]');

await page.locator('input[type="password"]').click();

await page.locator('input[type="password"]').fill('YYYYY');

await page.locator('button:has-text("Log in")').click();

await page.locator('a:has-text("Rides")').click();

await page.locator('text=ActiveStatus').click();

await page.locator('text=Ended').click();

await page.locator('[placeholder="Start date"]').click();

await page.locator('[aria-label="September 19\\, 2022"]').click();

await page.locator('[aria-label="September 19\\, 2022"]').click();

await page.locator('[aria-label="Export rides"]').click();

const [download] = await Promise.all([
	page.waitForEvent('download'),
	page.locator('button:has-text("Export")').click()
]);
await download.saveAs('/tmp/rides.csv')
await page.close()
await cxt.close()
await browser.close()

}

module.exports = { extractData }
`

@SamLoy
Copy link

SamLoy commented Feb 9, 2023

I was able to work around this issue by fixing a /tmp folder for the chrome to output its temporary files, then watch the directory for the PDF to arrive.

Obviously not perfect for every situation, but works well for us when the PDF download is reliable and only will be one download per session.

For example:

   const tmpFolder = "/tmp/pdfs/" + uuid();
   const browser = await playwright.launchChromium({downloadsPath: tmpFolder});
   const context = await browser.newContext();
   const page = await context.newPage();
 
  ...
        
   await page.getByText("Download PDF").click()
   let pdfFiles: string[] = [];

   while(!pdfFiles.length) {
        await page.waitForTimeout(1000);
        pdfFiles = fs.readdirSync(tmpFolder);
   }

   const pdfData = fs.readFileSync(`${tmpFolder}/${pdfFiles[0]}`);
       

@TheAPIguys
Copy link

I was able to work around this issue by fixing a /tmp folder for the chrome to output its temporary files, then watch the directory for the PDF to arrive.

Obviously not perfect for every situation, but works well for us when the PDF download is reliable and only will be one download per session.

For example:

   const tmpFolder = "/tmp/pdfs/" + uuid();
   const browser = await playwright.launchChromium({downloadsPath: tmpFolder});
   const context = await browser.newContext();
   const page = await context.newPage();
 
  ...
        
   await page.getByText("Download PDF").click()
   let pdfFiles: string[] = [];

   while(!pdfFiles.length) {
        await page.waitForTimeout(1000);
        pdfFiles = fs.readdirSync(tmpFolder);
   }

   const pdfData = fs.readFileSync(`${tmpFolder}/${pdfFiles[0]}`);
       

Hi Samloy,

I have use tmp folder in the past for other thing in lambda functions to store temperature files. In this case do you have to pre create the folder every run or added to your source code in AWS Lambda? Or it was enough just with this snippet above?

Many thanks for your suggestion and help.

Regards
Dan

@zhw2590582
Copy link

@SamLoy Very good idea, I successfully ran playwright-aws-lambda in vercel and then downloaded the file

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

8 participants