Getting text and attribute from a website #150

marcpre · 2021-09-12T13:07:09Z

I am using "@nesk/puphpeteer": "^2.0.0" and want get the text and the href-attribute from a link.

I tried the following:

<?php

require_once '../vendor/autoload.php';

use Nesk\Puphpeteer\Puppeteer;
use Nesk\Rialto\Data\JsFunction;

$debug = true;

$puppeteer = new Puppeteer([
    'read_timeout' => 100,
    'debug' => $debug,
]);
$browser = $puppeteer->launch([
    'headless' => !$debug,
    'ignoreHTTPSErrors' => true,
]);

$page = $browser->newPage();
$page->goto('http://example.python-scraping.com/');

//get text and link
$links = $page->querySelectorXPath('//*[@id="results"]/table/tbody/tr/td/div/a', JsFunction::createWithParameters(['node'])
    ->body('return node.textContent;'));

// get single text
$singleText = $page->querySelectorXPath('//*[@id="pagination"]/a', JsFunction::createWithParameters(['node'])
    ->body('return node.textContent;'));

$browser->close();

When I run the above script I get the nodes from the page, BUT I cannot access the attributes or the text?

Any suggestions how to do this?

The text was updated successfully, but these errors were encountered:

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Getting text and attribute from a website #150

Getting text and attribute from a website #150

marcpre commented Sep 12, 2021

Getting text and attribute from a website #150

Getting text and attribute from a website #150

Comments

marcpre commented Sep 12, 2021