Skip to content
Advertisement

Puppeteer can’t find elements when Headless TRUE

I’m facing some problems with Puppeteer, I want to extract a list of items and succeed when headless is FALSE but not when TRUE.

First thing first, I want to get those elements before mapping on it.

Here’s my script, maybe you can reproduce it, it is really basic.


JavaScript
JavaScript

Advertisement

Answer

For starters, I’d prefer page.waitForSelector(yourSelector) over page.waitForNetworkIdle();. In most cases, it’s a more direct guarantee that the data you want is on the page, whereas network idle can block waiting for all sorts of requests that are totally irrelevant to the data you’re trying to scrape. Another option is page.waitForResponse(predicate).

Some websites check the headers to block scrapers. You can try adding a user agent header as described in the Puppeteer GitHub issue Different behavior between { headless: false } and { headless: true } #665:

JavaScript

Using puppeteer-extra as described in Why does headless need to be false for Puppeteer to work? is another option you can try. It also anonymizes the user agent headers.

User contributions licensed under: CC BY-SA
9 People found this is helpful
Advertisement