Skip to content
Advertisement

Puppeteer not retrieving JavaScript rendered page

I am trying to load the product page using puppeteer but its not working.

JavaScript

If we open this URL it will load the page half and when we scroll down it loads rest of the page.

I tried using the scroll as well but it did not work.

Scroll function is following

JavaScript

Advertisement

Answer

When I run this headfully, I don’t see that the page loads fully with the review content. It seems to be detecting the bot and blocking those reviews from coming through regardless of the scroll.

Using puppeteer-extra-stealth headfully avoids detection, but headless stealth is still blocked. I’ll update if I can find a solution, but I figure this is at least a step forward.

JavaScript

In the future, if you see waitForSelector timeouts when running headlessly, it’s a good idea to add a console.log(await page.content()); which will usually show that you’ve been blocked before you waste time messing with scrolling and other futile strategies.

See also Why does headless need to be false for Puppeteer to work?

User contributions licensed under: CC BY-SA
8 People found this is helpful
Advertisement