Skip to content
Advertisement

Getting index.html content while trying to scrape a react website

when i try to scrape a reactjs website using nodejs i am getting the content of index.html file only not the tags that were used in the website. Here is what i have tried –

JavaScript

What should i do to get the whole of tags that were used in react website.

And do tell i can scrape the hackernoon website ? (for just example) if its legal?

Advertisement

Answer

Cheerio parses only already rendered HTML (eg: static HTML) In order to get the React render you should rely on headless browsers controlled with tools like Puppeteer

Advertisement