I am grateful to everyone who participates and will help a newbie.
Task: Access div = client-state, then div = Here-Goes-Some-Div-ID and return json from data-state.
<div class="client-state"> <div id="Here-Goes-Some-Div-ID" data-state='{"items":[{"action":"LAYOUT"}]'> </div>
I managed to refer to div = client state like this –
import cheerio from 'cheerio'; const id = ["98772"] async function GetDataFunction () { try { for (let i = 0; i < id.length; i++) { let HTMLresponse = await ProductSearchFunction(id[i]); const $ = cheerio.load(HTMLresponse); $('.client-state').each(function() { const ClientState = $(this).html() console.log(ClientState) }) }} catch (err) { throw err } };GetDataFunction()
Unfortunately, I did not find information on how to further access the “Here-Goes-Some-Div-ID” div and get data-state= I would be grateful for the hint, thank you in advance!
Advertisement
Answer
You can go with something like this:
await page.$eval('.client-state #Here-Goes-Some-Div-ID', el => el.getAttribute('data-state'))
What’s this?
- we use the
page.$eval
method to evaluate a specific DOM element - the first parameter is the CSS selector for the desired element, you can use a descendant combinator – a single space – between the already identified class (
.client-state
) and the HTML id (#Here-Goes-Some-Div-ID
), id-s are using#
in front of the id name - the second parameter of the function is the so-called
pageFunction
where you can do this:el => el.getAttribute('data-state')
usingElement.getAttribute()
on thedata-state
attribute.
You can define it as a variable and then parse its content with JSON.parse()
or whatever you want to do with the result.
Note: in your current example '{"items":[{"action":"LAYOUT"}]'
would be unparsable as the wrapper object is not closed with a }
!
full example:
const puppeteer = require('puppeteer') async function main() { const browser = await puppeteer.launch({ headless: false }) const page = await browser.newPage() await page.goto(url) const dataState = await page.$eval('.client-state #Here-Goes-Some-Div-ID', el => el.getAttribute('data-state')) console.log(dataState) console.log(JSON.parse(dataState)) await browser.close() } main()