I am grateful to everyone who participates and will help a newbie.
Task: Access div = client-state, then div = Here-Goes-Some-Div-ID and return json from data-state.
<div class="client-state">
<div id="Here-Goes-Some-Div-ID" data-state='{"items":[{"action":"LAYOUT"}]'>
</div>
I managed to refer to div = client state like this –
import cheerio from 'cheerio';
const id = ["98772"]
async function GetDataFunction () {
try {
for (let i = 0; i < id.length; i++) {
let HTMLresponse = await ProductSearchFunction(id[i]);
const $ = cheerio.load(HTMLresponse);
$('.client-state').each(function()
{
const ClientState = $(this).html()
console.log(ClientState)
})
}} catch (err) {
throw err
}
};GetDataFunction()
Unfortunately, I did not find information on how to further access the “Here-Goes-Some-Div-ID” div and get data-state= I would be grateful for the hint, thank you in advance!
Advertisement
Answer
You can go with something like this:
await page.$eval('.client-state #Here-Goes-Some-Div-ID', el => el.getAttribute('data-state'))
What’s this?
- we use the
page.$evalmethod to evaluate a specific DOM element - the first parameter is the CSS selector for the desired element, you can use a descendant combinator – a single space – between the already identified class (
.client-state) and the HTML id (#Here-Goes-Some-Div-ID), id-s are using#in front of the id name - the second parameter of the function is the so-called
pageFunctionwhere you can do this:el => el.getAttribute('data-state')usingElement.getAttribute()on thedata-stateattribute.
You can define it as a variable and then parse its content with JSON.parse() or whatever you want to do with the result.
Note: in your current example '{"items":[{"action":"LAYOUT"}]' would be unparsable as the wrapper object is not closed with a }!
full example:
const puppeteer = require('puppeteer')
async function main() {
const browser = await puppeteer.launch({
headless: false
})
const page = await browser.newPage()
await page.goto(url)
const dataState = await page.$eval('.client-state #Here-Goes-Some-Div-ID', el => el.getAttribute('data-state'))
console.log(dataState)
console.log(JSON.parse(dataState))
await browser.close()
}
main()