I am grateful to everyone who participates and will help a newbie.
Task: Access div = client-state, then div = Here-Goes-Some-Div-ID and return json from data-state.
JavaScript
x
4
1
<div class="client-state">
2
<div id="Here-Goes-Some-Div-ID" data-state='{"items":[{"action":"LAYOUT"}]'>
3
</div>
4
I managed to refer to div = client state like this –
JavaScript
1
20
20
1
import cheerio from 'cheerio';
2
3
const id = ["98772"]
4
5
async function GetDataFunction () {
6
try {
7
for (let i = 0; i < id.length; i++) {
8
let HTMLresponse = await ProductSearchFunction(id[i]);
9
const $ = cheerio.load(HTMLresponse);
10
11
$('.client-state').each(function()
12
{
13
const ClientState = $(this).html()
14
console.log(ClientState)
15
})
16
}} catch (err) {
17
throw err
18
}
19
};GetDataFunction()
20
Unfortunately, I did not find information on how to further access the “Here-Goes-Some-Div-ID” div and get data-state= I would be grateful for the hint, thank you in advance!
Advertisement
Answer
You can go with something like this:
JavaScript
1
2
1
await page.$eval('.client-state #Here-Goes-Some-Div-ID', el => el.getAttribute('data-state'))
2
What’s this?
- we use the
page.$eval
method to evaluate a specific DOM element - the first parameter is the CSS selector for the desired element, you can use a descendant combinator – a single space – between the already identified class (
.client-state
) and the HTML id (#Here-Goes-Some-Div-ID
), id-s are using#
in front of the id name - the second parameter of the function is the so-called
pageFunction
where you can do this:el => el.getAttribute('data-state')
usingElement.getAttribute()
on thedata-state
attribute.
You can define it as a variable and then parse its content with JSON.parse()
or whatever you want to do with the result.
Note: in your current example '{"items":[{"action":"LAYOUT"}]'
would be unparsable as the wrapper object is not closed with a }
!
full example:
JavaScript
1
17
17
1
const puppeteer = require('puppeteer')
2
3
async function main() {
4
const browser = await puppeteer.launch({
5
headless: false
6
})
7
const page = await browser.newPage()
8
9
await page.goto(url)
10
const dataState = await page.$eval('.client-state #Here-Goes-Some-Div-ID', el => el.getAttribute('data-state'))
11
console.log(dataState)
12
console.log(JSON.parse(dataState))
13
14
await browser.close()
15
}
16
main()
17