Skip to content
Advertisement

How to access data-state in div id, cheerio node js

I am grateful to everyone who participates and will help a newbie.

Task: Access div = client-state, then div = Here-Goes-Some-Div-ID and return json from data-state.

<div class="client-state">
      <div id="Here-Goes-Some-Div-ID" data-state='{"items":[{"action":"LAYOUT"}]'>
      </div>

I managed to refer to div = client state like this –

import cheerio from 'cheerio';

const id = ["98772"]

async function GetDataFunction () {
    try {
        for (let i = 0; i < id.length; i++) {
            let HTMLresponse = await ProductSearchFunction(id[i]);
            const $ = cheerio.load(HTMLresponse);

            $('.client-state').each(function()
            {
               const ClientState = $(this).html()
                console.log(ClientState)
            })
        }} catch (err) {
        throw err
    }  
};GetDataFunction()

Unfortunately, I did not find information on how to further access the “Here-Goes-Some-Div-ID” div and get data-state= I would be grateful for the hint, thank you in advance!

Advertisement

Answer

You can go with something like this:

await page.$eval('.client-state #Here-Goes-Some-Div-ID', el => el.getAttribute('data-state'))

What’s this?

  1. we use the page.$eval method to evaluate a specific DOM element
  2. the first parameter is the CSS selector for the desired element, you can use a descendant combinator – a single space – between the already identified class (.client-state) and the HTML id (#Here-Goes-Some-Div-ID), id-s are using # in front of the id name
  3. the second parameter of the function is the so-called pageFunction where you can do this: el => el.getAttribute('data-state') using Element.getAttribute() on the data-state attribute.

You can define it as a variable and then parse its content with JSON.parse() or whatever you want to do with the result.

Note: in your current example '{"items":[{"action":"LAYOUT"}]' would be unparsable as the wrapper object is not closed with a }!

full example:

const puppeteer = require('puppeteer')

async function main() {
  const browser = await puppeteer.launch({
    headless: false
  })
  const page = await browser.newPage()

  await page.goto(url)
  const dataState = await page.$eval('.client-state #Here-Goes-Some-Div-ID', el => el.getAttribute('data-state'))
  console.log(dataState)
  console.log(JSON.parse(dataState))

  await browser.close()
}
main()
User contributions licensed under: CC BY-SA
5 People found this is helpful
Advertisement