In VBA, How can we get data shown up with “Inspect Element”, but not with “View Page Source”?



I am trying to scrape a web page which includes multiple tabs. I want to get the quarterly data which is displayed when clicking on By-Quarter Tab, but my code keeps returning yearly data shown when clicking By-Year Tab. The problem is both types of data are on the same URL and when right-clicking “Inspect Element”, their IDs are also the same; you cannot distinguish the quarterly data element ID from yearly data data element ID. “Inspect Element” shows up both quarterlyand yealy data, but “View Page Source” shows up only yealy ones. Could anyone show me how to get the quarterly data please? Thank you very much.

   Sub Getquarterdata()

    Dim html As HTMLDocument
    Set html = New HTMLDocument
    
    URL = "https://s.cafef.vn/hose/VCB-ngan-hang-thuong-mai-co-phan-ngoai-thuong-viet-nam.chn"
 
    With CreateObject("MSXML2.XMLHTTP")
        .Open "GET", URL, False
        .SetRequestHeader "User-Agent", "Mozilla/5.0"
        .send
        html.body.innerHTML = .responseText

    End With

        ' By "Inspect Element" pointing at Quarterly Data, I counted "td" and came with these lines of code, but they print yearly data.
          Debug.Print html.getElementById("divHoSoCongTyAjax").getElementsByTagName("td")(23).innerText  '=> Print  9,091,070,000 (Year 2017 data)
          Debug.Print html.getElementById("divHoSoCongTyAjax").getElementsByTagName("td")(24).innerText  '=> Print 14,605,578,000 (Year 2018 data)
          Debug.Print html.getElementById("divHoSoCongTyAjax").getElementsByTagName("td")(25).innerText  '=> Print 18,510,898,000 (Year 2019 data)
          Debug.Print html.getElementById("divHoSoCongTyAjax").getElementsByTagName("td")(26).innerText  '=> Print 18,451,311,000 (Year 2020 data)
         ' The thing is that Quarterly Data shows up only with "Inspect Element", but not with "View Page Source"
    Set html = Nothing
 

End Sub


Links

  1. URL: https://s.cafef.vn/hose/VCB-ngan-hang-thuong-mai-co-phan-ngoai-thuong-viet-nam.chn

  2. Quaterly Data shown when clicking By-Quarter Tab https://drive.google.com/file/d/1oRtrBZxAoKgdE7gMSBsmkpSX_Ljv1c7L/view?usp=sharing

  3. Yearly Data shown when clicking By-Year Tab https://drive.google.com/file/d/1-tI5TU7IMOXFIhsfH8tGvsCRoB0O7Xl1/view?usp=sharing

  4. Inspect Quaterly Data: https://drive.google.com/file/d/1Xc5hRPTBIKFu7hQoLh4mStp92CxipNpU/view?usp=sharing

  5. Inspect yearly Data: https://drive.google.com/file/d/1LedAF3gvAYSIOKOKfZURR9A2rhK0SNgB/view?usp=sharing

Answer

One of the clues given is in the class where you see it says Ajax. This is dynamically added content. If you use the network tab of dev tools (F12), and manually select the quarterly tab, you will see the following request endpoint, which serves the data you are after:

https://s.cafef.vn/Ajax/Bank/BHoSoCongTy.aspx?symbol=VCB&Type=1&PageIndex=0&PageSize=4&donvi=1


Option Explicit

Public Sub GetQuarterlyTable()
    'required VBE (Alt+F11) > Tools > References > Microsoft HTML Object Library ;  Microsoft XML, v6 (your version may vary)

    Dim hTable As MSHTML.HTMLTable
    Dim xhr As MSXML2.XMLHTTP60, html As MSHTML.HTMLDocument
   
    Set xhr = New MSXML2.XMLHTTP60
    Set html = New MSHTML.HTMLDocument

    With xhr
        .Open "GET", "https://s.cafef.vn/Ajax/Bank/BHoSoCongTy.aspx?symbol=VCB&Type=1&PageIndex=0&PageSize=4&donvi=1", False
        .send
        html.body.innerHTML = .responseText
    End With

    Set hTable = html.querySelector(".tab1child_content")
    
    'Do something with table
    Stop
End Sub


Source: stackoverflow