Skip to content
Advertisement

Web scraping data displayed inside button with no name

i’m pretty new to web scraping, so please accept my apologies if the question might sound stupid. What I’m trying to do is extracting the values which are stored inside different buttons on the webpage. it seems button of each variant has no name, they are just called “variant__box”, which are under “variants” div class. As far as I can tell, values are loaded by javascript on each variant__box.

This is the website i’m trying to access, to get the data: https://www.honda.co.uk/motorcycles/range/adventure/crf1100l-africa-twin-adventure-sports/specifications-and-price.html#/

This is the code i’ve written so far

Dim ie As Object
  Dim html As New HTMLDocument
  Dim address, str As String
  Dim jobDetailsList As Object
  Dim jobitem As Object
  
  Set ie = CreateObject("InternetExplorer.Application")

  ie.navigate address 'the one mentioned above
  ie.Visible = False

  While ie.Busy Or ie.readyState < 4
  DoEvents
  Wend
  
  Set html = ie.document
  Set jobDetailsList = html.getElementsByClassName("variants")
                    
      For Each jobitem In jobDetailsList
      jobitem.Click
      str = jobitem.innerText
      ActiveSheet.Cells(i, 5).Value = str
      i = i + 1
      Next jobitem
      
  Set html = Nothing
  ie.Quit
  Set ie = Nothing

So far it returns absolutely nothing and don’t know how to solve this problem. Any suggestion would be highly appreciated. Thank you

Answer

If you want to use the IE you can use the following code. But SIM’s suggestion is better because IE is then omitted.

Sub ScrapeMotorCycleData()
  Dim ie As Object
  Dim address, str As String
  Dim jobDetailsList As Object
  Dim jobitem As Object
  Dim i As Long
  
  i = 2
  address = "https://www.honda.co.uk/motorcycles/range/adventure/crf1100l-africa-twin-adventure-sports/specifications-and-price.html#/"
  Set ie = CreateObject("InternetExplorer.Application")
  ie.navigate address 'the one mentioned above
  ie.Visible = False
  'The following line doesn't do what you want
  'While ie.Busy Or ie.readyState < 4: DoEvents: Wend
  
  'You nee a loop here to wait for loading the dynamic content
  'Ask for the HTML part you want to scrape
  '(No timeout included here, but it can be programmed)
  Do
    Set jobDetailsList = ie.document.getElementsByClassName("variant__wrapper")
  Loop Until jobDetailsList.Length > 0
  
  For Each jobitem In jobDetailsList
    ActiveSheet.Cells(i, 5).Value = jobitem.innerText
    i = i + 1
  Next jobitem
  
  ie.Quit
  Set ie = Nothing
End Sub
Advertisement