I want to scrape a page with some news inside. Here it’s an HTML simplified version of what I have : I want to get the author and text body of each news, without the blockquote part. So I wrote this code : It works well, but I don’t know how to remove the blockquote part of the “news-body” part,
Tag: puppeteer
Puppeteer Cannot read property getElementById of undefined
I’m trying to pass a value to the browser created by the puppeteer, but this error appears: Cannot read property ‘getElementById’ of undefined The error is in this line: what am I doing wrong? Thanks. Answer You can only use getElementById in the page context. Use page.evaluate, eg: That’ll take the the innerHTML of that element, send it back to
How to get the complete html AFTER javascript on RPi in a file
I have a RPi 4 and I want, via terminal, to generate a website.html that has the complete rendered html of a webpage. I want to do this for example in order to search the whole page for a string or pattern etc… I can do this using something like wget or curl for example wget -O website.html https://www.example.com The
How to do a web scraping using Puppeteer and publish it?
I would like to do a web-scraping using Puppeteer. It would be to obtain data from an external URL when the user clicks a button within my application. My application would have to visit an external URL, fill out a form, click on a button, get the data returned and display it to the user within my application. It is
Firebase Functions times out when using puppeteer’s browser.newPage()
I have seen others having relatively minor performance problems with puppeteer running on Firebase Functions. In my case, Firebase times out before I can do anything with puppeteer, even with the memory and timeoutSeconds cranked all the way up. Code: Here’s the resulting Firebase Functions log. It takes a few seconds to run puppeteer.launch(), and then browser.newPage() won’t finish at
why can’t puppeteer scrape an element from a iframe even if I add the selector
I have written a small web scraper using puppeteer, but I can’t seem to properly extract the information I want. Could you please help me find the mistake? Background: I want to scrape a website that indicates how much of premium the city allows a landlord to add to rest controlled apartments (e.g. for a prime location). What I have
Puppeteer .click hovers instead of clicking
I am using puppeteer to automatically restart my wifi (Linksys Velop) and I can’t seem to click an anchor tag to make the final dialog pop up. After the element is clicked, the anchor tag looks as if it’s being hovered over, with the blue underline. Here is the relevant markup: I have tried page.click() as well as page.$eval(), changing
Why does headless need to be false for Puppeteer to work?
I’m creating a web api that scrapes a given url and sends that back. I am using Puppeteer to do this. I asked this question: Puppeteer not behaving like in Developer Console and recieved an answer that suggested it would only work if headless was set to be false. I don’t want to be constantly opening up a browser UI
how do you install and run puppeteer for firefox
Hi I am doing some web automation. I am trying to open a url and I am getting a data URL error in chrome console so I am moving to firefox console to get around the no data urls opening in the chrome console issue. The problem is “npm install puppeteer-firefox” is not working to install puppeteer for firefox. How
puppeteer return value that is selected in dropdown
How can I grab the selected value from a dropdown (the value that is shown on the page) I have the following code. What I get when I run this is: So that’s not working… UPDATE: Since the html page is very long I’ve added it to a fiddle jsfiddle.net/cad231/c14mnp6z The id of the select item is of which I