Skip to content
Advertisement

Puppeter delete node inside element

I want to scrape a page with some news inside. Here it’s an HTML simplified version of what I have :

JavaScript

I want to get the author and text body of each news, without the blockquote part. So I wrote this code :

JavaScript

It works well, but I don’t know how to remove the blockquote part of the “news-body” part, before getting the text, how can I do this ?

EDIT : Sometimes there is blockquote exist, sometime not.

Advertisement

Answer

You can use optional chaining with ChildNode.remove(). Also you may consider innerText more readable.

JavaScript
User contributions licensed under: CC BY-SA
2 People found this is helpful
Advertisement