Tag: scrapy

How to escape a sup tag in xpath selector

I want to extract the text content from the below HTML tag, but the <sup> tag is preventing me from getting the desired text. The text I want to extract is simply (4:6, 6:7). how can I extract this text at the same time escaping the <sup> tag. I tried this “//p/text()”, but I am only getting the part before

Parse property page URLs using xpath

javascript python scrapy web-scraping xpath

I am trying to parse the main property page https://www.realtyatlas.co.za/search?areas%5B0%5D%5Btown%5D=Bellville&status=For%20Sale, more precisely I would like to extract the href from attribute class that is here, and make a follow link: However all the combinations I have tried result in None. I am also aware of API (https://jf6e1ij07f.execute-api.eu-west-1.amazonaws.com/p/search), however, in the response, I do not see the URL to the properties,

How to parse JavaScript Json into Python dict type, effeciently

javascript json python scrapy

I am looking for way to read javascript json data loaded into one of a script tag of this page. I have tried various re patterns posted on google and stackoveflow but got nothing. The Json Formatter shows an Invalid (RFC 8259). Here is a code The problem seems an invalid json format. The type of profile_json is string while

Scrapy + splash: can’t select element

javascript lua scrapy scrapy-splash web-scraping

I’m learning to use scrapy with splash. As an exercise, I’m trying to visit https://www.ubereats.com/stores/, click on the address text box, enter a location and then press the Enter button to move to next page containing the restaurants available for that location. I have the following lua code: When I click on “Render!” in the splash API, I get the

Splash API/lua error: attempt to index local element (a nil value)

javascript lua scrapy scrapy-splash

I’m writing a lua script that I want to use with scrapy + splash for a website. I want to write a script that enters a text and then clicks on a button. I have the following code: Right now I’m using the splash API to test if my code runs properly. When I click “Render!” I get the following