Skip to content
Advertisement

Python Requests run JS file from GET

Goal

To log in to this website (https://www.reliant.com) using python requests etc. (I know this could be done with selenium or PhantomJS or something, but would prefer not to)

Problem

During the log in process there a couple of redirects where “session ID” type params are passed. Most of these i can get but there’s one called dtPC that appears to come from a cookie that you get when first visiting the page. As far as I can tell, the cookie originates from this JS file (https://www.reliant.com/ruxitagentjs_ICA2QSVfhjqrux_10175190917092722.js). This url is the next GET request the browser performs after the initial GET of the main url. All the methods i’ve tried so far have failed to get me that cookie.

Code thus far

from requests_html import HTMLSession

url=r'https://www.reliant.com'
url2=r'https://www.reliant.com/ruxitagentjs_ICA2QSVfhjqrux_10175190917092722.js'
headers={
'Accept':'text/html,application/xhtml+xml,application/xml;q=0.9,image/webp,image/apng,*/*;q=0.8,application/signed-exchange;v=b3',
 'Accept-Encoding': 'gzip, deflate, br',
 'Accept-Language': 'en-US,en;q=0.9',
 'Cache-Control': 'max-age=0',
 'Connection': 'keep-alive',
 'Host': 'www.reliant.com',
 'Sec-Fetch-Mode': 'navigate',
 'Sec-Fetch-Site': 'none',
 'Sec-Fetch-User': '?1',
 'Upgrade-Insecure-Requests': '1',
 'User-Agent': 'Mozilla/5.0 (Windows NT 10.0; Win64; x64) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/70.0.3538.102 Safari/537.3'
}

headers2={
'Referer': 'https://www.reliant.com',
 'Sec-Fetch-Mode': 'no-cors',
 'User-Agent': 'Mozilla/5.0 (Windows NT 10.0; Win64; x64) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/76.0.3809.132 Safari/537.36'
}

s=HTMLSession()
r=s.get(url,headers=headers)
js=s.get(url2,headers=headers2).text

r.html.render() #works but doesn't get the cookie
r.html.render(script=js) #fails on Network error

Advertisement

Answer

Alright I figured this one out, despite it fighting me the whole way. Idk why dtPC wasn’t showing up in the s.cookies like it should, but I wasn’t using the script keyword quite right. Apparently, whatever JS you pass it will be executed after everything else has rendered, like you opened the console on your browser and pasted it in there. When i actually tried that in Chrome, I got some errors. Eventually i realized i could just run a simple JS script to return the cookies generated by the other JS.

s=HTMLSession()
r=s.get(url,headers=headers)
print(r.status_code)

c=r.html.render(script='document.cookie') 

c=urllib.parse.unquote(c)
c=[x.split('=') for x in c.split(';')]
c={x[0]:x[1] for x in c}
print(c)

at this point, c will be a dict with 'dtPC' as a key and the corresponding value.

User contributions licensed under: CC BY-SA
5 People found this is helpful
Advertisement