Any one has a idea on how to invoke a javascript function from puppeteer which is not inline but in an external .js file. If its inline within the html->head->script tag it works but not if the script tag points to an external .js file
Sample HTML File
<html> <head> <script type="text/javascript"> function inlineFunction() { window.location.replace('https://www.geeksforgeeks.org'); } </script> <script src="script.js" type="text/javascript"> </script> </head> <body> <p>Hello</p> <p>this is an online html</p> <p>Link with tag a <a href="https://www.geeksforgeeks.org" name="arivalink">Href Link</a></p> <p>Link with inline java script - <a href="#" onClick='inlineFunction();'>Inline JS link</a></p><!-- Works --> <p>Link with external JS file w/o tagname - <a href="#" onClick='fileFunction();'>Ext JS Link</a></p><!-- Does not work --> <p>Link with external JS file w/ tagname - <a href="#" onClick='fileFunction();' name="geeksLink">Ext JS Link</a></p><!-- Does not work --> </body> </html>
Sample Javascript file
/*----------------------------------------------------*/ /* External Javascript File */ /*----------------------------------------------------*/ function fileFunction() { window.location.replace('https://www.geeksforgeeks.org'); }
Puppeteer code sample
const puppeteer = require('puppeteer'); async function start() { const browser = await puppeteer.launch({ headless: false }); const page = await browser.newPage(); //Change the path of "url" to your local path for the html file const url = 'file:///Users/sam.gajjar/SG/Projects/headless-chrome/sample.html'; var link = '[name="link"]'; console.log("Main URL Called"); await page.goto(url); console.log("Link via HTML tag A called"); await page.click(link); await page.waitForTimeout(5000) // Wait 5 seconds .then(() => page.goBack()); console.log("Callng inline JS Function"); await page.evaluate(() => inlineFunction()); await page.waitForTimeout(5000) // Wait 5 seconds .then(() => page.goBack()); console.log("Callng extjs file Function"); await page.evaluate(() => fileFunction()); await page.waitForTimeout(5000) // Wait 5 seconds .then(() => page.goBack()); // console.log("Callng extjs file Function w/tag name"); // const element = await page.$$('[a href="#"]'); // await page.waitForTimeout(5000) // .then(() => page.goBack()); } start();
Advertisement
Answer
First of all, [name="link"]
should be [name="arivalink"]
to match your DOM. I assume that’s a typo.
As another aside, I recommend using the Promise.all
navigation pattern instead of waitForTimeout
which can cause race conditions (although this doesn’t appear to be related to the problem in this case).
As for the main issue, the external file is working just fine, so that’s a red herring. You can prove that by calling page.evaluate(() => fileFunction())
right after navigating to sample.html
.
The real problem is that when you navigate with window.location.replace('https://www.geeksforgeeks.org');
, Chromium isn’t pushing that action onto the history stack. It’s replacing the current URL, so page.goBack()
goes back to the original about:blank
rather than sample.html
as you expect. about:blank
doesn’t have fileFunction
in it, so Puppeteer throws.
Now, when you click [name="link"]
with Puppeteer, that does push the history stack, so goBack
works just fine.
You can reproduce this behavior by loading sample.html
in a browser and navigating it by hand without Puppeteer.
Long story short, if you’re calling a function in browser context using evaluate
that runs window.location.replace
, you can’t rely on page.goBack
. You’ll need to use page.goto
to get back to sample.html
.
There’s an interesting nuance: if you use page.click
to invoke JS that runs location.replace("...")
, Puppeteer will push the history stack and page.goBack
will behave as expected. If you invoke the same JS logic with page.evaluate(() => location.replace("..."));
, Puppeteer won’t push the current URL to the history stack and page.goBack
won’t work as you expect. The evaluate
behavior better aligns with “manual” browsing (i.e. as a human with a mouse and keyboard on a GUI).
Here’s code to demonstrate all of this. Everything goes in the same directory and node index.js
runs Puppeteer (I used Puppeteer 9.0.0).
script.js
const replaceLocation = () => location.replace("https://www.example.com"); const setLocation = () => location = "https://www.example.com";
sample.html
<!DOCTYPE html> <html lang="en"> <head> <title>sample</title> </head> <body> <div> <a href="https://www.example.com">normal link</a> | <a href="#" onclick="replaceLocation()">location.replace()</a> | <a href="#" onclick="setLocation()">location = ...</a> </div> <script src="script.js"></script> </body> </html>
index.js
const puppeteer = require("puppeteer"); const url = "file:///Users/sam.gajjar/SG/Projects/headless-chrome/sample.html"; const log = (() => { let logId = 0; return (...args) => console.log(logId++, ...args); })(); let browser; (async () => { browser = await puppeteer.launch({ headless: false, slowMo: 500, }); const [page] = await browser.pages(); await page.goto(url); // display the starting location log(page.url()); // 0 sample.html // click the normal link and pop the browser stack with goBack await Promise.all([ page.waitForNavigation(), page.click("a:nth-child(1)"), ]); log(page.url()); // 1 example.com await page.goBack(); log(page.url()); // 2 sample.html // fire location.replace with click await Promise.all([ page.waitForNavigation(), page.click("a:nth-child(2)"), // pushes history (!) ]); log(page.url()); // 3 example.com await page.goBack(); log(page.url()); // 4 sample.html // fire location.replace with evaluate await Promise.all([ page.waitForNavigation(), page.evaluate(() => replaceLocation()), // doesn't push history ]); log(page.url()); // 5 example.com await page.goBack(); log(page.url()); // 6 about:blank <--- here's your bug! await page.goto(url); // go to sample.html from about:blank <-- here's the fix log(page.url()); // 7 sample.html // use location = and see that goBack takes us to sample.html await Promise.all([ page.waitForNavigation(), page.evaluate(() => setLocation()), // same behavior as page.click ]); log(page.url()); // 8 example.com await page.goBack(); log(page.url()); // 9 sample.html })() .catch(err => console.error(err)) .finally(async () => await browser.close()) ;