Skip to content
Advertisement

Select Text within ’embed’ for pdf document

I am attempted to scrape information off a website which apparently uses an ’embed’ to display a pdf window. The code is very simple and I’ve found a method for sending the information.

If I ‘ctrl + a’ the pdf window, it will select everything. At which point, I can now send the information as needed through a message.

My problem is, that I need a method to select and set the range on the information automatically. Preferably not manually.

The html is pretty easy:

<html>
<head>
</head>
<body class="pdf">
<embed name="0111111" style="position:absolute; left: 0; top: 0;" width="100%" height="100%" src="about:blank" type="application/pdf" internalid="0111111">
</body>
</html>

That really is all of it. The stuff is apparently inside the embed. As I said, I can ‘ctrl + a’ and highlight everything and get the info sent as needed, but I can’t seem to figure out how to automatically select the text inside the embed.

My current code is:

// Send Selected Info Inside PDF
document.querySelector('embed').postMessage({type: 'getSelectedText'}, '*');

// Listen for info inside PDF
window.addEventListener("message", (event) => {
  console.log(event.data.selectedText);
}, false);

Any ideas?

Advertisement

Answer

document.querySelector(’embed’).postMessage({type: ‘selectAll’});

This works perfectly… You put it into the content script.

window.addEventListener("message", (event) => {
  console.log(event.data.selectedText);
}, false);

const script = document.createElement('script');

script.textContent = `(${() => {
        document.querySelector('embed').postMessage({type: 'selectAll'});
        document.querySelector('embed').postMessage({type: 'getSelectedText'}, '*');
      }})()`;
document.documentElement.appendChild(script);
script.remove();
Advertisement