Extracting a portion of a webpage?

Tags: ,



What I have in mind is the following:

  • Load web page

  • Find two divs with specific class names

  • Extract the content of everything in between, except of the last div

The reason I’m asking for it to extract anything in between is that the most important div I need doesn’t have a class name assigned.

EDIT: Here’s some generic code of what the page looks like:

<div class="text1">
    <p><b>Text 1.1</b><br>
    <b>Text 1.2</b></p>
</div>
<div>
    <p>Text without class which I also need.</p>
</div>
<div class="enddiv">
    [content of enddiv]
</div>

I need everything in between the divs text1 and enddiv, but not the contents of enddiv.

Answer

Welcome DeBedenHasen, If I understood you well you can do something like this:

// Select all elements from .text1 and so on
const elements = document.querySelectorAll('.text1, .text1 ~ *')

let string = '' // content will go here

// Store all content before reach .enddiv
for (const e of elements)
  if (e.getAttribute('class')  == 'enddiv') break
  else string += e.textContent

// Print the content
document.body.innerHTML = string

Here you can check your example: https://jsfiddle.net/s4mv5c1b/

Hope this help 🙂



Source: stackoverflow