Skip to content

Using jQuery to gather all text nodes from a wrapped set, separated by spaces

I’m looking for a way to gather all of the text in a jQuery wrapped set, but I need to create spaces between sibling nodes that have no text nodes between them.

For example, consider this HTML:

<div>
  <ul>
    <li>List item #1.</li><li>List item #2.</li><li>List item #3.</li>
  </ul>
</div>

If I simply use jQuery’s text() method to gather the text content of the <div>, like such:

var $div = $('div'), text = $div.text().trim();

alert(text);

that produces the following text:

List item #1.List item #2.List item #3.

because there is no whitespace between each <li> element. What I’m actually looking for is this (note the single space between each sentence):

List item #1. List item #3. List item #3.

This suggest to me that I need to traverse the DOM nodes in the wrapped set, appending the text for each to a string, followed by a space. I tried the following code:

var $div = $('div'), text = '';

$div.find('*').each(function() {
  text += $(this).text().trim() + ' ';
});

alert(text);

but this produced the following text:

This is list item #1.This is list item #2.This is list item #3. This is list item #1. This is list item #2. This is list item #3.

I assume this is because I’m iterating through every descendant of <div> and appending the text, so I’m getting the text nodes within both <ul> and each of its <li> children, leading to duplicated text.

I think I could probably find/write a plain JavaScript function to recursively walk the DOM of the wrapped set, gathering and appending text nodes – but is there a simpler way to do this using jQuery? Cross-browser consistency is very important.

Thanks for any help!

Answer

jQuery deals mostly with elements, its text-node powers are relatively weak. You can get a list of all children with contents(), but you’d still have to walk it checking types, so that’s really no different from just using plain DOM childNodes. There is no method to recursively get text nodes so you would have to write something yourself, eg. something like:

function collectTextNodes(element, texts) {
    for (var child= element.firstChild; child!==null; child= child.nextSibling) {
        if (child.nodeType===3)
            texts.push(child);
        else if (child.nodeType===1)
            collectTextNodes(child, texts);
    }
}
function getTextWithSpaces(element) {
    var texts= [];
    collectTextNodes(element, texts);
    for (var i= texts.length; i-->0;)
        texts[i]= texts[i].data;
    return texts.join(' ');
}