I’m new to Javascript and would like to use the library Cheerio to do some webscraping. Came across this text in the introduction to the library. Am not sure what the difference is between a selector, context and root.
Extract from documentation:
Cheerio’s selector implementation is nearly identical to jQuery’s, so the API is very similar.
$( selector, [context], [root] )
selector searches within the context scope which searches within the root scope. selector and context can be an string expression, DOM Element, array of DOM elements, or cheerio object. root is typically the HTML document string.
This selector method is the starting point for traversing and manipulating the document. Like jQuery, it’s the primary method for selecting elements in the document, but unlike jQuery it’s built on top of the CSSSelect library, which implements most of the Sizzle selectors.
Example API:
<ul id="fruits"> <li class="apple">Apple</li> <li class="orange">Orange</li> <li class="pear">Pear</li> </ul>
$(‘.apple’, ‘#fruits’).text() //=> Apple
$(‘ul .pear’).attr(‘class’) //=> pear
$(‘li[class=orange]’).html() //=> Orange
In the first example, .apple is the selector, and #fruits is the context. That makes sense. In the second example, is ul the selector and .pear the context? If the selector is meant to search within the context, that’s strange given that .pear is nested in ul?
Advertisement
Answer
jQuery, and in extension Cheerio, uses something called “context”, and it does have a special meaning.
The context is where jQuery will search for the given selector, so in plain JS the equivalent would be
document.getElementById('#fruit');
where document
is the context, and #fruit
is the selector.
The default context in Cheerio is always document
, unless another context is specifically given in the format
$(selector, context)
The selector only has context if it’s two strings, separated by a comma, so something like this would still use document
as context
$('#fruit, .apple')
and it would search for both elements, not one inside the other etc. because it’s just one string, containing a comma, so it’s not the same thing.
The first one of your examples is the only one with a special context, the other two has document
as context, and are regular CSS selectors.
$('.apple', '#fruits')
^ This has context, and would be the exact same as $('#fruits').find('.apple')
$('ul .pear')
^ This does not have a special context, it just selects all .pear
elements inside an UL
$('li[class=orange]')
^ This does not have a special context either, it selects all LI elements with a class
attribute that perfectly matches orange
, i.e. has no other classes