I have html file html content like this :
<ul> <li class="class_1">111</li> <li class="class_2"> <ul> <li class="class_3">222</li> <li class="class_4">333</li> </ul> </li> <li class="class_5">444</li> </ul>
After Loading html content in cheerio module and while searching for immediate li childs it’s getting all items from child ul as well like this :
this._$$=cheerio.load(<htmlContent>, {xmlMode : true}); const liElements = this._$$(`ul > *`);
When i print liElements in after converting to html content i am getting output like this :
<li class="class_1">111</li> <li class="class_2"> <ol> <li class="class_3">222</li> <li class="class_4">333</li> </ol> </li> <li class="class_5">444</li> <li class="class_3">222</li> <li class="class_4">333</li>
You can see content from child ul is repeating here. I tried a lots of options from cheerio documentation but no luck. Can any help me to get immediate li child of ul.
Many Thanks in Advance.
Advertisement
Answer
the issue is that ul > *
is too generic and it will return all the ul child even ones inside ul under li tag
maybe you have two solutions to fix this situation
1) put a class name on top ul
<ul class="main-ul"> <li class="class_1">111</li> <li class="class_2"> <ul> <li class="class_3">222</li> <li class="class_4">333</li> </ul> </li> <li class="class_5">444</li> </ul>
selector became const liElements = this._$$(.main-ul > li
);
2) get child of ul inside li tag and remove them from list of all child
const liWithLiParent= this._$$(`li > ul > *`); const liElements = this._$$(`ul > *`).filter(li => !liWithLiParent.some(liWithParent => liWithParent === li));