I would want to get only the path value form the below text/html. Actually it contains 10k lines, it would be very difficult to manually take the all path values. Is this possible to get the only path values through regex or through excel or any other possible way?
I would want to grab and take all the path value alone from the href attribute
<table> <tbody> <tr> <th>account</th> <th>size</th> <th>nodes</th> <th>props</th> <th></th> </tr> <tr> <td><a href=" /reports/?path=/root/en/products-services/course-products">course-products</a></td> <td class="number">955MB</td> <td class="number">80607</td> <td class="number">549393</td> <td width="100%"> <table style="border: none;" width="100%"> <tbody> <tr> <td style="border-width:1;width:58%" class="bar"></td> <td style="border: none; width:42%"><b>58%</b></td> </tr> </tbody> </table> </td> </tr> <tr> <td><a href="/reports/?path=/root/products-services/silverthorn-7e-info">silverthorn-7e-info</a></td> <td class="number">83.5MB</td> <td class="number">149</td> <td class="number">778</td> <td width="100%"> <table style="border: none;" width="100%"> <tbody> <tr> <td style="border-width:1;width:5%" class="bar"></td> <td style="border: none; width:95%"><b>5%</b></td> </tr> </tbody> </table> </td> </tr> <tr> <td><a href="/reports/?path =/root/products-services/sanders-2e-info">sanders-2e-info</a></td> <td class="number">45.5MB</td> <td class="number">9609</td> <td class="number">67184</td> <td width="100%"> <table style="border: none;" width="100%"> <tbody> <tr> <td style="border-width:1;width:3%" class="bar"></td> <td style="border: none; width:97%"><b>3%</b></td> </tr> </tbody> </table> </td> </tr> <tr> <td><a href="/reports/?path=/root/products-services/davidson-10e-info">davidson-10e-info</a></td> <td class="number">39MB</td> <td class="number">53</td> <td class="number">288</td> <td width="100%"> <table style="border: none;" width="100%"> <tbody> <tr> <td style="border-width:1;width:2%" class="bar"></td> <td style="border: none; width:98%"><b>2%</b></td> </tr> </tbody> </table> </td> </tr> <tr>
Advertisement
Answer
In javascript, with .each
, you can do something like that
$( "tr" ).each(function( index ) { let ahref = $(this).find('a').attr('href'); console.log(ahref); });