Skip to content
Advertisement

Get the hierarchy of a XML element with XPath

I am trying to get the ordered list of the hierarchy of a given element in a “application/xml” response.data document that I parse using a DOM parser in Javascript. So the expression should return the list [‘Grand Parent’,’Parent’,’Target’] for each A tag that has no A children. So I will get a list of lists where the last element of an inner list would be the deepest (in terms of graph depth) value of <A-title>. Thanks to @Jack Fleeting I know I can get the targets using the expression xpath below : xpath = '//*[local-name()="A"][not(.//*[local-name()="A"])]/*[local-name()="A-title"]' but I am not sure how to adapt it to get to the hierarchy list.

<A>
<A-title>Grand Parent</A-title>
   <A>
   <A-title>Parent</A-title>
      <A>
      <A-title>Target</A-title>
      </A>
   </A>
</A>

EDIT :

axios.get('WMS_URL').then((r) => {
      const parser = new DOMParser()
      const dom = parser.parseFromString(r.data, 'application/xml')
       let xpath = '//*[local-name()="A"][not(.//*[local-name()="A"])]/*[local-name()="A-title"]'
       let xpath2 = 'ancestor-or-self::A/A-title'
       var targets = dom.evaluate(xpath, dom, null, XPathResult.ORDERED_NODE_SNAPSHOT_TYPE, null)
       var targets2 = dom.evaluate(xpath2, targets, null, XPathResult.ORDERED_NODE_SNAPSHOT_TYPE, null)
       Array.from({ length: targets2.snapshotLength }, (_, index) => layerNames.push(targets2.snapshotItem(index).innerHTML))

Advertisement

Answer

If you use the XPath //A[not(A)]/ancestor-or-self::A/A-title you get with //A[not(A)] all A elements not having A children and the next step navigates to all ancestor or self A elements and last to all A-title children. Of course in XPath 1 with a single expression you can’t construct a list of lists of strings (or elements?) so you would first need to sel3ect //A[not(A)] and then from there select the ancestor-or-self::A/A-title elements.

Using XPath 3.1, for instance with Saxon JS 2 (https://www.saxonica.com/saxon-js/index.xmlm, https://www.saxonica.com/saxon-js/documentation/index.html), you could construct a sequence of arrays of strings directly e.g.

//A[not(A)] ! array { ancestor-or-self::A/A-title/data() }

The JavaScript code to evaluate the XPath would be e.g.

let result = SaxonJS.XPath.evaluate('parse-xml($xml)//A[not(A)] ! array { ancestor-or-self::A/A-title/data() }', [], { params : { 'xml' : r.data }})

With DOM Level 3 XPath 1.0 I think you need a lot of more lines of code:

let xmlDoc = new DOMParser().parseFromString(r.data, 'application/xml');

let leafAElements = xmlDoc.evaluate('//A[not(A)]', xmlDoc, null, XPathResult.ORDERED_NODE_ITERATOR_TYPE, null);

let result = [];

for (let i = 0; i < leafAElements.snapshotLength; i++) { 
  let titleEls = xmlDoc.evaluate('ancestor-or-self::A/A-title', leafAElements.snapshotItem(i), null, XPathResult.ORDERED_NODE_SNAPSHOT_TYPE, null);
  let titles = []; 
  for (let j = 0; j < titleEls.snapshotLength; j++) { 
    titles.push(titleEls.snapshotItem(j).textContent); 
  } 
  result.push(titles); 
}
User contributions licensed under: CC BY-SA
8 People found this is helpful
Advertisement