At work, while working on one of the features, we had to get a specific node from an XML
. Obvious choice was to use XPath
.
Problem Statement
Ignoring specifics of XML
, we directly put the XPath
as expected and found that no node was found when we executed the path.
I had to look up on how to extract the deep embedded node we needed using XPath in such case.
Assessment
We observed that the elements/nodes are bound to a namespace. When we tried the direct XPath
, it was attempting to address elements that are bound to the default “no namespace” namespace, and thus nothing was retrieved.
Sample XML to work with:
<?xml version="1.0"?>
<SampleXmlRoot xmlns="http://www.samplesite.com/example/SampleXmlRoot">
<SampleXmlSubRoot someValue="21" xmlns="http://www.samplesite.com/example/SampleXmlSubRoot">
<SampleXmlLevel2Node id="101" xmlns="http://www.samplesite.com/example/SampleXmlLevel2Node">
<SampleXmlNode2Extract num="2">
<A1 type="money">
<value>1234.0</value>
</A1>
<A2 type="money">
<value>123.4</value>
</A2>
<A3 type="money">
<value>12.3</value>
</A3>
</SampleXmlNode2Extract>
<SampleXmlNodeOther num="2">
<B1 type="money">
<value>234.0</value>
</B1>
<B2 type="money">
<value>23.4</value>
</B2>
</SampleXmlNodeOther>
</SampleXmlLevel2Node>
</SampleXmlSubRoot>
</SampleXmlRoot>
Based on what we observed and a quick reference, we learned that we need to register the namespace for namespace prefix mapping that will be used in defining the XPath
.
Using XmlNamespaceManager
object, we can provide a collection of namespace definitions that will be used by CLR to resolve the elements used in the XML documents.
Resolution
Step 1: Define XmlNamespaceManager
// xmlDoc is XmlDocument in which full XML is loaded
XmlNamespaceManager xmlnsManager = new XmlNamespaceManager(xmlDoc.NameTable);
// Add the namespaces used in XML to the XmlNamespaceManager
xmlnsManager.AddNamespace("sxr", "http://www.samplesite.com/example/SampleXmlRoot");
xmlnsManager.AddNamespace("sxsr", "http://www.samplesite.com/example/SampleXmlSubRoot");
xmlnsManager.AddNamespace("sxl2", "http://www.samplesite.com/example/SampleXmlLevel2Node");
Step 2: Use select method that is overloaded to use XmlNamespaceManager
along with XPath
In our case, we will select the first XmlNode
that matches the XPath
expression where any prefixes found in the XPath
expression will be resolved using the supplied XmlNamespaceManager
.
String xPath = "/sxr:SampleXmlRoot/sxsr:SampleXmlSubRoot/sxl2:SampleXmlLevel2Node/SampleXmlNode2Extract";
XmlNode extractedNode = xDoc.SelectSingleNode(xPath, xmlnsManager);
XmlNode
extracted OuterXml will look something like:
<SampleXmlNode2Extract num="2">
<A1 type="money">
<value>1234.0</value>
</A1>
<A2 type="money">
<value>123.4</value>
</A2>
<A3 type="money">
<value>12.3</value>
</A3>
</SampleXmlNode2Extract>
Yep, that’s the exact one we needed!
Refer:
MSDN: XmlNamespaceManager Class
MSDN: XmlNode.SelectSingleNode Method (String, XmlNamespaceManager)
Conclusion
XPath
is a wonderful way to extract node(s) and we have defined methods in place to get them. We just need to know and learn about them. XmlNamespaceManager
handles the namespace and helps in defining the XPath
for a given XML.