Extract node from XML having defined namespace using XPath in C#

At work, while working on one of the features, we had to get a specific node from an XML. Obvious choice was to use XPath.

Problem Statement

Ignoring specifics of XML, we directly put the XPath as expected and found that no node was found when we executed the path.
I had to look up on how to extract the deep embedded node we needed using XPath in such case.

Assessment

We observed that the elements/nodes are bound to a namespace. When we tried the direct XPath, it was attempting to address elements that are bound to the default “no namespace” namespace, and thus nothing was retrieved.
Sample XML to work with:

<?xml version="1.0"?>
<SampleXmlRoot xmlns="http://www.samplesite.com/example/SampleXmlRoot">
<SampleXmlSubRoot someValue="21" xmlns="http://www.samplesite.com/example/SampleXmlSubRoot">
<SampleXmlLevel2Node id="101" xmlns="http://www.samplesite.com/example/SampleXmlLevel2Node">
<SampleXmlNode2Extract num="2">
<A1 type="money">
<value>1234.0</value>
</A1>
<A2 type="money">
<value>123.4</value>
</A2>
<A3 type="money">
<value>12.3</value>
</A3>
</SampleXmlNode2Extract>
<SampleXmlNodeOther num="2">
<B1 type="money">
<value>234.0</value>
</B1>
<B2 type="money">
<value>23.4</value>
</B2>
</SampleXmlNodeOther>
</SampleXmlLevel2Node>
</SampleXmlSubRoot>
</SampleXmlRoot>


Based on what we observed and a quick reference, we learned that we need to register the namespace for namespace prefix mapping that will be used in defining the XPath.

Using XmlNamespaceManager object, we can provide a collection of namespace definitions that will be  used by CLR to resolve the elements used in the XML documents.

Resolution

Step 1: Define XmlNamespaceManager

// xmlDoc is XmlDocument in which full XML is loaded
XmlNamespaceManager xmlnsManager = new XmlNamespaceManager(xmlDoc.NameTable);

// Add the namespaces used in XML to the XmlNamespaceManager
xmlnsManager.AddNamespace("sxr", "http://www.samplesite.com/example/SampleXmlRoot");
xmlnsManager.AddNamespace("sxsr", "http://www.samplesite.com/example/SampleXmlSubRoot");
xmlnsManager.AddNamespace("sxl2", "http://www.samplesite.com/example/SampleXmlLevel2Node");


Step 2: Use select method that is overloaded to use XmlNamespaceManager along with XPath
In our case, we will select the first XmlNode that matches the XPath expression where any prefixes found in the XPath expression will be resolved using the supplied XmlNamespaceManager.

String xPath = "/sxr:SampleXmlRoot/sxsr:SampleXmlSubRoot/sxl2:SampleXmlLevel2Node/SampleXmlNode2Extract";
XmlNode extractedNode = xDoc.SelectSingleNode(xPath, xmlnsManager);


XmlNode extracted OuterXml will look something like:

<SampleXmlNode2Extract num="2">
<A1 type="money">
<value>1234.0</value>
</A1>
<A2 type="money">
<value>123.4</value>
</A2>
<A3 type="money">
<value>12.3</value>
</A3>
</SampleXmlNode2Extract>

Yep, that’s the exact one we needed!

Refer:
MSDN: XmlNamespaceManager Class
MSDN: XmlNode.SelectSingleNode Method (String, XmlNamespaceManager)

Conclusion

XPath is a wonderful way to extract node(s) and we have defined methods in place to get them. We just need to know and learn about them. XmlNamespaceManager handles the namespace and helps in defining the XPath for a given XML.