For this article we’ll dive further into the processing of XML files. There are two types of XML we may encounter, #1 well formed XML, #2 unknown XML files.
#1 (well formed XML) documents are XML files that are correct in their syntax and will not require any validation, for this type of file we know from where they come from, and we know their format.
#2 (unknown XML) are documents that we don’t know where they come from, or if they are syntactically correct and well formed.
For the second type we won’t worry about because in this article we will be talking about how to read well formed files, the type of files that are created by an automated process to communicate between programs and share information, all that is done using well formed, well known XML documents.
When we want to communicate with another program through an XML document we need to be able to write that document and to read that document.
In this type of exchange the XML format used in that document is a well known format.
We already saw how to write our XML document using the XMLGenerator class which creates a well formed document, and now we need to Read a “received” document.
For reading an XML document a new class was recently added in C7; that class is the XMLParser. The XMLParser is a class that reads an XML document file and processes it, the document must be a valid document because the XMLParser does not validate the document, that is why we call the XMLParser a non-validating XML parser.
Is a no-validating parser bad?
Well no, it is not bad, it is just that, a non-validating parser.
Because we will be reading a known file we don’t need to validate the files and we don’t need to loose any time in doing that.
We are not creating a program that can read any random XML file, we only need to read a specific XML document.
For that kind of file we don’t need any validation, we know the format and it is what we are expecting.
The XMLParser included in C7 is very fast and its memory usage is very low. The parser does not store or create any structure with the XML data, instead it uses an interface and calls each method of that interface passing the XML Data.
To use the parser the IXmlNotify interface needs to be implemented.
The IXmlNotify is an interface declared in the file QuickXMLParser.INC, the same file where the XMLParser is declared.
The question is then:
Why I need to implement the IXmlNotify to parse my XML file?
The XMLParser does all the work of parsing the file but it does not know what to do with the data that comes from the XML Document.
We need to “tell” the parser what to do with the data. Or in this case the parser will just inform us that it found some data and let us decide what to do with it.
So the parser just reads the data and passes it to us.
How is the data is passed to us?
The parser calls the correct methods from the IXmlNotify interface based on the data that it reads.
A quick look into the IXmlNotify interface will show that the methods are very simple.
IXmlNotify INTERFACE FoundNode PROCEDURE( STRING name, STRING attributes ) FoundElement PROCEDURE( STRING name, STRING value, STRING attributes ) CloseElement PROCEDURE( STRING name) StartElement PROCEDURE( STRING name, STRING value, STRING attributes ) EndElement PROCEDURE( STRING name, STRING value, STRING attributes ) FoundComment PROCEDURE( STRING Comment) FoundHeader PROCEDURE( STRING attributes) CloseHeader PROCEDURE() FoundAttribute PROCEDURE( STRING tagname, STRING name, STRING value ) END
Not that many methods to implement, right?
It is easy to implement the IXmlNotify, just declare a class like this:
And implement the IXmlNotify methods like:
MyClass.IXmlNotify.FoundNode PROCEDURE( STRING name, STRING attributes )
In each of the methods you implement you just write the code that is needed every time the parser finds some data that you are interested in.
Remember that this is not a general purpose parser so you will know exactly the data that you will be receiving.
After you have you class implemented just call the parser like:
Note: you can declare your string on run time based on the size of the XML file or you can receive the stream from a web server,etc.
Most of the time you will have to parse a FILE, and that is why we extended the parser class to support a FILE name.
You still need to tell the Parser what to do with the data that it is parsing, but now instead of implementing the IXmlNotify interface you just derive the VIRTUAL methods needed.
And after that you just call your class like this:
Only one method call and the XML Document was parsed and the data was processed as you indicated.
With the Virtual methods you only derive the methods that are called with the data you are interested in, so you don’t need to derive all the methods.
The following are the all the methods that are virtuals
XmlNotifyFoundNode PROCEDURE(STRING name, STRING attributes),VIRTUAL XmlNotifyCloseNode PROCEDURE(STRING name),VIRTUAL XmlNotifyFoundElement PROCEDURE(STRING name, STRING value, STRING attributes),VIRTUAL XmlNotifyCloseElement PROCEDURE(STRING name),VIRTUAL XmlNotifyStartElement PROCEDURE(STRING name, STRING value, STRING attributes),VIRTUAL XmlNotifyEndElement PROCEDURE(STRING name, STRING value, STRING attributes),VIRTUAL XmlNotifyFoundComment PROCEDURE(STRING Comment),VIRTUAL XmlNotifyFoundHeader PROCEDURE(STRING attributes),VIRTUAL XmlNotifyCloseHeader PROCEDURE(),VIRTUAL XmlNotifyFoundAttribute PROCEDURE(STRING tagname, STRING name, STRING value),VIRTUAL
As you can see each one match a method in the IXmlNotify.
Summarizing both articles, an XML file is just a formatted text file, we can read the data contained by hand using a simple ASCII FILE but then we need to parse the file to extract the content, to make that process much easier we use the XMLFileParser class. To create a new XML file instead of just writing out the text we use the XMLGenerator class to simplify the process of creating the Tags and maintain the open/close of notation tags, also it ensures that we create a well formed XML document, and gives us the flexibility to create any type of XML document that we need in our program.
Attached is an example on how to create a class that derives the methods to load a tree with the parsed document.