New classes: QXmlStreamReader and QXmlStreamWriter

Published Wednesday February 28th, 2007
6 Comments on New classes: QXmlStreamReader and QXmlStreamWriter
Posted in Qt

Snapshot users have seen them already: with QXmlStreamReader and QXmlStreamWriter, Qt 4.3 will feature two new classes for reading and writing XML.

QXmlStreamReader is a faster and more convenient replacement for Qt’s own SAX parser, and in some cases also for applications that would previously use a DOM tree.

The basic concept of a stream reader is to report an XML document as a stream of tokens, similar to SAX. The main difference between QXmlStreamReader and SAX is how these XML tokens are reported. With SAX, the application must provide handlers that receive so-called XML events from the parser at the parser’s convenience. With QXmlStreamReader, the application code itself drives the loop and pulls tokens from the reader one after another as it needs them. This is done by calling readNext(), which makes the reader read from the input stream until it has completed a new token, and then returns its tokenType(). A set of convenient functions like isStartElement() or text() then allows to examine this token, and to obtain information about what has been read. The big advantage of the pulling approach is the possibility to build recursive decent parsers, meaning you can split your XML parsing code easily into different methods or classes. This makes it easy to keep track of the application’s own state when parsing XML.

The streaming concept is not new, it’s what other APIs use as well, e.g. Java’s StAX or libxml’s TextReader approach. In fact, it’s pretty much the straight-forward and intuitive approach to reading tokens. The weird thing was SAX and honestly I can’t remember what had driven us to implement SAX in Qt in the first place, other than lack of time to design something better. We had it on the list for a long time, now we finally got around doing it.

The streambookmarks example in the Qt snapshots shows how to use both classes. Here’s the typical main loop:

 QXmlStreamReader xml;
  ...
  while (!xml.atEnd()) {
        xml.readNext();
        ... // do processing
  }
  if (xml.error()) {
        ... // do error handling
  }

The parser is well-formed and also supports incremental parsing. And it’s a lot faster than Qt’s SAX and DOM implementations. If things work out well, we might base the SAX parser in future Qt versions on QXmlStreamReader, which would make our SAX implementation both faster and more compliant. But even then, QXmlStreamReader is the fastest and most convenient way to read XML data in Qt.

Do you like this? Share it
Share on LinkedInGoogle+Share on FacebookTweet about this on Twitter

Posted in Qt

6 comments

That’s great news. I’ve been a fan of pull parser for a while now. Their speed is excellent and writing a parser with them rather straightforward. QtSvg should benefit too, right? It’s really nice to see work in the XML area. Makes me hopeful of what’s next. XPath, XML Schema, object relational mapping, XSLT, XQuery. Oh I guess when you start, there’s no end in sight.

“Makes me hopeful of what’s next. XPath, XML Schema, object relational mapping, XSLT, XQuery. Oh I guess when you start, there’s no end in sight.”

I’ll vote for XPath.

Trevor Clarke says:

I vote for XML Schema validation. In fact, the only reason I’m using xerces-c++ instead of Qt’s XML is the lack of schema validation.

Daniel Haas says:

Great to hear that!
In what way does that relate to the work Ariya Hidayat did for KOffice in the KoXmlReader? (See very interesting posts here http://ariya.blogspot.com/2006/11/memory-efficient-dom.html and here http://ariya.blogspot.com/2006/11/memory-efficient-dom-part-2.html)
Is this inspired by KoXmlReader, is it the same under a new name and for Qt, or is it something completly unrelated?
Some clarifications would be great!

Brad hards says:

recursive descent. Typo is also in the API docs.

What would be really great is some kind of XML to C++/Qt binding, like XMLBeans offers for Java.

XMLBeans (xmlbeans.apache.org) lets you generate Java classes based on an XML Schema. The resulting classes can be instantiated by parsing an XML file and can also be used to generate XML (optionally validating as well). This kind of thing provides you with a kind of instanteous internal model for an XML application.

The generated classes should of course be easy to integrate with the Models from the Model/View framework.

Commenting closed.

Get started today with Qt Download now