XML is loosing traction, it seems that every new API uses JSON or alike. But still there are big data sets only available in XML. As I already had a great XML-parser, that is extremely fast, has a clean API and provides an abstract source tree (AST) DOM, I had to fill the last important feature missing for that parser. Until now it was only possible to parse about 10-40 Megabyte strings.

I thought long and I started a few times implementing a parser that is still very fast and has all the features and on top of that is able to parse though large files. Files such as a Wikipedia-dump or the openStreetMap-world file. But now about at the fifed try, I found a solution to handle streams. I solved it by taking assumptions about the shape of large xml-files. Large XML files usually consist of a root element, containing a long list of items. The new parser is going to provide items one by one. It uses a nodejs-Streams. That makes it possible to use a stream reader for compressed files and use the plane-data stream for the xml-parser.

Each item provided from the xml is an ast on its own and can easily be simplified by the framework. That makes working with xml-files not just much faster then ever before in JS it is also much more convenient. Developer have an API they are probable already familiar with.

If I would be you and have to work with XML data, I would definitely use tXml!!! If you can really choose, choose JSON.

Contents