SAX


SAX

SAX (Simple API for XML) is a serial access parser API for XML. SAX provides a mechanism for reading data from an XML document. It is a popular alternative to the Document Object Model (DOM).

XML processing with SAX

A parser which implements SAX (ie, "a SAX Parser") functions as a stream parser, with an event-driven API. The user defines a number of callback methods that will be called when events occur during parsing. The SAX events include:

* XML Text nodes
* XML Element nodes
* XML Processing Instructions
* XML Comments

Events are fired when each of these XML features are encountered, and again when the end of them is encountered. XML attributes are provided as part of the data passed to element events.

SAX parsing is unidirectional; previously parsed data cannot be re-read without starting the parsing operation again.

Example

Given the following XML document:

Some Text Pre-Text Inlined text Post-text.

This XML document, when passed through a SAX parser, will generate a sequence of events like the following:

* XML Processing Instruction, named "xml", with attributes "version" equal to "1.0" and "encoding" equal to "UTF-8"
* XML Element start, named "RootElement", with an attribute "param" equal to "value"
* XML Element start, named "FirstElement"
* XML Text node, with data equal to "Some Text" (note: text processing, with regard to spaces, can be changed)
* XML Element end, named "FirstElement"
* XML Element start, named "SecondElement", with an attribute "param2" equal to "something"
* XML Text node, with data equal to "Pre-Text"
* XML Element start, named "Inline"
* XML Text node, with data equal to "Inlined text"
* XML Element end, named "Inline"
* XML Text node, with data equal to "Post-text."
* XML Element end, named "SecondElement"
* XML Element end, named "RootElement"

In fact, this may vary: the SAX specification deliberately states that a given section of text may be reported as multiple sequential text events. Thus in the example above, a SAX parser may generate a different series of events, part of which might include:

* XML Element start, named "FirstElement"
* XML Text node, with data equal to "Some"
* XML Text node, with data equal to "Text"
* XML Element end, named "FirstElement"

Definition

Unlike DOM, there is no "formal" specification for SAX. The Java implementation of SAX is considered to be normativeFact|date=September 2008, and implementations in other languages attempt to follow the rules laid down in that implementation, adjusting for the differences in language where necessary.

Benefits

SAX parsers have certain benefits over DOM-style parsers. The quantity of memory that a SAX parser must use in order to function is typically much smaller than that of a DOM parser. DOM parsers must have the entire tree in memory before any processing can begin, so the amount of memory used by a DOM parser depends entirely on the size of the input data. The memory footprint of a SAX parser, by contrast, is based only on the maximum depth of the XML file (the maximum depth of the XML tree) and the maximum data stored in XML attributes on a single XML element. Both of these are always smaller than the size of the parsed tree itself.

Because of the event-driven nature of SAX, processing documents can often be faster than DOM-style parsers. Memory allocation takes time, so the larger memory footprint of the DOM is also a performance issue.

Due to the nature of DOM, streamed reading from disk is impossible. Processing XML documents that could never fit into memory is only possible through the use of a stream XML parser, such as a SAX parser.

Drawbacks

The event-driven model of SAX is useful for XML parsing, but it does have certain drawbacks.

Certain kinds of XML validation require access to the document in full. For example, a DTD IDREF attribute requires that there be an element in the document that uses the given string as a DTD ID attribute. To validate this in a SAX parser, one would need to keep track of every previously encountered ID attribute and every previously encountered IDREF attribute, to see if any matches are made. Furthermore, if an IDREF does not match an ID, the user only discovers this after the document has been parsed; if this linkage was important to building functioning output, then time has been wasted in processing the entire document only to throw it away.

Additionally, some kinds of XML processing simply require having access to the entire document. XSLT and XPath, for example, need to be able to access any node at any time in the parsed XML tree. While a SAX parser could be used to construct such a tree, the DOM already does so by design.

ee also

Other XML processing technologies

*VTD-XML
*Document Object Model
*Streaming API for XML (StAX)
*XSL Transformations (XSLT)
*Streaming Transformations for XML (STX)
*System Integrated Automation parser
* [http://www.xom.nu/ XOM]
*CookXml

XML Parser and APIs supporting SAX

*Crimson XML
*Fusion XML SAX Parser
*JAXP: Java API for XML Processing
*Expat: C SAX implementation.
*LibXML
*MSXML
*Xerces

References

*David Brownell: "SAX2", O'Reilly, ISBN 0-596-00237-8
*W. Scott Means, Michael A. Bodie: "The Book of SAX", No Starch Press, ISBN 1-886411-77-8

External links

* [http://www.saxproject.org/ SAX homepage]
* [http://xml.com/pub/a/2001/12/05/sax2.html Top Ten SAX2 Tips]
* [http://www.cafeconleche.org/books/xmljava/chapters/ch06.html SAX at Cafe Con Leche]

*Interfaces for ...
** [http://search.cpan.org/~grantm/XML-SAX-0.16/SAX.pm Perl]
** [http://www.python.org/doc/current/lib/module-xml.sax.html Python]
** [http://www.saxproject.org/?selected=quickstart Java]


Wikimedia Foundation. 2010.

Look at other dictionaries:

  • sax — sax …   Dictionnaire des rimes

  • şaxələnmə — «Şaxələnmək»dən f. is …   Azərbaycan dilinin izahlı lüğəti

  • SAX — steht für: Sax (Asteroid), ein Asteroid des Hauptgürtels Sax (Waffe), ein einschneidiges Hiebschwert des frühen Mittelalters Sax (Zigaretten Marke), eine italienische Zigaretten Marke Sax (Alicante), eine Gemeinde in der spanischen Provinz… …   Deutsch Wikipedia

  • Sax — steht für: (3534) Sax, ein Asteroid des Hauptgürtels Sax (Waffe) (Brockhaus), andere Schreibweise für Sachs (Duden, Kluge Seebold), ein einschneidiges Hiebschwert des frühen Mittelalters Sax (Zigaretten Marke), eine italienische Zigaretten Marke… …   Deutsch Wikipedia

  • sax — sax·a·tile; sax·aul; sax·horn; sax·i·ca·vous; sax·ic·o·line; sax·ic·o·lous; sax·if·ra·ga; sax·i·fra·ga·ce·ae; sax·i·frage; sax·if·ra·gous; sax·i·frax; sax·ig·e·nous; sax·o·nian; sax·on·ic; sax·on·ism; sax·on·ist; sax·on·ize; sax·on·ly; sax·o·ny;… …   English syllables

  • Sax — (англ. «Simple API for XML») способ последовательного чтения/записи парсеры требуют фиксированного количества памяти для своей работы, но не позволяют изменять содержимое документа. Всё, что делает SAX парсер, это сообщает вызвавшему приложению о …   Википедия

  • Sax — or sax may refer to:* Saxophone, a musical instrument * Sax, Alicante, a municipality in Spain * SAX (Simple API for XML), a method of reading data files in computing * SaX, a tool for configuring graphics hardware in SUSE Linux * The Saxony… …   Wikipedia

  • SAX — Cette page d’homonymie répertorie les différents sujets et articles partageant un même nom. SAX, sigle composé des trois lettres S, A et X, peut faire référence à : Simple API for XML, Sambu, Panama, selon la liste des codes AITA des… …   Wikipédia en Français

  • SaX — Saltar a navegación, búsqueda SaX2 mostrando la configuración de la tarjeta y el monitor SaX (SUSE Automated X configuration), es el configurador de monitor y tarjeta de video de SUSE Linux de Novell y openSUSE. Junto con YaST, son las… …   Wikipedia Español

  • sax — [ saks ] n. m. • v. 1970; de saxophone ♦ Fam. Saxophone. ⇒ saxo. ⊗ HOM. Saxe. Sax (Antoine Joseph, dit Adolphe) (1814 1894) facteur d instruments et flûtiste belge naturalisé français. Il inventa le saxophone (1845) et le saxhorn. sax [saks] n. m …   Encyclopédie Universelle