Home    Products    Services    Resources
Products
 
XSL-FO Converter
Qizx/db XML Database
    Whether and why use Qizx/db
        Why a Native XML Database
        Benefits of Qizx/db
        Why a Database Engine
    Product
    Free Engine
    Download Free Engine
    Documentation
    Evaluate
    Buy
    Upgrade
    Support
    Qizx/open
XML Editor

 

Site Map
Contact

Why use a Native XML Database (NXDB)

This page attempts to help you decide whether a Native XML Database (NXDB) is appropriate for your projects. If you have already decided that it is, please proceed to this page, describing the benefits of Qizx/db.

Whether to use a XML database is a vast topic that is only summarized here; to go deeper, you will find useful links at the bottom of this page.

XML?

XML has become an ubiquitous way of representing data. XML is an exchange format, but also a powerful and flexible way to structure information. There are many reasons to use an XML representation to store information. A first reason could be that it is imposed from outside: data comes in XML form, it has to be retrieved and exchanged in XML without loss of information. XML can also be used as a common representation for heterogeneous data sources.

In fact using XML, or not, depends very much on the nature of information that has to be handled:

Document-centric, data-centric, semi-structured?

  • The term Data-centric relates to documents where XML is used only as an external representation for fine-grained data, with a regular and simple structure, where order is little significant. In short, such documents could fairly easily be modeled in a Relational DBMS. They could also be called structured.

  • Document-centric is a term used for human-readable documents, characterized by a more complex, irregular, possibly recursive structure, with mixed content, where order is significant. Such documents are difficult to store on a RDBMS.

  • Semi-structured is used for an intermediary type of documents, which have many characteristics of data-centric data, but can also contain document-centric style parts (e.g annotations or comments); which have a very variable structure (due to rapidly evolving specifications, or to a variety of origins); which have many data "fields" (hundreds), many of these fields optional ("null data" in RDBMS speak); which use unspecified properties (name/value pairs).

What kind of database?

There are two main kinds of XML databases:

  • Relational databases with an XML layer "shred" XML documents into relational tables. Complex types, in the sense of XML schema, are mapped to tables, therefore the use of XML Schema is indispensable to such systems.

  • Native XML Databases (NXDB): though there is no official definition of what NXDB are, here are their main characteristics:
    • In a NXDB, the basic unit of storage is the XML document. Document structure is preserved, according to a Data Model such as the XML Infoset or the XQuery/XPath2 Data Model.
    • A NXDB accepts any well-formed document (a property sometimes called Schema independence). Schema or DTD can of course be used to check the validity of documents, and also to accelerate query execution, but are not indispensable.
    • NXDB support an XML-aware query language, typically XML Query nowadays, but it can also be XPath 1 or 2, XSLT or other proprietary or research languages.

    Due to these characteristics, NXDB are clearly better suited for document-centric and semi-structured data.

Where Native XML databases are really useful

  • When data is XML (obviously) and its structure must be preserved.
  • For document-centric and semi-structured documents: because a NXDB does not require schema, much less work is required when adding new or complex document types.
  • For rapidly evolving schema: schema changes are a notoriously painful issue in Relational systems. In contrast, a NXDB is naturally much more resilient to changes. Furthermore, XML Query can often make the migration seamless.
  • Data integration. A language like XML Query is a precious help for solving such problems.

Desirable properties of a Native XML database

In addition to the characteristics mentioned above, and to the usual characteristics of any database system (transactions, concurrency, robustness), the following capabilities seem highly desirable:

  • Ability to perform queries at full speed without manual specifications of indexes. If introducing new data or new queries demands the burden of specifying or tuning indexes and reindexing the database, a NXDB becomes much less attractive (this is very much the same problem as schema change).
  • Support of full-text search, preferably contextual (or structure aware).

 

► The XML and databases papers by Ronald Bourret

► XML Databases by Michael Kay (PDF)

 


© 2008 Pixware. Updated 2008/1/14 using Qizx/open.

Java and all Java-based marks are trademarks or registered trademarks of Sun Microsystems, Inc. in the U.S. and other countries.
Acrobat and PostScript are trademarks of Adobe Systems Incorporated.