| Why use a Native XML Database
(NXDB)This page attempts to help you decide whether a Native XML
Database (NXDB) is appropriate for your projects. If you have already
decided that it is, please proceed to this page,
describing the benefits of Qizx/db. Whether to use a XML database is a
vast topic that is only summarized here; to go deeper, you will find useful links at the bottom of this page. XML?XML has become an ubiquitous way of representing
data. XML is an exchange format, but also a powerful and flexible way to
structure information. There are many reasons to use an XML representation
to store information. A first reason could be that it is imposed from
outside: data comes in XML form, it has to be retrieved and exchanged in XML
without loss of information. XML can also be used as a common representation
for heterogeneous data sources. In fact using XML, or not, depends
very much on the nature of information that has to be handled: Document-centric, data-centric, semi-structured?The term Data-centric relates to documents where XML is
used only as an external representation for fine-grained data, with a
regular and simple structure, where order is little significant. In
short, such documents could fairly easily be modeled in a Relational
DBMS. They could also be called structured. Document-centric is a term used for human-readable
documents, characterized by a more complex, irregular, possibly
recursive structure, with mixed content, where order is significant.
Such documents are difficult to store on a RDBMS. - Semi-structured is used for an intermediary type of
documents, which have many characteristics of data-centric data, but can
also contain document-centric style parts (e.g annotations or comments);
which have a very variable structure (due to rapidly evolving
specifications, or to a variety of origins); which have many data
"fields" (hundreds), many of these fields optional ("null data" in RDBMS
speak); which use unspecified properties (name/value pairs).
What kind of database?There are two main
kinds of XML databases: Relational databases with an XML layer "shred" XML
documents into relational tables. Complex types, in the sense of
XML schema, are mapped to tables, therefore the use of XML Schema is
indispensable to such systems. - Native XML Databases (NXDB): though there is no official
definition of what NXDB are, here are their main characteristics:
- In a NXDB, the basic unit of storage is the XML document.
Document structure is preserved, according to a Data
Model such as the XML Infoset or the XQuery/XPath2 Data
Model.
- A NXDB accepts any well-formed document (a property sometimes
called Schema independence). Schema or DTD can of course be
used to check the validity of documents, and also to accelerate
query execution, but are not indispensable.
- NXDB support an XML-aware query language, typically XML Query
nowadays, but it can also be XPath 1 or 2, XSLT or other proprietary
or research languages.
Due to these characteristics, NXDB are clearly better suited
for document-centric and semi-structured data.
Where Native XML databases are really
useful- When data is XML (obviously) and its structure must be
preserved.
- For document-centric and semi-structured documents: because a NXDB
does not require schema, much less work is required when adding new or
complex document types.
- For rapidly evolving schema: schema changes are a notoriously
painful issue in Relational systems. In contrast, a NXDB is naturally
much more resilient to changes. Furthermore, XML Query can often make
the migration seamless.
- Data integration. A language like XML Query is a precious help for
solving such problems.
Desirable properties of a Native XML
databaseIn addition to the characteristics mentioned above, and to
the usual characteristics of any database system (transactions, concurrency,
robustness), the following capabilities seem highly desirable: - Ability to perform queries at full speed without manual
specifications of indexes. If introducing new data or new queries
demands the burden of specifying or tuning indexes and reindexing the
database, a NXDB becomes much less attractive (this is very much the
same problem as schema change).
- Support of full-text search, preferably contextual (or structure
aware).
Useful
links► The XML and databases papers by
Ronald Bourret ► XML
Databases by Michael Kay (PDF) © 2008 Pixware.
Updated
2008/1/14
using Qizx/open.
Java and all Java-based marks are trademarks or
registered trademarks of Sun Microsystems, Inc.
in the U.S. and other countries.
Acrobat and PostScript are trademarks of
Adobe Systems Incorporated. |