XMLmind logoCompany | Contact | Site Map
 
 

Qizx technical specifications

General Features

Query oriented

Qizx is designed to perform high-speed querying, retrieval and processing of indexed XML contents.

No need for DTD or Schema

Works with well-formed XML documents. Does not require DTD or Schema.

Automatic indexing

By default, the full contents of documents are indexed: elements, attribute values (in text/numeric/date forms when applicable), simple element content, full-text.

Consequence: out of the box, queries execute at best speed.

Customizable indexing

For advanced needs, the indexing process in an XML Library can be configured through context-sensitive rules for elements and attributes, and/or pluggable value converters and full-text word tokenizers.

Hierarchical collections

A repository (named a XML Library) is organized as a hierarchical structure of Collections, which in turn contain Documents (similar to a file-system, with collections playing the role of directories and XML documents the role of plain files).

Indexed Documents and Collections can be decorated with queriable metadata properties.

Native XML storage

To support XQuery, XML documents indexed in Qizx have an internal representation that complies with the XQuery/XPath 2 Data-Model: their logical structure is fully preserved and documents can be exported back into XML without loss of information (except for physical details like entity boundaries and CDATA sections).

Compressed representation

Documents and indexes are compressed, which allows a significant reduction of disk space use and IO: an XML Library with all its indexes (including full-text) is often slightly smaller than the source XML.

Qizx does not currently support the storage of non-XML data, such as images. However metadata properties associated with Documents, Collections or Libraries make it easy to keep links with data stored on other media.

Querying

XQuery as query language.

XML Query (or XQuery) is now a W3C standard.

XQuery is more than "the SQL of XML": it is a full-blown programming language with user-defined functions and extended processing capabilities.

The XQuery engine of Qizx is one of the most advanced, complete and efficient XQuery implementations available today: as of end 2008, it is one of the two XQuery implementations that pass more than 99.9% of the official XQuery Test Suite; it has been noticed in several research papers for its speed. It is one of the rare engines that detect and optimize joins.

Since version 2.1, Qizx also supports the XQuery Update extension.

From version 3, Qizx will support most of XQuery Full-Text extension.

Efficient querying

Qizx works with a cost-based query plan optimizer which exploits indexes automatically.

The optimizer detects and optimizes most types of joins.

See here for concrete speed measurements.

Query by metadata

Documents and Collections can bear user-defined properties. This allows associating meta-data with documents and collections without modifying their contents. Property values can have basic types (string, number, date etc) or be XML fragments (for example structured annotations).

Such properties can be efficiently queried through XQuery or through the API. This is a very powerful mechanism that can be used for example for:

  • restricting general queries to only those documents that match certain property values (e.g search only documents whose modification date is more recent than a given date).
  • Manage custom indexes (for example store statistics computed from a document -like sum, average- as properties, then use the properties to very quickly find the matching documents).

Updating

Document level updates

Qizx can update both at document level (import, replace or remove entire documents in an XML Library) and at node level (local modification within a document). Node-level updates are achieved through XQuery Update (see below). Metadata properties can be modified independently of documents.

A document is always updated as a whole. This is a deliberate design choice, allowing faster queries: Qizx is a query-oriented engine and does not aim at being a database capable of frequent updates of large documents.

This limitation is not an hindrance to most applications:

  • Updating documents up to a few hundreds kilobytes is completely feasible and efficient.
  • If larger documents need to be updated frequently, then it probably means that they could be split into smaller units updated independently of each other. For example a large Maintenance Manual can be split into chapters and sections. Reassembling a large document from a collection of smaller parts can be accomplished easily using the XQuery language.

Update API's

Since version 2.1, Qizx fully supports the XQuery Update extension.

The Java API and the XQuery extension functions allow importing and updating documents efficiently.

Isolation and ACID transactions

Qizx is represented as a XQuery engine (or XML search engine), but offers many features of a true XQuery database:

Updates operations are performed in transactions, providing Atomicity, Consistency, Isolation and Durability.

One-phase commit, concurrent transactions are supported, with short-term locking of Collections and Documents.

Sessions are isolated from concurrent transactions, therefore long queries can be completed without perturbations by concurrent updates.

Backups can be performed while the engine is running.

Modifications are journaled, allowing crash recovery.

API and extension modules

Java API

Java API for management of XML Libraries and querying.

This API is simple and open: extension points allow for custom filters, observers, indexing, data import/export, access control, XQuery module resolving, etc.

The querying part of the API is very similar to XQJ (XQuery for Java, JSR 225). Full support for XQJ be provided when stable specifications are available.

XQuery extensions

Extended XQuery functions:

  • General extensions (serialization, XSLT, dynamic evaluation, error-catching, etc).
  • XML Library handling in XQuery.
  • Java binding (automatic plugging of Java methods as XQuery functions).

Note: Full-text is now supported natively through the XQuery Full-Text standard.

Other extensions

Some extensions available in Qizx Milestone 1 are temporarily disabled: SQL connectivity and server mode. Regarded as essential features, both are under redesign and will be available as soon as possible. See roadmap.

XML standards

XQuery / XPath 2

Conformant with W3C: as of end 2007, passes 99.9% of the official XQuery Test Suite (v1.0.2).

Complete implementation, except schema related features.

Optional features:
  • Full-axis: yes.
  • Module import: yes.
  • Schema import and validation: no.
  • Static typing: partially.

XQuery Update

Fully supports the Candidate Recommendation of August 2008

XQuery Full-text

Available from v3.0: nearly complete support of the Candidate Recommendation of May 2008.

XSLT

XSLT output through API and extension functions (JAXP based).

SAX2 and DOM

Through API: input (document import) and output.

System Requirements and Support

Platform independence

Entirely written in Java.

Officially supported platforms:

  • Windows 2000, XP, Vista
  • Linux 2.4 +
  • Mac OS X 10.4+

Java runtime

JRE 1.4.2+

Required third-party Software

None.

 

 

 

Keywords: XQuery, XQuery Update, XQuery Full-Text, XPath, XSLT, SAX, DOM, XML Schema | Java API, XQJ | Windows, Unix, cross-platform