Limitations and implementation specificities

The Convert step does not support the following MS-Word features.

By “does not support”, we mean that w2x will not generate something useful corresponding to such features. We don’t mean that using such features in a DOCX file would cause w2x to fail or to generate invalid XML documents.

Right to left scripts.

Enclose characters.

Asian layout.

Cover Page. Blank Page.

Text wrapping of tables and pictures other than the simplest one.

Picture formats other than GIF, PNG, JPEG, BMP, TIFF and WMF are not supported. EMF pictures are supported only on Windows.

Clip Art. Shapes. SmartArt. Chart.

Header. Footer. Page Number.

Japanese Greetings. Text Box. WordArt. Drop Cap.

Object.

All features related to Page Layout except (to a minimal extent) page and column breaks and end of sections.

All features related to Mailings.

All features related to Spelling & Grammar, except of course the various languages used in the document (i.e. lang attribute).

Comments.

All features related to Change Tracking, (document) Compare, (document) Protect.

Macros.

Controls.

The Convert step generates XHTML+CSS documents having the following specificities:

Tab stops are converted to <span class=”role-tab> </span>. See About tab stops.

MS-Word document properties having no standard meta equivalent are given names starting with “ms-”. Example:

<meta content="Hussein Shafie" name="ms-cp-lastModifiedBy" />

MS-Word “styles” having no CSS equivalent are a given a “-ms-” prefix. Example:

.p-Heading3 {

-ms-outlineLvl: 2;

color: #4F81BD;

font-family: Cambria;

...

Page breaks are translated to <?break-page?>. Column breaks are translated to <?break-column?>. End of sections are signaled by <?end-of-section?>.

WMF pictures are converted to SVG.

OpenXML math, for example , is converted to MathML.

Conversion from OpenXML math to MathML is implemented by an XSLT 1.0 stylesheet called omml2mml.xsl coming from open source project XSL stylesheets for TEI XML. If you think you have access to a better XSLT stylesheet than open source omml2mml.xsl, then you may use it by specifying environment variable (or Java™ system property) W2X_MATH_CONVERTER_XSLT. Example:

set W2X_MATH_CONVERTER_XSLT=C:\Users\john\My better omml2mml.xsl

All simple and most complex fields are converted to a <?field code?> having a <span class="role-field"> parent. Example:

<span class="role-field">

<?field DATE \@ "MMMM d, yyyy" \* MERGEFORMAT ?>

August 27, 2014

</span>

Smart tags are enclosed between <?begin-smartTag tag?> and <?end-smartTag tag?>. Example:

<?begin-smartTag {urn:schemas-microsoft-com:office:smarttags}PersonName#0?>

<?begin-smartTag {urn:schemas:contacts}GivenName#1?>

Bill

<?end-smartTag {urn:schemas:contacts}GivenName#1?>

<?begin-smartTag {urn:schemas:contacts}Sn#2?>

Gates

<?end-smartTag {urn:schemas:contacts}Sn#2?>

<?end-smartTag {urn:schemas-microsoft-com:office:smarttags}PersonName#0?>

Controls are enclosed between <?begin-sdt control_id?> and <?end-sdt control_id?>. Example:

<?begin-sdt comboBox#6?>

<td class="tc-TableGrid--bb tc-TableGrid"

style="padding-bottom: 7.2pt; padding-left: 7.2pt;

padding-right: 7.2pt; padding-top: 7.2pt;">

<p class="tp-TableGrid p-Normal" lang="fr-FR">

<span class="c-PlaceholderText">Choose an item.</span>

</p>

</td>

<?end-sdt comboBox#6?>