Home|News|Products|Store|Contact XMLmind Word To XML |

# Change history

## 1.7 (April 02, 2019)

Enhancements:

• Command-line utility w2x and desktop application w2x-app now support plugins.

A plugin is simply a text file having a ".w2x_plugin" suffix, containing a number of w2x command-line arguments and starting with comment lines containing information about the plugin (for example, its name). Example, rss.w2x_plugin:

### plugin.name: rss
### plugin.outputExtension: xml
### plugin.multiFileOutput: no

-c
-e w2x:xed/main.xed

# Image files not useful here.
-step:com.xmlmind.w2x.processor.DeleteFilesStep:cleanUp
-p cleanUp.files "%{~pO}/%{~nO}_files"

This plugin converts DOCX to RSS. This process is partly implemented by XSLT 1.0 stylesheet rss.xslt which is part of this plugin. Stylesheet rss.xslt transforms its input, the semantic XHTML 1.0 Transitional file created by the Edit step (-e w2x:xed/main.xed), to RSS.

Aside XSLT 1.0 stylesheets, a plugin may also include XED scripts as well as ".jar" files containing custom conversion steps implemented in Java™.

A plugin is registered with w2x and w2x-app by copying all its files anywhere inside directory w2x_install_dir/plugin/. However it's strongly recommended to group all the files comprising a plugin in a subdirectory of its own having the same name as the plug-in (e.g. w2x_install_dir/plugin/rss/). Alternatively, this plugin may be installed anywhere you want provided that the directory containing the ".w2x_plugin" file is referenced in the W2X_PLUGIN_PATH environment variable. Example: set W2X_PLUGIN_PATH=C:\Users\John\w2x\rss;C:\temp\w2x_plugins.

Once registered with w2x, a plugin may be invoked as it were a stock conversion, for example:

w2x -o rss my.docx my.xml

In w2x-app, you'll find the registered plugins in the "Convert to" combobox and in the "Output format" screen of the setup assistant.

• When a DOCX file contains revision info (i.e. "Track Changes"), w2x implements its own, automatic, very crude, interpretation of "Accept All Changes". That's why, a warning is now issued informing the user that she/he would better use MS-Word to manually accept or reject the tracked changes before submitting the DOCX file to w2x.
• Upgraded XMLmind Web Help Compiler (whc for short) to version 2.1.3_04.
• XMLmind Word To XML, which passed all non-regression tests, is now officially supported on Java™ 12 platforms.

Bug fixes:

• w2x-app: a user converting a DOCX file to a multi-file output format (e.g. Web Help), choosing the default output directory (which is the directory containing the input DOCX file) and choosing by mistake to make the output directory empty before proceeding with the conversion ended up deleting the input DOCX file.
• The online help browser of w2x-app displayed a blank window when the computer running w2x-app was not connected to the Internet.

## 1.6 (December 21, 2018)

Enhancements:

• Index entries marking a page range (e.g. field XE "XML" \r "OpenXMLPageRange") in the DOCX document are now supported when generating DITA and DocBook documents.
• Upgraded XMLmind Web Help Compiler (whc for short) to version 2.1.3_02.
• XMLmind Word To XML, which passed all non-regression tests, is now officially supported on Java™ 11 platforms.
• All programs which are part of XMLmind Word To XML are now officially supported on macOS Mojave (version 10.14).

## 1.5.1 (August 31, 2018)

Enhancements:

• A new —very low-level— parameter -p edit.ids.automatic-ids regex_pattern lets the user specify which bookmarks automatically generated by MS-Word ("_GoBack", "_Toc123", etc) are to be preserved by the conversion process, and this, even when such bookmarks are not referenced anywhere in the generated document.
• Slightly improved the way DOCX metadata (e.g. author, publisher, etc) are translated to DITA.
• A warning is now reported when processing a DOCX index entry marking a page range (e.g. field XE "XML" \r "OpenXMLPageRange"). This kind of index entries is currently not supported by XMLmind Word To XML. Only simple index entries marking a single location in the DOCX document are currently supported.
• Upgraded XMLmind Web Help Compiler (whc for short) to version 2.1.3_01.

Bug fixes:

• Setting the paragraph indentation of stock MS-Word style "List Paragraph" to a number of Character Units (ch) caused XMLmind Word To XML to generate incorrect lists.

Note that we are not 100% sure that this bug is really fixed now. Unfortunately, the behavior of MS-Word when it comes to processing w:ind/@w:xxxChars attributes is not completely documented in the "ECMA-376, Office Open XML File Formats" specification.

• Defining a custom style named "Normal +" caused XMLmind Word To XML to enter an endless loop.
• XMLmind Word To XML always created DITA fn (footnote) elements having an id attribute. The bug is that these fn elements were never referenced by an <xref type="fn"> (footnote call). The consequence was that these footnotes were automatically discarded when converting to HTML, PDF, DOCX, etc, the DITA files created by w2x.

## 1.5 (April 25, 2018)

Enhancements:

• When generating semantic XHTML, the following inline element names are no longer “hard-wired”: sup, sup, small, big, s, u, tt, b, i. Alternate element names may be specified using the following parameters: inlines.sup-element, inlines.sup-element, inlines.small-element, inlines.big-element, inlines.s-element, inlines.u-element, inlines.tt-element, inlines.b-element, inlines.i-element. Example 1: generate code rather than tt elements: -p edit.inlines.tt-element "code". Example 2: do not generate small elements: -p edit.inlines.small-element "span style='font-size:x-small'" (notice how one or more attributes may be specified too).

This facility is useful only when generating semantic XHTML and all formats based on semantic XHTML. Using it when generating DITA or DocBook may give poor results.

• When generating semantic XML of any kind, all the XHTML meta elements but author, description, dcterms.* are automatically suppressed from the semantic XHTML 1.0 Transitional document generated by the Edit step and used as an input by the Transform step.

If you want to keep some or all the meta elements in this intermediate semantic XHTML 1.0 Transitional document, you may now specify -p edit.metas.keep regexp_matching_meta_name. Examples: -p edit metas.keep '.*' keeps all metas; -p edit metas.keep '^dc.' keep all metas having a name starting with "dc." (e.g. <meta name="dc.subject" content="..." />).

• XMLmind Word To XML can now generate DITA indexterm elements having index-sort-as children and DocBook indexterm/primary, secondary, tertiary elements having sortas attributes. For this to happen, the input DOCX file must contain XE (index entry) fields having \y "yomi" (first phonetic character for sorting indexes) field arguments.

Unlike MS-Word which considers \y "yomi" only for East Asian languages, w2x uses this XE field argument to sort the index entries whatever the language of the document. English examples: {XE "<span>" \y "span"}, {XE "Operation:+" \y ":Addition"}.

• XMLmind Word To XML, which passed all non-regression tests, is now officially supported on Java™ 10 platforms.
• Upgraded XMLmind Web Help Compiler (whc for short) to version 2.1.3.

Bug fixes:

• The language of DOCX files written in an East Asian language is not correctly detected.

Unfortunately, this will always be the case because w2x never examines the characters actually contained in a text span having <w:lang w:eastAsia="ja-JP" w:val="en-US"/> to determine whether this text span is written in ja-JP or is written in en-US or is written is a mix of both languages.

However, a partial workaround for this limitation is to specify for example -p convert.set-lang ja-JP or -p convert.default-lang ja-JP. When parameter convert.set-lang or parameter convert.default-lang is set to a language code starting with ja, zh or ko, then it is attribute w:lang/@w:eastAsia which is used to determine the language of a text span and not attribute w:lang/@w:val.

Note that -p convert.default-lang ja-JP is just used as a hint to favor attribute w:lang/@w:eastAsia over attribute wlang/@w:val. Given the way MS-Word sets these two attributes, using parameter -p convert.default-lang ja-JP will not cause a vastly incorrect detection of the language when converting a German DOCX file for example.

• The Convert step sometimes generated XHTML elements having attribute lang="x-NONE". Value "x-NONE" is invalid.

## 1.4.0_01 (February 24, 2018)

• Minor internal changes needed to make XMLmind Word To XML code compatible with XMLmind XML Editor v8.

## 1.4 (December 18, 2017)

• All semantic XHTML formats and all formats based on semantic XHTML (EPUB, Web Help, frameset): paragraphs having p-Title and p-Subtitle styles (to make it simple; see parameters edit.title.title-style-names and edit.title.subtitle-style-names) are now converted to equivalent semantic XHTML elements.

In the previous versions of XMLmind Word To XML, such titles were converted only to head/title and to head/meta name="description" which made them invisible to the user (though usable by programs such as the XSLT stylesheets generating DITA or DocBook).

This feature can be controlled by specifying the following new parameters:

• edit.title.keep-title. Default value when generating semantic XHTML: "yes". Default value when generating DITA and DocBook: "no".
• edit.title.title-container. Default value: "h1 class='role-document-title'". Ignored when edit.title.keep-title is "no".
• edit.title.subtitle-container. Default value: "p class='role-document-subtitle'". Ignored when edit.title.keep-title is "no".
• EPUB formats: added parameter epub.omit-toc-root (default value: "no"). Web Help formats: added parameter webhelp.omit-toc-root (default value: "no").

By default, the Table of Contents (TOC) generated for an EPUB or Web Help document has a single “root”. This single root always points to the page containing the title, subtitle, author, etc, of the document. Setting this parameter to "yes" prevents the generated TOC from having such single root.

• XMLmind Word To XML, which passed all non-regression tests, is now officially supported on Java™ 9 platforms.
• Upgraded XMLmind Web Help Compiler (whc for short) to version 2.1. The new compiler uses window.sessionStorage rather than cookies to store the internal state of the Web Helps it generates.
• Changed the technology used to implement the context-sensitive online help from obsolete JavaHelp to a dedicated, embedded Web browser displaying Web Help.

if the Java™ runtime used to run w2x-app is older than version 1.8.0_40, the system Web browser rather than the dedicated, embedded Web browser is used to display the Web Help, which is much less convenient for the user.

## 1.3 (November 08, 2017)

Please do not use new Java 9 to run the programs which are part of XMLmind Word To XML. XMLmind Word To XML has not yet been tested against this version of Java.

Enhancements:

• Upgraded XMLmind Web Help Compiler (whc for short) to version 2.0, which supports 2 layouts for the generated Web Help: classic, the default layout and simple, a new layout. When generating Web Help, pass w2x option -p webhelp.wh-layout simple to give it a try.
• Setup assistant of w2x-app:
• Added a "Layout of the generated Web Help" combobox to the "Output format options" screen when the chosen output format is Web Help. This combobox makes it easy choosing between the classic and simple layouts.
• The dialog box allowing to add or modify an entry of the MS-Word style to XML element map now displays the localized name of a style (e.g. "Definition Char") next to the w2x name of this style (e.g. "c-DefinitionChar"). This is really needed when you give for example Japanese names to your custom MS-Word styles.
• Parameter edit.remove-styles.preserved-classes now accepts class patterns as well as class names. For example, specify -p edit.remove-styles.preserved-classes "^(t|(tr)|(tc)|(tp)|p|(pn)|n|c)-.+$" if you want to preserve in the semantic XHTML the class names corresponding to all the CSS styles generated during the Convert step. • Hidden text runs (<w:vanish/>) are now converted to <span style="display:none">. When generating semantic XML, these invisible span elements are then discarded. • “Word To XML” servlet: added an optional params servlet parameter which allows to augment or to override some of the options of the conversion specified by the conv servlet parameter. Example: curl -s -S -o manual.epub \ -F "docx=@manual.docx;type=application/vnd.openxmlformats-officedocument.wordprocessingml.document" \ -F "conv=epub" \ -F "params=-p epub.identifier urn:x-mlmind:w2x:manual -p epub.split-before-level 8" \ http://localhost:8080/w2x/convert • XMLmind Word To XML is now available as a macOS X native .dmg distribution including a private Java™ 1.8.0_152 runtime. • All programs which are part of XMLmind Word To XML are now officially supported on macOS High Sierra (version 10.13). Bug fixes: • When a table was inserted inside a sequence of paragraphs having the same border, the conversion to styled XHTML (and to all output formats based on styled XHTML, like EPUB) failed with the following error message: error in action "group": missing attribute "g:container" for element .../html:p[NN]. • When generating semantic XHTML, for some rare cases, class name role-bridgeheadI was added to li elements. • Field codes like "XE" (index entry) were not normalized to upper-case. For example, this bug could cause some index entries to be missing in the generated semantic XML. • It was not possible to use built-in image converter factory com.xmlmind.w2x_ext.emf2png.EMF2PNG to convert WMF to PNG despite the fact that this factory supports the WMF format in addition to the EMF format. • Marking as being deleted all the text contained in DOCX table caused w2x to generate an invalid XHTML table having no cells at all. • w2x generated invalid DITA when a table or figure caption contained index terms. ## 1.2.3 (June 20, 2017) Enhancements: • Converting a DOCX file to Web Help now automatically creates an index.html file (if an index.html file does not already exist). This feature is controlled using parameter webhelp.add-index. The default value of this parameter is yes. • Added option -liststeps to the w2x command-line utility. When this option is specified, w2x lists all the conversion steps to be executed and then exits. This option is useful to determine how to customize the conversion steps. Example: $ w2x -o bookmap -liststeps
-step:com.xmlmind.w2x.processor.ConvertStep:convert
-p convert.create-mathml-object no
-p convert.set-column-number yes
-step:com.xmlmind.w2x.processor.EditStep:edit
-p edit.xed-url-or-file file:/opt/w2x/xed/main.xed
-step:com.xmlmind.w2x.processor.TransformStep:transform
-p transform.out-file %{~pnO}.dita
-p transform.single-topic no
-p transform.xslt-url-or-file file:/opt/w2x/xslt/topic.xslt
-step:com.xmlmind.w2x.processor.TransformStep:transform2
-p transform2.xslt-url-or-file file:/opt/w2x/xslt/bookmap.xslt
-p transform2.topic-type %{transform.topic-type}
-p transform2.output-path %{~po}
-step:com.xmlmind.w2x.processor.DeleteFilesStep:cleanUp
-p cleanUp.files %{~pnO}.dita
• Added parameter convert.resource-prefix which is useful when used in conjunction with convert.resource-directory and when several files generated by w2x share the same resource directory.
• Upgraded XMLmind Web Help Compiler (whc for short) to version 1.4.4, which contains an important bug fix.

Bug fixes:

• Specifying parameter convert.resource-directory as "." in order to create all image files in the same directory as the other automatically generated files did not work.
• The resource directory was always made empty if it already existed. This behavior was very dangerous and could delete important user files (dangerous, harmful, example: w2x doc.docx /home/john/doc.html). Now the resource directory is made empty if and only if it's the “automatic” output_file_basename_files/ folder, which is at the same time safe and convenient.

Other changes:

• Changed "Licensor" from "Pixware SARL" to "XMLmind Software" in all licenses.

## 1.2.2 (April 14, 2017)

Enhancements:

• Added parameter edit.ids.generate-section-ids. Setting this parameter to yes (default value is no) ensures that all the sections found in the semantic XHTML resulting from the conversion of a DOCX file have a unique ID.

When this ID is missing, it is computed using the content of the h1, h2, ..., h6 heading which is the first child of the section. Example:

<div class="role-section2" id="Title_of_this_section">
<h2>Title of this section</h2>
...

The maximum length of the automatically computed ID may be specified using parameter edit.ids.section-id-max-length. The default value of this parameter is 32.

Setting edit.ids.generate-section-ids to yes is especially useful when converting a DOCX file to a DITA map or bookmap. With this parameter, the filenames of the topics referenced by the generated map are guaranteed to have meaningful values (e.g. "Introduction.dita" rather than "d0e35.dita").

• Added XSLT parameter shortdesc-class-name to W2X_install_dir/xslt/topic.xslt, the XSLT stylesheet which is used to convert intermediate semantic XHTML document to a DITA topic.

This parameter is used to specify the class name of the XHTML <p> which acts as a short description of the section. Examples: -p transform.shortdesc-class-name p-Shortdesc, -p transform.shortdesc-class-name p-Abstract.

When this parameter is not specified (or is specified as the empty string which is its default value), the following style mapping, created by the w2x-app wizard:

-p edit.blocks.convert "p-Shortdesc p class='p-Shortdesc'"
...
<xsl:template match="h:p[@class='p-Shortdesc']">
<shortdesc>
<xsl:call-template name="processCommonAttributes"/>
<xsl:apply-templates/>
</shortdesc>
</xsl:template>

causes DITA <shortdesc> elements to generated inside topic bodies, which is invalid.

After specifying -p transform.shortdesc-class-name p-Shortdesc, this issue is fixed and DITA <shortdesc> elements are generated before topic bodies.

• Added an "Other parameters" screen to the w2x-app wizard. This new screen lets the user specify parameters which are not supported by the "Output format options" and "MS-Word style to XML element map" screens. For example, when generating a DITA document, the other screens do not let the user specify -p transform.pre-element-name codeblock (default value being pre).
• Upgraded XMLmind Web Help Compiler (whc for short) to version 1.4.2_03.

Bug fixes:

• For some DOCX paragraphs, significant whitespace was removed by XMLmind Word To XML. This gave incorrect results when these DOCX paragraphs were converted to DocBook programlisting, DITA pre, XHTML pre, etc.
• In the source DOCX file, fields having an empty code (that is, somewhat abnormal fields) caused XMLmind Word To XML to raise a StringIndexOutOfBoundsException.
• When generating semantic XHTML of any kind with parameter edit.convert-tabs.to-table set to no (the default value), attribute class="role-tabs-XXX" and elements <span class="role-tab"> were not discarded.

Not only this markup is not useful, but it also prevented some style mappings created the w2x-app wizard from working. Example, the following style mapping of MS-Word paragraph style Note to a DITA element <note>:

-p edit.blocks.convert "p-Note p class='p-Note'"
...
<xsl:template match="h:p[@class='p-Note']">
<note>
<xsl:call-template name="processCommonAttributes"/>
<xsl:apply-templates/>
</note>
</xsl:template>

failed for the following paragraph (intermediate semantic XHTML preceding the transformation to DITA):

<p class="role-tabs-35.45-0-117 p-Note">Note:
<span class="role-tab"> </span>Body of the note here.</p>
• In rare cases, foot/end notes were numbered starting from 2 and not starting from 1 as expected.

Incompatibilities:

• w2x_all.jar, the self-contained JAR file, is no longer used by the following scripts: bin/w2x, w2x.bat, w2x-app, w2x-app-c.bat. This prevented advanced users from easily modifying the scripts found in subdirectories xed/ and xslt/. This self-contained JAR file is still available but its use should be reserved to embedding w2x in a third-party application.

## 1.2.1 (November 24, 2016)

Enhancements:

• Conversion of images found in the DOCX file (TIFF, WMF, EMF, etc) to standard formats (SVG, PNG, JPEG) may now be controlled using environment variable (or Java™ property) W2X_IMAGE_CONVERSIONS. The default value of this variable is (all specifications on a single line):
.wmf.svg java:com.xmlmind.w2x_ext.wmf_converter.WMFConverterFactory;
.tiff.png java:com.xmlmind.w2x.docx.image.ImageConverterFactoryImpl

On Windows, the default value of W2X_IMAGE_CONVERSIONS is (all specifications on a single line):

.wmf.svg java:com.xmlmind.w2x_ext.wmf_converter.WMFConverterFactory;
.emf.png java:com.xmlmind.w2x_ext.emf2png.EMF2PNG resolution 0;
.tiff.png java:com.xmlmind.w2x.docx.image.ImageConverterFactoryImpl
• Added two new image converters:
External image converter

This image converter executes an external program to perform the conversion.

Examples of W2X_IMAGE_CONVERSIONS specifications (see above): convert EMF to SVG using OpenOffice/LibreOffice:

.emf.svg soffice --headless --convert-to svg -–outdir %~po %i

Convert EMF/WMF to PNG using ImageMagick:

.emf.png.wmf.png magick convert -density 288 "%I" -scale 25% "%O"
com.xmlmind.w2x_ext.emf2png.EMF2PNG

This image converter is available only on Windows. It leverages Windows own GDI+ to convert EMF (in fact, Windows metafiles of any kind, including WMF) to PNG.

This is not that great because, unlike com.xmlmind.w2x_ext.wmf_converter.WMFConverterFactory which converts WMF (Windows vector graphics format) to SVG (standard vector graphics format), EMF2PNG converts a vector graphics format to a raster image format. However, having EMF2PNG is better than nothing at all.

• Upgraded XMLmind Web Help Compiler (whc for short) to version 1.4.2, which leverages jQuery v3.1.1 and jQuery UI v1.12.1. This implies that the Web Help generated by w2x no longer supports Internet Explorer 8 and older versions.

Bug fixes:

• Images which were used to statically render objects embedded in the DOCX file (e.g. a PowerPoint slide) were ignored.

## 1.2 (August 01, 2016)

Enhancements:

• Desktop application w2x-app has now a setup assistant (AKA “wizard” style dialog box) making it quick and easy creating w2x option files. This new setup assistant has a screen which may be used to map MS-Word character and paragraph styles (e.g. p-CodeSample) to XML elements possibly having attributes (e.g. DITA pre outputclass="code-sample").
• New “semantic” output formats:
• Multi-page semantic XHTML 1.0 Strict (-o frameset_strict), XHTML 1.0 Transitional (-o frameset_loose), XHTML 1.1 (-o frameset1_1), XHTML 5 (-o frameset5).
• Web Help containing semantic XHTML 1.0 Strict (-o webhelp_strict), XHTML 1.0 Transitional (-o webhelp_loose), XHTML 1.1 (-o webhelp1_1), XHTML 5 (-o webhelp5).
• EPUB 2 containing semantic XHTML 1.1 (-o epub1_1).
• MS-Word math (that is, OpenXML math) is now automatically converted to MathML. However not all output formats may embed MathML. By default, MathML elements are added only to documents having the following formats: XHTML 5, EPUB (through the use of <ops:switch>), DITA and DocBook 5. When targeting any other format, XMLmind Word To XML generates external files containing MathML then adds elements pointing to these external ".mml" files. XHTML 1 example: <object data="doc_files/math-010.mml" type="application/mathml+xml"/>.

The parameters related to MathML support are: convert.create-mathml-object, edit.finish-styles.mathjax (MathJax support).

• Added a useful variant of parameter edit.blocks.convert called edit.blocks.convert-to-pre. This new parameter is best explained by comparing it to edit.blocks.convert.

When using MS-Word, there two ways to represent code samples:

1. Use a sequence of paragraphs having the same style. Each paragraph contains one line of the code sample. Let's call the style of these paragraphs Code1.
2. Use a single paragraph containing the whole code sample, which means that this single paragraph contains significant whitespace and line breaks. Let's call the style of this paragraph Code2.

A sequence of Code1 paragraphs may be converted to an XHTML pre using:

–p edit.blocks.convert "p-Code1 span g:id='pre' g:container='pre'"

A Code2 paragraph may be converted to an XHTML pre using:

–p edit.blocks.convert-to-pre "p-Code2 pre"
• New parameter transform.pre-element-name may be used to specify to which DocBook or DITA element, an HTML pre element is to be converted. The default value of transform.pre-element-name is pre when generating DITA and literallayout when generating DocBook.
• When converting a DOCX file to semantic XHTML, new parameter remove-styles.preserved-classes may be used to preserve some of the classes (e.g. c-Code, p-Note, etc) used to style the elements found in the intermediate, automatically generated, styled XHTML document.

Moreover specifying both parameters prune.preserve and remove-styles.preserved-classes is currently the only way to keep in the generated semantic XHTML empty paragraphs having a given MS-Word style. For example, specifying -p prune.preserve p-PlaceHolder and -p remove-styles.preserved-classes p-PlaceHolder may be used to keep in the semantic XHTML output all empty paragraphs having the p-PlaceHolder style.

• The conversion to DITA may now generate some DITA 1.3 elements and attributes, for example: equation-block, equation-inline, mathml, line-through, entry/@rotate.

Bug fixes:

• DOCX to styled HTML: fixed a couple of bugs related to numbering.
• In some cases, option transform.generate-xref-text=yes (the default value) generated "???" (e.g. "See example ???.") rather than useful hyperlink text link "above" or "below" (e.g. "See example below.").
• Specifying parameters split.use-id-as-filename=true and webhelp.use-id-as-filename=true caused w2x to generate files having incorrect names when the input DOCX had duplicate bookmarks or when it had bookmarks containing the '.' character.
• In some cases, changing the style of the footnote number automatically created by MS-Word caused w2x to raise a NullPointerException.

## 1.1 (March 15, 2016)

It's now possible to convert a DOCX document to the following styled HTML formats (that is, XHTML+CSS):

Files generated this way look like the source DOCX document. Previously the only way to generate Web Help or EPUB was to first convert the source DOCX document to DITA or DocBook (semantic XML) and then to convert the intermediate DITA or DocBook files to Web Help or EPUB using external tools such as DITA Open Toolkit, XMLmind DITA Converter, DocBook XSL stylesheets. However in such case, the generated Web Help or EPUB does not look like the source DOCX document.

Note that a frameset is automatically generated along the multi-page styled HTML pages. While an obsolete HTML feature, a frameset makes it easy browsing these HTML pages. Moreover the table of contents used as the left frame is a convenient way to programmatically list all the generated HTML pages. Example: excerpts from w2x_install_dir/doc/manual/manual-TOC.html:

...
<body>
<p class="toc-entry-0"><a href="manual-0.html" target="contentFrame">XMLmind Word To XML Manual</a></p>
<p class="toc-entry-1"><a href="manual-1.html" target="contentFrame">Contents</a></p>
<p class="toc-entry-1"><a href="intro.html" target="contentFrame">1 Introduction</a></p>
<p class="toc-entry-1"><a href="install.html" target="contentFrame">2 Installing w2x</a></p>
<p class="toc-entry-2"><a href="distribution.html" target="contentFrame">2.1 Contents of
the installation directory</a></p>
...

How does this work?

In order to generate these 3 new formats, we need to automatically split the source DOCX document into parts. A new part is created each time a paragraph having an outline level less than or equal to specified split-before-level parameter is found in the source. An outline level is an integer between 0 (e.g. style Heading 1) and 8 (e.g. style Heading 9). The default value of parameter split-before-level is 0, which means: for each Heading 1, create a new page starting with this Heading 1.

Example: for each Heading 1 and Heading 2, create a new page (out/manual-1.html, out/manual-2.html, ..., out/manual-N.html) starting with this Heading 1 or Heading 2:

w2x -p split.split-before-level 1 -o frameset manual.docx out/manual.html

#### Important tip

Other enhancements:

• When a DOCX document is converted to styled HTML of any kind (as opposed to semantic XML), the generated processing instructions are now automatically removed and all the footnotes and endnotes are now automatically given a number. If you don't want this to happen, pass parameters -p edit.do.remove-pis "" and -p edit.do.number-footnotes "" to w2x.
• New parameter -p edit.finish-styles.custom-styles-url-or-file CSS_URL_OR_FILE makes it easy customizing the CSS styles used by the generated styled HTML pages. The custom CSS styles found in file CSS_URL_OR_FILE are simply appended to the automatically generated CSS styles.
• New parameter -p convert.lower-case-resource-names yes (default value: no) is needed to keep quiet epubcheck on platforms where filenames are case-sensitive (e.g. Linux). Not for general use.

Bug fixes:

• w2x-app: added a workaround for an Apple Java bug which caused any scrolled window to become garbled when scrolling quickly. This bug seems to be specific to Apple Java and to non-Retina Macs running El Capitan.

## 1.0.0_01 (December 4, 2015)

Bug fix: a span class=role-tabs having a negative X coordinate caused expand-tabs.js to loop forever.

## 1.0.0 (November 17, 2015)

First version of the commercial product.

Enhancements:

• Text runs aligned on tab stops are now processed as follows:
• When generating XHTML+CSS, some JavaScript™ code is added to the output file. This code computes and gives a width to all <span class="role-tab">. This allows to decently emulate tab stops in any modern Web browser.

If you don't want this code to be added to the output file, pass option -p edit.do.expand-tabs "" to w2x.

• When generating semantic XHTML and all the other semantic XML formats (DocBook, DITA, etc), it's now possible to convert consecutive paragraphs containing text runs aligned on tab stops to a borderless table.

However because, in the general case, it's not possible to emulate tab stops using tables, this XED script is disabled by default. If you really want to emulate tab stops using tables, pass option -p edit.convert-tabs.to-table yes to w2x.

Note that the alignment of a tab stop (right, center, etc) is ignored. That is, the text run is always considered to be left aligned.

• DOCX files using the "Strict Open XML Document" format are now supported. DOCX files using this format conforms to the Strict profile of the Open XML standard (ISO/IEC 29500). This profile of Open XML doesn't allow a set of features that are designed specifically for backward-compatibility with existing binary documents, as specified in Part 4 of ISO/IEC 29500.
• Tested XMLmind Word To XML against the DOCX files created using MS-Word 2016.
• Desktop application w2x-app now works fine on computers having very high resolution (HiDPI) screens. For example, it now works fine on a Mac having a Retina® screen and a Windows computer having an UHD (“4K”) screen. On Windows, all DPI scale factors —100%, 125%, 150%, 200%, etc— are supported.

On a Linux computer having a HiDPI screen, HiDPI is not automatically detected. You'll have to to specify the display scaling factor you prefer using the -putpref command-line option. Example: w2x-app -putpref displayScaling 200.

## 1.0.0-beta04 (September 8, 2015)

Enhancements:

• The “Word To XML” servlet now provides the user with a minimal work in progress feedback during the execution of a lengthy conversion.

Bug fixes:

• Added more DOCX files coming from different origins to the test suite of the XMLmind Word To XML. Had to slightly modify the software to cope with some specificities of these DOCX files.
• XMLmind Word To XML add-on for XMLmind XML Editor: a user preferring to use the native file chooser on Windows or on the Mac forced the add-on to also use the native file chooser. Using the native file chooser in the context of the add-on is not convenient as this prevents the file filters specified by the add-on (DOCX, TXT, XML, DITA, etc) from working.

## 1.0.0-beta03 (July 13, 2015)

New “Word To XML” servlet is a Java™ Servlet (server-side standard component) which has the same functions as the w2x-app desktop application.

The “Word To XML” servlet comes in a software distribution of its own: w2x_servet-1_0_0_beta03.zip. This distribution contains a ready-to-deploy binary w2x.war, as well as the full Java™ source code of the servlet.

## 1.0.0-beta02 (May 6, 2015)

• New graphical application w2x-app should be easier to use than the w2x command-line utility.
• New application w2x-app is also available as an add-on for XMLmind XML Editor. This add-on adds an "Import DOCX" item to the File menu. The "Import DOCX" menu item displays a non modal dialog box almost identical to w2x-app. XML output files created using the "Import DOCX" dialog box are automatically opened in XMLmind XML Editor.

This add-on is compatible with XMLmind XML Editor v6.3+. In order to install it, please follow the instructions found in XMLmind Word To XML Manual, Installing the "Word To XML" add-on.

• Added parameter edit.headings.convert which allows to easily convert to h1, h2, ..., h6 headings paragraphs not having a outline level property.

## 1.0.0-beta01 (March 30, 2015)

First public release.

© 2017-2019 XMLmind Software. Updated on 2019/5/25.
Oracle and Java are registered trademarks of Oracle and/or its affiliates. Other names may be trademarks of their respective owners. Acrobat and PostScript are trademarks of Adobe Systems Incorporated.