<?xml version="1.0" encoding="UTF-8"?><book xmlns="http://docbook.org/ns/docbook" version="5.0" xml:lang="en-US"><info><title>XMLmind Word To XML Manual</title><author><personname>Hussein Shafie</personname></author><date remap="dcterms.created">2024-02-12T08:46:00Z</date><date remap="dcterms.modified">2025-11-11T09:29:00Z</date><publishername>XMLmind Software</publishername><abstract><para>Explains how to install and use XMLmind Word To XML (w2x for short), how to customize the output of w2x and how to embed a w2x processor in a Java™ application.</para></abstract></info><chapter><title/><para>Hussein Shafie XMLmind Software 35 rue Louis Leblanc, 78120 Rambouillet, France, Phone: +33 (0)9 52 80 80 37, Web: <link xmlns:xlink="http://www.w3.org/1999/xlink" xlink:href="https://www.xmlmind.com/w2x/">www.xmlmind.com/w2x/</link> Email: <link xmlns:xlink="http://www.w3.org/1999/xlink" xlink:href="mailto:w2x-support@xmlmind.com">mailto:w2x-support@xmlmind.com</link> (public mailing list)</para></chapter><chapter xml:id="intro"><title>Introduction</title><para>Microsoft® Word is an amazing popular writing tool. However, its main drawback is that, once your document is complete, you cannot do much with it: print it, convert it to PDF or send it as is by email.</para><para>XMLmind Word To XML  aims no less than to suppress Microsoft® Word main drawback. This 100% Java™ software component  allows  to automate the publishing —in its widest sense— of contents created using Microsoft® Word 2007+.</para><para>More precisely,  XMLmind Word To XML  (<abbrev>w2x</abbrev> for short) allows to automatically convert DOCX files to:</para><itemizedlist><listitem><para><emphasis role="bold">Clean, styled, valid XHTML+CSS, looking very much like the source DOCX files</emphasis>.</para><para>Because the generated XHTML+CSS file is clean and valid, you can easily restyle it, extract metadata or an abstract from it before publishing it.</para></listitem><listitem><para><emphasis role="bold">Unstyled, valid, semantic XML</emphasis> (DITA, DocBook, XHTML, your custom schema, etc).</para><para>In this case, most styles are converted to semantic tags. For example, numbered paragraphs are converted to proper ordered lists.</para><para>Generating semantic XML out of DOCX files is useful for interchange reasons (e.g. implement open data) or because you want to port your existing documentation to a structured document format where form and content are completely separated (e.g. implement single source publishing).</para></listitem></itemizedlist><para>Of course, deploying w2x does not require installing MS-Word on the machines hosting the software. Also note that w2x does not require the authors to change their habits while using MS-Word: no strict writing discipline, no specific styles, no specific document templates, no specific macros, etc.</para><para>This document explains:</para><itemizedlist><listitem><para>how to install and use w2x;</para></listitem><listitem><para>how to customize the output of w2x;</para></listitem><listitem><para>because w2x has been designed to be easily embedded in any Java, desktop or server-side, application, how to embed a w2x processor in a Java  application.</para></listitem></itemizedlist></chapter><chapter xml:id="install"><title>Installing  w2x</title><para><emphasis role="bold">Requirements</emphasis></para><para>XMLmind Word To XML (<abbrev>w2x</abbrev> for short) requires a Java™ runtime 11+. However, w2x is officially supported by XMLmind only on Windows 8.1, 10 and 11, macOS (Intel® or ARM® processor) 15.x (Sequoia) and 26.x (Tahoe)  and Linux.</para><para>On Linux, make sure that the Java <literal>bin/</literal> directory is referenced in the <literal>$PATH</literal> and, at the same time, check that the Java runtime in the <literal>$PATH</literal> has the right version:</para><programlisting>$ <emphasis role="bold">java –version</emphasis>
openjdk version "25.0.1" 2025-10-21
OpenJDK Runtime Environment (build 25.0.1+8-27)
OpenJDK 64-Bit Server VM (build 25.0.1+8-27, mixed mode)
</programlisting><para>On Windows and on the Mac, this verification is in principle not needed as the <literal>java</literal> executable is automatically found in the <literal>$PATH</literal> when Java has been properly installed.</para><para><emphasis role="bold">Install on Windows</emphasis></para><orderedlist><listitem><para>Download the <literal>setup.exe</literal> distribution.</para></listitem><listitem><para>Double-click on the <literal>setup.exe</literal> file to launch the installer. </para></listitem><listitem><para>Follow the instructions of the installer.</para></listitem></orderedlist><note><para><emphasis role="bold">About Java on Windows</emphasis></para><para>The <literal>setup.exe</literal> distribution includes a very recent —generally the most recent— <emphasis>private</emphasis> <link xmlns:xlink="http://www.w3.org/1999/xlink" xlink:href="https://openjdk.java.net/">OpenJDK </link>  Java™ runtime. Therefore, you don't need to install Java on your computer. Moreover, if you have Java already installed on your computer, then your public Java runtime will be ignored by w2x. </para><para>If you prefer to run w2x using a different version of Java, you'll have to first delete folder <literal><replaceable>W2X_INSTALL_DIR</replaceable>\bin\jre64\</literal> in order to force w2x to use the version of Java installed on your computer. </para><para>Note that <literal><replaceable>W2X_INSTALL_DIR</replaceable>\bin\jre64\</literal> contains a 64-bit version of the Java runtime which cannot be used on a 32-bit version of Windows. This means that, on a 32-bit version of Windows, you'll still have to download and install a 32-bit Java™ 8+ runtime on your computer in order to use w2x.</para></note><para><emphasis role="bold">Install on the Mac</emphasis></para><orderedlist><listitem><para>Download the <literal>.dmg</literal> distribution.</para></listitem><listitem><para>Double-click the downloaded <literal>.dmg</literal> file to open it in the <emphasis role="bold">Finder</emphasis>.</para></listitem><listitem><para>Copy the <literal>WordToXML.app</literal> folder, an application bundle represented by icon <inlinemediaobject><imageobject><imagedata fileref="manual_dbk5_files/xmlmind_icon.png" contentwidth="16" contentdepth="16"/></imageobject></inlinemediaobject>, anywhere you want. For example, drag&amp;drop this icon to the <literal>/Applications</literal> folder or to your desktop.</para></listitem><listitem><para>Start <link linkend="w2x_app_alternative">the w2x-app desktop application</link> by double-clicking on the <inlinemediaobject><imageobject><imagedata fileref="manual_dbk5_files/xmlmind_icon.png" contentwidth="16" contentdepth="16"/></imageobject></inlinemediaobject> icon (or use the <emphasis role="bold">Launchpad</emphasis>).</para></listitem><listitem><para>The first time <literal>w2x-app</literal> is started, your Mac will generally ask you to confirm that you actually want to open an application downloaded from the Internet. Click <emphasis role="bold">Open</emphasis> to confirm.</para><para>Don't worry, <literal>w2x-app</literal> has been digitally signed using a certificate issued by Apple itself. This confirmation is required for any digitally signed application not coming from the App Store.</para></listitem><listitem><para>Move the downloaded <literal>.dmg</literal> file to the Trash.</para></listitem></orderedlist><note><para><emphasis role="bold">About Java on the Mac</emphasis></para><para>The <literal>.dmg</literal> distribution includes a very recent —generally  the most recent— <emphasis>private</emphasis> <link xmlns:xlink="http://www.w3.org/1999/xlink" xlink:href="https://openjdk.java.net/">OpenJDK </link>  Java™ runtime. Therefore, you don't need to install Java on your computer. Moreover, if you have Java already installed on your computer, then your public Java runtime will be ignored by w2x.</para><para>If you prefer to run w2x using a different version of Java, you'll have to first delete folder <literal>WordToXML.app/Contents/Resources/w2x/bin/jre/</literal> in order to force w2x to use the version of Java installed on your computer.</para></note><para><emphasis role="bold">Manual install on any Java 11+ platform (Windows, Mac, Linux, etc)</emphasis></para><para>Unzip the <literal>.zip</literal> distribution in any directory you want.</para><programlisting>C:\&gt; unzip w2x-1_14_0.zip

C:\&gt; cd w2x-1_14_0

C:\w2x-1_14_0&gt; dir 
... &lt;DIR&gt; bin
... &lt;DIR&gt; doc
... &lt;DIR&gt; legal
...
</programlisting><para>XMLmind Word To XML  is intended to be used directly from the <literal>w2x-1_14_0/</literal> directory. That is, you can run the <literal>w2x</literal> command by simply executing (in a Command Prompt on windows, a terminal on Linux):</para><programlisting>C:\w2x-1_14_0&gt; bin\w2x
Usage: w2x [-version] [-v|-vv|-vvv] [Options]
    in_docx_file out_file
    | -batch out_spec in_docx_file1 ... in_docx_fileN
    | -printenv
    | -liststeps

-version
    Print version number and exit.
…
Use '-?' to list options.
</programlisting><section xml:id="distribution"><title>Contents of the installation directory</title><note><para>If the <literal>.dmg</literal> distribution has been used to install XMLmind Word To XML on the Mac, the following subdirectories are found in <literal>WordToXML.app/Contents/Resources/w2x/</literal>.</para></note><variablelist><varlistentry><term><literal>bin/w2x</literal>, <literal>w2x.bat</literal></term><listitem><para>Scripts used to run XMLmind Word To XML (<abbrev>w2x</abbrev> for short). Use <literal>w2x</literal> on any Unix system. Use <literal>w2x.bat</literal> on Windows. </para></listitem></varlistentry><varlistentry><term><literal>bin/w2x-app.exe</literal>, <literal>w2x-app.jstart</literal></term><listitem><para>File <literal>w2x-app.exe</literal> is used to start <literal>w2x-app</literal><indexterm><primary>w2x-app</primary></indexterm>, a graphical application easier to use than the <literal>w2x</literal> command-line utility, on Windows. This <literal>.exe</literal> file is a home-made launcher parameterized by <literal>xxe.jstart</literal>, an UTF-8 encoded, plain text file.</para></listitem></varlistentry><varlistentry><term><literal>bin/w2x-app</literal>, <literal>w2x-app-c.bat</literal></term><listitem><para>Scripts used to run <literal>w2x-app</literal><indexterm><primary>w2x-app</primary></indexterm>, a graphical application easier to use than the <literal>w2x</literal> command-line utility. Use <literal>w2x-app</literal> on any Unix system. Use <literal>w2x-app-c.bat</literal> on Windows , but only when you need to start <literal>w2x-app</literal>  with a console. On Windows, a console is needed to be able to see low-level error messages. </para></listitem></varlistentry><varlistentry><term><literal>doc/index.html</literal></term><listitem><para>Contains the documentation of w2x. </para></listitem></varlistentry><varlistentry><term><literal>doc/manual/</literal></term><listitem><para>Contains <citetitle>XMLmind Word To XML Manual</citetitle>. This document is available in source DOCX format, in PDF format and in all the output formats supported by w2x.</para></listitem></varlistentry><varlistentry><term><literal>doc/manual/conv_manual.sh</literal>, <literal>conv_manual.bat</literal></term><listitem><para>Scripts allowing to convert  <citetitle>XMLmind Word To XML Manual</citetitle> to all the output formats supported by w2x. The files generated by these scripts are found in <literal>doc/manual/out/</literal>.</para></listitem></varlistentry><varlistentry><term><literal>doc/xedscript/</literal></term><listitem><para>Contains <citetitle>The XED scripting language</citetitle>.</para></listitem></varlistentry><varlistentry><term><literal>doc/w2x_app_help/</literal></term><listitem><para>Contains the online help of <literal>w2x-app</literal>, a graphical application which is easier to use than the <literal>w2x</literal> command-line utility.</para></listitem></varlistentry><varlistentry><term><literal>doc/api/</literal></term><listitem><para>Contains the reference manual of the Java™ API of w2x (generated using <literal>javadoc</literal>).</para></listitem></varlistentry><varlistentry><term><literal>legal/, legal.txt</literal></term><listitem><para>Contains legal information about w2x and about third-party components used in w2x. </para></listitem></varlistentry><varlistentry><term><literal>lib/</literal></term><listitem><para>All the (non-system) Java™ class libraries needed to run w2x: </para><para><literal>xmlresolver.jar</literal>: <link xmlns:xlink="http://www.w3.org/1999/xlink" xlink:href="https://xmlresolver.org/">an enhanced XML resolver</link> with XML Catalog support.</para><para><literal>saxon.jar</literal>: The <link xmlns:xlink="http://www.w3.org/1999/xlink" xlink:href="http://saxon.sourceforge.net/saxon6.5.5/">Saxon 6.5.5</link> XSLT 1.0 engine.</para><para><literal>w2x_all.jar</literal>: self-contained JAR containing everything needed to run <literal>w2x</literal>, that is, all the other JAR files and also all the scripts and the stylesheets found in subdirectories  <literal>xed/</literal> and <literal>xslt/</literal>.</para><para><literal>w2x.jar</literal>: contains the <literal>w2x</literal> engine.</para><para><literal>w2x_rt.jar</literal>: contains a runtime needed by the w2x engine. All these classes come from <link xmlns:xlink="http://www.w3.org/1999/xlink" xlink:href="http://www.xmlmind.com/xmleditor/">XMLmind XML Editor</link>.</para><para><literal>wmf2svg.jar</literal>: <link xmlns:xlink="http://www.w3.org/1999/xlink" xlink:href="http://wmf2svg.sourceforge.jp/">WMF to SVG Converting Tool &amp; Library</link>; needed to support the WMF picture format.</para><para><literal>wmf_converter.jar</literal>:  contains a picture format  plug-in  based on <literal>wmf2svg.jar</literal>.</para><para><literal>whc.jar</literal>: contains the <link xmlns:xlink="http://www.w3.org/1999/xlink" xlink:href="http://www.xmlmind.com/ditac/whc.shtml">XMLmind Web Help Compiler</link> engine.</para><para><literal>snowball.jar</literal>: <link xmlns:xlink="http://www.w3.org/1999/xlink" xlink:href="http://snowball.tartarus.org/">Snowball</link> is used by XMLmind Web Help Compiler to implement <link xmlns:xlink="http://www.w3.org/1999/xlink" xlink:href="https://en.wikipedia.org/wiki/Stemming">stemming</link>.</para></listitem></varlistentry><varlistentry><term><literal>plugin/</literal></term><listitem><para>An empty directory where user <link linkend="w2x_plugin">plugins</link> are to be copied in order to be automatically registered with w2x<indexterm><primary>plugin</primary></indexterm>.</para></listitem></varlistentry><varlistentry><term><literal>sample_plugins/rss/</literal></term><term><literal>sample_plugins/wh5_zip/</literal></term><listitem><para>The two sample <link linkend="w2x_plugin">plugins</link> used as examples in this document<indexterm><primary>plugin</primary></indexterm>. The <literal>rss/src/</literal> subdirectory contains the Java™ source code of <literal>rss/date_util.jar</literal> (custom support code). The <literal>wh5_zip/src</literal>/ subdirectory contains the Java™ source code of <literal>wh5_zip/zip_step.jar</literal> (custom conversion step).</para></listitem></varlistentry><varlistentry><term><literal>xed/</literal></term><listitem><para>Contains the <link xmlns:xlink="http://www.w3.org/1999/xlink" xlink:href="http://www.xmlmind.com/w2x/_distrib/doc/xedscript/index.html">XED</link> scripts used to convert styles to semantic XHTML tags. </para></listitem></varlistentry><varlistentry><term><literal>xslt/</literal></term><listitem><para>Contains the <link xmlns:xlink="http://www.w3.org/1999/xlink" xlink:href="https://www.w3.org/TR/1999/REC-xslt-19991116">XSLT 1.0</link> stylesheets used to generate semantic XML. </para></listitem></varlistentry></variablelist></section></chapter><chapter xml:id="alternatives_to_command_line"><title>Alternatives to using the w2x command-line utility</title><section xml:id="w2x_app_alternative"><title>The w2x-app graphical application</title><para>Graphical application <literal>w2x-app</literal><indexterm><primary>w2x-app</primary></indexterm> should be easier to use than the <literal>w2x</literal> command-line utility. This application is found in <literal><replaceable>w2x_install_dir</replaceable>/bin/</literal>. How to use it is explained in <link xmlns:xlink="http://www.w3.org/1999/xlink" xlink:href="http://www.xmlmind.com/w2x/_distrib/doc/w2x_app_help/index.html">w2x-app - Online Help</link>.</para><figure><title><literal>w2x-app</literal> window</title><mediaobject><imageobject><imagedata fileref="manual_dbk5_files/image2.png" contentwidth="413" contentdepth="510"/></imageobject></mediaobject></figure></section><section xml:id="xxe_addon_alternative"><title>The “Word To XML”  add-on for XMLmind XML Editor</title><para>Graphical application <literal>w2x-app</literal> is also available as an add-on<indexterm><primary>XMLmind XML Editor add-on</primary></indexterm> for <link xmlns:xlink="http://www.w3.org/1999/xlink" xlink:href="http://www.xmlmind.com/xmleditor/">XMLmind XML Editor</link>. This add-on adds an "<emphasis role="bold">Import DOCX</emphasis>" item to the <emphasis role="bold">File</emphasis> menu. The "<emphasis role="bold">Import DOCX</emphasis>" menu item displays a non-modal dialog box almost identical to <literal>w2x-app</literal>. XML output files created using the "<emphasis role="bold">Import DOCX</emphasis>" dialog box are automatically opened in XMLmind XML Editor.</para><para>As of version 9.1, the “Word To XML”  add-on is included in all the software distributions of XMLmind XML Editor. Therefore following <link linkend="install_xxe_addon">the instructions below</link> is probably not needed. However please note that, when part of XMLmind XML Editor <emphasis>Personal Edition</emphasis>, this add-on  runs in “evaluation mode”, that is, it generates output containing random words replaced by string "<literal>[XMLmind]</literal>"). </para><section xml:id="install_xxe_addon"><title>Installing the “Word To XML”  add-on</title><para>This add-on is compatible with latest version of XMLmind XML Editor. In order to install it, please proceed as follows:</para><orderedlist><listitem><para>Start XMLmind XML Editor.</para></listitem><listitem><para>Select <emphasis role="bold">Options</emphasis>→<emphasis role="bold">Install Add-ons</emphasis>. This displays the “<emphasis role="bold">Install Add-ons</emphasis>” dialog box.</para></listitem><listitem><para>In the <emphasis role="bold">Install</emphasis> tab, click the checkbox found before the table row containing “Word To XML”.</para><para><inlinemediaobject><imageobject><imagedata fileref="manual_dbk5_files/check_word_toxml_addon.png" contentwidth="503" contentdepth="200"/></imageobject></inlinemediaobject></para></listitem><listitem><para>Click <emphasis role="bold">OK</emphasis> to download and install the “Word To XML”  add-on.</para></listitem><listitem><para>Restart XMLmind XML Editor as instructed.</para><para>Notice that the <emphasis role="bold">File</emphasis> menu has now an “<emphasis role="bold">Import DOCX</emphasis>” item.</para><para><inlinemediaobject><imageobject><imagedata fileref="manual_dbk5_files/import_doc_menu_item.java.png" contentwidth="367" contentdepth="137"/></imageobject></inlinemediaobject></para></listitem></orderedlist></section></section><section xml:id="w2x_servlet"><title>The “Word To XML” servlet</title><para>The “Word To XML” servlet<indexterm><primary>servlet</primary></indexterm> is a Java™ <link xmlns:xlink="http://www.w3.org/1999/xlink" xlink:href="https://www.oracle.com/technetwork/java/index-jsp-135475.html">Servlet</link>  (server-side standard component) which has the same functions as the <literal>w2x-app</literal> desktop application.</para><note><para>Because it’s a server-side component and not a desktop application, please do not attempt to deploy the “Word To XML” servlet if you are an end-user of “Word To XML”. Please ask your IT personnel  to do that for you.</para></note><section xml:id="w2x_servlet_distribution"><title>Contents of the servlet software distribution</title><para>The “Word To XML” servlet comes in a software distribution of its own: <literal>w2x_servet-1_14_0.zip</literal>. This distribution contains a ready-to-deploy binary <literal>w2x.war</literal>, as well as the full Java™ source code of the servlet.</para><variablelist><varlistentry><term><literal>w2x.war</literal></term><listitem><para>Ready-to-deploy  <emphasis role="bold">W</emphasis>eb application <emphasis role="bold">AR</emphasis>chive (<abbrev>WAR</abbrev>) containing the servlet.</para></listitem></varlistentry><varlistentry><term><literal>src/</literal></term><term><literal>src/build.xml</literal></term><listitem><para>The Java™ source code of the servlet. Run <link xmlns:xlink="http://www.w3.org/1999/xlink" xlink:href="https://ant.apache.org/">ant</link> in <literal>src/</literal> in order to use <literal>src/build.xml</literal> to rebuild <literal>w2x.war</literal>.</para></listitem></varlistentry><varlistentry><term><literal>w2x/</literal></term><listitem><para>Directory containing unpacked <literal>w2x.war</literal>. Needed to rebuild <literal>w2x.war</literal>.</para></listitem></varlistentry><varlistentry><term><literal>lib/</literal></term><listitem><para>Contains Java™ libraries needed to rebuild <literal>w2x.war</literal>.</para></listitem></varlistentry></variablelist></section><section xml:id="w2x_servlet_install"><title>Installing the servlet</title><para>File <literal>w2x.war</literal> may be easily installed in any servlet container implementing at least the Servlet 2.3 standard. Example of such servlet containers: <link xmlns:xlink="http://www.w3.org/1999/xlink" xlink:href="https://tomcat.apache.org/">Apache Tomcat</link>, <link xmlns:xlink="http://www.w3.org/1999/xlink" xlink:href="https://www.eclipse.org/jetty/">Jetty</link>, <link xmlns:xlink="http://www.w3.org/1999/xlink" xlink:href="https://caucho.com/products/resin">Caucho Resin</link>.</para><note><para xml:id="install_in_tomcat_v10"><emphasis role="bold">About Apache Tomcat version 10 and above</emphasis></para><para>Beware that there is a <emphasis>major breaking change</emphasis> between latest versions of <link xmlns:xlink="http://www.w3.org/1999/xlink" xlink:href="https://tomcat.apache.org/">Apache Tomcat</link> (&gt;= 10) and older versions (&lt;= 9). This is documented in this <link xmlns:xlink="http://www.w3.org/1999/xlink" xlink:href="https://tomcat.apache.org/migration-10.html">migration article</link>.</para><para>To make a long story short, if you need to deploy the “Word To XML” servlet on <link xmlns:xlink="http://www.w3.org/1999/xlink" xlink:href="https://tomcat.apache.org/download-10.cgi">Tomcat version 10+</link>, then you first must create a <literal>webapps-javaee/</literal> folder next to <literal>TOMCAT_INSTALL_DIR/webapps/</literal> then copy <literal>w2.war</literal> to this <literal>TOMCAT_INSTALL_DIR/webapps-javaee/</literal>.</para></note><para>Though copying file <literal>w2x.war</literal> to the <literal>webapps/</literal> folder of the servlet container and then restarting the servlet container is generally sufficient to deploy the “Word To XML” servlet, please refer to the documentation your servlet container to learn about the best deployment procedure.</para><note><para>On Windows, the <literal>.dll</literal> files contained in <literal><replaceable>w2x_servlet_deployment_dir</replaceable>\WEB-INF\lib\</literal> must be copied to a directory referenced by the <literal>PATH</literal> environment variable of the computer running the servlet.</para></note></section><section xml:id="w2x_servlet_config"><title>Configuring the servlet</title><para>The “Word To XML” servlet is configured by specifying a number of <literal>init-param</literal> parameters.  These parameters are found in <literal>WEB-INF/web.xml</literal>, where folder <literal>WEB-INF/</literal> is contained in <literal>w2x.war</literal>.</para><para>All these <literal>init-param</literal> parameters are documented in <literal>web.xml</literal>. Example, parameter <literal>workDir</literal>:</para><programlisting>&lt;!-- workDir =============================================================
     Uploaded files and files generated during the conversion process 
     are stored in temporary subdirectories of this directory.
     If specified directory does not exist, it will be created.

     Value: this directory and its contents must be readable and writable
     by the operating system account used to run the Word To XML servlet.

     Default: dynamic; supplied by the Servlet Container.
====================================================================== --&gt;

&lt;init-param&gt;
  &lt;param-name&gt;workDir&lt;/param-name&gt;&lt;param-value&gt;&lt;/param-value&gt;
&lt;/init-param&gt;
</programlisting></section><section xml:id="w2x_servlet_usage"><title>Using the servlet to convert DOCX files</title><para>Let’s suppose your servlet container runs on host <literal>localhost</literal> and uses <literal>8080</literal> as its port.  In order to use the “Word To XML” servlet, please point your Web browser to <literal>http://localhost:8080/w2x/</literal>. This will cause the browser to display a page containing a simple DOCX convert form.</para><figure><title>The Convert DOCX form (servlet container running on host <literal>192.168.1.202</literal> and using port <literal>8080</literal>)</title><mediaobject><imageobject><imagedata fileref="manual_dbk5_files/w2x_servlet_convert_form.png" contentwidth="573" contentdepth="351"/></imageobject></mediaobject></figure><para>In order to convert a DOCX file to another format:</para><orderedlist><listitem><para>Click “<emphasis role="bold">Choose File</emphasis>” to select the DOCX file to be converted.</para></listitem><listitem><para>Select the desired output format using the “<emphasis role="bold">Output format</emphasis>” combobox.</para></listitem><listitem><para>Click <emphasis role="bold">Convert</emphasis> to download a <literal>.zip</literal> (or <literal>.epub</literal>) archive containing the result of the conversion. Generating this <literal>.zip</literal> (or <literal>.epub</literal>) file may take several seconds to several minutes depending on the size of the DOCX input file.</para></listitem></orderedlist><note><para>If the name of the DOCX input file contains non-ASCII characters (e.g. accented characters), please make sure to use Zip extractor  software supporting <literal>.zip</literal> files having UTF-8 encoded filenames.</para><para>Note that most Zip extractor software do <emphasis>not</emphasis> support <literal>.zip</literal> files having UTF-8 encoded filenames<footnote xml:id="__FN1__"><para>However, “<literal>jar xvf <replaceable>converted.zip</replaceable></literal>” works fine. <literal>jar</literal> is a command-line utility which comes with all Java Development Kits (<abbrev>JDK</abbrev>).</para></footnote>.  Such extractors will succeed in unpacking the <literal>.zip</literal> file, but will generate files having incorrect names.</para></note></section><section xml:id="w2x_servlet_api"><title>Non interactive requests </title><para>It’s also possible to use the conversion services of the “Word To XML” servlet by sending URL <literal>/w2x/convert</literal> an HTTP <literal>POST</literal> request having a <literal>multipart/form-data</literal> encoding. </para><para> <link xmlns:xlink="http://www.w3.org/1999/xlink" xlink:href="https://curl.haxx.se/">cURL</link><footnote xml:id="__FN2__"><para>curl is an open source command line tool and library for transferring data with URL syntax.</para></footnote> example: <indexterm><primary>servlet</primary><secondary>curl</secondary></indexterm><indexterm><primary>servlet</primary><secondary>POST</secondary></indexterm><indexterm><primary>servlet</primary><secondary>multipart/form-data</secondary></indexterm></para><programlisting>curl -s -S -o manual_docbook5.zip \
  -F "docx=@manual.docx;type=application/vnd.openxmlformats-officedocument.wordprocessingml.document" \
  -F "conv=docbook5" \
  http://localhost:8080/w2x/convert
</programlisting><para>Other example: </para><programlisting>curl -s -S -o manual.epub \
 -F "docx=@manual.docx;type=application/vnd.openxmlformats-officedocument.wordprocessingml.document" \
 -F "conv=epub" \
 -F "params=-p epub.identifier urn:x-mlmind:w2x:manual -p epub.split-before-level 8" \
 http://localhost:8080/w2x/convert
</programlisting><para>The conversion request has three emulated form fields:</para><variablelist><varlistentry><term><literal>docx</literal></term><listitem><para>Emulated <literal>&lt;input type=”file”&gt;</literal> field. Required. Contains the DOCX input file.</para></listitem></varlistentry><varlistentry><term><literal>conv</literal></term><listitem><para>Emulated <literal>&lt;input type=”text”&gt;</literal> field. Required. Contains the name of  one of the <literal>conversion<replaceable>N</replaceable>.name</literal> <literal>init-param</literal> defined in <literal>WEB-INF/web.xml</literal>.</para><para>The stock <literal>WEB-INF/web.xml</literal> defines the following conversions to <emphasis>styled HTML</emphasis>:</para><para><literal>xhtml_css</literal> (single page styled HTML), <literal>frameset</literal> (multi-page styled HTML, split on <emphasis role="bold">Heading 1</emphasis>), <literal>frameset2</literal> (multi-page styled HTML, split on <emphasis role="bold">Heading 1</emphasis>, <emphasis role="bold">2</emphasis>), <literal>frameset3</literal> (multi-page styled HTML, split on <emphasis role="bold">Heading 1</emphasis>, <emphasis role="bold">2</emphasis>, <emphasis role="bold">3</emphasis>), <literal>webhelp</literal> (split on <emphasis role="bold">Heading 1</emphasis>), <literal>webhelp2</literal> (split on <emphasis role="bold">Heading 1</emphasis>, <emphasis role="bold">2</emphasis>), <literal>webhelp3</literal> (split on <emphasis role="bold">Heading 1</emphasis>, <emphasis role="bold">2</emphasis>, <emphasis role="bold">3</emphasis>), <literal>epub</literal> (split on <emphasis role="bold">Heading 1</emphasis>), <literal>epub2</literal> (split on <emphasis role="bold">Heading 1</emphasis>, <emphasis role="bold">2</emphasis>), <literal>epub3</literal> (split on <emphasis role="bold">Heading 1</emphasis>, <emphasis role="bold">2</emphasis>, <emphasis role="bold">3</emphasis>)</para><para>and also the following conversions to  <emphasis>“semantic” XML</emphasis>:</para><para><literal>docbook</literal>, <literal>docbook5</literal>, <literal>topic</literal>, <literal>map</literal>, <literal>bookmap</literal>, <literal>xhtml_strict</literal>, <literal>xhtml_loose</literal>, <literal>xhtml1_1</literal>, <literal>xhtml5</literal>.</para></listitem></varlistentry><varlistentry><term><literal>params</literal></term><listitem><para>Emulated <literal>&lt;input type=”text”&gt;</literal> field. Optional. Contains some <literal>w2x</literal> command-line options, generally <link linkend="option_p">-p parameters</link>.  These options are appended to the options of the conversion specified in the <literal>conv</literal> emulated form field.</para></listitem></varlistentry></variablelist><para>The response to a successful conversion request  is a <literal>.zip</literal> (or <literal>.epub</literal>) archive containing the result of the conversion.</para></section></section></chapter><chapter xml:id="getting_started"><title>Getting started with w2x</title><note><para xml:id="about_evaluation_edition"><emphasis role="bold">About Evaluation Edition</emphasis></para><para>Note that Evaluation Edition is useless for any purpose other than evaluating XMLmind Word To XML. This edition generates output containing random words replaced by string "<emphasis role="bold">[XMLmind]</emphasis>". (Of course, this does not happen with Professional Edition!)</para><para><inlinemediaobject><imageobject><imagedata fileref="manual_dbk5_files/random_replaced_words.png" contentwidth="481" contentdepth="224"/></imageobject></inlinemediaobject></para></note><para>We’ll use this manual to explain the basic uses of the <literal>w2x</literal> command-line utility. This manual is found in DOCX format in <literal><replaceable>w2x_install_dir</replaceable>/doc/manual/</literal> and the w2x command-line utility is found in <literal><replaceable>w2x_install_dir</replaceable>/bin/</literal>.</para><programlisting>C:\w2x-1_14_0&gt; cd doc\manual
C:\w2x-1_14_0\doc\manual&gt; mkdir out
</programlisting><itemizedlist><listitem><para>Convert  <literal>manual.docx</literal> to <literal>out\manual.xhtml</literal>, containing clean, styled, valid XHTML+CSS<indexterm><primary>XHTML, output format</primary></indexterm>, looking very much like <literal>manual.docx</literal>:</para><programlisting>..\..\bin\w2x manual.docx out\manual<emphasis role="bold">.xhtml</emphasis>
</programlisting><para>If you want to generate XHTML which is treated by Web browsers  as if it were HTML, simply use a  <literal>.html</literal> file extension for the output file:</para><programlisting>..\..\bin\w2x manual.docx out\manual<emphasis role="bold">.html</emphasis>
</programlisting><para>Doing this automatically  turn on options<footnote xml:id="__FN3__"><para>This option is “<literal>-p convert.charset UTF-8</literal>”. See <link linkend="param_charset">charset parameter</link>.</para></footnote> which remove the XML declaration <literal>(&lt;?xml version=”1.0” encoding=”UTF-8”?&gt;</literal>) normally found at the top of an XHTML file and insert a <literal>&lt;meta content=”text/html; charset=UTF-8” http-equiv=”Content-Type”/&gt;</literal>  into the <literal>html</literal>/<literal>head</literal> element of the output document.</para></listitem><listitem><para>Convert  <literal>manual.docx</literal> to <literal>out\frameset\manual.xhtml</literal>, containing <emphasis>multi-page</emphasis>, clean, styled, valid XHTML+CSS<indexterm><primary>frameset, output format</primary></indexterm>, looking very much like <literal>manual.docx</literal>:</para><programlisting>..\..\bin\w2x <emphasis role="bold">–o frameset</emphasis> manual.docx out\frameset\manual.xhtml
</programlisting><para>The above command generates multiple “<literal>.xhtml</literal>” files in the <literal>out\frameset</literal> directory which is automatically created<footnote xml:id="__FN4__"><para>But not automatically made empty if the output directory already exists.</para></footnote> if needed to.</para><para>Note that <literal>out\frameset\manual.xhtml</literal> contains a frameset.  While an obsolete HTML feature, a <link xmlns:xlink="http://www.w3.org/1999/xlink" xlink:href="https://www.w3.org/TR/html401/present/frames.html">frameset</link> makes it easy browsing the generated XHTML+CSS pages. Moreover the table of contents used as the left frame, found in <literal>out\frameset\manual-TOC.xhtml</literal>, is a convenient way to programmatically list all the generated XHTML+CSS pages.</para></listitem><listitem><para>Convert  <literal>manual.docx</literal> to <literal>out\webhelp\manual.html</literal>, containing a Web Help<indexterm><primary>Web Help, output format</primary></indexterm> looking very much like <literal>manual.docx</literal>:</para><programlisting>..\..\bin\w2x <emphasis role="bold">–o webhelp</emphasis> manual.docx out\webhelp\manual.html
</programlisting><para>The above command generates multiple “<literal>.html</literal>” files in the <literal>out\webhelp</literal> directory which is automatically created if needed to.</para></listitem><listitem><para>Convert  <literal>manual.docx</literal> to <literal>out\manual.epub</literal>, containing a <link xmlns:xlink="http://www.w3.org/1999/xlink" xlink:href="http://idpf.org/epub/201">EPUB 2</link><indexterm><primary>EPUB, output format</primary></indexterm> book looking very much like <literal>manual.docx</literal>:</para><programlisting>..\..\bin\w2x <emphasis role="bold">–o epub</emphasis> manual.docx out\manual.epub
</programlisting></listitem><listitem><para>Convert  <literal>manual.docx</literal> to <literal>out\manual.xml</literal>, containing DocBook 4.5<indexterm><primary>DocBook 4, output format</primary></indexterm><indexterm><primary>-o, option</primary></indexterm>.</para><programlisting>..\..\bin\w2x <emphasis role="bold">–o docbook</emphasis> manual.docx out\manual.xml
</programlisting></listitem><listitem><para>Convert  <literal>manual.docx</literal> to <literal>out\manual.xml</literal>, containing DocBook 5.0<indexterm><primary>DocBook 5, output format</primary></indexterm><indexterm><primary>-o, option</primary></indexterm>.</para><programlisting>..\..\bin\w2x <emphasis role="bold">–o docbook5</emphasis> manual.docx out\manual.xml
</programlisting><para>By default, the generated DocBook files contain HTML tables. If you prefer DocBook to contain CALS tables, please use the following options:</para><programlisting>..\..\bin\w2x <emphasis role="bold">–o </emphasis>docbook5¬
<emphasis role="bold"> -p convert.set-column-number yes -p transform.cals-tables yes</emphasis>¬
 manual.docx out\manual.xml
</programlisting></listitem><listitem><para>Convert  <literal>manual.docx</literal> to <literal>out\manual.xml</literal>, containing a <link xmlns:xlink="http://www.w3.org/1999/xlink" xlink:href="https://tdg.docbook.org/tdg/5.1/ch06.html">DocBook V5.1 assembly</link><indexterm><primary>DocBook V5.1 assembly, output format</primary></indexterm><indexterm><primary>-o, option</primary></indexterm>.</para><programlisting>..\..\bin\w2x <emphasis role="bold">–o assembly</emphasis> manual.docx out\manual.xml
</programlisting></listitem><listitem><para>Convert  <literal>manual.docx</literal> to <literal>out\manual.dita</literal>, containing a DITA topic<indexterm><primary>DITA topic, output format</primary></indexterm><indexterm><primary>-o, option</primary></indexterm>.</para><programlisting>..\..\bin\w2x <emphasis role="bold">–o topic</emphasis> manual.docx out\manual.dita
</programlisting><para>Generating a task having “<literal>MyTask</literal>” as its ID is equally simple:</para><programlisting>..\..\bin\w2x <emphasis role="bold">–o topic</emphasis>¬
<emphasis role="bold"> -p transform.topic-type task -p transform.root-topic-id MyTask</emphasis>¬
 manual.docx out\manual.dita
</programlisting></listitem><listitem><para>Convert  <literal>manual.docx</literal> to <literal>out\manual.ditamap</literal>, containing a DITA map<indexterm><primary>DITA map, output format</primary></indexterm><indexterm><primary>-o, option</primary></indexterm>.</para><programlisting>..\..\bin\w2x <emphasis role="bold">–o map</emphasis> manual.docx out\manual.ditamap
</programlisting></listitem><listitem><para>Convert  <literal>manual.docx</literal> to <literal>out\manual.ditamap</literal>, containing a DITA bookmap<indexterm><primary>DITA bookmap, output format</primary></indexterm><indexterm><primary>-o, option</primary></indexterm> possibly having chapter <literal>topicref</literal>s and nested <literal>topicref</literal>s acting as sections and subsections (but no sub-subsections).</para><programlisting>..\..\bin\w2x <emphasis role="bold">–o bookmap -p transform2.section-depth 3</emphasis>¬
 manual.docx out\manual.ditamap
</programlisting></listitem><listitem><para>Convert  <literal>manual.docx</literal> to <literal>out\manual.xhtml</literal>, containing “semantic”, unstyled  XHTML5<indexterm><primary>XHTML 5.0, output format</primary></indexterm><indexterm><primary>-o, option</primary></indexterm>.</para><programlisting>..\..\bin\w2x <emphasis role="bold">–o xhtml5</emphasis> manual.docx out\manual.xhtml
</programlisting><para>Use the following options to generate other versions of semantic XHTML<indexterm><primary>XHTML, output format</primary></indexterm><indexterm><primary>-o, option</primary></indexterm>:</para><informaltable><tgroup cols="2"><colspec colname="c1" colwidth="45*"/><colspec colname="c2" colwidth="55*"/><thead valign="top"><row><entry align="center"><para><emphasis role="bold">Option</emphasis></para></entry><entry align="center"><para><emphasis role="bold">XHTML Version</emphasis></para></entry></row></thead><tbody valign="top"><row><entry><para><literal>-o xhtml_strict</literal><indexterm><primary>-o, option</primary></indexterm></para></entry><entry><para>XHTML 1.0 Strict<indexterm><primary>XHTML 1.0 Strict, output format</primary></indexterm></para></entry></row><row><entry><para><literal>-o xhtml_loose</literal></para></entry><entry><para>XHTML 1.0 Transitional<indexterm><primary>XHTML 1.0 Transitional, output format</primary></indexterm></para></entry></row><row><entry><para><literal>-o xhtml_1</literal></para></entry><entry><para>XHTML 1.1<indexterm><primary>XHTML 1.1, output format</primary></indexterm></para></entry></row><row><entry><para><literal>-o xhtml5</literal></para></entry><entry><para>XHTML 5.0<indexterm><primary>XHTML 5.0, output format</primary></indexterm></para></entry></row></tbody></tgroup></informaltable></listitem></itemizedlist><section xml:id="check_outline_levels"><title>How to generate useful multi-page HTML </title><para>In order to generate multi-page HTML, that is, frameset<indexterm><primary>frameset, output format</primary></indexterm>, Web Help<indexterm><primary>Web Help, output format</primary></indexterm>, EPUB<indexterm><primary>EPUB, output format</primary></indexterm>, we need to automatically split the source DOCX document into parts.</para><para>A new part is created each time a paragraph having an <emphasis>outline level</emphasis><indexterm><primary>Outline level</primary></indexterm> less than or equal to specified <link linkend="split_split_before_level">split-before-level parameter</link><indexterm><primary>split-before-level, parameter</primary></indexterm> is found in the source. An outline level is an integer between 0 (e.g. style “<emphasis role="bold">Heading 1</emphasis>”) and 8 (e.g. style “<emphasis role="bold">Heading 9</emphasis>”). The default value of parameter <literal>split-before-level</literal> is 0, which means: for each “<emphasis role="bold">Heading 1</emphasis>”, create a new page starting with this “<emphasis role="bold">Heading 1</emphasis>”.</para><para>Frameset example: for each “<emphasis role="bold">Heading 1</emphasis>” and “<emphasis role="bold">Heading 2</emphasis>”, create a new page (<literal>out/frameset/manual-1.xhtml</literal>, <literal>out/frameset/manual-2.xhtml</literal>, ..., <literal>out/frameset/manual-N.xhtml</literal>) starting with this “<emphasis role="bold">Heading 1</emphasis>” or “<emphasis role="bold">Heading 2</emphasis>”:</para><programlisting>..\..\bin\w2x <emphasis role="bold">-p split.split-before-level 1</emphasis>¬
  –o frameset manual.docx out\frameset\manual.xhtml
</programlisting><para>EPUB example:</para><programlisting>..\..\bin\w2x <emphasis role="bold">-p epub.split-before-level 1</emphasis>¬
  –o epub manual.docx out\manual.epub
</programlisting><para>Web Help containing “semantic” XHTML 5 example:</para><programlisting>..\..\bin\w2x <emphasis role="bold">-p webhelp.split-before-level 1</emphasis>¬
  –o webhelp5 manual.docx out\webhelp\manual.html
</programlisting><note><para xml:id="check_outline_levels_tip"><emphasis role="bold">Important tip</emphasis></para><para>Generating any of the multi-page, styled HTML formats should work great if, for the DOCX document to be converted, you can use MS-Word's "<emphasis role="bold">References</emphasis> &gt; <emphasis role="bold">Table of Contents</emphasis>" button to automatically create a table of contents.</para><para>Note that the source DOCX document is not required to have a table of contents, but MS-Word should allow to automatically create a <emphasis>good</emphasis> one.</para><para>In other words, automatically creating a table of contents using MS-Word is the best way to check that your outline levels<indexterm><primary>Outline level</primary></indexterm> are OK.</para></note></section></chapter><chapter xml:id="going_further"><title>Going further with w2x</title><para>When you execute the following command:</para><programlisting>..\..\bin\w2x –o docbook5 manual.docx out\manual.xml
</programlisting><para>you execute in fact a sequence of 3  <emphasis>conversion steps</emphasis>:</para><orderedlist><listitem><para>Convert the DOCX file to a styled, valid, XHTML 1.0 Transitional document, looking very much like the input DOCX file.</para></listitem><listitem><para>Apply a number of <link xmlns:xlink="http://www.w3.org/1999/xlink" xlink:href="http://www.xmlmind.com/w2x/_distrib/doc/xedscript/index.html">XED scripts</link> to this document to convert CSS styles into semantic tags. For example, numbered paragraphs are converted to proper ordered lists . </para><para>The entry point of these “semantic” XED scripts is found in <literal><replaceable>w2x_install_dir</replaceable>/xed/main.xed</literal>.</para><para>The XED scripts edit in place the input XHTML document. Therefore, the result of this step is the same XHTML document, still valid, but this time, containing no CSS styles whatsoever.</para></listitem><listitem><para>Apply an <link xmlns:xlink="http://www.w3.org/1999/xlink" xlink:href="https://www.w3.org/TR/1999/REC-xslt-19991116">XSLT 1.0</link> stylesheet to the unstyled, valid, XHTML 1.0 Transitional document in order to generate the desired semantic XML format.</para><para>The XSLT stylesheets are all found in <literal><replaceable>w2x_install_dir</replaceable>/xslt/</literal>.  In the above case, we want to generate DocBook v5, therefore we use <literal><replaceable>w2x_install_dir</replaceable>/xslt/docbook5.xslt</literal>.</para></listitem></orderedlist><para>This sequence of conversion steps can be made visible in every detail by specifying the <literal>–vv</literal> option (very verbose) <indexterm><primary>-vv, option</primary></indexterm>:</para><programlisting>..\..\bin\w2x <emphasis role="bold">–vv</emphasis> –o docbook5 manual.docx out\manual.xml

VERBOSE: Converting "manual.docx" to XHTML...
DEBUG: convert.xhtml-file=C:\w2x-1_14_0\doc\manual\out\manual.xhtml

VERBOSE: Editing XHTML document using "C:\w2x-1_14_0\xed\main.xed"...
DEBUG: edit.xed-url-or-file=file:/C:/w2x-1_14_0/xed/main.xed
DEBUG: Loading script "file:/C:/w2x-1_14_0/xed/main.xed"...
DEBUG: Loading script "file:/C:/w2x-1_14_0/xed/after-translate.xed"...
[...]
DEBUG: Loading script "file:/C:/w2x-1_14_0/xed/before-save.xed"...

VERBOSE: Transforming document using "C:\w2x-1_14_0\xslt\docbook5.xslt" then saving it to "C:\w2x-1_14_0\doc\manual\out\manual.xml"...
DEBUG: transform.out-file=C:\w2x-1_14_0\doc\manual\out\manual.xml transform.xslt-url-or-file=file:/C:/w2x-1_14_0/xslt/docbook5.xslt
[...]
</programlisting><para>In fact, option <literal>–o docbook5</literal> is a shorthand for the following <link linkend="w2x_command">w2x command-line options</link>:</para><itemizedlist><listitem><para><literal>-c</literal></para><para>Execute a <link linkend="convert_step">Convert step</link> called “<literal>convert</literal>”. <indexterm><primary>-c, option</primary></indexterm></para></listitem><listitem><para><literal>-p convert.xhtml-file C:\w2x-1_14_0\doc\manual\out\manual.xhtml </literal><indexterm><primary>-p, parameter</primary></indexterm></para><para>Pass the above <literal>xhtml-file</literal> parameter to the conversion step called “<literal>convert</literal>”.</para></listitem><listitem><para><literal>-e</literal></para><para>Execute an <link linkend="edit_step">Edit step</link> called “<literal>edit</literal>”. <indexterm><primary>-e, option</primary></indexterm></para></listitem><listitem><para><literal>-p edit.xed-url-or-file file:/C:/w2x-1_14_0/xed/main.xed</literal></para><para>Pass the above <literal>xed-url-or-file</literal> parameter to the conversion step called “<literal>edit</literal>”.</para></listitem><listitem><para><literal>-t</literal></para><para>Execute a <link linkend="transform_step">Transform step</link> called “<literal>transform</literal>”. <indexterm><primary>-t, option</primary></indexterm></para></listitem><listitem><para><literal>-p transform.xslt-url-or-file file:/C:/w2x-1_14_0/xslt/docbook5.xslt</literal></para></listitem><listitem><para><literal>-p transform.out-file C:\w2x-1_14_0\doc\manual\out\manual.xml</literal></para><para>Pass the above <literal>xslt-url-or-file</literal> and <literal>out-file</literal> parameters to the conversion step called “<literal>transform</literal>”.</para></listitem></itemizedlist><note><para>If you need to learn about the details of the conversion steps to be executed, the simplest is to use the <link linkend="option_liststeps">–liststeps</link><indexterm><primary>-liststeps, option</primary></indexterm> command-line option. Example: <literal>w2x –o docbook5 –liststeps</literal>.</para></note><para>The order of the <link linkend="option_c">–c</link>, <link linkend="option_e">-e</link> and <link linkend="option_t">–t</link> options is significant because it means:  first convert, then edit and finally transform. The order of the <link linkend="option_p">–p</link> (and <link linkend="option_pu">–pu</link>) options is not important, as <emphasis>a parameter name must be prefixed by the name of the step to which it applies</emphasis>. <indexterm><primary>-p, parameter</primary></indexterm><indexterm><primary>-pu, parameter</primary></indexterm></para><para>The Convert, Edit and Transform steps are the most important steps. There are other conversion steps though, which are all documented in chapter <xref linkend="step_reference"/>. Moreover a Java™ programmer may implement its own custom conversion steps<footnote xml:id="__FN5__"><para>A custom conversion step derives from abstract class <literal>com.xmlmind.w2x.processor.ProcessStep</literal>.</para></footnote> and instruct the <literal>w2x</literal> command-line to give them names  (required to pass them parameters) and to execute them. See option <link linkend="option_step">–step</link>. <indexterm><primary>-step, option</primary></indexterm></para><para>A w2x processor executes a sequence of conversion steps whatever the output format. Simply the conversion steps, their order, number and parameters, depend on the desired output format. This is depicted in the figure below.</para><figure><title>Anatomy of a w2x processor</title><mediaobject><imageobject><imagedata fileref="manual_dbk5_files/image8.svg" contentwidth="612" contentdepth="412"/></imageobject></mediaobject></figure><para>The first sequence of in the above figure reads as follows: in order to convert a DOCX file to styled XHTML, first convert the DOCX file  to a XHTML+CSS document, then “polish up” this document (e.g. process consecutive paragraphs having identical borders) using XED script <literal><replaceable>w2x_install_dir</replaceable>/xed/main-styled.xed</literal>, and finally save the possibly modified XHTML+CSS document to disk. </para><section xml:id="stock_xed_scripts"><title>Stock XED scripts</title><para>XMLmind Word to XML comes with two stock “main” XED scripts:</para><variablelist><varlistentry><term><literal><replaceable>w2x_install_dir</replaceable></literal><literal>/xed/main-styled.xed</literal></term><listitem><para>Invokes XED scripts used to “polish up” the styled XHTML 1.0 Transitional document created by the Convert step (e.g. process consecutive paragraphs having identical borders).</para></listitem></varlistentry><varlistentry><term><literal><replaceable>w2x_install_dir</replaceable></literal><literal>/xed/main.xed</literal></term><listitem><para>Invokes XED scripts used to prepare the generation of semantic XML of all kinds:  XHTML, DocBook, DITA.  These scripts leverage the CSS styles and classes found in the styled XHTML 1.0 Transitional document created by the Convert step.  They translate these CSS styles and classes (e.g. numbered paragraph) into semantic tags (e.g. <literal>ol</literal>/<literal>li</literal>).</para></listitem></varlistentry></variablelist><para>Both the above “main” XED scripts are organized as sequences of simpler, short, XED scripts. Using <link linkend="option_p">–p</link> or <link linkend="option_pu">–pu</link> options, these short scripts may be replaced or removed and  may be passed parameters. It’s also possible to insert custom scripts before or after any of these short scripts.</para><para>Excerpts from <literal><replaceable>w2x_install_dir</replaceable>/xed/main-styled.xed</literal>:</para><programlisting>script(defined("<emphasis role="bold">before.</emphasis>init-styles", ""));
script(defined("<emphasis role="bold">do.</emphasis>init-styles", "init-styles.xed"));
script(defined("<emphasis role="bold">after.</emphasis>init-styles", ""));

script(defined("<emphasis role="bold">before.</emphasis>title-styled", ""));
script(defined("<emphasis role="bold">do.</emphasis>title-styled", "title-styled.xed"));
script(defined("<emphasis role="bold">after.</emphasis>title-styled", ""));

script(defined("<emphasis role="bold">before.</emphasis>remove-pis", ""));
script(defined("<emphasis role="bold">do.</emphasis>remove-pis", "remove-pis.xed"));
script(defined("<emphasis role="bold">after.</emphasis>remove-pis", ""));

script(defined("<emphasis role="bold">before.</emphasis>expand-tabs", ""));
script(defined("<emphasis role="bold">do.</emphasis>expand-tabs", "expand-tabs.xed"));
script(defined("<emphasis role="bold">after.</emphasis>expand-tabs", ""));

script(defined("<emphasis role="bold">before.</emphasis>borders", ""));
script(defined("<emphasis role="bold">do.</emphasis>borders", "borders.xed"));
script(defined("<emphasis role="bold">after.</emphasis>borders", ""));

script(defined("<emphasis role="bold">before.</emphasis>number-footnotes", ""));
script(defined("<emphasis role="bold">do.</emphasis>number-footnotes", "number-footnotes.xed"));
script(defined("<emphasis role="bold">after.</emphasis>number-footnotes", ""));

script(defined("<emphasis role="bold">before.</emphasis>finish-styles", ""));
script(defined("<emphasis role="bold">do.</emphasis>finish-styles", "finish-styles.xed"));
script(defined("<emphasis role="bold">after.</emphasis>finish-styles", ""));
</programlisting><para>Examples:</para><itemizedlist><listitem><para>Remove script  <literal>title-styled.xed</literal>: </para><programlisting>-p edit.do.title-styled “”
</programlisting></listitem><listitem><para>Replace script  <literal>borders.xed</literal> by custom script “<literal>C:\Users\john\w2x tests\MyBorders.xed</literal>”:</para><programlisting>-pu edit.do.borders “C:\Users\john\w2 tests\MyBorders.xed”
</programlisting></listitem><listitem><para>Pass parameter <literal>finish-styles.css-uri</literal> to <literal>script finish-styles.xed</literal>:</para><programlisting>-p edit.finish-styles.css-uri css/manual.css
</programlisting><para>By convention  (this is not strictly required),  the name of a parameter which applies to a given XED script is prefixed with the basename without any file extension of this script. Hence the full names of most parameters of  Edit steps have the following syntax: <literal><replaceable>step_name</replaceable>.<replaceable>script_name</replaceable>.<replaceable>parameter_name</replaceable></literal>. Examples: </para><programlisting>-p edit.prune.preserve “p-ProgramListing”

-p edit.inlines.convert “c-Code code”
</programlisting></listitem><listitem><para>Execute script <literal>customize\patch_manual.xed</literal> before script <literal>finish-styles.xed</literal>:</para><programlisting>-pu edit.before.finish-styles customize\patch_manual.xed
</programlisting></listitem><listitem><para>Execute script <literal>customize\patch_manual.xed</literal> after script <literal>borders.xed</literal>:</para><programlisting>-pu edit.after.borders customize\patch_manual.xed
</programlisting></listitem></itemizedlist></section></chapter><chapter xml:id="customize"><title>Customizing the output of w2x</title><section xml:id="customizing_styled_xhtml"><title>Customizing the XHTML+CSS files generated by w2x</title><section xml:id="modify_embedded_styles"><title>Using a XED script to modify the styles embedded in the XHTML+CSS file</title><para>By default, w2x adds a number of CSS rules to the /<literal>html</literal>/<literal>head</literal>/<literal>style</literal> element of the generated XHTML+CSS file. Example: excerpts from <literal><replaceable>w2x_install_dir</replaceable>/doc/manual/manual.html</literal>:</para><programlisting>&lt;style type="text/css"&gt;
body {
    counter-reset: n-1-0 0 n-1-1 0 n-1-2 0 n-17-0 0 n-20-0 0;
    font-family: Calibri;
    font-size: 11pt;
}
...
&lt;/style&gt;
</programlisting><para>A <link xmlns:xlink="http://www.w3.org/1999/xlink" xlink:href="http://www.xmlmind.com/w2x/_distrib/doc/xedscript/index.html">XED script</link> allows to modify, not only the nodes of an XHTML document, but also its  “CSS styles”. These “CSS styles” may be either style properties contained in the <literal>style</literal> attribute of an element or class names found in the <literal>class</literal> attribute of an element or the CSS rules of the document.</para><para>Therefore, when the desired customization is limited, suffice to execute a XED script in order to modify the XHTML+CSS document  created by the <link linkend="convert_step">Convert step</link>. Example:</para><programlisting>w2x <emphasis role="bold">-pu edit.before.finish-styles customize\patch_manual.xed</emphasis>¬
 manual.docx out\manual.html
</programlisting><para>where <literal><replaceable>w2x_install_dir</replaceable>/doc/manual/customize/patch_manual.xed</literal> contains:</para><programlisting>set-rule(".p-ProgramListing", "white-space", "pre");
</programlisting><para>The above line adds CSS property “<literal>white-space: pre;</literal>” to the CSS rule having “<literal>.p-ProgramListing</literal>” as its selector. This CSS rule corresponds to custom paragraph<footnote xml:id="__FN6__"><para>It’s a paragraph style because the CSS style name has a “<literal>p-</literal>“ prefix.</para></footnote> style called “<literal>ProgramListing</literal>”.</para><para>Besides <link xmlns:xlink="http://www.w3.org/1999/xlink" xlink:href="http://www.xmlmind.com/w2x/_distrib/doc/xedscript/set-rule.html">XED command set-rule</link>, the following commands allow to edit the CSS styles contained in the XHTML+CSS document created by the Convert step: <literal>add-class</literal>, <literal>add-rule</literal>, <literal>remove-class</literal>, <literal>remove-rule</literal>, <literal>set-style</literal>.</para></section><section xml:id="appending_custom_styles"><title>Appending custom styles to the styles embedded in the XHTML+CSS file</title><para>XED script <literal><replaceable>w2x_install_dir</replaceable>/xed/finish-styles.xed</literal> has a optional <link linkend="finish_styles_custom_styles_url_or_file"> custom-styles-url-or-file parameter</link> which makes it easy customizing the automatically generated CSS styles.</para><para>This parameter may be used to specify the location of a CSS file. The custom CSS styles found in specified file are simply appended to the automatically generated CSS styles. Example:</para><para>Example:</para><programlisting>w2x <emphasis role="bold">-pu edit.finish-styles.custom-styles-url-or-file customize\custom.css</emphasis>¬
  manual.docx out\manual_restyled.html
</programlisting><para>where <literal>customize\custom.css</literal> contains:</para><programlisting>body {
    font-family: sans-serif;
}

.p-Heading1,
.p-Heading2,
.p-Heading3,
.p-Heading4,
.p-Heading5,
.p-Heading6 {
    font-family: serif;
    color: #17365D;
    padding: 1pt;
    border-bottom: 1pt solid #4F81BD;
    margin-bottom: 10pt;
    margin-left: 0pt;
    text-indent: 0pt;
}

.p-Heading1 {
    border-bottom-width: 2pt;
}

...

.c-FootnoteReference,
.c-EndnoteReference {
    font-size: smaller;
}
</programlisting></section><section xml:id="external_css_file"><title>Using an external CSS file rather than embedded CSS styles</title><para>XED script <literal><replaceable>w2x_install_dir</replaceable>/xed/finish-styles.xed</literal> has a optional <link linkend="finish_styles_css_uri">css-uri parameter</link> which allows to specify the CSS file where all CSS rules, whether automatically generated or custom, are to be saved. </para><para>Same example as <link linkend="appending_custom_styles">above</link> but using an external CSS file rather than embedded CSS styles:</para><programlisting>w2x -p <emphasis role="bold">edit.finish-styles.css-uri manual_restyled_css/manual.css</emphasis>¬
  -pu edit.finish-styles.custom-styles-url-or-file customize\custom.css¬
  manual.docx out\manual_restyled.html
</programlisting><para>All the CSS styles, whether automatically generated or the custom ones found in <literal>customize\custom.css</literal>, end up in <literal>manual_restyled_css\manual.css</literal>. Moreover, <literal>out\manual_restyled.html</literal> contains a link to <literal>manual_restyled_css\manual.css</literal>.</para><programlisting>&lt;link href="manual_restyled_css/manual.css"
      rel="stylesheet" type="text/css"/&gt;
</programlisting></section><section xml:id="combine_methods"><title>Combining all the above methods</title><para>It is of course possible to combine all the above methods. For example, the following <literal>w2x</literal> command is used to create <literal><replaceable>w2x_install_dir</replaceable>/doc/manual/manual_restyled.html</literal>:</para><programlisting>w2x <emphasis role="bold">-pu edit.before.finish-styles customize\patch_manual_restyled.xed</emphasis>¬
<emphasis role="bold"> -p edit.finish-styles.css-uri manual_restyled_css/custom.css</emphasis>¬
 <emphasis role="bold">-pu edit.finish-styles.custom-styles-url-or-file customize\custom.cs</emphasis>s¬
 manual.docx out\manual_restyled.html
</programlisting><para>where<literal><replaceable> w2x_install_dir</replaceable>/doc/manual/customize/patch_manual_restyled.xed</literal> contains:</para><programlisting>for-each /html/body/p[get-class("^p-Heading\d$")] {
    set-variable("class", get-class("^n-\d+-\d+$"));
    if $class != '' {
        set-variable("selector", concat(".", $class, ":after"));
        if find-rule($selector) &gt;= 0 {
            remove-rule($selector);

            set-variable("selector", concat(".", $class, ":before"));
            set-rule($selector, "float");
            set-rule($selector, "width");
            set-rule($selector, "content",
                     concat(get-rule($selector, "content"), ' " "'));
            set-rule($selector, "display", "inline");
        }
    }
}
</programlisting><para>The above <link xmlns:xlink="http://www.w3.org/1999/xlink" xlink:href="http://www.xmlmind.com/w2x/_distrib/doc/xedscript/index.html">XED script</link>:</para><orderedlist><listitem><para>Delete CSS rules like this one:</para><programlisting>.n-1-0:after {
    clear: both;
    content: "";
    display: block;
}
</programlisting></listitem><listitem><para>Modify CSS rules like this one:</para><programlisting>.n-1-0:before {
    <emphasis role="bold">content: counter(n-1-0)</emphasis>;
    counter-increment: n-1-0;
    <emphasis role="bold">float: left</emphasis>;
    <emphasis role="bold">width: 21.6pt</emphasis>;
}
</programlisting><para>which becomes:</para><programlisting>.n-1-0:before {
    <emphasis role="bold">content: counter(n-1-0) " "</emphasis>;
    counter-increment: n-1-0;
    <emphasis role="bold">display: inline</emphasis>;
}
</programlisting></listitem></orderedlist><para>This script is useful because otherwise adding a bottom border  to headings gives an ugly result. While the contents of the heading is “underlined”, the CSS <literal>float</literal> containing the numbering value of the heading is not.</para><para>Besides <link xmlns:xlink="http://www.w3.org/1999/xlink" xlink:href="http://www.xmlmind.com/w2x/_distrib/doc/xedscript/w2xfuncs.html#get-class">get-class</link>, the following XPath extension functions may be used to access the CSS styles contained in the XHTML+CSS document created by the <link linkend="convert_step">Convert step</link>: <literal>find-rule</literal>, <literal>font-size</literal>, <literal>get-rule</literal>, <literal>get-style</literal>, <literal>lookup-length</literal>, <literal>lookup-style</literal>, <literal>style-count</literal>.</para><note><para><emphasis role="bold">Why use XPath extension function </emphasis><link xmlns:xlink="http://www.w3.org/1999/xlink" xlink:href="https://www.xmlmind.com/w2x/_distrib/doc/xedscript/w2xfuncs.html#get-class">get-class</link><literal> </literal><emphasis role="bold">and not </emphasis><literal>matches(@class,pattern)</literal><emphasis role="bold">?</emphasis></para><para>The answer is: because <emphasis>all </emphasis><literal>class</literal><emphasis> attributes have been removed</emphasis> by XED script <literal><replaceable>w2x_install_dir</replaceable>/xed/init-styles.xed</literal>.</para><para>This script  “interns” the CSS rules found in the <literal>html</literal>/<literal>head</literal>/<literal>style</literal> element of the XHTML+CSS document, the CSS styles directly set on some elements and the CSS classes set on some elements.  </para><para>This operation is needed to allow an efficient implementation of the following XPath extension functions: <literal>find-rule</literal>, <literal>font-size</literal>, <literal>get-class</literal>, <literal>get-rule</literal>, <literal>get-style</literal>, <literal>lookup-length</literal>, <literal>lookup-style</literal>, <literal>style-count</literal>, and of the following editing commands: <literal>add-class</literal>, <literal>add-rule</literal>, <literal>remove-class</literal>, <literal>remove-rule</literal>, <literal>set-rule</literal>, <literal>set-style</literal>.</para><para>More information about “interned”  CSS styles in <link xmlns:xlink="http://www.w3.org/1999/xlink" xlink:href="http://www.xmlmind.com/w2x/_distrib/doc/xedscript/parse-styles.html">command parse-styles</link> (command invoked by <literal><replaceable>w2x_install_dir</replaceable>/xed/init-styles.xed</literal>) and inverse <link xmlns:xlink="http://www.w3.org/1999/xlink" xlink:href="http://www.xmlmind.com/w2x/_distrib/doc/xedscript/unparse-styles.html">command unparsed-styles</link><literal> </literal>(command invoked by <literal><replaceable>w2x_install_dir</replaceable>/xed/finish-styles.xed</literal>).</para></note></section></section><section xml:id="customizing_semantic_xml"><title>Customizing the semantic XML files generated by w2x</title><section xml:id="convert_character_styles"><title>Converting custom character styles to semantic tags</title><para>Converting a custom character style to an XHTML element (possibly having specific attributes) is simple and does not require writing a XED script. Suffice for that to pass <link linkend="param_inlines_convert">parameter inlines.convert</link> to the <link linkend="edit_step">Edit step</link>.</para><para>Example 1: convert text spans having a “<literal>Code</literal>” character style to XHTML element <literal>code</literal>:</para><programlisting>-p edit.inlines.convert "c-Code code"
</programlisting><note><para>Notice that the name of character style in the generated XHTML+CSS file  is always prefixed by “<literal>c-</literal>“.</para></note><para>The syntax for the value of parameter <literal>inlines.convert</literal> is:</para><programlisting><emphasis>value</emphasis> →  <emphasis>conversion</emphasis> [ S ‘!’ S  <emphasis>conversion</emphasis> ]*
<emphasis>conversion</emphasis> → <emphasis>style_spec</emphasis> S <emphasis>XHTML_element_name</emphasis> [ S <emphasis>attribute</emphasis> ]*
<emphasis>style_spec</emphasis> →  <emphasis>style_name</emphasis> | <emphasis>style_pattern</emphasis>
<emphasis>style_pattern</emphasis>  → ‘/’ <emphasis>pattern</emphasis> ’/’ | ‘^’ <emphasis>pattern</emphasis> ‘$’
<emphasis>attribute</emphasis> → <emphasis>attribute_name</emphasis> ‘=’ <emphasis>quoted_attribute_value</emphasis>
<emphasis>quoted_attribute_value</emphasis> →  “’” <emphasis>value</emphasis> “’” | ‘”’ <emphasis>value</emphasis> ‘”’
</programlisting><para>Example 2: in addition to what’s done in above example 1,  convert text spans having a “<literal>Abbrev</literal>” character style to XHTML element <literal>abbr</literal> having a <literal>title=”???”</literal> attribute:</para><programlisting>-p edit.inlines.convert "c-Code code ! c-Abbrev abbr title='???'"
</programlisting><para>What if the semantic XHTML created by the Edit step is then converted to DITA or DocBook by the means of a <link linkend="transform_step">Transform step</link>?</para><para>In the case of XHTML elements <literal>code</literal> and <literal>abbr</literal>, there is nothing else to do because the stock XSLT stylesheets already support these elements:</para><itemizedlist><listitem><para><literal><replaceable>w2x_install_dir</replaceable>/xslt/topic.xslt</literal> converts XHTML <literal>code</literal> to DITA <literal>codeph</literal> and XHTML <literal>abbr</literal> to DITA <literal>keyword</literal>,</para></listitem><listitem><para><literal><replaceable>w2x_install_dir</replaceable>/xslt/docbook.xslt</literal> converts XHTML <literal>code</literal> to DocBook <literal>code</literal> and XHTML <literal>abbr</literal> to DocBook <literal>abbrev</literal>.</para></listitem></itemizedlist><para>The general case which also requires using custom XSLT stylesheets  is explained in section <xref linkend="general_customize_semantic_xml"/>.</para></section><section xml:id="convert_paragraph_styles"><title>Converting custom paragraph styles to semantic tags</title><para>Converting a custom paragraph style to an XHTML element (possibly having specific attributes) is simple and does not require writing a XED script. Suffice for that to pass <link linkend="param_blocks_convert">parameter blocks.convert</link> to the <link linkend="edit_step">Edit step</link>.</para><para>Example 1.a: convert paragraphs having a “<literal>ProgramListing</literal>” paragraph style to XHTML element <literal>pre</literal>:</para><programlisting>-p edit.blocks.convert "p-ProgramListing pre"
</programlisting><note><para>Notice that the name of paragraph style in the generated XHTML+CSS file  is always prefixed by “<literal>p-</literal>“.</para></note><para>If you use the above <literal>blocks.convert</literal> specification, it will work fine, except that you’ll end up with several consecutive <literal>pre</literal> elements (one <literal>pre</literal> per line of program listing). This is clearly not what you want. You want consecutive <literal>pre</literal> elements to be merged into a single <literal>pre</literal> element. Fortunately implementing this too is quite simple.</para><para>Example 1.b: convert paragraphs having a “<literal>ProgramListing</literal>” paragraph style to XHTML element <literal>span</literal> (having <emphasis>grouping attributes</emphasis>; more about this below):</para><programlisting>-p edit.blocks.convert "p-ProgramListing span g:id='pre' g:container='pre'"
</programlisting><para>When any of the target XHTML elements have <emphasis>grouping attributes</emphasis> (<literal>g:id='pre'</literal><footnote xml:id="__FN7__"><para>Any value would do (e.g. <literal>g:id=”foo”</literal> would have worked as well). Suffice for consecutive elements to be grouped to all have the same <literal>g:id</literal> attribute.</para></footnote>,  <literal>g:container='pre'</literal>, in the above example), then <literal><replaceable>w2x_install_dir</replaceable>/xed/blocks.xed</literal> automatically invokes <link xmlns:xlink="http://www.w3.org/1999/xlink" xlink:href="http://www.xmlmind.com/w2x/_distrib/doc/xedscript/group.html">the group() command</link> at the end of the conversions. This has the effect of grouping  consecutive <literal>&lt;span g:id='pre' g:container='pre'&gt;</literal> into a  common <literal>pre</literal> parent element.</para><para>Given the fact that XED command <literal>group()</literal> automatically removes grouping attributes when done and that <literal><replaceable>w2x_install_dir</replaceable>/xed/finish.xed</literal> discards all useless <literal>span</literal> elements, this leaves us with clean <literal>pre</literal> elements containing text<footnote xml:id="__FN8__"><para>Unless you specify:</para><programlisting>-p edit.prune.preserve "p-ProgramListing"
</programlisting><para>script <literal><replaceable>w2x_install_dir</replaceable>/xed/prune.xed</literal> will cause open lines to be stripped from the generated <literal>pre</literal> element.</para></footnote>.</para><para>The syntax for the value of parameter <literal>blocks.convert</literal> is:</para><programlisting><emphasis>value</emphasis> →  <emphasis>conversion</emphasis> [ S ‘!’ S  <emphasis>conversion</emphasis> ]*
<emphasis>conversion</emphasis> → <emphasis>style_spec</emphasis> S <emphasis>XHTML_element_name</emphasis> [ S <emphasis>attribute</emphasis> ]*
<emphasis>style_spec</emphasis> →  <emphasis>style_name</emphasis> | <emphasis>style_pattern</emphasis>
<emphasis>style_pattern</emphasis>  → ‘/’ <emphasis>pattern</emphasis> ’/’ | ‘^’ <emphasis>pattern</emphasis> ‘$’
<emphasis>attribute</emphasis> → <emphasis>attribute_name</emphasis> ‘=’ <emphasis>quoted_attribute_value</emphasis>
<emphasis>quoted_attribute_value</emphasis> →  “’” <emphasis>value</emphasis> “’” | ‘”’ <emphasis>value</emphasis> ‘”’
</programlisting><para>Example 3: in addition to what’s done in above example 1.b,  convert paragraphs having a “<literal>Term</literal>” paragraph style to XHTML element <literal>dt</literal>, convert paragraphs having a “<literal>Definition</literal>” paragraph style to XHTML element <literal>dl</literal> and group consecutive <literal>dt</literal> and <literal>dl</literal> elements into a common <literal>dl</literal> parent:</para><programlisting>-p edit.blocks.convert "p-Term dt g:id='dl' g:container='dl' !¬
 p-Definition dd g:id='dl' g:container='dl' !¬
 p-ProgramListing span g:id='pre' g:container='pre'"
</programlisting><para>What if the semantic XHTML created by the Edit step is then converted to DITA or DocBook by the means of a <link linkend="transform_step">Transform step</link>?</para><para>In the case of XHTML elements <literal>pre</literal>, <literal>dt</literal>, <literal>dd</literal> and <literal>dl</literal>, there is nothing else to do because the stock XSLT stylesheets already support these elements.</para><para>The general case which also requires using custom XSLT stylesheets  is explained in section <xref linkend="general_customize_semantic_xml"/>.</para></section><section xml:id="general_customize_semantic_xml"><title>The general case</title><para>In the general case, customizing the semantic XML files generated by w2x requires writing both a XED script and an XSLT stylesheet.</para><para>For example, let’s suppose we want to group all the paragraphs having a “<literal>Note</literal>” paragraph style and to generate for such groups DocBook and DITA <literal>note</literal> elements.</para><para>The following  <link linkend="param_blocks_convert">blocks.convert parameter</link> would allow to very easily create the desired groups:</para><programlisting>-p edit.blocks.convert "p-Note p g:id='note_group_member'¬
 g:container='div class=\”role-note\” ’"
</programlisting><para>However this would leave us with two unsolved problems:</para><orderedlist numeration="loweralpha"><listitem><para>A paragraph having a “<literal>Note</literal>” paragraph style often starts with bold text “<literal>Note:</literal>”. We want to eliminate this redundant label.</para></listitem><listitem><para>The stock XSLT stylesheets  will not convert  XHTML element <literal>&lt;div class=”role-note”&gt;</literal> to a DocBook or DITA <literal>note</literal> element.</para></listitem></orderedlist><para><emphasis role="bold">A custom XED script</emphasis></para><para>The first problem is solved by the following <literal><replaceable>w2x_install_dir</replaceable>/doc/manual/customize/notes.xed</literal> script:</para><programlisting>namespace "http://www.w3.org/1999/xhtml";
namespace html = "http://www.w3.org/1999/xhtml";
namespace g = "urn:x-mlmind:namespace:group";

for-each /html/body//p[get-class("p-Note")] {
    delete-text("note:\s*", "i");
    if content-type() &lt;= 1 and not(@id) {
        delete();
    } else {
        remove-class("p-Note");
        set-attribute("g:id", "note_group_member");
        set-attribute("g:container", "div class='role-note'");
    }
}

group();
</programlisting><para>The “<literal>Note:</literal>” label, if any, is deleted using <link xmlns:xlink="http://www.w3.org/1999/xlink" xlink:href="http://www.xmlmind.com/w2x/_distrib/doc/xedscript/delete-text.html">XED command delete-text</link>. If doing this creates a useless empty (<link xmlns:xlink="http://www.w3.org/1999/xlink" xlink:href="http://www.xmlmind.com/w2x/_distrib/doc/xedscript/w2xfuncs.html#content-type">content-type</link><literal>() &lt;= 1</literal>) paragraph, then delete this paragraph using <link xmlns:xlink="http://www.w3.org/1999/xlink" xlink:href="http://www.xmlmind.com/w2x/_distrib/doc/xedscript/delete.html">XED command delete</link>.</para><para>The above script is executed after  stock script <literal><replaceable>w2x_install_dir</replaceable>/xed/blocks.xed</literal> by the means of the following  <literal>w2x</literal> command-line option:</para><programlisting>-pu edit.after.blocks customize\notes.xed
</programlisting><para><emphasis role="bold">A custom XSLT stylesheet</emphasis></para><para>The second problem is solved by the following <literal><replaceable>w2x_install_dir</replaceable>/doc/manual/customize/custom_topic.xslt</literal> XSLT 1.0 stylesheet:</para><programlisting>&lt;xsl:stylesheet version="1.0"
  xmlns:xsl="http://www.w3.org/1999/XSL/Transform"
  xmlns:h="http://www.w3.org/1999/xhtml"
  exclude-result-prefixes="h"&gt;

<emphasis role="bold">&lt;xsl:import href="w2x:xslt/topic.xslt"/&gt;</emphasis>

&lt;xsl:template match="h:div[@class = 'role-note']"&gt;
  &lt;note&gt;
    &lt;xsl:call-template name="processCommonAttributes"/&gt;
    &lt;xsl:apply-templates/&gt;
  &lt;/note&gt;
&lt;/xsl:template&gt;
...
&lt;/xsl:stylesheet&gt;
</programlisting><para>This stylesheet, which imports stock <literal><replaceable>w2x_install_dir</replaceable>/xslt/topic.xslt</literal>,  is used for the <literal>topic</literal>, <literal>map</literal> and <literal>bookmap</literal> output formats (see <link linkend="option_o">–o option</link>). Similar, very simple, stylesheets have been developed for the <literal>docbook</literal> and <literal>docbook5</literal> output formats.</para><note><para>Something like “<literal>w2x:xslt/topic.xslt</literal>” is an absolute URL supported by w2x. “<literal>w2x:</literal>” is an URL prefix (defined in the automatic XML catalog used by w2x) which specifies the location of the parent directory of both the <literal>xed/</literal> and <literal>xslt/</literal> subdirectories.</para></note><para>The above stylesheet replaces the stock one by the means of the following  <literal>w2x</literal> command-line option:</para><programlisting>-o topic -<emphasis role="bold">t customize\custom_topic.xslt</emphasis>
</programlisting><para>Do not forget to specify the <literal>–t</literal> option <emphasis>after</emphasis> the <literal>–o</literal> option, because it’s the <literal>–o</literal> option which implicitly invokes stock <literal><replaceable>w2x_install_dir</replaceable>/xslt/topic.xslt</literal> (this has been explained in chapter <xref linkend="going_further"/>) and we want to use <literal>–t</literal> to override the use of the stock XSLT stylesheet.</para><note><para><emphasis role="bold">Tip:</emphasis>  You’ll find a template for custom XED scripts and several templates for custom XSLT stylesheets in <literal><replaceable>w2x_install_dir</replaceable>/doc/manual/templates/</literal>.</para><para>For example, in order to create <literal><replaceable>w2x_install_dir</replaceable>/doc/manual/customize/custom_topic.xslt</literal>, we started by copying template XSLT stylesheet <literal><replaceable>w2x_install_dir</replaceable>/doc/manual/templates/template_topic.xslt</literal>.</para></note></section></section><section xml:id="generating_custom_xml"><title>Generating XML conforming to a custom schema</title><para>In order to use w2x to convert a DOCX input file to an XML output file conforming to your custom schema, all you have to do is write a custom <link xmlns:xlink="http://www.w3.org/1999/xlink" xlink:href="https://www.w3.org/TR/1999/REC-xslt-19991116">XSLT 1.0</link> stylesheet converting the “semantic” XHTML 1.0 Transitional generated by the <link linkend="edit_step">Edit step</link> to your custom schema.</para><para>Let’s call your custom XSLT 1.0 stylesheet “<literal>C:\Users\John\foo\xsl\xhtml_to_foo.xsl</literal>”. Command-line tool <literal>w2x</literal> must then be passed the following options:</para><itemizedlist><listitem><para><literal>-c</literal></para><para>Execute a <link linkend="convert_step">Convert step</link> called “<literal>convert</literal>”. <indexterm><primary>-c, option</primary></indexterm></para></listitem><listitem><para><literal>-e <replaceable>XED_URL_or_file</replaceable></literal></para><para>Execute an <link linkend="edit_step">Edit step</link> called “<literal>edit</literal>”. </para><para>Example: <literal>-e w2x:xed/main.xed</literal>. Pass this stock XED script (converting the styled XHTML  1.0 Transitional created by the <link linkend="convert_step">Convert step</link> to “semantic” XHTML) to the conversion step called “<literal>edit</literal>”.<indexterm><primary>-e, option</primary></indexterm></para></listitem><listitem><para><literal>-t <replaceable>XSLT_URL_or_file</replaceable></literal></para><para>Execute a <link linkend="transform_step">Transform step</link> called “<literal>transform</literal>”. </para><para>Example: <literal>-t "C:\Users\John\foo\xsl\xhtml_to_foo.xsl".</literal></para><para>Pass your custom XSLT 1.0 stylesheet  to the conversion step called “<literal>transform</literal>”.<indexterm><primary>-t, option</primary></indexterm></para></listitem></itemizedlist><para>Stock XED script <literal>w2x:xed/main.xed</literal> creates a number of semantic XHTML elements having a <literal>class</literal> attribute starting with “<literal>role-</literal>“. Examples:  <literal>&lt;div class=”role-section1”&gt;</literal>, <literal>&lt;div class=”role-section2”&gt;</literal>, <literal>&lt;div class=”role-figure”&gt;</literal>, <literal>&lt;div class=”role-figcaption”&gt;</literal>, <literal>&lt;a class=”role-footnote-ref”&gt;</literal>,  <literal>&lt;div class=”role-footnote”&gt;</literal>, <literal>&lt;a class=”role-xref”&gt;</literal>, <literal>&lt;span class=”role-index-term”&gt;</literal>, etc. To learn how to process these elements, the simplest is to look how this is done in a stock XSLT stylesheet such as “<literal><replaceable>w2x_install_dir</replaceable>/xslt/topic.xslt</literal>” or “<literal><replaceable>w2x_install_dir</replaceable>/xslt/docbook.xslt</literal>”.</para></section><section xml:id="w2x_plugin"><title>Packaging your customization as a w2x plugin</title><para><indexterm><primary>plugin</primary></indexterm>Command-line utility <literal>w2x</literal> and <link xmlns:xlink="http://www.w3.org/1999/xlink" xlink:href="http://www.xmlmind.com/w2x/_distrib/doc/w2x_app_help/index.html">desktop application w2x-app</link> support <emphasis>plugins</emphasis>.</para><para>Let’s suppose you have created a plugin called “<literal>rss</literal>” which may be used to convert DOCX to <link xmlns:xlink="http://www.w3.org/1999/xlink" xlink:href="http://www.rssboard.org/rss-specification">RSS</link>. Once registered with w2x, this plugin may be invoked as it were a stock conversion, for example:</para><programlisting>w2x -o rss my.docx my.xml
</programlisting><para>Other example, using a plugin called “<literal>wh5_zip</literal>” (see description <link linkend="wh5_zip_plugin">below</link>):</para><programlisting>w2x -o wh5_zip -p zip.include-top-dir false my.docx my.zip
</programlisting><para>In <literal>w2x-app</literal>, you'll find the registered plugins in <link xmlns:xlink="http://www.w3.org/1999/xlink" xlink:href="http://www.xmlmind.com/w2x/_distrib/doc/w2x_app_help/converting_docx_to_xml.html">the "Convert to" combobox</link> and in <link xmlns:xlink="http://www.w3.org/1999/xlink" xlink:href="http://www.xmlmind.com/w2x/_distrib/doc/w2x_app_help/options_wizard_format_screen.html">the "Output format" screen of the setup assistant</link>.</para><section xml:id="w2x_plugin_format"><title>Anatomy of a plugin</title><para><indexterm><primary>plugin</primary><secondary>format</secondary></indexterm><indexterm><primary>w2x_plugin, file extension</primary><see>plugin</see></indexterm>A plugin is simply a plain text file, using an <literal>UTF-8</literal> character encoding, having a "<literal>.w2x_plugin</literal>" file suffix, containing a number of <literal>w2x</literal> command-line arguments and starting with comment lines containing information about the plugin (for example, its name). Example, <literal><replaceable>w2x_install_dir</replaceable>/sample_plugins/rss/rss.w2x_plugin</literal>:</para><programlisting>### plugin.name: rss
### plugin.outputDescription: RSS 2.0
### plugin.outputExtension: xml
### plugin.multiFileOutput: no

-c
-e w2x:xed/main.xed
-t rss.xslt

# Image files not useful here.
-step:com.xmlmind.w2x.processor.DeleteFilesStep:cleanUp
-p cleanUp.files "%{~pO}/%{~nO}_files"
</programlisting><informaltable><tgroup cols="3"><colspec colname="c1" colwidth="31*"/><colspec colname="c2" colwidth="25*"/><colspec colname="c3" colwidth="44*"/><thead valign="top"><row><entry align="center"><para><emphasis role="bold">Field Name</emphasis></para></entry><entry align="center"><para><emphasis role="bold">Default Value</emphasis></para></entry><entry align="center"><para><emphasis role="bold">Description</emphasis></para></entry></row></thead><tbody valign="top"><row><entry align="center"><para><literal>plugin.name:</literal></para></entry><entry><para>Basename of the "<literal>.w2x_plugin</literal>" file without its extension.</para></entry><entry><para>The name of the plugin (a single word).</para></entry></row><row><entry align="center"><para><literal>plugin.outputDescription:</literal></para></entry><entry><para>The name of the plugin.</para></entry><entry><para>A short description (just a few words) of the output format of this plugin.</para></entry></row><row><entry align="center"><para><literal>plugin.outputExtension:</literal></para></entry><entry><para><literal>xml</literal></para></entry><entry><para>Preferred extension for the files created by this plugin.</para></entry></row><row><entry align="center"><para><literal>plugin.multiFileOutput:</literal></para></entry><entry><para><literal>no</literal></para></entry><entry><para>Whether this plugin creates multiple files or just a single one. A boolean: “<literal>true</literal>”, “<literal>yes</literal>”, “<literal>on</literal>”, “<literal>1</literal>” or “<literal>false</literal>”, “<literal>no</literal>”, “<literal>off</literal>”, “<literal>0</literal>”.</para></entry></row></tbody></tgroup></informaltable><para>The above <literal>rss</literal> plugin converts DOCX to <link xmlns:xlink="http://www.w3.org/1999/xlink" xlink:href="http://www.rssboard.org/rss-specification">RSS</link>. This process is partly implemented by XSLT 1.0 stylesheet <literal><replaceable>w2x_install_dir</replaceable>/sample_plugins/rss/rss.xslt</literal> which is part of this plugin. Stylesheet <literal>rss.xslt</literal> transforms its input, the semantic XHTML 1.0 Transitional file created by <link linkend="edit_step">the Edit step</link> (invoked using <literal>-e w2x:xed/main.xed</literal>), to RSS.</para><para>Aside XSLT 1.0 stylesheets, a plugin may also include <link xmlns:xlink="http://www.w3.org/1999/xlink" xlink:href="http://www.xmlmind.com/w2x/_distrib/doc/xedscript/index.html">XED scripts</link> as well as "<literal>.jar</literal>" files containing support code and/or custom conversion steps implemented in Java™. Example, <literal><replaceable>w2x_install_dir</replaceable>/sample_plugins/wh5_zip/wh5_zip.w2x_plugin</literal>:</para><programlisting>### plugin.outputDescription: Web Help ZIP containing "semantic" (X)HTML 5.0
### plugin.outputExtension: zip

-o webhelp5
-p webhelp.split-before-level 8
-p webhelp.use-id-as-filename yes
-p webhelp.omit-toc-root yes
-p webhelp.wh-layout simple

# Generate all HTML files in a subdirectory of the output directory 
# having the same basename as the ".zip" output file.

-p convert.xhtml-file "%{~pO}/%{~nO}/%{~nO}.xhtml"

-p transform.out-file "%{~pO}/%{~nO}/%{~nO}_tmp.xhtml"

-p webhelp.out-file "%{~pO}/%{~nO}/%{~nO}.html"

-p cleanUp.files "%{~pO}/%{~nO}/%{~nO}_tmp.xhtml"

-step:ZipStep:zip 
-p zip.out-file "%{O}"
</programlisting><para xml:id="wh5_zip_plugin">The above <literal>wh5_zip</literal> plugin specializes the stock conversion called <literal>webhelp5</literal> (Web Help containing XHTML 5.0) by giving specific values to some of its parameters (e.g. <literal>-p webhelp.wh-layout simple</literal>) and also by archiving all the output files in a single “<literal>.zip</literal>” file.</para><para>This last step, <literal>-step:ZipStep:zip</literal>, is implemented by a <link linkend="custom_convert_step">custom conversion step</link> found in <literal><replaceable>w2x_install_dir</replaceable>/sample_plugins/wh5_zip/src/ZipStep.java</literal>. This Java™ code is compiled and archived in <literal><replaceable>w2x_install_dir</replaceable>/sample_plugins/wh5_zip/zip_step.jar</literal> by the means of <link xmlns:xlink="http://www.w3.org/1999/xlink" xlink:href="https://ant.apache.org/">ant</link> build file <literal><replaceable>w2x_install_dir</replaceable>/sample_plugins/wh5_zip/src/build.xml</literal>.</para><para>Note that these "<literal>.jar</literal>" files, just like the "<literal>.w2x_plugin</literal>" files, are automatically discovered and loaded by <literal>w2x</literal> and <literal>w2x-app</literal> during their startup phase.</para></section><section xml:id="w2x_plugin_register"><title>Registering a plugin with w2x</title><para><indexterm><primary>plugin</primary><secondary>registry</secondary></indexterm>A plugin is registered with both <literal>w2x</literal> and <literal>w2x-app</literal> by copying all its files anywhere inside directory <literal><replaceable>w2x_install_dir</replaceable>/plugin/</literal>.</para><para>However it's strongly recommended to group all the files comprising a plugin in a subdirectory of its own having the same name as the plug-in (e.g. <literal><replaceable>w2x_install_dir</replaceable>/plugin/rss/</literal>). </para><note><para>If the <literal>.dmg</literal> distribution has been used to install XMLmind Word To XML on the Mac, the plugin directory is <literal>WordToXML.app/Contents/Resources/w2x/plugin/</literal>.</para></note><para>Alternatively, this plugin may be installed anywhere you want provided that the directory containing the "<literal>.w2x_plugin</literal>" file is referenced in the <varname>W2X_PLUGIN_PATH</varname> environment variable<indexterm><primary>W2X_PLUGIN_PATH, environment variable</primary></indexterm>. Example: <literal>set W2X_PLUGIN_PATH=C:\Users\John\w2x\rss;C:\temp\w2x_plugins</literal>.</para><para>The <varname>W2X_PLUGIN_PATH</varname> environment variable (or, equivalently, the <literal>W2X_PLUGIN_PATH</literal> Java™ system property; e.g. <literal>-DW2X_PLUGIN_PATH=C:\Users\John\w2x\rss;C:\temp\w2x_plugins</literal>) may contain absolute or relative directory paths separated by semi-colons ("<literal>;</literal>"). A relative path is relative to the current working directory.</para><para>The <varname>W2X_PLUGIN_PATH</varname> environment variable may also contain "<literal>+</literal>", which is a shorthand for <literal><replaceable>w2x_install_dir</replaceable>/plugin/</literal>. Windows example: <literal>set W2X_PLUGIN_PATH=..\sample_plugins;+</literal>. Linux/macOS example: <literal>export W2X_PLUGIN_PATH=+;/home/john/w2x_plugins</literal>.</para></section></section></chapter><chapter xml:id="w2x_command"><title>The w2x command-line utility</title><note><para xml:id="where_is_w2x_when_dmg_install">If the <literal>.dmg</literal> distribution has been used to install XMLmind Word To XML on the Mac, the <literal>w2x</literal> command-line utility is found in <literal>WordToXML.app/Contents/Resources/w2x/bin/</literal>.</para></note><para>Usage:</para><programlisting>w2x [-version] [-v|-vv|-vvv] [Options]
    in_docx_file out_file
    | -batch out_spec in_docx_file1 ... in_docx_fileN
    | -printenv
    | -liststeps
</programlisting><variablelist><varlistentry><term>-v<indexterm><primary>-v, option</primary></indexterm></term><term>-vv<indexterm><primary>-vv, option</primary></indexterm></term><term>-vvv<indexterm><primary>-vvv, option</primary></indexterm></term><listitem><para> Verbose. More Vs means more verbose.</para></listitem></varlistentry><varlistentry><term>-version<indexterm><primary>-version, option</primary></indexterm></term><listitem><para> Print version number and exit.</para></listitem></varlistentry><varlistentry><term>-batch<indexterm><primary>-batch, option</primary></indexterm> <emphasis>out_spec</emphasis> <emphasis>in_docx_file1</emphasis> … <emphasis>in_docx_fileN</emphasis></term><listitem><para>Convert all specified input DOCX files. <emphasis>out_spec</emphasis> specifies the absolute or relative path of the output files. It may contain the following variables: <varname>@{name}</varname>, basename of the input file without any extension, <varname>@{parent}</varname>, absolute path of the directory containing the input file. Example:</para><programlisting>C:\Users\jane&gt; w2x -o docbook5 -batch pub\@{name}.xml Documents\*.docx</programlisting><para>Same example but this time, convert all DOCX files in place:</para><programlisting>C:\Users\jane&gt; w2x -o docbook5 -batch @{parent}\@{name}.xml Documents\*.docx</programlisting></listitem></varlistentry><varlistentry><term xml:id="option_liststeps">-printenv<indexterm><primary>-printenv, option</primary></indexterm></term><listitem><para>Print supported environment variables/system properties and exit. Example:</para><programlisting>C:\&gt; w2x -printenv
W2X_TRACE=
(Supported values are: "image", "math" or "all".)

W2X_IMAGE_CONVERSIONS=
.wmf.svg java:com.xmlmind.w2x_ext.wmf_converter.WMFConverterFactory;
.emf.png.wmf.png java:com.xmlmind.w2x_ext.emf2png.EMF2PNG;
.bmp.jpg.bmp.jpeg.bmp.png.gif.jpg.gif.jpeg.gif.png
.jpeg.png.jpg.png.png.jpg.png.jpeg.tif.jpg.tif.jpeg
.tif.png.tiff.jpg.tiff.jpeg.tiff.png.wbmp.jpg.wbmp.jpeg
.wbmp.png java:com.xmlmind.w2x.docx.image.ImageConverterFactoryImpl</programlisting></listitem></varlistentry><varlistentry><term>-liststeps<indexterm><primary>-liststeps, option</primary></indexterm></term><listitem><para>List the conversion steps to be executed and exit. This option is useful to determine how to customize the conversion steps. Example:</para><programlisting>$ w2x -o bookmap -liststeps
-step:com.xmlmind.w2x.processor.ConvertStep:convert
-p convert.create-mathml-object no
-p convert.set-column-number yes
-step:com.xmlmind.w2x.processor.EditStep:edit
-p edit.xed-url-or-file file:/opt/w2x/xed/main.xed
-step:com.xmlmind.w2x.processor.TransformStep:transform
-p transform.out-file %{~pnO}.dita
-p transform.single-topic no
-p transform.xslt-url-or-file file:/opt/w2x/xslt/topic.xslt
-step:com.xmlmind.w2x.processor.TransformStep:transform2
-p transform2.xslt-url-or-file file:/opt/w2x/xslt/bookmap.xslt
-p transform2.topic-type %{transform.topic-type}
-p transform2.output-path %{~po}
-step:com.xmlmind.w2x.processor.DeleteFilesStep:cleanUp
-p cleanUp.files %{~pnO}.dita</programlisting><para><indexterm><primary>plugin</primary></indexterm>The <literal>-liststeps</literal> option is also useful when developing a <link linkend="w2x_plugin">plugin</link>. It may be used to learn how a stock conversion (e.g. bookmap) is implemented to get some inspiration when developing your own plugin.</para></listitem></varlistentry></variablelist><para>Other options are:</para><variablelist><varlistentry><term xml:id="option_o">-o<indexterm><primary>-o, option</primary></indexterm> <emphasis>format</emphasis></term><listitem><para>This option automatically adds all the steps needed to convert input DOCX file to an output file having specified format. </para><para>Possible formats: <literal>docbook</literal>, <literal>docbook5</literal>, <literal>assembly</literal> (<link xmlns:xlink="http://www.w3.org/1999/xlink" xlink:href="https://tdg.docbook.org/tdg/5.1/ch06.html">DocBook V5.1 assembly</link>), <literal>topic</literal>, <literal>map</literal>, <literal>bookmap</literal>, <literal>xhtml_css</literal> (single-page styled HTML, that is, single-page XHTML+CSS), <literal>xhtml_strict</literal>, <literal>xhtml_loose</literal>, <literal>xhtml1_1</literal>, <literal>xhtml5</literal>, <literal>frameset</literal> (multi-page styled HTML), <literal>frameset_strict</literal> (multi-page XHTML 1.0 Strict), <literal>frameset_loose</literal> (multi-page XHTML 1.0 Transitional), <literal>frameset1_1</literal> (multi-page XHTML 1.1), <literal>frameset5</literal> (multi-page XHTML 5.0), <literal>webhelp</literal> (Web Help containing styled HTML), <literal>webhelp_strict</literal> (Web Help containing XHTML 1.0 Strict), <literal>webhelp_loose</literal> (Web Help containing XHTML 1.0 Transitional), <literal>webhelp1_1</literal> (Web Help containing XHTML 1.1), <literal>webhelp5</literal> (Web Help containing XHTML 5.0), <literal>epub</literal> (EPUB 2 containing styled XHTML 1.1), <literal>epub1_1</literal> (EPUB 2 containing semantic XHTML 1.1).</para><para>The default output format is: <literal>xhtml_css</literal> (single-page styled HTML, that is, single-page XHTML+CSS).</para></listitem></varlistentry><varlistentry><term xml:id="option_p">-p<indexterm><primary>-p, option</primary></indexterm> <emphasis>name</emphasis> <emphasis>value</emphasis></term><listitem><para>Set parameter <emphasis>name</emphasis> to <emphasis>value</emphasis>.</para><para>Use parameter <literal><replaceable>step_name</replaceable>.param_name</literal> to parametrize the step called <emphasis>step_name</emphasis>.</para><para>Because they are used to parameterize named steps, the order of <literal>–p</literal> and <literal>–pu</literal> options relatively to options specifying conversions steps (<literal>-c</literal>, <literal>-e</literal>, <literal>-t</literal>, <literal>-step</literal>, etc) is not significant. For example: “<literal>-p convert.charset UTF-8 -c</literal>” is equivalent to “<literal>-c -p convert.charset UTF-8</literal>”.</para></listitem></varlistentry><varlistentry><term xml:id="option_pu">-pu<indexterm><primary>-pu, option</primary></indexterm> <emphasis>name</emphasis> <emphasis>URL_or_file</emphasis></term><listitem><para> Same as <literal>-p</literal>, except that parameter value <emphasis>URL_or_file</emphasis> is first converted to an URL.</para><para> <emphasis>URL_or_file</emphasis> is an absolute or relative URL (relative to current  <literal>-f</literal> options file if any, to current working directory otherwise) or the filename of an existing file or directory.</para></listitem></varlistentry><varlistentry><term xml:id="option_c">-c<indexterm><primary>-c, option</primary></indexterm></term><listitem><para>Add or replace “convert” step. This step converts input DOCX file to an in-memory XHTML +CSS document.</para></listitem></varlistentry><varlistentry><term>-l<indexterm><primary>-l, option</primary></indexterm></term><listitem><para>Add or replace “load” step. This step, mainly used to test XED scripts, loads input XML file.</para></listitem></varlistentry><varlistentry><term xml:id="option_e">-e<indexterm><primary>-e, option</primary></indexterm> <emphasis>xed_URL_or_file</emphasis></term><listitem><para>Add or replace “edit” step. This step edits in place input XHTML document using XED script <emphasis>xed_URL_or_file</emphasis>.</para></listitem></varlistentry><varlistentry><term>-e2<indexterm><primary>-e2, option</primary></indexterm> <emphasis>xed_URL_or_file</emphasis></term><listitem><para>Add or replace “edit2” step. This step edits in place input XHTML document using XED script <emphasis>xed_URL_or_file</emphasis>.</para></listitem></varlistentry><varlistentry><term xml:id="option_t">-t<indexterm><primary>-t, option</primary></indexterm> <emphasis>xslt_URL_or_file</emphasis></term><listitem><para>Add or replace “transform” step. This step transforms input  XML document or file using XSLT stylesheet <emphasis>xslt_URL_or_file</emphasis>.</para><para> The output file is specified by parameter <literal>transform.out-file</literal>.</para></listitem></varlistentry><varlistentry><term>-t2<indexterm><primary>-t2, option</primary></indexterm> <emphasis>xslt_URL_or_file</emphasis></term><listitem><para> Add or replace “transform2” step. This step transforms input XML document or file using XSLT stylesheet <emphasis>xslt_URL_or_file</emphasis>.</para><para>The output file is specified by parameter <literal>transform2.out-file</literal>.</para></listitem></varlistentry><varlistentry><term>-s<indexterm><primary>-s, option</primary></indexterm></term><listitem><para>Add or replace “save” step. This step saves to disk input XHTML document.</para><para>The output file is specified by parameter <literal>save.out-file</literal>.</para></listitem></varlistentry><varlistentry><term xml:id="option_step">-step:<emphasis>java_class_name</emphasis>:<emphasis>step_name</emphasis><indexterm><primary>-step, option</primary></indexterm></term><listitem><para> Add or replace step called <emphasis>step_name</emphasis> by an instance of Java™ class <emphasis>java_class_name</emphasis> deriving from <literal>com.xmlmind.w2x.processor.ProcessStep</literal>.</para></listitem></varlistentry><varlistentry><term>-f<indexterm><primary>-f, option</primary></indexterm> <emphasis>options_URL_or_file</emphasis></term><listitem><para> Load one or more of the above options from <emphasis>options_URL_or_file</emphasis>,  a plain UTF-8 text file</para></listitem></varlistentry></variablelist><section xml:id="option_p_variables"><title>Variables substituted in the parameter values passed to the <literal>–p</literal> and <literal>–pu</literal> options</title><para>The following variables are substituted in the parameter values passed to the <link linkend="option_p">–p</link> and <link linkend="option_pu">–pu</link> options. </para><informaltable><tgroup cols="3"><colspec colname="c1" colwidth="18*"/><colspec colname="c2" colwidth="36*"/><colspec colname="c3" colwidth="46*"/><thead valign="top"><row><entry align="center"><para><emphasis role="bold">Variable</emphasis></para></entry><entry align="center"><para><emphasis role="bold">Description</emphasis></para></entry><entry align="center"><para><emphasis role="bold">Example</emphasis></para></entry></row></thead><tbody valign="top"><row><entry align="center"><para><literal>%{I}</literal></para></entry><entry><para>Full path of the input DOCX file.</para></entry><entry><para><literal>C:\My</literal> <literal>Docs\report.docx</literal></para></entry></row><row><entry align="center"><para><literal>%{O}</literal></para></entry><entry><para>Full path of the output XML file.</para></entry><entry><para><literal>C:\My Docs\out\report.xml</literal></para></entry></row><row><entry align="center"><para><literal>%{i}</literal></para></entry><entry><para>Absolute URL of the input DOCX file.</para></entry><entry><para><literal>file:/C:/My%20Docs/report.docx</literal></para></entry></row><row><entry align="center"><para><literal>%{o}</literal></para></entry><entry><para>Absolute URL of the output XML file.</para></entry><entry><para><literal>file:/C:/My%20Docs/out/report.xml</literal></para></entry></row></tbody></tgroup></informaltable><para>Variables <literal>%{I}</literal>, <literal>%{O}</literal>, <literal>%{i}</literal> and <literal>%{o}</literal> may all contain one or more of following modifiers. First modifier must be preceded by character  “<literal>~</literal>”.</para><informaltable><tgroup cols="2"><colspec colname="c1" colwidth="23*"/><colspec colname="c2" colwidth="77*"/><thead valign="top"><row><entry align="center"><para><emphasis role="bold">Modifier</emphasis></para></entry><entry align="center"><para><emphasis role="bold">Description</emphasis></para></entry></row></thead><tbody valign="top"><row><entry align="center"><para><literal>n</literal></para></entry><entry><para>The name of the file or URL without any extension.</para></entry></row><row><entry align="center"><para><literal>x</literal></para></entry><entry><para>The extension of  the file or URL. Starts with “<literal>.</literal>”.</para></entry></row><row><entry align="center"><para><literal>p</literal></para></entry><entry><para>The full path of the parent directory of the file or URL.</para></entry></row></tbody></tgroup></informaltable><para>Note that combinations of modifiers other than “<literal>~nx</literal>”, “<literal>~pn</literal>”, “<literal>~pnx</literal>” do not make sense  and that, for example,   <literal>%{~pnxI}</literal> is equivalent to  <literal>%{I}</literal>.</para><para>Examples: let’s suppose that command-line argument  <literal><replaceable>in_docx_file</replaceable></literal> (see <link linkend="w2x_command">above</link>) is  “<literal>C:\My</literal> <literal>Docs\report.docx</literal>” and that argument <literal><replaceable>out_file</replaceable></literal> is “<literal>C:\My Docs\out\report.xml</literal>”.</para><itemizedlist><listitem><para><literal>%{~nI} </literal>is replaced by “<literal>report</literal>”.</para></listitem><listitem><para><literal>%{~xI} </literal>is replaced by “<literal>.docx</literal>”.</para></listitem><listitem><para><literal>%{~pI} </literal>is replaced by “<literal>C:\My</literal> <literal>Docs</literal>”.</para></listitem><listitem><para><literal>%{~nxo} </literal>is replaced by “<literal>report.xml</literal>”.</para></listitem><listitem><para><literal>%{~pno} </literal>is replaced by “<literal>file:/C:/My%20Docs/out/report</literal>”.</para></listitem></itemizedlist><para>Other variables substituted in the parameter values passed to the <literal>–p</literal> and <literal>–pu</literal> options:</para><itemizedlist><listitem><para>The value of another parameter passed to w2x by the means of the <literal>–p</literal> or <literal>–pu</literal> options. Example: when “<literal>w2x -o map -p transform.topic-type concept ...</literal>” is executed, <literal>%{transform.topic-type}</literal> is substituted with "<literal>concept</literal>".</para></listitem><listitem><para>Any Java™ system property. Example: <literal>%{file.separator}</literal> is substituted with "<literal>\</literal>" on Windows and with "<literal>/</literal>" on the other platforms.</para></listitem></itemizedlist><para>When a variable is not defined, its value is "", the empty string. Example: <literal>%{foo}</literal> is substituted with "".</para></section><section xml:id="default_steps"><title>Default conversion steps</title><para>If none of the options creating a step (<literal>-l</literal>, <literal>-c</literal>, <literal>-e</literal>, <literal>-e2</literal>, <literal>-t</literal>, <literal>-t2</literal>, <literal>-s</literal>, <literal>-step</literal>) have been specified, <literal>w2x</literal> automatically adds the equivalent of <literal>–o xhtml_css</literal>, which consists in the following conversion steps:</para><itemizedlist><listitem><para><literal>-c</literal><indexterm><primary>-c, option</primary></indexterm></para></listitem><listitem><para><literal>-e</literal><indexterm><primary>-e, option</primary></indexterm></para></listitem><listitem><para><literal>-p edit.xed-url-or-file w2x:xed/main-styled.xed</literal><indexterm><primary>-p, option</primary></indexterm></para></listitem><listitem><para><literal>-s</literal><indexterm><primary>-s, option</primary></indexterm></para></listitem></itemizedlist><para>The above options convert the input DOCX file to clean, styled, valid XHTML. The resulting output file is not indented.</para><note><para>Something like “<literal>w2x:xed/main-styled.xed</literal>” is an absolute URL supported by w2x. “<literal>w2x:</literal>” is an URL prefix (defined in the automatic XML catalog used by w2x) which specifies the location of the parent directory of both the <literal>xed/</literal> and <literal>xslt/</literal> subdirectories.</para></note></section><section xml:id="automatic_params"><title>Automatic conversion step parameters</title><para>If the first conversion  step is a <link linkend="convert_step">Convert step</link>, the following parameters are automatically added by <literal>w2x</literal> (unless, of course, they have already been specified by the user):</para><itemizedlist><listitem><para>If <literal><replaceable>out_file</replaceable></literal> extension starts with “<literal>htm</literal>” or “<literal>shtm</literal>”,</para><para><literal>-p <replaceable>step_name</replaceable>.charset UTF-8</literal><indexterm><primary>charset, parameter</primary></indexterm></para><para>The <link linkend="param_charset">charset parameter</link> allows to get Web browsers consider the generated  document as being HTML,  and not XHTML.</para></listitem><listitem><para><literal>-pu <replaceable>step_name</replaceable>.xhtml-file <replaceable>out_file_with_an_xhtml_extension</replaceable></literal><indexterm><primary>out-file, parameter</primary></indexterm></para></listitem></itemizedlist><para>If the last conversion  step is a <link linkend="save_step">Save step</link>, <link linkend="transform_step">Transform step</link>, <link linkend="split_step">Split step</link>, <link linkend="webhelp_step">Web Help step</link> or <link linkend="epub_step">EPUB step</link> the following parameters are automatically added by <literal>w2x</literal> (unless, of course, they have already been specified by the user):</para><itemizedlist><listitem><para><literal>-pu <replaceable>step_name</replaceable>.out-file <replaceable>out_file</replaceable></literal><indexterm><primary>out-file, parameter</primary></indexterm></para></listitem></itemizedlist></section></chapter><chapter xml:id="step_reference"><title>Conversion step reference</title><section xml:id="convert_step"><title>Convert step</title><para>Convert input DOCX file to a styled, valid, XHTML 1.0 Transitional document. The result of this step is this XHTML document. <indexterm><primary>Convert, step</primary></indexterm></para><note><para>For clarity, the “<literal>convert.</literal>” parameter name prefix is omitted here.</para><para>However when you’ll pass any of the following parameters to <literal>w2x</literal>, please do not forget this prefix. Example: <literal>-p convert.resource-directory images</literal>.</para></note><para xml:id="convert_step_params">Parameters:</para><informaltable><tgroup cols="3"><colspec colname="c1" colwidth="24*"/><colspec colname="c2" colwidth="28*"/><colspec colname="c3" colwidth="49*"/><thead valign="top"><row><entry align="center"><para><emphasis role="bold">Name</emphasis></para></entry><entry align="center"><para><emphasis role="bold">Value</emphasis></para></entry><entry align="center"><para><emphasis role="bold">Description</emphasis></para></entry></row></thead><tbody valign="top"><row><entry><para><literal>automatic-ids</literal><indexterm><primary>automatic-ids, parameter</primary></indexterm></para></entry><entry><para>A regular expression pattern.</para><para>Default: "<literal>(^_?[a-zA-Z]{1,3}\\d+$)| (^(OLE_LINK|_ENREF_))| (^_GoBack$)</literal>".</para></entry><entry><para>Specifies the names of the bookmarks which are automatically generated by MS-Word. This parameter is used to favor user-specified bookmarks, which are expected to have long and descriptive names, over those automatically generated by MS-Word  ("<literal>_GoBack</literal>", "<literal>_Toc123</literal>", "<literal>BM3</literal>",etc).</para><para>If specified regular expression pattern starts with "<literal>|</literal>", it is appended to the default one.</para><para>If specified regular expression pattern ends with "<literal>|</literal>", it is prepended to the default one.</para></entry></row><row><entry><para xml:id="param_charset"><literal>charset</literal><indexterm><primary>charset, parameter</primary></indexterm></para></entry><entry><para>A valid character encoding (e.g. <literal>UTF-8</literal>, <literal>Windows-1252</literal>).</para><para>Default: no charset, add  an XML declaration.</para></entry><entry><para>When a <literal>charset</literal> is specified,  a <literal>meta</literal> element is added to the <literal>head</literal> element of the generated document:</para><itemizedlist><listitem><para><literal>&lt;meta charset=”<replaceable>charset</replaceable>”/&gt;</literal> if parameter <literal>version</literal> is  “<literal>5.0</literal>”,</para></listitem><listitem><para><literal>&lt;meta content=”text/html; charset=<replaceable>charset</replaceable>” http-equiv=”Content-Type” /&gt;</literal> otherwise.</para></listitem></itemizedlist><para>If the specified <literal>charset</literal> is “<literal>UTF-8</literal>”, then the XML declaration <literal>(&lt;?xml version=”1.0” encoding=”UTF-8”?&gt;</literal>) is <emphasis>not</emphasis> to added to the generated document. This allows to get Web browsers consider the generated  document as being HTML,  and not XHTML.</para></entry></row><row><entry><para><literal>converted-image-extensions</literal><indexterm><primary>converted-image-extensions, parameter</primary></indexterm></para></entry><entry><para>A list of image file extensions separated by space characters.</para><para>Default: “<literal>svg png jpeg</literal>”.</para></entry><entry><para>When the input DOCX file contains an image not having any of the file extensions specified in the <literal>converted-image-extensions</literal> list, attempt to convert this image to one of the formats of this list. </para><para>Each format is considered in turn, that’s why w2x will attempt to convert a WMF image to SVG first, before considering PNG and JPEG.</para></entry></row><row><entry><para><literal>create-mathml-object</literal><indexterm><primary>create-mathml-object, parameter</primary></indexterm></para></entry><entry><para>“<literal>yes</literal>” |  “<literal>no</literal>” | “<literal>auto</literal>”</para><para>Default:  “<literal>auto</literal>”.</para></entry><entry><para>When converting MS-Word math (that is, OpenXML math) to <link xmlns:xlink="http://www.w3.org/1999/xlink" xlink:href="https://www.w3.org/TR/MathML2/">MathML</link><indexterm><primary>MathML</primary></indexterm>:</para><variablelist><varlistentry><term>yes</term><listitem><para>Generate an external file containing the converted MathML element and insert an <literal>object</literal> element pointing to the generated “<literal>.mml</literal>” file. Example: <literal>&lt;object data="doc_files/math-010.mml" type="application/mathml+xml"/&gt;</literal>.</para></listitem></varlistentry><varlistentry><term>no</term><listitem><para>Embed the converted MathML element in the XHTML document created by this step.</para></listitem></varlistentry><varlistentry><term>auto</term><listitem><para>Embed the converted MathML element in the XHTML document but only if <link linkend="param_version">parameter version</link> is set to <literal>5.0</literal><footnote xml:id="__FN9__"><para>Because only XHTML 5 documents may embed MathML. With any other version of XHTML, this would cause the document to become invalid.</para></footnote>.</para></listitem></varlistentry></variablelist></entry></row><row><entry><para xml:id="param_default_lang"><literal>default-lang</literal><indexterm><primary>default-lang, parameter</primary></indexterm></para></entry><entry><para>A valid language code (e.g. <literal>en</literal>, <literal>fr-CA</literal>).</para><para>No default.</para></entry><entry><para>if parameter <literal>set-lang</literal> is not specified and if the main language of the document cannot determined by examining the contents of the input DOCX file,  set the <literal>lang</literal> attribute of the <literal>html</literal> element to this value.</para><note><para><emphasis role="bold">About East Asian languages</emphasis><indexterm><primary sortas="East Asia">About East Asian languages</primary></indexterm><indexterm><primary>CJK</primary><see>About East Asian languages</see></indexterm></para><para>Due to a <link linkend="east_asia_lang_limitation">limitation</link>, it is recommended  to specify for example <literal>–p convert.set-lang ja-JP</literal> or <literal>–p convert.default-lang ja-JP</literal> when converting a document written mainly in Japanese. </para><para>When parameter <literal>convert.set-lang</literal> or parameter <literal>convert.default-lang</literal> is set to a language code starting with <literal>ja</literal>, <literal>zh</literal> or <literal>ko</literal>, then it is attribute <literal>w:lang/@w:eastAsia</literal> which is used to determine the language of a text span and not <literal>attribute w:lang/@w:val</literal>.</para><para>Note that <literal>–p convert.default-lang ja-JP</literal> is just used as a <emphasis>hint</emphasis> to favor attribute <literal>w:lang/@w:eastAsia</literal> over attribute <literal>wlang/@w:val</literal>. Given the way MS-Word sets these two attributes, using parameter <literal>–p convert.default-lang ja-JP</literal> will <emphasis>not</emphasis> cause a vastly incorrect detection of the language when converting a German DOCX file for example.</para></note></entry></row><row><entry><para><literal>lower-case-resource-names</literal><indexterm><primary>lower-case-resource-names, parameter</primary></indexterm></para></entry><entry><para>A boolean: <literal>true</literal> (same as:  <literal>yes</literal> | <literal>on</literal> | <literal>1</literal>) | false (same as:  <literal>no</literal> | <literal>off</literal> | <literal>0</literal>).</para><para>Default: <literal>false</literal>.</para></entry><entry><para>Not for general use. Specifying this parameter as <literal>true</literal> is needed to keep quiet <link xmlns:xlink="http://www.w3.org/1999/xlink" xlink:href="https://github.com/idpf/epubcheck">epubcheck</link><indexterm><primary>EPUB, output format</primary></indexterm> on platforms where filenames are case-sensitive (e.g. Linux).</para></entry></row><row><entry><para><literal>resource-directory</literal><indexterm><primary>resource-directory, parameter</primary></indexterm></para></entry><entry><para>A file path.</para><para>Default: if parameter <literal>xhtml-file</literal> is specified,  basename of <literal>xhtml-file</literal>, without an extension, but followed by <literal>“_files</literal>”; otherwise the absolute path of an automatically created temporary  directory.</para></entry><entry><para>Specifies the file path of the directory which is to contain copies of the images referenced in the input DOCX file.</para><para>A relative file path is relative to the value of parameter  <literal>xhtml-file</literal>.</para><para>Note that, if it already exists, a resource directory specified this way is <emphasis>not</emphasis> automatically made empty by w2x before being used to store resources. Only the “automatic”, default, <literal>output_file_basename_files/</literal> folder is automatically made empty by w2x (if this “automatic”  folder already exists).</para></entry></row><row><entry><para><literal>resource-prefix</literal><indexterm><primary>resource-prefix, parameter</primary></indexterm></para></entry><entry><para>A non-empty string not containing the file separator character (“<literal>/</literal>” or “<literal>\</literal>”).</para><para>Default: none, no prefix.</para></entry><entry><para>Specifies a prefix to be prepended to the names of resource files created by w2x.</para><para>This prefix is useful when used in conjunction with  parameter <literal>resource-directory</literal> and when several files generated by w2x share the same resource directory.</para></entry></row><row><entry><para><literal>set-column-number</literal><indexterm><primary>set-column-number, parameter</primary></indexterm></para></entry><entry><para>A boolean: <literal>true</literal> (same as:  <literal>yes</literal> | <literal>on</literal> | <literal>1</literal>) | false (same as:  <literal>no</literal> | <literal>off</literal> | <literal>0</literal>).</para><para>Default: <literal>false</literal>.</para></entry><entry><para>If specified as <literal>true</literal>, insert in each table cell a <literal>column-number</literal> processing-instruction containing the column number of this cell. First column is column #1.</para><para>Example:</para><programlisting>&lt;?column-number 1?&gt;
</programlisting><para>This processing-instruction greatly helps in generating CALS tables (DocBook, DITA) containing cells spanning several columns.</para></entry></row><row><entry><para xml:id="param_set_lang"><literal>set-lang</literal><indexterm><primary>set-lang, parameter</primary></indexterm></para></entry><entry><para>A valid language code (e.g. <literal>en</literal>, <literal>fr-CA</literal>).</para><para>No default: set  the <literal>lang</literal> attribute of the <literal>html</literal> element after examining  the contents of the input DOCX file.</para></entry><entry><para>if specified, set the <literal>lang</literal> attribute of the <literal>html</literal> element to this value.</para><note><para><emphasis role="bold">About East Asian languages</emphasis><indexterm><primary sortas="East Asia">About East Asian languages</primary></indexterm></para><para>Due to a <link linkend="east_asia_lang_limitation">limitation</link>, it is recommended  to specify for example <literal>–p convert.set-lang ja-JP</literal> or <literal>–p convert.default-lang ja-JP</literal> when converting a document written mainly in Japanese.</para><para>When parameter <literal>convert.set-lang</literal> or parameter <literal>convert.default-lang</literal> is set to a language code starting with <literal>ja</literal>, <literal>zh</literal> or <literal>ko</literal>, then it is attribute <literal>w:lang/@w:eastAsia</literal> which is used to determine the language of a text span and not <literal>attribute w:lang/@w:val</literal>.</para></note></entry></row><row><entry><para xml:id="param_version"><literal>version</literal><indexterm><primary>version, parameter</primary></indexterm></para></entry><entry><para><literal>1.0_transitional</literal> (same as:  <literal>1.0_loose</literal> | <literal>1</literal>) | <literal>1.0_strict</literal> | <literal>1.1</literal> | <literal>5.0</literal> (same as: <literal>5</literal>) | “”.</para><para>Default:  <literal>1.0_transitional</literal>.</para></entry><entry><para>Specifies which XHTML  version to generate, hence which <literal>&lt;!DOCTYPE&gt;</literal> to add to generated XHTML document.</para><para>Note that XHTML 5.0 has no DTD, hence no <literal>&lt;!DOCTYPE&gt;</literal> for this version.</para><para>The empty string “” means:  generate XHTML 1.0 Transitional , but  do not add a <literal>&lt;!DOCTYPE&gt;</literal>.</para></entry></row><row><entry><para xml:id="param_xhtml_file"><literal>xhtml-file</literal><indexterm><primary>xhtml-file, parameter</primary></indexterm></para></entry><entry><para>A file path.</para><para>No default .</para></entry><entry><para>If the generated XHTML document was saved to disk, this would be the path of its save file. </para><para>When specified (which is strongly recommended),  this file path  is used to give a base URL to the generated XHTML document.</para></entry></row></tbody></tgroup></informaltable></section><section xml:id="delete_files_step"><title>Delete files step</title><para>Delete files or directories having specified path or matching specified <link xmlns:xlink="http://www.w3.org/1999/xlink" xlink:href="https://en.wikipedia.org/wiki/Glob_%28programming%29">glob pattern</link>.  The input of this step is ignored.  The result of this step is thus equal to its input. <indexterm><primary>Delete files, step</primary></indexterm></para><para>This step is used for example when generating a DITA map or bookmap. It is used to delete the intermediate topic file created by the first Transform step.</para><para>Parameters (for clarity, the “<literal>cleanUp.</literal>” parameter name prefix is omitted here):</para><informaltable><tgroup cols="3"><colspec colname="c1" colwidth="25*"/><colspec colname="c2" colwidth="25*"/><colspec colname="c3" colwidth="50*"/><thead valign="top"><row><entry align="center"><para><emphasis role="bold">Name</emphasis></para></entry><entry align="center"><para><emphasis role="bold">Value</emphasis></para></entry><entry align="center"><para><emphasis role="bold">Description</emphasis></para></entry></row></thead><tbody valign="top"><row><entry><para><literal>files</literal><indexterm><primary>files, parameter</primary></indexterm></para></entry><entry><para>A file path or glob pattern.</para><para>No default (<emphasis>required</emphasis>).</para></entry><entry><para>Specifies which files or directories are to be deleted. A relative file path or glob pattern is relative to the current working directory.</para></entry></row></tbody></tgroup></informaltable></section><section xml:id="edit_step"><title>Edit step</title><para>Edit in place input XHTML document using a <link xmlns:xlink="http://www.w3.org/1999/xlink" xlink:href="http://www.xmlmind.com/w2x/_distrib/doc/xedscript/index.html">XED script</link>. The result of this step is the same XHTML document, but modified by the script. <indexterm><primary>Edit, step</primary></indexterm></para><note><para>For clarity, the “<literal>edit.</literal>” parameter name prefix is omitted here.</para><para>However when you’ll pass any of the following parameters to <literal>w2x</literal>, please do not forget this prefix. Example: <literal>-p edit.ids.generate-section-ids yes</literal>.</para></note><para>Parameters:</para><informaltable><tgroup cols="3"><colspec colname="c1" colwidth="25*"/><colspec colname="c2" colwidth="25*"/><colspec colname="c3" colwidth="50*"/><thead valign="top"><row><entry align="center"><para><emphasis role="bold">Name</emphasis></para></entry><entry align="center"><para><emphasis role="bold">Value</emphasis></para></entry><entry align="center"><para><emphasis role="bold">Description</emphasis></para></entry></row></thead><tbody valign="top"><row><entry><para><literal>xed-url-or-file</literal><indexterm><primary>xed-url-or-file, parameter</primary></indexterm></para></entry><entry><para>An absolute URL or the path of an existing file.</para><para>No default (<emphasis>required</emphasis>).</para></entry><entry><para>Specifies which XED script should be used to edit the input XHTML document. A relative file path is relative to the current working directory.</para></entry></row></tbody></tgroup></informaltable><para> Any other parameter is passed to the XED script as a XED global variable.</para><para>XMLmind Word to XML (<abbrev>w2x</abbrev> for short) comes with two stock “main” XED scripts:</para><variablelist><varlistentry><term><literal>w2x:xed/main-styled.xed</literal></term><listitem><para>Invokes XED scripts used to “polish up” the styled XHTML 1.0 Transitional document created by the Convert step (e.g. process consecutive paragraphs having identical borders).</para></listitem></varlistentry><varlistentry><term><literal>w2x:xed/main.xed</literal></term><listitem><para>Invokes XED scripts used to prepare the generation of semantic XML of all kinds:  XHTML, DocBook, DITA.  These scripts leverage the CSS styles and classes found in the styled XHTML 1.0 Transitional document created by the Convert step.  They translate these CSS styles and classes (e.g. numbered paragraph) into semantic tags (e.g. <literal>ol</literal>/<literal>li</literal>).</para></listitem></varlistentry></variablelist><note><para>Something like “<literal>w2x:xed/main.xed</literal>” is an absolute URL supported by w2x. “<literal>w2x:</literal>” is an URL prefix (defined in the automatic XML catalog used by w2x) which specifies the location of the parent directory of both the <literal>xed/</literal> and <literal>xslt/</literal> subdirectories.</para></note><table><title>Parameters common to <literal>w2x:xed/main-styled.xed</literal> and <literal>w2x:xed/main.xed</literal></title><tgroup cols="3"><colspec colname="c1" colwidth="25*"/><colspec colname="c2" colwidth="25*"/><colspec colname="c3" colwidth="50*"/><thead valign="top"><row><entry align="center"><para><emphasis role="bold">Name</emphasis></para></entry><entry align="center"><para><emphasis role="bold">Value</emphasis></para></entry><entry align="center"><para><emphasis role="bold">Description</emphasis></para></entry></row></thead><tbody valign="top"><row><entry><para xml:id="finish_styles_css_uri"><literal>finish-styles.css-uri</literal><indexterm><primary>finish-styles.css-uri, parameter</primary></indexterm></para></entry><entry><para>An absolute or relative “<literal>file:</literal>”  URI.</para><para>Default: “”. “Interned” CSS styles, if any, are stored in a <literal>head</literal>/<literal>style</literal> element.</para></entry><entry><para>Global variable defined  in <literal>w2x:xed/finish-styles.xed</literal>.</para><para>Store “interned”  CSS styles, if any, in the CSS (UTF-8 encoded) file having this URI. A relative URI is relative to the URI specified by <link linkend="param_xhtml_file">parameter xhtml-file</link>.</para><para>More information about “interned”  CSS styles in <link xmlns:xlink="http://www.w3.org/1999/xlink" xlink:href="http://www.xmlmind.com/w2x/_distrib/doc/xedscript/parse-styles.html">command parse-styles</link> (command invoked by <literal>w2x:xed/init-styles.xed</literal>) and inverse <link xmlns:xlink="http://www.w3.org/1999/xlink" xlink:href="http://www.xmlmind.com/w2x/_distrib/doc/xedscript/unparse-styles.html">command unparsed-styles</link><literal> </literal>(command invoked by <literal>w2x:xed/finish-styles.xed</literal>).</para></entry></row><row><entry><para xml:id="finish_styles_custom_styles_url_or_file"><literal>finish-styles.</literal> <literal>custom-styles-url-or-file</literal></para></entry><entry><para>An absolute URL or a filename. A relative filename is relative to the current working directory.</para><para>Default: “” (no custom styles).</para></entry><entry><para>Global variable defined  in <literal>w2x:xed/finish-styles.xed</literal>.</para><para>Specifies the location of a CSS file. The custom CSS styles found in specified file are simply appended to the automatically generated CSS styles.</para><para>Using this variable is the easiest way to customize the automatically generated CSS styles.</para><note><para><emphasis role="bold">When generating multi-page styled or semantic XHTML of any kind </emphasis><emphasis role="bold">(frameset, Web Help, EPUB)</emphasis></para><para>Please use <literal>finish-styles.</literal> <literal>custom-styles-url-or-file</literal> to specify custom CSS styles. </para><para>No need to specify <literal>finish-styles.css-uri</literal> as all the CSS styles are anyway stored into an external “<literal>.css</literal>” file having the same basename as the main output file.</para></note></entry></row><row><entry><para><literal>finish-styles.mathjax</literal><indexterm><primary>finish-styles.mathjax, parameter</primary></indexterm></para></entry><entry><para>“<literal>yes</literal>” |  “<literal>no</literal>” | “<literal>auto</literal>”</para><para>Default:  “<literal>no</literal>”.</para></entry><entry><para>Global variable defined  in <literal>w2x:xed/finish-styles.xed</literal>.</para><para>Very few web browsers (Firefox) can natively render MathML<indexterm><primary>MathML</primary></indexterm>. Fortunately, there is <link xmlns:xlink="http://www.w3.org/1999/xlink" xlink:href="https://www.mathjax.org/">MathJax</link><indexterm><primary>MathML</primary><secondary>MathJax</secondary></indexterm>.</para><para>MathJax is a JavaScript display engine for mathematics that works in all browsers.</para><variablelist><varlistentry><term>yes</term><listitem><para>Add a <literal>&lt;script&gt;</literal> element loading MathJax to the <literal>&lt;html&gt;</literal>/<literal>&lt;head&gt;</literal> element of the generated XHTML file.</para></listitem></varlistentry><varlistentry><term>auto</term><listitem><para>Same as “<literal>yes</literal>”, but add <literal>&lt;script&gt;</literal> only when the generated XHTML file contains MathML.</para></listitem></varlistentry></variablelist></entry></row><row><entry><para><literal>finish-styles.mathjax-url</literal><indexterm><primary>finish-styles.mathjax-url, parameter</primary></indexterm></para></entry><entry><para>String.</para><para>Default value: the URL pointing to the MathJax CDN, as recommended in the MathJax documentation.</para></entry><entry><para>Global variable defined  in <literal>w2x:xed/finish-styles.xed</literal>.</para><para>The URL allowing to load the <link xmlns:xlink="http://www.w3.org/1999/xlink" xlink:href="https://www.mathjax.org/">MathJax</link> engine<indexterm><primary>MathML</primary><secondary>MathJax</secondary></indexterm> configured for rendering MathML.</para><para>Ignored unless parameter <literal>mathjax</literal> is set to “<literal>yes</literal>” or “<literal>auto</literal>”.</para></entry></row><row><entry><para xml:id="param_keep_title"><literal>title.keep-title</literal><indexterm><primary>title.keep-title, parameter</primary></indexterm></para></entry><entry><para>“<literal>yes</literal>” |  “<literal>no</literal>”</para><para>Default: “<literal>yes</literal>” when generating styled or semantic XHTML of all kinds (single-page, EPUB, etc), “<literal>no</literal>” when generating any other format.</para></entry><entry><para>Global variable defined  in <literal>w2x:xed/title.xed</literal>.</para><para>Default value “<literal>no</literal>” specifies that paragraphs having “<literal>p-Title</literal>” and “<literal>p-Subtitle</literal>” styles (to make it simple; see also parameters <link linkend="param_title_style_names">title.title-style-names</link> and <link linkend="param_subtitle_style_names">title.subtitle-style-names</link>) are to be converted only to <literal>head</literal>/<literal>title</literal> and to <literal>head</literal>/<literal>meta name="description"</literal>. </para><para>This simple behavior makes these titles invisible to the user, though usable by programs such as the XSLT stylesheets generating DITA or DocBook.</para><para>Value “<literal>yes</literal>” may be used to specify that paragraphs having “<literal>p-Title</literal>” and “<literal>p-Subtitle</literal>” styles are <emphasis>additionally</emphasis> converted to equivalent, visible, XHTML elements.</para><para>These equivalent, visible, XHTML elements are specified by parameters <link linkend="param_title_container">title.title-container</link> and <link linkend="param_subtitle_container">title.subtitle-container</link>.</para></entry></row><row><entry><para xml:id="param_title_container"><literal>title.title-container</literal><indexterm><primary>title.title-container, parameter</primary></indexterm></para></entry><entry><para>An XHTML element name possibly followed by one or more attributes.</para><para>Default: “” when generating styled XHTML; otherwise “<literal>h1 class='role-document-title'</literal>” .</para></entry><entry><para>Global variable defined  in <literal>w2x:xed/title.xed</literal>.</para><para>Specifies the XHTML element to which a paragraph having a “<literal>p-Title</literal>” style is to be converted. An empty string value is equivalent to “<literal>p</literal>”.</para><para>Ignored when <link linkend="param_keep_title">parameter title.keep-title</link> is “<literal>no</literal>”.</para></entry></row><row><entry><para xml:id="param_title_style_names"><literal>title.title-style-names</literal><indexterm><primary>title.title-style-names, parameter</primary></indexterm></para></entry><entry><para>List of user-defined style names separated by space characters.</para><para>Default: “” (empty list).</para></entry><entry><para>Global variable defined  in <literal>w2x:xed/title.xed</literal>.</para><para>Specifies which user-defined paragraph styles should be considered to be equivalent to standard style “<literal>p-Title</literal>”. </para><para>(Paragraph styles, whether user-defined or standard, are given a “<literal>p-</literal>“ prefix by the Convert step.)</para></entry></row><row><entry><para xml:id="param_subtitle_container"><literal>title.subtitle-container</literal><indexterm><primary>title.subtitle-container, parameter</primary></indexterm></para></entry><entry><para>An XHTML element name possibly followed by one or more attributes.</para><para>Default: “” when generating styled XHTML; otherwise “<literal>p class='role-document-subtitle'</literal>”.</para></entry><entry><para>Global variable defined  in <literal>w2x:xed/title.xed</literal>.</para><para>Specifies the XHTML element to which a paragraph having a “<literal>p-Subtitle</literal>” style is to be converted. An empty string value is equivalent to “<literal>p</literal>”.</para><para>Ignored when <link linkend="param_keep_title">parameter title.keep-title</link> is “<literal>no</literal>”.</para></entry></row><row><entry><para xml:id="param_subtitle_style_names"><literal>title.subtitle-style-names</literal><indexterm><primary>title.subtitle-style-names, parameter</primary></indexterm></para></entry><entry><para>List of user-defined style names separated by space characters.</para><para>Default: “” (empty list).</para></entry><entry><para>Global variable defined  in <literal>w2x:xed/title.xed</literal>.</para><para>Specifies which user-defined paragraph styles should be considered to be equivalent to standard style “<literal>p-Subtitle</literal>”. </para><para>(Paragraph styles, whether user-defined or standard, are given a “<literal>p-</literal>“ prefix by the Convert step.)</para></entry></row></tbody></tgroup></table><table><title>Parameters which are specific to <literal>w2x:xed/main-styled.xed</literal></title><tgroup cols="3"><colspec colname="c1" colwidth="25*"/><colspec colname="c2" colwidth="25*"/><colspec colname="c3" colwidth="50*"/><thead valign="top"><row><entry align="center"><para><emphasis role="bold">Name</emphasis></para></entry><entry align="center"><para><emphasis role="bold">Value</emphasis></para></entry><entry align="center"><para><emphasis role="bold">Description</emphasis></para></entry></row></thead><tbody valign="top"><row><entry><para><literal>remove-pis.except</literal><indexterm><primary>remove-pis.except, parameter</primary></indexterm></para></entry><entry><para>One or more processing-instructions targets separated by space characters.</para><para>Default: “” (remove all processing-instructions)</para></entry><entry><para>Global variable defined  in <literal>w2x:xed/remove-pis.xed</literal>.</para><para>Specifies which processing-instructions should be kept in the styled HTML document.</para><para>By default, all processing-instructions are removed from the styled HTML document. Such processing-instructions are useful only when the styled HTML document created by the Convert step is used as an intermediate format in order to generate semantic XML.</para></entry></row></tbody></tgroup></table><table><title>Parameters which are specific to <literal>w2x:xed/main.xed</literal></title><tgroup cols="3"><colspec colname="c1" colwidth="24*"/><colspec colname="c2" colwidth="28*"/><colspec colname="c3" colwidth="49*"/><thead valign="top"><row><entry align="center"><para><emphasis role="bold">Name</emphasis></para></entry><entry align="center"><para><emphasis role="bold">Value</emphasis></para></entry><entry align="center"><para><emphasis role="bold">Description</emphasis></para></entry></row></thead><tbody valign="top"><row><entry><para><literal>before-save.allow-flow</literal><indexterm><primary>before-save.allow-flow, parameter</primary></indexterm></para></entry><entry><para>“<literal>yes</literal>” |  “<literal>no</literal>”.</para><para>Default:  “<literal>no</literal>”.</para></entry><entry><para>Global variable defined  in <literal>w2x:xed/before-save.xed</literal>.</para><para>If “<literal>yes</literal>”, allow flow elements (e.g. <literal>li</literal>) to directly contain text and inline elements.</para><para>If “<literal>no</literal>”,  do not allow flow elements (e.g. <literal>li</literal>) to directly contain text and inline elements. Instead “wrap” these text and and inline elements in <literal>&lt;p class=”role-inline-wrapper”&gt;</literal> elements. </para><para>The “<literal>no</literal>” option greatly eases the generation of certain types of semantic XML (e.g. DocBook) during the Transform step.</para></entry></row><row><entry><para><literal>biblio.style-names</literal><indexterm><primary>biblio.style-names, parameter</primary></indexterm></para></entry><entry><para>List of user-defined style names separated by space characters.</para><para>Default: “” (empty list).</para></entry><entry><para>Global variable defined  in <literal>w2x:xed/biblio.xed</literal>.</para><para>Specifies which user-defined paragraph styles should be considered to be equivalent to standard style “<literal>p-Bibliography</literal>”. </para><para>(Paragraph styles, whether user-defined or standard, are given a “<literal>p-</literal>“ prefix by the Convert step.)</para></entry></row><row><entry><para xml:id="param_blocks_convert"><literal>blocks.convert</literal><indexterm><primary>blocks.convert, parameter</primary></indexterm></para></entry><entry><para>A conversion specification. </para><para>Default: “”. No conversions other than those performed by <literal>w2x:xed/blocks.xed</literal>.</para></entry><entry><para>Global variable defined  in <literal>w2x:xed/blocks.xed</literal>.</para><para>Specified  paragraph styles are converted to specified XHTML elements. See <xref linkend="simple_convert_spec"/>.</para></entry></row><row><entry><para><literal>blocks.convert-to-pre</literal><indexterm><primary>blocks.convert-to-pre, parameter</primary></indexterm></para></entry><entry><para>A conversion specification. </para><para>Default: “”. </para></entry><entry><para>Global variable defined  in <literal>w2x:xed/blocks.xed</literal>.</para><para>Specified  paragraph styles are converted to specified XHTML elements. See <xref linkend="simple_convert_spec"/>.</para><para>When using MS-Word, there two ways to represent code samples:</para><orderedlist><listitem><para>Use a sequence of paragraphs having the same style. Each paragraph contains one line of the code sample. Let’s call the style of these paragraphs <literal>Code1</literal>.</para></listitem><listitem><para>Use a single paragraph containing the whole code sample, which means that this single paragraph contains significant whitespace and line breaks. Let’s call the style of this paragraph <literal>Code2</literal>.</para></listitem></orderedlist><para>A sequence of <literal>Code1</literal> paragraphs may be converted to an XHTML <literal>pre</literal> using:</para><programlisting>–p edit.blocks.convert "p-Code1 span g:id='pre' g:container='pre'"
</programlisting><para>A <literal>Code2</literal> paragraph may be converted to an XHTML <literal>pre</literal> using:</para><programlisting>–p edit.blocks.convert-to-pre "p-Code2 pre"
</programlisting></entry></row><row><entry><para><literal>captions.style-names</literal><indexterm><primary>captions.style-names, parameter</primary></indexterm></para></entry><entry><para>List of user-defined style names separated by space characters.</para><para>Default: “” (empty list).</para></entry><entry><para>Global variable defined  in <literal>w2x:xed/captions.xed</literal>.</para><para>Specifies which user-defined paragraph styles should be considered to be equivalent to standard style “<literal>p-Caption</literal>”. </para><para>(Paragraph styles, whether user-defined or standard, are given a “<literal>p-</literal>“ prefix by the Convert step.)</para></entry></row><row><entry><para><literal>convert-tabs.to-table</literal><indexterm><primary>convert-tabs.to-table, parameter</primary></indexterm><indexterm><primary>tab stops</primary></indexterm></para></entry><entry><para>“<literal>yes</literal>” |  “<literal>no</literal>”.</para><para>Default:  “<literal>no</literal>”.</para></entry><entry><para>Global variable defined  in <literal>w2x:xed/convert-tabs.xed</literal>.</para><para>If set to “<literal>yes</literal>”, convert consecutive paragraphs containing text runs aligned on tab stops to a borderless table.</para><para>This option is turned off by default because, in the general case, it's not possible to emulate tab stops using tables.</para></entry></row><row><entry><para><literal>convert-tabs.unwrap-paragraphs</literal><indexterm><primary>convert-tabs.unwrap-paragraphs, parameter</primary></indexterm><indexterm><primary>tab stops</primary></indexterm></para></entry><entry><para>“<literal>yes</literal>” |  “<literal>no</literal>”.</para><para>Default:  “<literal>yes</literal>”.</para></entry><entry><para>Global variable defined  in <literal>w2x:xed/convert-tabs.xed</literal>.</para><para>If set to “<literal>yes</literal>”, the cells contained in the borderless table used to emulate tab stops directly contain text runs rather than paragraphs. </para></entry></row><row><entry><para><literal>headings.convert</literal><indexterm><primary>headings.convert, parameter</primary></indexterm></para></entry><entry><para>A conversion specification. </para><para>Default: “”. No conversions other than those performed by <literal>w2x:xed/headings.xed</literal>.</para></entry><entry><para>Global variable defined  in <literal>w2x:xed/headings.xed</literal>.</para><para>Specified  paragraph styles are converted to specified XHTML heading elements (<literal>h1</literal>, <literal>h2</literal>, …, <literal>h6</literal>). See <xref linkend="simple_convert_spec"/>.</para><para>Note that by default,  script <literal>headings.xed</literal>  automatically converts paragraphs having an outline level to <literal>h1</literal>, <literal>h2</literal>, …, <literal>h6</literal> headings.</para></entry></row><row><entry><para><literal>ids.generate-section-ids</literal><indexterm><primary>ids.generate-section-ids, parameter</primary></indexterm></para></entry><entry><para>“<literal>yes</literal>” |  “<literal>no</literal>”.</para><para>Default:  “<literal>no</literal>”.</para></entry><entry><para>Global variable defined  in <literal>w2x:xed/ids.xed</literal>.</para><para>Ensure that all the sections found in the semantic XHTML resulting from the conversion of a DOCX file have a unique ID.</para><para>When this ID is missing, it is computed using the content of the <literal>h1</literal>, <literal>h2</literal>, ..., <literal>h6</literal> heading which is the first child of the section. Example: </para><programlisting>&lt;div class="role-section2"
     <emphasis role="bold">id="Title_of_this_section"</emphasis>&gt;
  &lt;h2&gt;<emphasis role="bold">Title of this section</emphasis>&lt;/h2&gt;
...
</programlisting><para>Setting <literal>ids.generate-section-ids</literal> to <literal>yes</literal> is especially useful when converting a DOCX file to a DITA map or bookmap. With this parameter, the filenames of the topics referenced by the generated map are guaranteed to have meaningful values (e.g. "<literal>Introduction.dita</literal>" rather than "<literal>d0e35.dita</literal>").</para></entry></row><row><entry><para><literal>ids.section-id-max-length</literal><indexterm><primary>ids.section-id-max-length, parameter</primary></indexterm></para></entry><entry><para>An integer greater or equal to 1.</para><para>Default: 32.</para></entry><entry><para>Global variable defined  in <literal>w2x:xed/ids.xed</literal>.</para><para>Specifies the maximum length of the automatically computed ID when parameter <literal>ids.generate-section-ids</literal> is set to <literal>yes</literal>.</para></entry></row><row><entry><para><literal>index.index-term-separator</literal><indexterm><primary>index.index-term-separator, parameter</primary></indexterm></para></entry><entry><para>A string.</para><para>Default: "<literal>, </literal>".</para></entry><entry><para>Global variable defined  in <literal>w2x:xed/index.xed</literal>.</para><para>Specifies the string used to join index terms when a redirection to another index entry is to be generated (example: “See Cat, Siamese, Seal point”).</para></entry></row><row><entry><para><literal>inlines.b-element</literal><indexterm><primary>inlines.b-element, parameter</primary></indexterm><literal>, inlines.big-element</literal><indexterm><primary>inlines.big-element, parameter</primary></indexterm><literal>,</literal> <literal>inlines.i-element</literal><indexterm><primary>inlines.i-element, parameter</primary></indexterm><literal>,</literal> <literal>inlines.s-element</literal><indexterm><primary>inlines.s-element, parameter</primary></indexterm><literal>,</literal> <literal>inlines.small-element</literal><indexterm><primary>inlines.small-element, parameter</primary></indexterm><literal>,</literal> <literal>inlines.sub-element</literal><indexterm><primary>inlines.sub-element, parameter</primary></indexterm><literal>,</literal> <literal>inlines.sup-element</literal><indexterm><primary>inlines.sup-element, parameter</primary></indexterm><literal>,</literal> <literal>inlines.tt-element, </literal><indexterm><primary>inlines.tt-element, parameter</primary></indexterm><literal> inlines.u-element</literal><indexterm><primary>inlines.u-element, parameter</primary></indexterm></para></entry><entry><para>An element name optionally followed by attributes.</para><para>Defaults: <literal>"b"</literal>, <literal>"big"</literal>, <literal>"i"</literal>, <literal>"s"</literal>, <literal>"small"</literal>, <literal>"sub"</literal>, <literal>"sup"</literal>, <literal>"tt"</literal>, <literal>"u"</literal>. </para></entry><entry><para>Global variables defined  in <literal>w2x:xed/inlines.xed</literal>.</para><para>By default, the <emphasis role="bold">Edit</emphasis> step converts a text <literal>span</literal> having <literal>style="font-weight:bold"</literal> (as generated by the <emphasis role="bold">Convert</emphasis> step) to XHTML element <literal>b</literal>. Specifying parameter<literal> –p edit.inlines.b-element "strong"</literal> replaces default <literal>b</literal> element by a <literal>strong</literal> element.</para><para>Similarly, alternate element names may be specified using the following parameters: <literal>inlines.sup-element</literal>, <literal>inlines.sup-element</literal>, <literal>inlines.small-element</literal>, <literal>inlines.big-element</literal>, <literal>inlines.s-element</literal>, <literal>inlines.u-element</literal>, <literal>inlines.tt-element</literal>,  <literal>inlines.i-element</literal>. </para><para>Example 1: generate <literal>code</literal> rather than <literal>tt</literal> elements: <literal>-p edit.inlines.tt-element "code"</literal>. </para><para>Example 2: do not generate <literal>small</literal> elements: <literal>-p edit.inlines.small-element "span style='font-size:x-small'"</literal> (notice how one or more attributes may be specified too).</para><para>This facility is useful only when generating semantic XHTML and all formats based on semantic XHTML. Using it when generating DITA or DocBook may give poor results.</para></entry></row><row><entry><para xml:id="param_inlines_convert"><literal>inlines.convert</literal><indexterm><primary>inlines.convert, parameter</primary></indexterm></para></entry><entry><para>A conversion specification. </para><para>Default: “”. No conversions other than those performed by <literal>w2x:xed/inlines.xed</literal>.</para></entry><entry><para>Global variable defined  in <literal>w2x:xed/inlines.xed</literal>.</para><para>Specified character  styles are converted to specified XHTML elements . See <xref linkend="simple_convert_spec"/>.</para></entry></row><row><entry><para><literal>inlines.generate-big-small</literal><indexterm><primary>inlines.generate-big-small, parameter</primary></indexterm></para></entry><entry><para>“<literal>yes</literal>” |  “<literal>no</literal>”.</para><para>Default:  “<literal>yes</literal>”.</para></entry><entry><para>Global variable defined  in <literal>w2x:xed/inlines.xed</literal>.</para><para>Specifies whether spans having a bigger (respectively smaller) font size than their parent elements  should be converted to <literal>big</literal> (respectively <literal>small</literal>) elements.</para></entry></row><row><entry><para><literal>lists.alternate-ordered-list-grouping</literal><indexterm><primary>lists.alternate-ordered-list-grouping, parameter</primary></indexterm></para></entry><entry><para>“<literal>yes</literal>” |  “<literal>no</literal>”.</para><para>Default:  “<literal>no</literal>”.</para></entry><entry><para>Global variable defined  in <literal>w2x:xed/lists.xed</literal>.</para><para>Set this parameter to "<literal>yes</literal>" if, using MS-Word, you often create on purpose consecutive “ordered lists” having the same numbering specification (e.g. the list items all start with "a)", "b)", "c)", etc, but some consecutive list items have different numbering IDs). This will cause <abbrev>w2x</abbrev> to create one semantic XML ordered list per MS-Word list, which is almost certainly what you'll want to get.</para><para>The default value of this parameter is "<literal>no</literal>" because MS-Word users generally create consecutive ordered lists having the same numbering specification by mistake. This default value causes <abbrev>w2x</abbrev> to create a single semantic XML ordered list for all consecutive MS-Word lists.</para></entry></row><row><entry><para><literal>metas.keep</literal><indexterm><primary>metas.keep, parameter</primary></indexterm></para></entry><entry><para>Regular expression matching part or all of the name of the XHTML <literal>meta</literal>.</para></entry><entry><para>Global variable defined  in <literal>w2x:xed/metas.xed</literal>.</para><para>When generating semantic XML of any kind, all the XHTML <literal>meta</literal> elements but <literal>author</literal>, <literal>description</literal>, <literal>dcterms.*</literal> are automatically suppressed from the semantic XHTML 1.0 Transitional document generated by the <emphasis role="bold">Edit</emphasis> step and used as an input by the <emphasis role="bold">Transform</emphasis> step.</para><para>If you want to keep some or all the <literal>meta</literal> elements in this intermediate semantic XHTML 1.0 Transitional document, you may specify <literal>-p edit.metas.keep <replaceable>regexp</replaceable></literal>.</para><para>Examples: <literal>-p edit metas.keep ".*"</literal> keeps all metas;<literal> -p edit metas.keep "^dc\."</literal> keep all metas having a name starting with "<literal>dc.</literal>" (e.g. <literal>&lt;meta name="dc.subject" content="..."/&gt;</literal>).</para></entry></row><row><entry><para><literal>prune.preserve</literal><indexterm><primary>prune.preserve, parameter</primary></indexterm></para></entry><entry><para>List of user-defined style names separated by space characters.</para><para>Default: “” (empty list).</para></entry><entry><para>Global variable defined  in <literal>w2x:xed/prune.xed</literal>.</para><para>Empty paragraphs having a user-defined style found in this list will not be deleted by <literal>w2x:xed/prune.xed</literal>.</para></entry></row><row><entry><para><literal>remove-styles.preserved-classes</literal><indexterm><primary>remove-styles.preserved-classes, parameter</primary></indexterm></para></entry><entry><para>List of user-defined style names separated by space characters.</para><para>Default: “” (empty list).</para></entry><entry><para>Global variable defined  in <literal>w2x:xed/remove-styles.xed</literal>.</para><para>The CSS classes used to apply the user-defined styles specified in this list will not be removed by <literal>w2x:xed/removes-styles.xed</literal>.</para><para>Note that specifying both parameters <literal>prune.preserve</literal> and <literal>remove-styles.preserved-classes</literal> is currently the only way to keep in the generated semantic XHTML <emphasis>empty paragraphs</emphasis> having a given MS-Word style. For example, specifying <literal>-p prune.preserve p-PlaceHolder</literal> and -<literal>p remove-styles.preserved-classes p-PlaceHolder</literal> may be used to keep in the semantic XHTML output all empty paragraphs having the <literal>p-PlaceHolder</literal> style.</para></entry></row><row><entry><para><literal>sections.max-level</literal><indexterm><primary>sections.max-level, parameter</primary></indexterm></para></entry><entry><para>An integer greater or equal to 1.</para><para>Default: -1. No maximum level.</para></entry><entry><para>Global variable defined  in <literal>w2x:xed/sections.xed</literal>.</para><para>Wrap sequences of elements starting with a <literal>h<replaceable>N</replaceable></literal> element (that is, <literal>h1</literal>, <literal>h2</literal>, <literal>h3</literal>, <literal>h4</literal>, <literal>h5</literal>, <literal>h6</literal>) into <literal>&lt;div class=”role-section<replaceable>N</replaceable>&gt;</literal> elements.</para><para>This parameter specifies the maximum level of nesting for such sections.</para></entry></row></tbody></tgroup></table><para xml:id="simple_convert_spec"><emphasis role="bold">Simple conversion specifications</emphasis></para><para>Above parameter  <literal>blocks.convert</literal>  (respectively <literal>inlines.convert</literal>) provides the user of w2x with a simple mean to convert  <literal>p</literal> (respectively <literal>span</literal>) elements having certain paragraph (respectively character) styles to XHTML elements  possibly having attributes.</para><para>The syntax of a simple conversion specification is:</para><programlisting><emphasis>spec</emphasis> → <emphasis>simple_spec</emphasis>  [ S ‘!’ S <emphasis>simple_spec</emphasis> ]*
<emphasis>simple_spec</emphasis> → <emphasis>style_spec </emphasis>S <emphasis>XHTML_element_qname</emphasis> [ S <emphasis>attribute_spec</emphasis> ]*
<emphasis>style_spec</emphasis> →  <emphasis>style_name</emphasis> | <emphasis>style_pattern</emphasis>
<emphasis>style_pattern</emphasis>  → ‘/’ <emphasis>pattern</emphasis> ’/’ | ‘^’ <emphasis>pattern</emphasis> ‘$’
<emphasis>attribute_spec</emphasis> →  <emphasis>attribute_qname</emphasis> ‘=’  <emphasis>quoted_attribute_value</emphasis>
<emphasis>quoted_attribute_value</emphasis> →  “’” <emphasis>value</emphasis> “’” | ‘”’ <emphasis>value</emphasis> ‘”’
</programlisting><para>Note that when specifying a <varname>XHTML_element_qname</varname>, you must restrict yourself to XHTML 1.0 Transitional elements. Specifying for example, XHTML 5.0 elements such as <literal>mark</literal>, <literal>aside</literal>, <literal>section</literal>, etc, will not give you the results you’ll expect.</para><para>Examples:  stock styled span conversions used by <literal>w2x:xed/inlines.xed</literal>:</para><programlisting>/Emphasis$/ em ! 
c-Strong strong ! 
c-BookTitle cite ! 
/((IntenseReference)|(SubtleReference)|(QuoteChar))$/ em !
/((itleChar)|(Heading\d+Char))$/ strong
</programlisting><para>Custom styled span conversions used to process this manual:</para><programlisting>c-Code code
</programlisting><para>Stock styled paragraph conversions used by <literal>w2x:xed/blocks.xed</literal>:</para><programlisting>/Quote$/ p g:id='blockquote' g:container='blockquote'
</programlisting><para>Custom styled paragraph conversions used to process this manual:</para><programlisting>p-Term dt g:id="dl" g:container="dl" ! 
p-Definition dd g:id="dl" g:container="dl" ! 
p-ProgramListing span g:id="pre" g:container="pre"
</programlisting><para><emphasis role="bold">Automatic grouping of  the XHTML elements which are the results of the styled paragraph conversions</emphasis></para><para>In the above examples, attributes having names prefixed with “<literal>g:</literal>” are in the “<literal>urn:x-mlmind:namespace:group</literal>” namespace. These attributes are called <emphasis>grouping attributes</emphasis>.  Examples: <literal>g:id</literal>, <literal>g:container</literal>.</para><para>When parameter  <literal>blocks.convert</literal> is used to create XHTML elements having grouping attributes, <link xmlns:xlink="http://www.w3.org/1999/xlink" xlink:href="http://www.xmlmind.com/w2x/_distrib/doc/xedscript/group.html">command group()</link> is automatically invoked at the end of all the styled paragraph conversions. To make it simple, this command groups consecutive XHTML elements having the same <literal>g:id</literal> attribute into a common parent element. The parent element is specified by attribute <literal>g:container</literal>.</para><para>In the above examples, </para><itemizedlist><listitem><para>Consecutive <literal>p</literal> elements having  grouping attributes <literal>g:id='blockquote'</literal> and <literal>g:container='blockquote'</literal> are grouped into a common <literal>blockquote</literal> parent element.</para></listitem><listitem><para>Consecutive <literal>dt</literal> and <literal>dt</literal> elements having  grouping attributes <literal>g:id="dl"</literal>  and <literal>g:container="dl</literal> are grouped into a common <literal>dl</literal> parent element.</para></listitem><listitem><para>Consecutive <literal>span</literal> elements having  grouping attributes <literal>g:id="pre"</literal> and <literal>g:container="pre"</literal> are grouped into a common <literal>pre</literal> parent element.</para></listitem></itemizedlist></section><section xml:id="epub_step"><title>EPUB step</title><para>Splits input XHTML document, whether styled or semantic,  into several pages and packages these pages as an  <link xmlns:xlink="http://www.w3.org/1999/xlink" xlink:href="http://idpf.org/epub/201">EPUB 2</link><indexterm><primary>EPUB, output format</primary></indexterm> book. The result of the this step is the file containing the EPUB book.</para><note><para><emphasis role="bold">No tab expansion for EPUB 2</emphasis></para><para>By default, when generating styled HTML (that is, XHTML+CSS), some JavaScript™ code (<literal><replaceable>w2x_install_dir</replaceable>/xed/expand-tabs.js</literal>) is added to the output file. This code computes and gives a width to all <literal>&lt;span class=”role-tab&gt; &lt;/span&gt;</literal>. This allows to decently emulate tab stops in any modern Web browser. More information in <xref linkend="about_tab_stops"/>.</para><para>However, this cannot work in the case of the <link xmlns:xlink="http://www.w3.org/1999/xlink" xlink:href="http://idpf.org/epub/201">EPUB 2</link> output format<indexterm><primary>EPUB, output format</primary></indexterm> because scripting is disabled in the styled HTML pages comprising an EPUB book.</para></note><para>Same parameters as the <link linkend="split_step">Split step</link>, plus the following EPUB specific parameters (for clarity, the “<literal>epub.</literal>” parameter name prefix is omitted here):</para><informaltable><tgroup cols="3"><colspec colname="c1" colwidth="25*"/><colspec colname="c2" colwidth="25*"/><colspec colname="c3" colwidth="50*"/><thead valign="top"><row><entry align="center"><para><emphasis role="bold">Name</emphasis></para></entry><entry align="center"><para><emphasis role="bold">Value</emphasis></para></entry><entry align="center"><para><emphasis role="bold">Description</emphasis></para></entry></row></thead><tbody valign="top"><row><entry><para><literal>cover-image-url-or-file</literal><indexterm><primary>cover-image-url-or-file, parameter</primary></indexterm></para></entry><entry><para>An absolute URL or a filename. A relative filename is relative to the current working directory.</para><para>Default: none (no cover page).</para></entry><entry><para>Specifies an image file which is to be used as the cover page of the EPUB book. This image must be a PNG or JPEG image. Its size must not exceed 1000x1000 pixels.</para></entry></row><row><entry><para><literal>default-lang</literal><indexterm><primary>default-lang, parameter</primary></indexterm></para></entry><entry><para>A language code conforming <link xmlns:xlink="http://www.w3.org/1999/xlink" xlink:href="https://www.ietf.org/rfc/rfc3066.txt">RFC 3066</link>. Examples: <literal>de</literal>, <literal>fr-CA</literal>. Default value: <literal>en</literal>.</para></entry><entry><para>Main language of the EPUB book. This parameter is used only when this language cannot be determined by examining the input styled XHTML document.</para></entry></row><row><entry><para><literal>identifier</literal><indexterm><primary>identifier, parameter</primary></indexterm></para></entry><entry><para>String.</para><para>Default: dynamically generated UUID URN.</para></entry><entry><para>A globally unique identifier for the generated EPUB book (typically the permanent URL of the EPUB book).</para></entry></row><row><entry><para><literal>omit-toc-root</literal><indexterm><primary>omit-toc-root, parameter</primary></indexterm></para></entry><entry><para>“<literal>yes</literal>” |  “<literal>no</literal>” </para><para>Default:  “<literal>no</literal>”.</para></entry><entry><para>By default, the TOC generated for an EPUB document has a single “root”. This single root always points to the page containing the title, subtitle, author, etc, of the document. Setting this parameter to “<literal>yes</literal>” prevents the generated TOC from having such single root.</para></entry></row><row><entry><para><literal>out-file</literal><indexterm><primary>out-file, parameter</primary></indexterm></para></entry><entry><para>A file path.</para><para>No default (<emphasis>required</emphasis>).</para></entry><entry><para>Specifies the path of the EPUB book. A relative file path is relative to the current working directory.</para></entry></row></tbody></tgroup></informaltable></section><section xml:id="load_step"><title>Load step</title><para>Loads an input XML file. The result of this step is loaded XML document. <indexterm><primary>Load, step</primary></indexterm></para><para>This step is mainly useful to test <link xmlns:xlink="http://www.w3.org/1999/xlink" xlink:href="http://www.xmlmind.com/w2x/_distrib/doc/xedscript/index.html">XED scripts</link>. Example:</para><programlisting>w2x –l –e my_script.xed –s in.xhtml out.xhtml
</programlisting><para>Note that if loaded file starts with a <literal>&lt;!DOCTYPE&gt;</literal> pointing to a DTD, then a document loader created by this step will <emphasis>not</emphasis> attempt to load this DTD.  The document loader will behave as if the <literal>&lt;!DOCTYPE&gt;</literal> was absent.</para><para>No parameters.</para></section><section xml:id="save_step"><title>Save step</title><para>Saves input <emphasis>XHTML</emphasis> document to disk. The result of the this step is the save file. <indexterm><primary>Save, step</primary></indexterm></para><para>Parameters (for clarity, the “<literal>save.</literal>” parameter name prefix is omitted here):</para><informaltable><tgroup cols="3"><colspec colname="c1" colwidth="25*"/><colspec colname="c2" colwidth="25*"/><colspec colname="c3" colwidth="50*"/><thead valign="top"><row><entry align="center"><para><emphasis role="bold">Name</emphasis></para></entry><entry align="center"><para><emphasis role="bold">Value</emphasis></para></entry><entry align="center"><para><emphasis role="bold">Description</emphasis></para></entry></row></thead><tbody valign="top"><row><entry><para><literal>encoding</literal><indexterm><primary>encoding, parameter</primary></indexterm></para></entry><entry><para>A valid character encoding (e.g. <literal>UTF-8</literal>, <literal>Windows-1252</literal>).</para><para>Default: “<literal>UTF-8</literal>”.</para></entry><entry><para>Specifies the character encoding of the save file.</para></entry></row><row><entry><para><literal>indent</literal><indexterm><primary>indent, parameter</primary></indexterm></para></entry><entry><para>A boolean: <literal>true</literal> (same as:  <literal>yes</literal> | <literal>on</literal> | <literal>1</literal>) | false (same as:  <literal>no</literal> | <literal>off</literal> | <literal>0</literal>).</para><para>Default: <literal>false</literal>.</para></entry><entry><para>Specifies whether the save file should be indented.</para><note><para><emphasis>Do not specify indent=”true” in production</emphasis>.  </para><para>The XML indentation created  this way being very simple,  this may add whitespace inside elements where space characters are significant.</para></note></entry></row><row><entry><para><literal>out-file</literal><indexterm><primary>out-file, parameter</primary></indexterm></para></entry><entry><para>A file path.</para><para>No default (<emphasis>required</emphasis>).</para></entry><entry><para>Specifies the path of the save file. A relative file path is relative to the current working directory.</para></entry></row></tbody></tgroup></informaltable></section><section xml:id="split_step"><title>Split step</title><para>Splits input XHTML document, whether styled or semantic, into several pages and saves these pages to disk.</para><para>This step also generates a <link xmlns:xlink="http://www.w3.org/1999/xlink" xlink:href="https://www.w3.org/TR/html401/present/frames.html">frameset</link> <indexterm><primary>frameset, output format</primary></indexterm> and a table of contents used as the left frame of the frameset. While an obsolete HTML feature, a frameset makes it easy browsing the generated pages. Moreover the table of contents used as the left frame is a convenient way to programmatically list all the generated pages.</para><para>The result of the this step is the file containing the frameset.</para><note><para>For clarity, the “<literal>split.</literal>” parameter name prefix is omitted here. </para><para>However when you’ll pass any of the following parameters to <literal>w2x</literal>, please do not forget this prefix. Example: <literal>-p split.split-before-level 8</literal>.</para></note><para>Parameters:</para><informaltable><tgroup cols="3"><colspec colname="c1" colwidth="25*"/><colspec colname="c2" colwidth="25*"/><colspec colname="c3" colwidth="50*"/><thead valign="top"><row><entry align="center"><para><emphasis role="bold">Name</emphasis></para></entry><entry align="center"><para><emphasis role="bold">Value</emphasis></para></entry><entry align="center"><para><emphasis role="bold">Description</emphasis></para></entry></row></thead><tbody valign="top"><row><entry><para><literal>allow-lonely-heading</literal><indexterm><primary>allow-lonely-heading, parameter</primary></indexterm></para></entry><entry><para>A boolean: <literal>true</literal> (same as:  <literal>yes</literal> | <literal>on</literal> | <literal>1</literal>) | false (same as:  <literal>no</literal> | <literal>off</literal> | <literal>0</literal>).</para><para>Default: <literal>false</literal>.</para></entry><entry><para>If specified as <literal>true</literal>, allow a page to contain just a heading and nothing else.</para></entry></row><row><entry><para><literal>indent</literal><indexterm><primary>indent, parameter</primary></indexterm></para></entry><entry><para>A boolean: <literal>true</literal> (same as:  <literal>yes</literal> | <literal>on</literal> | <literal>1</literal>) | false (same as:  <literal>no</literal> | <literal>off</literal> | <literal>0</literal>).</para><para>Default: <literal>false</literal>.</para></entry><entry><para>Specifies whether the save files should be indented.</para><note><para><emphasis>Do not specify indent=”true” in production</emphasis>.  </para><para>The XML indentation created  this way being very simple,  this may add whitespace inside elements where space characters are significant.</para></note></entry></row><row><entry><para><literal>out-file</literal><indexterm><primary>out-file, parameter</primary></indexterm></para></entry><entry><para>A file path.</para><para>No default (<emphasis>required</emphasis>).</para></entry><entry><para>Specifies the path of the file containing the frameset. A relative file path is relative to the current working directory.</para><para>This step always generates several files, all in the same directory as file <literal>out-file</literal>. </para><para>This output directory is created on the fly if needed too. However, the output directory, if it already exists, is not automatically made empty.</para><itemizedlist><listitem><para>The file specified by <literal>out-file</literal> contains the frameset. Let’s suppose <literal>out-file</literal> is <literal>temp\foo.html</literal>.</para></listitem><listitem><para>The table of contents of the frameset, its left frame, is created in<literal> temp\foo-TOC.html</literal>.</para></listitem><listitem><para>Unless parameter <literal>use-id-as-filename</literal> has been specified as <literal>true</literal>, the styled HTML pages are created in <literal>temp\foo-0.html</literal>, <literal>temp\foo-1.html</literal>, <literal>temp\foo-2.html</literal>, …, <literal>temp\foo-<replaceable>N</replaceable>.html</literal>.</para></listitem></itemizedlist></entry></row><row><entry><para xml:id="split_split_before_level"><literal>split-before-level</literal><indexterm><primary>split-before-level, parameter</primary></indexterm></para></entry><entry><para>Outline level<indexterm><primary>Outline level</primary></indexterm> between 0 (e.g. style “<emphasis role="bold">Heading 1</emphasis>”) and 8 (e.g. style “<emphasis role="bold">Heading 9</emphasis>”).</para><para>Default: 0 (split at “<emphasis role="bold">Heading 1</emphasis>”).</para></entry><entry><para>In order to generate multi-page styled HTML, that is, frameset<indexterm><primary>frameset, output format</primary></indexterm>, Web Help<indexterm><primary>Web Help, output format</primary></indexterm>, EPUB<indexterm><primary>EPUB, output format</primary></indexterm>, we need to automatically split the input XHTML document into pages.</para><para>A new page is created each time a paragraph having an <emphasis>outline level</emphasis><indexterm><primary>Outline level</primary></indexterm> less than or equal to specified <literal>split-before-level</literal> parameter<indexterm><primary>split-before-level, parameter</primary></indexterm> is found in the source. </para><para>An outline level is an integer between 0 (e.g. style “<emphasis role="bold">Heading 1</emphasis>”) and 8 (e.g. style “<emphasis role="bold">Heading 9</emphasis>”). </para><para>The default value of parameter <literal>split-before-level</literal> is 0, which means: for each “<emphasis role="bold">Heading 1</emphasis>”, create a new page starting with this “<emphasis role="bold">Heading 1</emphasis>”.</para><para>See also <link linkend="check_outline_levels_tip">Important tip</link>.</para></entry></row><row><entry><para><literal>use-id-as-filename</literal><indexterm><primary>us-id-as-filename, parameter</primary></indexterm></para></entry><entry><para>A boolean: <literal>true</literal> (same as:  <literal>yes</literal> | <literal>on</literal> | <literal>1</literal>) | false (same as:  <literal>no</literal> | <literal>off</literal> | <literal>0</literal>).</para><para>Default: <literal>false</literal>.</para></entry><entry><para>By default, the save files of the generated pages have the same basename as <literal>out-file</literal>, except that a number is appended to this basename. Example: <literal>out-file</literal> is <literal>temp\foo.html</literal>; the save files of the generated pages are thus: <literal>temp\foo-0.html</literal>, <literal>temp\foo-1.html</literal>, <literal>temp\foo-2.html</literal>, …, <literal>temp\foo-<replaceable>N</replaceable>.html</literal>.</para><para>In a MS-Word document, a heading is often given a bookmark. The Convert step translates this bookmark to an ID. When <literal>use-id-as-filename</literal> is specified as <literal>true</literal>, the save file of a page is given a basename corresponding to the ID of the heading used to start this page. When this heading ID is missing, the Split step fallbacks to the default behavior.</para></entry></row></tbody></tgroup></informaltable></section><section xml:id="transform_step"><title>Transform step</title><para>Transforms input XML document or file using an <link xmlns:xlink="http://www.w3.org/1999/xlink" xlink:href="https://www.w3.org/TR/1999/REC-xslt-19991116">XSLT 1.0</link> stylesheet.  The result of the this step is the save file containing the transformed document. <indexterm><primary>Transform, step</primary></indexterm></para><para>Unlike the load step, if the input XML file starts with a <literal>&lt;!DOCTYPE&gt;</literal> pointing to a DTD, then the document  loader created by a Transform step will silently skip this DTD.</para><note><para>For clarity, the “<literal>transform.</literal>” or “<literal>transform2.</literal>” parameter name prefix is omitted here.</para><para>However when you’ll pass any of the following parameters to <literal>w2x</literal>, please do not forget this prefix. Example: <literal>-p transform.cals-tables yes</literal>.</para></note><para>Parameters:</para><informaltable><tgroup cols="3"><colspec colname="c1" colwidth="25*"/><colspec colname="c2" colwidth="25*"/><colspec colname="c3" colwidth="50*"/><thead valign="top"><row><entry align="center"><para><emphasis role="bold">Name</emphasis></para></entry><entry align="center"><para><emphasis role="bold">Value</emphasis></para></entry><entry align="center"><para><emphasis role="bold">Description</emphasis></para></entry></row></thead><tbody valign="top"><row><entry><para><literal>xslt-url-or-file</literal><indexterm><primary>xslt-url-or-file, parameter</primary></indexterm></para></entry><entry><para>An absolute URL or the path of an existing file.</para><para>No default (<emphasis>required</emphasis>).</para></entry><entry><para>Specifies which XSLT 1.0 stylesheet should be used to transform the input XML document. A relative file path is relative to the current working directory.</para></entry></row><row><entry><para><literal>out-file</literal><indexterm><primary>out-file, parameter</primary></indexterm></para></entry><entry><para>A file path.</para><para>No default (<emphasis>required</emphasis>).</para></entry><entry><para>Specifies the path of the save file. A relative file path is relative to the current working directory.</para></entry></row></tbody></tgroup></informaltable><para>Any other parameter is passed to the XSLT stylesheet as an XSLT stylesheet parameter. Which XSLT stylesheet parameters are supported depend on the XSLT stylesheet being used.</para><table><title>Parameters of <literal>w2x:xslt/docbook.xslt</literal>, <literal>docbook5.xslt</literal>, which are used to convert input XHTML document to DocBook v4 or v5</title><tgroup cols="3"><colspec colname="c1" colwidth="25*"/><colspec colname="c2" colwidth="25*"/><colspec colname="c3" colwidth="50*"/><thead valign="top"><row><entry align="center"><para><emphasis role="bold">Name</emphasis></para></entry><entry align="center"><para><emphasis role="bold">Value</emphasis></para></entry><entry align="center"><para><emphasis role="bold">Description</emphasis></para></entry></row></thead><tbody valign="top"><row><entry><para><literal>docbook-version</literal><indexterm><primary>docbook-version, parameter</primary></indexterm></para></entry><entry><para>DocBook version (“<literal>4.5</literal>”, “<literal>5.0</literal>”, “<literal>5.1</literal>” or “<literal>5.2</literal>”).</para><para>Default: “<literal>4.5</literal>” for <literal>docbook.xslt</literal>, “<literal>5.0</literal>” for <literal>docbook5.xslt</literal>.</para></entry><entry><para>Specifies the version of DocBook. </para><para>This number is used to specify which <literal>&lt;!DOCTYPE&gt;</literal> to add to the generate file or, in the case of DocBook 5, the value of the <literal>version</literal> attribute of the root element of the generated file.</para><para>Please remember that versions of DocBook older than “<literal>4.3</literal>” do not support HTML tables. (HTML tables, not CALS tables, are generated by default. See below.)</para></entry></row><row><entry><para><literal>cals-tables</literal><indexterm><primary>cals-tables, parameter</primary></indexterm></para></entry><entry><para>“<literal>yes</literal>” |  “<literal>no</literal>”.</para><para>Default:  “<literal>no</literal>”.</para></entry><entry><para>If “<literal>yes</literal>”,  generate CALS tables.</para><para>If “<literal>no</literal>”,  generate HTML tables.</para><para>Note that <literal>cals-table=”yes”</literal> requires specifying <link linkend="convert_step_params">Convert step parameter</link> <literal>set-column-number=”yes”</literal>.</para></entry></row><row><entry><para><literal>hierarchy-name</literal><indexterm><primary>hierarchy-name, parameter</primary></indexterm></para></entry><entry><para>“<literal>book</literal>” | “<literal>article</literal>” | “<literal>part</literal>” | “<literal>chapter</literal>” | “<literal>appendix</literal>” | “<literal>section</literal>” | “<literal>book-sect1</literal>” | “<literal>article-sect1</literal>” | “<literal>part-sect1</literal>” | “<literal>chapter-sect1</literal>” | “<literal>appendix-sect1</literal>” |  “<literal>sect1</literal>” | “<literal>sect2</literal>” | “<literal>sect3</literal>” | “<literal>sect4</literal>” | “<literal>sect5</literal>” .</para><para>Default: “<literal>book</literal>”.</para></entry><entry><para>Specifies the root element name and type of sections of the DocBook document to be generated.</para></entry></row><row><entry><para><literal>media-alt</literal><indexterm><primary>media-alt, parameter</primary></indexterm></para></entry><entry><para>“<literal>yes</literal>” |  “<literal>no</literal>”.</para><para>Default:  “<literal>no</literal>”.</para></entry><entry><para>If “<literal>yes</literal>”,  convert the <literal>alt</literal> attribute of XHTML element <literal>img</literal> to DocBook <literal>alt</literal> element.</para><para>If “<literal>no</literal>”,  ignore the <literal>alt</literal> attribute of XHTML  element <literal>img</literal>.</para></entry></row><row><entry><para><literal>pre-element-name</literal><indexterm><primary>pre-element-name, parameter</primary></indexterm></para></entry><entry><para>An element local name. Default: “<literal>literallayout</literal>”.</para></entry><entry><para>Specifies to which DocBook element, an HTML <literal>pre</literal> element is to be converted.</para></entry></row></tbody></tgroup></table><table xml:id="_Ref414612060"><title>Parameters of <literal>w2x:xslt/assembly.xslt</literal>, which are used to convert input DocBook V5.1 book to a DocBook V5.1 assembly</title><tgroup cols="3"><colspec colname="c1" colwidth="25*"/><colspec colname="c2" colwidth="25*"/><colspec colname="c3" colwidth="50*"/><thead valign="top"><row><entry align="center"><para><emphasis role="bold">Name</emphasis></para></entry><entry align="center"><para><emphasis role="bold">Value</emphasis></para></entry><entry align="center"><para><emphasis role="bold">Description</emphasis></para></entry></row></thead><tbody valign="top"><row><entry><para><literal>add-index</literal><indexterm><primary>add-index, parameter</primary></indexterm></para></entry><entry><para>“<literal>yes</literal>” |  “<literal>no</literal>”.</para><para>Default:  “<literal>yes</literal>”.</para></entry><entry><para>Ignored if the input book document does not contain any index term.</para><para>If “<literal>yes</literal>”,  add an index module at the end of the assembly.</para><para>If “<literal>no</literal>”,  do not add an index module at the end of the assembly.</para></entry></row><row><entry><para><literal>output-path</literal><indexterm><primary>output-path, parameter</primary></indexterm></para></entry><entry><para>An absolute or relative “<literal>file:</literal>”  URI.</para><para>No default (<emphasis>required</emphasis>).</para></entry><entry><para>Specifies the URI of the directory which is to contain all generated files. A relative URI is relative to the current working directory.</para></entry></row><row><entry><para><literal>section-depth</literal><indexterm><primary>section-depth, parameter</primary></indexterm></para></entry><entry><para>“1”, “2”, “3”, “4”, “5”, “6”, “7”, “8”, “9”.</para><para>Default: “1”.</para></entry><entry><para>Specifies the module structure of the assembly (<emphasis>always acting as a book</emphasis>) <indexterm><primary>DocBook V5.1 assembly, output format</primary></indexterm> to be generated.</para><para>Example 1: an assembly generated using  <literal>section-depth=”1”</literal> only contains chapter modules.</para><para>Example 2: an assembly generated using  <literal>section-depth=”2”</literal> contains chapter modules, themselves possibly containing section modules. </para><para>Example 3: an assembly generated using  <literal>section-depth=”3”</literal> contains chapter modules, themselves possibly containing section modules, themselves possibly containing section modules (acting as subsections).</para></entry></row><row><entry><para><literal>topic-path</literal><indexterm><primary>topic-path, parameter</primary></indexterm></para></entry><entry><para>An absolute or relative “<literal>file:</literal>”  URI.</para><para>No default: generate topic files in <literal>output-path</literal>.</para></entry><entry><para>Specifies the URI  of the subdirectory directory which is to contain all generated <link xmlns:xlink="http://www.w3.org/1999/xlink" xlink:href="https://tdg.docbook.org/tdg/5.1/topic.html">DocBook V5.1 topic</link> files. A relative  URI is relative to <literal>output-path</literal>.</para></entry></row></tbody></tgroup></table><table><title>Parameters of <literal>w2x:xslt/topic.xslt</literal>, which is used to convert input XHTML document to a DITA topic</title><tgroup cols="3"><colspec colname="c1" colwidth="25*"/><colspec colname="c2" colwidth="25*"/><colspec colname="c3" colwidth="50*"/><thead valign="top"><row><entry align="center"><para><emphasis role="bold">Name</emphasis></para></entry><entry align="center"><para><emphasis role="bold">Value</emphasis></para></entry><entry align="center"><para><emphasis role="bold">Description</emphasis></para></entry></row></thead><tbody valign="top"><row><entry><para><literal>root-topic-id</literal><indexterm><primary>root-topic-id, parameter</primary></indexterm></para></entry><entry><para>An XML ID.</para><para>Default:  automatically generated  ID.</para></entry><entry><para>Specifies the ID of the root topic.</para></entry></row><row><entry><para><literal>single-topic</literal><indexterm><primary>single-topic, parameter</primary></indexterm></para></entry><entry><para>“<literal>yes</literal>” |  “<literal>no</literal>”.</para><para>Default:  “<literal>no</literal>”.</para></entry><entry><para>If “<literal>yes</literal>”,  convert input <literal>&lt;div class=”role-section<replaceable>N</replaceable>”&gt;</literal> to (non-nested) DITA section  elements.</para><para>If “<literal>no</literal>”,  convert input <literal>&lt;div class=”role-section<replaceable>N</replaceable>”&gt;</literal> to nested topics.</para></entry></row><row><entry><para><literal>topic-type</literal><indexterm><primary>topic-type, parameter</primary></indexterm></para></entry><entry><para>“<literal>topic</literal>” |“<literal>concept</literal>”  | “<literal>generalTask</literal>” | “<literal>task</literal>” (same as: “<literal>strictTask</literal>” )  | “<literal>reference</literal>”.</para><para>Default: “<literal>topic</literal>”.</para></entry><entry><para>Specifies the type of topics to be created by the XSLT stylesheet.</para></entry></row><row><entry><para><literal>pre-element-name</literal><indexterm><primary>pre-element-name, parameter</primary></indexterm></para></entry><entry><para>An element local name. Default: “<literal>pre</literal>”.</para></entry><entry><para>Specifies to which DITA element, an HTML <literal>pre</literal> element is to be converted.</para></entry></row><row><entry><para><literal>shortdesc-class-name</literal><indexterm><primary>shortdesc-class-name, parameter</primary></indexterm></para></entry><entry><para>A class name. Default:  “”. Examples: <literal>p-Shortdesc</literal>, <literal>p-Abstract</literal>.</para></entry><entry><para>Specifies the class name of the XHTML <literal>&lt;p&gt;</literal> which acts as a short description of the section. </para><para>When this parameter is not specified (or is specified as the empty string which is its default value), the following style mapping, created by the w2x-app wizard:</para><programlisting>-p edit.blocks.convert¬
"p-Shortdesc p class='p-Shortdesc'"
...
&lt;xsl:template 
  match="h:p[@class='p-Shortdesc']"&gt;
  &lt;shortdesc&gt;
    &lt;xsl:call-template 
      name="processCommonAttributes"/&gt;
    &lt;xsl:apply-templates/&gt;
  &lt;/shortdesc&gt;
&lt;/xsl:template&gt;
</programlisting><para>causes DITA <literal>&lt;shortdesc&gt;</literal> elements to generated inside topic bodies, which is invalid.</para><para>After specifying </para><programlisting>-p transform.shortdesc-class-name¬
p-Shortdesc 
</programlisting><para>this issue is fixed and DITA <literal>&lt;shortdesc&gt;</literal> elements are generated before topic bodies.</para></entry></row></tbody></tgroup></table><table><title>Parameters of <literal>w2x:xslt/xhtml_strict.xslt</literal>, <literal>xhtml_loose.xslt</literal>, <literal>xhtml1_1.xslt</literal>, <literal>xhtml5.xslt</literal>, which are used to convert input XHTML 1.0 Transitional document to  XHTML having a different version</title><tgroup cols="3"><colspec colname="c1" colwidth="25*"/><colspec colname="c2" colwidth="25*"/><colspec colname="c3" colwidth="50*"/><thead valign="top"><row><entry align="center"><para><emphasis role="bold">Name</emphasis></para></entry><entry align="center"><para><emphasis role="bold">Value</emphasis></para></entry><entry align="center"><para><emphasis role="bold">Description</emphasis></para></entry></row></thead><tbody valign="top"><row><entry><para><literal>add-xml-lang</literal><indexterm><primary>add-xml-lang, parameter</primary></indexterm></para></entry><entry><para>“<literal>yes</literal>” |  “<literal>no</literal>”.</para><para>Default:  “<literal>yes</literal>” for <literal>xhtml_strict</literal>, <literal>xhtml_loose</literal>, <literal>xhtml1_1</literal>; “<literal>no</literal>” for <literal>xhtml5</literal>.</para></entry><entry><para>If “<literal>yes</literal>”,  add an <literal>xml:lang</literal> attribute to all XHTML elements having a <literal>lang</literal> attribute.</para></entry></row><row><entry><para><literal>discard-index-terms</literal><indexterm><primary>discard-index-terms, parameter</primary></indexterm></para></entry><entry><para>“<literal>yes</literal>” |  “<literal>no</literal>”.</para><para>Default:  “<literal>yes</literal>”.</para></entry><entry><para>If “<literal>yes</literal>”,  discard <literal>&lt;span class=”role-index-term”&gt;</literal> elements.</para><para>If “<literal>no</literal>”,  keep  <literal>&lt;span class=”role-index-term”&gt;</literal> elements.</para></entry></row><row><entry><para><literal>footnote-number-format</literal><indexterm><primary>footnote-number-format, parameter</primary></indexterm></para></entry><entry><para>A valid XSLT number format (value of attribute <literal>format</literal> of  element <literal>xsl:number</literal>).</para><para>Default: “<literal>[1]</literal>”.</para></entry><entry><para>When parameter <literal>number-footnotes</literal> is “<literal>yes</literal>”, specifies the format of the numeric label used for footnotes and footnote callouts.</para></entry></row><row><entry><para><literal>generate-xref-text</literal><indexterm><primary>generate-xref-text, parameter</primary></indexterm></para></entry><entry><para>“<literal>yes</literal>” |  “<literal>no</literal>”.</para><para>Default:  “<literal>yes</literal>”.</para></entry><entry><para>If “<literal>yes</literal>”,  add hyperlink text to <literal>a</literal> elements which are cross-references.</para><para>If “<literal>no</literal>”,  keep empty <literal>a</literal> elements which are cross-references.</para></entry></row><row><entry><para><literal>number-footnotes</literal><indexterm><primary>number-footnotes, parameter</primary></indexterm></para></entry><entry><para>“<literal>yes</literal>” |  “<literal>no</literal>”.</para><para>Default:  “<literal>yes</literal>”.</para></entry><entry><para>If “<literal>yes</literal>”,  add a numeric label to  footnotes and footnote callouts.</para><para>If “<literal>no</literal>”, do not add  a numeric label to footnotes and footnote callouts.</para></entry></row><row><entry><para><literal>style-with-class</literal><indexterm><primary>style-with-class, parameter</primary></indexterm></para></entry><entry><para>“<literal>yes</literal>” |  “<literal>no</literal>”.</para><para>Default:  “<literal>no</literal>”.</para></entry><entry><para>If “<literal>yes</literal>”,  add a <literal>class</literal> attribute to some elements to allow using a CSS stylesheet to style them. For example: convert <literal>&lt;center&gt;</literal> to <literal>&lt;div class=”center”&gt;</literal>.</para><para>If “<literal>no</literal>”,  add a direct style to some elements to style them. For example: convert <literal>&lt;center&gt;</literal> to <literal>&lt;div style=”text-align:center;”&gt;</literal>.</para></entry></row></tbody></tgroup></table><table><title>Parameters of <literal>w2x:xslt/map.xslt</literal>, <literal>bookmap.xslt</literal>, which are used to convert input DITA topic file to a map or bookmap</title><tgroup cols="3"><colspec colname="c1" colwidth="25*"/><colspec colname="c2" colwidth="25*"/><colspec colname="c3" colwidth="50*"/><thead valign="top"><row><entry align="center"><para><emphasis role="bold">Name</emphasis></para></entry><entry align="center"><para><emphasis role="bold">Value</emphasis></para></entry><entry align="center"><para><emphasis role="bold">Description</emphasis></para></entry></row></thead><tbody valign="top"><row><entry><para><literal>add-index</literal><indexterm><primary>add-index, parameter</primary></indexterm></para></entry><entry><para>“<literal>yes</literal>” |  “<literal>no</literal>”.</para><para>Default:  “<literal>yes</literal>”.</para></entry><entry><para> <literal>bookmap.xslt</literal> only. </para><para>Ignored if the input topic document does not contain any index term.</para><para>If “<literal>yes</literal>”,  add an <literal>indexlist</literal> element to the back matter of the bookmap . </para><para>If “<literal>no</literal>”,  do not add an <literal>indexlist</literal> element to the back matter of the bookmap.</para></entry></row><row><entry><para><literal>add-toc</literal><indexterm><primary>add-toc, parameter</primary></indexterm></para></entry><entry><para>“<literal>yes</literal>” |  “<literal>no</literal>”.</para><para>Default:  “<literal>yes</literal>”.</para></entry><entry><para><literal>bookmap.xslt</literal> only.</para><para>If “<literal>yes</literal>”,  add a <literal>toc</literal> element to the front matter of the bookmap.</para><para>If “<literal>no</literal>”,  do not add a <literal>toc</literal> element to the front matter of the bookmap.</para></entry></row><row><entry><para><literal>output-path</literal><indexterm><primary>output-path, parameter</primary></indexterm></para></entry><entry><para>An absolute or relative “<literal>file:</literal>”  URI.</para><para>No default (<emphasis>required</emphasis>).</para></entry><entry><para>Specifies the URI of the directory which is to contain all generated files. A relative URI is relative to the current working directory.</para></entry></row><row><entry><para><literal>section-depth</literal><indexterm><primary>section-depth, parameter</primary></indexterm></para></entry><entry><para>“1”, “2”, “3”, “4”, “5”, “6”, “7”, “8”, “9”.</para><para>Default: “1”.</para></entry><entry><para>Specifies the <literal>topicref</literal> structure of the DITA map to be generated.</para><para>Example 1: a bookmap generated using  <literal>section-depth=”1”</literal> only contains <literal>chapter</literal> <literal>topicref</literal>s.</para><para>Example 2: a bookmap generated using  <literal>section-depth=”2”</literal> contains <literal>chapter</literal> <literal>topicref</literal>s, themselves possibly containing plain <literal>topicref</literal>s (acting as sections). </para><para>Example 3: a bookmap generated using  <literal>section-depth=”3”</literal> contains <literal>chapter</literal> <literal>topicref</literal>s, themselves possibly containing plain <literal>topicref</literal>s (acting as sections), themselves possibly containing other plain <literal>topicref</literal>s (acting as subsections).</para></entry></row><row><entry><para><literal>topic-path</literal><indexterm><primary>topic-path, parameter</primary></indexterm></para></entry><entry><para>An absolute or relative “<literal>file:</literal>”  URI.</para><para>No default: generate topic files in <literal>output-path</literal>.</para></entry><entry><para>Specifies the URI  of the subdirectory directory which is to contain all generated topic files. A relative  URI is relative to <literal>output-path</literal>.</para></entry></row><row><entry><para><literal>topic-type</literal><indexterm><primary>topic-type, parameter</primary></indexterm></para></entry><entry><para>“<literal>topic</literal>” | “<literal>concept</literal>”  | “<literal>generalTask</literal>” | “<literal>task</literal>” (same as: “<literal>strictTask</literal>” )  | “<literal>reference</literal>”.</para><para>No default. See description.</para></entry><entry><para>Specifies the type of topics to be created by the <literal>topic.xslt</literal> XSLT stylesheet.  See <xref linkend="_Ref414612060"/>.</para><para>This parameter is used to make a difference between a strict task and a general task. In all other cases, this parameter may be omitted.</para></entry></row></tbody></tgroup></table></section><section xml:id="webhelp_step"><title>Web Help step</title><para>Splits input XHTML document, whether styled or semantic, into several pages and compiles these pages into a  Web Help<indexterm><primary>Web Help, output format</primary></indexterm>. The Web Help compiler used to do this is free, open source, <link xmlns:xlink="http://www.w3.org/1999/xlink" xlink:href="http://www.xmlmind.com/ditac/whc.shtml">XMLmind Web Help Compiler</link><indexterm><primary>XMLmind Web Help Compiler</primary></indexterm>.</para><para>This step always generates UTF-8 encoded, “<literal>.html</literal>” files, no matter the parameters specifying other values.</para><para>Same parameters as the <link linkend="split_step">Split step</link>, plus the following Web Help specific parameters (for clarity, the “<literal>webhelp.</literal>” parameter name prefix is omitted here):</para><informaltable><tgroup cols="3"><colspec colname="c1" colwidth="25*"/><colspec colname="c2" colwidth="25*"/><colspec colname="c3" colwidth="50*"/><thead valign="top"><row><entry align="center"><para><emphasis role="bold">Name</emphasis></para></entry><entry align="center"><para><emphasis role="bold">Value</emphasis></para></entry><entry align="center"><para><emphasis role="bold">Description</emphasis></para></entry></row></thead><tbody valign="top"><row><entry><para><literal>add-index</literal><indexterm><primary>webhelp.add-index, parameter</primary></indexterm></para></entry><entry><para>“<literal>yes</literal>” |  “<literal>no</literal>”.</para><para>Default:  “<literal>yes</literal>”.</para></entry><entry><para> If “<literal>yes</literal>”,  automatically create an <literal>index.html</literal> file, if an <literal>index.html</literal> file does not already exist.</para></entry></row><row><entry><para><literal>omit-toc-root</literal><indexterm><primary>omit-toc-root, parameter</primary></indexterm></para></entry><entry><para>“<literal>yes</literal>” |  “<literal>no</literal>” </para><para>Default:  “<literal>no</literal>”.</para></entry><entry><para>By default, the TOC generated for a Web Help document has a single “root”. This single root always points to the page containing the title, subtitle, author, etc, of the document. Setting this parameter to “<literal>yes</literal>” prevents the generated TOC from having such single root.</para></entry></row><row><entry><para><literal>wh-*</literal> (<literal>wh-local-jquery</literal>, <literal>wh-layout</literal>, <literal>wh-collapse-toc</literal>, etc)<indexterm><primary>webhelp.wh-*, parameters</primary></indexterm></para></entry><entry><para>String.</para><para>No default.</para></entry><entry><para>All parameters starting with “<literal>wh-</literal>“ are passed as is to XMLmind Web Help Compiler<indexterm><primary>XMLmind Web Help Compiler</primary></indexterm>.</para><para>Example: <literal>-p webhelp.wh-collapse-toc yes</literal>.</para><para>These parameters are all documented in <link xmlns:xlink="http://www.w3.org/1999/xlink" xlink:href="http://www.xmlmind.com/ditac/_whc/doc/manual/parameters.html">XMLmind Web Help Compiler, Parameters</link>.</para></entry></row></tbody></tgroup></informaltable></section></chapter><chapter xml:id="embed"><title>Embedding w2x in a Java™ application</title><para>Embedding w2x in a Java™ application is as simple as:</para><orderedlist><listitem><para>Create an instance of class <literal>Processor</literal>.</para></listitem><listitem><para>Configure it by passing an array of option strings identical to those of the <link linkend="w2x_command">w2x command line utility</link> to method <literal>Processor.configure</literal> or (low-level)  by directly adding conversion steps  and parameters to <literal>Processor.stepList</literal> and <literal>Processor.parameterMap</literal>. </para></listitem><listitem><para>Invoke the configured processor to convert specified input file to specified output file. This is done invoking high-level method <literal>Processor.process</literal> or low-level method <literal>Processor.executeSteps</literal>.</para></listitem></orderedlist><note><para><emphasis role="bold">About thread-safety</emphasis></para><para>An instance of <literal>Processor</literal> cannot be shared by different threads.</para><para>It’s strongly recommend not to reuse an instance of <literal>Processor</literal>. That is, please create one instance of <literal>Processor</literal> per conversion.</para></note><para>The reference manual (generated using <literal>javadoc</literal>) of the Java API of w2x is found in <link xmlns:xlink="http://www.w3.org/1999/xlink" xlink:href="https://www.xmlmind.com/w2x/_distrib/doc/api/overview-summary.html">XMLmind Word To XML Java™ API</link>.</para><para><emphasis role="bold">High-level example </emphasis><literal>w2x_install_dir/doc/manual/embed/Embed1.java</literal><emphasis role="bold">:</emphasis></para><programlisting>Processor processor = new Processor();

int l = processor.configure(args);

File inFile = null;
File outFile = null;
if (l+2 == args.length) {
    inFile = new File(args[l]);
    outFile = new File(args[l+1]);
} else {
    System.exit(1);
}

processor.process(inFile, outFile, /*progress monitor*/ null);
</programlisting><itemizedlist><listitem><para>Compile <literal>Embed1.java</literal> by executing “<literal>ant</literal>”<footnote xml:id="__FN10__"><para><link xmlns:xlink="http://www.w3.org/1999/xlink" xlink:href="https://ant.apache.org/">Apache Ant</link> is a command-line utility for automating software build processes. By default, <literal>ant</literal> uses an XML file, called <literal>build.xml</literal> to describe the build process and its dependencies.  In the case of the two above code samples, this file is <literal><replaceable>w2x_install_dir</replaceable>/doc/manual/embed/build.xml</literal>.</para></footnote> in <literal><replaceable>w2x_install_dir</replaceable>/doc/manual/embed/.</literal> </para></listitem><listitem><para>Run “<literal>ant tembed1</literal>” in <literal><replaceable>w2x_install_dir</replaceable>/doc/manual/embed/</literal>. This creates <literal><replaceable>w2x_install_dir</replaceable>/doc/manual/embed/tembed1.dita</literal>.</para></listitem></itemizedlist><para><emphasis role="bold">Lower-level example </emphasis><literal>w2x_install_dir/doc/manual/embed/Embed2.java</literal><emphasis role="bold">:</emphasis></para><programlisting>Processor processor = new Processor();

ConvertStep convertStep = new ConvertStep("convert");
processor.stepList.add(convertStep);

EditStep editStep = new EditStep("edit");
processor.stepList.add(editStep);

processor.parameterMap.put("edit.xed-url-or-file", 
                           "w2x:xed/main-styled.xed");

SaveStep saveStep = new SaveStep("save");
processor.stepList.add(saveStep);

processor.parameterMap.put("save.indent", "yes");

processor.process(inFile, outFile, /*progress monitor*/ null);
</programlisting><itemizedlist><listitem><para>Compile <literal>Embed2.java</literal> by executing “<literal>ant</literal>” in <literal><replaceable>w2x_install_dir</replaceable>/doc/manual/embed/.</literal> </para></listitem><listitem><para>Run  “<literal>ant tembed2</literal>” in <literal><replaceable>w2x_install_dir</replaceable>/doc/manual/embed/</literal>. This creates <literal><replaceable>w2x_install_dir</replaceable>/doc/manual/embed/tembed2.xhtml</literal>.</para></listitem></itemizedlist><section xml:id="extension_points"><title>Extension points</title><section xml:id="custom_convert_step"><title>Custom conversion step</title><para>The stock conversion steps are:  <literal>com.xmlmind.w2x.processor.ConvertStep</literal>, <literal>DeleteFilesStep</literal>, <literal>EditStep</literal>, <literal>LoadStep</literal>, <literal>SaveStep</literal>, <literal>TransformStep</literal>.</para><para>A custom conversion step may be implemented by deriving abstract class <literal>com.xmlmind.w2x.processor.ProcessStep</literal>. Such task poses no technical problems whatsoever. Suffice for that to implement a single method: <literal>ProcessStep.process</literal>.</para><para>See <link xmlns:xlink="http://www.w3.org/1999/xlink" xlink:href="https://www.xmlmind.com/w2x/_distrib/doc/api/com/xmlmind/w2x/processor/ProcessStep.html">reference of class com.xmlmind.w2x.processor.ProcessStep</link>.</para></section><section xml:id="custom_image_converter"><title>Custom image converters</title><para>Image converters are used to convert images having a format not supported by Web browsers (TIFF, WMF, EMF, etc) to a format supported by Web browsers (SVG, PNG, JPEG).</para><para>Image converters are specified by  interface <literal>com.xmlmind.w2x.docx.image.ImageConverterFactory</literal>.  XMLmind Word To XML ships with 4 classes implementing this interface:</para><variablelist><varlistentry><term><literal>com.xmlmind.w2x.docx.image.ImageConverterFactoryImpl</literal></term><listitem><para>Image converter factory  used to convert TIFF images to PNG or JPEG.</para></listitem></varlistentry><varlistentry><term><literal>com.xmlmind.w2x_ext.wmf_converter.WMFConverterFactory</literal></term><listitem><para>Image converter factory used to convert WMF graphics to SVG.</para></listitem></varlistentry><varlistentry><term><literal>com.xmlmind.w2x_ext.emf2png.EMF2PNG</literal></term><listitem><para><emphasis>This image converter factory is available only on Windows</emphasis>. It leverages Windows own <link xmlns:xlink="http://www.w3.org/1999/xlink" xlink:href="https://msdn.microsoft.com/en-us/library/windows/desktop/ms533797(v=vs.85).aspx">GDI+</link> to convert EMF (in fact, Windows metafiles of any kind, including WMF) to PNG.</para><para>This is not that great because, unlike above <literal>WMFConverterFactory</literal> which converts WMF (Windows vector graphics format) to SVG (standard vector graphics format), <literal>EMF2PNG</literal> converts a vector graphics format to a raster image format. However, having <literal>EMF2PNG</literal> is better than nothing at all.</para><para><literal>EMF2PNG</literal> has one parameter called <literal>resolution</literal>. Its value is a real number expressed in Dot Per Inch (DPI).  The default value of parameter <literal>resolution</literal> is <literal>-300</literal> (see below).</para><para>The <literal>resolution</literal>  parameter specifies the resolution of the output PNG file. 0 means: same resolution as the one found input EMF/WMF file; a positive number means: use this value to override the resolution found in the input EMF/WMF file;  a negative number means: use specified absolute value but only if  this absolute value is  greater than the resolution found in the input EMF/WMF file.</para></listitem></varlistentry><varlistentry><term><literal>com.xmlmind.w2x.docx.image.ExternalImageConverter</literal></term><listitem><para>This image converter factory executes <emphasis>an external program</emphasis> to perform the conversion. See <xref linkend="external_image_converter"/>.</para></listitem></varlistentry></variablelist><para>If you want w2x to support more image formats, you’ll have to create your own <literal>ImageConverterFactory</literal> and register it with w2x using method <literal>ImageConverterFactories.register</literal>.</para><note><para><emphasis role="bold">About thread-safety</emphasis></para><para>A single instance of a class implementing <literal>ImageConverterFactory</literal> is used by all instances of <literal>com.xmlmind.w2x.processor.Processor</literal>. This implies that an implementation of <literal>ImageConverterFactory</literal> must be thread-safe.</para></note><para>See <link xmlns:xlink="http://www.w3.org/1999/xlink" xlink:href="https://www.xmlmind.com/w2x/_distrib/doc/api/com/xmlmind/w2x/docx/image/package-summary.html">reference of package com.xmlmind.w2x.docx.image</link>.</para><section xml:id="external_image_converter"><title>Specifying an external image converter</title><para>Examples of <literal>W2X_IMAGE_CONVERSIONS</literal> specifications (see <xref linkend="W2X_IMAGE_CONVERSIONS"/>):</para><itemizedlist><listitem><para>Convert EMF/WMF to SVG using <link xmlns:xlink="http://www.w3.org/1999/xlink" xlink:href="https://www.openoffice.org/">OpenOffice</link>/<link xmlns:xlink="http://www.w3.org/1999/xlink" xlink:href="https://www.libreoffice.org/">LibreOffice</link>:</para><programlisting>.emf.svg.wmf.svg <emphasis role="bold">soffice --headless --convert-to svg -–outdir %~po %i</emphasis> 
</programlisting><para>Or equivalently using <link xmlns:xlink="http://www.w3.org/1999/xlink" xlink:href="https://github.com/unoconv/unoconv">unoconv</link>:</para><programlisting>.emf.svg.wmf.svg <emphasis role="bold">unoconv -f svg -o %o %i</emphasis>
</programlisting></listitem><listitem><para>Convert EMF to SVG using <link xmlns:xlink="http://www.w3.org/1999/xlink" xlink:href="https://inkscape.org/">Inkscape</link>:</para><programlisting>.emf.svg <emphasis role="bold">inkscape -l -o %o %i</emphasis>
</programlisting></listitem></itemizedlist><para>The command executed by an external image converter may contain the following variables:</para><informaltable><tgroup cols="2"><colspec colname="c1" colwidth="20*"/><colspec colname="c2" colwidth="80*"/><thead valign="top"><row><entry align="center"><para><emphasis role="bold">Variable</emphasis></para></entry><entry align="center"><para><emphasis role="bold">Definition</emphasis></para></entry></row></thead><tbody valign="top"><row><entry align="center"><para><literal>%I</literal></para></entry><entry><para>Absolute path of the input image file.</para></entry></row><row><entry align="center"><para><literal>%O</literal></para></entry><entry><para>Absolute path of the output image file.</para></entry></row><row><entry align="center"><para><literal>%i</literal></para></entry><entry><para>Same as <literal>%I</literal> but quoted, that is, equivalent to <literal>“%I”</literal>.</para></entry></row><row><entry align="center"><para><literal>%o</literal></para></entry><entry><para>Same as <literal>%O</literal> but quoted, that is, equivalent to <literal>“%O”</literal>.</para></entry></row><row><entry align="center"><para><literal>%S</literal></para></entry><entry><para>File separator: “<literal>\</literal>” on Windows, “<literal>/</literal>” on Mac/Linux.</para></entry></row></tbody></tgroup></informaltable><para>The following modifiers may be applied to the <literal>%I</literal>, <literal>%O</literal>, <literal>%i</literal>, <literal>%o</literal> variables:</para><informaltable><tgroup cols="2"><colspec colname="c1" colwidth="20*"/><colspec colname="c2" colwidth="80*"/><thead valign="top"><row><entry align="center"><para><emphasis role="bold">Modifier</emphasis></para></entry><entry align="center"><para><emphasis role="bold">Definition</emphasis></para></entry></row></thead><tbody valign="top"><row><entry align="center"><para><literal>~p</literal></para></entry><entry><para>Absolute path of the parent directory of the file. For example, if <literal>%I</literal> is “<literal>C:\temp\doc_files\logo.wmf</literal>”, then <literal>%~pI</literal> is “<literal>C:\temp\doc_files</literal>”.</para></entry></row><row><entry align="center"><para><literal>~n</literal></para></entry><entry><para>Basename of the file. For example, if <literal>%I</literal> is “<literal>C:\temp\doc_files\logo.wmf</literal>”, then <literal>%~nI</literal> is “<literal>logo.wmf</literal>”.</para></entry></row><row><entry align="center"><para><literal>~r</literal></para></entry><entry><para>Basename of the file without any extension. For example, if <literal>%I</literal> is “<literal>C:\temp\doc_files\logo.wmf</literal>”, then <literal>%~rI</literal> is “<literal>logo</literal>”.</para></entry></row><row><entry align="center"><para><literal>~e</literal></para></entry><entry><para>Extension of the file. For example, if <literal>%I</literal> is “<literal>C:\temp\doc_files\logo.wmf</literal>”, then <literal>%~eI</literal> is “<literal>wmf</literal>”.</para></entry></row></tbody></tgroup></informaltable><para>Also note that “<literal>%%</literal>” may be used to escape character “<literal>%</literal>”. More generally, just like in an URL, an <literal>%<replaceable>HH</replaceable></literal> UTF-8 sequence may be used to escape any character. Example: “<literal>%3B</literal>” is “<literal>;</literal>” (semi colon), “<literal>%C3%A9</literal>” is “<literal>é</literal>” (“e” with acute accent).</para></section><section xml:id="W2X_IMAGE_CONVERSIONS"><title>Controlling how image files found in the input DOCX file are converted to standard formats</title><para>Conversion of images found in the DOCX file (TIFF, WMF, EMF, etc) to standard formats (SVG, PNG, JPEG) may be controlled using environment variable (or Java™ property) <literal>W2X_IMAGE_CONVERSIONS</literal>. </para><para>The default value of this variable is (all specifications on a single line):</para><programlisting>.wmf.svg java:com.xmlmind.w2x_ext.wmf_converter.WMFConverterFactory;
.tiff.png java:com.xmlmind.w2x.docx.image.ImageConverterFactoryImpl
</programlisting><para>On Windows, the default value of <literal>W2X_IMAGE_CONVERSIONS</literal> is (all specifications on a single line):</para><programlisting>.wmf.svg java:com.xmlmind.w2x_ext.wmf_converter.WMFConverterFactory;
<emphasis role="bold">.emf.png.wmf.png java:com.xmlmind.w2x_ext.emf2png.EMF2PNG resolution -300;</emphasis>
.tiff.png java:com.xmlmind.w2x.docx.image.ImageConverterFactoryImpl
</programlisting><para>The syntax of <literal>W2X_IMAGE_CONVERSIONS</literal> is:</para><programlisting>specifications -&gt; “<emphasis role="bold">-</emphasis>” | specification_list

specification_list -&gt; specification [ “<emphasis role="bold">;</emphasis>” specification ]+

specification -&gt; “<emphasis role="bold">+</emphasis>” | image_conversion

image_conversion -&gt; extensions <emphasis>S</emphasis> ( java_image_conversion | external_image_conversion )

extensions -&gt; [ “<emphasis role="bold">.</emphasis>” input_file_extension “<emphasis role="bold">.</emphasis>” output_file_extension ]+

java_image_conversion -&gt; “<emphasis role="bold">java:</emphasis>” fully_qualified_java_class_name parameters

parameters -&gt; [ <emphasis>S</emphasis> parameter_name <emphasis>S</emphasis> possibly_quoted_parameter_value ]*

external_image_conversion -&gt; command_line
</programlisting><para>About this syntax:</para><itemizedlist><listitem><para>“<literal>-</literal>” means: no specifications; hence no image conversions at all. </para></listitem><listitem><para>“<literal>+</literal>” means: insert default value of <literal>W2X_IMAGE_CONVERSIONS</literal> at this point. Example:</para><programlisting>set W2X_IMAGE_CONVERSIONS=.emf.svg inkscape -l -o %o %i<emphasis role="bold">;+</emphasis>
</programlisting><para>where default value of <literal>W2X_IMAGE_CONVERSIONS</literal> is (on Windows):</para><programlisting>.wmf.svg java:com.xmlmind.w2x_ext.wmf_converter.WMFConverterFactory;
.emf.png.wmf.png java:com.xmlmind.w2x_ext.emf2png.EMF2PNG resolution -300;
.tiff.png java:com.xmlmind.w2x.docx.image.ImageConverterFactoryImpl
</programlisting></listitem><listitem><para>Note that the image conversion  specifications are considered in the order of their declarations in variable <literal>W2X_IMAGE_CONVERSIONS</literal>.  In the case of the above example, it’s custom “<literal>inkscape -l -o %o %i</literal>” which is used to convert EMF to PNG and not stock “<literal>java:com.xmlmind.w2x_ext.emf2png.EMF2PNG resolution -300</literal>”.</para></listitem></itemizedlist></section></section></section></chapter><chapter xml:id="limitations"><title>Limitations and implementation specificities</title><para>The <link linkend="convert_step">Convert step</link> does not support the following MS-Word features. </para><para>By “does not support”, we mean that w2x will not generate something useful  corresponding to such features. We don’t mean that using such features in a DOCX file would cause w2x to fail or to generate invalid XML documents.</para><itemizedlist><listitem><para>Right to left scripts.</para></listitem><listitem><para>Enclose characters.</para></listitem><listitem><para>Asian layout. </para></listitem><listitem><para>Cover Page. Blank Page.</para></listitem><listitem><para>Text wrapping of tables and pictures other than the simplest one.</para></listitem><listitem><para>Picture formats other than  GIF, PNG, JPEG, BMP, TIFF and WMF are not supported. <emphasis>EMF pictures are supported only on Windows</emphasis>.</para></listitem><listitem><para>Clip Art. <emphasis>Shapes</emphasis>. <emphasis>SmartArt</emphasis>. <emphasis>Chart</emphasis>.</para></listitem><listitem><para>Header. Footer. Page Number.</para></listitem><listitem><para>Japanese Greetings. Text Box.  WordArt. Drop Cap. </para></listitem><listitem><para>Object.</para></listitem><listitem><para>All features related to Page Layout except (to a minimal extent) page and column breaks and end of sections.</para></listitem><listitem><para>All features related to Mailings.</para></listitem><listitem><para>All features related to Spelling &amp; Grammar, except of course the various languages used in the document (i.e. <literal>lang</literal> attribute).</para></listitem><listitem><para>Comments.</para></listitem><listitem><para>All features related to Change Tracking.</para><para>When a DOCX file contains revision info (i.e. "<emphasis role="bold">Track Changes</emphasis>"), w2x implements its own, automatic, very crude, interpretation of "<emphasis role="bold">Accept All Changes</emphasis>". That's why, a warning is issued informing the user that she/he would better use MS-Word to manually accept or reject the tracked changes before submitting the DOCX file to w2x.</para></listitem><listitem><para>All features related to (document) Compare, (document) Protect.</para></listitem><listitem><para>Macros.</para></listitem><listitem><para>Controls.</para></listitem></itemizedlist><para>The <link linkend="convert_step">Convert step</link> generates XHTML+CSS documents having the following specificities:</para><itemizedlist><listitem><para>Tab stops<indexterm><primary>tab stops</primary></indexterm> are converted to <literal>&lt;span class="role-tab"&gt; &lt;/span&gt;</literal>. See <xref linkend="about_tab_stops"/>.</para></listitem><listitem><para>MS-Word document properties having no standard <literal>meta</literal> equivalent are given names starting with “<literal>ms-</literal>”. Example:</para><programlisting>&lt;meta content="Hussein Shafie" name="<emphasis role="bold">ms-cp-lastModifiedBy</emphasis>" /&gt;
</programlisting></listitem><listitem><para>MS-Word “styles” having no CSS equivalent are a given a “<literal>-ms-</literal>” prefix. Example:</para><programlisting>.p-Heading3 {
    <emphasis role="bold">-ms-outlineLvl: 2</emphasis>;
    color: #4F81BD;
    font-family: Cambria;
    ...
</programlisting></listitem><listitem><para>Page breaks are translated to <literal>&lt;?break-page?&gt;</literal>. Column breaks are translated to <literal>&lt;?break-column?&gt;</literal>. End of sections are signaled by <literal>&lt;?end-of-section?&gt;</literal>.</para></listitem><listitem><para>WMF pictures are converted to <link xmlns:xlink="http://www.w3.org/1999/xlink" xlink:href="https://www.w3.org/TR/SVG/">SVG</link>.</para></listitem><listitem><para>OpenXML math, for example <inlineequation><mml:math xmlns:mml="http://www.w3.org/1998/Math/MathML"><mml:mi>x</mml:mi><mml:mo>=</mml:mo><mml:mfrac><mml:mrow><mml:mo>-</mml:mo><mml:mi>b</mml:mi><mml:mo>±</mml:mo><mml:mroot><mml:mrow><mml:msup><mml:mrow><mml:mi>b</mml:mi></mml:mrow><mml:mrow><mml:mn>2</mml:mn></mml:mrow></mml:msup><mml:mo>-</mml:mo><mml:mn>4</mml:mn><mml:mi>a</mml:mi><mml:mi>c</mml:mi></mml:mrow><mml:mrow/></mml:mroot></mml:mrow><mml:mrow><mml:mn>2</mml:mn><mml:mi>a</mml:mi></mml:mrow></mml:mfrac></mml:math></inlineequation>,  is converted to <link xmlns:xlink="http://www.w3.org/1999/xlink" xlink:href="https://www.w3.org/TR/MathML2/">MathML</link><indexterm><primary>MathML</primary></indexterm>.</para><para>Conversion from OpenXML math to MathML is implemented by an XSLT 1.0 stylesheet called <literal>omml2mml.xsl</literal> coming from open source project <link xmlns:xlink="http://www.w3.org/1999/xlink" xlink:href="https://tei-c.org/Tools/Stylesheets/">XSL stylesheets for TEI XML</link>. If you think you have access to a better XSLT stylesheet than open source <literal>omml2mml.xsl</literal>, then you may use it by specifying environment variable (or Java™ system property) <literal>W2X_MATH_CONVERTER_XSLT</literal>. Example:</para><programlisting>set W2X_MATH_CONVERTER_XSLT=C:\Users\john\My better omml2mml.xsl
</programlisting></listitem><listitem><para>All simple and most complex fields are converted to a <literal>&lt;?field <replaceable>code</replaceable>?&gt;</literal> having a <literal>&lt;span class="role-field"&gt;</literal> parent. Example:</para><programlisting>&lt;span class="role-field"&gt;
<emphasis role="bold">&lt;?field DATE  \@ "MMMM d, yyyy"  \* MERGEFORMAT ?&gt;</emphasis>
August 27, 2014
&lt;/span&gt;
</programlisting></listitem><listitem><para>Smart tags are enclosed between  <literal>&lt;?begin-smartTag <replaceable>tag</replaceable>?&gt;</literal> and <literal>&lt;?end-smartTag <replaceable>tag</replaceable>?&gt;</literal>. Example:</para><programlisting><emphasis role="bold">&lt;?begin-smartTag {urn:schemas-microsoft-com:office:smarttags}PersonName#0?&gt;</emphasis>
&lt;?begin-smartTag {urn:schemas:contacts}GivenName#1?&gt;
Bill
&lt;?end-smartTag {urn:schemas:contacts}GivenName#1?&gt;

&lt;?begin-smartTag {urn:schemas:contacts}Sn#2?&gt;
Gates
  &lt;?end-smartTag {urn:schemas:contacts}Sn#2?&gt;
<emphasis role="bold">&lt;?end-smartTag {urn:schemas-microsoft-com:office:smarttags}PersonName#0?&gt;</emphasis>
</programlisting></listitem><listitem><para>Controls are enclosed between  <literal>&lt;?begin-sdt <replaceable>control_id</replaceable>?&gt;</literal>  and <literal>&lt;?end-sdt <replaceable>control_id</replaceable>?&gt;.</literal> Example:</para><programlisting><emphasis role="bold">&lt;?begin-sdt comboBox#6?&gt;</emphasis>

&lt;td class="tc-TableGrid--bb tc-TableGrid"
    style="padding-bottom: 7.2pt; padding-left: 7.2pt; 
           padding-right: 7.2pt; padding-top: 7.2pt;"&gt;
  &lt;p class="tp-TableGrid p-Normal" lang="fr-FR"&gt;
    &lt;span class="c-PlaceholderText"&gt;Choose an item.&lt;/span&gt;
  &lt;/p&gt;
&lt;/td&gt;

<emphasis role="bold">&lt;?end-sdt comboBox#6?&gt; </emphasis>
</programlisting></listitem><listitem xml:id="east_asia_lang_limitation"><para>The language of DOCX files written in an East Asian language is not correctly detected.</para><para>Unfortunately, this will always be the case because w2x never examines the characters actually contained in a text span having <literal>&lt;w:lang w:eastAsia="ja-JP" w:val="en-US"/&gt;</literal> to determine whether this text span is written in <literal>ja-JP</literal> or is written in <literal>en-US</literal> or is written is a mix of both languages.</para><para>However, a partial workaround for this limitation is to specify for example <literal>–p convert.set-lang ja-JP</literal> or <literal>–p convert.default-lang ja-JP</literal>. When <link linkend="param_set_lang">parameter convert.set-lang</link><indexterm><primary>set-lang, parameter</primary></indexterm> or <link linkend="param_default_lang">parameter convert.default-lang</link><indexterm><primary>default-lang, parameter</primary></indexterm> is set to a language code starting with <literal>ja</literal>, <literal>zh</literal> or <literal>ko</literal>, then it is attribute <literal>w:lang/@w:eastAsia</literal> which is used to determine the language of a text span and not attribute <literal>w:lang/@w:val</literal>.</para><para>Note that <literal>–p convert.default-lang ja-JP</literal> is just used as a <emphasis>hint</emphasis> to favor attribute <literal>w:lang/@w:eastAsia</literal> over attribute <literal>wlang/@w:val</literal>. Given the way MS-Word sets these two attributes, using parameter <literal>–p convert.default-lang ja-JP</literal> will <emphasis>not</emphasis> cause a vastly incorrect detection of the language when converting a German DOCX file for example.</para></listitem><listitem><para>w2x can generate DITA <literal>indexterm</literal> elements having <literal>index-sort-as</literal> children and DocBook <literal>indexterm</literal>/<literal>primary</literal>, <literal>secondary</literal>, <literal>tertiary</literal> elements having <literal>sortas</literal> attributes. For this to happen, the input DOCX file must contain <literal>XE</literal> (index entry) fields having <literal>\y "<replaceable>yomi</replaceable>"</literal> (first phonetic character for sorting indexes) field arguments.</para><para>Unlike MS-Word which considers <literal>\y "<replaceable>yomi</replaceable>"</literal> only for East Asian languages, w2x uses this <literal>XE</literal> field argument to sort the index entries <emphasis>whatever the language of the document</emphasis>. English examples: <literal>{XE "&lt;span&gt;" \y "span"}</literal>, <literal>{XE "Operation:+" \y ":Addition"}</literal>.</para></listitem></itemizedlist><section xml:id="about_tab_stops"><title>About tab stops</title><para>Tab stops<indexterm><primary>tab stops</primary></indexterm> are converted to <literal>&lt;span class="role-tab"&gt; &lt;/span&gt;</literal>. These <literal>span</literal> elements are processed as follows:</para><itemizedlist><listitem><para>When generating styled HTML (that is, XHTML+CSS), some JavaScript™ code (<literal><replaceable>w2x_install_dir</replaceable>/xed/expand-tabs.js</literal>) is added to the output file. This code computes and gives a width to all <literal>&lt;span class=”role-tab&gt; &lt;/span&gt;</literal>. This allows to decently emulate tab stops in any modern Web browser.</para><para>If you don't want this code to be added to the output file, pass option <literal>-p edit.do.expand-tabs ""</literal> to w2x.</para></listitem><listitem><para>When generating semantic XHTML and all the other semantic XML formats (DocBook, DITA, etc), it's possible to convert consecutive paragraphs containing text runs aligned on tab stops to a borderless table.</para><para>However because, in the general case, it's not possible to emulate tab stops using tables, this XED script is disabled by default. If you really want to emulate tab stops using tables, pass option <literal>-p edit.convert-tabs.to-table yes</literal> to w2x.</para></listitem></itemizedlist></section></chapter><chapter xml:id="automatic_index"><title>Index</title><para/></chapter></book>