XMLmind Word To XML Manual

Explains how to install and use XMLmind Word To XML (w2x for short), how to customize the output of w2x and how to embed a w2x processor in a Java™ application.

Hussein Shafie
XMLmind Software
35 rue Louis Leblanc,
78120 Rambouillet,
France,
Phone: +33 (0)9 52 80 80 37,
Web: www.xmlmind.com/w2x/
Email: mailto:w2x-support@xmlmind.com (public mailing list)

Contents

1 Introduction 4

2 Installing w2x 5

2.1 Contents of the installation directory 7

3 Alternatives to using the w2x command-line utility 9

3.1 The w2x-app graphical application 9

3.2 The “Word To XML” add-on for XMLmind XML Editor 9

3.2.1 Installing the “Word To XML” add-on 10

3.3 The “Word To XML” servlet 10

3.3.1 Contents of the servlet software distribution 11

3.3.2 Installing the servlet 11

3.3.3 Configuring the servlet 11

3.3.4 Using the servlet to convert DOCX files 12

3.3.5 Non interactive requests 13

4 Getting started with w2x 15

4.1 How to generate useful multi-page HTML 17

5 Going further with w2x 19

5.1 Stock XED scripts 21

6 Customizing the output of w2x 24

6.1 Customizing the XHTML+CSS files generated by w2x 24

6.1.1 Using a XED script to modify the styles embedded in the XHTML+CSS file 24

6.1.2 Appending custom styles to the styles embedded in the XHTML+CSS file 24

6.1.3 Using an external CSS file rather than embedded CSS styles 25

6.1.4 Combining all the above methods 26

6.2 Customizing the semantic XML files generated by w2x 27

6.2.1 Converting custom character styles to semantic tags 27

6.2.2 Converting custom paragraph styles to semantic tags 28

6.2.3 The general case 30

6.3 Generating XML conforming to a custom schema 33

6.4 Packaging your customization as a w2x plugin 34

6.4.1 Anatomy of a plugin 34

6.4.2 Registering a plugin with w2x 35

7 The w2x command-line utility 37

7.1 Variables substituted in the parameter values passed to the –p and –pu options 39

7.2 Default conversion steps 40

7.3 Automatic conversion step parameters 40

8 Conversion step reference 42

8.1 Convert step 42

8.2 Delete files step 46

8.3 Edit step 46

8.4 EPUB step 55

8.5 Load step 56

8.6 Save step 57

8.7 Split step 57

8.8 Transform step 59

8.9 Web Help step 64

9 Embedding w2x in a Java™ application 66

9.1 Extension points 67

9.1.1 Custom conversion step 67

9.1.2 Custom image converters 67

9.1.2.1 Specifying an external image converter 68

9.1.2.2 Controlling how image files found in the input DOCX file are converted to standard formats 69

10 Limitations and implementation specificities 71

10.1 About tab stops 73

Index 75