Contents

1 Introduction 4

2 Installing w2x 5

2.1 Contents of the installation directory 6

3 Alternatives to using the w2x command-line utility 9

3.1 The w2x-app graphical application 9

3.2 The “Word To XML” add-on for XMLmind XML Editor 9

3.2.1 Installing the “Word To XML” add-on 10

3.3 The “Word To XML” servlet 10

3.3.1 Contents of the servlet software distribution 10

3.3.2 Installing the servlet 11

3.3.3 Configuring the servlet 11

3.3.4 Using the servlet to convert DOCX files 12

3.3.5 Non interactive requests 13

4 Getting started with w2x 15

4.1 How to generate useful multi-page HTML 17

5 Going further with w2x 19

5.1 Stock XED scripts 21

6 Customizing the output of w2x 24

6.1 Customizing the XHTML+CSS files generated by w2x 24

6.1.1 Using a XED script to modify the styles embedded in the XHTML+CSS file 24

6.1.2 Appending custom styles to the styles embedded in the XHTML+CSS file 24

6.1.3 Using an external CSS file rather than embedded CSS styles 25

6.1.4 Combining all the above methods 26

6.2 Customizing the semantic XML files generated by w2x 27

6.2.1 Converting custom character styles to semantic tags 27

6.2.2 Converting custom paragraph styles to semantic tags 28

6.2.3 The general case 30

6.3 Generating XML conforming to a custom schema 33

7 The w2x command-line utility 34

7.1 Variables substituted in the parameter values passed to the –p and –pu options 36

7.2 Default conversion steps 37

7.3 Automatic conversion step parameters 37

8 Conversion step reference 38

8.1 Convert step 38

8.2 Delete files step 40

8.3 Edit step 40

8.4 EPUB step 49

8.5 Load step 49

8.6 Save step 50

8.7 Split step 50

8.8 Transform step 52

8.9 Web Help step 56

9 Embedding w2x in a Java™ application 58

9.1 Extension points 59

9.1.1 Custom conversion step 59

9.1.2 Custom image converters 59

9.1.2.1 Specifying an external image converter 60

9.1.2.2 Controlling how image files found in the input DOCX file are converted to standard formats 61

10 Limitations and implementation specificities 63

10.1 About tab stops 65

Index 66