2.3. Convert ImageDemo document to HTML

The ImageDemo configuration has been created to teach external consultants and local gurus how to configure XXE for XML documents embedding binary or XML images.

<command name="imgd.convertToHTML">
  <macro>
    <sequence>
      <command name="selectFile" parameter="saveFileURL" />
      <command name="imgd.toHTML" parameter='"%_"' />
    </sequence>
  </macro>
</command>

<command name="imgd.toHTML">
  <process>
    <mkdir dir="resources" />
    <mkdir dir="raw" />
    <copyDocument to="__doc.xml">
      <extract xpath="//imgd:image_ab/@data | //imgd:image_eb" toDir="raw">1
        <processingInstruction target="extracted" 
                               data="resources/{$url.rootName}.png" />
      </extract>
      <extract xpath="//imgd:*/svg:svg" toDir="raw">
        <processingInstruction target="extracted" 
                               data="resources/{$url.rootName}.png" />
      </extract>

      <resources match="(https|http|ftp)://.*" />
      <resources match=".+\.(png|jpg|jpeg|gif)" 
                 copyTo="resources" />
      <resources match="(?:.+/)?(.+)\.(\w+)"
                 copyTo="raw" referenceAs="resources/$1.png" />
      <resources match=".+" 
                 copyTo="resources" />
    </copyDocument>

    <convertImage from="raw" to="resources" format="png" />

    <mkdir dir="xslt_graphics" />
    <copyProcessResources resources="xslt_graphics/*" to="xslt_graphics" />

    <transform stylesheet="html.xslt" 
               file="__doc.xml" to="__doc.html"/>

    <upload base="%0">2
      <copyFile file="__doc.html" to="%0" />
      <copyFiles files="resources/*" toDir="resources" />
      <copyFiles files="xslt_graphics/*" toDir="xslt_graphics" />
    </upload>
  </process>
</command>

If you can follow the previous example, you can follow this one too because they are very similar. The main differences are:

1

Instead of extracting the SVG graphics from svg:svg and replacing this element by another one such as imgd:image_au, it is much simpler to insert an extracted processing instruction inside imgd:image_ab, imgd:image_eb and svg:svg.

Doing this spares the effort of copying all the image geometry attributes, width, height, content_width, content_height, etc, from the extracted element to the replacement imgd:image_au element.

2

Unlike an RTF file, an HTML file is not self-contained. All the graphics files found in resources/ and in xslt_graphics/ need to be copied along the generated HTML file.