Chapter 9. Writing a validateHook

Table of Contents

1. Implementing the ValidateHook interface

A validateHook is some code written in Java™ notified by XXE before and after a document is checked for validity. A document is checked for validity on demand (menu item ToolsCheck Validity) but also automatically, after just being opened and before being saved to disk.

This mechanism has been created to perform semantic validation beyond what can be done using a DTD or schema.

A validateHook which:

has been written to be used as an example in this tutorial.

Compile this validateHook by executing ant (see build.xml) in samples/checklinks/. The build creates checklinks.jar. Then test the validateHook by:

  1. Copying checklinks.incl and checklinks.jar to XXE_install_directory/addon/config/xhtml/.

  2. Including checklinks.incl in the XXE configuration file for XHTML which is XXE_install_directory/addon/config/xhtml/xhtml_strict.xxe.

  3. Restarting XXE.

  4. Loading tests/in/sample2.html into XXE and examining all the problems found by the validateHook for this file (click on the Validity tool tab to display the semantic warnings).

How to deploy a validateHook is detailed in Section 28, “validateHook” in XMLmind XML Editor - Configuration and Deployment.

1. Implementing the ValidateHook interface

The ValidateHook interface is very easy to understand:

MethodDescription
checkingDocumentInvoked before a document conforming to a schema is validated.
documentCheckedInvoked after a document conforming to a schema has been validated.

The validateHook used as an example in this tutorial just needs to implement the documentChecked method, therefore it extends adapter class ValidateHookBase rather than implement the above interface.

public class CheckLinks extends ValidateHookBase {
    private static final Name SRC = Name.get("src");
    private static final Name NAME = Name.get("name");
    private static final Name ID = Name.get("id");
    private static final Name HREF = Name.get("href");

    public Diagnostic[] documentChecked(Document doc, boolean canceled,
                                        Diagnostic[] diagnostics) {
        if (canceled)1
            return diagnostics;

        final ArrayList<DiagnosticImpl> warnings =
            new ArrayList<DiagnosticImpl>();
        final ArrayList<Element> links = new ArrayList<Element>();
        final HashMap<String, List<Element>> anchors =
            new HashMap<String, List<Element>>();

        Traversal.traverse(doc.getRootElement(), new Traversal.HandlerBase() {2
            public Object enterElement(Element element) {
                String localName = element.getLocalName();

                String anchorName = null;

                if ("img".equals(localName)) {3
                    String src = element.getAttribute(SRC);

                    if (src != null) {
                        if (src.startsWith("file:/") ||
                            src.startsWith("/") ||
                            src.startsWith("\\\\") ||
                            (src.length() >= 3 &&
                             Character.isLetter(src.charAt(0)) &&
                             src.regionMatches(1, ":\\", 0, 2))) {
                            warnings.add(new DiagnosticImpl(
                              element, 
                              "src attribute looks like an absolute file path",
                              Diagnostic.Severity.SEMANTIC_WARNING));
                        }
                    }
                } else if ("a".equals(localName)) {
                    String href = element.getAttribute(HREF);

                    if (href != null) {
                        if (href.startsWith("#"))4
                            links.add(element);
                    } else {
                        anchorName = element.getAttribute(NAME);5
                        if (anchorName != null) {
                            List<Element> elements = anchors.get(anchorName);
                            if (elements == null) {
                                elements = new ArrayList<Element>();
                                anchors.put(anchorName, elements);
                            }

                            elements.add(element);
                        }
                    }
                }

                String id = element.getAttribute(ID);6
                if (id != null && !id.equals(anchorName)) {
                    List<Element> elements = anchors.get(id);
                    if (elements == null) {
                        elements = new ArrayList<Element>();
                        anchors.put(id, elements);
                    }

                    elements.add(element);
                }

                return null;
            }
        });

        int count = links.size();
        for (int i = 0; i < count; ++i) {7
            Element element = links.get(i);

            String id = element.getAttribute(HREF).substring(1);

            if (!anchors.containsKey(id))
                warnings.add(new DiagnosticImpl(
                    element, 
                    "reference to non-existent name or id '" + id + "'", 
                    Diagnostic.Severity.SEMANTIC_WARNING));
        }

        Iterator<Map.Entry<String, List<Element>>> iter = 
            anchors.entrySet().iterator();8
        while (iter.hasNext()) {
            Map.Entry<String, List<Element>> entry =  iter.next();

            String id = entry.getKey();
            List<Element> elements = entry.getValue();

            count = elements.size();
            for (int i = 1; i < count; ++i) {
                warnings.add(new DiagnosticImpl(
                    elements.get(i), 
                    "name or id '" + id + "' already defined", 
                    Diagnostic.Severity.SEMANTIC_WARNING));
            }
        }

        int warningCount = warnings.size();9
        if (warningCount > 0) {
            int diagnosticCount = diagnostics.length;
            Diagnostic[] diagnostics2 =
                new Diagnostic[diagnosticCount + warningCount];

            System.arraycopy(diagnostics, 0, diagnostics2, 0, diagnosticCount);
            for (int i = 0; i < warningCount; ++i)
                diagnostics2[diagnosticCount+i] = warnings.get(i);

            diagnostics = diagnostics2;
        }

        return diagnostics;
    }
}

1

If the checkingDocument method has been invoked, the documentChecked method is guaranteed to be invoked too, even if the passed canceled argument is true .

2

Document is traversed using the Traversal utility. The anonymous Traversal.Handler

  • will check <img src="..."> and will possibly add semantic warnings to ArrayList warnings,

  • will add elements <a href="#..."> to ArrayList links,

  • will add elements <a name="..."> or elements having an id attribute to HashMap anchors.

3

Img elements are checked here.

If the value of the src attribute looks like an absolute file path, a DiagnosticImpl structure describing the semantic warning is added to ArrayList warnings.

4

Elements <a href="#..."> are added to the ArrayList links here. Verification is done in a subsequent pass.

5

Elements <a name="..."> are added to HashMap anchors here. Verification is done in a subsequent pass.

6

Elements having an id attribute are added to HashMap anchors here. Verification is done in a subsequent pass.

Note that a a element often has both a name and an id attribute set to the same value and that this should not be considered as an error.

7

Elements contained in ArrayList links referencing an unknown name or id are detected here.

8

Elements contained in HashMap anchors having a name or id already in use are detected here.

9

If semantic warnings have been found for the document, they are added to the list of Diagnostic passed as an argument and the augmented list is returned as the result of the documentChecked method.