Chapter 6. Writing a validateHook

Table of Contents

1. What is a validation hook?
2. Compiling and running the code sample
3. Implementing the ValidateHook interface

1. What is a validation hook?

A validateHook is some code written in Java™ notified by XXE before and after a document is checked for validity. A document is checked for validity on demand (menu item ToolsCheck Validity) but also automatically, after just being opened and before being saved to disk.

This mechanism has been created to perform semantic validation beyond what can be done using a DTD or schema.

In some cases, an alternative to writing a validation hook in Java™ is to write a Schematron and declare it in the configuration file by the means of a schematron configuration element in XMLmind XML Editor - Configuration and Deployment.

2. Compiling and running the code sample

The code sample used in this tutorial is a validation hook which:

  • checks that the src attribute of the img element is not an absolute file path;

  • checks that the value of the name attribute of a a element is not already in use;

  • checks that for each href attribute of a a element starting with # (local reference), there is an element with such name or id.

Compile this validation hook by executing ant (see build.xml) in samples/check_links/. The build creates checklinks.jar. Then test the validation hook by proceeding as following:

  1. Copy checklinks.incl and checklinks.jar to XXE_install_directory/addon/config/xhtml/.

  2. Include checklinks.incl in the XXE configuration file for XHTML which is XXE_install_directory/addon/config/xhtml/xhtml_strict.xxe.

    How to deploy a validation hook is detailed in Section 33, “validateHook” in XMLmind XML Editor - Configuration and Deployment.

  3. Restart XXE.

  4. Clear the Quick Start cache (OptionsPreferences, Advanced|Cached Data section in XMLmind XML Editor - Online Help), then restart XXE one more time. If you forget to do that, XXE will fail to see your extension.

  5. Open samples/tests/in/sample2.html int XXE and examine all the problems found by the validation hook for this file (click on the Validity tool tab to display the semantic warnings).

3. Implementing the ValidateHook interface

The ValidateHook interface is very easy to understand:

MethodDescription
checkingDocument

Invoked before a document conforming to a schema is validated.

Therefore validation hooks automatically fixing problems in the document being edited must implement method checkingDocument.

documentChecked

Invoked after a document conforming to a DTD or schema has been validated.

Therefore validation hooks reporting semantic errors must implement method documentChecked.

The validateHook used as an example in this tutorial just needs to implement the documentChecked method, therefore it extends adapter class ValidateHookBase rather than implement the above interface.

public class CheckLinks extends ValidateHookBase {
    private static final Name SRC = Name.get("src");
    private static final Name NAME = Name.get("name");
    private static final Name ID = Name.get("id");
    private static final Name HREF = Name.get("href");

    public Diagnostic[] documentChecked(Document doc, boolean canceled,
                                        Diagnostic[] diagnostics) {
        if (canceled)1
            return diagnostics;

        final ArrayList<DiagnosticImpl> warnings =
            new ArrayList<DiagnosticImpl>();
        final ArrayList<Element> links = new ArrayList<Element>();
        final HashMap<String, List<Element>> anchors =
            new HashMap<String, List<Element>>();

        Traversal.traverse(doc.getRootElement(), new Traversal.HandlerBase() {2
            public Object enterElement(Element element) {
                String localName = element.getLocalName();

                String anchorName = null;

                if ("img".equals(localName)) {3
                    String src = element.getAttribute(SRC);

                    if (src != null) {
                        if (src.startsWith("file:/") ||
                            src.startsWith("/") ||
                            src.startsWith("\\\\") ||
                            (src.length() >= 3 &&
                             Character.isLetter(src.charAt(0)) &&
                             src.regionMatches(1, ":\\", 0, 2))) {
                            warnings.add(new DiagnosticImpl(
                              element, 
                              "src attribute looks like an absolute file path",
                              Diagnostic.Severity.SEMANTIC_WARNING));
                        }
                    }
                } else if ("a".equals(localName)) {
                    String href = element.getAttribute(HREF);

                    if (href != null) {
                        if (href.startsWith("#"))4
                            links.add(element);
                    } else {
                        anchorName = element.getAttribute(NAME);5
                        if (anchorName != null) {
                            List<Element> elements = anchors.get(anchorName);
                            if (elements == null) {
                                elements = new ArrayList<Element>();
                                anchors.put(anchorName, elements);
                            }

                            elements.add(element);
                        }
                    }
                }

                String id = element.getAttribute(ID);6
                if (id != null && !id.equals(anchorName)) {
                    List<Element> elements = anchors.get(id);
                    if (elements == null) {
                        elements = new ArrayList<Element>();
                        anchors.put(id, elements);
                    }

                    elements.add(element);
                }

                return null;
            }
        });

        int count = links.size();
        for (int i = 0; i < count; ++i) {7
            Element element = links.get(i);

            String id = element.getAttribute(HREF).substring(1);

            if (!anchors.containsKey(id))
                warnings.add(new DiagnosticImpl(
                    element, 
                    "reference to non-existent name or id '" + id + "'", 
                    Diagnostic.Severity.SEMANTIC_WARNING));
        }

        Iterator<Map.Entry<String, List<Element>>> iter = 
            anchors.entrySet().iterator();8
        while (iter.hasNext()) {
            Map.Entry<String, List<Element>> entry =  iter.next();

            String id = entry.getKey();
            List<Element> elements = entry.getValue();

            count = elements.size();
            for (int i = 1; i < count; ++i) {
                warnings.add(new DiagnosticImpl(
                    elements.get(i), 
                    "name or id '" + id + "' already defined", 
                    Diagnostic.Severity.SEMANTIC_WARNING));
            }
        }

        int warningCount = warnings.size();9
        if (warningCount > 0) {
            int diagnosticCount = diagnostics.length;
            Diagnostic[] diagnostics2 =
                new Diagnostic[diagnosticCount + warningCount];

            System.arraycopy(diagnostics, 0, diagnostics2, 0, diagnosticCount);
            for (int i = 0; i < warningCount; ++i)
                diagnostics2[diagnosticCount+i] = warnings.get(i);

            diagnostics = diagnostics2;
        }

        return diagnostics;
    }
}

1

If the checkingDocument method has been invoked, the documentChecked method is guaranteed to be invoked too, even if the passed canceled argument is true .

2

Document is traversed using the Traversal utility. The anonymous Traversal.Handler

  • will check <img src="..."> and will possibly add semantic warnings to ArrayList warnings,

  • will add elements <a href="#..."> to ArrayList links,

  • will add elements <a name="..."> or elements having an id attribute to HashMap anchors.

3

Img elements are checked here.

If the value of the src attribute looks like an absolute file path, a DiagnosticImpl structure describing the semantic warning is added to ArrayList warnings.

4

Elements <a href="#..."> are added to the ArrayList links here. Verification is done in a subsequent pass.

5

Elements <a name="..."> are added to HashMap anchors here. Verification is done in a subsequent pass.

6

Elements having an id attribute are added to HashMap anchors here. Verification is done in a subsequent pass.

Note that a a element often has both a name and an id attribute set to the same value and that this should not be considered as an error.

7

Elements contained in ArrayList links referencing an unknown name or id are detected here.

8

Elements contained in HashMap anchors having a name or id already in use are detected here.

9

If semantic warnings have been found for the document, they are added to the list of Diagnostic passed as an argument and the augmented list is returned as the result of the documentChecked method.