Docx Subdomain

This module provides a mail-merge capability of input data into an MS Word .docx templates. The generated output document is either Word .docx or Acrobat .pdf. (Be aware that exporting to PDF requires more memory).

The module consists of a single domain service, DocxService. This provides an API to merge a .docx template against its input data. The input data is represented as a simple HTML file.

The service supports several data types:

  • plain text

  • rich text

  • date

  • bulleted list

  • tables

The implementation uses docx4j and jdom2. Databinding to custom XML parts (the .docx file format’s in-built support) is not used (as repeating datasets - required for lists and tables - was not supported prior to Word 2013).

Dependency Management

If your application inherits from the Apache Isis starter app (org.apache.isis.app:isis-app-starter-parent) then that will define the version automatically:

pom.xml
<parent>
    <groupId>org.apache.isis.app</groupId>
    <artifactId>isis-app-starter-parent</artifactId>
    <version>2.0.0-M6</version>
    <relativePath/>
</parent>

Alternatively, import the core BOM. This is usually done in the top-level parent pom of your application:

pom.xml
<dependencyManagement>
    <dependencies>
        <dependency>
            <groupId>org.apache.isis.core</groupId>
            <artifactId>isis-core</artifactId>
            <version>2.0.0-M6</version>
            <scope>import</scope>
            <type>pom</type>
        </dependency>
    </dependencies>
</dependencyManagement>

In addition, add a section for the BOM of all subdomains:

pom.xml
<dependencyManagement>
    <dependencies>
        <dependency>
            <groupId>org.apache.isis.subdomains</groupId>
            <artifactId>isis-subdomains</artifactId>
            <scope>import</scope>
            <type>pom</type>
            <version>2.0.0-M6</version>
        </dependency>
    </dependencies>
</dependencyManagement>

Dependencies

In the domain module(s) of your application, add the following dependency:

pom.xml
<dependencies>
    <dependency>
        <groupId>org.apache.isis.subdomains</groupId>
        <artifactId>isis-subdomains-docx-applib</artifactId>
    </dependency>
</dependencies>

To output to PDF, the following dependency must also be added:

pom.xml
<dependency>
    <groupId>org.docx4j</groupId>
    <artifactId>docx4j-export-fo</artifactId>
</dependency>

Usage

The .docx templates use Word custom content controls as placeholders. The actions to work with these placeholders can be enabled by toggling on the "Developer" menu:

word enable developer ribbon

You can then toggle on Design Mode to create/edit/remove custom content controls.

For example, see Template.docx.

template docx

To programmatically mail-merge into the template, we create a HTML document that provides the input. For example:

input HTML
<?xml version="1.0" encoding="UTF-8" standalone="yes"?>
<html>
  <body>
    <p id="PPSN" class="plain">1234567A</p>
    <p id="CustomerName" class="plain">Mrs Fidelma O'Leary</p>
    <p id="Date" class="date">31/1/2012</p>
    <p id="Decision" class="rich">
      I am writing to you about your claim for jobseeker's credits. I have
      decided that you are not entitled to this benefit.
    </p>
    <p id="Decision2" class="rich">
      What follows below is a table that has been merged in, adding
      additional rows dynamically as necessary based on the input data.
    </p>
    <ul id="Reasons2">
      <li>
        <p>This is some reason text (without a following
          paragraph)</p>
      </li>
      <li>
        <p>This would be some additional the reason text</p>
        <p>This reason has one additional text paragraph, eg documenting
          the grounds</p>
      </li>
			<!-- ... -->
    </ul>
    <table id="Relatives">
      <tr>
        <td>Charlie O'Leary</td>
        <td>Husband</td>
        <td></td>
      </tr>
      <tr>
        <td>Mary O'Leary</td>
        <td>Daughter</td>
        <td>14</td>
      </tr>
			<!-- ... -->
    </table>
  </body>
</html>

We also parse the template into an internal data structure. This is usually done during bootstrapping as it is almost certainly immutable, and the parsing can take a second or two:

WordprocessingMLPackage docxTemplate =
    docxService.loadPackage(io.openInputStream("Template.docx"));

We then merge in the input to the template as follows:

val baos = new ByteArrayOutputStream();
val params = DocxService.MergeParams.builder()
        .docxTemplateAsWpMlPackage(docxTemplate)            (1)
        .inputAsHtml(inputHtml)                             (2)
        .matchingPolicy(DocxService.MatchingPolicy.STRICT)
        .outputType(DocxService.OutputType.DOCX)
        .output(baos)
        .build();
docxService.merge(params);

final byte[] docxActual = baos.toByteArray();
1 docx template, as shown above
2 input HTML, as shown above