Prerequisites
This page is aimed at helping you write your own TagHandlers and does not covers the breadths of Design Importer. To gain a comprehensive insight into the Design Importer functionality, it's essential to go through the documentation links below:http://dev.day.com/docs/en/cq/current/wcm/campaigns/landingpages.html
http://dev.day.com/docs/en/cq/current/wcm/campaigns/landingpages/extending-and-configuring-the-design-importer.html
Introduction
Lifecycle
Resolution
Content Aggregation
Writing your own TagHandler
Inside the SDK
TagHandler Boilerplate
TagHandler Example
Useful Links
A primary solution provided by the Design Importer is that of transforming the input HTML into a top-level generated canvas component and a set of CQ components contained therein.
The TagHandler is responsible for handling an HTML element, and all the HTML elements nested therein. The TagHandler receives SAX events corresponding to the HTML tags as and when they are encountered while parsing the HTML document. The output of the TagHandler could be markup, meta, script, includes, cq components or a mix of any of those. The below diagram illustrates how HTML snippets are tranformed TagHandlers into desired output content.
TagHandlers are POJOs instantiated everytime a tag needs to be handled. Each TagHandler has an associated TagHandlerFactory which is responsible for rolling out TagHandler instances.
TagHandlerFactories are implemented as OSGi services responsible for:
- Configuration of TagHandler
- Rolling out the corresponding TagHandler instance
- Injecting other OSGi service references as required by the TagHandler
The Design Importer framework controls when and how the callback methods of individual TagHandlers are invoked. The below steps descibe how Design Importer framework invokes various TagHandlers:
- Design Importer receives SAX events for individual tags as the HTML document is getting parsed.
- If there is no TagHandler active, the Design Importer resolves an appropriate TagHandler. As a part of initialization, the callback method [TagHandler#beginHandling()](http://dev.day.com/docs/en/cq/current/javadoc/com/day/cq/wcm/designimporter/api/TagHandler.html#beginHandling(java.lang.String, java.lang.String, java.lang.String, org.xml.sax.Attributes)) is invoked by the Design Importer.
- If a TagHandler is already active, the Design Importer invokes either of the [startElement()](http://dev.day.com/docs/en/cq/current/javadoc/com/day/cq/wcm/designimporter/api/TagHandler.html#startElement(java.lang.String, java.lang.String, java.lang.String, org.xml.sax.Attributes)), [endElement()](http://dev.day.com/docs/en/cq/current/javadoc/com/day/cq/wcm/designimporter/api/TagHandler.html#endElement(java.lang.String, java.lang.String, java.lang.String)) or [characters()](http://dev.day.com/docs/en/cq/current/javadoc/com/day/cq/wcm/designimporter/api/TagHandler.html#characters(char[], int, int)) callback method
- It's the responsibility of the active TagHandler to instantiate and initialize child TagHandlers, if required. For example, the ParsysComponentTagHandler on encountering nested component div tags instantiates and delegates to appropriate child TagHandlers.
- The delegation continues recursively till a "leaf" TagHandler is reached. Thus, the handling happens in form of a delegation chain.
- On receiving endElement(), the parent TagHandler is responsible for finishing up the child TagHandler, popping the child TagHandler out of the stack, and aggregating the content generated by by child TagHandler into itself.
The below diagram illustrates the delegation chain of the existing TagHandlers.
Question: How does the Design Importer framework decide which TagHandler should go about handling which html tag?
Answer: Each TagHandler declares the kind of tag it can handle via the tag.pattern OSGi property. This OSGi property stores the regular expression that needs be to matched against an HTML tag to determine if the TagHandler can handle the tag. Having the pattern stored as a configurable OSGi property has the clear advantage of ease with which the parsing logic could be configured.
Remember that the TagHandlers are plain java objects instantiated by their corresponding TagHandlerFactories. These TagHandlerFactories in turn are the OSGi services which need to define the configuration. Listed below are few examples of the tag.pattern property in out-of-the-box TagHandlerFactories:
CanvasComponentTagHandlerFactory
/**
* The TagHandlerFactory that rolls out {@link CanvasComponentTagHandler} instances
*/
@Service
@Component(metatype = true)
@Properties({
@Property(name = Constants.SERVICE_RANKING, intValue = 5000, propertyPrivate = false),
@Property(name = TagHandlerFactory.PN_TAGPATTERN, value = CanvasComponentTagHandlerFactory.TAG_PATTERN)
})
public class CanvasComponentTagHandlerFactory implements TagHandlerFactory {
static public final String TAG_PATTERN = "<div .*(?=id=\"(?i)cqcanvas\").*>";
}
TextComponentTagHandler
/**
* The TagHandlerFactory that rolls out {@link TextComponentTagHandler} instances
*/
@Service
@Component(metatype = true)
@Properties({
@Property(name = Constants.SERVICE_RANKING, intValue = 5000, propertyPrivate = false),
@Property(name = TagHandlerFactory.PN_TAGPATTERN, value = TextComponentTagHandlerFactory.TAG_PATTERN)
})
public class TextComponentTagHandlerFactory implements TagHandlerFactory {
static public final String TAG_PATTERN = "<(p|span|div)\\s+.*data-cq-component=\"(?i)text\".*?>";
}
ImgTagHandler
/**
* The TagHandlerFactory that rolls out {@link ImgTagHandler} instances
*/
@Service
@Component(metatype = true)
@Properties({
@Property(name = Constants.SERVICE_RANKING, intValue = 5000, propertyPrivate = false),
@Property(name = TagHandlerFactory.PN_TAGPATTERN, value = ImgTagHandlerFactory.TAG_PATTERN),
@Property(name = "service.factoryPid", value = "com.day.cq.wcm.designimporter.api.TagHandler")
})
public class ImgTagHandlerFactory implements TagHandlerFactory {
static public final String TAG_PATTERN = "<img(?!.* data-cq-component=\"(?i)image\").*>";
}
Note: Since regular expressions could be overlapping, it's possible that multiple TagHandler qualify for a particular tag. In case of such conflicts, the TagHandler with the highest ranking value, as denoted by the OSGi property SERVICE_RANKING, is the one picked.
Each TagHandler is responsible for controlling the lifecycle of its nested TagHandlers. Once a TagHandler starts handling an html element, it must also handle all the nested html elements. The nested elements could well map to other TagHandlers. It's the responsibility of the TagHandler to instantiate, destroy and control the nested TagHandlers. The Design Importer framework doesn't interfere here.
This may sound intimidating but this recurring logic is encapsulated by the AbstractTagHandler. The easiest way to reuse this functionality is thus, by extending from the AbstractTagHandler.
It's important to understand the type of content the TagHandlers emit. The below table descibes various content types.
Content Type | Description |
---|---|
HtmlContentType.META | Meta content typically defined within the HTML meta tags |
HtmlContentType.MARKUP | The HTML markup. This is what majority of the TagHandlers emit |
HtmlContentType.SCRIPT_INCLUDE | External javascript included via the HTML script tag |
HtmlContentType.SCRIPT_INLINE | The javascript defined inline, within the HTML script tag |
HtmlContentType.STYLESHEET_INCLUDE | External css included via the HTML link tag |
HtmlContentType.STYLES_INLINE | inline styles within the HTML style tag |
Note: Your TagHandler must implement the HTMLContentProvider interface if it emits any html content. Typically, most TagHandlers emit some content.
ComponentTagHandlers, in addition to the html content, also emit in-core cq components that are later persisted to the jcr repository. The TagHandlers that implement the PageComponentProvider are automatically called back for the components they've generated at the end of their handling.
Writing TagHandler should be fairly easy once you understand the architecture described above. Enlisted below are the cookbook steps you'll need to follow, in order to write your own TagHandlers:
- Define your TagHandler implementation class.
- The TagHandler has to be designed keeping in mind the HTML fragment it shall handle.
- Since the HTML fragment you're handling could contain nested tags, you'll need to handle them as well. The easiest way, is to simply extend from the AbstractTagHandler class. There would be very few cases you wouldn't want to extend from AbstractTagHandler.
- Implement/Override the beginHandling() method for any intialization activities
- Implement/Override the endHandling() method for finalizing. This is where you should typically update the AbstractTagHandler buffers viz. metadata, htmlBuffer, scriptBuffer, referencedScripts, pageComponents
- Define your TagHandler factory
- Make sure your TagHandlerFactory is a valid OSGi service that implements the design importer TagHandlerFactory service interface
- Define the property OSGi TagHandlerFactory.PN_TAGPATTERN to be the regular expression that matches the HTML tag you intend to handle
- Implement the create() method to instantiate and return your tag handler.
- Use the @Reference annotation to have existing OSGi services injected. You may pass these service instances to your taghandlers when you instantiate them within the create() method.
- You could also expose more OSGi configuration properties via the @Property or @Properties annotations and use them to configure the behaviour of your TagHandlers.
- Build and deploy
- With the provided maven project, simply execute the following command at the top level:
mvn -PautoInstallPackage clean install
- The above command shall build, run unit tests, and deploy your code into the cq instace running at localhost.
- With the provided maven project, simply execute the following command at the top level:
The SDK contains a starter maven project built by following steps mentioned at http://dev.day.com/docs/en/cq/aem-how-tos/development/how-to-build-aem-projects-using-apache-maven.html
This SDK comprises a boilerplate tag handler implementaion which can be used to quickly build your custom tag handlers. In addition, the SDK comes with an example implementation that could be used for reference. Both the boilerplate and the example are detailed below:
The following files comprise the boilerplate:
- bundle/src/main/java/com/mycompany/myproject/MyTagHandler.java
- bundle/src/main/java/com/mycompany/myproject/MyTagHandlerFactory.java
Commonly used methods are stubbed out for you to fill in. Please follow the code comments for further help.
An example tag handler implementation is provided to help you better understand how to write custom tag handlers.
The supplied example implementation is that of a plain text component tag handler. It transforms an html section into plain text, by simply stripping off all the nested html tags, before storing it as the text property of the foundation text component.
The following files comprise the example implementaion:
- bundle/src/main/java/com/mycompany/myproject/example/PlainTextComponentTagHandler.java
- bundle/src/main/java/com/mycompany/myproject/example/PlainTextComponentTagHandlerFactory.java
For more help, please refer to the code comments within the source files.
- http://dev.day.com/docs/en/cq/current/wcm/campaigns/landingpages.html
- http://dev.day.com/docs/en/cq/current/wcm/campaigns/landingpages/extending-and-configuring-the-design-importer.html
- http://dev.day.com/docs/en/cq/current/javadoc/com/day/cq/wcm/designimporter/api/TagHandler.html
- http://felix.apache.org/documentation/subprojects/apache-felix-maven-scr-plugin/scr-annotations.html