Expert Consultancy from Yellow Pelican

Component JET Files

A site about Talend

Talend Component Java Emitter Templates (JET)

By far and away, the most complex part of your component design, is writing the Java Emitter Template (JET) files. You will have a minimum of one file and a maximum of three. These are the code of your component.

What is a Java Emitter Template (JET)?

JET is part of the Eclipse Modeling Framework Project (EMF) and is a JSP-like syntax that makes it easy to write templates for generating code. JET is a generic template engine. In the case of our component design, we want it to generate Java code.

The tutorials here are not intended to teach you JET. You will learn enough about JET, to build your components. If you want to know more about JET, then this tutorial is a good place to start.

When code is emitted, it is appended to the JET StringBuffer and then returned by the generate method.

How does Talend use JET?

This is a very important concept that you may already understand as a Talend developer; but it is even more important to understand as a component designer. Talend is a code-generater. Each time you build your Jobs, code is generated. The JET files of all of the components that you use in your Job, are executed each time you build your Job - to play their part in the generation of your code. If you understand this concept, then the remainder of this tutorial will make a lot more sense.

The Three Template File

The following list shows the three template files for the imaginary component tMyComponent.

  • tMyComponent_begin.javaflex
  • tMyComponent_main.javaflex
  • tMyComponent_end.javaflex

If you're familiar with the tJavaflex component, then you'll recognise the concept of begin, main and end. These are the same three code-sections that you define for tJavaflex. If you are not familiar with tJavaflex, then you may find it helpful to review this component now. If you were building (mostly) the functionality of a component without actually building one, then tJavaflex is what you would use. tJavaflex provides a great insight in to how these three code-sections work together.

A Simple Template File

Now let's look at a simple template file. The code shown below it from the file tJava_begin.javaflex. The component tJava is a simple yet powerful component, so it is a good place to start.

<%@ jet
        imports="
                org.talend.core.model.process.INode
                org.talend.core.model.process.ElementParameterParser
                org.talend.designer.codegen.config.CodeGeneratorArgument
        "
%>

<%
        CodeGeneratorArgument codeGenArgument = (CodeGeneratorArgument) argument;
        INode node = (INode)codeGenArgument.getArgument();
%>

<%=ElementParameterParser.getValue(node, "__CODE__") %>

Template Arguments

An argument is passed to component templates, of type CodeGeneratorArgument. This will provide us with important information about our component and we'll learn more about this later. For now, it is sufficient to understand that an argument has been passed.

Directives

Directives are messages to the JET engine. As we work through this tutorial, we'll look at the different Directives and how they are used.

Directives have the syntax: -

<%@ directive { attr="value" }* %>

Jet Directive

The Jet Directive defines a number of attributes and communicates these to the JET engine. The Jet Directive is identified by the section starting <%@ jet snd ending with %>.

As a minimum, a Talend component requires that the import attribute is specified, and to include a minimum number of imports.

Don't expect to find too much information on these and other Classes. We'll hopfully cover the most important ones in this and other articles. API Documentation is available for these.

Imports

As we can see from the above, we use the Jet Directive the import the Classes that JET requires for our component.

                org.talend.core.model.process.INode
                org.talend.core.model.process.ElementParameterParser
                org.talend.designer.codegen.config.CodeGeneratorArgument

These three imports are probably the least that we would expect to see in a typical component. If you were to create a component using the Component Designer, then you would see the following additional imports. We can, thus, take it that this is the set of most commonly required imports.

                org.talend.core.model.metadata.IMetadataTable
                org.talend.core.model.metadata.IMetadataColumn
                org.talend.core.model.process.IConnection
                org.talend.core.model.process.IConnectionCategory
                org.talend.core.model.metadata.types.JavaTypesManager
                org.talend.core.model.metadata.types.JavaType
                java.util.List
        	java.util.Map

JET Scripting Elements

JET has two scripting language elements - scriptlets and expressions. A scriptlet is a statement fragment, and an expression is a complete Java expression.

Each scripting element has a <% based syntax as follows: -

<% this is a scriptlet %>
<%= this is an expression %>

White space is optional after <%, and <%=, and before %>. If you want to use the %> character sequence as literal characters in a scriptlet, rather than to end the scriptlet, you can escape them by typing %\>. Similarly, the <% character sequence can be escaped by using <\%.

Scriptlets

JET scriptlets contain fragments of Java code. Scriptlet may or may nor cause Java code to be emmitted.

Let's take a look at the scriptlet from our example above.

<%
        CodeGeneratorArgument codeGenArgument = (CodeGeneratorArgument) argument;
        INode node = (INode)codeGenArgument.getArgument();
%>

In this scriptlet we are simply creating two Objects for use by the JET engine.

The first line creates codeGenArgument of type CodeGeneratorArgument. argument is an argument that was passed to the template, as described in the Template Arguments section, above.

The second line creates node of type INode. This is created by calling the method codeGenArgument.getArgument(). We'll learn more about why wer're doing this in the Expressions section.

In the example above, we've used a scriptlett to create two Objects that may be used by the JET engine. We can also use scriptlets to determine if Java should be emitted. Consider the following example that I have borrowed from the JET tutorial.

<% if (Calendar.getInstance().get(Calendar.AM_PM) == Calendar.AM) { %>
System.out.println("Good Morning");
<% } else { %>
System.out.println("Good Afternoon");
<% } %>

In this example, scriptlets are being used to determine which code should be emitted. If it is morning, then the code System.out.println("Good Morning"); will be emitted. If not, then System.out.println("Good Afternoon"); will be emitted. If this logic was in one of our components, then we would see that this causes only a single line to appear in our code.

Expressions

JET expressions are evaluated and then emitted. In the case of our example above, we have the following expression: -

<%=ElementParameterParser.getValue(node, "__CODE__") %>

In this example, the static method ElementParser.getValue is being used to extract the value of our CODE parameter from node. Remember that node is an object of type INode, as demonstrated in the section on Scriptlets, above. Note that when references our parameters, they are both prefixed and suffixed with __.

Include Directive

The Include Directive has the syntax (for example): -

<%@ include file="copyright.jet" %>

We'll look at this directive in a later article.

A Component is not a Class

It is important to know that when you create a component, you are not defining a Class; When you write a Talend Job, then you are defining a Class. When you add components to your Job, you are simply merging the code of the component with all of the other code that makes up your Job.

This is an important concept to understand and, if you've ever looked at a Job's Source tab in Talend Studio, you should already be familiar with this idea.

One important effect of this is variable scope. In the next section, we'll look at the Unique Component Identifier, and how this resolves the issue of variable scope.

Unique Component Identifier

One effect of the way in which Talend merges the code of all the components that make up your Job, is variable scope. It is important to always uniquely name these for each instance of your component, to ensure that there are no name-collisions. If you take a look at existing components, you will see logic that looks similar to the that shown below: -

<%
    CodeGeneratorArgument codeGenArgument = (CodeGeneratorArgument) argument;
    INode node = (INode)codeGenArgument.getArgument();
    String cid = node.getUniqueName();
%>

We've already looked at the first two lines, so now let's look at the third line. we are creating a String variable named cid (Component ID) and assigning the value returned by the method node.getUniqueName. In our imaginary component named tMyComponent, this would assign the value tMyComponent_1 for the first instance of the component, tMyComponent_2 for the second instance, and so on.

Let's say that we now want a simple loop, using a the variable i. Often, we would express this as: -

for(int i = 0; i < 10; i++) {

Because of the way the code for our entire Job is constructed, it is concievable that we may define a second variable named i, that is in the same scope as the first. Because of this, we must always qualify our variables by using the variable cid in a JET Expression, as shown below.

for(int i_<%=cid %> = 0; i_<%=cid %> < 10; i_<%=cid %>++) {

If you were to look at a Job's Source tab in Talend Studio, you would see the code that was emitted for this component and it would look something like this: -

for(int i_tMyComponent_1; = 0; i_tMyComponent_1; < 10; i_tMyComponent_1++) {

Conclusion

This tutorial should provides an insight in to the overal structure of Java Emitter Template (JET) files. This has been provided with some context, by looking at a simple component that allows a parameter to be provided; which is simply injected in to the output code of the Job.

You have now learned enough to create a simple component. Of course, you also need your component to do something useful. Most Talend component have a need to process data. Components often start data flow, process data frow, or end data flow. We'll look at this in the next article and build our first component.




Expert Consultancy from Yellow Pelican
comments powered by Disqus

© www.TalendByExample.com