Expert Consultancy from Yellow Pelican

Talend Load Context Example

A site about Talend

Talend Load Context Example

On 30th May 2013, I published a tutorial on building a Reusable Context Load Job. I'd strongly recommend following this tutorial on creating a Reusable Context Load Job as I believe this is best-practice for ensuring you build reliable Jobs that are maintainable through the entire development to production lifecycle. You may still find this older tutorial a useful reference, especially if you are new to Talend as it provides some detailed explanation of the general process.

This is a working example of a Job that loads Context from a file. In my opinion, loading Context from a file is the easiest and most reliable method for managing Context. A future example will extend this to show how a reusable Job can be created that can be called by all or most of your Jobs, meaning that Context is always loaded in a consistent way.

Context Overview

Context represents the parameters that are passed to your Job at runtime. These are the parameters that will control the way your Job executes, for example, Context Variables may identify the database connections that will be used by your Job.

For more information on context, read Talend Context Reference.

General Context Group

Context Groups may be defined, to specify collections of Context Variables. This eases the maintenace of Context variables and their addition to your Jobs. In this example, we're going to create a General Context Group where we will specify General Context Variables that may be relevant to most, if not, all of your Jobs.

Amongst other things, this Context Group will define a Context Variable that will be used to specify the location of the files that hold your Context. Of course, this is one value that you cannot specify in a Context File. This value may be either hard-coded (although it may be modified at runtime) or you may use Operating System variables to determine the location of your Context files. This is a presonal preference.

To create a new Context Group, right-click Contexts in the Repository pane, to activate the Contexts menu. Select Create Context group to create a new group, as shown in the image below.

Talend Load Context Example Image 1

Create a new Context Group named General. Also complete the fields Purpose and Description. When you have completed these fields, press Next.

Talend Load Context Example Image 2

You can now add the following Context variables.

  • contextDir, String
  • tmpDir, String

Here, we're defining two Context Variables, with a String type. contextDir will be used to identify the location of your Context files and tmpDir will identify the location for a directory for holding temporary files. tmpDir is only used as an example. This requirement may or may not be relevant to your own specific needs.

Talend Load Context Example Image 3

Now select the Values as table tab, where you'll provide a value for contextDir. The following code provides serveral examples. The last option uses Operating System properties to determine the location.

"/usr/local/Talend/Context"
"C:/Talend/Context"

There is no need to change the value of tmpDir as you'll be loading this value from the Context file. Note that the default value for a String Context Variable is null; this is a String value of null and does not represent a null String pointer. See Talend Context Reference for more information.

When you've completed these settings, press Finish.

Talend Load Context Example Image 4

Create Job and add Context Group

Create a new Job called ContextLoadExample. This Job will load the General Context Group variables from a file.

Talend Load Context Example Image 5

When the Job has been created, drag the General Context Group from the Repository Browser to either the Job Design tab or to the Context tab. If you haven't already done so, select the Context tab.

You will see from the image below, that the two Context Variables have now been added to your Job and are grouped under the General Context Group. This will permanently link these Context Variables in your Job, to the Context Group that has been defined in the Repository.

Talend Load Context Example Image 6

Associate General Context with an External File

You've already defined the directory that will contain your Context file, you'll now define a Global Variable that you will use to identify the full path name of your Context file. As you have no requirement for this to ever be changed through parameters, you'll use globalMap rather than creating an additional Context Variable.

  • Add tJava Component (InitJob)
  • Add the following code globalMap.put("generalContextFile", context.contextDir + "/" + contextStr + ".General.cfg");

We're now able to reference a String in our Job, with a value something like this (depending on the directory you previously specified): -

/tmp/TalendByExample/context/Default.General.cfg

Your Job should now look similar to the screenshot below.

Talend Load Context Example Image 7

Check Context File

Next, Your Job will check to see if your Context File exists and exit if the file is not found.

  • Add tFileExist Component (CheckGeneralContextFileExists)
  • Set File name/Stream to (String) globalMap.get("generalContextFile")
  • Connect InitJob to CheckGeneralContextFileExists using Trigger - On SubjobOk
  • Add tDie Component (DieOnNotGeneralContextFileExists)
  • Set Die message to jobName + ": cannot open file " + (String) globalMap.get("generalContextFile")
  • Connect CheckGeneralContextFileExists to DieOnNotGeneralContextFileExists using Trigger - Run if (NotGeneralContextFileExists)
  • Set the If expression to ! (Boolean) globalMap.get("tFileExist_1_EXISTS")

When you've completed these steps, your Job should look like the screenshot below. You'll notice that Components and Triggers have been named appropriately. To access Trigger - Run if, right-click tFileExist and select it from the pop-up menu. To edit the If expression, select the connector.

You'll also notice that the result from tFileExist is made available through globalMap.

You can now run your Job. It should fail due as you have not yet created the Context file.

Talend Load Context Example Image 8

Create Context File

Now create a new file, using a text editor. The file will be named Default.General.cfg and will be located in the directory that you specified in contextDir.

Add the following text to your file, setting the value of tmpDir to a value of your choice. We will not be using this value in this tutorial, so you may leave this value unchanged, if you wish.

tmpDir=/tmp

Load Context File

Now that you have tested that the Context file exists, it can be loaded.

  • Add tJava Component (InitReadGeneralContext)
  • Add the following code System.out.println(jobName + ": loading Context file " + (String) globalMap.get("generalContextFile"));
  • Connect CheckGeneralContextFileExists to InitReadGeneralContext using Trigger - Run if (GeneralContextFileExists)
  • Set the If expression to (Boolean) globalMap.get("tFileExist_1_EXISTS")
  • Add tFileInputDelimited Component (ReadGeneralContext)
  • Set File name/Stream to (String) globalMap.get("generalContextFile")
  • Set Field Separator to "="
  • Connect InitReadGeneralContext to ReadGeneralContext using Trigger - On Subjob Ok
  • Add tContextLoad Component (LoadGeneralContext)
  • Connect ReadGeneralContext to LoadGeneralContext using Row - Main (GeneralContext)
  • Copy the schema from LoadGeneralContext to ReadGeneralContext (see Talend Schema Reference for help on copying a schema)
  • Add tJava Component (DeInitReadGeneralContext)
  • Add the following code Boolean missingGeneralContext = false;
    if(context.tmpDir == null ||
    "".equals(context.tmpDir) ||
    "null".equals(context.tmpDir)) {
    missingGeneralContext = true;
    System.err.println(jobName + ": context.tmpDir not defined in " + (String) globalMap.get("generalContextFile"));
    }
  • Connect ReadGeneralContext to DeInitReadGeneralContext using Trigger - On Subjob Ok
  • Add tDie Component (DieOnMissingGeneralContext)
  • Set Die message to jobName + ": missing Context for " + (String) globalMap.get("generalContextFile")
  • Connect DeInitReadGeneralContext to DieOnMissingGeneralContext using Trigger - Run if (MissingGeneralContext)
  • Set the If expression to missingGeneralContext
  • Add tJava Component (DeInitJob)
  • Add the following code System.out.println(jobName + ": tmpDir: " + context.tmpDir);
  • Connect DeInitReadGeneralContextFile to DeInitJob using Trigger - On Subjob Ok

Your Job should now look like the screenshot below and is ready to run. If it completes successfully it should display the message ContextLoadExample: tmpDir: /tmp and you will also see that the exit code of the Job indicating success [exit code=0]. You may also want to try removing the definition of tmpDir, from your Context file. In this instance, you should see your Job fail with the messages ContextLoadExample: context.tmpDir not defined in /usr/local/Talend/context/Default.General.cfg ContextLoadExample: missing Context for /usr/local/Talend/context/Default.General.cfg [exit code=4].

Talend Load Context Example Image 9

Conclusion

You have created a Context Group where Talend will load a specific Context Variable from an external file. Talend checks that the file exists, attempts to loads the Context Variable and then checks that the Context Variable was successfully loaded.

In a future tutorial, we'll look at how this Job can be extended, so that it is reusable (as a Subjob) in all of the Jobs that you write.




Expert Consultancy from Yellow Pelican
comments powered by Disqus

© www.TalendByExample.com