Expert Consultancy from Yellow Pelican

Scheduling Talend Jobs using cron

A site about Talend

Scheduling using cron

In our articles Talend Job Deployment and Job Shell Launchers, we looked at how Jobs may be exported from Talend for deployment. The next logical step is to be able to schedule your Jobs so that they run at appropriate times. In this article, we will look how Jobs may be scheduled using the Unix scheduler cron. This scheduler is also available for other Operating System including OS X and Linux.

If you would like to see a example of scheduling a Talend Job using cron, read our article Stock Market Analysis Project - Get Stock Quotes.

An Introduction to Using cron

cron is daemon than executes scheduled commands. For more information on cron, issue the command man cron, from a command prompt.

cron wakes up every minute and executes commands based on the definitions found in crontab files. These files are maintained using the program crontab. For more information on cron, issue the command man crontab, from a command prompt.

The following table shows the general construct of a crontab file. Each field should be separated by a tab character. For more information on the format of a crontab file, issue the command man 5 crontab, from a command prompt.

(*=every minute)
FieldAllowed Values
minute0-59 (*=every minute)
hour0-23 (*=every hour)
day of month1-31 (*=every day of month)
month1-12 (or use names) (*=every month)
day of week0-7 (0 or 7 is Sunday, or use names) (*=every day of week)
commandThe command to be run

The following table shows some typical time field values.

MinuteHourDay of MonthMonthDay of Week
21,418**1,2,3,4,5
1,21,419,10,11,12,13,14,15**1,2,3,4,5
1,21,5116**1,2,3,4,5

Example crontab File

The following shows an example crontab file. This file is maintained using the command crontab -e. You may also maintain your own local file and submit it using the command crontab MyCrontabFile.cron. Note that this will replace the entire crontab file for the current user account.

21,41	8	*	*	1,2,3,4,5 $HOME/talend/bin/runTalendJob StockMarket GetQuotes
1,21,41	7,9,10,11,12,13,14,15,16	*	*	1,2,3,4,5 $HOME/talend/bin/runTalendJob StockMarket GetQuotes
1,21,51	16	*	*	1,2,3,4,5 $HOME/talend/bin/runTalendJob StockMarket GetQuotes

This example crontab file has been taken from our tutorial Stock Market Analysis Project. In this example, we are scheduling a Job to run every 20 minutes, during the period of time that a Stock Market is open. This Job runs at 21 and 41 minutes past 8am and 1, 21 and 41 minutes past the hours of 9am, 10am, 11am, 12pm, 1pm, 2pm, 3pm. The 4pm schedule runs at 16:51 rather than 16:41, to catch the closing prices. This Job runs every weekday.

A Simple Job Control Script

When we exported our Job, Talend created a launch script, for example, GetQuotes_run.sh. In some cases, simply executing this script is sufficient for running our Job. Often, we want more control over the Job we are executing.

The following shows the typical content of a Talend-generated launch script.

cd `dirname $0`
 ROOT_PATH=`pwd`
 java -Xms256M -Xmx1024M -cp $ROOT_PATH/../lib/dom4j-1.6.1.jar:$ROOT_PATH/../lib/filecopy.jar:$ROOT_PATH/../lib/mysql-connector-java-5.1.22-bin.jar:$ROOT_PATH/../lib/talend_file_enhanced_20070724.jar:$ROOT_PATH/../lib/talendcsv.jar:$ROOT_PATH:$ROOT_PATH/../lib/systemRoutines.jar::$ROOT_PATH/../lib/userRoutines.jar::.:$ROOT_PATH/getquotes_0_1.jar:$ROOT_PATH/libmaketickersymbols_0_1.jar:$ROOT_PATH/libmysqlsharedconnection_0_1.jar:$ROOT_PATH/libcontextreader_0_4.jar: talendbyexample.getquotes_0_1.GetQuotes --context=Default "$@"

As can be seen from the crontab file, above, we are using a single Shell Script to execute and control our Jobs. This is a very basic script and we will enhance this as we work through this project. The current version of this script is shown below.

#!/bin/sh

if [ ${#} -ne 2 ]; then
        echo "runTalendJob: usage: project_folder job_name"
        exit 2
fi
echo "`date`: Starting Job ${1}.${2}" 1>> ${HOME}/talend/${1}/${2}.log
${HOME}/talend/${1}/Jobs/${2}/${2}_run.sh 1>> ${HOME}/talend/${1}/${2}.log 2>&1
JobExitStatus=${?}

if [ ${JobExitStatus} -eq 0 ]; then
        echo "`date`: Job ${1}.${2} ended successfully" 1>> ${HOME}/talend/${1}/${2}.log
else
        echo "`date`: Job ${1}.${2} ended in error" 1>> ${HOME}/talend/${1}/${2}.log
fi
exit ${JobExitStatus}

This script assumes that our exported Job has been extracted to the directory $HOME/talend/ProjectName/Jobs (where $HOME represents our Home directory and ProjectName is the name of our Talend project, for example StockMarket). For more information on exporting Jobs, read the articles Talend Job Deployment and Job Shell Launchers, now.

The key elements to this script is that it allows us a simple naming convention for specifying the Job that we want to run, and output (stdout and stderr) is written to a log file.




Expert Consultancy from Yellow Pelican
comments powered by Disqus

© www.TalendByExample.com