Expert Consultancy from Yellow Pelican

Reusable Job LibMakeDimDate

A site about Talend

LibMakeDimDate

This reusable and configurable Job, generates the data that is required to populate a Data Warehouse Date Dimension. This is often referred to as a Time Dimension; however, in this series of articles, we will be creating Dimensions for both Date and Time.

Download LibMakeDimDate

You may download LibMakeDimDate and import (Job Designs->Import items) in to a Talend Project of your choice. This download is supported by Talend 5.2.2 and above.

Download

Usage

This Job may be added to any other Job (the parent Job), and provides the input for creating a Date Dimension. You may use the output as-is, drop the columns that are not of interest, modify existing values or add additional ones - without the need to modify the library Job itself.

The output from this Job contains many attributes. It does not include attributes for Financial Periods or Public Holidays. These additional attributes will be added in a later version; however, you may easily add these for yourself.

To see this library Job in use, read our article Create Data Warehouse. This is one of a series of articles that are described in Talend Stock Market Analysis Project.

Parameters

The following parameters (Context Variables) may be overridden, by passing them from a parent Job.

ParameterDefault ValueDescription
epoch01-Jan-1970The first date to be created
daysPastToday365The last day created will be today+daysPastToday. This may be a negative number.
quarters"Quarter 1;Quarter 2;Quarter 3;Quarter 4"A semi-colon (;) delimited list of descriptions.
months"January;February;March;April;May;June;July;
August;September;October;November;December"
A semi-colon (;) delimited list of descriptions.
shortMonths"Jan;Feb;Mar;Apr;May;Jun;Jul;Aug;Sep;Oct;Nov;Dec"A semi-colon (;) delimited list of descriptions.
days"Sunday;Monday;Tuesday;Wednesday;Thursday;
Friday;Saturday"
A semi-colon (;) delimited list of descriptions.
shortDays"Sun;Mon;Tue;Wed;Thu;Fri;Sat"A semi-colon (;) delimited list of descriptions.

Generated Data

The table shown, below, demonstrates the data generated by this Job. You may drop columns that are not required (from within your own Job), modify values, and add your own. You may modify the textual values by passing your own, as detailed above.

Naming Conventions

The variable naming conventions break away from the Talend (Java) convention as it is expected that this data will be loaded in to a database and it is, therefore, more helpful to use a naming convention that is more in keeping with what you would expect to see in a database.

DATE_ID

DATE_ID is the Primary Key for this Dimension and, as can be seen from the sample data shown below, it has been derived from the Date. Some may argue against this and say that it should simply be an arbitrary number. I would, generally, agree with this; however, I have chosen to do this so that the key may be calculated by data loaders, rather than requiring a look-up. It is also the intention, that this Dimension may be refreshed on a daily basis and that it may contain attributes that are relative to the current day. To achieve this, it is helpful that there is a guarantee that these keys do not mutate.

Sample Data

ColumnSample ValueDescription
DATE_ID20131112Primary Key
THE_DATE12-11-2013
YEAR2013
QUARTER_NUMBER4
QUARTERQuarter 4Descriptions may be passed as delimited (;) list.
MONTH_NUMBER11
MONTHNovemberDescriptions may be passed as delimited (;) list.
MONTH_SHORTNovDescriptions may be passed as delimited (;) list.
WEEK_OF_YEAR46
WEEK_OF_MONTH3
DAY_OF_YEAR316
DAY_OF_MONTH12
DAY_OF_WEEK_NUMBER3
DAY_OF_WEEKTuesdayDescriptions may be passed as delimited (;) list.
DAY_OF_WEEK_SHORTTueDescriptions may be passed as delimited (;) list.
DAY_OF_WEEK_IN_MONTH2
IS_WEEKDAYtrue
NH_SEASONAutumnNorthern Hemisphere.
SH_SEASONSpringSouthern Hemisphere.
IS_THIS_YEARtrueRequires daily refresh.
IS_LAST_YEARfalseRequires daily refresh.
IS_NEXT_YEARfalseRequires daily refresh.
IS_THIS_QUARTERtrueRequires daily refresh.
IS_LAST_QUARTERfalseRequires daily refresh.
IS_NEXT_QUARTERfalseRequires daily refresh.
IS_THIS_MONTHtrueRequires daily refresh.
IS_LAST_MONTHfalseRequires daily refresh.
IS_NEXT_MONTHfalseRequires daily refresh.
IS_THIS_WEEKtrueRequires daily refresh.
IS_LAST_WEEKfalseRequires daily refresh.
IS_NEXT_WEEKfalseRequires daily refresh.
IS_TODAYtrueRequires daily refresh.
IS_YESTERDAYfalseRequires daily refresh.
IS_TOMORROWfalseRequires daily refresh.

Job Design

This Job is added to a parent Job using the tRunJob Component. Once added, you should hit the button Copy Child Job Schema and then connect this component to another component of your choice. This may, for example, be a t[DB]Output component.




Expert Consultancy from Yellow Pelican
comments powered by Disqus

© www.TalendByExample.com