Expert Consultancy from Yellow Pelican

Talend t[DB]Output Reference

A site about Talend

Talend t[DB]Output Reference

Talend provides a number of t[DB]Output components, providing the ability to write data to database objects. Although Talend provides a generic component, tDBOutput, you are more than likely to find a custom component that supports the database vendor of your choice.

Our examples will be based on MySQL; but referring to other database vendors when appropriate. The principles learned here, can be applied to all of the database vendors that Talend supports.

tMysqlOutput

Talend provides the tMysqlOutput component, for manipulating MySQL database Tables, Views and Synonyms. This is a flexible component that can perform both Data Definition Language (DDL) and Data Manipulation Language (DML) operations which are sub-sets of Structured Query Language (SQL).

Although not demonstrated in the examples shown here, I would always recommend that you use the tMysqlConnection component to establish your database connections, rather than configuring them directly within this component. For more information on this, read our t[DB]Connection tutorial, and Working with Databases.

The simplest way to get going with this component, is to import the database schemas that you want to use, in to the Talend Repository. More information on this can be found in Working with Databases.

By importing your database schemas, you can then add Repository items to your Job design workspace. When you drag a database object to the design workspace, you will be able to choose the Component that should be used. In this case, it will be the tMysqlInput option.

The following screenshot shows the Component tab, for our newly created component.

Image 1

Property Type

Property Type allows you to specify if your database connection parameters should be taken from the Repository (as they are in our example), or Built-In to the component itself.

As discussed in Working with Databases, I would recommend that you do not use this option. Add a t[DB]Connection component and use this to maintain your database connections.

Use an existing connection

This option allows you to use an existing database connection. That is, a connection that has already been established with a t[DB]Connection component. On selecting this option, a new drop-down list will be displayed. This list will show all of the available connections. This is the connection method that I would always recommend.

Table

Here, you can specify the name of the table name that both Data Definition Language (DDL) and Data Manipulation Language (DML) operations will be performed against.

Action on table

This option allows you to specify any Data Definition Language (DDL) operation that should be performed against Table. The following options are available.

  • Default
  • Drop and create table
  • Create table
  • Create table if not exists
  • Drop table if exists and create
  • Clear table
  • Truncate table

Action on data

This option allows you to specify the Data Manipulation Language (DML) operation that should be performed. As this Job creates a new table each time it runs, this option is set to Insert as we want to only create new rows.

  • Insert
  • Update
  • Insert or update
  • Update or insert
  • Delete
  • Replace
  • Insert or update on duplicate key or unique index
  • Insert ignore

Insert

With the Insert action, data will be inserted. If a Primary Key violation occurs, this will be reported, but the Job will continue.

Update

With the Update action, data will be updated. If a record cannot be found for update, this will be ignored.

Number of rows per insert

Note. This option is dependent on both database vendor and Action on data. Please check the options available for your database vendor and the action that you are performing.

Number of rows per insert can be found on the Advanced settings tab (when you check the option Extend Insert) and has a default value of 100. This value specifies the number of rows that will be inserted before a transaction is Committed. This value works in conjunction with Auto Commit, that we specified for our tMysqlConnection component.

Batch Size

Note. This option is dependent on both database vendor and Action on data. Please check the options available for your database vendor and the action that you are performing.

Batch Size can be found on the Advanced settings tab (when you check the option Use Batch Size) and has a default value of 10,000. This value specifies the number of rows that will be modified before a transaction is Committed. This value works in conjunction with Auto Commit, that we specified for our tMysqlConnection component.

Schema

We would usually expect the Schema to have been propagated to this component, when it was connected to the preceding component. If this has not happened, hit the Sync columns button. This will always copy the Schema from the preceding component.




Expert Consultancy from Yellow Pelican
comments powered by Disqus

© www.TalendByExample.com