Expert Consultancy from Yellow Pelican

Talend with Salesforce

A site about Talend

Talend Salesforce Reference

In this series of articles, we'll take a look at Salesforce and how to get data in and out, using Talend. If you're new to Salesfoce, read our Salesforce Guide; which will help you to get started.

Salesforce is a Cloud Service and, as with other Cloud Services, it brings some new challenges. If you're used to working with modern Relational Databases (RDBMS) such as Oracle that are hosted in your local Data Centre, then you may be excused for thinking that you've travelled more than 20 years back in time.

Many of the features that you've come to expect, are not available (at least yet) with your Cloud Service, and things certainly start to slow down.

Talend provides a number of components to help you to connect and use Salesforce. Bear in mind that you must consider that a Cloud Services has different design considerations to your typical Relational Databases (RDBMS).

Developer Force

If you are just getting started with Salesforce, then you can create a free account at Developer Force and try things out.

Connecting to Salesforce

Establishing your connection to Salesforce is probably the first place to start exploring. As with connections to any data source, I always establish a Talend Repository Connection to my development database; but I never refer to this connection within a Job. The repository connection is only used as a development aid. I always use the connection component tSalesforceConnection within my Jobs. The connection parameters are then externalised using Context Variables. There are many ways to externalise your Context. My preference is by using a reusable Subjob and I have described one in the article LibContextReader.

For more information on setting up a Salesforce Repository Connection, read the article on Salesforce Metadata.

Connection Parameters

Here are the connection parameters that you'll need, when connection to Salesforce and using the Salesforce API. In later versions of Talend, you can additionally connect to Salesforce using OAuth 2.0. In this article, we will concentrate on establishing a Basic Login Type.

ParameterDefault Value (Talend 5.4.1)Comment
WebService URLhttps://login.salesforce.com/services/Soap/u/25.0
Usernameyouremail@yourcompany.com
PasswordpasswordWhen using the Salesforce API, you must provide both a password and Security Token. This parameter is the two values, concatenated.
Timeout (milliseconds)60000
Salesforce Version25.0Required for Bulk API only.

For details of your specific connection parameters, contact your Salesforce Administrator.

Standard & Bulk API

Salesforce provides two API, Standard & Bulk.

Generally speaking, I use the Standard API whenever possible; however, there are some advantages to using the Bulk API, especially when processing large data volumes. At the time of writing, some versions of Talend, are lacking in their implementation of the Bulk API components, so you will want to ensure that the API does work for you. The articles here will assume that the API does work; but I will make comment on issues where helpful.

Query Timeout

When you query Salesforce, Talend allows you to specify a timeout in milliseconds. The default value is 60000 (1 minute). Note that Salesforce also has a Query Timeout that is, all queries must return within 2 minutes. You should consider both of these timeouts when querying Salesforce, especially when querying large objects and where you have a constraint in your query. Also bear in mind that the Salesforce 2-minute-timeout is, at the time of writing, an immovable restriction.

Custom Objects & Fields

As well as Salesforce having a standard set of Objects & Fields, Salesforce also allows Salesforce Developers to specify their own and, in a typical Salesforce environment, you are likely to see a number of custom Objects and Fields.

These custom Objects and Fields are easily identifiable as they will all have the prefix __c, for example, you may have a Custom Object named MyCustomObject__c or a custom field named Account.MyCustomField__c.

Internationalisation

This section provides some key points on Salesforce internationalisation. For more information on this subject, read Salesforce Internationalisation.

Date & Time

All date and time values stored by Salesforce, are stored in UTC.

Character Encoding

By default, data stored in Salesforce, is encoded using UTF-8. Data may, alternatively, be encoded using ISO-8859-1. Before reading or writing data, you will want to ask your Salesforce Administrator what encoding scheme is being used by your Organisation.

When reading from Salesforce, the Salesforce API will always deliver data in the chosen character set.

Change Data Capture (CDC)

If you need to identify changes to your Salesforce data, for example, to extract data for a Data Warehouse, then read this article on Salesforce Change Data Capture (CDC).




Expert Consultancy from Yellow Pelican
comments powered by Disqus

© www.TalendByExample.com