A site about Talend
In this series of articles, we'll take a look at Salesforce and how to get data in and out, using Talend. If you're new to Salesfoce, read our Salesforce Guide; which will help you to get started.
Salesforce is a Cloud Service and, as with other Cloud Services, it brings some new challenges. If you're used to working with modern Relational Databases (RDBMS) such as Oracle that are hosted in your local Data Centre, then you may be excused for thinking that you've travelled more than 20 years back in time.
Many of the features that you've come to expect, are not available (at least yet) with your Cloud Service, and things certainly start to slow down.
Talend provides a number of components to help you to connect and use Salesforce. Bear in mind that you must consider that a Cloud Services has different design considerations to your typical Relational Databases (RDBMS).
If you are just getting started with Salesforce, then you can create a free account at Developer Force and try things out.
Establishing your connection to Salesforce is probably the first place to start exploring. As with connections to any data source, I always establish a Talend Repository Connection to my development database; but I never refer to this connection within a Job. The repository connection is only used as a development aid. I always use the connection component tSalesforceConnection within my Jobs. The connection parameters are then externalised using Context Variables. There are many ways to externalise your Context. My preference is by using a reusable Subjob and I have described one in the article LibContextReader.
For more information on setting up a Salesforce Repository Connection, read the article on Salesforce Metadata.
Here are the connection parameters that you'll need, when connection to Salesforce and using the Salesforce API. In later versions of Talend, you can additionally connect to Salesforce using OAuth 2.0. In this article, we will concentrate on establishing a Basic Login Type.
|Parameter||Default Value (Talend 5.4.1)||Comment|
|Password||password||When using the Salesforce API, you must provide both a password and Security Token. This parameter is the two values, concatenated.|
|Salesforce Version||25.0||Required for Bulk API only.|
For details of your specific connection parameters, contact your Salesforce Administrator.
Salesforce provides two API, Standard & Bulk.
Generally speaking, I use the Standard API whenever possible; however, there are some advantages to using the Bulk API, especially when processing large data volumes. At the time of writing, some versions of Talend, are lacking in their implementation of the Bulk API components, so you will want to ensure that the API does work for you. The articles here will assume that the API does work; but I will make comment on issues where helpful.
When you query Salesforce, Talend allows you to specify a timeout in milliseconds. The default value is
60000 (1 minute). Note that Salesforce also has a Query Timeout that is, all queries must return within 2 minutes. You should consider both of these timeouts when querying Salesforce, especially when querying large objects and where you have a constraint in your query. Also bear in mind that the Salesforce 2-minute-timeout is, at the time of writing, an immovable restriction.
As well as Salesforce having a standard set of Objects & Fields, Salesforce also allows Salesforce Developers to specify their own and, in a typical Salesforce environment, you are likely to see a number of custom Objects and Fields.
These custom Objects and Fields are easily identifiable as they will all have the prefix
__c, for example, you may have a Custom Object named
MyCustomObject__c or a custom field named
This section provides some key points on Salesforce internationalisation. For more information on this subject, read Salesforce Internationalisation.
All date and time values stored by Salesforce, are stored in UTC.
By default, data stored in Salesforce, is encoded using UTF-8. Data may, alternatively, be encoded using ISO-8859-1. Before reading or writing data, you will want to ask your Salesforce Administrator what encoding scheme is being used by your Organisation.
When reading from Salesforce, the Salesforce API will always deliver data in the chosen character set.
If you need to identify changes to your Salesforce data, for example, to extract data for a Data Warehouse, then read this article on Salesforce Change Data Capture (CDC).comments powered by Disqus