Expert Consultancy from Yellow Pelican

Salesforce Change Data Capture (CDC)

A site about Talend

Identifying Changes

Salesforce does not provide any formal Change Data Capture Data (CDC) mechanism, as you may find with a Relational Databases (RDBMS). If you need to extract changed data from Salesforce, then I would recommend using the SystemModstamp field; which can be found on every Salesforce object. By using this date and time value, and by recording the previous date and time that data was extracted, you are able to incrementally select new, modified and (optionally) deleted rows from Salesforce. I would recommend incorporating an overlap in the date and time values, to mitigate the risk of lost changes due to longer running transactions.

CDC Example

The following table shows a sample series of extracts for the Account object.

ObjectExtract StartLast Extract StartOverlap (seconds)Query (Where)
Account2014-01-31T12:00:00.000Z60
Account2014-01-31T13:00:00.000Z2014-01-31T12:00:00.000Z60where SystemModstamp >= 2014-01-31T11:59:00.000Z
Account2014-01-31T14:00:00.000Z2014-01-31T13:00:00.000Z60where SystemModstamp >= 2014-01-31T12:59:00.000Z

In the above scenario, we can reasonably expect (you may need to review the overlap value) to capture all changes to the Salesforce object. We will not be able to reliable determine the nature of the change (in respect to how our Data Warehouse or other receiving system will perceive the change); that is, determining if the change is an Insert, Update, Delete or a restoration from he Recycle Bin. The onus is, therefore, on the receiving system to determine the nature of the change, for example, if the row has been seen before.

Event Date, Time, User and Delete Status Values

The following table shows the key fields that are available for tracking the modification history of a row.

FieldDescription
CreatedDateThe date and time that the row was created, for example, 2014-01-31T12:00:00.000Z
CreatedIdThe User.ID of the user that created the row.
LastModifiedDateThe date and time that the row was created, last modified, deleted or restored for example, 2014-01-31T12:00:00.000Z
LastModifiedByThe User.ID of the user that created, last modified, deleted or restored the row.
SystemModstampThe date and time that the row was created, last modified, deleted, restored or other system events.
IsDeletedA boolean value to show if a row is in the Recycle Bin. Unless requested, these rows are not available for query.

The following table shows typical values for these fields, based on the specified events.

ActionCreatedDateLastModifiedDateSystemModStampIsDeleted
Insert Row2014-01-31T12:00:00.000Z2014-01-31T12:00:00.000Z2014-01-31T12:00:00.000Zfalse
Update Row2014-01-31T12:00:00.000Z2014-01-31T12:15:00.000Z2014-01-31T12:15:00.000Zfalse
Delete Row2014-01-31T12:00:00.000Z2014-01-31T12:30:00.000Z2014-01-31T12:30:00.000Ztrue
Restore Row2014-01-31T12:00:00.000Z2014-01-31T12:45:00.000Z2014-01-31T12:45:00.000Zfalse

Deleted Rows

Most, but not all, Salesforce Objects allow rows to be deleted. Deleted rows are moved to a Recycle Bin; however, they do not remain there indefinitely. If a row is deleted, then the field isDeleted will be set to true. When querying Salesforce Objects, you must specify if you want deleted rows to be selected.




Expert Consultancy from Yellow Pelican
comments powered by Disqus

© www.TalendByExample.com