Labels

Apache Hadoop (3) ASP.NET (2) AWS S3 (2) Batch Script (3) BigQuery (21) BlobStorage (1) C# (3) Cloudera (1) Command (2) Data Model (3) Data Science (1) Django (1) Docker (1) ETL (7) Google Cloud (5) GPG (2) Hadoop (2) Hive (3) Luigi (1) MDX (21) Mongo (3) MYSQL (3) Pandas (1) Pentaho Data Integration (5) PentahoAdmin (13) Polybase (1) Postgres (1) PPS 2007 (2) Python (13) R Program (1) Redshift (3) SQL 2016 (2) SQL Error Fix (18) SQL Performance (1) SQL2012 (7) SQOOP (1) SSAS (20) SSH (1) SSIS (42) SSRS (17) T-SQL (75) Talend (3) Vagrant (1) Virtual Machine (2) WinSCP (1)

Friday, April 10, 2015

Talend: How to add StageModifiedDate and a CheckSum value in Text file Output

Consider you have a text file with User details, the text file doesn't have any data about created or modified date, in that case when you need to move this data to OLEDB database or to Data warehouse you may need stage modified date and Checksum value to improve the performance of the data load.

In this case, make sure you pass StageModifiedDate and CheckSum value to your staging tables, so those values helps you to perform incremental load to target main table.

To do this with Talend, design your job as like below:



In tMap, create a new variable 'stagemodifieddate' as shown below (make sure you use date format and output date pattern provide the required date format), this will add new output StageModifiedDate to schema.



To get CheckSum output, add tAddCRCRow component, and check all those columns that may undergo update, in below example I checked all. By doing so a new CRC column will be generated.



To verify tAddCRCRow, click on Edit Schema, you can see new column CRC


 Now, in text output you can add these two columns along with other data.




No comments:

Post a Comment