Labels

Apache Hadoop (3) ASP.NET (2) AWS S3 (2) Batch Script (3) BigQuery (21) BlobStorage (1) C# (3) Cloudera (1) Command (2) Data Model (3) Data Science (1) Django (1) Docker (1) ETL (7) Google Cloud (5) GPG (2) Hadoop (2) Hive (3) Luigi (1) MDX (21) Mongo (3) MYSQL (3) Pandas (1) Pentaho Data Integration (5) PentahoAdmin (13) Polybase (1) Postgres (1) PPS 2007 (2) Python (13) R Program (1) Redshift (3) SQL 2016 (2) SQL Error Fix (18) SQL Performance (1) SQL2012 (7) SQOOP (1) SSAS (20) SSH (1) SSIS (42) SSRS (17) T-SQL (75) Talend (3) Vagrant (1) Virtual Machine (2) WinSCP (1)

Thursday, December 1, 2011

How to handle Slowly Changing Dimension using Pentaho Data Integration?

The below steps help the user to handle Slowly Changing Dimension using Pentaho Data Integration (PDI) tool.

We can maintain historical attribute of a dimension by using "Combinational Lookup/Update" control in PDI.

Step 1: Design your package as shown below. In my example I had taken "dbo.Branch" table as source with the following columns:

SELECT     [BranchID]
    ,[BranchName]
    ,[Address1]
    ,[Address2]
    ,[City]
    ,[State]
    ,[Country]
    ,[Telephone]
FROM [dbo].[Branch]




















Step 2: Add a Combination Look up, and configure the settings as shown below:
































Step 3: Select the target table "dbo.DimBranch", map the source columns with the target columns, give th ename of the Surrogate Key in tachnical key field text box "Branch Key", select Use Auto incremental field and enter a datafield column name in Date of last update field.

Step 4: Save and Run the package, the package will execute successfully. Now change the values in any column of the source table and run the package again. A new row will be created with updated row and new surrogate key in target table.



1 comment: