Labels

Apache Hadoop (3) ASP.NET (2) AWS S3 (2) Batch Script (3) BigQuery (21) BlobStorage (1) C# (3) Cloudera (1) Command (2) Data Model (3) Data Science (1) Django (1) Docker (1) ETL (7) Google Cloud (5) GPG (2) Hadoop (2) Hive (3) Luigi (1) MDX (21) Mongo (3) MYSQL (3) Pandas (1) Pentaho Data Integration (5) PentahoAdmin (13) Polybase (1) Postgres (1) PPS 2007 (2) Python (13) R Program (1) Redshift (3) SQL 2016 (2) SQL Error Fix (18) SQL Performance (1) SQL2012 (7) SQOOP (1) SSAS (20) SSH (1) SSIS (42) SSRS (17) T-SQL (75) Talend (3) Vagrant (1) Virtual Machine (2) WinSCP (1)

Tuesday, December 21, 2010

SSIS Incremental Load using Package History

Incremental Load using package History

1. Create Package History table in Warehouse:

CREATE TABLE [dbo].[PackageHistory](
 [PackageHistoryId] [int] IDENTITY(1,1) NOT NULL,
 [PackageName] [varchar](50) NOT NULL,
 [RunDateTime] [smalldatetime] NOT NULL,
 [SourceDateTime] [smalldatetime] NOT NULL,
 CONSTRAINT [PackageHistory_PK] PRIMARY KEY CLUSTERED
(
 [PackageHistoryId] ASC
)WITH (PAD_INDEX  = OFF, STATISTICS_NORECOMPUTE  = OFF, IGNORE_DUP_KEY = OFF, ALLOW_ROW_LOCKS  = ON, ALLOW_PAGE_LOCKS  = ON) ON [PRIMARY]
) ON [PRIMARY]

2. Create the variables as shown below:















3. Design ETL Package as shown below:



3. Config 'Load Max ETL Date' task as shown below:



 4. Load staging fact table.

5. Config 'Load Max ODS Date' task as shown below:



6. Create tasks to Load fact table. Select data whose modified/createddate > Max(RundateTime) in package history.



7. truncate Staging table.

8. Config 'Insert Package History' task as shown below:

8. When the package is run, the data whose created/modified dates are greater than Max(RundateTime) in package history is loaded in to warehouse.

No comments:

Post a Comment