Data Integration
(Healthy Habits for SAS® Da
www.definitivequality.com Copyr
SAS® Professionals Convention
14-16 July 2009
n Best Practices
ata Integration Studio Users)
right © 2009 Defin itive Quality Solutions Limited. Registered in England No.:05141146.
Data Integration Best Practices
Abstract:
Version 9 of the SAS System offers t
users manage and organise the wea
SAS professionals today. SAS® Data
many features that support healthy
can only 'be of use' if they are 'being
DI Studio allows customisation of th
status handling, data validation, “co
documentation, and role assignment
using these functions is often enoug
organised methods of working.
This paper describes examples of be
integration suites to ensure quality,
the heart of your enterprises’ inform
www.definitivequality.com Copyr
SAS® Professionals Convention
14-16 July 2009
tools to help developers and business
alth of data and processes that face
a Integration Studio benefits from
habits for data integration, but they
g used'.
he custom tree, error monitoring, job
onformed” data model support, self-
t. Identification of the benefits behind
gh to motivate users into controlled and
est practice for developing data
efficiency and resilience is built into
mation estate.
right © 2009 Defin itive Quality Solutions Limited. Registered in England No.:05141146.
Data Integration Best Practices
Subjects:
Data Integration Structure
Data Integration Organisation
Capture Control (CCT Tables)
Error Monitoring
Data Validation
Data Protection (Scrambler)
Conformed Modelling
SQL Optimisation
Self Documentation
Role Assignment
Rename Standard Transforms
SAS DI Studio Version 3.4 under SAS In
www.definitivequality.com Copyr
SAS® Professionals Convention
14-16 July 2009
ntelligence Platform 9.1.3
right © 2009 Defin itive Quality Solutions Limited. Registered in England No.:05141146.
Data Integration Best Practices
Data Integration Structure
Challenge: How can you best deli
variety of source syste
Solution: Employ a Data Integra
www.definitivequality.com Copyr
SAS® Professionals Convention
14-16 July 2009
iver Business Intelligence from a
ems across a diverse consumer base?
ation flow structure.
right © 2009 Defin itive Quality Solutions Limited. Registered in England No.:05141146.
Data Integration Best Practices
Data Integration Structure
Challenge: How can you best deli
variety of source syste
Solution: Employ a Data Integra
Source
Systems
www.definitivequality.com Copyr
SAS® Professionals Convention
14-16 July 2009
iver Business Intelligence from a
ems across a diverse consumer base?
ation flow structure.
right © 2009 Defin itive Quality Solutions Limited. Registered in England No.:05141146.
Data Integration Best Practices
Data Integration Structure
Challenge: How can you best deli
variety of source syste
Solution: Employ a Data Integra
Source Detailed
Systems Data Model
www.definitivequality.com Copyr
SAS® Professionals Convention
14-16 July 2009
iver Business Intelligence from a
ems across a diverse consumer base?
ation flow structure.
right © 2009 Defin itive Quality Solutions Limited. Registered in England No.:05141146.
Data Integration Best Practices
Data Integration Structure
Challenge: How can you best deli
variety of source syste
Solution: Employ a Data Integra
Source Detailed S
Systems Data Model
www.definitivequality.com Copyr
SAS® Professionals Convention
14-16 July 2009
iver Business Intelligence from a
ems across a diverse consumer base?
ation flow structure.
Subject Specific Data
Marts
right © 2009 Defin itive Quality Solutions Limited. Registered in England No.:05141146.
Data Integration Best Practices
Data Integration Structure
Challenge: How can you best deli
variety of source syste
Solution: Employ a Data Integra
Source Detailed S
Systems Data Model
www.definitivequality.com Copyr
SAS® Professionals Convention
14-16 July 2009
iver Business Intelligence from a
ems across a diverse consumer base?
ation flow structure.
Subject Specific Data Subject Specific
Marts Business Intelligence
right © 2009 Defin itive Quality Solutions Limited. Registered in England No.:05141146.
Data Integration Best Practices
Subjects:
Data Integration Structure
Data Integration Organisation
Capture Control (CCT Tables)
Error Monitoring
Data Validation
Data Protection (Scrambler)
Conformed Modelling
SQL Optimisation
Self Documentation
Role Assignment
Rename Standard Transforms
SAS DI Studio Version 3.4 under SAS In
www.definitivequality.com Copyr
SAS® Professionals Convention
14-16 July 2009
ntelligence Platform 9.1.3
right © 2009 Defin itive Quality Solutions Limited. Registered in England No.:05141146.
Data Integration Best Practices
Data Integration Organisation
Challenge: How can you keep tra
created in a data integ
Solution: Utilise the custom tree
www.definitivequality.com Copyr
SAS® Professionals Convention
14-16 July 2009
ack of the thousands of jobs typically
gration suite?
e in SAS Data Integration Studio.
right © 2009 Defin itive Quality Solutions Limited. Registered in England No.:05141146.
Data Integration Best Practices
Dat
www.definitivequality.com Copyr
SAS® Professionals Convention
14-16 July 2009
ta Integration Organisation
Create folders for each integration layer.
Sub divide them by:
Jobs
Libraries
Tables
Number the folders preserve order.
Stick to methodology:
(e.g. don’t transform in capture layer)
right © 2009 Defin itive Quality Solutions Limited. Registered in England No.:05141146.
Data Integration Best Practices
Subjects:
Data Integration Structure
Data Integration Organisation
Capture Control (CCT Tables)
Error Monitoring
Data Validation
Data Protection (Scrambler)
Conformed Modelling
SQL Optimisation
Self Documentation
Role Assignment
Rename Standard Transforms
SAS DI Studio Version 3.4 under SAS In
www.definitivequality.com Copyr
SAS® Professionals Convention
14-16 July 2009
ntelligence Platform 9.1.3
right © 2009 Defin itive Quality Solutions Limited. Registered in England No.:05141146.
Data Integration Best Practices
Capture Control
Challenge: How can I perform inc
source systems?
Solution: Define Capture Contro
Status – To ensure smooth runn
(Started, Failed, or Success)
From/To Datetimes – To extract
the database. Also useful to dete
increases day by day.
www.definitivequality.com Copyr
SAS® Professionals Convention
14-16 July 2009
cremental extracts from several
ol Tables for each source table.
ning of DI suite.
against the “last updated” column in
ermine processing times as data
right © 2009 Defin itive Quality Solutions Limited. Registered in England No.:05141146.
Data Integration Best Practices
Capture Control
Send Job Status to dataset
with same name as the job.
www.definitivequality.com Copyr
SAS® Professionals Convention
14-16 July 2009
right © 2009 Defin itive Quality Solutions Limited. Registered in England No.:05141146.
Data Integration Best Practices
Capture Control
Only extract re
which have up
since last run.
Capture Jo
Source Systems
www.definitivequality.com Copyr
SAS® Professionals Convention
14-16 July 2009
ecords
pdated
ob
Conformed Model
right © 2009 Defin itive Quality Solutions Limited. Registered in England No.:05141146.
Data Integration Best Practices
Capture Control
Only extract re
which have up
since last run.
Capture Jo
Source Systems
www.definitivequality.com CoreInfo Ta
Copyr
SAS® Professionals Convention
14-16 July 2009
ecords
pdated
ob
Conformed Model
ables
right © 2009 Defin itive Quality Solutions Limited. Registered in England No.:05141146.
Data Integration Best Practices
Capture Control
Only extract re
which have up
since last run.
Pre Capture Jo
Source Systems
www.definitivequality.com CoreInfo Ta
Copyr
SAS® Professionals Convention
14-16 July 2009
ecords
pdated
ob Post
Conformed Model
ables
right © 2009 Defin itive Quality Solutions Limited. Registered in England No.:05141146.
Data Integration Best Practices
Capture Control – Pre-Processing
Is this the first time
the job has run
successfully today?
Yes
Did the previous run
fail, or not finish?
Yes
Update dates in CCT
table for this source.
(&source_table._CCT)
www.definitivequality.com Copyr
SAS® Professionals Convention
14-16 July 2009
No Warn that duplicate
facts will occur.
No Warn that this is a
replacement run.
right © 2009 Defin itive Quality Solutions Limited. Registered in England No.:05141146.
Data Integration Best Practices
Capture Control – Post-Processing
Did the job run
successfully ?
Yes
Update dates in CCT
table for this source.
(&source_table._CCT)
www.definitivequality.com Copyr
SAS® Professionals Convention
14-16 July 2009
No Update CCT table with
Status= “Failed”.
right © 2009 Defin itive Quality Solutions Limited. Registered in England No.:05141146.
Data Integration Best Practices
Subjects:
Data Integration Structure
Data Integration Organisation
Capture Control (CCT Tables)
Error Monitoring
Data Validation
Data Protection (Scrambler)
Conformed Modelling
SQL Optimisation
Self Documentation
Role Assignment
Rename Standard Transforms
SAS DI Studio Version 3.4 under SAS In
www.definitivequality.com Copyr
SAS® Professionals Convention
14-16 July 2009
ntelligence Platform 9.1.3
right © 2009 Defin itive Quality Solutions Limited. Registered in England No.:05141146.
Data Integration Best Practices
Error Monitoring
Challenge: How can I keep my pr
informed of job failure
Solution: Email job statistics to
Create User Transform called Em
Add Email_Stats transform to ea
www.definitivequality.com Copyr
SAS® Professionals Convention
14-16 July 2009
roduction support department
es/successes ?
designated mailbox.
mail_Stats.
ach job.
right © 2009 Defin itive Quality Solutions Limited. Registered in England No.:05141146.
Data Integration Best Practices
Error Monitoring
Add Email_Stats
transform to Job.
www.definitivequality.com Copyr
SAS® Professionals Convention
14-16 July 2009
right © 2009 Defin itive Quality Solutions Limited. Registered in England No.:05141146.
Data Integration Best Practices
Error Monitoring
Drag Target table to one input.
Drag Email_Stats to other input.
(Email_Stats table contains
email addresses of recipients).
Don’t “hard-code” email addresses.
What happens when people leave
Different recipients for dev/prod.
www.definitivequality.com Copyr
SAS® Professionals Convention
14-16 July 2009
e?
.
right © 2009 Defin itive Quality Solutions Limited. Registered in England No.:05141146.
Data Integration Best Practices
Error Monitoring
Email_Stats
transform properties.
Only emails if job has failed.
www.definitivequality.com Copyr
SAS® Professionals Convention
14-16 July 2009
right © 2009 Defin itive Quality Solutions Limited. Registered in England No.:05141146.
Data Integration Best Practices
Error Monitoring
Last job in flow always sends
email to Admin & Support.
Set Last Job to Yes.
www.definitivequality.com Copyr
SAS® Professionals Convention
14-16 July 2009
right © 2009 Defin itive Quality Solutions Limited. Registered in England No.:05141146.