Dirty data is everywhere, and it's headed for a database near you. Extraction, transformation, and loading (ETL) can be difficult, but often the most challenging component of that process is the validation and clean up of data. Information must be cleansed in such a way that it retains its original message and business value, while conforming to the expectations of the destination system(s).
In this session, we'll discuss some design patterns for addressing different types of dirty data using SQL Server Integration Services. We will review the various cleansing tools accessible from within SSIS including native Integration Services components, T-SQL, and SSIS scripting. In addition, we'll briefly review the new SQL Server Data Quality Services and its integration with SSIS. We'll cap off the discussion with demonstrations of several methods for data cleansing.