Identity Mapping and De-Duplicating
In an enterprize, merging master data, like
customer data, from multiple sources is a common problem. Typically, you do not
have a single, i.e. the same key identifying a customer in different sources.
You have to match data based on similarity of strings, like names and addresses.
In this session, we are going to check how different algorithms for comparing
strings included in SQL Server 2012 and SQL Server 2014 work. We are going to
use Soundex Transact-SQL function, four different algorithms that come with
Master Data Services (Levenshtein, Jaccard, Jaro-Winkler and
Ratcliff-Obershelp), and Fuzzy Lookup transformation from Integration Services.
Finally, we are going to introduce how SQL Server 2012 Data Quality Services
(DQS) help us here. We are also going to tackle the performance problems with
string matching merging.
Sorry, there are no downloads available for this session.
Dejan Sarka, MCT and
SQL Server MVP, is an independent trainer and consultant that focuses on
development of database & business intelligence applications. Besides projects, he spends about half of the
time on training and mentoring. He is the founder of the Slovenian SQL Server
and .NET Users Group. Dejan Sarka is the main author or coauthor of twelve
books about databases and SQL Server. Dejan Sarka also developed many courses
and seminars for Microsoft, SolidQ and Pluralsight.
The video is not available to view online.