Everybody hates poor quality data. But let’s face it, in most organizations you will run into data quality issues, at least from time to time. So instead of arguing against poor data quality, a more useful question is: how do you deal with it?
To make business intelligence (BI) applications add value to the corporation, some level of data integration invariably takes place. This might be a data warehouse (DW), operational data store (ODS), enterprise resource planning (ERP), customer relationship management (CRM) application or another application you might have. We’ll assume that the data quality problems originate in upstream (primary) systems that are generating your source data. In this article, I’ll focus exclusively on DW solutions.
There are two fundamentally different ways of dealing with bad data: either you load all of the data “as is” (and deal with the errors later) or you clean/scrub the data on the way in to the DW. The former is the approach advocated by Data Vault architects, the latter I will label the “Ralph Kimball” approach, in honor of his extensive writing on this subject. This article focuses on the pros and cons of both approaches.
Article sourced from www.b-eye-network.com, click here to read full story.
No comments have been posted yet.