Data integration can be as simple or complex as an organization demands. Successful data integration projects allow data to be accessed, profiled, enriched, de-duplicated and consolidated to provide a single view of customers, products or operations. With multiple data integration strategies to choose from, many organizations automatically look to ETL (extract-transform-load). While being the most common approach, ETL may not be the best one for a specific business. There are numerous data integration options available today, including ELT (also known as in-database or SQL push-downs).
ETL: The Old School Approach
ETL is a technology architecture that gathers and consolidates data from disparate data sources into a repository (such as a data warehouse or data mart) by integrating the data and providing it with a common structure. Since it often involves IT professionals doing their own custom coding, ETL is one of the most common data integration methods used in the marketplace. However, it's not always the best method. The reason ETL is seen by many as the de facto solution for data integration is because it can handle large quantities of complex data as well as data transformations that require multiple passes. It's also handy when an organization requires data transformation, frequent access, analytical processing or longitudinal reporting. Organizations that prefer ETL are those who require data consolidation, since the technology can handle large batch migrations of data.
However, for the majority of enterprises ETL may be way more trouble than it's worth. Some of the more important issues include:
- It's a poor fit for synchronization since it can't address high concurrency, low latency data needs
- Hand-coding makes it too hard to scale; since there's no set process, data integration can be inaccurate or incomplete - plus a company is dependent on the specific style of its developers
- In addition to human error, the time and effort required to manually maintain a data management system renders it inefficient
ELT/In-Database: Quality, Actionable Data Delivered Quickly and Efficiently Some organizations are now looking to the extract, load and transform (ELT) method, also known as in-database integration, as an alternative to ETL. With this process, most of the data transformations occur after the data has been loaded into its intended repository. While the data is still "raw," it's transformed and moved to tables before being made available to users. Transforming the data after it's reached its destination helps optimize performance and minimize cost. In-database integration functions at the infrastructure level while ETL functions at the integration server level; therefore in-database optimizes performance in most cases. Additionally, the in-database method leverages the convenience of virtualization and cloud computing - already part of the data warehousing infrastructure - which helps to speed processes and control costs.
With in-database integration/ELT, organizations can:
- Reduce time-to-market for new applications using a standardized enterprise data model
- Deliver constantly updated reports with real-time reporting
- Control costs through centralized development and reduction of core integration expenses
About the Author
Daniel Teachey is the senior director of marketing for DataFlux
No comments have been posted yet.