(Teradata funded the research into appliance benchmarks and this article is a summary of a white paper that can be found here.)
There is no official definition for a data warehouse appliance, but the appliances in circulation today are a combination of database software, (optimized for data warehousing), with their own physical platform including operating system, server (CPU and RAM) and integrated mass storage. Current offerings are a mix of proprietary and non-proprietary physical components and databases some of which have been extensively modified. The more well-known appliance vendors are Netezza and Datallegro.
Though some consider Teradata to be the original appliance vendor, the market perceives it to be the competitive target of the new entrants.
Relational databases designed for only one physical platform are not new. In fact, historically it has been more the rule than the exception. Teradata is now available for Linux and Microsoft Windows, but was wed to a single hardware/OS combination until a few years ago. Other successful examples (at the time) of this approach were Tandem, Digital's RDB and IBM's DB2 for mainframes and DB/400 for the AS/400. There is no inherent advantage or anything new about a database system wed to a single platform, so the appeal of data warehouse appliances must lie elsewhere.
Appliance vendors cite a price differential and point to their use of non-proprietary hardware. In fact, no hardware is non-proprietary. Assembling the components in an appliance-designed enclosure and backplane, with interconnect hardware and software, can hardly be considered non-proprietary. Implementing an open source database such as PostgresSQL or Ingres could be considered a cost savings, but in some cases these databases have been massively modified and enhanced. The vendors must bear all of the subsequent development costs themselves and cannot avail themselves of the advances provided by the open source community.
This begs the question, what is the source of the cost savings that the vendors can pass along that they claim is as much as 50 percent? There are a number of answers. In some cases these appliances are stripped down versions of fully functional relational databases for data warehousing. They lack much of the functionality that evolved over 25 years of data warehousing. They are Moore?s Law pure plays, substituting less expensive, less performative hardware (particularly disk drives) for intellectual capital. Close examination shows they are largely feature poor, not only in database function, but in vitally essential capabilities like system management, load balancing and mixed load management.
In other cases they have turned to less expensive hardware components with less stringent performance and reliability specifications, such as the chosen CPU. But the more dramatic cost savings may come through the use of less expensive (7200rpm) disk drives with a lower Mean Time Between Failure (MTBF). In essence they are trading off availability for lower price.
Query performance in a contrived benchmark may seem compelling, but performance gains in one area may hide performance weaknesses in other areas. The myriad of workload possibilities in a true production environment and the variety of query types presented, in parallel, can easily confound a system designed to perform in one mode only. This is why the TPC develops benchmark suites, to measure the performance of a system across a wide range of probable situations. A focus on query performance may not expose a glaring lack of advanced database and data warehouse management features. These shortcomings typically manifest themselves as more difficult and expensive production environments, if not outright failure of the project.
Like any new technology that enters the market, it's useful to separate the wheat from the chaff in the marketing messages. Some of the appliance vendors marry their sparse functionality and abundant hardware to contrive testing scenarios that show their products in the most favorable light. There is nothing wrong with this; any organization in a competitive market would be remiss if they did not do the same. But it does highlight the need for the market to be vigilant.