Advertisment

Making Sense of Data

author-image
PCQ Bureau
New Update

Data warehousing: Old wine?

Advertisment

An ordinary warehouse is where you store goods before distributing them for use. Similarly, a data warehouse is a database where you store your data, before it is put to use. At the operational level, organizations work with transactional data, which can be simple, like rail or airline reservation, or sales. Most often it involves linking different but related business activities of an organization, like receipt of a sales order, placing of a purchase order, change in stock, dispatch of goods, invoicing, and payments. Such data is usually stored in relational databases that are managed using RDBMS (Relational Database Management Systems).

Can they work magic?

From the looks of it data warehousing and mining can do magic for your business. Can they really? The answer is not an unqualified yes. Data mining can bring out patterns that are available in the historical data contained in the warehouse. It cannot predict results of customer behavior, if the basic information is not there. To take the example of beer and baby products, to get an answer from your warehouse, there should be data in it, about the sales of beer when kept next to baby products and otherwise. Without this, the warehouse cannot pull the answer out like a magician pulling a rabbit out of his hat.

RDBMS, however, perform poorly when summarizing a database, something that managers work with to monitor performance and trends in their organizations. Complex customized cross tabulations are worse, making it difficult to build an MIS (Management Information System) or DSS (Decision Support System) over traditional RDBMS. The need to support both operational and MIS systems requires a parallel database that supports quick summarization. That way a resource intensive cross-tabulation does not interfere with any on-going transactions. This parallel database is the data warehouse.

Advertisment

A warehouse facilitates analytical processing and integration of data. Most large organizations have many operational systems running, implemented using different operating systems and DBMS. Fragmentation of organizational data often makes it impossible to carry out any meaningful analysis on it.

Once organizational data is moved into a warehouse it is used exclusively for analysis, and from the perspective of an operational system it immediately becomes out-of-date and unusable.

The multi-dimensional database stores only those parts of transactional data that interest managers–month, product, sale quantity, market, outlet, etc. Other transactional information like customer name and invoice number is not moved in from the operational system. It optimizes storage for quick summarization and retrieval. This often involves redundant storage of the same information. In some implementations, data is summarized as soon as it is loaded and the summations along with the business dimensions are stored in the database itself. This makes retrieval of summary data a breeze but can often lead to dramatic increases in the size of the database. Just a few megabytes of operational data can grow into hundreds of megabytes when the dimensions of interests are many.

Data mining: Where to keep the beer?

Using specialized software tools, data mining helps you discern patterns and leverage them to enhance your business. A popular example is of a chain store, where an analysis of data unearthed a seemingly bizarre pattern. Outlets where beer was stored next to items meant for infants recorded a higher sale of beer than other outlets! The explanation is that in the evenings usually the fathers are required to fetch baby stuff, such as nappies and milk powder. The sight of beer on the nearby racks is usually irresistible. The retail business was thus able to push up sales of beer in all stores, by leveraging on this information!

Advertisment