Advertisment

Data Integration: Challenges & Solutions

author-image
PCQ Bureau
New Update

Database was originally defined as 'a large collection of data organized

for rapid search and retrieval' applied to information consistently written to

a particular set of files and shepherded by a single master program, called the

'database engine'. In those good old days, your enterprise data would be

found inside this file. But as businesses grew and the multiplicity of software

being used mushroomed like a fractal pattern, the locations and use of your data

changed rapidly as well.

Advertisment

Today, data within your enterprise is present in not only within your ERP

(and related) systems, but also as word processing files, spreadsheets, e-mail

and IM logs. All of this data is not necessarily filed in the proper way or

suited for 'rapid search and retrieval' either. Integrating all this data

and deriving a positive business benefit from this information sea therefore,

seems quite out of the question — or is it?



Challenges of integration

Some of the challenges would be painfully evident to the bold person who

ventures into your info-sea. Still, let us identify a few key problems with

starting out to make sense out of your data:

Advertisment



-
Multiple formats — documents, e-mail, databases

- Distributed nature — servers,

employee workstations, archived storage, portable computers, cellphones

- Unstructured storage — not all

methods of input guarantee a trustable classification and tagging scheme

enabling a searcher to quickly find the data




-
Mixed with 'junk' — Data useful to a wider audience or

business decision need is often hidden deep inside comparatively un-required

information (like client background information against a normal e-mails)

Advertisment

- Discontinuity and Finding

relationships — The starting point for one piece of information may have been

a phone call (that went untranscribed) then to e-mail, a few Word documents and

then a CRM application. It is next to impossible to track this down without a

proper logging system in place

Why integrate?



When the odds of coming out with a system that makes sense out of the collected
information given all these challenges look daunting, why would a CIO choose to

go ahead with the project? The answer lies in the business benefits that the

enterprise can reap by knowing what you already have. Take any business app that

performs the role of cross-selling to a customer or creating appropriate offers

for him. These applications leverage the power of knowledge already gleaned from

past deals and the background of that client and future potential based on those

trends. This can only come about when your automated system can harness the

information usefully across disparate systems. That is the difference between an

'information' and a 'knowledge' system. Only an integrated 'database'

can give you knowledge out of information, and that's a benefit worth laboring

for.

Solutions



The simplest solution is to implement a knowledge management system. Such a
system would integrate data from a variety of sources and provide a mechanism

for searching across all of them in an easy manner. A piece-meal approach, like

implementing different systems like a CRM, an SFA, DMS, a workflow would soon

turn into more information silos. The best part is the solution need not be a

single vendor stack, and that may not be the ideal way around either. Today,

products in the market can frequently interface with one another through plug

ins, connectors and other data exchange technologies to function seamlessly with

one another. Last month, we had looked at seamlessly integrating your business

applications and letting information flow across them in a meaningful way (see

'Open Standards for Seamless Integration', pg74). The system we propose here

is a step forward from there, letting you operate on data that's already there

as well.

Advertisment

DW and BI



A DW (data warehouse) is the ultimate goal of the desired solution. Such
warehouses let you perform a variety of offline processing and analysis on such

information. However, one needs to be careful of pitfalls while implementing a

DW solution. These include: the time and effort spent in cleaning up and loading

the data into your warehouse, maintaining the focus of the DW implementation,

making sure that the DW is storing the data that you need and the common needs

of security and integrity of the implementation. The moment you have historical

information ready for analysis, everyone would want to query it in a hundred

different ways. And, they would want you to feed in more kinds of data to

generate even more reports. Therefore, months after you consider your DW 'complete',

you will find still more work to do.

BI tools feed in to larger DSS (decision support systems). As your business

evolves, the needs of the DSS morph too and you would need to constantly

evaluate what parts of the data (subsets) the queries apply to. Data from the DW

may also need to be fed 'back' into smaller data stores (like spreadsheets)

so that users can perform different analysis (like pivots) on them. Also, if

yours is a business with an online presence, you might want to take a look at

web-based DW and BI as well. While the applicable principles remain largely the

same, they can be used to redesign and retarget your campaigns for immediate and

visible results. For example, an analysis of visitor patterns on your customer

portal and a restructuring of the way they access data may instantly translate

into more satisfied or more disgruntled visitors. Also, comparatively over

non-web data, the value of web-based data is volatile and sensitive to very 'minor'

changes in configuration and visitor-ship.

Bottom line



In the end, it is evident that DI is a business necessity, a technological
possibility and an implementation challenge. The components required for a

successful implementation are simple and common to all successful IT

implementations: requirement analysis, system study, capacity and resource

planning, deployment of appropriate software and continued maintenance. The

scope and effects of the solution are far more wide-ranging than any other

implementation of IT.

Advertisment