by February 1, 2010 0 comments



The typical application development lifecycle has many phases: requirements
gathering, application design, data and storage requirements, application
development, user acceptance testing and the eventual deployment of the
application itself. At some point in the evolution of the application other
tasks like integration and data quality are added to refine the applicability
and usefulness of the deployed application.

Usually, the overall business driver for application development is to
capture and process information needed to meet specific business goals. That is,
the purpose of the application domain or business functions the application is
designed to address. In determining the requirements, analysts will work with
business owners to model the usage scenarios. This usually involves documenting
the business processes that the system needs to support and defining the
information components that the system needs to process.

Direct Hit!

Applies To: Enterprise & Solution
Architects, Data Quality Practitioners
USP: Learn what digital signal processing techs can offer to IT
Management
Primary Link:
www.mastersoftresearch.com

Search Engine Keywords: Data Quality, Data Modeling

The art of modelling an information system is to represent the scope of data
definition and collection against the things in the real world that need to be
captured.  This dictates the scope of the data collection as well as the
processing and subsequent maintenance of the application.  It is important that
information being modelled is represented accurately and that it does not
over-represent what is required by the application.  The information model must
also be as succinct as possible for reasons of process efficiency.

Digitising the real world
Ultimately, the combined information and processing system will mirror the
real world but with a built-in bias towards the specific world-view determined
by the parameters of the application.

This built-in bias towards a specific view of the world can be likened to
photography. Photos, like information systems, capture and process the views of
the real world that the photographer is interested in.   The capturing of
essential characteristics of the real world is not unlike taking snapshots of
that real world; and viewing a series of snapshots is like the reconstruction of
reality. If we agree on the snapshot analogy above, then the representation of
information and processing requirements as an application is rather like
flipping a series of snapshots like an animated cartoon.1

Taking the analogy further, the nature of the application also gives an
indication of the type of snapshots we are taking. In the case of most
information systems, the snapshots are digital rather than analog in form. This
is not just because most IT applications are digital, but because current
knowledge representation technology limits us in how information can be
digitised as knowledge.

Even though Google Maps is a rich multi-dimensional view, it
is by necessity a context specific view and therefore an incomplete
representation of the real world.

Even without the limitations of knowledge representation, most applications
are created with a number of business contexts in mind. Besides the digital
slicing of the real world through time, the slicing also operates within a given
context. In Google Maps for example, the real world is represented as a location
map, but there is also a choice of traffic, satellite and terrain, and all four
can be combined to obtain a rich multi-dimensional view of the location.

By re-colouring of the images, blue-green for Chandra,
yellow for Hubble, etc, the composite images has enabled humans to visualise
physical world.

Reconstruction of the real world
The world of photography offers some insight into the issues involved. 
There are ongoing debates in the world of photography around film vs. digital,
based on the abilities and limitations of each to capture and display the real
world.  Ken Rockwell argues that because film is analog, it can capture much
more information about the real world compared to digital alternatives2.  The
argument is that once an image is digitised, the extra information captured by
film is lost, and there is no way to reverse engineer it back to an analog form.
In contrast, analog movies from the past can be digitised with modern scanning
technology into vibrant and attractive modern digital form: you can go forward
from analog to digital, but you can’t go back.

With this in mind, let’s look at the opportunities digital technology has to
offer the world of data management:

1. Reconstructing of the real world in interesting ways.  For example,
beautiful images of the Universe has been created by combining images from the
Chandra, Hubble and Spitzer telescopes3, where each telescope is responsible for
capturing a certain part of the visible and invisible (X-ray, Infrared)
spectrum.

2. The ability to digitally retouch or enhance the final image. For example,
most digital photographers perform further processing in their digital darkrooms
before releasing the final photo is released. A variety of different software
filters are available for purposes of manipulating layers, sharpens, re-colour
or generally re-emphasis the photo in certain ways.

In information management, we artificially enhance the real world through
pre-processing the data that represents real-world artefacts.

3. Computer Generated Imagery (CGI) allows us to create artificial images and
combine them with other digitalise images in the digital darkroom. The result is
almost indistinguishable from reality and offers interesting possibilities not
just in reconstituting the real world but also the ability to enrich and enhance
it.

The realism of combined real action and CGI in movies should not be taken
lightly from an information quality point of view.  When we combine digitised
data about the real world with generated or derived data, we have to remember we
are no longer looking at just the digitised form of our world, but also what has
been created out of thin air and with all the assumptions that come with it.

The question we need to ask is: how well do our assumptions marry with the
purpose of digitising the real world in the first place?  This is pretty simple
in CGI: as long as the virtual reality looks and sounds like the real world, the
makers can focus on the fantasy they are creating.

Even when we watch a CGI movie, some parts of the movie seem more realistic
than other parts. In fact, in the production notes for the movie “Final Fantasy:
The Spirits Within”4, the makers of the movie highlighted the above and then
proceed to explain the phenomenon by saying that the actual rendering software
used in making the movie was significantly improved by the programmers during
the making of the film.  So their ideas about their virtual worlds they built
were influenced by changes in the technologies they were using.

The analogy here is with defining or filtering data in
different ways and then recombining them to create new, more meaningful
information.

Information management also uses technology to digitise the real world.  For
example a database is a method of synthesising the real world through the use of
data models, metadata, aggregations and calculations, and so on.  The goal of
CGI is to create ‘virtual’ worlds: mythical places that resemble the real world
but do not actually represent it.  In contrast information management digitises
the real world to represent it for a specific purpose, such as operating bank
accounts, controlling air traffic, managing forests, and so on.

As knowledge management technologies improve our abilities to synthesise the
real world, the more important our assumptions will become in the use of such
techniques.  If we don’t use valid assumptions, or we can’t clearly articulate
them, we compromise our ability to check that our digital world has any
validity.

If we follow Ken Rockwell’s argument that we should continue to shoot slide
film because it has the potential to record the most amount of real world
information, then the challenge for data management practitioners is not how to
improve the quality of data during processing and integration, but how we can
delay the interpretation of the data until it is necessary to provide the data
in its interpreted form.  In other words, how should we address the crucial
issue of separating the processes of collecting the data and the interpretation
of the data?  It is therefore critically important to be fully aware of the
digital slices of the real world we create, the assumptions and contexts we use
to represent and process our data, and, most importantly, the information we
remove from the real world when we are digitising it.

While there are data quality issues around ‘losing touch with reality’,
additional modelling of the real world through a virtual world is acceptable and
useful. However if we are not fully aware of all the assumptions, we run the
real risk of allowing our digital world to be more ‘virtual’ than real.

Patrick has been in the IT industry in a variety of technical roles for over
twenty years. He enjoys all things technical in his spare time, whether it is
photography, astronomy or technical analysis. He can be contacted at
plam@mastersoftresearch.com

 1. More sophisticated applications may also have the ability to
rewind the movie back to a certain place for replay.

 2. Ken Rockwell — Film: The Real Raw –

http://www.kenrockwell.com/tech/real-raw.htm

3. NASA — Great Observatories May Unravel 400 Year Old Supernova Mystery –

http://hubblesite.org/newscenter/archive/releases/2004/29/video/e/

4 Final Fantasy: The Spirits Within (2001) –

http://www.imdb.com/title/tt0173840/

No Comments so far

Jump into a conversation

No Comments Yet!

You can be the one to start a conversation.

<