If this page does not print out automatically, select Print from the File menu.

Information overdrive

Ensuring databases are accurate is fundamental for any organisation to be able to work effectively, writes Kim Thomas

Kim Thomas, Computing 15 Mar 2007

Information is the lifeblood of the modern organisation. Critical decisions depend on data from business intelligence systems about customers’ buying habits, product sales or the effectiveness of marketing campaigns.

The quality of that data is all too often taken for granted, yet according to research by analyst firm Gartner in 2004, 25 per cent of corporate data is inaccurate. Imagine what that does to the organisation’s ability to make good decisions. As the old saying goes: ‘Garbage in, garbage out’.

Ian Charlesworth, a senior analyst with Ovum, says it is only recently that organisations have begun to take data quality seriously. ‘Three or four years ago, everyone saw data quality as a cost to the business and very much the responsibility of IT to put it right. The tools and techniques that people would use were very much down to the capability of the individual developer and administrator,’ he says.

Now, while ownership is still with IT rather than the business, at least firms are beginning to invest in data cleansing tools to address the problem.

One reason for the change, he argues, is that data quality vendors are advertising their services more heavily. Another is that, as businesses seek to rationalise and standardise their business intelligence systems, data quality issues become more visible. When data is combined as the result of a merger or acquisition, the subject of data quality becomes unavoidable.

Compliance is another driver, with the Sarbanes-Oxley legislation in particular forcing businesses to find ways of keeping the quality of their financial data. For the public sector, the drive to share data between agencies such as the police and social services has highlighted the importance of having good quality data to share.

But why is so much data inaccurate? ‘The main factor in causing the quality of data to degrade is people,’ says Ted Friedman, vice president of research at Gartner.

‘There is a cultural issue in that people in the business do not understand the knock-on impact of what they do when they are not maintaining high-quality data.’

In a customer relationship management (CRM) system, for example, it is easy for data to be entered incorrectly at source, with names misspelled or date of birth fields left blank. Even if accurate, it can degrade very quickly, as people move house or change their names after marriage. Most systems hold a few duplicate records, and data held on a single customer can also vary between, for example, the CRM and the billing system.

‘Data quality can be an indication of how integrated a company’s data processes are,’ says Umesh Hari, global data management and architecture lead in Accenture Information Management Services.

Friedman sums the main problems caused by poor quality data as ‘productivity loss, customer churn, opportunity cost, lost revenue opportunities and additional or unnecessary waste’.

And it is not just a problem with customer information, as Charlesworth points out: ‘Order entry errors have huge downstream cost implications such as customers being sent the wrong goods because of the order entry.’

It is almost impossible to put a figure on the cost resulting from not keeping data in good order, but a 2002 study by the Data Warehousing Institute estimated the annual loss to US businesses at $611bn (£316bn) a year.

The not-for-profit advisory service Business Link for London (BLL) experienced the full impact of poor data quality when it was formed in 2001, as the result of a merger of nine small individual Business Links. Nine customer databases were amalgamated, says Mike Pratt, now the organisation’s data integrity manager.

‘There were no geographical boundaries on which Business Link could contact which firm in which area, so there was a huge amount of crossover. And all they did was just bring all the records together, give them a unique ID and dump them into a database,’ he says.

By the time Pratt joined the organisation in 2003 to oversee business intelligence, there were already problems. Customer satisfaction was low because people were receiving four or five copies of the same letter. BLL employees paying visits to customers found they were turning up at the wrong address.

Continued on next page

Data cleansing became his first priority. BLL purchased a set of data quality tools from FirstLogic – now part of Business Objects – and found that a shocking 25 per cent of its records were duplicates. It took several months to flag every duplicate among its hundreds of thousands of records.

The duplicates often held slightly different data, including information about work being done with the client. Rather than delete or combine all the duplicates, BLL developed a system of master and subordinate records. If an organisation had three records, one would be the master, which would appear if someone searched for a particular company, and the other two would be subordinates.

The Business Objects software also checks new data coming into the system against 430,000 existing records, only adding genuinely new records. A telemarketing company phones customers quarterly to check the accuracy of the data held on them. This system has worked smoothly, says Pratt, and BLL has saved £100,000 in its marketing expenditure in three years. The proportion of duplicates is down to 1.38 per cent.

Data cleansing will identify duplicates, check customer data against publicly held lists and identify missing data or clearly inaccurate data, such as phone numbers held in the wrong format. But to maintain data quality permanently, says Friedman, requires a ‘three-legged stool’ approach of people, technology and processes. Increasingly, organisations are assigning senior managers to data stewardship roles as part of their job responsibilities – a good way, he says, of making the issue visible.

One of the greatest challenges, according to Andrew Stevenson, head of SI-BAS-Centres of Excellence for Atos Origin, lies in maintaining master data across departments and applications.

‘Following a merger and acquisition there are likely to be product, customer and supplier codes contained in multiple enterprise resource planning applications, with different departments disagreeing on which one is the master,’ he says.

Once the data cleansing tools have identified the gaps and inaccuracies, the firm needs to decide how the master data should be mapped from one application into master data from another.

The result, for many organisations, is a more streamlined business: possession of good quality data means less waste, less inefficiency, and the ability to target customers effectively.

Mike Pratt, who was originally taken on to look after Business Link for London’s business intelligence, now has a full-time job taking care of the data.

‘Data integrity was seen as an add-on, whereas now it is seen as an integral part of the business in its own right,’ he says.

What the experts say about data management

Before we had the process in place, data was coming to us from external sources and we would just load it into the system. We now have 420,000 companies on our database so the majority of data sources coming into us now are not new, they are existing records, and that is the thing we would never know without this software.
Mike Pratt, data integrity manager, Business Link for London

As business processes become increasingly global, information may be used across many business units and functions. This means that while information might work well within a transaction system or business unit, it may not be compatible or consistent with other business units or product lines in the organisation. This makes it difficult for enterprise-level decision-making.
Umesh Hari, global data management and architecture lead, Accenture Information Management Services

If data is inaccurate we are not able to protect the public. If you go to a domestic violence incident, you need to know if it is the first time or the second time, so you need a picture of what you are going to deal with because you will deal with it differently depending on the information you have.
Graham Dawson, head of information services, Humberside Police

Many tools exist to help ensure integrated data is accurate and reliable. While some of these tools are excellent in allowing end users to build business rules around information quality, none of them can substitute for having good business practices around information quality.
Andrew Stevenson, head of SI-BAS-Centres of Excellence, Atos Origin

The best thing to do is try to get the right sort of data in at source, so it is captured in the right way, and good intelligence can help that. The next thing is to have business owners who understand how the data should be used.
Julie Henry, managing consultant, PA Consulting

Do not be afraid to invest in data cleansing. There is a tendency to forget the information, and it is just one line on the Gantt chart. A significant proportion of your project should be spent on information cleansing, and the successful CRM projects always have a lot of time and effort put into information cleansing.
Peter Simpson, IT relationship manager, Eversheds

A common mistake organisations make is to say: ‘We will try and rectify our information quality problems in the data warehouse.’ My argument is: that is too late – it is like shutting the gate after the horse has bolted. It is best practice to capture these things at source, or as near to source as possible.
Ian Charlesworth, senior research analyst, Ovum

People are beginning to understand they cannot get their hands around this problem because of the sheer volumes of information, and the fact that things are constantly changing. You have to be focused on it on an ongoing basis. That means introducing controls and monitoring points within your data infrastructure.
Ted Friedman, VP Research, Gartner

www.whatpc.co.uk/2185455
This article was printed from the WhatPC? web site
© Incisive Media Ltd. 2008
Incisive Media Limited, Haymarket House, 28-29 Haymarket, London SW1Y 4RX, is a company registered in the United Kingdom with company registration number 04038503
Close this window to return to the website