DataFlux - The Leader in Data Quality and Data Integration

Application Designers Need to Bake Data Quality In

Dylan Jones

July 30, 2010

Take a look at the vast majority of article inches, conference presentations and blog posts devoted to data quality and you’ll find an interesting trend – they’re mostly focused on managing data quality after the systems, business processes and data have been created.

As an industry we tend to ignore the most critical aspects of data quality, which is designing quality measures into applications, so that we make it far more difficult for defects to arise in the first place.

A classic example of this is found in applications that manage customer data. I have witnessed many poor search utilities in these types of applications, which results in a high occurrence of duplicated customer records. Even basic fuzzy matching is often omitted, which can cause problems when users are checking whether a contact already has an account.

We often see data quality technology solutions being adopted downstream of these applications to resolve these problems, but I think it is now time for application designers to plan with data quality in mind, either by building these type of functions into the original design or by creating easier gateways for external tools to improve quality in real-time. The data quality vendors are ready; my hunch is the application designers are still some way off.

Anyone who has ever undertaken a data quality assessment on a large volume of corporate data will know just how much low hanging fruit is available. Many of the most common data quality issues stem from quite minor data defects such as non-conforming patterns, leading or trailing spaces, hidden nulls, values outside of permitted ranges and other basic outliers. The application logic to spot these defects before they are entered into the system, or at least post a warning to the user, is not especially challenging and overall quality of data would increase markedly.

With the rise of software as a service and cloud-based data quality computing, it would seem an opportune time for application designers to start thinking far more about how they can bake data quality prevention into the application designs, as data quality functions have never been more accessible.

The commercial incentives for this are obvious. As more organisations wake up to the costs and impacts of poor data quality they will increasingly look to solutions that help them resolve data quality further upstream. Application vendors that can demonstrate a greater awareness and capability surrounding data quality will be able to create a clear market differentiator.

But what do you think? Are your business applications designed with data quality in mind or do they focus on function at the expense of quality? What other ways can application designers “bake data quality in” their application designs?

tags:  

  1. #1 by dc at July 30th, 2010

    reminds me of this article:

    You Build it, You Break It, You Fix It: Why Applications Must Be Responsible for Data Quality

    Evan Levy
    Information Management Blogs, December 1, 2009

    http://www.information-management.com/blogs/applications_data_quality-10016638-1.html

  2. #2 by Dylan Jones at July 30th, 2010

    Yep, great article by Evan as ever, excellent comments too, well spotted, thanks for continuing the debate.

  3. #3 by Beth Breidenbach at July 30th, 2010

    Dylan, this post has the potential to be a game-changer…excellent!

    I’m reminded of the same argument being made for application security’s place in the development lifecycle. Just as security really should be “baked in” from the get-go, so should data quality.

    Michael Howard’s book ‘Writing Secure Code’ is a classic in the (security) development arena — we need an equivalent for data quality…….

(will not be published)
  1. No trackbacks yet.