Knowledge Center

Česká pojišt'ovna Uses DataFlux to Improve the Quality of its Data

A major insurance company in the Czech Republic deploys DataFlux technology to cleanse a data set, resulting in a 90-percent success rate.


 

Quick Facts

  • Česká pojišťovna is the largest and oldest all-purpose insurance company in the Czech Republic with origins dating back to 1827.
  • DataFlux technology was deployed to cleanse a data set with more than 20 million records.
  • The project achieved a 90-percent success rate.
 

The Business

Česká pojišťovna is an all-purpose insurance company providing both individual life and non-life insurance, along with insurance for small, medium and large clients in industrial and business segments. Česká pojišťovna (ČP), the leader in the Czech insurance market, belongs to the Generali PPF Holding, which serves 30.9 percent of the market in terms of the volume. ČP is the largest insurance company in the country with more than nine million active policies.

The Challenge

Thanks to its long history, ČP has an extensive collection of client data, gathered over many years and systems (going back to punch cards). One of ČP’s oldest operational systems contains life insurance data for almost ten million policies and more than ten million claims. Some of this data suffered from inconsistencies, duplications and other data quality problems, such as:

  • First names and surnames with missing diacritical marks
  • National ID numbers with an invalid suffix – usually 4 zeros
  • National ID numbers that didn’t match the client’s gender
  • Name, surname and title(s) that were erroneously typed in a single field, instead of their respective fields
  • Obsolete post codes, parts of addresses sometimes missing
  • Heavily abbreviated names, surnames and addresses
  • Miscellaneous remarks not stored in designated fields

The poor data quality had an adverse effect on other systems within ČP. It was difficult to effectively cleanse and de-duplicate data after it had been transferred into a central client database. Also, information on policies didn’t always match information on corresponding claims. Incorrect information on age or gender could have led to erroneous insurance premium calculation. Moreover, the data inconsistency made it virtually impossible to move data from the existing operating system into a new, modern system.

The Solution

ČP selected the DataFlux solution as part of a larger SAS technology offering to improve the quality of its enterprise data. The project consisted of data cleansing and de-duplication of ČP’s main client database. In order to prevent the data distortion in the future, ČP will establish unique and custom new business rules.

The Results

Thanks to DataFlux, ČP cleansed data in its oldest operational system over a four-month period. The data quality initiative achieved better than a 90-percent success rate (defined as the ratio of cleansed or verified records to the total number of records). The database held more than 20 million records in two datasets: policies and claims. Due to the fact that the data was historical and static, the main focus was not on the performance of the data quality tool, but on the quality of the new information.

ČP is pleased with the outcome of the project. “The collaboration with SAS was excellent. The project ran smoothly and the results are impressive,” said Štepán Cábelka, head of data quality, Česká pojišťovna. “Data – the most valuable asset of our company – has been improved. The project was an important milestone on the road to consistent, error-free and reliable information.”