Data Profiling
Begin your data improvement efforts by knowing where to begin.
The first step to building better data
Data profiling, sometimes called data discovery or data quality analysis, is the essential first step in any data improvement program. Data profiling provides a wealth of information about the integrity of your data and illuminates potential problems you may encounter along the way. An effective data profiling approach allows you to structure a data quality solution to address the specific nature of your organization's data quality issues.
DataFlux Methodology Stage 1 of 5 – Data Profiling

Data profiling allows you to gain a clear perspective on the current integrity of your data, helping you:
- Discover the quality, characteristics and potential problems of information before beginning data-driven projects
- Drastically reduce the time and resources required to find problematic data
- Allow business analysts and data stewards to have more control on the maintenance and management of enterprise data
- Catalog and analyze metadata and discover metadata relationships
Data profiling is completely integrated into the DataFlux data quality integration solution. As you uncover non-standard or duplicate data, you can immediately develop business rules to correct issues, shortening the time between data discovery and data correction.
Diagnose Your Data at the Outset
DataFlux data profiling solutions automatically identify data quality issues in a variety of ways, including statistical analysis, fuzzy matching to discover duplicate content, as well technology to validate business rules and formats. The components that will construct your data profiling project will depend upon the particular needs of your organization. DataFlux has all of the following capabilities integrated into dfPower Studio:
| Metadata analysis |
Gain a clear view of the information contained in your enterprise data resources
- Organize your data logically across all of your data sources
- Accurately group related data
- Exclude irrelevant data
|
| Outlier detection |
Detect data that falls outside normal limits
- Gain insight into data values
- Identify data values that may be considered incorrect
- Drill down to the data to make a more in-depth determination about the data
|
| Data validation |
Ensure that the data in your tables matches its description
- Catalog metadata across an organization
- Group similar types of data into projects
- Understand relationships across data sources
- Facilitate data consolidation and master data management projects
|
| Pattern analysis |
Validate that your data follows standardized pattern
- Analyze underlying data
- Building correction and validation rules
|
| Relationship discovery |
Uncover relationships across tables and databases
- Manage data and the sources and relationships of data across different applications
|
| Statistical analysis |
Reveal trends and commonalities in corporate information
- Uncover minimum and maximum values to determine outliers
- Examine numerical trends via mean, median, mode and standard deviation
|
| Business rule validation |
Guarantee your data meets organizational standards for data quality and business processes
- Build, store and validate data against your organization's unique business rules.
- Check data against pre-set domains and ranges
- Compare data using specific formulas
|
The Next Step: Data Quality
Data profiling provides an analysis of the data problems you face. The process of correcting these problems and building better data starts in the next phase, data quality >>