Data quality through knowledge engineering

Tamraparni Dasu, Gregg T. Vesonder, Jon R. Wright
2003 Proceedings of the ninth ACM SIGKDD international conference on Knowledge discovery and data mining - KDD '03  
Traditionally, data quality programs have acted as a preprocessing stage to make data suitable for a data mining or analysis operation. Recently, data quality concepts have been applied to databases that support business operations such as provisioning and billing. Incorporating business rules that drive operations and their associated data processes is critically important to the success of such projects. However, there are many practical complications. For example, documentation on business
more » ... les is often meager. Rules change frequently. Domain knowledge is often fragmented across experts, and those experts do not always agree. Typically, rules have to be gathered from subject matter experts iteratively, and are discovered out of logical or procedural sequence, like a jigsaw puzzle. Our approach is to implement business rules as constraints on data in a classical expert system formalism sometimes called production rules. Our system works by allowing good data to pass through a system of constraints unchecked. Bad data violate constraints and are flagged, and then fed back after correction. Constraints are added incrementally as better understanding of the business rules is gained. We include a real-life case study.
doi:10.1145/956750.956844 dblp:conf/kdd/DasuVW03 fatcat:gwanvmp7ufdlfpi6mfeizdihya