Skip to main content

Thesis Defense - Susan Pollock

Thesis Title

Susan Pollock

Data Quality Rules in the Analytic Health Repository

Location: HSEB 2908
Date: April 6, 2012
Time: 1:00 pm

Supervisory Committee: Stanley Huff, M.D.; Peter Haug, M.D.; John Holmen, Ph.D.


Data quality has become a significant issue in healthcare as large preexisting databases are integrated to provide greater depth for research and process improvement. Large scale data integration exposes and compounds data quality issues latent in source systems. Although the problems related to data quality in transactional databases have been identified and well-addressed, the application of data quality constraints to large scale data repositories has not and requires novel applications of traditional concepts and methodologies.

Despite an abundance of data quality theory, tools and software, there is no consensual technique available to guide developers in the identification of data integrity issues and the application of data quality rules in warehouse-type applications. Data quality measures are frequently developed on an ad hoc basis or methods designed to assure data quality in transactional systems are loosely applied to analytic data stores. These measures are inadequate to address the complex data quality issues in large, integrated data repositories particularly in the healthcare domain with its heterogeneous source systems.

This study derives a taxonomy of data quality rules from relational database theory. It describes the development and implementation of data quality rules in the Analytic Health Repository at Intermountain Healthcare and situates the data quality rules in the taxonomy. Further, it identifies areas in which more rigorous data quality should be explored. This comparison demonstrates the superiority of a more structured approach to data quality rule identification.


Susan earned her bachelor’s degree in mathematics at the University of Utah and an associate’s degree in nursing from Weber State University. She is a senior data architect with the Enterprise Data Warehouse at Intermountain Healthcare, where she has worked with the Cardiovascular Clinical Program and on the Analytic Health Repository.