PhD Dissertation Defense - Olga Patterson
Feb 7, 2012 1:00 AM
Automatic Domain Adaptation of Word Sense Disambiguation Based on Sublanguage Semantic Schemata Applied to Clinical Narrative
Location: HSEB 4100C
Date: February 17, 2012
Time: 1:00 pm
Supervisory Committee: John F. Hurdle, MD, PhD; Stephane Meystre, MD, PhD; Bruce Bray, MD; Lewis Frey, PhD; Ellen Riloff, PhD
Domain adaptation of natural language processing systems is challenging because it requires human expertise. While manual effort is effective in creating a high quality knowledge base, it is expensive and time consuming. Clinical text adds another layer of complexity to the task due to privacy and confidentiality restrictions that hinder the ability to share training corpus among different research groups. Semantic ambiguity is a major barrier for effective and accurate concept recognition by natural language processing systems.
In my research I propose an automated domain adaptation method that utilizes sub- language semantic schema for all-word word sense disambiguation of clinical narrative. According to the sublanguage theory developed by Zellig Harris, domain-specific language is characterized by a relatively small set of semantic classes that combine into a small number of sentence types. Previous research relied on manual analysis to create language models that could be used for more effective natural language processing. Building on previous research on semantic type disambiguation, I propose a method of resolving semantic ambiguity utilizing automatically acquired semantic type disambiguation rules applied on clinical text ambiguously mapped to a standard set of concepts.
This research aims to provide an automatic method to acquire Sublanguage Semantic Schema (S3) and apply this model to disambiguate terms that map to more than one concept with different semantic types. The research is conducted using unmodified MetaMap version 2009, a concept recognition system provided by the National Library of Medicine, applied on a large set of clinical text. The project includes creating and comparing models, which are based on unambiguous concept mappings found in seventeen clinical note types. The effectiveness of the final application was validated through a manual review of a subset of processed clinical notes using recall, precision and F-score metrics.
Olga Patterson is a PhD Candidate in the Department of Biomedical Informatics at the University of Utah. She received a Masters degree in Business Information Systems from the Utah State University. Her primary research interests involve natural language processing of clinical text for the purposes of decision support and information extraction. Prior to joining the program, Olga worked as a Human Factors Engineer at GE Healthcare performing user interface design of surgical X-ray equipment.