CPCP Retreat 2016: Entity Matching for EHR- and Transcriptome-based Phenotyping
Symposium Video

Dr. AnHai Doan describes the task of entity matching across EHR- and transcriptome-based data and introduces a new tool called Magellan that allows non-experts to perform entity matching on their datasets. Entity matching allows matching of data across multiple data sets, for example identifying all of a patient's data when they have been treated at different medical offices or selecting all patients who have been treated with a specific drug. This is a challenging task because of variation in the data such as spelling mistakes or the use of abbreviations. Magellan fills an important gap in the data science pipeline by providing a step-by-step workflow for individuals to perform entity matching on their own data without becoming experts in the field. This can potentially save research groups thousands of dollars that would otherwise be spent hiring an expert. The Magellen package will be released in 2016 as a python package.

< Resources