Medical Concept Normalization by Encoding Target Knowledge

Authors: Nikhil Priyatam,Sangameshwar Patil,Girish Palshikar,Vasudeva Varma
Conference: 2019 Conference on Neural Information Processing Systems (NeurIPS 2019 2019)
Location Vancouver Convention Centre, Vancouver, Canada
Date: 2019-12-08
Report no: IIIT/TR/2019/116

Abstract

Medical concept normalization aims to map a variable length message such as, ‘unable to sleep’ to an entry in a target medical lexicon, such as ‘Insomnia’. Current approaches formulate medical concept normalization as a supervised text classification problem. This formulation has several drawbacks. First, creating training data requires manually mapping medical concept mentions to their corresponding entries in a target lexicon. Second, these models fail to map a mention to the target concepts which were not encountered during the training phase. Lastly, these models have to be retrained from scratch whenever new concepts are added to the target lexicon. In this work we propose a method which overcomes these limitations. We first use various text and graph embedding methods to encode medical concepts into an embedding space. We then train a model which transforms concept mentions into vectors in this target embedding space. Finally, we use cosine similarity to find the nearest medical concept to a given input medical concept mention. Our model scales to millions of target concepts and trivially accommodates growing target lexicon size without incurring significant computational cost. Experimental results show that our model outperforms the previous state-of-the-art by 4.2% and 6.3% classification accuracy across two benchmark datasets. We also present a variety of studies to evaluate the robustness of our model under different training conditions.

Full paper: pdf

Centre for Search and Information Extraction Lab

IIIT Hyderabad Publications

Medical Concept Normalization by Encoding Target Knowledge

Abstract