Disambiguation of Biomedical Text
Add to your list(s)
Download to your calendar using vCal
If you have a question about this talk, please contact Johanna Geiss.
Like text in other domains, biomedical documents contain a range of
terms with more than one possible meaning. These ambiguities form a
significant obstacle to the automatic processing of these
texts. Previous approaches to resolving this problem have made use of
a variety of knowledge sources including the context in which the
ambiguous term is used and domain-specific resources (such as
UMLS ). We compare a range of knowledge sources which have been
previously used and introduce a novel one: MeSH terms. The best
performance is obtained using linguistic features in combination with
MeSH terms. Performance exceeds previously reported results on a
standard test set.
Our approach is supervised and therefore relies on annotated training
examples. A novel approach to automatically acquiring additional
training data, based on the relevance feedback technique from
Information Retrieval, is presented. Applying this method to generate
additional training examples is shown to lead to a further increase in performance.
This talk is part of the NLIP Seminar Series series.
This talk is included in these lists:
Note that ex-directory lists are not shown.
|