COOKIES: By using this website you agree that we can place Google Analytics Cookies on your device for performance monitoring. |
University of Cambridge > Talks.cam > NLIP Seminar Series > Annotating Genericity: How Do Humans Decide?- A Case Study in Ontology Extraction
Annotating Genericity: How Do Humans Decide?- A Case Study in Ontology ExtractionAdd to your list(s) Download to your calendar using vCal
If you have a question about this talk, please contact Johanna Geiss. This talk deals with the identification of kind versus non-kind entities in natural language text for ontology extraction. The following two sentences illustrate the relevance of obtaining genericity annotations for the creation of ontologies. —the whale is a mammal—the whale rescued the scuba diver. Given this input, an ontology extraction system would typically output the relationships ‘whale—is_a—mammal’ and ‘whale—rescue—scuba diver’. When inserted as such in an real-world ontology, these relations may give the user the false impression that ‘one general feature of whales is that they rescue scuba divers.’ In order to prevent this reading, it is necessary to tag the first whale with a generic label and the second with a specific label. The task of genericity annotation using machine learning relies on a training corpus. Available corpora, however, are limited in the genres they cover and more importantly in the range of labels that they use to describe the genericity phenomenon. The public annotation schemes linked to those corpora are also often simplified and/or domain-specific. With the view of producing our own training corpus, we propose here an annotation scheme that covers the kind versus object distinction, the specificity phenomenon and reference resolution. The scheme is not domain-specific and produced, over a small test set from the British National Corpus, an inter-annotator agreement of Kappa = 0.74. We will discuss the scheme, our choice of labels, and the various problems attached to the manual annotation of genericity. In particular, we will show the importance of reference resolution for accurate annotation. This talk is part of the NLIP Seminar Series series. This talk is included in these lists:
Note that ex-directory lists are not shown. |
Other listsMcDonald Lectures & Seminars Cambridge Networks and Communications meeting Islamic SocietyOther talksCurrent-Induced Stresses in Ceramic Lithium-Ion Conductors CGHR Practitioner Series: Andrea Coomber, JUSTICE Localization estimates for hypoelliptic equations Positive definite kernels for deterministic and stochastic approximations of (invariant) functions Tunable Functional Magnetic Skyrmions at Room Temperature "The integrated stress response – a double edged sword in skeletal development and disease" Graph Legendrians and SL2 local systems Computing knot Floer homology 'Honouring Giulio Regeni: a plea for research in risky environments' Constructing the virtual fundamental cycle Tracking neurobiological factors of language developmental difficulties |