COOKIES: By using this website you agree that we can place Google Analytics Cookies on your device for performance monitoring. |
University of Cambridge > Talks.cam > NLIP Seminar Series > Towards automated understanding of scientific papers
Towards automated understanding of scientific papersAdd to your list(s) Download to your calendar using vCal
If you have a question about this talk, please contact Johanna Geiss. The large number of scientific papers generated, especially in the life sciences, makes it a challenge for researchers and resource curators to extract and evaluate the knowledge contained within them. Automated text mining methods currently operate mainly on abstracts but scientists have highlighted the need for the automatic processing of the full text. Researchers in information extraction and information retrieval have to be able to recognise areas of interest in papers and scientists have expressed the need for machine readable summaries. However, the manual production of semantic markup in papers is very time consuming and cannot cater for the millions of papers already published. We have produced a tool (SAPIENT) and an ontology-based annotation scheme for the annotation of core scientific concepts (CISP) (Goal’, Motivation’,Object’,Hypothesis’,Background’,Model’,Experiment’,Method’,Observation’,Result’,`Conclusion’) in research papers. A corpus of 225 papers covering topics in physical chemistry and biochemistry were annotated at the sentence level by 16 experts using SAPIENT and the CISP -based annotation scheme. Within the SAPIENTA project we plan to use this corpus to enable the automatic recognition of scientific concepts in papers and generate digital abstracts in both human and machine readable format. We also aim to enable intelligent querying of the content of scientific papers by exploiting the extra semantic information and representing the relevant sections in a first order logic form that reasoners can handle. Bio: Dr Maria Liakata has an Oxford DPhil in Computational Linguistics, on the topic of using Inductive Logic Programming to learn pragmatic knowledge from a corpus (Inducing Domain Theories). Since June 2005 she has been a research associate with the Computational Biology group at Aberystwyth University and has worked on interdisciplinary projects, such as the Robot Scientist, involving the automation and formalisation of science. This talk is part of the NLIP Seminar Series series. This talk is included in these lists:
Note that ex-directory lists are not shown. |
Other listsEcology Lunchtime Series Fitzwilliam Museum Technology and Democracy EventsOther talksAlzheimer's talks Migration in Science Solving the Reproducibility Crisis UK 7T travelling-head study: pilot results Autumn Cactus & Succulent Show Active bacterial suspensions: from individual effort to team work “Modulating Tregs in Cancer and Autoimmunity” How to Deploy Psychometrics Successfully in an Organisation The Digital Doctor: Hope, Hype, and Harm at the Dawn of Medicine’s Computer Age Anthropological engineering and hominin dietary ecology |