Log in

Cambridge users (raven) details

Other users details

No account? details

Information on

Subscribing to talks details

Finding a talk details

Adding a talk details

Disseminating talks details

Help and Documentation details

Mining scientific diagrams for semantic information

Add to your list(s) Download to your calendar using vCal

Dr Peter Murray-Rust
Wednesday 27 January 2016, 14:00-15:00
MR4, Centre for Mathematical Sciences, Wilberforce Road, Cambridge.

If you have a question about this talk, please contact Emily Boyd.

Scientific data is often only reported as diagrams in publications and is effectively destroyed and lost. This data is often critically valuable for other scientists and data abstracting services, and often has to be recreated manually from the diagram at great expense, with waste and error. Examples include plots, charts, and more complex objects such as chemical structure diagrams and phylogenetic (evolutionary) trees.

I shall show how, in favourable circumstances, it is possible to recreate semantic information from diagrams using well-established Computer Vision techniques. These include thresholding, binarization, dilation and thinning, OCR and a variety of domain-specific heuristics. Our Open Source library is based on BoofCV , an Open Java Image processing library, and enhanced with tools useful for scientific documents. Some PDF documents contain vector images and are particularly tractable while others are only pixel images and suffer form overlap, problems of scale and loss of detail

I shall show the application to chemistry and phylogenetics and show where errors and loss occur.

http://www.slideshare.net/petermurrayrust/mining-scientific-images

This talk is part of the Computational and Systems Biology series.

This talk is included in these lists:

Note that ex-directory lists are not shown.

Log in

Information on

Mining scientific diagrams for semantic information

This talk is included in these lists:

Other lists

Other talks