COOKIES: By using this website you agree that we can place Google Analytics Cookies on your device for performance monitoring. |
University of Cambridge > Talks.cam > Language Technology Lab Seminars > Clinical De-Identification and Semantic Relatedness
Clinical De-Identification and Semantic RelatednessAdd to your list(s) Download to your calendar using vCal
If you have a question about this talk, please contact Marinela Parovic. The first part of this talk will discuss the development of novel low-cost approaches to de-identifying clinical notes. The second part of the talk discuss the development of a new dataset of semantic relatedness for sentence pairs.. This dataset, STR -2021, has 5,500 English sentence pairs manually annotated for semantic relatedness using a comparative annotation framework. We show that the resulting scores have high reliability (repeat annotation correlation of 0.84). We use the dataset to explore a number of questions on what makes two sentences more semantically related. We also evaluate a suite of sentence representation methods on their ability to place pairs that are more related closer to each other in vector space. This talk is part of the Language Technology Lab Seminars series. This talk is included in these lists:
Note that ex-directory lists are not shown. |
Other listsReproSoc Cell Biology in Medicine Seminar Series Camtessential GroupOther talksAsymmetric cell division and germline immortality Statistics Clinic Michaelmas 2021 V Gateway OfB MWS HOLIDAY BREAK Fast runners and lazy couch potatoes – the spectrum of migrating immune cells |