University of Cambridge > Talks.cam > Isaac Newton Institute Seminar Series > Evaluating Data Linkage: Creating longitudinal synthetic data to provide a gold-standard linked dataset

Evaluating Data Linkage: Creating longitudinal synthetic data to provide a gold-standard linked dataset

Add to your list(s) Download to your calendar using vCal

If you have a question about this talk, please contact INI IT.

DLA - Data linkage and anonymisation

When performing probabilistic data linkage on real world data we, by the fact we need to link it, do not know the true linkage. Therefore, the success of our linkage approach is difficult to evaluate. Often small hand linked datasets are used as a ‘gold-standard’ for the linkage approach to be evaluated against. However, errors in the hand-linkage and the limited size and number of these datasets do not allow for robust evaluation. The research focuses on the creation of longitudinal synthetic datasets for the domain of population reconstruction. In this talk I will cover the previous and current models we have created to achieve this and detail the approaches to how we: define the desired behaviour in the model to avoid clashes between input distributions, verify the statistical correctness of the population, and initialise the model such that the starting population meets the temporal requirements of the desired behaviour. To conclude I will outline the model’s intended use for linkage evaluation, its other potential uses and also take questions.



This talk is part of the Isaac Newton Institute Seminar Series series.

Tell a friend about this talk:

This talk is included in these lists:

Note that ex-directory lists are not shown.

 

© 2006-2024 Talks.cam, University of Cambridge. Contact Us | Help and Documentation | Privacy and Publicity