University of Cambridge > > Isaac Newton Institute Seminar Series > Handling identifier error rate variation in data linkage of large administrative data sources

Handling identifier error rate variation in data linkage of large administrative data sources

Add to your list(s) Download to your calendar using vCal

If you have a question about this talk, please contact INI IT.

DLAW02 - Data linkage: techniques, challenges and applications

Co-authors: Gareth Hagger-Johnson (Administrative Data Research Centre for England, University College London), Ruth Gilbert (Institute of Child Health, University College London), Harvey Goldstein (University of Bristol and University College London)

Background: Linkage of administrative data with no unique identifier often relies on probabilistic linkage. Variation in data quality on individual or organisational levels can adversely affect match weight estimation, and potentially introduce selection bias to the linked data if subgroups of individuals are more likely to link than others. We quantified individual and organisational variation in identifier error in a large administrative dataset (Hospital Episode Statistics; HES ) and incorporated this information within a match probability estimation model. Methods: A stratified sample of 10,000 admission records were extracted from 2011/2012 HES for three cohorts of ages 0-1, 5-6 and 18-19 years. A reference standard was created by linking via NHS number with the Personal Demographic Service for up-to-date identifiers. Based on aggregate tables, we calculated identifier error rates for sex, date of birth and postcode and investigated whether these errors were dependent on individual characteristics and evaluated variation between organisations. We used a log-linear model to estimate match probabilities, and used a simulation study to compare readmission rates based on traditional match weights. Results: Match probabilities differed significantly according to age (p

This talk is part of the Isaac Newton Institute Seminar Series series.

Tell a friend about this talk:

This talk is included in these lists:

Note that ex-directory lists are not shown.


© 2006-2023, University of Cambridge. Contact Us | Help and Documentation | Privacy and Publicity