Space Embedding of Records for Privacy Preserving Linkage
- 👤 Speaker: Vassilios Verykios (Hellenic Open University)
- 📅 Date & Time: Wednesday 14 September 2016, 10:00 - 10:30
- 📍 Venue: Seminar Room 1, Newton Institute
Abstract
Massive amounts of data, collected by a wide variety of organizations, need to be integrated and matched in order to facilitate data analyses that may be highly beneficial to businesses, governments, and academia. Record linkage, also known as entity resolution, is the process of identifying records that refer to the same real-world entity from disparate data sets. Privacy Preserving Record Linkage (PPRL) techniques are employed to perform the linkage process in a secure manner, when the data that need to be matched are sensitive. In PPRL , input records undergo an anonymization process that embeds the records into a space, where the underlying data can be matched but not understood by naked eye.
The PPRL problem is picking up a lot of steam lately due to a ubiquitous need for cross matching of records that usually lack common unique identifiers and their field values contain variations, errors, misspellings, and typos. The PPRL process as it is applied to massive ammounts of data comprises of an anonymization phase, a searching phase and a matching phase.
Several searching and anonymization approaches have been developed with the aim to scale the PPRL process to big data without sacrificing quality of the results. Recently, redundant randomized methods have been proposed, which insert each record into multiple independent blocks in order to amplify the probability of bringing together similar records for comparison. The key feature of these methods is the formal guarantees, they provide, in terms of accuracy in the generated results.
In this talk, we present both state-of-the-art private searching methods and anonynimization techniques, by exposing their characteristics, including their strengths and weaknesses, and we also present a comparative evaluation.
Series This talk is part of the Isaac Newton Institute Seminar Series series.
Included in Lists
- All CMS events
- bld31
- Cambridge Centre for Data-Driven Discovery (C2D3)
- Cambridge talks
- Chris Davis' list
- dh539
- Featured lists
- INI info aggregator
- Interested Talks
- Isaac Newton Institute Seminar Series
- ndk22's list
- ob366-ai4er
- rp587
- School of Physical Sciences
- Seminar Room 1, Newton Institute
- Trust & Technology Initiative - interesting events
- yk449
Note: Ex-directory lists are not shown.
![[Talks.cam]](/static/images/talkslogosmall.gif)

Vassilios Verykios (Hellenic Open University)
Wednesday 14 September 2016, 10:00-10:30