# What is the chance that the match is coincidence?

I’d like to talk about two topics both connected to forensic statistics on which I have been working recently. The 2010 UK Court of Appeal Ruling known as “R v T” asserted that Bayes’ theorem and likelihood ratios should not be used in evaluating forensic evidence, except for DNA and ‘possibly other areas where there is a firm statistical base’. This illustrates that the task of communicating the evidential value of statistical evidence to a court is not easy.

The first, more mathematical, main topic, is the subject of an ongoing collaboration with Dragi Anevski (Lund). Think of it as a nonparametric missing data estimation problem with parameter restricted by ordering constraints.

Consider a probability distribution over an infinite set. Let p=(p_1,p_2,?) be the vector of all the atoms of this probability distribution ordered from large to small and augmented with zero’s if there are only finitely many atoms. Now take an iid sample of size n from this distribution, count how often each element is observed, and similarly order the resulting counts from large to small. This results in our observed data Y=(Y_1,Y_2,?). The problem we are interested in is how to estimate the underlying vector of ordered probabilities p from the observed vector of ordered counts Y. Note that the k’th most frequent element in the sample is not necessarily the k’th most frequent element in the population!

I will discuss our preliminary results on the nonparametric maximum likelihood estimator of p, which is very different from the naieve estimator Y/n, and explain their relevance to the problem of evaluating the evidential value of a rare Y-chromosome match – a problem called “the fundamental problem of forensic mathematics” by Charles Brenner (Berkeley).

The second topic concerns the problem of deciding whether or not two mobile telephones actually belong to the same person, based on call records (times of calls, locations of GSM towers) of both of the phones. Here there is no simple model and no simple mathematical problem to be solved, but on the other hand an equally challenging problem of how the statistician can advise a court on the evidential value of the evidence. The court in question will be the United Nations Special Tribunal on Lebanon, the crime is the assassination of premier Rafiq Hariri in 2005.

This talk is part of the Statistics series.