University of Cambridge > Talks.cam > NLIP Seminar Series > Nobody Writes Letters Anymore: Helping people make sense of historically significant email collections

Nobody Writes Letters Anymore: Helping people make sense of historically significant email collections

Add to your list(s) Download to your calendar using vCal

If you have a question about this talk, please contact Johanna Geiss.

The archivist’s dilemma is that in a world with vastly more information being created, less of what we should keep may reach the archive in forms that we know how to manage. Much of present archival practice rests on four key facts: important records have generally been written on paper, paper records are (reasonably) persistent, paper records require some level of manual description, and the costs of description and preservation necessitate appraisal and selection. We are, however, moving toward a world in which records that are never committed to paper may prove to be ephemeral, digital objects can be (at least to some extent) self-describing, and the economics of appraisal and retention might therefore reverse. Many projects are now working on reliably getting important digital records into the future, so in this talk I’ll focus on what I as the natural next step: helping to at least partially automate description. In order to illustrate how this might be done, I’ll describe joint work with Tamer Elsayed to automatically resolve the identity of people who are mentioned ambiguously (e.g., just by first name) in a collection of email from a failed corporation (Enron). Our results indicate that, at least for people who are well represented in the collection, we can use a generative model to guess the right identity more than 80% of the time. I’ll conclude the talk with a few remarks on our next directions for techniques, evaluation, and additional types of collections to which similar ideas might be applied.

This talk is part of the NLIP Seminar Series series.

Tell a friend about this talk:

This talk is included in these lists:

Note that ex-directory lists are not shown.

 

© 2006-2024 Talks.cam, University of Cambridge. Contact Us | Help and Documentation | Privacy and Publicity