University of Cambridge > > Cambiowebinars Seminar Series Feb-Apr 2015 > The 1000 Genomes Project and Beyond

The 1000 Genomes Project and Beyond

Add to your list(s) Download to your calendar using vCal

If you have a question about this talk, please contact Dr. Monica Vega-Hernandez.

The 1000 Genomes Project provides an essential reference catalog of human variation with more than 60 million variant sites ranging from single nucleotide polymorphisms to structural variant events including inversions and duplications. Also provided are global allele frequencies and genotypes for 2535 individuals from 26 different populations across Europe, Africa, East and South Asia and the Americas, which enable many other projects to better interpret their results. Primary uses for the 1000 Genomes data sets include imputation panels to create whole genome variant sets from exome or array-based genotypes; as filters of “normal” or shared variation in rare disease or cancer sequencing projects; and to explore demography and selection in human populations.

The 1000 Genomes Project is now drawing to a close. Here we describe plans to maintain the resource in order to ensure it remains the valuable data set it is today by providing long-term support for the 1000 Genomes Project resource. We will continue to host both the FTP site ( and the project website ( to ensure the community can access both the raw data and the documentation about the project. There will also be a stable version of the 1000 Genomes Browser ( based on the project’s final date release. This project specific Ensembl-based browser displays all of the 1000 Genomes variants as soon as possible and will use the GRCh37 assembly of the human reference genome.

We will also maintain the existing tools and incorporate new ones as appropriate to enable users to easily access the data they desire. Our most popular tools are the Data Slicer—that allows users to select genomic subsections of our alignment (BAM) and variant (VCF) files and thus download just the piece of the file they need—and the Variation Pattern Finder, which allows users to discover patterns of shared variation in a specific region of the genome. Other tools include the VCF to PED converter, which allows users to generated PLINK format files from remotely hosted VCF files and the recently introduced the Allele Frequency Calculator that will calculate allele frequencies in bulk for specific sub populations from our VCF files.

This talk is part of the Cambiowebinars Seminar Series Feb-Apr 2015 series.

Tell a friend about this talk:

This talk is included in these lists:

Note that ex-directory lists are not shown.


© 2006-2023, University of Cambridge. Contact Us | Help and Documentation | Privacy and Publicity