University of Cambridge > Talks.cam > Inference Group > Some interesting data: Cryptic DNA sequence periodicities are ubiquitous, organism specific and distinguish introns from exons.

Some interesting data: Cryptic DNA sequence periodicities are ubiquitous, organism specific and distinguish introns from exons.

Add to your list(s) Download to your calendar using vCal

If you have a question about this talk, please contact David MacKay.

Nucleotide bases in DNA are usually assumed to evolve independently. However, within a genome each of the 16 dinucleotide pairs tends to be consistently either commoner or rarer than expected, to the point where together they constitute taxon-specific genomic signatures. Here we extend these observations to explore gapped dinucleotide motifs (GDMs): dinucleotide pairs with 0-9 intervening bases. As with simple dinucleotides, GDMs define signatures that are highly consistent across a genome. Trees constructed from matrices of GDM similarity suggest that phylogenetic signal exists but can sometimes be swamped by convergence. To find the likely basis of GDM signatures we used a principal component analysis to summarise GDM profiles and show that the dominant predictors of both the strength and nature of the patterning are the %AT composition of the genome and the optimal growth temperature of the organism, particularly among prokaryotes. For eukaryotes, other factors such as genome size and chromosome number exert a minor but significant influence. Dividing eukaryotic sequences into intronic and exonic sequences reveals a striking difference in GDM patterns, exons being dominated by 3-base periodicities that are largely absent from introns, and from any of the whole genome sequence profiles. By implication, GDMs have potential to be used to identify anonymous DNA fragments, either to taxon among species, or perhaps even by function within a species. GDMs also illuminate constraints leading to convergent evolution. Either way, the strength of the GDM profiles suggest that DNA evolution is less random than is often assumed.

This talk is part of the Inference Group series.

Tell a friend about this talk:

This talk is included in these lists:

Note that ex-directory lists are not shown.

 

© 2006-2020 Talks.cam, University of Cambridge. Contact Us | Help and Documentation | Privacy and Publicity