> next up previous
Next: Walking along the genome. Up: Consensus Sequence Zen Previous: How to be sure

A Paradox: How can two things be the same but different?

Intriguingly, the binding sites for human splice junction donor and acceptor sites have the same consensus sequence for a portion of each site around the junction. Yet, when we measured the sequence conservation in bits we found that the information curves are quite different [Stephens & Schneider, 1992]. How could two sites have the same consensus sequence but be different? This conundrum led us to introduce a computer graphic, called a sequence logo, in order to understand the difference (Fig. 1). A logo depicts an average picture of the set of binding sites by a series of stacks of letters. The height of each stack is the sequence conservation (measured in bits of information; the vertical black bar at each junction is 2 bits high) and the heights of the letters show the relative proportions of the bases, sorted so that the more frequent bases are on top. From the logos shown, it is clear that the donor and acceptor sites have different `emphasis', but this cannot be seen with the consensus sequence CAGGT, which matches both of them at the junction. The difference in emphasis is important because it shows that there is more information on the intron side of each junction. This allows more freedom during the evolution of the protein-coding exon side, which is a biologically sensible result. The resemblance between the two junctions suggests that the splice machinery that binds to donors and acceptors have a common ancestor [Stephens & Schneider, 1992].


next up previous
Next: Walking along the genome. Up: Consensus Sequence Zen Previous: How to be sure
Tom Schneider 2002-12-05