>
Next: Walking along the genome.
Up: Consensus Sequence Zen
Previous: How to be sure
Intriguingly,
the binding sites for human splice junction
donor and acceptor sites have the same consensus sequence
for a portion of each site
around the junction.
Yet, when we measured the sequence conservation in bits we found
that the information curves are quite different
[Stephens & Schneider, 1992].
How could two sites have the same consensus sequence but be different?
This conundrum led us to introduce
a computer graphic,
called a sequence logo,
in order
to understand the difference
(Fig. 1).
A logo depicts an average picture of the set of binding sites
by a series of stacks of letters.
The height of each stack is the sequence conservation
(measured in bits of information; the vertical black bar
at each junction is 2 bits high) and the heights
of the letters show the relative proportions of the bases,
sorted so that the more frequent bases are on top.
From the logos shown, it is clear that the donor and acceptor
sites have different `emphasis',
but this cannot be seen with
the consensus sequence CAGGT,
which matches both of them at the junction.
The difference in emphasis is important because it shows that there
is more information on the intron side of each junction.
This allows more freedom during the evolution of the protein-coding
exon side, which is a biologically sensible result.
The resemblance
between the two junctions suggests that the splice
machinery that binds to donors and acceptors
have a common ancestor
[Stephens & Schneider, 1992].
Next: Walking along the genome.
Up: Consensus Sequence Zen
Previous: How to be sure
Tom Schneider
2002-12-05