> next up previous
Next: Appendix 2: Correspondences between Up: Theory of Molecular Machines. Previous: Summary

   
Appendix 1: Introduction to Information Theory

Forty years ago Shannon published several famous papers that rigorously defined the concept of information so that it could be used in designing communications systems [Shannon, 1948,Shannon & Weaver, 1949,Pierce, 1980]. To allow information from several independent sources to be additive, he and earlier workers chose a logarithmic measure. One ``bit'' is the amount of information required to distinguish between two equally likely symbols, two bits are required to distinguish one symbol out of 4, and 3 bits are required to distinguish one symbol out of 8. In general, if there are M equally likely symbols to be distinguished, then one needs $\log_2 M$ bits to pick out one of them.

Communication requires at least three components. A transmitter sends a signal over a communications channel to a receiver that collects the signal for further use. The signal consists of a series of symbols, which convey some average amount of information, R, measured in bits per symbol [Shannon, 1948]. We follow Shannon and other early workers [Shannon, 1948,Brillouin, 1951,Tribus & McIrvine, 1971,Rothstein, 1951] and take this to be the uncertainty of the receiver before receiving symbols minus the uncertainty after reception:

 \begin{displaymath}R = H_{before} - H_{after}
\;\;\;\;\;\mbox{(bits per symbol)}
\end{displaymath} (43)

where an uncertainty H is

 \begin{displaymath}H = -\sum_{i=1}^M p_i \log_2 p_i
\;\;\;\;\;\mbox{(bits per symbol)}
\end{displaymath} (44)

and pi is the probability of each symbol i. When the symbols are equally likely, pi = 1/M and equation (44) simplifies to the form $H_{equal} = \log_2 M$. Likewise, when one symbol is certain, H = 0.

To find the maximum information from equation (43), the symbols appearing at the receiver must be equally likely, (so that $H_{before} = \log_2 M$) and every symbol must be exactly identified (no uncertainty left after reception, Hafter = 0). Under these circumstances the information is $R_{maximum} = \log_2 M$. If there is any noise ( $H_{after} \neq 0$), or the symbols are not equally likely ( Hbefore < Hequal) then this simple formula must not be used and R < Rmaximum.

If the symbols are sent at a rate of Wssymbols per second, then the channel carries WsR bits per second.

Shannon defined the ``channel capacity'' of a communications system and showed that it is:

 \begin{displaymath}C = W\log_{2} \left( \frac{P}{N} + 1 \right)
\;\;\;\;\;\mbox{(bits per second)}
\end{displaymath} (45)

where the bandwidth W is the range of frequencies used in the communication (in cycles per second or Hertz), and P / N is the ``signal-to-noise ratio'' [Shannon, 1949]. At the receiver a certain amount of signal power P (in joules per second) is required to distinguish the signals from each other in the presence of thermal noise N (also in joules per second).

Shannon proved a remarkable theorem about the channel capacity. One part of the theorem says that we cannot send information at a rate faster than the channel capacity. If we try to do this (i.e., WsR > C), a quantity of noise will be received that limits the rate to C. The other part of the theorem is surprising: if we transmit at any rate less than or equal to the channel capacity ( $W_sR \leq C$), then the transmission is possible with as low an error rate as we may desire.

There is a price to be paid to get a low error rate: we must carefully encode the signal before transmission and then carefully decode it afterward. Although both steps require a delay, the overall transmission rate can approach C. Unfortunately the derivation of (45) and the proof of the theorem do not tell us how to make codes which allow transmission at rates close to C. Nevertheless, the formula is useful for understanding and designing communication systems, and methods have been found for creating ``good'' codes [Gilbert, 1966,Hamming, 1986,Sourlas, 1989].




next up previous
Next: Appendix 2: Correspondences between Up: Theory of Molecular Machines. Previous: Summary
Tom Schneider
1999-12-09