A number of other sequence-dependent structural variations have been detected
that may serve locally important functions in DNA metabolism. For example, some
sequences cause bends in the DNA helix. Bends are produced whenever four or more
adenine residues appear sequentially in one of the two strands (Fig. 12-19). Six
adenines in a row produce a bend of about 18
0 The
bending observed with this and other sequences may be important in the binding
of some proteins to DNA.
A rather common type of sequence found in DNA is a palindrome. A
palindrome is a word, phrase, or sentence that is spelled identically reading
forward or backward; two examples are ROTATOR and NURSES RUN. The term is
applied to regions of DNA in which there are inverted repetitions of base
sequence with twofold symmetry occurring over two strands of DNA (Fig. 12-20).
Such sequences are self complementary within each of the strands and therefore
have the potential to form hairpin or cruciform (cross-shaped) structures (Fig.
12-21). When the inverted sequence occurs within each individual strand of the
DNA, the sequence is called a mirror repeat. Mirror repeats do not have
complementary sequences within the same strand and cannot form hairpin or
cruciform structures. Sequences of these types are found in virtually every
large DNA molecule and can involve a few or up to thousands of base pairs. It is
not known how many palindromes actually occur as cruciforms in cells, although
the existence of at least some cruciform structures has been demonstrated in
vivo in E. coli. Self complementary sequences cause isolated single strands of
DNA to fold up in solution into complex structures containing multiple hairpins.
A particularly unusual DNA structure, known as H-DNA, is found in
polypyrimidine/polypurine tracts that also incorporate a mirror repeat within
the sequence. One simple example is a long stretch of alternating T and C
residues, as shown in Figure 12-22. A novel feature of H-DNA is the pairing and
interwinding of three strands of DNA to form a triple helix. Triple-helical DNA
forms spontaneously only within long sequences containing only pyrimidines (or
only purines) in one strand. Tvvo of the three strands in the H-DNA triple helix
(Fig. 12-22c, d) contain pyrimidines and the third contains purines.
These structural variations are interesting because there is a tendency for
many of them to appear at sites where important events in DNA metabolism
(replication, recombination, transcription) are initiated or regulated. For
example, the sites recognized by many sequencespecific DNA-binding proteins
(Chapter 27) are arranged as palindromes, and sequences that can form H-DNA are
found within regions involved in the regulation of expression of a number of
genes in eukaryotes. Much work is still required to defme these structures and
determine their functional significance.
We now turn our attention briefly from DNA structure to the expression of the
genetic information contained in DNA. RNA, the second major form of nucleic acid
in cells, plays the role of intermediary in converting this information into a
functional protein.
In eukaryotes DNA is largely confined to the nucleus, whereas protein
synthesis occurs on ribosomes in the cytoplasm. Therefore some molecule other
than DNA must carry the genetic message for protein synthesis from the nucleus
to the cytoplasm. As early as the 1950s, RNA was considered the logical
candidate: RNA is found in both the nucleus and cytoplasm, and the onset of
protein synthesis is accompanied by an increase in the amount of RNA in the
cytoplasm and an increase in its rate of turnover. These and other observations
led several researchers to suggest that RNA carries genetic information from DNA
to the protein biosynthetic machinery of the ribosome. In 1961, Francois Jacob
and Jacques Monod presented a unified (and essentially correct) picture of many
aspects of this process. They proposed the name messenger RNA (mRNA) for that
portion of the total cell RNA carrying the genetic information from DNA to the
ribosomes, where the messengers provide the templates for specifying amino acid
sequences in polypeptide chains. Although mRNAs from different genes can vary
greatly in length, the mRNAs from a particular gene will generally have a
defined size. The process of forming mRNA on a DNA template is known as
transcription.
In prokaryotes a single mRNA molecule may code for one or several
polypeptide chains. If it carries the code for only one polypeptide, the mRNA is
monocistronic; if it codes for two or more different polypeptides, the mRNA is
polycistronic. In eukaryotes, most mRNAs are monocistronic. (The term cistron,
for purposes of this discussion, refers to a gene. The term itself has
historical roots in the science of genetics, and its formal genetic definition
is beyond the scope of this text.) The minimum length of an mRNA is set by the
length of the polypeptide chain for which it codes. For example, a polypeptide
chain of 100 amino acid residues requires an RNA coding sequence of at least 300
nucleotides, because each amino acid is coded by a nucleotide triplet (Chapter
26). However, mRNAs transcribed from DNA are always somewhat longer than needed
simply to specify the code for the polypeptide sequence(s). The additional
noncoding RNA includes sequences that regulate protein synthesis (Chapter 26).
Figure 12-23 summarizes the general structure of prokaryotic mRNAs.
Messager RNAs code for Polypeptide Chains
Messenger RNA is only one of several classes of cellular RNA. Transfer RNAs
serve as adapter molecules in protein synthesis; covalently linked to an amino
acid at one end, they pair with the mRNA in such a way that the amino acids are
joined in the correct sequence. Ribosomal RNAs are structural components of
ribosomes. There is also a wide variety of special-function RNAs. All of these
are considered in detail in Chapter 25.
Regardless of the class of RNA being synthesized, the product of
transcription is always a single strand of RNA. The single-stranded nature of
these molecules does not mean their structure is random. The single strands tend
to take up a right-handed helical conformation that is dominated by
base-stacking interactions (Fig. 12-24). The stacking interactions are stronger
between two purines than between a purine and a pyrimidine or between two
pyrimidines. The purinepurine interaction is so strong that a pyrimidine
separating two purines will often be displaced from the stacking pattern so that
the purines can interact. Any self complementary sequences in the molecule will
lead to more complex and specific structures. RNA can base-pair with
complementary strands of either RNA or DNA. The standard base-pairing rules are
identical to those for DNA: guanine pairs with cytosine and adenine pairs with
uracil (or thymine). One difference is that one unusual base pairing-between
guanine and uracil-is fairly common between two strands of RNA; see Fig. 12-26.
The paired strands in RNA or RNA-DNA are antiparallel, as in DNA.
E. coli, showing many hairpins. RNase P also contains a protein component (not
shown). This enzyme functions in the processing of transfer RNAs,
Unlike the double helix of DNA, there is no simple, regular secondary
structure that forms a reference point for RNA structure. The three-dimensional
structures of many RNAs, like those of proteins, are complex and unique. Weak
interactions, especially base-stacking (hydrophobic) interactions, again play a
major role in stabilizing structures. Where complementary sequences are present,
the predominant double-stranded structure is an A-form right-handed double
helix. Z-form helices have been made in the laboratory (under very high-salt or
high-temperature conditions). The B form of RNA has not been observed. Breaks in
the regular A-form helix caused by mismatched or unmatched bases in one or both
strands are common, and result in bulges or internal loops (Fig. 12-25). Hairpin
loops form between nearby self complementary sequences in the RNA strand (Fig.
12-25). The potential for base-paired helical structures in many RNAs is
extensive (Fig. 12-26), and the resulting hairpins can be considered the most
common type of secondary structure in RNA. Certain short base sequences, such as
UUCG, are often found at the ends of RNA hairpins and are known to form
particularly tight and stable loops. Such sequences may play an important role
in nucleating the folding of an RNA molecule into its precise three-dimensional
structure. Important additional structural contributions are made by hydrogen
bonds that are not part of standard Watson-Crick base pairs. For example, the
2'-hydroxyl group of ribose can form a hydrogen bond with other groups, and a
variety of nonstandard base-pairing patterns are also observed. Some of these
properties are evident in the structure of the phenylalanine transfer RNA of
yeast (Fig. 12-27).
The analysis of RNA structure and its relationship to function is an emerging
field of inquiry that has many of the same complexities as the analysis of
protein structure. The importance of understanding RNA structure grows as we
become aware of an increasing number of functions of RNA molecules.
|
|