Structural Biochemistry/Volume 8 – Wikibooks, open books for an open world

Nucleic Acids are long linear polymers that are called DNA, RNA. these polymers carry genetic information that passed from generations after generations. They are composed of three main parts: a pentose sugar, a phosphate group, and a nitrogenous base. Sugars and Phosphates groups play as structure of the backbone, while bases carries genetic components, which characterized the differences of nucleic acids. There are 2 types of bases: purines and pyrimidines, and these bases determine whether the nucleic acid is DNA or RNA.

A conceptualized depiction of multiple nucleic acids. Green circles represent the pentose sugars, red circles represent the nucleobases, and the yellow circles represent the phosphate groups. Note that a single nucleic acid consists of one sugar, one base, and one phosphate group

Nucleic acids are composed of smaller subunits called nucleotides. A nucleotide is a nucleoside with one or more phosphoryl group by esterlinkage. When it is in the form of RNA the bases are called adenylate, guanylate, cytidylate, and uridylate. In the form of DNA the bases are called deoxyadenylate, deoxyguanylate, deoxycytidylate, and thymidylate. A nucleoside is a monomer, just the bases attached to a sugar without the phosphate groups. In this state the bases in RNA are called adenosine, guanosine, cytidine and uridine. In this state in DNA the bases are called deoxyadenosine, deoxyguanosine, deoxycytidine and thymidine.

In organic chemistry, a phosphate, or organophosphate, is an ester of phosphoric acid. Organic phosphates are important in biochemistry and biogeochemistry.

General Phosphate structure

The backbone of the DNA strand is made from alternating phosphate and sugar residues. The sugars are joined together by phosphate groups that form phosphodiester bonds between the third and fifth carbon atoms of adjacent sugar rings.

As you noticed in the deoxyribose sugar, it does not contain a hydroxyl group on the 2′ carbon. This absence of the hydroxyl group allows greater stability because the absence of hydroxyl group allows the 2′ carbon to resist hydrolysis. This is one of the reasons why the hereditary material is stored in the DNA and not RNA. However, the net negative charge of the phosphate group must be stabilized by metal ions, such as magnesium or manganese.

In the molecular bonding of the deoxyribonucleotide (DNA) and ribonucleotide(RNA), phosphodiester bond is a strong covalent bond between a phosphate group and two 5-carbon ring. The phosphate group contains a negative charge as it bonds to a 3′ carbon in one ring and a 5′ carbon in another ring.

The phosphodiester is formed when a single phosphate or two phosphates break away and catalyze the reaction by DNA polymerase. dATP would dissociate one phosphates in order to form a phosphodiester bond with a deoxyribose sugar from a nucleotide during the process of DNA elongation.

(DNA)n + dATP <------> (DNA) n+1 + Ppi

Phosphodiesterase is an enzyme that breaks a cyclic nucleotide phosphate due to incorrect hydrolysis of phosphodiester bonds. Phosphodiesterase will be an important clinical significance in repairing DNA sequences.


Carbohydrates are comprised of monosaccharide units which create sugars ranging from simplest of sugars such as glucose (chemical formula: C6H12O6) to the more complex polysaccharides such as starch.
Single nucleotide monomeric units consist of one sugar molecule connected to 1) a heterocyclic nitrogen containing organic base, and 2) a Phosphate group that connects the sugar component of different nucleotides together. The organic base is usually attached to Carbon 1′ of the sugar, while the Phosphate group is connected to Carbon 5′ of the sugar. When strung together, the phosphate of the neighboring nucleotide attaches to Carbon 3′ of the sugar.

Monosaccharides consist of aldehyde or ketone groups with hydroxyl groups as substituents. Sugars that contain an aldehyde group are called aldoses, and the sugars that contain a ketone group are called ketoses.

Sugars that are non-super imposable mirror images of each other are called enantiomers. Sugars that are stereoisomers but mirror images of each other are called diastereoisomers. If sugars that are stereoisomers but differ in configuration at a single chiral center are called epimers.

Sugars can be open-chain form or ring form. To form a six-membered hemiacetal ring, the carbon in the aldehyde group (C-1) attaches to the oxygen atom in the C-5 hydroxyl group. The six membered cyclic hemiacetal is called pyranose because it is similar to the structure of a pyran. To form a five-membered ring, the C-2 of ketone group attackes the oxygen atom of the hydroxyl group on C-6. The five membered cyclic hemiacetal is called furanose because it is similar to the structure of a furan. When a furanose or pyranose ring is formed, a new stereocenter is formed, and this new chiral carbon is called the anomeric carbon. This carbon can have one of two configurations, it is either in the S conformation (the hydroxyl group is pointing up), and it is referred to as the alpha carbon, or it is in the R conformation (the hydroxyl group is pointing down) and it is referred to as the B configuration. These two conformations are diastereomers, not enantiomers, and the α and β forms are called anomers.

A reducing sugar is one that can react because they have a relatively reactive hemiacetal group at C-1 position. Examples include: glucose, fructose, lactose, and maltose. The anomeric carbon in all of these molecules is free to react.

A non-reducing sugar is one that does not react, such as sucrose. The acetal group at the C-1 position makes the sugar non-reactive. Their structures are modified, so that they do not have free aldehyde or ketone groups to react. In sucrose, neither of the monosaccharides in the disaccharide can easily change into an aldehyde or ketone, making it nonreactive, this non-reducing. The glycosidic bond in the disaccharide hinders the molecule from being reactive. The anomeric carbon is not free to react. In order to determine whether or not a sugar is reducing, a Fehling’s or Tollen’s test is performed. In the Fehling’s test a brick red precipitate is the positive result, and in the Tollen’s test a silver mirror is the positive result.

In contrast, when a sugar is oxidized, the aldehyde or ketone carbonyl becomes a carboxyl group.

It is called an O-glycosic bond if the anomeric carbon is attached to an oxygen atom of a hydroxyl group. It is called an N-glycosidic bond if the anomeric bond is attached to a nitrogen atom of a amine group.

Glycosidic bonds are also what form the bridges between monosaccharides. If monosaccharides are joined by O-glycosidic bonds, they are called oligosaccharides.

The difference in having an -OH group attached to Carbon 2′ of the sugar is the difference between DNA and RNA. In RNA, the carbon 2′ contains an -OH group, whereas in the carbon 2′ of DNA, there is just a hydrogen attached. The sugar in RNA, or “ribonucleic acid” is “ribose” while the sugar for DNA or “DEOXIribonucleic acid” is “deoxiribose.” DEOXI- is used to represent the lack of oxygen from the -OH group on Carbon 2′ of ribose.
|Ribose structure 2.png||Deoxyribose.png

Importance of sugar in glycoproteins

CellMembraneDrawing. This is three dimensional structure of a cell membrane that depicts the relationship between sugar and proteins like glycoproteins

Sugar attached proteins called glycoproteins is another important component of the cell. Sugar components are oriented toward the watery cell exterior of glycoproteins. These sugar components serve as an identifier like cellular address labels. When signaling molecules pass through bodily fluids they encounter certain patterns of sugars, which either gives them access or dismissal. Therefore, the glyoproteins act as a regulator or gatekeeper in cells. In addition they help direct the formation of organs and tissue by forming correct cells together. Sugar coatings also help cells move through blood vessels by providing traction by latching on cell surface receptors.


Davis, Alison. “The Chemistry of Health.” ‘NIGMS August 2006: 36-42.

Deoxyribose Sugar[edit]

Typically, deoxyribonucleic acid is depicted as the nucleic acid that serves as the template for the development of an organism or a double helix. DNA, unlike RNA, lacks a hydroxyl (-OH) group at the 2′ carbon. Since there is no hydroxyl group, DNA can only form phosphodiester linkages with other nucleic acids at the 3′ carbon to the 5′ carbon of another nucleic acid.

Due to the lack of the hydroxyl group, DNA is more resistant to hydrolysis than RNA is. The lack of the partially negative hydroxyl group also favors DNA over RNA in stability. There is always a negative charge associated with the phosphodiester bridges that join two nucleotides which will repel the hydroxyl group in RNA, making it less stable than DNA.

Ribofuranose-2D-skeletal.png The structure of ribose in RNA

Deoxyribose structure.svgThe structure of deoxyribose in DNA

Deoxyribose is an aldopentose, meaning that it is a monosaccharide which contains five carbon atoms, and also contains an aldehyde functional group in its linear structure. Essentially, the deoxy sugar is just a pentose sugar ribose, with the hydroxyl group at position 2 replaced with a hydrogen instead. Another name for the deoxyribose is deoxyribofuranose, which is derived from the fact that it is a five membered ring with four carbon atoms and one oxygen atom.

Biological Importance of Deoxyribose[edit]

2-deoxyribose, as well as ribose, derivatives are important in biological processes. The most important of these derivatives involve a phosphate group attached at the 5-position of the ring. The mono-, di-, and triphosophate phosphates hold great imporance, as does the 3-5 cyclic monophosphate form purines and pyrimidines form an important class of compounds with ribose and deoxyribose through the formation of diphosphate dimers called coenzymes. Nucleosides are formed when purines and pyrimidines are coupled with a ribose sugar. Common nucleosides typically have a phosphate group attached at the 5-carbon and a base attached at the 1-carbon. Phosphorylated nucleosides are called nucleotides.

Nitrogenous bases can be added or react with the hemiacetal of the deoxyribose. Common bases added on are adenine and guanine (purine derivatives), and thymine, uracil, and cytosine (pyrimidine derivatives). When adenine is coupled with ribose, it is referred to as adenosine and when it is coupled with deoxyribose, it is referred to as deoxyadenosine. The 5′-triphosphate derivative of adenosine, also known as adenosine triphosphate (ATP), is vital for the transportation of energy molecules in the cell.

2-deoxyribose and ribose nucleotides are usually found as an unbranched 5′-3′ polymer. The 3′-carbon of one monomer is attached to the 5′-carbon of another monomer, which is then attached to the 3′-carbon of another monomer, and can continue on for many millions of monomer units. These long polymer chains contain very different physical properties than those of small molecules, and so these polymers make up another division known as macromolecules. The backbone of the polymer is the sugar-phosphate-sugar chain that is created by the 3′-5′-3′-carbon bonds, which is independent of which base is attached to the sugars.

Chromosomes also contain the polymer chain of the 5′-3′ of 2′-deoxyribose nucleotides. Each monomer is one of the aforementioned nucleotides, which are deoxy-adenine, thymine, guanine, or cytosine. These are often referred to as deoxyribonucleic acid, or DNA for short. In ribonucleic acid, or RNA, the thymine is replaced with uracil. DNA found in chromosomes form long helical structures which contain two molecules that run anti-parallel to each other with the backbones facing in and are held together by the hydrogen bonds formed between the complementary nucelotide bases (Adenine and Thymine, Guanine and Cytosine), which are lying between the helical backbones. The absence of the 2′-hydroxyl group in DNA allows the backbone to be more flexible and to assume the full conformation of the long double-helix structure, which in turn allows for coiling and, therefore, DNA is able to fit longer molecules into smaller volume spaces of a cell nucleus. RNA, on the other hand, are known to form relatively short double-helix structures.


Ribose primarily occurs as D-ribose. It is an aldopentose, a monosaccharide containing five carbon atoms that has an aldehyde

functional group at one end. Typically, this species exists in the cyclic form. Ribose composes the backbone for RNA and relates to deoxyribose, as found in DNA, by removal of the hydroxy group on the 2′ Carbon.

Ribose is less resistant to hydrolysis and will cause tension in RNA due to the negative charge of the phosphodiester bridge and the hydroxyl group on the 2′ Carbon. The hydroxyl group has the capability to attack the phosphodiesr bond that typically links it to another ribose, thereby forming a cyclic form of the sugar. An example of this is cyclic Adenosine Monophosphate (cAMP).

Roles of D-ribose in the body[edit]

Aside of being the backbone for RNA and DNA, D-ribose is also important in the creation of ATP that all cells require to stay alive. It is currently used in medicinal practice to increase muscle energy and improve exercise performance. People that experiences Fibromyalglia and chronic fatigue syndrome that took a supplement of D-ribose improved their conditions dramatically. D-ribose supplements improved their conditions because it helps the patients produce more ATP in the body, because their body cannot produce a sufficient amount of ATP needed.

D-ribose has an important role in improving heart function for patients that suffer symptoms of congestive heart failure (CHF). Ischaemia, which is sudden decrease of blood supply, reduces myocardial ATP level. The addition of D-ribose will replenish the ATP level because it shortens the time it takes to create and restore ATP levels. Therefore the patient will be able to last longer during exercising before experiencing left chest pain, because the body is getting adequate amount of myocardial ATP. It also aided in regulating blood circulation in the heart by normalizing and readjusting blood flow through the left ventricle and atrium to accommodate the sudden change in blood supply. As a result patients suffering from CHF has an improved quality of life after taking D-ribose supplements because they are able to do more physical activity and return to a near normal lifestyle.

D-Ribose supplement is also important to athletes as well because it quickly replenishes ATP levels in muscle to help increase stamina and aid in strength building. D-ribose shorten the time it takes to create ATP because it directly enter the pentose phosphate pathway to create ribose-5-phosphate without having to go through the glucose-6-phospohate dehydrogense and 6-phosphogluconate dehydrogenase, both of which require rate-limiting enzymes to form. The rate-limiting enzyme will slow down the creation of ATP, therefore by bypassing those pathways ATP will be produced at a higher rate. Hence, it restores ATP that was loss during exercise faster.

Summary of the roles:
1. Provide a backbone for DNA and RNA
2. Restores ATP in the body
3. Improve muscle stamina
4. Regulate blood circulation in the heart.

Natural sources of D-Ribose[edit]

D-ribose is a molecule that is naturally produced by the human body and is not found in food sources. However riboflavin, a component of d-ribose that helps aid in the production of d-ribose, is found in a plethora of food. Riboflavin, also known as vitamin B2 is found in found in eggs, milk products, nuts, vegetable, beef, and other proteins. However, these should be kept in areas where it is dimly lit because light can damage riboflavin.


Aside from helping form d-ribose, riboflavin also helps fight off free radicals that can be damaging to cell. Hence it is also a form of antioxidant for the body. Free radicals can damage cells and increase aging and contribute to health conditions, such as heart disease and cancer, therefore riboflavin aids in the reduction of free radicals found in one’s body. Another function of riboflavin is that it helps produce red blood cell and convert B6 vitamin into a form the body can use. Another function of riboflavin is that it helps skin develop properly.

Summary of roles:
1. Helps form ribose that is then converted to d-ribose
2. Acts as an antioxidants
3. Helps produce red blood cells.
4. Convert B6 vitamin into a form the body can use.
5. Helps develop skin properly.




DNA double helix 45.PNG

A DNA nucleotide is composed of 3 main units: a 5-carbon monosaccharide (deoxyribose), a phosphate group, and a nitrogenous base. While the monosaccharide and phosphate group alternate in sequence and form the backbone of the DNA double helix, the nitrogenous bases may differ in every adjoining nucleotide. The four nitrogenous bases present in DNA are adenine (A), guanine (G), cytosine (C) and thymine (T). In RNA, the only differing nitrogenous base is uracil (U) (which replaces thymine in DNA and differs thymine only by the missing methyl group at carbon 5 of the pyrimidine ring). Of the nitrogenous bases, adenine and guanine are purines, which are aromatic compounds attached to an imidazole group, while cytosine and thymine and uracil compose a set of pyrimidines, which are one ring-aromatic compounds. Nitrogenous bases, being hydrophobic, tend to face inwards of the double helix, pointing away from the surrounding aqueous environment. If the phosphate backbones were faced inside of the double helix, then there will be too many charges clustered together such that the double helix would be an unlikely product. Bonds between linking nitrogenous bases of two DNA strands are Hydrogen bonds with 3 H-bonds connecting cytosine and guanine and 2 H-bonds connecting adenine and thymine, while the bonds between the stacking of DNA are kept in close contact via van der waals interactions. The aromaticity of the nitrogenous bases accounts for the DNA absorbance peak at 260nm.

== What is a Purine? ==


The name was invented by the German chemist Emil Fischer in 1884. A purine is a nucleotide (a nucleoside + phosphate group) that is amine based and planar, aromatic, and heterocyclic. The structure of purine is that of a cyclohexane(pyrimidine group) and cyclopentane(imidazole group) attached to one another; the Nitrogen atoms are at positions 1,3,7,9. Adenine(A) and Guanine(G) are examples of purines which are involved in the construction of the backbone of the DNA and RNA. They are also a part of the structures for Adenosine disphosphate (ADP), triphosphate(ATP), and other enzymes. Purines form bonds with pentoses exclusively through the 9th Nitrogen atom.

Purine. Two of the bases found in both DNA and RNA, adenine (A) and guanine (G), are derivatives of purine.

6-amino and 2-amino-6-oxy purine[edit]

One derivative form of purine, adenine (A), is also commonly known as 6-amino purine. The 6-amino purine molecule contains an amine group attached to the carbon atom at position 6 double bonded to the nitrogen atom at position 1 and single-bonded the carbon atom at position 5. Another derivative form of purine, guanine (G), is also known as 2-amino-6-oxy purine. The 2-amino-6-oxy purine contains an amine group attached to the carbon atom at position 2 double bonded to the nitrogen atom on position 3 and single-bonded to the nitrogen atom on position 1. Guanine also has a carbonyl group at position 6 hence the 6-oxy.

2-amino-6-oxy purine; Guanine. Arrows indicate direction of hydrogen bonding.

Purine Content in Foods[edit]

Food is responsible for approximately 30% of uric acid in the blood. Regular diets could affect the level of uric acid. Some food will increase the blood acidity even if the content in purine is low.

Lowest Level of Purine: 0-50mg[edit]

tea, coffee, soda, nuts, dairy products, vegetables, cereal, fruits, preserve foods, sweets

Moderate Level of Purine: 50-150mg[edit]

spinach, avocado, beef, turkey, lamb, oyster, fish, peanuts, sausages, ducks, chickens

High Level of Purine: 150-1000mg[edit]

kidney, liver, heart, caviar, scallops, lobster, sardines, Thai fish sauce


A diet high in purines can lead to gout, a form of arthritis with symptoms of severe pain, redness, and swelling. Uric acid is a product formed from the breakdown of purines. Uric acid builds up in one’s joints, causing the inflammation and resultant pain.

2 Types of Purine Disorders of Nucleotide Synthesis[edit]

Adenylosuccinase deficiency[edit]

This causes retardation or heart attacks due to high level of succinyladenosine in urine. Currently, there is no treatment.

Phosphoribosylpyrophosphate synthetase superactivity[edit]

A recessive disorder which causes too much production of purines, which results in gout or other developmental effects. Treatments could include low purines in daily diet.


Purines are biochemically significant in a myriad of biomolecules besides DNA and RNA, such as ATP, GTP, cyclic AMP, NADH, and coenzyme A. Although purine has not been found naturally in nature, it can be produced through organic synthesis. Purines can also be used as neurotransmitters, acting upon purinergic receptors (i.e., adenosine activates adensoine receptors)


Many organisms utilize metabolic pathways in order to synthesize and break down purines. Biologically, purines are synthesized as nucleosides, which are bases attached to ribose.

Laboratory Synthesis[edit]

Purines can be created artificially, too, and not just through vivo synthesis in purine metabolism. When formamide is heated in an open vessel at 170°C for 28 hours, purine is obtained.


1. Obtain a sample of formamide
2. Heat in an open vessel with a condenser for 28 hours in an oil bath at 170-190°C
3. Remove excess formamide through vacuum distillation
4. Reflux the residue with methanol
5. Filter the methanol solvent and remove by vacuum distillation

Adenine base. The NH group is bonded to the sugar within the nucleotide, and the other groups participate in hydrogen bonding to thymine (in DNA) and uracil (in RNA).

Structure & Function[edit]

Adenine(A) is one of the four bases that make up nucleic acids. It is a purine base that complementarily binds to Thymine (T) in DNA and Uracil (U) in RNA. This bond is formed by two hydrogen bonds, which help stabilize the nucleic acid structures. Different structures of adenine mainly result from tautomerization of adenine, which allows the molecule to be available in isomeric forms in chemical equilibrium. The molecular formula of adenine is C5H5N5 .

An adenine molecule bound to a deoxyribose, a sugar, is known as deoxyadenosine. An adenine bound to ribose, also a sugar, is known as adenosine, a key component in Adenosine Triphosphate. When adenosine attaches to three phosphate groups, a nucleotide, adenosine triphosphate (ATP) is formed. Adenosine triphosphate is an important source of energy that is used in many cellular mechanisms, primarily in the transfer of energy in chemical reactions. The phosphate of ATP can detach, resulting in a release of energy.

In addition to ATP, adenosine also plays a key role in other organic molecules nicotinamide adenine dinucleotide (NAD) and flavin adenine dinucleotide (FAD), both molecules of which are involved in metabolism. Also, adenine can be found in tea, vitamin B12, and several other coenzymes.

Formation and other forms of Adenine[edit]

In the human body, adenine is synthesized in the liver. Biological systems tend to preserve energy, so usually adenine is achieved through the diet, the body degrading nucleic acid chains to obtain individual bases and reconstructing them through mitosis. The vitamin folic acid is important for adenine synthesis.

Adenine forms adenosine, a nucleoside, when attached to ribose, and deoxyadenosine when attached todeoxyribose; it forms adenosine triphosphate (ATP), a nucleotide, when three phosphate groups are added to adenosine. Adenosine triphosphate is used in cellular metabolism as one of the basic methods of transferring chemical energy between reactions.

In older literature, adenine was sometimes called Vitamin B4. However it is no longer considered a true vitamin (see Vitamin B).
Some think that, at the origin of life on Earth, the first adenine was formed by the polymerizing of 5 hydrogen cyanide (HCN) molecules.


Adenine is one of the byproducts of the Purine metabolism, where inosine monophosphate (IMP) is synthesized with a pre-existing ribose through a complex process involving atoms from the amino acids glycine, glutamine, and aspartic acid, in addition to the formate ions transferred from coenzyme tetrahydrofolate.

]== Tautomerization ==
Tautomers are isomers related by changing the positions of attachment of a single hydrogen and a single double bond, in a three-atom system, such as the keto- and enol tautomers of a ketone. Like, keto-enol tautomers, Adenine, as well as Cytosine, Guanine, Tyrosine, and Uracil may go through tautomerization, interchanging from the amino to the imino functionality by intermolecular proton transfer.

Ketamin-Enamin-Tautomerie Adenin.svg



Guanine base. The NH group is bonded to the sugar within the nucleotide, and the other groups participate in hydrogen bonding to cytosine.

Guanine is among the five nucleobases that is found in DNA and RNA. The formula of guanine is C5H5N5O, and is a planar and bicyclic molecule. Guanine has two forms, keto and enol forms. The keto form is the major form. Guanine, like adenine, is a derivative of purine and binds to cytosine through 3 hydrogen bonds. The amino group in the cytosine is the hydrogen donor and the C2 carbonyl and the N3 amine are the hydrogen-bond acceptors. In Guanine, the group at C6 acts as the hydrogen accepter, and the group at N1 and the amino group at C2 act as the hydrogen donors. The related nucleoside containing guanine and ribose is called guanosine and guanine bound to deoxyribose sugar is called deoxyguanosine.

Guanine is capable of being hydrolyzed by strong acids to form ammonia, carbon monoxide, carbon dioxide, and glycine. Guanine oxidizes more readily than adenine, another purine-derivative nitrogenous base in nucleic acids. Guanine has a high melting point of 350°C due to the intermolecular hydrogen bonds between the oxo and amino groups in the crystal of the molecule. Also because of this intermolecular bonding, guanine is relatively insoluble in water as well as in weak acids and bases.

DNA base pair bonding[edit]

Base pair GC.svg

From the image on the left, it can be seen that Guanine and Cytosine bond together through noncovalent hydrogen bonding at three distinct sites. Since Cytosin to Guanine has 3 H-bonds and Adenine to Thymine has 2 H-bonds, a higher CG content leads to higher melting point when compare with AT content. An interesting note is that Watson and Crick first hypothesized that Guanine and Cytosine bonded together through hydrogen bonding at two distinct sites. [1]


Guanine may go through tautomerization, interchanging from the keto to the enol functionality by intermolecular proton transfer.


Guanine is also the name of the white amorphous substance found in fish scales. It serves as an additive to various products such as shampoos, metallic paints, and simulated pearls and plastics providing a pearly iridescent effect. Also, it adds a shimmering luster to eye shadow and nail polish. This pearly luster is produced by the crystalline form of guanine which are rhombic platelets composed of multiple transparent layers that have a high index of refraction that partially reflects and transmits light from layer to layer. To provide this effect, it can be applied by spraying, painting, or dipping.


  1. Crick, Francis H. (Aril 1953). “Molecular Structure of Nucleic Acids”. Nature 171: pp. 737-738. 

Berg, Jeremy M. John L. Tymoczko. Lubert Stryer. Biochemistry Sixth Edition. New York: W.H. Freeman and Company, 2007.

Purine is a heterocyclic aromatic organic compound. Purine consists of a pyrimidine ring fused to an imidazole ring. Purines and pyrimidines make up of two groups of nitrogenous bases. The name was invented by the German chemist Emil Fischer in 1884. Below are the DNA bases.

DNA Bases

Hypoxanthine (6-Hydroxypurine) is a naturally occurring purine derivative and deaminated form of adenine. It is an intermediate in the purine catabolism reaction and is occasionally found as a constituent in the anticodon of tRNA as the nucleosidic base inosine. It is also utilized as a nitrogen source in bacteria and parasite cultures for energy metabolism and nucleic acid synthesis.



Hypoxanthine exists as an intermediate in the biodegradation of AMP (adenosine monophosphate). It is first converted to xanthine with xanthine oxidase before it is excreted as urate.

Urate reaction.jpg

A deleterious reaction that can occur is a spontaneous deamination of adenine to form hypoxanthine. This is a mutagenic process because the result is a pairing of hypoxanthine with cytosine rather than thymine, due to hypoxanthine’s guanine-like form. This could lead to an error in DNA transcription and replication.

Adenine deaminase scheme.jpg


Berg, et al. Biochemistry, 6th Ed. 2007.

Xanthine is a purine base that’s an antecedent of uric acid and is generally found in muscle tissue, blood, urine and some plants. It is a water insoluble toxic yellowish white powder and acids that’s soluble in caustic soda; it sublimes when heated. It is involved in purine degradation and is converted from hypoxanthine and converted to uric acid by xanthine oxidase. Some of its derivatives are widely known as mild stimulants, which include caffeine, a sleep-inhibiting methylated xanthine found in coffee, and theobromine, a bitter alkaloid found in cacao.


There is a genetic disease of xanthine metabolism, xanthinuria, due to deficiency of an enzyme, xanthine oxidase. Xanthinuria is a rare genetic disorder where individuals are unable to convert xanthine into uric acid because of the lack of enzyme xanthine oxidase resulting in an accumulation of xanthine. Symptoms include renal failure and kidney stones. There is currently no treatment available to cure this disease.

Clinical Use[edit]

Xanthine derivatives are collectively known as xanthines, which are a group of alkaloids used as stimulants and bronchodilators. As a result of widespread side effects, many of these derivatives have been treated as second-rate asthma treatment medication.


Berg, et al. Biochemistry, 6th Ed. 2007.

Theobromin - Theobromine.svg
Theobromine (xantheose) is a xanthine derivative and bitter alkaloid commonly found in cacao plants. Its name is derived from the name of the genus of the cacao tree. It doesn’t contain bromine, as its name might indicate. It shares a similar structure to that of another well-known purine and xanthine derivative known as caffeine, except it contains one more methyl group. It was first discovered in the cacao plant in 1841, isolated in 1878, and synthesized from xanthine by Hermann Emil Fischer shortly thereafter. In its pure form, it is a water-insoluble, crystalline white powder that has a milder effect than caffeine. Since dark chocolate has higher concentrations of theobromine than milk chocolate, its beneficial effects are better attained from the less diluted dark chocolate.

Therapeutic uses[edit]

Theobromine is known as a diuretic, which promotes the removal of excess fluids accumulated in the body from edema, or the flushing of excess salts through the increase production of urine.

It is also widely used as a vasodilator, which widens blood vessels and improves blood flow. This, in turn, helps reduce blood pressure, although it is reputed that flavanols have a bigger role in promoting that effect.

A 2004 patent on the future use of theobromine for cancer prevention was granted due to recent research that revealed anti-carcinogenic activity.



Theobromine has a weaker effect on the human central nervous system than caffeine because of its weaker inhibition effects on cyclic nucleotide phosphodiesterases and its antagonism of adenosine receptors. As for its effect on the heart, theobromine stimulates it to a much greater degree than caffeine. It is cited as being involved in contributing to chocolate’s role as an aphrodisiac.

Since theobromine is a myocardial stimulator, it increases the heartbeat. As stated above it also dilates blood vessels and reduces blood pressure by enlarging the vessels. It is possible that theobromine might be able to treat cardiac failure since it has properties which allowing draining. Ingesting too much theobromine could lead to some adverse effects. Since it is a diuretic, it will increase the amount of urine produced in the person. It could also possible cause nausea, restlessness, sleeplessness, and anxiety.


A helpful hint in responsible pet-keeping is to not feed dogs or cats cacao containing products. This is because they metabolize theobromine much more slowly than humans. Complications that arise from doing such an action is succumbing your pet to theobromine poisoning, which causes digestive issues, dehydration, excitability, and a slow heart rate. Larger quantities of theobromine can result in epileptic-like seizures and even death.

What is a Pyrimidine?[edit]

A pyrimidine is a 6-membered heterocyclic organic compound made up of 4 carbon atoms and 2 nitrogen atoms at positions 1 and 3.[1] It is one of three isomers of diazine, the other two being pyridazine (1,2-diazine), and pyrazine (1,4-diazine).[2] Pyrimidines are aromatic and planar. The nucleobases Cytosine(C), Uracil(U), and Thymine(T) are all examples of pyrimidines; each with different chemical groups. Pyrimidines can attach to a phosphate sugar group such as a ribonucleotide(which have a hydroxy group positioned axially at carbon-2) or deoxyribonucleotide(which have a hydrogen atom at C-2) through a glycosidic linkage at the 1st Nitrogen to form a nucleotide, the monomeric building block of nucleic acids (DNA and RNA).

Pyrimidine. Two of the bases found in DNA, cytosine (C) and thymine (T), and a base found only in RNA, uracil (U), are derivatives of pyrimidine.

Correct mistake:
2. It needs carbonyl phosphate synthetase, which is located in the cytoplasm.

Pyrimidine Biosynthesis[edit]

1. Unlike in purine, the ring is synthesized first then conjugated after.

2. It needs carbamoyl phosphate synthetase, which is located in the cytoplasm.

3. It also needs an enzyme in order for the reaction to work, but the enzyme should be controlled in 2 steps:

  • controlled level at where the reaction occurs & transcriptions must be reduced
  • the pyrimidine nucleotides which produces the feedback inhibition level also must be controlled

4. The ring then closes.

5. The C-C bond is formed when the ring oxidizes.

Chemical Properties[edit]

Pyrimidine has similar properties to that of pyridines. One similarity is that as the number of nitrogen atoms in the ring increase, the ring pi electrons become less energetic and, as a result, electrophilic aromatic substitution gets more difficult while nucleophilic aromatic substitution gets easier. One example is the displacement of the amino group in 2-aminopyrimidine by chlorine and its reverse reaction. Reduction in resonance stabilization of pyrimidines leads to the addition and ring cleavage reactions, and not substitutions. An example of this is in the Dimroth arrangement. Pyrimidines are less basic than pyridines and the N-alkylation and N-oxidation are more difficult in pyrimidines as well.

Cytosine base. The NH group is bonded to the sugar within the nucleotide, and the other groups participate in hydrogen bonding to guanine.


Cytosine is part of the pyrimidine family, and it is one of the 5 nucleotide bases found in both DNA and RNA. The molecular formula of cytosine is C4H5N3O. Cytosine consists of a heterocyclic aromatic ring, an amine group at C4, and a keto group at C2. Cytosine binds with ribose to form the nucleoside cytidine and with deoxyribose to form deoxycytidine.

The molecule is of planar geometry and cytosine forms 3 hydrogen bonds with Guanine in the DNA double helix. The nucleoside of cytosine is cytidine in RNA, which consists of cytosine and ribose. In DNA, it is called deoxycytidine, which consists of cytosine and deoxyribose. The nucleotide of cytosine in DNA is deoxycytidylate which consists of a cytosine, ribose and phosphate.


In 1894, Cytosine was discovered by the hydrolysis of the calf thymus tissue. The first structure for cytosine was published in 1903 and the structure was validated when it was synthesized that same year.(The Columbia Encyclopedia)

Chemical Activity[edit]

Base pair GC.svg

From the image on the left, it can be seen that Guanine and Cytosine bond together through noncovalent hydrogen bonding at three distinct sites. An interesting note is that Watson and Crick first hypothesized that Guanine and Cytosine bonded together through hydrogen bonding at two distinct sites. [3]

Cytosine is found in DNA and RNA or as a part of a nucleotide. When the nucleoside cytidine binds with three phosphate groups, it forms cytidine triphosphate (CTP). This molecule can act as a co-factor to enzymes and it aids in transferring a phosphate to convert adenosine diphosphate (ADP) to adenosine triphosphate (ATP) to prepare the ATP to be used in chemical reaction.

In DNA and RNA, cytosine binds with guanine through 3 hydrogen bonds. However, this unit is unstable and can change into uracil. This process is called spontaneous deamination. This can possibly lead to a point mutation if DNA repair enzymes such as uracil glycosylase does not repair it by cleaving uracil in DNA.


Cytosine may go through tautomerization, interchanging from the amino to the imino functionality by intermolecular proton transfer.


Berg, Jeremy M. John L. Tymoczko. Lubert Stryer. Biochemistry Sixth Edition. New York: W.H. Freeman and Company, 2007.

CYTOSINE. The Columbia Encyclopedia, Sixth Edition

Uracil base. Present only in RNA, the N1 of the molecule bonds to the sugar within the nucleotide, and the other groups participate in hydrogen bonding to adenine.

Uracil is among the five nucleobases: adenine, guanine, cytosine, and thymine,but is only found in RNA. It is a naturally occurring pyrimidine derivative with the molecular formula C4H4N2O2. Uracil is planar and unsaturated and has the ability to absorb light.


Uracil is found in RNA and binds to adenine via 2 hydrogen bonds, but is replaced by thymine in DNA. Methylation of Uracil produces thymine. Uracil can pair with any of the base pairs depending on arrangement. Despite this, it readily pairs with adenine because the methyl group is repelled into a fixed position. In the uracil and adenine bond, uracil is the hydrogen bond acceptor and the adenine is the donor. When attached to a ribose sugar, the compound is called uridine, a nucleoside.
Then, phosphate attaches to uridine to form uridine 5′-monophosphate. Nucleotides are formed through a series of phosphoribosyltransferase reactions. This produces substrates, aspartate, carbon dioxide, and ammonia.

Uracil tautomerization: lactam structure (left) and lactim structure (right)

Uracil, like other bases, undergoes tautomerization. The keto tautomer is referred to as the lactam structure, while the imidic acid tautomer is referred to as the lactim structure. With the lactam structure being the major form of uracil, both tauotemric forms are present under conditions where pH=7.

Uracil is a weak acid.

Chemical Activity[edit]

Uracil is capable of undergoing reactions such as oxidation, nitration, and alkylation. It can also react with elemental halogens because of the presence of more than one strongly electron donating group. A useful property of uracil is that in the presence of PhOH/NaOCl, it can be visualized in the blue region of UV light.

As stated above, uracil can partake in synthesis, binding with ribose sugars and phosphates to form very useful molecules like uridine, urindine monophosphate (UMP), urindine diphosphate (UDP), urindine triphosphate (UTP).


Uracil is a nucleotide that was discovered in the 1900s by the hydrolysis of yeast(Brown 1994). Uracil is an important component in helping enzymes to carry out different reactions and the making of polysaccharides (New World Encyclopedia). Because Uracil helps enzymes carry out different reactions in cells, it is important in the drug industry because it helps with delivering drugs throughout the body. Even though it is useful in helping the delivery of drugs in the body, it can increase the risk of cancer when the body is missing the nutrient folate (The Individualist). Uracil is naturally occurring however, it could also be synthesized in the laboratory by mixing water with cytosine. This reaction will produce two compounds which are uracil and ammonia(Wikipedia).


Uracil may go through tautomerization, interchanging from the keto to the enol functionality by intermolecular proton transfer due to rich electrons ring.
Uracil Tautaumerization.jpg


New World Encyclopedia. Uracil. “” 17 November 2008.

Wikipedia. Uracil. “” 17 November 2008.

Brown, D.J. Heterocyclic Compounds: Thy Pyrimidines. Vol 52. New York: Interscience, 1994.

The Individualist. Uracil. “” 17 November 2008.

Thymine base. Present in only DNA, the N1 of the molecule bonds with the sugar within the nucleotide, and the other groups participate in hydrogen bonding to adenine.


5th carbon, hence the other name of thymine, 5-methyluracil. Uracil takes its place in RNA, which also binds to adenine. Thymine is a single ring planar molecule. Thymine combined with deoxyribose yields deoxythymidine while Thymine with ribose makes thymidine.

Thymine binds with deoxyribose to form the nucleoside deoxythymidine, which is the same thing as thymidine. This compound can be phosphorylated with one, two, or three phosphoric acid groups creating thymidine mono-, di-, or triphosphate, respectively.

Thymine is a part of one of the most common mutations of DNA, which involves two adjacent thymines or cytosines. In the presence of UV light, this may form thymine dimers, causing “kinks” in the DNA molecule, interfering with normal function.

Uses of thymine include cancer treatment where it serves as a target for actions of 5-fluorouracil (5-FU). Substitution of this compound to thymine (in DNA) and uracil (in RNA) allows inhibition of DNA synthesis in actively-dividing cells.


Thymine is a heterocyclic aromatic organic compound as a pyrimidine nucleobase. Heterocyclic compounds are organic compounds (those containingcarbon) that contain a ring structure containing atoms in addition to carbon, such as sulfur, oxygen, or nitrogen, as part of the ring. Aromaticity is a chemical property in which a conjugated ring of unsaturated bonds, lone pairs, or empty orbitals exhibit a stabilization stronger than would be expected by the stabilization of conjugation alone.

As the name implies, thymine may be derived by methylation of uracil at the fifth carbon. In DNA, thymine(T) binds to adenine (A) via two hydrogen bonds to support in stabilizing the nucleic acid structures.

Thymine jointed with deoxyribose creates the nucleoside deoxythymidine, which is identical with the term thymidine. Thymidine can be phosphorylated with one, two, or three phosphoric acid groups, creating TMP, TDP or TTP (thymidine mono- di- or triphosphate) correspondingly.

One of the common mutations of DNA involves two neighboring thymine or cytosine, which in existence of ultraviolet light may form thymine dimers, causing “kinks” in the DNA molecule that constrain normal function.

Thymine could also be a goal for actions of 5-fu in cancer treatment. 5-fu can be a metabolic analog of Thymine (in DNA synthesis) or Uracil (in RNA synthesis). Replacement of this analog inhibits DNA synthesis in actively dividing cells.


Thymine may go through tautaumerization, interchanging from the keto to the enol functionality by intermolecular proton transfer.
Thymine taut.jpg


Al Mahroos, M., et al. “Effect of sunscreen application on UV-induced thymine dimers.” Arch Dermatol 138: 1480-5, 2002.
Ribonucleotide reductase (or RNR) is the enzyme responsible for catalyzing the reduction of ribonucleotides to deoxyribonucleotides. These deoxyribonucleotides can then be utilized by the cell in DNA replication. Additionally, because of the role RNR plays in the formation of deoxyribonucleotides, RNRs are responsible for regulating the rate of DNA synthesis within the cell.[1]

Classes of RNR[2][edit]

  1. Class I: Class I RNRs consist two subgroups (Ia, Ib, and Ic) which differ only slightly in primary structure; however, both subgroups are common in that they contain two different dimeric subunits (R1 and R2) and require oxygen in order to form a stable radical. Class Ic RNRs are the most recently discovered, first found in Chlamydia trachomatis. Evidence also suggests its existence in archaea and eubacteria. The sequence of class Ic RNRs shows that residues in the PCET pathway and active site for nucleotide reductase are similar between the three subgroups.[3]
  2. Class II: Class II RNRs form thiyl radicals with the help of adenosylcobalamin – which fulfills the role of the R2 subunit as a radical generator – and utilize thioredoxin or glutaredoxin as electron donors. Therefore, class II RNRs are made up of only one subunit and present as monomers or dimmers and neither require nor are inhibited by the presence of oxygen.
  3. Class III: Class III RNRs, like Class I RNRs, are made up of two dimeric protein subunits (NrdG and NrdD); however, unlike in Class I RNRs which require R2 continuously to generate radicals, the small NrdG is only required during the activation of NrdD. The mechanism of Class III RNRs uses formate as an electron donor and generates an oxygen-sensitive glycyl radical, thus rendering the enzymes inactive in the presence of oxygen.

Radical Mechanism of RNR[edit]

Despite the differences in structure and electron donor, all three classes of RNR proceed via a free radical mechanism.[4] Ultimately RNR catalyzes a reaction which results in the replacement of the 2′-hydroxyl group of the ribose with a hydrogen atom resulting in a deoxyribose moiety.

Metallocofactor Assembly in Class I RNR[5][edit]

Although the Class I RNR’s (Ia, Ib, and Ic) have comparable structures and pathways, the metallocofactors necessarily involved in the activity of RNRs to catalyze the conversion of nucleotides to deoxynucleotides differ remarkably. The mechanisms which generate these cofactors, both in vitro and in vivo, and examining how damaged cofactors are repaired show the significance of each subgroup’s dependence on different cofactors. Studies of the pathways and activation of these metallocofactors have helped our understanding of how biology prevents mismetallation from occurring and configures cluster formation in high yields. All three class I RNR share a common catalytic mechanism in which the metal cofactor is involved directly or indirectly in the oxidation of the conserved cysteine in the active site of alpha to thiol radical S•). Class I RNR oxidation occurs by the Y• in Ia and Ib.

  1. Class IA: Class IA RNR requires a FeIIIFeIII-Y• cofactor. It is localized in β2 at the end of a hydrophobic channel, the supposed access route for O2 cluster assembly. In studies of E. coli, the in vivo process showed that incubation of apo-β2 of E. coli with FeII, O2, and reductant, resulted in self-assembly of the FeIIIFeIII-Y• cofactor. This process likely requires at minimum a single small protein or molecule to deliver FeII to apo-β2 and to deliver the extra reducing equivalent required to reduce O2 to H2O. This is also plausible because Ia RNRN binds MnII more tightly than FeII, thus requiring some type of chaperone protein to ensure proper metallation.
  2. Class IB: Class IB RNR is active with both FeIIIFeIII-Y• and MnIIIMnIII-Y• cofactors. The enzymes can form active FeIIIFeIII-Y• cofactors in vitro, but only the MnIIIMnIII-Y• cofactor was found to be relevant in vivo. The mechanism of this formation has been proposed to occur via oxidation of a MnIIMnII center by a flavoprotein known as NrdI, an oxidant created by reduction of O2. In E.Coli, studies have found that the manganese cofactor is induced when iron is at premature levels in the cell, pointing to the significance of manganese in this and other organisms. There is also an extent of organism-dependent variation in metal homeo-stasis to be considered which may help explain why some organisms rely on either cofactor more frequently.
  3. Class IC: Class IC RNR is unique from Class Ia and Ib RNRs due to its proposed bimetallocofactor, MnIVFeIII. The class Ic RNRs store a one-electron oxidizing equivalent in its metal cluster. In vitro self-assembly of Ic is similar to Ia and Ib in that it reacts with O2 and a reductant to form its respective MnIVFeIII cofactor; however, it differs in that it can also react with 2 equivalents of H2 O2 to form the active cofactor. The class Ic RNR has been isolated from its native organism in vivo, complicating its assembly as the two different metals have similar affinities for the protein. In vitro studies in C. trachomatis have shown the necessity of regulating levels of the metals, along with the order of addition.

There exists problems with proper metal loading within the three subunits of Class I RNR. In the class Ia RNR, it requires a FeIIIFeIII-Y• cofactor, but the protein tends to bind MnII more tightly than FeII. In e.coli, correct metallation of NrdB relies on the necessity of free MnII and FeII present, while iron chaperones are also present to overcome the preference to bind MnII. The issue in class Ib RNR is that it may bind to either FeIIIFeIII-Y• and MnIIIMnIII-Y• cofactors, but only the manganese cofactor was found to be relevant in vivo. Ib binding is dependent on the preference of individual organisms and the concentrations of each metal that they possess inherently. The class Ic RNR complicates metallocofactor assembly since it requires two different metals with similar affinities for the same protein. Regulation of both levels of the metal is important in order to prevent mismetallation and its success depends on the presence of both types of metals. In C. trachomatis, the absence of MnII or at a lower than required rate may lead to diiron cluster formation instead. Thus if these levels are not regulation, low activity and improper metallation occurs. In general, if there is trouble regulating the levels of any of the required metals in each class I RNR, this leads to low activity and improper metallation and ultimately DNA synthesis is affected.

Biosynthesis and Repair of Metal Cofactors in Class I RNR[6][edit]

Certain general principles and challenges exist when studying the metllocofactor formation with different metals and levels of complexity, as summarized below. Physiological expression conditions are taken into account in studies of metalloenzymes to confirm if the form of protein studied in vitro is the same as its active form in vivo. Class I RNRs can control the concentration of the active metal cofactors through biosynthetic and repeair pathways.

  1. Cofactors of metal proteins are generated by specific biosynthetic pathways.
  2. The proteins involved in the biosynthetic pathway are often associated with the operon of the metalloprotein of interest, and certain factors can be analyzed by comparing genomic sequences.
  3. To facilitate the exchange of ligands and protein factors, metals are transferred in their reduced state.
  4. There exists a variety of protein factors which include: metal insertase or chaperone to deliver the metal to the active site, specific redox proteins which control the oxidation state of the metal, and GTPases or ATPases which aid in the folding and unfolding processes to allow the metal to be inserted in the active site.
  5. Due to biological redundancy that affect pathway factors, multiple deletions of genes are required in order to identify phenotypes within a gene deletion experiment.
  6. A hierarchy of metal delivery to proteins and its regulation is inferred but not completely understood.
  7. Compartmentalization (e.g. periplasm vs cytosol in prokaryotes) and affinities of proteins to bind certain metals preferentially are two likely factors that contribute to prevent mismetatallion at the cellular level.
  8. Several proteins have not been isolated from their native source and form heterologous expression systems and leading to mismetallation. Since the optimum level of activity is not fully known, incorrect clusters corresponding to low activity may not be recognized.
  9. Certain oxidants can cause damage to the metal clusters (e.g. NO and O2) and specific pathways are used in their repair.
  10. During changes of oxidaion states, protons are typically required for this metal oxidation. Ligands to metal binding can reorganize easily and rearrangement of the carboxylate ligands are critical to the cluster assembly process.

One of the biggest complications is that the metal required for activity is often not the metal that has the highest affinity for binding to a specific protein. The Irving-Williams series (MnII < FeII < CoII < NiII < CuII > ZnII) best describes the relative affinities of proteins for divalent metals, in addition to the dependence on the particular protein coordination environment where the binding takes place. For the latter metals in the series, chaperone proteins exist to aid their movement to the active sites, while intracellularly they are likely to exist as “free” metals at a low concentration. These chaperone proteins also have another function beside delivery, which is to help maintain low levels of free concentration of these metals to prevent mismetallation and binding between other proteins that require MnII and FeII. Compartmentalization can overcome a protein’s binding preference, as certain activities occur in different parts of the cell which have and require varying amounts of a metal. In cyanobacteria, it was found that MnII dependent perisplasmic protein must fold in the cytosol where MnII exists freely in a higher amount than ZuII, CuI, and CuII.

Techniques to Study RNR Activity[7][edit]

There are several techniques used in the laboratory that are used to monitor the activity of the RNR metallocofactors. This contributes to identifying accurate proposed mechanism, generation, and function of these cofactors in vitro and in vivo by studying their movement.

  1. Whole-Cell Electron Paramagnetic Resonance: EPR was used in studying FeIIIFeIII-Y• biosynthesis in S. cerevisae. It was found that Y• levels were sufficiently high and detectable at endogenous levels in various growth conditions, meaning that the Y• is not modulated as a function of the cell cycle. A small molecule or protein factor must be needed to rapidly reduce the Y• in cell lysates, indicating the presence of a metallocofactor which was later identified to be iron.
  2. Mossbauer Spectroscopy: This type of spectroscopy monitors iron movement from oxidized and reduced iron pools into the RNR cofactor. It allows for the detection of all oxidation states of iron simultaneously and is sensitive to the surrounding electronic environments of the iron species present. In order for this technique to be accurate, cells first need to be labelled with the Fe57 isotope.
  1. Herrick J, Sclavi B. (2007) Ribonucleotide reductase and the regulation of DNA replication: an old story and an ancient heritage Mol Microbiol. 63:22–34

  2. Nordlund P, Reichard P (2006). Ribonucleotide Reductases Annu Rev Biochem, 75:681–706
  3. Cotruvo, Joseph, Jr., and Stubbe, JoAnne. (2011). Class I Ribonucleotide Reductases: Metallocofactor Assembly and Repair In Vitro and In Vivo Annual Review of Biochemistry, 80: 733-767
  4. Eklund H, Eriksson M, Uhlin U, Nordlund P, Logan D (1997). Ribonucleotide reductase–structural studies of a radical enzyme Biol Chem. 378:821–825
  5. Cotruvo, Joseph, Jr., and Stubbe, JoAnne. (2011). Class I Ribonucleotide Reductases: Metallocofactor Assembly and Repair In Vitro and In Vivo Annual Review of Biochemistry, 80: 733-767
  6. Cotruvo, Joseph, Jr., and Stubbe, JoAnne. (2011). Class I Ribonucleotide Reductases: Metallocofactor Assembly and Repair In Vitro and In Vivo Annual Review of Biochemistry, 80: 733-767
  7. Cotruvo, Joseph, Jr., and Stubbe, JoAnne. (2011). Class I Ribonucleotide Reductases: Metallocofactor Assembly and Repair In Vitro and In Vivo Annual Review of Biochemistry, 80: 733-767


Nucleotides consist of a base, sugar, and phosphate group. They are the building blocks of nucleic acids. Nucleotides are essential for the body for many reasons. They are needed for gene replication and transcription into RNA. They are also needed for energy. ATP, the body’s form of energy, is a nucleotide with adenine as its base. Guanine nucleotides (GTP) are also a source of energy. Furthermore, derivatives of nucleotides are necessary in various biosynthetic processes. Nucleotides are necessary in signal transduction pathways as ewll.

The Biosynthesis of Nucleotides

There are two kinds of pathways in the biosynthesis of nucleotides: de novo and salvage. The following table contains similiarities and differences between the two pathways.

De Novo Similarities Salvage
Simpler compounds are used in the synthesis of nucleotides. Numerous small pathways are repeated to assemble different nucleotides. Both synthesize nucleotides, though they utilize different mechanisms. Bases are preformed, recovered, and reconnected to a ribose.
Synthesizes pyrimidine nucleotides. Bicarbonate, aspartate, and glutamine are used to synthesize the ring of the pyrimidine. The ring then links with ribose phosphate, forming the nucleotide. Both assemble ribonucleotides, which are then used to synthezise deoxyribonucleotides for DNA. Synthesizes purine nucleotides. Various precurosrs may be used to form the purine ring, which is then added to ribose and phosphate.

Feedback inhibition regulates multiple steps in the biosynthesis of nucleotides. Examples of this include activation and inactivation of aspartate transcarbamoylase in the synthesis of pyrimidines by CTP and ATP respectively,and activation and iactivation of glutamine-PRPP amidotransferase by purine nucelotides.

Reduction of Ribonucleotides to Deoxyribonucleotides

Ribonucleotide reductase is a catalyst in reducing ribonucleoside diphosphates to deoxyribonucleotides. In this process, electrons flow from NADPH to sulfhydryl groups at ribonucleotide reductase’s active sites. The reaction is summarized as follows:
1. An electron is transferred from cysteine on R1 to tyrosyl on R2. This creates a cysteine thiyl radical on R1, which is highly reactive on the active site.
2.A hydrogen from C3 of the ribose is then abstracted. This creates carbon radical.
3. The C3 radical helps release OH at carbon-2. This departs as H2O after protonation from the second cysteine residue.
4. A third cysteine residue then provides a hydride to complete the reduction at C2. This returns the C3 to a radicala nd also generates a disulfide bond.
5. The c3 radical reacts with the original hydrogen that the first cysteine had extracted. A deoxyribonucleotide has now been generated and can leave the enzyme ribonucleotide reductase.

So What?

The biosynthesis and metabolism of nucleotides are important to the body because disruptions in them can result in pathology. If nucleotides are not degraded properly, certain conditions may arise. An example of this is gout. Urates are degraded proteins, and gout is when they are accumulated, generating poor joints and arthritis.
Similarly, if nucleotides are not synthesize properly, or if not enough are synthesized, conditions will arise as well. An example of this is the Lesch-Nyhan syndrome. Symptoms of this include mental deficiency, self-mutilation, and gout. This disease is due to a lack of an enzyme that is needed to synthesize purine nucleotides through the salvage pathway.

Source: Berg, Jeremy and Stryer, Lubert. Biochemistry: Fifth Edition. United States of America: W.H. Freeman and Company, 2002.

DNA and RNA Backbone[edit]

In macromolecules, such as DNA and RNA, there are linear polymers built and connected together by monomers. These monomers are known as nucleotides, and they consist of a nitrogenous base, a sugar, and a phosphate group. The chains and bonds between these nucleotides form the backbone of DNA and RNA, and these backbones allow the formation of unique genetic sequences. In DNA and RNA backbones, the monomers are connected by phosphodiester bridges. Specifically, the bridges are formed between the 3′-hydroxyl group of either the ribose sugar in RNA or deoxyribose sugar in DNA, and the 5′-hydroxyl group of the adjacent sugar; essentially called a 3′-5′ phosphodiester bond. Chemically, to make this bond, the 3′-hydroxyl group of a sugar undergoes esterification with a phosphate group. That phosphate group then gets attacked by the 3′-hydroxyl group to form the phosphodiester bridge.

Once the phosphodiester bond is established, the backbone needs to be preserved in order to maintain the genetic information of the nucleotide sequence. Thus, no more nucleophilic attacks may occur on the backbone. In order to prevent nucleophilic attacks, the phosphate group on the phosphodiester bond has a negative charge which is used to prevent other nucleophilic species such as hydroxyl groups from attacking. The fact that DNA lacks a hydroxyl group on the 2′ carbon means that it is more resistant to nucleophilic attacks, and thus, is the more stable hereditary material than RNA is.



Phosphodiester Linkage in DNA

What is DNA? DNA is a long chain of linear polymers containing deoxyribose sugars and their covalently bonded bases known as nucleic acids. One of the major functions of the DNA is storage of the genetic information. In DNA a sequence of three bases, which is called a codon, is responsible for the encoding of a single amino acid. The amino acid is added to a growing protein during the process of translation. These nucleic acid polymers encode for the all of the materials an organism needs to live in the form of genes. Genes are small blocks of DNA that tell the cell which proteins it should create. The type of genes that a given cell receives depends entirely on the parent cells. Genes are passed on from generation to generation as a way of ensuring an organism’s survival genetically.

DNA stands for deoxyribonucleic Acid. The prefix “deoxy” distinguishes DNA from its close relative RNA (ribonucleic acid). The prefix indicates that, unlike Ribose, Deoxyribose does not contain a hydroxyl group at the 2′ carbon replacing it with a single Hydrogen atom. The absence of this Hydroxyl group is fundamental in determining the way in which DNA is able to condense itself within the nucleus of a cell.

DNA is a nucleic acid which is capable of duplicating itself via the enzyme known as DNA polymerase. Each of the four bases on DNA, Adenine (A), Cytosine (C), Guanine (G), and Thymine (T) is bonded covalently to a deoxyribose sugar. The four nucleotide units in DNA are called deoxyadenylate, deoxyguanylate, deoxycytidylate, and thymidylate. The nucleotide includes the nucleoside, a nitrogenous base bonded to a deoxyribose or ribose group. The four nucleosides in DNA are deoxyadenosine, deoxyguanosine, deoxycytidine, and thymide. By the joining one or more phosphate groups to a nucleoside through ester linkages, a nucleotide is formed.

The deoxyribose sugars form the structural backbone for DNA via a phosphodiester bond between the 3′ carbon of one nucleotide and the 5′ carbon of the next. When DNA is not self-replicating it exists in the cell as a double stranded helical molecule with the strands lined up anti-parallel to each other. That is to say if the orientation of one strand is 3′ to 5′ the other strand would be oriented 5′ to 3′. The bases of each strand bind very specifically, A binds with T and C binds with G no other combination exists at least in DNA. The bases are bound to one another internally via hydrogen bonds with the phosphodiester bond backbone oriented to face outward. It is here that the missing 2′ hydroxyl group plays an important role in DNA. It is the absence of this group that allows DNA to form its conventional double helix structure. RNA which does have a hydroxyl group at the 2′ carbon is unable to obtain this same helical structure. The modern double helix structure of DNA was first proposed by Watson and Crick, and the functions of DNA were demonstrated in a series of experiments which will be discussed in the next few sections.

Why DNA?
It is significant to note the reasons why DNA is the primary method through which all cells pass along genetic information. That is to say why has evolution favored a DNA world over an RNA world given that the two molecules are so similar structurally? These reasons involve chemical stability, energy needed to form and break chemical bonds, and the availability of enzymes to perform this task. The primary reason involves the relative stability of the two molecules. DNA is more chemically stable than RNA because it lacks the hydroxyl group on the 2′ carbon. In RNA there are two possible OH groups that the molecule can form a phosphodiester bond between, which means that RNA is not forced into the same rigid structure as its deoxy counterpart. Additionally the deoxyribose sugar in DNA is much less reactive than the ribose sugar in RNA. Simply put C-H groups are significantly less reactive than C-OH (hydroxyl) groups. This difference also explains why RNA is not very stable in alkaline conditions, and DNA is. The base in alkaline condition does the same thing as the -OH group at the C2 position. Furthermore, double-strand DNA has relatively small grooves where damaging enzymes can’t attach, making it more difficult for them to ‘attack’ the DNA. Double-stranded RNA, on the other hand, has much larger grooves, and therefore, it is more subject to being broken down by enzymes. The connection between the strands of double-stranded DNA is tighter than double-stranded RNA. In other words, it’s much easier to unzip double-stranded RNA than it is to unzip double-stranded DNA. Overall, the breakdown and reform of RNA can be carried out faster and requires less energy than the breakdown and reform of DNA. It is essential to the organism’s survival and well-being that its genetic material is encoded into something that is more stable and resistant to changes. In addition, the sequence of DNA and its physical conformation seems to play a part in DNA’s selection as well. Another point that helps elucidate DNA’s prevalence as the primary storage of genetic information is the availability of the enzyme that breaks down DNA. The body actively destroys foreign nucleases, which are enzymes that cleave DNA. This is only one of the many ways DNA is protected against damage. The body can actually recognize foreign DNA and destroy it, while leaving its own DNA intact.

Hyperchromic Effect
Another unique feature of DNA in its double stranded form is the hyperchromic effect, which describes the decreasing absorbance of UV electromagnetic radiation of double helix strands as compared to the non-helical conformation of the molecule. The hydrogen bonding between complementary DNA strands as a result of sugar stacking in the helical conformation causes the aromatic rings to become increasingly stable and thus absorb less UV radiation. This ultimately decreases the amount of UV absorption by 40%. As the temperature is increased these hydrogen bonds dissolve and the helical structure begins to unwind. In this unwound form the aromatic rings are free to absorb much more UV radiation.

Properties of DNA[edit]

1. Consists of 2 strands (anti-parallel and complementary): DNA has two polynucleotide chains that twist around a helical axis in opposite direction.

2. It is made up of deoxyribose sugar, a phosphate backbone on the exterior, and nucleic acid bases in the interior.

3. Bases are perpendicular to the helix axis that separated by 3.4 Angstroms.

4. Strands are held together by hydrogen bonds an other various intermolecular forces that form a double helix. The base pairing involves 2 hydrogen bonds for A – T and 3 hydrogen bonds for C – G -see in images to the right

5. Backbone consists of alternating sugars and phosphates, where phosphodiester linkages form the covalent backbone of the DNA.The direction of DNA goes from 5′ phosphate group to 3′ hydroxide group.

6. Repeats every 10 bases

7. Weak forces stabilize DNA because of the hydrophobic effects and VanDerWaals.

8. DNA chain is 20 Angstroms wide (2 nm)

9. One nucleotide unit is 3.3 Angstroms long (0.33 nm)

Primary Structure[edit]

DNA is made of two polynucleotide chains (strands) which run in opposite directions around the common axis. As a result, DNA has a double helical structure. Each polynucleotide chain of DNA consists of monomer units. A monomer unit consists of three main components that are a sugar, a phosphate, and a nitrogenous base. The sugar used in the DNA monomer unit is deoxyribose (it lacks an oxygen atom on the second Carbon in the furanose ring). There are also four possible nitrogen containing bases which can be used in the monomer unit of the DNA. Those bases are adenine (A), guanine (G), cytosine (C), and thymine (T). Adenine and guanine are purine derivatives, while cytosine and thymine are pyrimidine derivatives. Polymeric chain forms as a result of joining nucleosides (the sugar which is covalently bonded to the nitrogen containing base) through the phosphodiester linkage. Polymeric chain is a single strand of the DNA molecule. Two strands run in opposite directions to form double helix. The forces that keep those strands together are hydrogen bond, hydrophobic interactions, van de Waal force, and charge-charge interactions. The H-bonds form between base pairs of the antiparallel strands. The base in the first strand forms an H-bond only with a specific base in the second strand. Those two bases form a base-pair (H-bond interaction that keeps strands together and form double helical structure). The base –pairs are: adenine-thymine (A-T), cytosine-guanine (C-G). Such interaction gives us the hint that nitrogen-containing bases are located inside of the DNA double helical structure, while sugars and phosphates are located outside of the double helical structure. The hydrophobic bases are inside the double helix of DNA.
The bases, located inside the double helix, are stacked one on the top of another. Stacking bases interact with each other through the Van der Waals force. Even though the van de Waal forces are week, sumation of those forces can be substantial. The distance between two neighboring bases that are perpendicular to the main axis is 3.4 A˚. DNA structure is repetitive. There are ten bases per turn, so every base has a 36° angle of rotation. The diameter of the double helix is approximately 20 A˚. The hydrophobic effect stabilizes the double helix. The structural variation in DNA is due to the different deoxyribose conformations, rotation about the contiguous bonds in the phosphodeoxyribose backbone, and free rotation about the C-1′- N (glycosyl bond).

The technique of southern blotting is often used to uncover the DNA sequence of a sample. The technique is named after Edwin Southern.

DNA Manipulation Techniques[edit]

When it comes to exploring genes and genomes, it depends on the technical tools that are used. The five important DNA manipulation techniques are:

1.Restriction Endonucleases – also known as restriction enzymes

The restriction of enzymes split the DNA into specific fragments. By having the DNA split into different pieces, it allows the manipulation of DNA segments.

2. Blotting Technique

To separate and characterize DNA, the Southern blotting technique is used. This technique is similar to the Western blot, except that Southern blotting is used for DNA and not RNA. This technique identifies a specific sequence of DNA by electrophoresis through an agarose gel. The DNA is separated by placing the large fragments on top and the small fragments at the bottom. Next, the DNA is transfer into the nitrocellulose sheet. Then a 32-p labeled DNA probe that is complementary to the sequence, is added to hybridize the fragments. Finally, a autoradiography film is use to view the fragment containing the sequence.

3. DNA Sequencing

By using the DNA sequencing technique, a precise nucleotide sequence of a DNA molecule can be determined. The key to DNA sequencing is the generation of DNA fragments whose length depends on the last base of the sequence. Even though there are different alternative methods, they all perform the same procedure on the four reaction mixtures.

A. Chain termination DNA Sequencing

A primer is always needed. To produce fragments, the addition of 2′, 3′-dideoxy analog of a dNTP is added to each of the four mixtures. It will stop the sequence at that N-dideoxy. The types of dNTP that can be use are dATP, TTP, dCTP, dGTP. In the end, new DNA strands are separated to electrophoresis.

B. Fluorescence Detection of Bases

Fluorescent tag is used into each of the four chain-terminating dideoxy nucleotides at different wavelengths.
It is an effective method because no radioactive reagents are used and large sequences of bases can be determined.
The fragments get separated by having the mixture passed through high voltage.
Then, the fragments are detected by their fluorescence, which the base sequence is based on the color sequence.

C. Top-down (Shotgun) Method of Genome Sequencing

The top-down method and the shotgun method are similar, the main difference is that the top-down requires a detailed map of the clones. The Shotgun randomly sequences large clones to match them computationally.

D. Microarrays(Green chips)

Using microarrays is useful when it comes to studying the expression of a large number of genes. The microarray is created by using either oligonucleotides or cDNA. Based on the fluorescent intensity, red or green marks will appear. If it is red, it means no fluorescence is present, known as gene induction. If it is green, fluorescence is expressed, known as gene repression.

4. DNA Synthesis

To synthesize DNA, a solid-phase method is used. The solid-phase synthesis is carried out by the phosphite triester method. In this process only one nucleotide is added in each group. The first step that takes place is the binding of the first nucleotide. Another nucleotide is added and activated and reacts with the 3′ -phosphoramididte containing DMT. A deoxyribonucleoside 3′ -phosphoramidite with DMT and βCE is attached because it has the ability to synthesize any DNA. It is also a basic nucleotide that is modified and protected. Then, the molecule gets oxidized to oxidized the phosphate group. In the end, the DMT is removed by addition of dichloroacetic acid. Overall, the desired product remains insoluble and it is release at the end.

5. Polymerase Chain Reaction (PCR)

PCR is a technique used that allows to amplify DNA sequence between two nucleotides. If the DNA sequence is known, millions of copies of that sequence can be obatained by using this technique. To carried out PCR, a DNA template, a precursor, and two complementary primers are needed. What makes the PCR unique is that the temperature is constantly changing within the three different stages and that the stages get repeated 25 times. The three stages are:

1. Denaturing – DNA gets denature from a double strand (parent DNA molecule) to two single strands by heating thesolution at 94°C.

2. Annealing – After letting the solution cooled, two synthetic oligonucleotide primers are added at the end of the 3′ end of target strand, and at the 3′ end of complementary strand. This process is done when the temperature is between 50°C – 60°C.

3. Polymerization – Addition of thermostable DNA polymerase to catalyze 5′ to 3′ DNA synthesis at 72°C.

Structural Variation[edit]

Structural Variation occurs due to the different deoxyribose conformations, free rotation about the C-1, and rotation about the closest bond in phosphodeoxyribose backbones.

There are secondary structures when it comes to DNA which are forms A, B, and Z.
A Form:
1. Right handed
2. Glycosyl bond conformation is ANTI
3. Needs 11 base pairs per helical turn
4. Size of diameter is about 26 angstroms
5. Sugar pucker conformation is at the C-3′ endo.

B Form:
1. Like the A form, the B form is right handed.
2. Glycosyl bond formation is ANTI
3. Needs 10.5 base pairs per helical turn
4. Size of diameter is about 20 angstroms
5. Sugar pucker conformation is at the C-2′ endo

Z Form:
1. Unlike the A and B form, the obvious difference is that the Z form is left handed.
2. Glycosyl bond formation consists of two components: pyrimidines and purines. ANTI (for pyrimidines) and SYN (for purines)
3. Needs 12 base pairs per helical turn
4. Size of diameter is about 18 angstroms
5. Sugar pucker conformation is at the C-2′ endo (for pyrimidines) and C-3′ endo (for purines)

DNA libraries[edit]

A DNA library is a collection of cloned DNA fragments in a cloning vector that can be searched for a DNA of interest. If the goal is to isolate particular gene sequences, two types of library are useful.

Genomic DNA libraries[edit]

A genomic DNA library is made from the genomic DNA of an organism. For example, a mouse genomic library could be made by digesting mouse nuclear DNA with a restriction nuclease to produce a large number of different DNA fragments but all with identical cohesive ends. The DNA fragments would then be ligated into linearized plasmid vector molecules or into a suitable virus vector. This library would contain all of the nuclear DNA sequences of the mouse and could be searched for any particular mouse gene of interest. Each clone in the library is called a genomic DNA clone. Not every genomic DNA clone would contain a complete gene since in many cases the restriction enzyme will have cut at least once within the gene. Thus some clones will contain only a part of a gene.

cDNA Library[edit]

A cDNA library is a library of mRNAs. It is made from introns and exons and a cDNA library is made to be able to isolate the genes/the final version of the gene.

A cDNA library i used to screen for colonies. If looking for a gene, you can screen the colonies, use the collection of plasmids, transform the bacteria, and use a probe. You can also use Southern Hybridization. By using an oligonucleotide that is complementary to the gene you are looking for, and that will eventually tell you which colonies of bacteria will have the DNA that corresponds with the mRNA in the plasmids.

How to make a cDNA library:
1. Isolate mRNA from the cell.
2. Use reverse transcriptase and dNTPss so that from the original mRNA, a DNA copy can be created.
3. RNA is easier to degrade than DNA so put in alkali solution to degrade mRNA.
4. Use DNA polymerase to complete the template.
Ultimately, you end up with double stranded DNA, one of which is identical to the mRNA. After doing this all for mRNA, you can clone it in the plasmids. The collection of plasmids will include all of the mRNA but in the form of DNA.[1]

Flow of Genetic Information[edit]

  • Genetic information storage: genome
  • Replication: DNA –> DNA
  • Transcription: DNA –> RNA
  • Translation: RNA –> Proteins


  1. Viadiu, Hector. “Making a cDNA Library.” UCSD. Lecture. November 2012.


Berg , Jeremy . Biochemistry . 7. New York : W.H Freeman and Company , 2012. Print.

Berg, Jeremy, Tymoczko J., Stryer, L.(2012). Protein Composition and Structure.Biochemistry(7th Edition). W.H. Freeman and Company. ISBN1-4292-2936-5

Hames, David. Hooper, Nigel. Biochemistry. Third edition. New York. Taylor and Francis Groups. 2005.


Deoxyribonucleic acid (DNA) stores information for the synthesis of specific proteins. DNA has deoxyribose as its sugar. DNA consists of a phosphate group, a sugar, and a nitrogenous base. The structure of DNA is a helical, double-stranded macromolecule with bases projecting into the interior of the molecule. These two strands are always complementary in sequence. One strand serves as a template for the formation of the other during DNA replication, a major source of inheritance. This unique feature of DNA provides a mechanism for the continuity of life. The structure of DNA was found by Rosalind Franklin when she used x-ray crystallography to study the genetic material. The x-ray photo she obtained revealed the physical structure of DNA as a helix.

DNA has a double helix structure. The outer edges are formed by alternating deoxyribose sugar molecules and phosphate groups, which make up the sugar-phosphate backbone. The two strands run in opposite directions, one going in a 3′ to 5′ direction and the other going in a 5′ to 3′ direction. The nitrogenous bases are positioned inside the helix structure like “rungs on a ladder,” due to the hydrophobic effect, and stabilized by hydrogen bonding.

The two strands run in opposite directions to form the double helix. The strands are held together by hydrogen bonds and hydrophobic interactions. The H-bonds are formed between the base pairs of the anti-parallel strands. The base in the first strand forms a H-bond only with a specific base in the second strand. Those two bases form a base-pair (H-bond interaction that keeps strands together and form double helical structure). The base–pairs in DNA are adenine-thymine (A-T) and cytosine-guanine (C-G). Such interactions provide us an understanding that nitrogen-containing bases are located inside of the DNA double helical structure, while sugars and phosphates are located outside of the double helical structure.

The component consisting of the base and the sugar is known as the nucleoside. DNA contains deoxyadenosine (deoxyribose sugar bonded to adenine), deoxyguanosine (deoxyribose sugar bonded to guanine), deoxycytidine (deoxyribose sugar bonded to cytosine), and deoxythymidine (deoxyribose sugar bonded to thymine). The linkage of the bonds between the base to the sugar is known as the beta-N-Glycosidic linkage. In purines, this occurs between the N-9 and C-1′ and in pyrimidines this occurs between the N-1 and C-1′. A nucleoside and a phosphate group make up a nucleotide. The bond between the deoxyribose sugar of the nucleoside and the phosphate group is a 3′-5′ phosphodiester linkage.

The bases, located inside the double helix, are stacked. Stacking bases interact with each other through the Van der Waals forces. Although the energy associated with a Van der Waals interaction is relatively small, in a helical structure, a large number of atoms are intertwined in such interactions and the net sum of the energy is quite substantial. The distance between two neighboring bases that are perpendicular to the main axis is 3.4 Å. The DNA structure is repetitive. There are ten bases per turn, that is the structure repeats after 34 Å, so every base has a 36° angle of rotation. The radius of the double helix is approximately 10 Å.

An easy way to differentiate between Nucleosides and Deoxynucleosides is the atoms bonded to C-2 on the sugar unit. If the structure is a deoxynucleoside, then C-2 bears two hydrogens. If it is a nucleoside, then C-2 bears one hydrogen and one hydroxide group, in which the hydroxide group faces south.

Structural variations in DNA can occur if:
1. There are different deoxyribose conformations
2. If there are rotations around the contiguous bonds in the phosphodeoxyribose backbone
3. Free rotation about the C-1’N=glycosyl bond (syn/anti)[1]

Terms and Naming[edit]

There are two types of nucleic acids, ribonucleic acids (RNA) and deoxyribonucleic acid (DNA). Recall that a nucleoside is a base + sugar. A Nucleotide is composed of a base + sugar + phosphate. The deoxy- prefix in Deoxyribonucleotides is the nomenclature used for DNA. The term ribonucleotides is employed when it is nomenclature for RNA, or in other words, C-2 on the sugar unit has an -OH group (versus deoxy which C-2 has 2 hydrogens). Symbols are used to simplify the names. For example, ATP (precursor of RNA). The “A” in the front signifies that the base is Adenine and the “T” in the middle signifies tri-phosphates. AMP on the other hand, also has an adenine, but the M signifies that the sugar is bound to a single phosphate group. Finally, in dAMP, the “d” signifies that it is a 2′-deoxyribo-, versus simply AMP means it is a ribonucleotide.In short, four nucleotide units of DNA are called deoxyadenylate, deoxyguanylate, deoxycitidylate, and thymidylate.

Early foundation for DNA structures[edit]

The primary structure of a nucleic acid is its covalent structure and nucleotide sequences.
One of most important parts of determining the structure of DNA comes from the work of Erwin Chargaff and his colleagues in the late 1940s. They found that the four nucleotide bases of DNA of different organisms and that the amounts of certain bases are closely related. They concluded the following about the structure of DNA:

DNA general structure and its bases

1. The base composition of DNA generally varies from one species to another.

2. DNA specimens isolated from different tissues of the same species have the same base composition.

3. The base composition of DNA in a given species does not change over time, nutritional states, or environment.

4. In all cellular DNA, regardless of the species, the number of adenine residues is equal to the number of thymine residue (A=T) and the number of guanine residues is equal to the number of cytosine residues (G=C).

Later in 1953, Rosalind Franklin and Maurice Wilkins used a powerful X-ray diffraction technique called X-ray crystallography to deduce the DNA structure. Photographs produced by the X-ray crystallography method are not actually pictures of molecules, however the spots and smudges produced by X-rays that were diffracted (deflected) as they passed through crystallized DNA. Crystallographers use mathematical equations to translate such patterns of spots into information about the three-dimensional shape of DNA. Franklin and Wilkins found that DNA molecules are helical with two periodicities along their long axis, a primary one of 3.4 A and a secondary one of 34 A.

A DNA molecule separated and created of new daughter DNA‎

Watson and Crick later based their model of DNA upon the data they were able to extract from Wilkins and Franklin’s X-ray diffraction photo.

They interpreted the pattern of spots on the X-ray photo to mean that DNA consisted of two chains and was helical in shape. Eventually, Watson and Crick formulated a DNA structure from the diffraction pattern of the x-ray photo and gave to incredible insight that is still accepted today. In this structure, they proposed that two helical DNA chains of opposite direction wound around the same axis to form a right handed double helix. The hydrophillic backbones form by phosphodiester bonds of alternating deoxyribose sugar and phosphate group that are faced outside of the helix, surrounded by aqueous environment. The furanose ring of each deoxyribose sugar is in the C-2’ endo conformation. The purine and pyrimidine bases of both strands are stacked inside the double helix and stabilized by Van Der Waals interactions.

The double-helix has a diameter of 10 Å. Each adjacent base on one strand of the double-helix is 3.4 Å apart. Every 10 base-pairs constitutes a 360° turn in the helix, and the length of the helix is determined by 34 Å per 10 base-pairs.

Nucleoside (adenosin) with beta glycosidic bond


DNA molecules are asymmetrical, such property is essential in the processes of DNA replication and transcription. A double-stranded DNA molecule consists of two complementary but disjoint strands that are intertwined into a helix formation through a network of H bonds. Although both the right-handed and left-handed helices are among the allowed conformations, right-handed helices are energetically more favorable due to less steric hindrance between the side chains and the backbone. The direction of DNA is determined by the arrangement of the phosphate and deoxyribose sugar groups along the DNA backbone. One of the DNA ends terminates with the 3′-OH group, whereas the other one terminates with the 5′-phosphate group. All sequences of DNA are usually written from 5′ to 3′ termini. In a double-helix formation, the complementary DNA strands are oriented in opposite directions. DNA is a rather rigid molecule: at physiological conditions, DNA curves at the length scale of about 50 nm, which is 20 times the diameter of the double helix. More so, the alignment of the bases can indicate the global orientation of a DNA strand. For purine nucleotides (A and G) the most probable angle is approximately 88°, whereas for pyrimidine (C and T) that angle is approximately 105°.


Forces involved in DNA helices[edit]

The DNA double helix is held together by two main forces: hydrogen bonds between complementary base pairs inside the helix and the Van der Waals base-stacking interaction.

G-C pair showed three hydrogen bonds A-T pair showed two hydrogen bonds

A typical nucleoside‎

Hydrogen bonds[edit]

Watson and Crick found that the hydrogen bonded base pairs, G with C, A with T, are those that best fit within the DNA structure. It is important to note that three hydrogen bonds can form between G and C, but only two bonds can be found in A and T pairs. On the other hand, A-T pairs seem to destabilize the double helical structures. This conclusion was made possible by a known fact that in each species the G content is equal to that of C content and the T content is equal to that of A content.

Below is the link to the demo of the Hydrogen bondings between base pairs:

The three hydrogen bonds that constitute the linkage of Guanine(G) and Cytosine(C) consequently alters the thermal melting of DNA, which is dependent upon base compositions. With varying base composition the melting point of such molecule will either increase or decrease.

Denaturing and Annealing

Ultraviolet (UV) light can detect whether bases are stacked or unstacked. Stacked bases within the DNA structure facilitate shielding from light, therefore the absorbance of UV light of double helical DNA is much less than single stranded DNA. This characteristic is known as the hypochromic effect, in which less color is emitted from the double helix of DNA molecules.

The melting temperature (Tm) is the temperature in which DNA is half way of the DNA is double stranded and half is single stranded. The Tm depends greatly on base composition. Since G-C base pairs are stronger due to more Hydrogen bonds, DNA with high G-C content will have a higher Tm than that of DNA with greater A-T content.

When heat is applied to a double-stranded DNA, each individual strand will eventually separate (denature) because hydrogen bonds are disrupted between base pairs. Upon separation, the separated strands spontaneously reassociate to form the double helix again. This process is known as annealing.

In biological systems, both denaturing and annealing can occur. Helicases use chemical energy (from ATP) to disrupt the structure of double-stranded nucleic acid molecules. The study of the ability of DNA to reanneal within the laboratory is important in discovering gene structure and expression.

Complex Structures

Complex structures can also be formed from single-stranded DNA. A stem-loop is formed when complementary sequences, within the same strand, pair to form a double helix. Hydrogen bonds between base pairs within the same strand occur. Often, these structures include mismatched bases, resulting in destabilization of the local structure. Such action can be important in higher-order folding, like in tertiary structures.

Hypochromic Effect[edit]

DNA absorbs very strongly at wavelengths close to UV light (~260 nm). A single stranded DNA will absorb more UV light than that of double-stranded DNA. DNA UV absorption decreases when it forms a double strand, this characteristic is an indication of DNA stability. With the increase in light energy, its structure and therefore its function will still remain intact since there is low disturbance to its structure.

The decreased absorbance observed with the DNA double helix with respect to the native and denatured forms is explained by the fact that the stacking of the nitrogenous bases that takes place with the double helix does not leave them as exposed to radiation and thus they are able to absorb less. The aromaticity of the nitrogenous bases (specifically in the purine and pyrimidine like ring structures) accounts for the absorption peak being at 260nm.

Weak forces[edit]

Various Weak Forces come together to stabilize the DNA structure.

  • Hydrogen bonds, linkage between bases, although weak energy-wise, is able to stabilize the helix because of the large number present in DNA molecule.
  • Stacking interactions, or also known as Van der Waals interactions between bases are weak, but the large amounts of these interactions help to stabilize the overall structure of the helix.
    • Double helix is stabilized by hydrophobic effects by burying the bases in the interior of the helix increases its stability; having the hydrophobic bases clustered in the interior of the helix keeps it away from the surrounding water, whereas the more polar surfaces, hence hydrophilic heads are exposed and interaction with the exterior water
    • Stacked base pairs also attract to one another through Van der Waals forces the energy associated with a single van der Waals interaction has small significant to the overall DNA structure however, the net effect summed over the numerous atom pairs, results in substantial stability.
    • Stacking also favors the conformations of rigid five-membered rings of the sugars of backbone.
  • Charge-Charge Interactions– refers to the electrostatic (ion-ion) repulsion of the negatively charged phosphate is potentially unstable, however the presence of Mg2+ and cationic proteins with abundant Arginine and Lysine residues that stabilizes the double helix.

Nitrogenous Bases[edit]

Nitrogenous Bases are the foundational structure of DNA polymers, the structure of DNA polymers vary with the different attached nitrogenous bases.

Nitrogenous Bases can tautomerize between keto and enol forms. The aromaticity of the pyrimidine (Cytosine, Thymine, Uracil (RNA)) and purine (Adenine, Guanine) ring systems and their electron-rich nature of -OH and -NH2 substituents enable them to undergo keto-enol tautomeric shifts. The keto tautomer is called a lactam and the enol tautomer is called lactim. The lactam predominates at pH 7. Keto-enol tautomerization is the interconversion of a keto and enol involving the movement of a proton and the shifting of bonding electrons, hence the isomerism qualifies as tautomerism.

Keto-enol tautomerization

Keto-enol tautomerism is important in DNA structure because high phosphate-transfer potential of phosphenolpyruvate results in the phosphorylated compound to be trapped in the less stable enol form, whereas dephosphorylation results in the keto form. Rare enol tautomers of bases guanine and thymine can lead to mutation because of the altered base-pairing properties.

Base-stacking interactions[edit]

The two strands of double-stranded DNA are held together by a number of weak interactions such as hydrogen bonds, stacking interactions, and hydrophobic effects. Of these, the stacking interactions between base pairs are the most significant. The strength of base stacking interactions depends on the bases. It is strongest for stacks of G-C base pairs and weakest for stacks of A-T base pairs. The hydrophobic effect stacks the bases on top of one another. The stacked base pairs attract one another through Van der Waals forces, typically from 2 to 4 kJ/mol-1. In addition, base stacking in DNA is favored by the conformations of the somewhat rigid five membered rings of the backbone phosphate-sugars. The base-stacking interactions, which are largely nonspecific with respect to the identity of the stacked base, make the major contribution to the stability of the double helix.

Phosphodiester Bond[edit]

Phosphodiester Bond between nucleotides

Phosphodiester linkages form the covalent backbone of DNA. A phosphodiester bond is the linkage formed between the 3′ carbon atom and the 5′ carbon of the sugar deoxyribose in DNA.

The phosphate groups in a phosphodiester bond are negatively-charged. The pKa of phosphate groups are near 0, therefore they are negatively-charged at neutral pH (pH=7). This charge-charge repulsion forces the phosphates groups to take opposite positions of the DNA strands and is neutralized by proteins (histones), metal ions such as magnesium, and polyamines.

The tri-phosphate or di-phosphate forms of the nucleotide building are blocks, first have to be broken apart to release the energy require to drive an enzyme-catalyzed reaction for a phosphodiester bond to form and for the nucleotide to join. Once a single phosphate or two phosphates (pyrophosphates) break apart and participate in a catalytic reaction, the phosphodiester bond is formed.

An important role in repairing DNA sequences is due to the hydrolysis of phosphodiester bonds being catalyzed by phoshodiesterases, an enzyme that facilitates the repairs.

One reason that made DNA more stable than RNA is absence of the 2′-OH group in DNA. The presence of OH group on 2’C makes RNA more susceptible for reactions. A nucleophile (base) can pull out the H (when everything is in the correct trajectory) and the phosphate part of the backbone will rearrange and eventually a P-O bond is broken to break the connection site between two sugars.

Secondary Structures of DNA[edit]

Major and Minor Grooves

Base pairing of complementary nucleotides make up the secondary structure of DNA. A single-stranded DNA may participate in intramolecular base pairing between complementary base pairs and therefore make up secondary structure as well. Base pairing between Adenine (A)-Thymine (T) and Guanine (G)-Cytosine(C)are possible because these base pairs are similar in size. This means there are no “bulges” or “gaps” within the double helix.

Irregular placement of base pairs in a double helix will result in consequences that will render the macromolecule nonfunctional. Therefore if there is something wrong with the structure, signals will be sent and DNA repair will work to fix damage.

As a result of the double helical nature of DNA, the molecule has two asymmetric grooves. One groove is smaller than the other. This asymmetry is a result of the geometrical configuration of the bonds between the phosphate, sugar, and base groups that forces the base groups to attach at 120 degree angles instead of 180 degree. The larger groove is called the major groove, occurs when the backbones are far apart; while the smaller one is called the minor groove, occurs when they are close together.

Since the major and minor grooves expose the edges of the bases, the grooves can be used to tell the base sequence of a specific DNA molecule. The possibility for such recognition is critical, since proteins must be able to recognize specific DNA sequences on which to bind in order for the proper functions of the body and cell to be carried out. As you might expect, the major groove is more information rich than the minor groove, allowing the DNA proteins to interact with the bases. This fact makes the minor groove less ideal for protein binding.

Visual Representation of Major and Minor Grooves in DNA Structure

A form[edit]

These following features represented different characteristics of A-form DNA structure:

1. Most RNA and RNA-DNA duplex in this form

2. Shorter, wider helix than B.

Deep, narrow major groove not easily accessible to proteins

Wide, shallow minor groove accessible to proteins, but lower information content than major groove.

Favored conformation at low water concentrations

Base pairs tilted to helix axis and displaced from axis

Sugar pucker C3′-endo (in RNA 2′-OH inhibits C2′-endo conformation)

Right handed

Size is about 26 angstroms

10· Needs 11 base pairs per helical turn

11· Glycosyl bond conformation is Anti

B form[edit]

The double helical structure of normal DNA takes a right-handed form called the B-helix. It is about 20 angstroms with a C-2′ endo sugar pucker conformation. The helix makes one complete turn approximately every 10 base pairs (= 34 A per repeat/3.4 A per base). B-DNA has two principal grooves, a wide major groove and a narrow minor groove. Many proteins interact in the space of the major groove, where they make sequence-specific contacts with the bases. In addition, a few proteins are known to make contacts via the minor groove.

B and Z form DNA

Z form[edit]

DNA sequences can flip from a B form to a Z form and vice versa. Z form of DNA is a more radical departure from the B structure; the most obvious distinction is the left-handed helical rotation.

The Z form is about 18 angstroms and there are 12 base pairs per helical turn, and the structure appears more slender and elongated. The DNA backbone takes on a zigzag appearance. Certain nucleotide sequences fold into left-handed Z helices much more readily than others. Prominent examples are sequences in whichpyrimidines alternate with purines, especially alternating C and G or 5-methyl-C and G residues. To form the left-handed helix in Z-DNA, the purine residues flip to the syn conformation alternating with pyrimidines in the anti conformation. The major groove is barely apparent in Z-DNA, and the minor groove is narrow and deep. For pyrimidines, the sugar pucker conformation is C-2′ endo and for purines, it is a C-3′ endo.

Z-DNA formation occurs during transcription of genes, at transcription start sites near promoters of actively transcribed genes. During transcription, the movement of RNA polymerase induces negative supercoiling upstream and positive supercoiling downstream the site of transcription. The negative supercoiling upstream favors Z-DNA formation; a Z-DNA function would be to absorb negative supercoiling. At the end of transcription, topoisomerase relaxes DNA back to B conformation.

Tertiary structure (3 dimensional)[edit]

The tertiary structure of DNA molecule is made up of the two strands of DNA wind around each other. DNA double helix can be arranged in space, in a tertiary arrangement of strands.

  • Linking Number( Lk) in a covalently closed circular DNA, where the two strands cannot be separated will result in a constant number of turns in a given molecule. Lk of DNA is an integral composed of two components:
1)Twist (Tw): number of helical turns of DNA strand
2) Writhe (Wr): number of supercoiled turns in DNA

Normally, DNA has Lk of about 25, meaning it is underwound. However, DNA can also be supercoiled with two “underwindings” which is made up of negative supercoils. This is much like the two “turns- worth” of a single stranded DNA and no supercoils. This kinds of interconversion of helical and superhelical turns in important in gene transcription and regulation.

Quaternary structure and other unusual structure[edit]

DNA is connected with histones and non-histone proteins to form the chromatin. The negative charge due to the phosphate group in DNA makes it relatively acidic. This negative charge binds to the basic histone groups.

Histone Modification[edit]

Recent studies provide that actively transcribed regions are characterized by specific modification pattern of histone. The experiments carried on by the dynamics of histone modification shows that there is a significant kinetic distinction between methylation, phosphorylation, and acetylation. This suggest that the roles of these modifications has different roles in gene expression patterns.

Histones are proteins which DNA wraps around and forms a chromatin. The basic unit of a chromatin is a nucleosome which are formed by histone octomer of 2 molecules of H2A, H2B, H3, and H4 along with 147 base pairs of DNA wrapped in a superhelix. The accessibility of DNA is regulated by higher-order chromatin structures that of which can be obtained by the packing of nucleosomes. It is believed that the N-Termini tail of the histone molecules contributes to the chromatin function in that it mediates inter-nucleosomal interactions and are involved in the recruitment of non-histone proteins to the chromatin. The N-termini tail directs interactions to the chromatin binders which is thought to be the driving force of modulate chromatin structure. However, there are other ways modifications can occur such as that observed by the unfolding or assembly of nucleosome and how it is involved in gene regulation. It is hoped that this can provided an explanation of epigenetic inheritance (Box 1) the there phenotypic differences in individual cannot be due to differences in DNA, such as that of monozygotic twins.

Epigenetic inheritance are changes in the gene activity that are not encoded by the DNA sequence. These changes include phosphorylation, methylation, ADP-ribosylation, SUMOylation, and ubiquitylation. These modifications can be considered active or repressive depending on their occurrence in active or silent genes. It is show that methylation can have different outcomes depending on the binders of the histone modifications. Nucleosome positioning are found to have an influence on the DNA sequence and may contribute to epigenetic inheritance. [2]

Structural Variation in DNA[edit]

The Structural Variation in DNA is most due to:

1) Varying deoxyribose conformations (4 total conformations)
2) Rotations about the contiguous bonds in the phosphodeoxyribose backbone (between the C1-C3 and C5-C6)
3) Free rotation about C1′- N-glycosyl bond (resulting in syn or anti conformation)

Because of steric hindrance, purines bases in nucleotides are restricted to two stable conformations with respect to deoxyribose, called syn and anti. On the other hand, pyrimidines are generally restricted to the anti conformation because of steric interference between the sugar and the carbonyl oxygen at C-2 of the pyrimidine.

Comparison of A, B, and Z form of DNA[edit]

A form B form Z form
Helical sense
Right handed Right handed Left handed
26 A 20 A 18 A
Base pairs per helical turn
11 10.5 12
Helix rise per base pair
2.6 A 3.4 A 3.7A
Base tilt normal to the helix axis
200 60 70
Sugar pucker conformation
C-3’ endo C-2’ endo C-2’ endo for pyrimidines and C-3’endo for purines
Glycosyl bond conformation
Anti Anti Anti for pyrimidine and syn for purines


  1. Viadiu, Hector. “DNA Structure” UCSD Lecture. November 2011.
  2. Teresa Barth and Axel Imhof. “Fast signals and slow marks: the dynamics of histone modifications.” Trends in Biochemical Sciences vol.31:11. Nov. 2010 (618-626).

Campbell and Reese’s Biology, 7th Edition

Nelson and Cox’s Lehninger Principles of Biochemistry, 5th Edition
Telomeres (from the Greek telos, “an end”) are long stretches of repeating non-coding DNA sequences at the ends of the DNA strand. They protect the ends of DNA and prevent DNA strands from shortening or attaching to other molecules by masking the chromosome. Russian Alexei Olonikov was the first to postulate the problem of chromosomes replicating at the tip.[1] He theorized that in every subsequent replication bits of the DNA would be lost until a critical limit had been reached, thereupon cell division would cease.


Telomerase adding Telomere extension

Telomerase is an enzyme that creates the Telomeres. Telomerase adds specific repeating sequences (“TTAGGG” in all vertebrates) to the ends of four DNA strands.

The telomerase enzyme has an RNA template that partially attaches to the shortened end of the DNA strand. New nucleic acids then attach to the template, extending the DNA strand. Once the telomerase leaves, the double stranded DNA is completed with the DNA polymerase. Telomerase was discovered in 1985 by Carol W. Greider and Elizabeth Blackburn. For this discovery, they were awarded the 2009 Nobel Prize in Physiology or Medicine along with Jack W. Szostak.[3]

Szostak and Blackburn first discovered telomeres in ciliates. They chose ciliates because at one stage of their life cycle, they make a million new telomeres. The model created includes a telomere-dedicated DNA polymerase, which adds telomeric repeats onto chromosome ends. Therefore, telomeres are represented as a motif in DNA sequences.

Telomerase’s presence in humans is somewhat strange. It is located in the nucleus which is unsurprising because that is where DNA replication takes place. However, Telomerase activity is not present in all cells. It was found to be almost absent in the majority of normal adult tissues, including cardiac and skeletal muscle, lung, liver, and kidney. Because of this curious lack of telomerase activity, a theory arose connecting telomere length to aging and cell senescence. According to this theory, human somatic cells are born with a full number of telomeric repeats, but the telomerase enzyme is not present in some tissues. The cells of those tissues would lose about 50 to 100 nucleotides from each chromosome end each time they underwent replication and division. Eventually, the telomeres would cease to exist and the chromosomes themselves would start losing nucleotides, carrying genetic defects into their next division so that neither daughter cell would be viable. Thus after a certain number of divisions a cell will not have enough nucleotides and die.[4]

Telomeres at the end of a chromosome.

The function of Telomerase is to allow for short replacements of Telomeres which are gradually lost during cell division.[5] In normal conditions without Telomerase, a cell would divide until it would hit a critical point known as the Hayflick limit.[6] In the presence of Telomerase, however, the cell has the ability to replace lost DNA and divide without limit. But this continuous growth comes with a consequence as this growth may lead eventually to cancerous cells.

While the details are not fully known, it would seem that that shortened Telomeres play a role in aging due to the erosion of the DNA over time. The questions arises whether or not Telomerase has the ability to greatly extend the lifespan of a human due to its importance in the maintenance of the Telomeres.[7]Dr. Michael Fossel, a professor of clinical medicine at Michigan State University, has expressed his views on Telomerase as a viable treatment for cell senescence.

However, several experiments have raised doubts on the ability of Telomerase as an effective anti-aging treatment. An experiment was done with mice having higher levels of Telomerase and it was discovered that they also had a higher rate of cancer which therefore led to a shorter lifespan. In addition, Telomerase favors tumorogenesis.[8] Telomerase fosters cancer development by allowing uncontrolled cell growth which eventually proliferates into tumors. In fact, Telomerase activity has been observed in approximately 90% of all human tumors which suggests that the uncontrolled growth of a cell as conveyed by Telomerase has a key role in cancer.

In addition to using Telomerase as an anti-aging treatment, Telomerase has potential as a drug target against cancer.[9] Since it is necessary for the immortality of many cancer cell types, it is believed that if a drug is able to deactivate Telomerase activity in a cell, Telomeres would shorten, mutations would happen, cell stability would decrease and cancer would be, in essence, effectively treated. Experimental drugs have been tested in mouse models and some drugs have moved onto clinical testing.


Cancer Biology[edit]

The significance of studying telomeres can be found in telomerase, which rebuilds the telomere so that the cells can keep dividing. The telomerase, however, eventually shortens the telomere, causing the cell to die. In the case of cancer cells, this enzyme builds telomeres long past the cell’s average lifetime. These cells then are called to be “immortaled”, since they can divide endlessly. This results in a tumor. Many researchers believe that telomere maintenance activity is characterized in most human cancer cells. Though the mechanism by which such phenomena happen has not been well understood, the discovery may reveal key elements of telomere function.
Telomerase, on the other hand, is the natural enzyme used for telomere repair, highly abundant in stem cells, germ cells, hair follicles, and most cancers cells, but its expression is low or in some cases absent in somatic cells. Telomerase functions by adding bases to the ends of the telomeres. Cells with sufficient telomerase activity are considered immortal in the sense that they can divide past the Hayflick limit without entering senescence or apoptosis. For this reason, telomerase is viewed as a potential target for anti-cancer drugs such as telomestatin.

2009 Nobel Prize[edit]

The Nobel Prize 2009 in Physiology and Medicine was awarded to three scientists who have discovered how the chromosomes can be copied in a complete way during cell divisions and how they are protected against degradation. By showing that the ends of the chromosomes, telomeres, and their enzyme, telomerase, are significant in protecting the chromosomes from degradation, they identified telomerase and explained how the telomeres protect the ends of the chromosomes and built by telomerase.
On the other hand, if the telomeres become shortened, cells can duplicate damaged as cancer cells. If telomerase is well maintained, conversely, telomere length is maintained and the cell does not become cancerous. In the case of cancer cells, telomerase allows the cell to divide without any limit. Certain genetic disease are caused by a defective telomerase. This discovery can thus be used to stimulate the development of new therapeutic strategies. Understanding such fundamental mechanism is an important first step toward opening new doors for cures for cancer and other related diseases, as well as anti-aging.

Hayflick Limit[edit]

The Hayflick limit is the number of times a normal cell may divide until it reaches a critical limit and stops dividing based on the idea that Telomeres reach a critical length.[10] This limit was discovered by Leonard Hayflick in the 1960s who demonstrated that the cells in a normal fetus divided around 40 to 60 times before entering into cell senescence. Due to repeated mitosis, the Telomere shortening occurred which inhibited cell division which is analogous to aging. The discovery of this limit, a pillar of Biology, refuted the early contention by Alexis Carrel who, along with the majority of scientists during that time period, believed cells were “immortal”.

Role of Telomere[edit]

Telomeres account for the lost bits of DNA at the ends of chromosomes during DNA replication. Since DNA polymerase moves along the template strand in the 5′–> 3′ direction, some of the 5′ end of the template strand will not be replicated. This results in the incomplete ends as shown in the diagram below. However, telomeres are usually very long, ranging from 400 to 600 base pairs in yeast to many kilobases in humans. They are made of six to eight base pair long repeats which are usually rich with guanine bases. With long stretches of telomeres at the ends of DNA strands, the incomplete strands of DNA will still contain the genetic code.

Incomplete ends.JPG

Guanosine Tetraplex: a structure of DNA with four strands of DNA. Often the structure of telomere.

The shortening of telomeres in humans induces cell senescence in humans. This mechanism appears to cause the formation of cancerous cells. Telomere length has been theorized in recent publications to account for the aging in humans. Since cells replicate identically, there must be a reason why cells within a body lose function and viability with time. Telomeres may have some influence over the aging process since every consequent DNA replication results in the shortening of telomeres. Two aspects to this question are: (i) whether telomere length, as measured in specific cell populations in the body, correlates with longevity or disease; and (ii) whether telomere shortening in any cell population causes functional impairment of that cell population. However, some may argue telomeres do not correlate to longevity as mice contain long strands of telomeres, but contrarily live much shorter lives than humans who do not have as long telomeres as do mice. And some may argue that telomere length does correlate to longevity as it determines the number of times that a cell can divide before it dies or reaches senescence.

Recent Publications[edit]

Recently it has been found that telomerase activity is inversely related to length of the telomeres. In other words, telomere elongation happens more often on short telomeres rather than long ones. The research showed a deficiency in telomerase activity in telomeres greater than 125 base pairs,and there was 2 to 3 times more telomerase activity in telomeres shorter than 125 base pairs. This preferential elongation has been demonstrated in yeast and mice, and now human somatic cells. Kinetic data indicates that elongation in yeast cells in a single event in which elongates the telomeres to a certain length, whereas in human cells the elongation seems to be a gradual process. The researchers showed that telomerase adds a regulated length of telomere in each cell division. The researchers showed that human cells expressed telomerase, however long telomeres were maintained and not elongated where as the cells with shorter telomeres elongated, which goes to show that telomeres can not be infinitely extended.[11]

Another interesting paper was focused on the role of DNA damage response (DDR) proteins in the role of telomere maintenance. The review says that early stage DNA repair proteins have a significant role in telomere maintenance where as late stage proteins usually do not take part in telomere repair. The interplay with these proteins and the proteins that cap the telomeres to protect the telomeres is very important too. Many of stronger DDR proteins inhibit cell replication, because of this fact, it would be harmful to the organism for these proteins to be a part of telomere repair. These protein caps on the telomeres inhibit full DNA damage response which keeps the stronger protein from “repairing” the telomere ends. It still isn’t clear why some of the DDR proteins participate in telomere maintenance and others do not, but it is clear that the cellular process in repairing a DNA break and repairing telomeres are two different process, with the former halting cellular division.[12]


  1. “Telomeres, telomerase, and aging: Origin of the theory”. Alexey M. OlovnikovE-mail The Corresponding Author. 1999. Retrieved 2009-11-05. 
  2. “Repeat Expansion–Detection Analysis of Telomeric Uninterrupted (TTAGGG)n Arrays”. [1]. 2007. Retrieved 2009-11-05. 
  3. “The Nobel Prize in Physiology or Medicine 2009”. [2]. 2009. Retrieved 2009-11-05. 
  4. “What are telomeres and telomerase?”. [3]. Retrieved 2009-11-05. 
  5. “Telomerase: regulation, function and transformation.”. [4]. Retrieved 2009-11-05. 
  6. “Hayflick Limit Theory”. [5]. Retrieved 2009-11-05. 
  7. “Extension of Life-Span by Introduction of Telomerase into Normal Human Cells”. [6]. Retrieved 2009-11-05. 
  8. “Anti-Aging Medicine”. João Pedro de Magalhães. 2008. Retrieved 2009-11-05. 
  9. Foreman, Judy. “Telomerase – a Promising Cancer Drug Stuck in Patent Hell?”. Retrieved 2009-11-05. 
  10. “Cellular Senescence”. João Pedro de Magalhães. 2008. Retrieved 2009-11-17. 
  11. Britt-Compton, Bethan; Capper, Rebecca; Rowson, Jan; Baird, Duncan M. (2009). FEBS Letters (583): 3076–3080. 
  12. Lyndall, David (2009). The EMBO Journal (28): 2174–2187. 

DNA does not always take the form of a double helix. It can often be found creating structures considered abnormal when compared to what is commonly considered DNA. Normally, DNA contains a B-form helix. Improper formation of base pairs can greatly affect DNA’s structure and flexibility.

Single-stranded nucleic acids can form hairpins. Such formations can affect the transcription terminations in prokaryotes. With regard to double-stranded DNA, they can form something called cruciforms.


Hairpin loops are formed by a fold in a single strand of DNA, causing several bases to remain unpaired before the strand loops back upon itself. A hairpin loop is only possible if the strand of DNA contains the complimentary bases in correct sequence to those that appear earlier in the strand. For example; if a DNA strand contained CCGT followed by several bases including ACGG, the strand is capable of creating a hairpin loop by folding back on itself.

Hairpin loops can occur in both DNA and RNA, though in RNA the thymine base is replaced by uracil. The number of bases in the loop itself is variable, though it never exists in the length of three bases, as the steric hindrance makes the configuration too unstable.

Here is an image example of hairpin DNA:
(Image is of a Long-alpha hairpin)
PDB 2qoz EBI.png


Cruciform DNA structure appears as several hairpin loops, creating a crucifix-like structure composed of DNA.

DNA structure is formed by incomplete exchange of the strands between the double-stranded helices.

Cruciform DNA
Eukaryotic cells contain DNA-binding protein that can specifically recognize cruciform DNA. Interactions with ubiquitous protein plays a crucial role for the conformation of cruciform DNA.

An example of a DNA-binding Protein is Crp1p. This DNA-binding protein is found in the yeast Saccharomyces cerevisiae

Image of the formation of Cruciform DNA can be found

Triple Helix[edit]

The triple helix form of DNA is similar to the double helix DNA except that it contains another oligonucleotide that hydrogen bonds to the bases that are already included in the double helix strands of DNA.

The triple-stranded DNA was a very common hypothesis in the 1950s when scientists were having trouble figuring out the true structure of DNA. Watson and Crick, Pauling and Corey all published a triple-helix model proposal. Watson and Crick found problems with the model. The problems were as follows:

  1. Negatively charged phosphates near the axis will repel each other, leaving the question as to how the three-chain structure would stay together.
  2. In a triple-helix model (specifically Pauling and Corey’s model), some of the van der Waals distances appear to be too small.1

For more information on Triple-stranded DNA see DNA Triple-stranded DNA

An image of the triple helix form can be found here.

Hinged DNA[edit]

Hinged DNA (H-DNA) is a triple helix structure that exists based on hydrogen bonds between DNA bases. The three strands base pair by Hoogsteen base pairing. Hoogsteeen base pairing is a variation of base-pairing in the nucleic acids such as the A-T pair or the G-C pair. The Hoogsteen base pair applies the “N7 position of purine base and c6 amino group which bind the Waston-Crick face of pyrimidine base.” More information on the Hoogsteen base pair can be found here. It is also called H-DNA because of its dependence on hydrogen bonds. The H-DNA can be found in vitro or during recombination and also in DNA repair.

An example of H-DNA can be found here.


G-quadruplexes are a family of quadruple-stranded structures formed by a guanine-rich sequences of nucleic acids. Members of this family share a common square arrangement of four guanines centered around a monovalent cation and stabilized by Hoogsteen hydrogen bonding. The guanines may adopt either an anti or syn alignment about the glycosidic bond. The backbone strands of the g-tetrad can also adopt a variety of directionalities: all four strands may be oriented in the same direction, three strands are oriented in one direction while the fourth is in another direction, two adjacent strands can be oriented in one direction while the other two will be oriented in another direction, or each strand will have adjacent anti parallel neighbors. The sequence of amino acid that has the potential to form g-quadruplex is: GxNaGxNbGxNcGx, where x is the number of G residues and Na, Nb, and Nc are loops of different lengths. Furthermore, they can form in DNA, RNA, LNA, and PNA, and either be intramolecular, bimolecular and tetramolecular compounds. Their four stranded motifs create four grooves each with varying widths and depths. Their folding depends on many factors; DNA sequence, presence of ions, temperatures, and presence of various ligands. They are a special area of interest due to their biological implications specifically in telomeres and as contributors to gene regulation.

A shows a G-tetrad, B shows the Anti and Syn conformations of Guanine, C shows the various directionailities of the backbone strands, D shows the different types of loops

Structure determination of G-quadruplex based on crystallography or solution NMR demonstrates significant deviations in conformation and loop geometry suggesting heterogeneity in strand topology and loop conformation of G-quadruplexes. Varying conformations can result in varying stability. Furthermore, studies of the various conformations reveal that the nature of the loop sequence and the formation of interactions between loops and the quadruplex core are important elements in controlling quadruplex topology and stability. For example, in examining the bindinging of quinacridine-based ligand to a G-quadruplex, interactions with the sides of the G-stack do not alter the topology but interaction with the loop sequence ended up altering the conformation of the loops. This hints at the notion that the loop sequences of the quadruplex are what actually moderate the binding affinity and specificity of the whole structure.

The four-stranded structure with four grooves instead of the normal two found in typical DNA structure, provides a variety of surfaces for interactions with ligands. Aromatic compounds of various dimensions showed favorable interactions with the planer surfaces of terminal guanine tetrads. Intercalation between layers of G-tetrads does not occur, however because G-tetrads do not allow for bulky aromatic compounds to insert itself between layers of guanine.

In eukaryotic telomeres, there exists repeats of g-rich sequences that can fold into g tetrads. It has been postulated that this structure plays an important role in cell aging and human diseases such as cancer, then making them targets to anticancer drugs.


  1. Problems with triple helix model:
  2. H-DNA:
  3. Cruciform DNA:
  4. Sannohe,Yuta, Sugiyama, Hiroshi. “Overview of Formation of G-Quadruplex Structures” Wiley Online Library. 01 Mar. 2010. 20 Nov. 2010.
  5. Martin Egli, Pradeep S Pallan. “The many twists and turns of DNA: template, telomere, tool, and target” Current Opinion in Structural Biology. 08 Apr. 2010. 20 Nov. 2010
  6. Lubos Bauer,Peter Javorsky, Katarina Tluckova, and Viktor Viglasky. “Evaluation of human telomeric g-quadruplexes: the influence of overhanging sequences on quadruplex stability and folding” Journal of Nucleic Acid. 10 Jun. 2010. 20 Nov. 2010


The structure of DNA does not only exist as secondary structures such as double helices, but it can fold up on itself to form tertiary structures by supercoiling. Supercoiling allows for the compact packing of circular DNA. Circular DNA still exists as a double helix, but is considered a closed molecule because it is connected in a circular form. A superhelix is formed when the double helix is further coiled around an axis and crosses itself. Supercoiling not only allows for a compact form of DNA, but the extent of coiling also affects the DNA’s interactions with other molecules by determining the ability of the double helix to unwind.

Although the supercoiling provides an organized way to tightly compact DNA, the structure is relatively unstable as a result of torsional strain. In order to minimize the energy required to maintain the structure, the number of twists and writhes are minimized. Twists refer to the number of turns the double helix makes around the superhelical axis. Writhes refer to the circular distortion, bending, and overall non-planarity of the DNA strand.

Supercoiling changes the shape of DNA. The benefit of a supercoiled DNA molecule is its compactibility. In comparison to a relaxed DNA molecule of the same length, a supercoiled DNA is more compact. How this is reflected in experimentation is that supercoiled DNA moves faster than relaxed DNA. Therefore, the structural differences can be analyzed in techniques such as electrophoresis and centrifugation.

Supercoiled DNA may hinder and favor the DNA to unwind and thus affect the interaction between DNA and other molecules in cells.

Positive and Negative Supercoilings[edit]

1. Negative supercoiling is the left-handed coiling of DNA thus winding occurs in the counterclockwise direction. It is also known as the “underwinding” of DNA.

2. Positive supercoiling is the right-handed, coiling of DNA thus winding occurs in the clockwise direction. This process is also known as the “overwinding” of DNA.
(CORRECTION FIXED on 10/23/17 – DV, original error had the negative and positive supercoiling definitions reversed. Also provided more basic clarity to supercoiling).

Although the helix is underwound and has low twisting stress, negative supercoil’s knot has high twisting stress. Prokaryotes and Eukaryotes usually have negative supercoiled DNA. Negative supercoiling is naturally prevalent because negative supercoiling prepares the molecule for processes that require separation of the DNA strands. For example, negative supercoiling would be advantageous in replication because it is easier to unwind whereas positive supercoiling is more condensed and would make separation difficult.

Topoisomerases unwind helix to do DNA transcription and DNA replication. After the proteins have been made,the DNA template supercoils by the force to make chromatin. RNA polymerase also influence DNA strand to have two different supercoiled directions. The region RNA polymerase has passed forms negative supercoil while the region RNA polymerase that have not passed forms positive supercoil. By these processes, supercoils are generated.


Topoisomerases are enzymes that are responsible for the introduction and elimination of supercoils. Positive and negative supercoils require two different topoisomerases. This prevents the distortion of DNA by the specificity of the topoisomerases. The two classes of topoisomerases are Type I and Type II. Type I stimulates the relaxation of supercoiled DNA and Type II uses the energy from ATP hydrolysis to add negative supercoils to DNA. Both of these classes of topoisomerases have important roles in DNA transcription, DNA replication, and recombinant DNA.

Topoisomerase form loops (unwinded regions of the double helix) of negative supercoils. If the DNA lacks superhelical tension, there is no unwinding of supercoils.

Type I topoisomerase[edit]

Type I topoisomerase act by creating transient single-strand breaks in DNA. This is further classified as type IA and type IB.

Type IA topoisomerases

Type IA topoisomerases enzyme is a 695-residue monomer and it relaxes negatively supercoiled DNA. First, Type IA cuts a single stranded DNA and catenates two circles of single stranded DNA. Then it unwinds the supercoiled duplex DNA by one turn. Type IA has a specific strand-passage mechanism which is the denaturation of type IA incubated with single stranded DNA that yields a linear DNA by phospho-Tyr diester linkage.

Type IB topoisomerases

Type IB mediates a controlled rotation mechanism to relax both negative and positive supercoils. Type IB cleaves a single strand of a duplex DNA through the nucleophilic attack of an active site with Tyr on a DNA to yield a 3′-linked phospho-Tyr intermediate with 5′-OH group. Type IB consists of several domains and subdomains. Interestingly, type IA topoisomerases form a covalent intermediate with the 5′ end of DNA, while the IB topoisomerases form a covalent intermediate with the 3′ end of DNA. Historically, type IB topoisomerases were referred to as Eukaryotic Topo I, but Type IB topoisomerases are present in all three kingdoms of life.

Type II topoisomerase[edit]

Type II topoisomerase is an enzyme that require ATP hydrolysis to complete a reaction cycle in which two DNA strands are cleaved, duplex DNA is passed through the break and the break is resealed. Type II cuts both strands a DNA double helix, passes another unbroken DNA strand through it, and then reanneals the cut strand. It is also split into two subclasses: type IIA and type IIB topoisomerases, which share similar structure and mechanisms. Examples of type IIA topoisomerases include eukaryotic topo II, E. coli gyrase, and E. coli topo IV. Examples of type IIB topoisomerase include topo VI. Supercoiling requires energy because it is torsionally strained. Thus, through the coupling to ATP hydrolysis it can introduce negative supercoils.

In bacteria, Type II topoisomerase is also known as DNA gyrase. Gyrase is an enzyme that acts similarly to human Type II topoisomerase. Antibiotics act on bacterial enzyme by blocking the binding of ATP to gyrase and thus deactivating the breaking and joining of bacterial DNA chains.


Nucleosomes allow for the compact packing of linear DNA. Nucleosomes are complexes of DNA and histones, consisting of ~145 base pairs of DNA wrapped around in a left-handed superhelix around a histone octomer, which are a group of small proteins. Histones contain a large amount of positively charged amino acids such as lysine and arginine which allow them to bind to the negatively charged DNA molecule. The histone octamer is composed of two copies each of H2A, H2B, H3, and H4. The two loops of DNA around the histone are attached to the histone also using the H1 histone. Nucleosomes are further arranged in a stacked helical complex. Through the extensive wrapping of DNA around the histones, as well is the helical arrangement of the nucleosomes, the linear DNA is able to be compacted. The structural folding of the nucleosomes eventually forms a chromosome.

Chromatin refers to the structure of DNA and its accompanying histones. Chromatin is composed of repeating units called nucleosomes. The five major histones found in chromatin are H2A, H2B, H3, H4, and H1.

In gene clusters, protein genes of histone are present and these are expressed in S phase. Once it is expressed, it forms histone octamers. With interactions of 146 base pairs of DNA double helix, histone octamer becomes a nucleosome. When histones bind to DNA, it is depended on the amino acid sequence of histone, not the nucleotide sequences of DNA.

Histones and Transcription Regulation[edit]

Histones always appear to remain attached to the DNA even through transcription. The fact that nucleosomes are able to change shape and position allow for transcription to occur and RNA polymerase to move along the DNA strand. Slight loosening of the binding between the histones and DNA are accomplished by acetylation of the histones, which neutralizes the positively charged residues. Meanwhile, binding is made tighter through methylation to restore the positive charge of the histones. By changing the charge of the histones in this manner, gene transcription can be regulated.
Histone Chaperones are proteins that mediate the assembly and disassembly of the chromatin to form correct nucleosomes sequences and aid in stable folding conformations. These proteins function to protect and shield the histones from forming incorrect and unwanted aggregates with DNA because of the high ionic strength that exists between DNA and Histones. DNA is primarily negatively charged molecule and histones are positively charged therefore, there exists a strong affinity for each other. Histone Chaperones, which are positively charged, help to guide histones to form octamers and correctly bind to DNA by shielding and masking the negative charge of DNA. There are different types of histone chaperones, including β- sandwich, α/α earmuff, Β-propeller and β-barrel chaperones. Β-sandwich chaperones are chaperone monomers that form β-sheets with the histones. An example of these types of chaperones is ASF1 or anti-silencing function chaperones involved the overexpression during yeast replication. In addition ASF 1 is the first histone used during assembly of the chromatin. α/α earmuff chaperones are dimers that form α helical conformations of histone/DNA complex. An example would include NAP chaperones which are used to transport histones from the cytoplasm to the nucleus during chromatin assembly. Β-propeller chaperones were the first chaperones to be distinguished using NMR and crystallography techniques. These pentamer chaperones function is the storage of histones. Β-Barrel chaperones are heteroligomers that help facilitate chromatin transcription. In addition, there are irregular or variant histone chaperones that do not fit into any specific structural category. All of these different types of chaperones are involved in different stages of assembly of disassembly of chromatin. The energetics of Chromatin assembly and disassembly are regulated by histone chaperones. Assembly which is an energetically favored process because as the histones bind with DNA it forms a more stable structure causing a decrease in energy. On the other hand, disassembly is an energetically unflavored process needing the use of ATP to break apart the stable histone/ DNA interactions.

Nucleosome Sliding[edit]

Nucleosome sliding is a frequent result of energy-dependent nucleosome remodelling in vitro.

ATP-Dependent Nucleosome Sliding Mechanism

The paper “Mechanisms of ATP-dependent nucleosome sliding” by Gregory D Bowman, researches how ATPase motors engage and manipulates nucleosomal DNA and discusses possible mechanisms for ATP-dependent sliding of nucleosomes. ATPase motors are shared between chromatin remodelers and collections of different protein machines. The ATPase motor generates torsional strain when it engages with DNA at an internal site on the nucleosome. The torsional strain in the nucleosomal DNA is a result of the ATPase motor acting at SHL2 region. Protection of nucleosomal DNA between SHL2 and the entry/exit site is increased Isw2 ATPase is activated. ATP-dependent crosslinking of the Isw2-subunit Dbp4 to SHL4 promotes hydrolysis-dependent changes. Iswi-type remodelers form template-committed complexes that allow for nucleosomes to slide processively.

Bowman also explains possible variations of the bulge/loop propagation model using ATPase motors. One model suggests that the ATPase motor uses translocase abilities to pull DNA from an entry/exit site in a continuous manner. This pumping allows for a remodeler to create a bulge that would rapidly diffuse to a distant entry/exit site. Another model suggests the histone-DNA contacts are disrupted by a DNA loop that is developed by a remodeler ATPase around the SHL2 region. This disruption pulls DNA for the linker and the ATPase motor would move toward dyad along the DNA loop.

Chromatin Remodeler[edit]

Chromatin remodelers are mainly involved in DNA packaging and facilitating the transcript elongation process. For example, when a DNA strand coils with nucleosomes for packaging into chromatin, chromatin remodelers arrange the nucleosomes in a regular distance for effective condensation of DNA strands. Furthermore, in some processes where nucleosomes have to be modified, chromatin remodelers may disassemble the nucelosome into histones or even detach the whole nucleosome from the DNA. The processes that require nucleosome modification by the chromatin remodeler include DNA repair, recombination, transcription and replication. The following picture displays an example of how a chromatin remodeler may be used during transcription catalyzed by RNA Polymerase II.

RNA Polymerase II Transcription

Palindromic Sequencing[edit]

A palindromic sequence is a sequence made up of nucleic acids within double helix of DNA and/or RNA that is the same when read from 5’ to 3’ on one strand and 5’ to 3′ on the other, complementary, strand. It is also known as a palindrome or an inverted-reverse sequence.

The pairing of nucleotides within the DNA double-helix is complementary which consist of Adenine (A) pairing with either Thymine (T) in DNA or Uracil (U) in RNA, while Cytosine (C) pairs with Guanine (G). So if a sequence is palindromic, the nucleotide sequence of one strand would be the same as its reverse complementary strand. An example of a palindromic sequence is 5’-GGATCC-3’, which has a complementary strand, 3’-CCTAGG-5’. This is the sequence where the restriction endonuclease, BamHI, binds to and cleaves at a specific cleavage site. When the complementary strand is read backwards, the sequence is 5’-GGATCC-3’ which is identical to the first one, making it a palindromic sequence.

Another restriction enzyme called EcoR1 recognizes and cleaves the following palindromic sequence:

5’ – G A A T T C – 3’

3’ – C T T A A G – 5’


Image of a palindrome in a DNA structure. A = Palindrome, B = Loop, C = Stem

Relationship between Sequence and Protein Structure[edit]

There have been many researchers who have studied the relationship between palindromic sequences and protein structures. Studies have shown that the frequent appearances of palindromic sequences, also called palindromic peptides, in protein sequences are not just by chance. Scientists suggest that these sequences are important for protein structure and protein function in different proteins. Some of these protein groups include DNA binding proteins, ion channels and Rhodopsin, metal binding proteins and receptors, and etc. By comparing palindromes with set sequences from the database, scientists can try to find the roles of palindromic sequences.

Another topic within palindromic sequences which is being studied is whether the symmetry of palindromic sequences affects the structure and folds of peptides. One hypothesis is that by reversing the sequence, the resulting folds would be mirror-images of the original fold. The conclusion states that because both the original and reverse proteins have identical amino acid compositions which lead to similar hydrophobic-hydrophilic patterns, the reversing sequence results in the same folds as opposed to the mirror-image folds. Another hypothesis guided by research is that by reversing a sequence, the fold could change or possibly be destroyed. This shows evidence that the similarity in reverse sequencing does not reflect structural similarity, which means that they do not form symmetrical protein structures.

Effect on genomic instability in yeast[edit]

Palindromic sequences have been tied to different genomic rearrangements in different organisms depending on the length of the repeated sequences. Shorter palindromic sequences (shorter than 30 bp) are very stable while longer sequences are not stable in vivo. These sequences occur in both eukaryotes and prokaryotes. These sequences also increase inter and intrachromosomal recombination between homologous sequences. Hairpin structures can form from palindromic sequences due to base pairing in single-stranded DNA. These structures can be substrates for structure-specific nucleases and repair enzymes which can lead to a double-strand break in the DNA. This then leads to loss of genomic material which can cause meiotic recombination. Studies with a 140-bp long mutated palindromic sequence inserted in yeast have shown to lower postmeiotic segregation and increase rate of gene conversions, while shorter sequences do the opposite. Research also found that during meiosis, double-strand breaks are induced by the long 140-bp palindromic sequence. In the long hairpin structure, the entire stem-loop is not covered and the processing endonuclease is exposed, which makes nicks in the loop. This nick creates a gap which is repaired by the wild-type strand. The induction of double-strand breaks during meiosis is what causes genomic instability.

Likelihood of palindromic sequences in proteins[edit]

There have not been an abundance of studies focusing on the significance of palindromic sequences in protein, but there have been some which tell us a lot about the relationship between palindromic sequencing and protein function. But by understanding the actual formation of these palindromic sequences and their properties, researchers can tie these sequences to functions. It has been found that decreasing amino acid composition complexity increases the likelihood of a palindromic sequence. The next step relates to the likelihood of palindromic sequences in proteins which can be due to the frequent formation of alpha helices by palindromes.


(c) Acdx, from the Wikipedia Commons

Jankowski, Craig, Dilip K. Nag, and Farooq Nasar. “Long Palindromic Sequences Induce Double-Strand Breaks during Meiosis in Yeast.” National Center for Biotechnology Information. U.S. National Library of Medicine, 20 May 2000. Web. 7 Dec. 2012. .

“Palindromic Sequences.” Wikipedia. Wikimedia Foundation, 11 Aug. 2012. Web. 07 Dec. 2012. .

Sheari, Armita, Mehdi Kargar, Ali Katanforoush, Shahriar Arab, Mehdi Sadeghi, Hamid Pezeshk, Changiz Eslahchi, and Sayed-Amir Marashi. “A Tale of Two Symmetrical Tails: Structural and Functional Characteristics of Palindromes in Proteins.” National Center for Biotechnology Information. U.S. National Library of Medicine, 11 June 2008. Web. 07 Dec. 2012. .
When a DNA solution is heated enough, the double-stranded DNA unwinds and the hydrogen bonds that hold the two strands together weaken and finally break. The process of breaking double-stranded DNA into single strands is known as DNA denaturation, or DNA denaturing. The temperature at which the DNA strands are half denatured, meaning half double-stranded, half single-stranded, is called the melting temperature(Tm). The amount of strand separation, or melting, is measured by the absorbance of the DNA solution at 260nm. Nucleic acids absorb light at this wavelength because of the electronic structure in their bases, but when two strands of DNA come together, the close proximity of the bases in the two strands quenches some of this absorbance. When the two strands separate, this quenching disappears and the absorbance rises 30%-40%.This is called Hyperchromicity. The Hypochromic effect is the effect of stacked bases in a double helix absorbing less ultra-violet light.

Applications of DNA denaturation[edit]

Sequence differences between two different DNA sequences can also be detected by using DNA denaturation. DNA is heated and denatured into single-stranded state, and the mixture is cooled to permit strands to re-hybridize. Hybrid molecules are formed between similar sequences and any differences between those sequences will give a disruption of the base-pairing

What determines the Melting Temperature (Tm)?[edit]

While the ratio of G (Guanine) to C (Cytosine) and A (Adenine) to T (Thymine) in an organism’s DNA is fixed, the GC content (percentage of G +C) can vary considerably from one DNA to another. The percentage of GC content of DNA has a significant effect on its Tm. Because G-C pairs form three hydrogen bonds, while A-T pairs form only two, the higher the percentage of GC content, the higher its Tm. Thus, A double-stranded DNA rich in G and C needs more energy to be broken than one that is rich in A and T, meaning higher melting temperature(Tm). Above the Tm, DNA denatures, and below it, DNA anneals. Annealing is the reverse of denaturation.

One aspect of thermal denaturation is never discussed. The heat supplied to effect such denaturation has no preferred direction and is therefore a scalar quantity. However, unwinding a double helix involves unwinding and this has direction and is therefore a vector. The issue is this: how does a scalar change induce a vector change ?

Other methods to denature DNA[edit]

Heating is not the only way to denature DNA. Organic solvents such as dimethyl sulfoxide and formamide, or high pH, could break the hydrogen bonding between DNA strands too. Low salt concentration could also denature DNA double-strands by removing ions that stabilize the negative charges on the two strands from each other.

The central difficulty with denaturation of the double helix remains. How would two strands, typically consisting of many turns, and often many hundreds of turns, actually effect strand separation after the hydrogen bonds have been severed ?

A further, major difficulty lies in the fact that the application of heat to a suspension of nucleic acids amounts to the application of a scalar quantity because the heat applied in this way has no direction. However, unwinding the strands requires an angular force and this is a vector as it has a preferred direction. It has never been explained how a scalar quantity (heat) can effect a vector change (rotation) in a solution. There is simply not enough technology are intellect in this world to explain it.

A solution to these problems is offered by the side-by-side models in which there is no net winding of strands around each other.
The Avery-MacLeod-McCarty Experiment was presented by Oswald Avery, Colin MacLeod, and Maclyn McCarty in 1944. During the 1930s and early 1940s, Avery and MacLeod performed this experiment at Rockefeller Institute for Medical Research, after the departure of MacLeoirulency (measure of deadly potency). This experiment would allow them to determine if rough bacteria could be transformed into smooth bacteria, hence passing along the genetic information causing the transformation. By isolating and purifying this chemical component, they could deduce if it had characteristics of a protein or DNA molecule.


The purpose behind this experiment was to better understand the chemical component that carries the genetic information and transforms one molecule to the next.


Bacteria grown in petri dishes can grow spots or colonies inside the dish multiplying under certain conditions. Virulent (deadly) colonies look smooth or like tiny droplets, where as non-deadly bacteria formed rigid, uneven edges, basically rough colonies. While analyzing a certain kind of pneumonia caused by bacteria in mice, they were able to isolate a “variant” (mutant) strain that did not kill the mice. During the experiments, Avery and MacLeod injected a mouse simultaneously with “boiled” or dead smooth bacteria and live rough bacteria. Thereafter a short while they were surprised to see that the mouse died. When they took samples from the dead mice, and cultured the samples in a petri dish, Avery and MacLeod found that what grew inside the culture was in fact the smooth deadly bacteria. This suggested that something from the “dead” bacteria somehow converted the rough bacteria into smooth bacteria. The rough bacteria had been permanently converted or transformed into the smooth dangerous bacteria. They had confirmed that they could not grow smooth bacteria from the boiled culture and cause disease if the dead smooth bacteria were injected alone. This all implied that a chemical component in the smooth bacteria survived and transformed the rough bacteria into smooth. Isolating and purifying that chemical component had shown that is was DNA, NOT proteins that transferred the genetic code from the smooth to the rough.

Simpler Experimental[edit]

Avery-MacLeod-McCarty Experiment

Here is a simpler demonstration for this experiment by Oswald Avery, Colin MacLeod, and Maclyn McCarty. There are two sets of bacteria – one is smooth (virulent), one is rough (nonvirulent).

1) They first inject deadly encapsulated bacteria into the mouse – the mouse dies at the end.

2) They then inject non-encapsulated, nonvirulent bacteria into the mouse – the mouse lives.

3) Next, they heated the virulent bacteria at a temperature that kills them and injected these bacteria into the mouse – the mouse lives.

4) After that, they then have the denatured fatal bacteria mix into the living non-encapsulated, nonfatal bacteria. The mixture was then injected into the mouse – the mouse dies.

5) Finally, they mix the live, non-encapsulated harmless bacteria with the DNA that was extracted from the heated, lethal bacteria. These “harmless” bacteria injected to the mouse after being mixed – the mouse dies.

From these experiments, Avery and his group showed that nonvirulent bacteria become deadly after mixing with the DNA of the virulent bacteria
. Such a demonstration shows that nonvirulent bacteria became virulent because of the genetic information that originally came from the virulent bacteria. The protein from the virulent bacteria was already denatured during Step 3. Thus, it was DNA and not protein that transferred the genetic information to the nonvirulent bacteria.

Griffith Experiment

In 1928, Frederick Griffith performed a DNA experiment using pneumonia bacteria and mice. This experiment provided evidence that some particular chemical within cells is genetic material. The objective of the experiment was to find the material within the cells responsible for the genetic codes.

For the experiment, Griffith used Streptococcus pneumoniae, known as pneumonia. Pneumonia contains two strains – a smooth and a rough strain. The smooth strain causes pneumonia and contains a polysaccharide coating around it. The rough strain does not cause pneumonia and also lacks a polysaccharide coating. For his first experiment, Griffith took the S strain (smooth strain) and injected it into the mice. He found that the mice contracted pneumonia and ended up dying. He then took the R strain (rough strain) and injected it into the mice and found that they did not contract the pneumonia illness and survived the insertion of the strain.
Through these first two experiments Griffith concluded that the polysaccharide coating on the bacteria somehow caused the pneumonia illness, so he used heat to kill the bacteria (polysaccharides are prone to heat) of the S strain and injected the dead bacteria into the mice. He found that the mice lived, which indicated that the polysaccharide coating was not what caused the disease, but rather something living inside the cell. Then he hypothesized that the heat used to kill the bacteria denatured a protein within the living cells, which caused the disease. He then injected the mice with a heat killed S strain and a live R strain, which resulted in the mice dying.

Griffith performed a necropsy on the dead mice and isolated the S strain bacteria from the corpses. He concluded that the live R strain bacteria must have absorbed the genetic material from the dead S strain bacteria, which is called transformation, a process where one strain of a bacterium absorbs genetic material from another strain of bacteria and turns into the type of bacterium whose genetic material it absorbed. Since heat denatures proteins, the protein in the bacterial chromosomes was not the genetic material. However, evidence pointed to DNA. This experiment that Griffith performed was a precursor to the Avery experiment. Avery, Macleod and McCarty followed up on the experiment because they wanted a more definitive experiment and answer.

Avery, MacLeod and McCarty used heat to kill the virulent Streptococcus pneumonia bacteria and extracted RNA, DNA, carbohydrates, lipids and proteins – which were considered possible candidates for the carriers of genetic information – from the dead cells. Each molecule was added to a culture of live non-virulent bacteria to determine which was responsible for changing them into virulent bacteria. DNA was the only molecule that turned the non-virulent cells into virulent cells, which they concluded was the genetic material within cells.


Lockshin, Richard A., The Joy of Science 2007.


In 1944 Oswald Avery and colleagues did an experiment involving the use of pathogenic bacteria to determine the material that contained genetic information. Their experiment concluded that it was DNA and not proteins that is the hereditary material. Despite the findings, the popular and widely accepted conclusion remained that protein encoded genetic information, accounting for its diversity in function and much greater number compared to DNA. In order to gain more evidence on DNA scientists by the name Alfred Hershey and Martha Chase decided to perform a simple but effective experiment involving bacteriophages.


In order to understand the experiment that was performed we must examine first the vector used which played a crucial role in the experiment – Bacteriophages. Bacteriophages are types of viruses which infect bacteria such as Escherichia coli. They consist of a protein coat, collar, base plate, tail fiber and most importantly a head which houses the genetic material. They have a unique feature, which makes them the perfect candidate to prove whether DNA or proteins house the genetic information. They have an outer capsule of proteins, which surround an inner core of DNA. Bacteriophages, being viruses, are unable to proliferate on their own accord, as they lack the necessary system to do so. Viruses invade a host cell and inject their genetic material to the host’s own gene and allow the host to replicate the viral gene. Knowing this, Martha and Hershey Chase saw that if they labeled the bacteriophages they would be able to track what genetic information is passed on to the host bacteria – the labeled DNA or the labeled protein coat.


A bacteriophage was taken and its encased DNA was labeled with radioactive 32P and its protein coat was left nonradioactive. The bacteriophage was exposed to a sample of bacteria. The phage attached to the surface of the bacterial cell and injected the labeled DNA. The sample was then chilled to arrest growth. The sample was then shaken vigorously for several minutes in a Waring Blender. This process separated the phages coat from the surface of the bacteria. The sample was then centrifuged very quickly. The bacterial cells were at the bottom of the tube and the phage particles were in the supernatant. Hershey and Chase discovered that there was no disruption in the reproduction of viral phages inside the bacteria. A new generation of viruses had successfully propagated inside the host cell and these phages exhibited 32P radioactivity in their own DNA.

Another set of Bacteriophage was then examined, this time nnznwith a radioactive protein coat 35S and a nonradioactive DNA. The same procedure was followed and the phage attached to the bacterial wall and was allowed to inject its genetic material. Vigorous shaking of the bacteria caused the radioactive viral sheath to detach from the bacteria. Injection of the viral DNA into the bacteria still occurred and new phages were observed to have been produced. However, analysis of the new phages inside the bacteria showed that it had no radioactive properties; a property which should be present in the new phages, if proteins were in fact the genetic material that is passed on to new progeny. This experiment therefore illustrated that DNA, not protein, is the source of genetic information.



The first bacterial cell contained phages with observable radioactivity illustrating that the radioactive property present on the parent phage was passed on to the new phages. The second bacterial cell however showed no hint of 35S, showing that it was removed along with the protein coat and did not enter the bacteria. Hershey and Chase then deduced that the genetic material that is being passed on is DNA and not protein as previously accepted before.

This famous 1952 experiment allowed Hershey and Chase to demonstrate that it was DNA, not protein, that functioned as the T2 phage’s genetic material. Viral proteins, labeled with radioactive sulfur, remained in the ouside of the host cell during infection. In contrast, the viral DNA, which was label with radioactive phosphorus, entered the bacterial cell. Concluding that the DNA is in fact the material within cells that contains useful genetic information.

Additional Information[edit]

An animated video of the Hershey and Chase experiment can be viewed by clicking on this link

The published papers of Martha Chase and Alfred Hershey can be viewed in this link


Berg, Jeremy, John Tyzmozcko, Lubert Stryer. Biochemistry

Historical information[edit]

Determination of the DNA structure would not have been possible if it was not for the work of Erwin Chargaff, an Austro-Hungarian biochemist. Originally a scientist who did his first work in lipids and lipoproteins, after reading about an experiment of Oswald Avery which showed that DNA was material encoding the genetic information, he turned his work onto DNA.

Tetranucleotide hypothesis was the mainstream theory on Chargaff’s time which was proposed by Phoebus Levene. The theory suggested “that DNA was made up of equal amounts of four bases – adenine, guanine, cytosine, and thymine – but that it was organized in a way that was too simple to enable it to carry genetic information.” The four bases are held together by hydrogen bonds and they are located inside the DNA helical structure. However, the sugar and phosphate backbone are on the outside of the DNA structure. The two strands are complementary to each other and thus one strand depends on the other. Despite the results of Avery’s experiments that DNA encodes life the scientific community was convinced DNA was relatively too simple to carry genetic information. Chargaff was not satisfied with the tetranucleotide postulation because of the minimal data that supported it.


Erwin and his colleagues collected several DNA samples throughout their discovery. Using the fairly new technique of paper chromatography, Chargaff and his associates proceeded to separate DNA. The DNA that they collected was subjected to acid. The acid would then hydrolyze the phosphodiester bonds as it would cause a nucleophilic attack on the bond and result in the backbone breaking up. Once the phosphodiester bonds were broken then the individual nucleotides would then be separated and be free to analyze. Ultraviolet spectrophotometry was used to analyze the exact amounts of bases that were present in the DNA sample.

UV spectrophotometry showed that there was not an equal amount of purine bases (Adenine and Guanine) and pyrimidine bases (Cytosine and Thymine). Chargaff and his partners showed that the tetranucleotide hypothesis was in fact wrong in assuming that all four bases were in equal amounts. In other words, the concentration of GC equals to the concentration of AT. However, in RNA, Thymine is replaced with Uracil. What Chargaff noticed however was that although not all were in equal amounts certain bases were equal to each other. The base Guanine was equivalent to the amount of cytosine present; and the same held true for Adenine and Thymine. The ratio of A/T and C/G bases held true for all organisms and for both of the strands that were separated. The noticeable proportionality between one purine base to another pyrimidine base as well as it being true for both strands would be crucial in determining the helical structure of DNA although Chargaff was unable to see it.

The experiment gave two discoveries which is now summarized as Chargaff’s Rule:

1. The number of Adenine bases is equal to the number of Thymine bases, and number of Cytosine bases is equal to Guanine bases.
Ratio of A=T
Ratio of C=G
Ratio of A + T +C +G = 100%

2. The proportion of A:T and C:G holds true for both strands.

For example: in human DNA, the four bases Adenine (A), Thymine (T), Cytosine (C), and Guanine (G) are present in these percentages: A= 30.9% and T= 29.4%; G=19.9% and C=19.8%. The A=T and G=C equalities, displays Chargaff’s Rule, which actually remained unexplained until the discovery of the double helix by Watson and Crick.


Berg, Jeremy. Biochemistry. 6th edition. ISBN-13 9780716787242

Campbell,Neil. Biology. Pearson Publishing. Dec 2004

Watson, James. DNA : The Secret of Life. Knopf Publishing Group. Aug 2004.

Inspirations to the Discovery of DNA Structure[edit]

James Watson began his research on DNA structure when he was in college. In 1945, during his third year of college, he reads Erwin Schrödinger’s What Is Life? and takes away the message: Genes are the key components of living cells, so “we must know how genes act”. In 1950 at an international conference in Naples, Maurice Wilkins of King’s College, London, shows his clear X-ray pictures of DNA to Watson. Determined to work on DNA structure, Watson moves to Sir Lawrence Bragg’s biophysics unit of the Cavendish Laboratory at Cambridge, England, where he meets biophysicist Francis Crick. Many scientists, including Rosalind Franklin, began her research on DNA structure with the help of X-ray diffraction. During the same year, she held a seminar at King’s College in London, where Watson was invited. Her X-ray photo revealed the physical structure of DNA as a helix. During the seminar, Watson learned that Franklin’s research confirmed that DNA had a helical structure, which consisted of two to four interlaced helical chains. Each helix had a phosphate-sugar backbone, with attached bases (adenine, guanine, thymine, and cytosine). The bases were proved to attached to the inside of the helix, possibly forming links between the helical chains. After Franklin’s seminar, Watson decided to build DNA models.

Continuation on the Discovery[edit]

Nevertheless, the diffraction pictures of these models did not fit that of real DNA. The models that Watson built turned out to be wrong—the bases are on the outside of the helix and the helix is dehydrated—because he misinterpreted Franklin’s findings. Watson’s and Crick’s research on DNA structures was terminated by King’s College, and they must continue with their previous researches, which are tobacco mosaic virus (TMV) for Watson and proteins for Crick.

Although banned from researching on the structure of DNA, Watson was able to continue because one of the main components of TMV was nucleic acid and Francis Crick continued with it outside of his research. In 1952, Watson described Alfred Hershey’s discovery that the genetic material of viruses is DNA, comparing the DNA in virus heads to “a hat in a hatbox”. Watson and Crick had a disastrous meeting with Erwin Chargaff of Columbia University, who had discovered the ratios of the amount of the DNA bases. From John Griffiths, the nephew of Fred Griffiths who contributed to the fact that DNA is a genetic carrier, Crick learned that guanine (G) is attracted to cytosine (C), and adenine (A) to thymine (T), and Crick deduced that the bases must fit together like two interleaved decks of cards—they were stacked on top of one another inside the entwined backbones.

Watson was convinced that DNA must be helical due to Crick’s proposed DNA structure and the X-ray diffraction plates. When Franklin showed Crick and Watson the X-ray pictures of DNA, even though the pictures did not show the radial symmetry necessary for helices, they show that the crystals were overlapping.

In autumn of 1952, Watson became friends with Linus Pauling’s son, Peter. At that time, Linus Pauling was one of the few men in the scientific community who pondered the importance of DNA structure. From Peter, Watson learned that Linus Pauling published a paper on DNA structure—there are three helically entwined chains with sugar phosphate backbone outside of the coil, and the outdated X-ray pictures “proved” the structure to be true. Such structure is known as alpha helix. Watson immediately knew that Pauling’s structure was incorrect because of the previous models that Watson had built. In fact, Pauling’s structure left out important details: he had omitted to assign ionization charges on the phosphate groups. When there is no electric charge holding the long thin chains together, the chains would unravel and fall apart; without the charges, the nucleic acid structure was not even an acid.

Watson and Crick knew that Linus Pauling was their main competitor in determining the structure of DNA. Knowing that one of the greatest scientists made several mistakes in deducing DNA structures, Watson and Crick resolved to tackle the DNA structure at Cavendish laboratory. They worked on the DNA model using metal plates and Franklin’s pictures of DNA by X-ray crystallography, provided by Maurice Wilkins and Max Perutz. Besides matching the bases, they also determined that the width between the two DNA strands must be less than two nanometers. In order to fit the bases inside the strands, Watson believed that the base pairs that are alike should be put together. However, they were unable to fit the similar bases within a small width. Watson then discovered that the keto-form base pairs joined A-T and C-G, and now the base pairs are able to fit inside the double strands. In five weeks of time, Watson and Crick built a DNA model that is indeed the correct structure of DNA.

In April 25, 1953, Watson and Crick published their article “Molecular structure of Nucleic Acids: A Structure for Deoxyribose Nucleic Acid” in Nature, becoming the first to publish the structure of DNA as a double helix.

Importance of Discovery[edit]

This discovery shed light on how genetic material could be passed on from generation to generation, and proves the simplicity of the transfer of genetic material. In fact, our present understanding of the storage and utilization of a cell’s genetic information is based on work made possible by this discovery.

DNA Structure – Leading to Function[edit]

After looking at the X-ray crystallography made by Rosalind Franklin, Watson and Crick were able to deduce that the shape of DNA was a double helix, and by Chargaff’s experiment, were able to deduce that the G pairs with C and A pairs with T. The pairs’ base lengths are equal, and fit exactly between the two chains of phosphates. The bonds between the two phosphate groups are hypothesized by Watson and Crick to be hydrogen bonds, which are easily broken. The discovery of DNA structure thus gave them a very good idea on how DNA might replicate itself, and thus the passing of genetic material.

Within Watson and Crick’s article they claim that DNA is a double helical structure and that Pauling’s previous attempt to define the structure noting that it did not have the much needed hydrogen bond stabilization and underestimated the van der Waals interactions of base stacking. The helix would of right handed as the two chains run in opposite directions. Bases were linked towards the inside of the helix and the sugar phosphate linkage created the outer backbone. The helix would repeat every 10 residues or 3.4 Angstroms, as they saw in the crystallographic data from Rosalind Franklin. The diameter of the helix was found to be 20 Angstroms and there was a rotation of 36 degrees per base, thus having 10 bases every 360 degrees. The most innovative ideas of Watson and Crick’s model was that the two chains were held together by bases of purines and pyrimidines. By hydrogen bonding, a purine must be bonded to a pyrimidine creating a complementary pair. Using experimental data that showed the ratios of adenine and thymine were very close as were guanine and cytosine they stated that adenine bonds to thymine and guanine binds to cytosine. They discovered this based on comparing the ratios of A-T, C-G and A-G, and they found that the first two ratios were the closest to 1 where as the second was varied. This helped them make the conclusion that A bonds with T and C bonds with G only. The pairs’ base lengths are equal, and fit exactly between the two chains of phosphates. The bonds between the two phosphate groups are hypothesized by Watson and Crick to be hydrogen bonds, which are easily broken. The DNA nucleotide must also contain deoxyribose and not ribose because the extra oxygen on ribose would interfere with the structure due to van der waals interactions. The discovery of DNA structure thus gave them a very good idea on how DNA might replicate itself, and thus the passing of genetic material. Also, they found that each of the bases was capable of tautomerizing between the enol and keto forms. Experimentally, it was determined that the keto form predominates at a physiological pH. Thus, they also came up with a method for demonstrating how DNA may denature as pH changes due to conversion from the keto to the enol form.

Later Discoveries[edit]

Watson and Crick’s discovery led to many new investigations, such as the structure of RNA, how DNA contains all the information for protein production, and the Human Genome Project, whereby all the 100,000 human genes are attempted to be mapped.

Although the discovery of the structure of DNA was attributed to Watson and Crick, a keynote
player in helping them discover this structure was a scientist by the name of Rosalind Franklin.
Rosalind Franklin, along with Francis Wilkins, worked on DNA applying X-ray crystallography to find out
its structural properties. X-ray crystallography required the process of exposing a crystal specimen
(DNA) to X-rays to determine the locations of the atoms in the “molecules that comprises basic unit of
crystal called unit cell”. The task however was not an easy one to attain.

Obtaining a clear diffraction pattern of an object required that the crystal be pure and the x-ray strong enough. However, as Franklin realized, DNA existed in two forms in equilibrium which resulted in a very unclear diffraction pattern. These forms were the A form, which is the dehydrated form of DNA, and the B form, which generated a long and fibrous structure due to humid conditions. Franklin, applying her chemistry background, then proceeded to isolate these two forms using clever laboratory techniques such as “manipulation of the critical hydration of her specimens” [2]. The A and B forms were then separated and subjected to X-ray crystallography obtaining the pictures which would be the basis of Watson and Crick’s helical DNA structure; one of them the famous photo no. 51 of the B DNA form.

Fiber Diffraction[edit]

Fiber diffraction is a method used to determine the structural information of a molecule by using scattering data from X-rays. Rosalind Franklin used this technique in discovering structural information of DNA. The experiment places a fiber in the trajectory of an X-ray beam at right angle. The diffraction pattern is obtained in the films of a detector placed few centimeters away from the fiber. The fiber diffraction pattern is a two dimension patterns showing the helical symmetry of a molecule rather than a three dimension symmetry if taken by X-ray crystallography. A good fiber diffraction patterns exhibits four quadrant symmetry, the axis aligned to the fiber is called the meridian and the axis perpendicular to the fiber is called equator. Franklin obtained a diffraction pattern using a non-crystalline DNA fiber, and from it she deduced the B-form of DNA.


  • The diffraction pattern obtained by Franklin and Wilkins showed a X pattern which hinted of a 2 stranded helical form


  • They also observed that the patterns was consistent and inferred that the helix’s diameter must also be consistent,


  • The helical turn of DNA correlates to the horizontal lines in the picture which measures to 34 Angstroms. They also calculated that the gap between based pairs was 3.4 A as measured on the distance from the center of the X to the ends. Simple math deduced that there are 10 nucleotides per turn

Franklin and Wilkins also showed that the sugar phosphate backbones were found to be in the outside of the helix and not inside as it was previously thought to be. They came to this conclusion because of the A and B forms of DNA. The hydrated and dry forms of DNA showed that water could easily come in and bind to DNA, a fact that could only happen if the feature showed sugar phosphate backbones being on the outside.


– DNA’s helical structure was composed of two strands

– establish that DNA’s diameter was similar throughout

– calculated that 1 turn was 34 A, distance between base pairs as 3.4A, and 10 nucleotides per helical turn

– showed that sugar phosphate backbones were located outside of the structure


Berg, Jeremy. Biochemistry Textbook
DNA Replication is required for all cell division, which allows organisms to grow. In DNA replication, the DNA is first divided into two daughter strands in the genome, which carries the exact genetic information as the original cell. This starting point of the strand being separated is called the “origin”. The double strand structure of the DNA aids the mechanism in replicating; these two strands are first separated into two separate strands. The complementary stands of the two separate strands are then recreated by DNA polymerase, an enzyme that specialize in making complementary strands; it will find the correct complementary base for each strand and it will extend from the 5′ to 3′. The process by which the original strand is being preserved is called “semiconservative replication”.
DNA replication is essential in the life cycles for biological organisms. It is initiated when the double stranded DNA located at the origin of replication is separated or melted. When the double stranded DNA is melted, melted region is propagated and a mature replication fork forms. DNA melting, along with the replication fork formation is coordinated by initiators, helicases, and other cellular factors. Recent advancements in structural biochemistry studies of initiators and replicative helicases have been emphasized in archaeal and eukaryotic cells. The results of these studied have provided new insight to possible mechanisms of the early stages of DNA replication.

Genomic DNA is a common, necessary, and essential process in all living things. Replication can be divided into initiation, elongation, and termination steps.

Initiation of DNA replication[edit]

During initiation, initiators recognize and then bind the replication origin DNA, converting it to a replication fork. The steps of initiation are made of up of the following steps: initiators assemble around the origin of DNA, and the dsDNA origin is melted. The melting of dsDNA produces a replication fork on each side of the origin to allow bi-directional replication. Before this step can happen, however, there are topological limitations that must be overcome to convert the melted origin to a fork structure.
To induce the assembly of initiators at the origin, biochemical methods can be utilized to detect the initial melting of origin dsDNA. In the archaeal and eukaryotic cellular systems, the duration of origin melting is still unsure. However, the origin melting has been shown to be induced by the assembly of LTag. SV40 LTag is capable of inducing origin melting and unwinding, therefore it is considered to be the iniator in the eukaryotic system. It has been used as a model to study origin recognition, assembly, and melting process.
To convert from a melted dsDNA origin, an assembly of initiators at an active replication fork expands the melted region and positions the helicase on the fork.

The initiation step is one of 3 steps in DNA replication (along with elongation and termination). In initiation, many replication proteins called initiators convert the DNA into a replication fork. This is accomplished first by the initiator proteins assembling around the DNA which causes melting of the dsDNA (double stranded DNA) origin. The origin melting then starts to produce a replication fork on each side of the melted origin. This produces bi-directional replication. Ring shaped helicases assists in this process. However, the mechanism of how the initiators and helicase melts and unwinds the origin DNA is not well understood due to the lack of high-resolution structures at the intermediate.

In eukaryotic and archaeal cellular system the initiator proteins includes Orc, Cdc6, Cdt1, and MCM (mini-chromosome maintenance) helicase. MCM is one of most important factors in the formation of the unwound fork. MCM forms hexamers that can dimerize into double- hexamers. The helicase for SV40 large T antigen (LTag) is able to recognize the origin DNA and can melt and unwind the DNA into a replication fork without the use of cofactors . SV40 LTag is considered the archetypal initiator/helicase in eukaryotic systems and is a model for studying recognition, assemble and melting.

Crystal structures of LTag hexamer reveals a channel of (13-17Å), which is enough for a ssDNA to go through but not dsDNA (20 Å). It is believed that melted ssDNA is encircled in the central channel for hexameric helicase, even during the assembly at the origin.

LTag also shows a β-hairpins in the central channel that is configured in a planar arrangement. β-hairpins form 2 adjacent planar rings with DR/F loops which contributes to the narrowest part of the channel in the AAA+ domain. It is questioned whether LTag can expand to accommodate dsDNA or is the dsDNA modified by initiator/helicase to fit the narrow channel. However for the latter to occur, LTag must squeeze and crush the dsDNA which disrupts the base pairs and melting of the dsDNA. This models often referred to as the squeeze to open model.

The most widely accepted model for fork unwinding is of the ring-shaped helicase that encircles and migrates down the DNA strand and splitting the dsDNA to ssDNA.

In Prokaryotic cells, bacterial replicases contain a polymerase, polymerase III (Pol III), a β2 factor, and a DnaX complex. They are very processive, and cycle faster during Okazaki fragment synthesis in many ways. DnaA (an origin recognition protein) can start the origin melting into single stranded DNA (ssDNA). The ssDNA is the site for loading hexameric helicase DnaB(which only exist as single-hexamers). One helicase that bacteria has is DnaB6, which can separate two strand at the replication fork. It translocates at the 5′–>3′. The DNA polymerase III holoenzyme (Pol III HE) makes contact at the replication fork and also function as a dimer that appears to have a regulated affinity on the lagging strand in order to recycles between primers during Okazaki fragments synthesis. DnaB uses ATP hydrolysis to go down the strand in order to split the two strand. Primase interacts with the helicase and combines with short RNA primers for Okazaki fragment synthesis. The RNA primers keep extending by the Pol III HE until a signal is received to replace to the next primer at the replication fork. During the process, the gaps between the Okazaki fragments are filled, RNA primers are deleted by DNA polymerase I, and is sealed by DNA ligase. DnaB has its N-terminal end free for docking primases making it easy for the primase to capture the ssDNA emerging from the N-terminal domain during fork unwinding.

Initiating Replication in Archae and Eukaryotes by Melting the Double Stranded DNA[edit]

Although not much is known about the initiation of replication by the melting of double-stranded DNA, recent studies have shed light on possible mechanisms for this process. Two co-crystal structures from archaea that have both the initiators and the origin DNA have been discovered to show how the initiators recognize the double-stranded origin of DNA. The complexes, Cdc6/Orc-dsDNA show the double stranded DNA deforming and bending, but not melting. Thus, researchers believe that in order to trigger the melting of the double-stranded DNA and to generate higher order complexes at the origin of replication, initiators like MCM mentioned in the above section must be needed.

PDB 1ltl EBI

This image represents an example of the structure of a DNA replication initiator—specifically showing the Cdc21 and Cdc54 (similar to the Cdc6 described above) N-terminal domain. The initiator, Cdc6/ORC 1 (which is not depicted here but can be represented by the picture above) binds to the origin of replication and bends the DNA.
In eukaryotes, the SV40 Ltag at the origin is able to trigger the melting of the origin of replication and the subsequent unwinding of DNA, making it the initiator-helicase that is used as a model system for examining origin recognition, assembly, and the melting of the double-stranded DNA. The crystal structures of Ltag hexamers that are not bounded to DNA have been shown to have channels that seem to be able to bind to only single stranded DNA, but not double stranded DNA because the channels are usually about 13 to 17 Å (angstroms), while double stranded DNA molecules tend to have a diameter of about 20 Å, making a double stranded DNA molecule unable to fit inside the channel. Generally, studies of DNA translocation have shown that in order for a double stranded DNA to fit inside the channel of an Ltag hexamer, without changing its shape, the channels diameter must be at least 20 Å in diameter. In addition to not being big enough, crystal structures of Ltag hexamers have a planar arrangement of b-hairpins in the middle channel.

Beta hairpin

Here is an example of a b-hairpin, a component of the LTag hexamer structure. The b-strands in the b-hairpin are antiparallel—meaning that the N-terminus of one b-sheet is aligned with the C-terminus of another b-sheet. In the case of the LTag hexamer, the b-hairpins are on the same plane in the central region of the channel.
Recently, cryo-EM has demonstrated that Ltag hexamer channels can bind double stranded DNA molecules by surrounding the double stranded DNA with two hexamers. Researchers however, still are unsure whether the double stranded DNA changes configuration because of the initiator-helicase or whether the Ltag widens to allow the double-stranded DNA to bind. One model, the squeeze-to-open model, asserts that the Ltag hexamer can fit the origin double stranded DNA into its narrower channel by squeezing the DNA through the channel. As a result, base-pairs are disrupted and the melting of the double stranded DNA origin occurs. This model has been proposed, and is in the process of being confirmed because it appears to be consistent with the known data regarding DNA melting.

The Formation of the Replication Fork by the Squeeze-Pumping Model:
The squeeze-pumping model derives from information that comes from the structure of the Ltag hexamers. The structure includes a narrow channel as mentioned above, an AAA+ motor domain, a side channel where single stranded DNA can exit, and inter-Zn domains. This model is based on the DNA being melted by the squeeze-to-open model described above, where the melted DNA is pumped to the Zn-Domain until it generates the single stranded DNA loop which can then leave the channel and form the replication fork.

Translocation of Single and Partially Hydrolyzed Double Stranded DNA:
Researchers have demonstrated that double-hexameric LTag and MCM have the ability to unwind DNA. LTag has been shown to be able to unwind long double stranded DNA that include an internal origin sequence in its double-hexameric form. This differs from the steric exclusion model of fork unwinding—which is the most widely accepted model. This model is based on evidence showing that a ring-shaped helicase surrounds and moves down one of the DNA strands toward the double-stranded DNA fork while exposing the single stranded DNA strand in the process.

PDB 1g8y EBI

The above image represents a ring-shaped hexameric helicase structure that surrounds and moves down the DNA strands (which are not depicted in the photo).
Citation: ;


Curr Opin Struct Biol. 2010 Sep 24. [Epub ahead of print]
Origin DNA melting and unwinding in DNA replication.
Gai D, Chang YP, Chen XS.
Molecular and Computational Biology, University of Southern California, 1050
Childs Way, Los Angeles, CA 90089, USA.

Links between DNA replication and protein synthesis[edit]

For decades, individual studies were done on DNA replication and protein synthesis. Not many scientists discuss the link between these two critical processes in living organisms. Jonathan Berthon, Ryosuke Fujikane, and Patrick Forterre came together in their article “When DNA replication and protein synthesis come together” to provide a detailed explanation of the connections of these seemingly independent fields of structural biochemistry. They suggest that the unexpected but real connections between DNA replication and protein synthesis are found in the three domains of life, especially in Archaea and Eukarya. They believe that there are mechanisms that couple DNA and protein synthesis. Such mechanisms can be found in the activities of (p)ppGpp – Guanosine polyphosphate derivative – and GTPases or the Obg family.

Stringent response is a phenomenon that can well link the processes of DNA replication in bacteria’s to the change in amino acid concentration in proteins. As starvation of amino acids occur, a dramatic increase in the intracellular (p)ppGpp concentration is observed that initiates the shut-down of rRNA gene transcription as well as protein synthesis. This process, however, varies among different bodies of bacteria. For instant, inside the system of Bacillus subtilis, amino acid starvation, along with the inhibition of rRNA gene transcription, blocks the elongation step of DNA replication. (p)ppGpp also inhibits the DnaG primase in Bacillus subtilis and could directly affect the Okazaki fragment synthesis in the lagging DNA strand, during the process of self-replication. On the other hand, stringent response in Escherichia coli leads to an instant interference of the DNA replication initiation. Such proofs are important in proving the direct connection between proteins and DNA replication process. The starvation of protein’s amino acid has the potential to stop DNA replication.
Another source of connection is Obg family. Obg is known for its ability to couple ribosome biogenesis, a critical step in production of proteins as protein synthesis is done inside ribosome through mRNA, with DNA replication. The link between ribosome biogenesis and DNA replication is argued by scientists to start from the proteins that are originally function in the making of ribosome. These proteins participate in the regulation of the stringent response in bacteria as well as in the stabilization of DNA replication forks. A type of Obg, called ObgE is useful in controlling the levels of (p)ppGpp. One important link between DNA replication and protein synthesis found in ObgE is the fact that the depletion of ObgE would cause problems in chromosomes segregation and cell separation. This study is significant in showing that changes in certain proteins within the body would directly affect the pattern of DNA replication and the organism’s genetic processing. For this reason, Obg studies were done to prove the direct role that this type of protein plays in connecting DNA replication and protein synthesis.
Similarly, a type of protein family called NOG1 – Nucleolar G-protein – also participates in the making of ribosome. Nog1p from this particular family belongs to a complex that contains many other proteins that directly take part in DNA replication such as Orc6p (origin recognition complex), Mcm6p, some subunits of MCM complex, Yph1p, and Rrb1p. A very important statement was made by Kilian that changes in proteins that connect ribosome biogenesis to DNA replication would cause “chromosome instability” and “tumor formation”. He also concludes that there exists a network of proteins that directly link the production of ribosome’s and DNA replication in Eukarya domain.
All of the above studies and conclusions apply only for Eukarya because there is no clear evidence found for the domain of Archaea. Scientists, however, found that there is a cluster of genes that encode both DNA replication and translation proteins. This cluster includes numerous genes including essential ones such as aIF-2, an excellent source for regulation of DNA replication and protein synthesis. eIF-2 phosphorylation from this cluster is a major component in the mechanism of protein synthesis in eukaryotic cells. Another important component is Nop10 – plays a part in rRNA development. From simply examining these components, a clear conclusion can be drawn that there is, indeed, a close relation in the studies of proteins and DNA replication. One important example is the phenomenon where the two ribosomal proteins L44E and S27E interferes with the DNA replication process under special conditions such as amino acid starvation, previously discussed in the case of stringent response.
In conclusion, in both Archaea and Eukarya, there are many experimental data that confirm or suggest the close connections between protein synthesis and DNA replication. Stringent response is one example of how starving amino acids would inhibit the process of DNA replication initiation.

Semiconservative replication.png

The DNA Replication process works in an “assembly line” like fashion. The DNA double helix is ripped apart and a copy of each strand is produced. There are many biological enzymes that take part and must be present for this vital action to occur correctly.

Biological Proteins and Enzymes Required for DNA Replication (in chronological order)[edit]

Replication Fork[edit]

When DNA is being replicated, it forms a replication fork that was created during the helicase process that separates the DNA strand. The strands that are separated are called the leading strand and the lagging strand accordingly. The leading strand is synthesized in the 5′-3′ direction. It is the new DNA strand, which is being synethized by DNA polymerase. The lagging strand, on the other hand, at the opposite side, which runs from 3′ to 5′ direction and are synthesized by okazaki fragments. Then primase will build up RNA primers, allowing DNA polymerases to use the 3′ OH groups on the RNA primers to act on the DNA running from 5′ to 3′. Then these RNA fragments are being substituted with new deoxyribonucleotides and the strand will then be joined together with DNA ligase to complete the chain.

As the DNA unwinds, it will automatically force the DNA to rotate, twisting the structure. This is actually a problem to replicating DNA because it will eventually be physically incapable of replicating when it is over-twisted. To solve this problem, a enzyme called DNA topoisomerases is used. Topoisomerases I will cut the backbone of the DNA to allow the DNA to unwind itself and topoisomerases II will cut the backbones of both strands to allow interconnections with other DNA molecules, eliminating the chances of tangling together.


Helicases are motor proteins that move along the double-stranded nucleic acids and actively unwind the double helix. The enzyme uses the energy produced by the hydrolysis of ATP to ADP to unwind and separate a strand of DNA. This is done by the breaking of the hydrogen bonds between the annealed nucleotide bases. Helicase opening of the double strand can be categorized into two different cases: active opening and passive opening. In the active opening case, helicase directly destabilizes the double strand nucleic acid (dsNA) to promote the separation of the two strands. In the case of passive opening, the helicase enzyme binds to a single strand nucleic acid (ssNA) that existed due to thermal fluctuation which induces the opening of part of the double strand. It is found that active opening can increase the rate of unwinding of the DNA strand by 7 folds compared to passive opening. The product of this action is two template strands. One is known as the Leading Strand and the other is known as the Lagging Strand.

The leading stand is the single strand of the parental DNA that is synthesized continuously without interruption while the lagging strand of the parental DNA is formed in fragments. These fragments are called the Okazaki fragments. This is important in explaining how both strands of the parental DNA forms in a 5′->3′ direction despite the fact that the two strands are antiparallel. The fragmentary synthesis enables the 5′->3′ growth while appearing to form in a 3′->5′ direction.

Single-Stranded DNA Binding Proteins[edit]

The Single-Stranded DNA Binding Proteins bind to the DNA templates in a way that ceases the two newly formed strands from reannealing. these proteins keep the strands separated so that both of the strands can serve as templates for replication. This allows the remainder of the replication machinery to get into position and begin making new DNA strands.

DNA Polymerase[edit]

(see DNA Polymerase Section)

RNA Primase[edit]

The RNA Primase attaches itself to the Lagging Strand in a position adjacent to the Helicase. The RNA Primase’s Function in DNA Replication is to lay down RNA Primers in 3′ to 5′ fashion. These RNA Primers act as starting and ending locations for the DNA Polymerases addition of complementary nucleotides. The nucleotide sequences between RNA Primers are known as Okazaki Fragments. The RNA Primase is only necessary in the Lagging Strand because DNA Polymerase can only add complementary bases in a 5′ to 3′ direction, and the lagging strand is being unwound in the 3′ to 5′ direction.

DNA polymerase.svg
DNA Replicases from a Bacterial Perspective

Mitochondrial DNA Replication[edit]

Human Mitochondrion Genome

Mitochondrial DNA (mtDNA) is maintained apart from nuclear DNA. Because of mtDNA’s small size, it can only boast 37 genes and 13 protein products whereas the haploid nuclear genome encodes over 20,000 genes. However, it can provide a model system for studying nuclear DNA replication. The genome for the circular mtDNA contains approximately 16,600 base pairs in human beings. The encoded genes are also found to be necessary for making ATP by way of oxidative phosphorylation. There seems to be no specific phase for mtDNA to be replicated, meaning the replication can take place over and over during a cell cycle.

The endosymbiont hypothesis is the idea that mitochondria were engulfed to create the first eukaryote. Evidence supporting this hypothesis comes from the existence of mtDNA itself. Because mitochondria were once free-living bacteria, it might be anticipated that the mechanics of mtDNA maintenance would show greater similarity to prokaryotes over eukaryotes.

The mechanism in which mtDNA replicated was discovered in 1972 by electron microscopy. All replicating mtDNA molecules had a single-stranded branch. This further resulted in the leading-strand and lagging-strand synthesis uncoupled in mitochondria, which was different compared to the replication fork for nuclear DNA. The human mtDNA is typically arranged in covalently closed circles that are about one genome in length. In mtDNA replication, there is a strand-displacement replication fork in which leading-strand DNA synthesis occurs in the absence of lagging-strand DNA synthesis. DNA synthesis is carried out by conventional coupled leading- and lagging-strand. Then, delayed lagging-strand DNA synthesis is accompanied by incorporation of RNA on the lagging strand termed RITOLS for RNA incorporation throughout the lagging strand.

The issue of how mammals replicate their mtDNA gave rise to mtDNA replication redux which is an attempt to test the idea that biased segregation of human pathological mtDNA variants was related to replicative advantage, as suggested for yeast mtDNA. 2D agarose gel electrophoresis (2D-AGE) was used to resolve replication intermediates from mitochondria. This was used to define details of the mechanisms of replication for nuclear, plasmid, and viral genomes. It was found that many replication intermediates from crude mitochondrial preparations were sensitive to single-strand nuclease as predicted by SDM, a subset formed arcs indistinguishable from those associated with replicating nuclear and prokaryotic DNA.

However, purer preparations of mitochondria yielded not partially single-stranded DNA but RNA/DNA hybrids. This concluded that the SDM intermediates from earlier studies could be explained by RNA loss during isolation and processing.

In conclusion, there is still controversy about the mechanisms of mtDNA replication. The strand-displacement model of mtDNA replication is where there is a minimum of two primer maturation events for each strand, which applies to the RITOLS replication as well. The identification of Dna2 and Fen1 in mitochonrdria provides new tools for studying mtDNA replication. By manipulating their expression and studying mutant variants that disrupt mtDNA replication, it might prove to be very informative.

It was found that mutations, deletions, and other problematic arrangements of mtDNA increased in correlation to a mammal’s aging. The accumulation of mutated mtDNA in single cells cause respiratory chain deficiency. This causes shorter life spans for the mammals. It can also cause aging phenotypes when there are many mutations such as weight loss and loss of hair.[1]


  1. DNA Replication and Transcription in Mammalian Mitochondria.

Holt, Ian J. “Mitochondrial DNA replication and repair: all a flap.”

  • Berthon, Jonathan, Ryosuke Fujikane, and Patrick Forterre. “When DNA replication and protein synthesis come together”. Trends in Biochemical Sciences. Vol.34, no.9 (2009): 429-434. Cell Press.
  • Dahai Gai, Y Paul Chang and Xiaojiang Chen. “Origin DNA melting and unwinding in DNA Replication.” Current Opinion in Structural Biology 2010, 20:1-7. Elsevier.
  • Charles S. McHenry. “DNA Replicases from a Bacterial Perspective.” Annual Review of Biochemistry Volume 40 2011 July, 403-36.

General information[edit]

The full process of DNA replication is comprised of the intricate and coordinated interplay of more than 20 proteins. In 1958, Arthur Kornberg and his colleagues separated DNA polymerase from E.Coli. DNA polymerase is the first known of the enzymes whose function is to promote the bond formation of the joining units that make up the DNA backbone. E.Coli has various numbers of DNA polymerases, assigned by Roman numerals, that play important roles in DNA replication and repair.

DNA polymerase is an enzyme. This enzyme synthesizes a new DNA strand from an old DNA template and also works to repair the DNA in order to avoid mutations. DNA polymerase catalyzes the formation of the phosphodiester bond which makes up the backbone of DNA molecules. It uses a magnesium ion in catalytic activity to balance the charge from the phosphate group.

Nucleotides are added to only the 3′ end of the new strand; it is impossible for it to start a new chain on its own. Another DNA polymerase function is error correction – the correction of mistakes that were made in the new DNA strand. The entire DNA polymerase family consists of 7 different subgroups: A, B, C, D, X, Y and RT. Eukaryotes have at least 15 different DNA polymerases. However, none of the eukaryotic polymerases can remove primers, and only the elongation polymerases can proofread the sequence.

Although there are different types of DNA polymerases, all have common structural features. Additionally, even though DNA polymerases differ greatly in detail, they have very similar overall shape. There are at least 5 structural classes of DNA polymerase that have been identified. They take the shape of a hand with specific regions referred to as the fingers, the palm, and the thumb. In all classes of DNA polymerase, the thumb and finger wraps the DNA, holding it across the active site of the enzyme, while the palm releases residues that comprise this active site. Moreover, all DNA polymerases use similar strategies in the catalyzation of the reaction.

Diagram of DNA polymerase extending and proofreading a DNA polymerase

General Formulation[edit]

DNA polymerases are the catalysts in the step-by-step addition of deoxyribonucleotide units to a DNA chain. The reaction catalyzed is

(DNA)n + dNTP ↔ (DNA)n+1 + PPi

where dNTP stands for any deoxyribonucleotide and PPi is a pyrophosphate ion.


1. All four activated precursors are needed for the reaction to occur, the deoxynucleotide 5’-triphosphate dATP, dGTP, dCTP, and dTTP, in addition to Mg2+ ions. Typically, two of the metal ions will take part in the reaction. One will interact with the primer while the other with dNTP. The carboxylate groups of the residues in dNTP bind the two metal ions in place.

2. The new DNA chain is constructed directly on a pre-existing DNA template. DNA polymerases can only work efficiently as a catalyst in the formation of phosphodiester bonds if the base on the incoming nucleotide triphosphate is complementary to that of the template strand. In other words, DNA polymerase is an enzyme that synthesizes a product by interpreting the existing DNA strand as a template and produces the complementary sequence of the template into a new strand.

3. DNA polymerases necessitate the presence of a primer to start synthesis. The reaction catalyzed by DNA polymerases that works to elongate the chain is a nucleophilic attack by the 3’OH terminus of the growing chain on the innermost phosphorus atom of the deoxynucleotide triphosphate. Therefore, a primer strand with a free 3′-OH group must be bound to the template strand from the start. This primer is formed from RNA synthesis. Due to the fact that RNA can form without a primer, it starts the synthesis of DNA. Once the complementary DNA is formed and the synthesis has been initiated, the RNA piece will be removed and then replaced by the proper DNA sequence. A phosphodiester bridge is formed from the reaction and pyrophosphate is released. The ensuing hydrolysis of pyrophosphate that results in the creation of two ions of orthophosphate (Pi) by pyrophosphate assists to drive the polymerization forward. This elongation process of the DNA chain proceeds in the 5’-to- 3’ direction.

4. Many DNA polymerases are able to remove the mismatched nucleotides as a method of mistake correction in DNA. The polymerases possess a distinct nuclease activity that allows them to eliminate incorrect bases through a separate reaction. DNA polymerase will reverse its direction by one base pair and excise the incorrect base to replace it with the proper one and continue with the rest of replication. Due to this 3′ to 5′ exonuclease activity, DNA replication has a remarkably high dependability. This step process is also called proofreading. However, it is not completely perfect, which is why natural mutations and related diseases can still arise.

Eukaryotic DNA Polymerases[edit]

DNA Polymerases play a key role in the synthesis of DNA. Without these players, life would cease to exist. These polymerases are multi-subunit complexes that function very uniquely. It requires different components to work together to function efficiently. Polymerases act upon single-stranded strands (specifically to the template), to synthesize a strand that is complementary. In eukaryotic cells, there are 5 families of DNA polymerase. These can encode into different (up to as many as 15) enzymes. Critical for DNA replication are three DNA polymerases: Polymerase α-primase, Polymerase δ, and Polymerase ε. These three polymerases function at the replication fork of the DNA strands. The DNA strands are unwounded by MCM helicase, which is part of a CMG complex (Cdc45-MCM-GINS). It is Polymerase α- primase that initiates replication on the leading and lagging strand. It is here that the RNA primers (about 10 nucleotides) are laid down.

After the initiation, Polymerase δ and ε are brought to the complex and tethered. They function to increase the productivity of the different enzymes. Specifically, Pol δ synthesizes on the lagging strand while the Pol ε synthesizes on the leading strand. The roles of these polymerases were found by genetic experiments. For Pol ε, a mutation was placed on the active site. This increased the rate of enzymes activity, and leave behind a signature in the regions of activity. With the involvement of reporter genes, it proved that the Pol ε did indeed participate in the synthesis of the leading strand. The same genetics were done for Pol δ to prove its activity with the lagging strand.

A consistent correctness is necessary with the implementation of the bases. An incorporation fortunately occurs only every 10,000 replicated base pairs. But when it does occur in the DNA primer strand, it must be moved out from the polymerase and to the exonuclease domain. It is there that it is proofread and allow for continuation of a stable domain. [1]

Polymerase Families[edit]

Central to life, polymerases have been put under study in search of its structure as well as roles. To date, there have been 7 different families (or domains). There are 5 unique to eukaryotic cells. More families are unique to bacteria and archaea. In these polymerase families, there is a core structure: palm, finger, and thumb domains. From there the families diverge to their specific cellular functions. The 7 families are labeled with letters: A, B, C, D, X, Y, and reverse transcriptase. Family A includes Pol I polymerase, which functions to repair nucleotides. It also includes Okazaki fragments, which takes part in the replication of the lagging strand. Family B includes the eukaryotic polymerase sigma, alpha, as well as epsilon. Family C harbors the Polymerase III, which XXX. Family D includes polymerases that are exclusive to archaea. Family X as well as Y include enzymes that do repairing.


Schematic summary of the compositions of DNA Eukaryotic Polymerases. [a] Polymerase α. [b] Polymerase ε and [c] Polymerase δ. The common core structure can be viewed with the larger catalytic subunits. Then each polymerases have their own unique smaller subunits that allow them to function in their own specific ways

Within the Eukaryotic DNA Polymerase Structures[edit]

As it was earlier noted that the polymerases are multi-unit entities, it holds true that they are very complex. The structures are comprised of a large catalytic subunit (part of the B family), and then many other smaller subunits. The architecture of the B family polymerases are consistent: a N-terminal domain, 3’-5’ exonuclease domain, palm, finger, and thumb domains; in a ring-like structure. The catalytic subunit of all the eukaryotic polymerases are assumed to be related and come from a common ancestor via gene duplication. But studies do show that the catalytic subunit of the ε is larger than the other two due to additional sequences.

Obtaining structures that are in high resolution is essential for further analysis of polymerases. To date, there has been a lot of progress in formulating the structures of the different subunits that make up the polymerases, but only at low-resolutions. The first structure reported was the cryo-EM structure of the Pol ε. Researchers aim to work towards high resolution structures because it came allow further understanding of the fidelity of DNA synthesis, and the highly regulated genome that is maintained in all of the eukaryotic cells. Furthermore, it would allow design of genetic experiments to explore the interactions of and within the complexes.[2]


“Molecular Recognition and Catalysis in Translation Termination Complexes” by Bruno P. Klaholz. IGBMC (Institute of Genetics and of Molecular and Cellular Biology), Department of Structural Biology and Genomics, Illkirch, F-67404 France. Trends in Biochemical Sciences, May 2011, Vol 36, No. 5

“Crystal Structure and Functional Analysis of the Eukaryotes Class” Mol. Cell 14, 233-245. Kong, C. etal (2004)
DNA initiation is the first stage of the DNA replication process. During this stage, the double stranded DNA (dsDNA) is first separated into single strands by breaking up the hydrogen bonds between base pairs. The separation of dsDNA into singled stranded DNA (ssDNA) is known as DNA melting. Proteins that are responsible for breaking up of dsDNA are called initiator proteins. In the next step, proteins called helicase bind to the dsDNA and unwind it to create a replication fork. In Eukaryotic and archaeal cells, melting and unwinding of DNA are mainly accomplished by mini-chromosome maintenance helicase (MGM) along with multiple initiation proteins. However, helicase such as large T antigen (LTag) and E1 which are found in simian virus 40 (SV40) and bovine paillomavirus (BPV) are able to break up and unwind the dsDNA without any additional cofactors (Chen et al.). Because of the great similarity between the viral and eukaryotic and archaeal DNA replication system, LTag and E1 are studied intensively by researchers in hope of gaining a better understanding in replication process.


Differences in the arrangement of β-hairpins and mode of ATP binding etc. in the viral proteins can lead to different mechanism of melting and unwinding. For example, LTag of SV40 is believed to use mechanism that follows the double-pump looping model which is described in Chen et al. First, LTag in the shape of a double-hexamer binds to the dsDNA at the replication origin and compress it to break up the hydrogen forces between two strands. Two hexamers ahead of the replication origin then pump the dsDNA into the double-hexamer to create a replication fork that is consisted of ssDNA as loops. Further pumping of the dsDNA will elongate the replication fork to allow fork progression. On the other hand, E1 of BPV uses mechanism that follows closely with the steric exclusion model (Chen et al.). In this model, E helicase exists only as a single hexamer and is separated into two trimers with each binds to one strand of dsDNA at the origin to induce melting. After successfully breaking up the dsDNA, two trimers rejoin to form a hexamer that binds to only one strand of dsDNA and unwinds it to create a replication fork.


Although they present plausible mechanisms for DNA melting and formation of replication fork, both models still require support of further evidences. Questions such as how LTag binds to dsDNA or whether the E1 hexamer can separate into two trimers still remain unanswered. More intensive investigation and research are therefore needed.


Chen, Xiaojiang S, Paul Chang and Dahai Gai. “Origin DNA melting and unwinding in DNA replication”. Current Opinion in Structural Biology 2010, 20:1-7.

Meselson – Stahl Experiment[edit]

Meselson and Stahl Experiment

Theories of Replication of DNA[edit]


The daughter DNA is composed entirely of new DNA and the parent DNA retains it’s same back-bones and bases.


Replication produces two copies of DNA that are made up of 50% DNA from the parent DNA helix, and 50% of new DNA. In this situation, each daughter DNA double-helix contains one strand that is the old DNA (from the parent) and one strand that is new (the complimentary strand resulting from the replication).


This form of replication also produces daughter DNA that is constituted by 50% new DNA, and 50% parent DNA. However, in this case, the new DNA and old DNA are shuffled, and fragments of each are found on both strands on the helices on both copies of DNA following replication.

DNA Replication Theories Map2.jpg

Schematic of the three theories of replication, by CJHIGGIN

The Experiment[edit]

Watson and Crick proposed that DNA replicated semi-conservatively, but conservative and dispersive replication were still plausible until the theories could be disproved. In 1957, Matthew Meselson, and Franklin Stahl devised an experiment to determine whether DNA replicated following a conservative, semi-conservative, or dispersive model.


Meselson and Stahl cultured Escherichia Coli in a medium containing a heavy isotope of nitrogen (15N) as the only nitrogen source, as opposed to the more common nitrogen-14 (14N). After several generations, the E. coli contained DNA composed of nucleotide base made of 15N isotope. The (15N) DNA was denser than the common (14N) DNA, and the difference in densities allowed for separation by density gradient equilibrium sedimentation.

To achieve separation of the E. coli DNA by densities, the DNA was mixed with a solution of CsCl and centrifuged. A CsCl density gradient was created as a result of sedimentation and diffusion working against each other. The DNA molecules were found in the area of the CsCl density gradient that was equal to their own density.

The (15N) E. colicells were transferred to a medium that contained only (14N). DNA was isolated from the first generation of cells grown in the (14N) medium, and analyzed by density gradient equilibrium sedimentation. Then DNA from the second generation of E. coli grown in the (14N) medium was extracted and analyzed.


The first generation of E. coli grown in the 14N medium contained a single DNA band found halfway in between where the 14N DNA band and the 15N DNA band should have been. This demonstrated the presence of a DNA that was lighter than the DNA from the original population of E. Coli grown in the 15N medium, but still heavier than 14N DNA. Due to the position of this intermediate DNA band in the density gradient, it was apparent that the DNA was a hybrid and contained both 14N and 15N. This automatically eliminated the conservative model of replication, which would have resulted in two distinct bands: one matching the position of the 15N-containing DNA, and one matching the position expected by DNA containing only 14N. Only the dispersive and semi-conservative models fit the situation.

The second generation of E. coli grown in the 14N medium contained two distinct bands. One of the bands was 14N DNA, and the other band was the intermediate (14N/15N) DNA. This result supported the theory of semiconservative replication since dispersive replication would have resulted in a single band of lower density DNA in each consecutive generation.

The figure below illustrates the theoretical outcome of the conservative, dispersive, and semiconservative models along with the experimental outcome obtained by Meselson and Stahl.

Density Gradient.jpg

Figure: A schematic of the appearance of fractions of DNA samples after centrifuging in a density gradient, by CJHIGGIN


Berg, Jeremy M., John L. Tymoczko, and Lubert Stryer. “Exploring Genes and Genomes.” Biochemistry. New York: W. H. Freeman, 2007. 113-14. Print.

Campbell and Reese’s Biology, 7th Edition

Nelson and Cox’s Lehninger Principles of Biochemistry, 5th Edition

General Information[edit]

A knockout mouse is a mouse used by researchers for laboratory experiments aimed at understanding the consequences of inactivation or “knocking out” of a specific gene. In general, the over- and/or under-expression of genes in an organism for experimentation is known as transgenic technology [1]. This process is completed by disrupting or replacing the existing gene with an artificial piece of DNA that is a mutated version of the targeted gene [2]. Due to the disruption, there will be a loss of gene activity, and it will cause changes in the mouse’s phenotype. When the mouse’s phenotype is affected, the changes in appearance, behavior, and other physical characteristics should be evident in the offspring.

Purpose and Applications[edit]

As many genes are similar in mice and humans, the extraction or “knocking out” of a particular gene in a mouse can provide evidence to further understand the extent of the function of genes in humans [3]. This usually is manifested by a change in the animal’s physical characteristics, behavioral characteristics, or biochemical pathways that regulate the mouse’s functions [4]. This laboratory technique has been used in various types of research:

  • Cancer Research
  • Cystic Fibrosis
  • Lung, Heart, Blood, and Parkinson Diseases
  • Aging
  • Anxiety
  • Arthritis
  • Diabetes
  • Obesity
  • Neural Pathway Functions
  • Substance Abuse

A specific gene studied from the knockout mouse can also be useful in studying how different recreational drugs affect the animal, which can be used to test therapies for drug abuse in humans [5]. For example, a p53 knockout mouse focuses on a mechanism – p53 – that codes for a protein that inhibits the growth of tumors and stops cell division. By taking out this gene, the mouse is at risk of developing various types of cancer (blood, lung, brain, bone, etc.). This is a useful study because humans with the abnormality in this gene have Li-Fraumeni Syndrome, a rare autosomal dominant hereditary disorder that puts people at a much higher risk of developing cancer.

Limitations and Weaknesses[edit]

Although knockout mouse technology is an excellent research tool, there are frequent complications that occur when a particular gene is knocked out. For example, the mouse might depend on the gene of interest for other important bodily functions; if it was disrupted, the mouse might die or stop functioning correctly in unexpected ways. In addition, the gene that is knocked out in the mouse may not even produce an observable change in any of the mouse’s characteristics. This makes it extremely difficult to correlate the study with that of humans. Gene knockouts in mice embryos may sometimes inhibit the mice from growing into adult mice. This makes the studies limited to the pre-natal stage of the mouse, further distancing the relationship between the gene-knockout in the mouse, and that of humans.

Methods of Preparing Knockout Mice[edit]

Knockout Mouse Breeding Scheme

Knockout mice are created from embryonic stem cell (ES cells) by harvesting them approximately 4 days after fertilization. The reason for using the ES cells so early on is because the swapping of gene sequences can be properly passed on to the rest of the cells during division and develop along with the all the other adult cells. This process is completed in one of two methods:

Gene Targeting[edit]

In gene targeting, a particular gene is manipulated within the nucleus of the ES cells of the mouse through homologous recombination [6]. To start the homologous recombination, the DNA sequence of the gene that needs to be replaced would need to be known. Next is to make a new DNA sequence that is needed to be inserted into a chromosome. That chromosome is going to take the place of a wild-type allele. The artificial inactive DNA sequence is introduced (this piece is nearly identical in sequence to the knocked out gene). This artificial sequence flanks the DNA sequence in both directions on the chromosome. The cell recognizes the identical stretches of DNA, and “trades” the existing gene with the artificial DNA. Since the artificial DNA is inactive, the function of the existing gene has now been “knocked out” by gene targeting. The new cells will keep growing and dividing with the new gene inside of it.

Example: An embryo from a mice in the blastocyst stage of a species with gray fur is isolated. Then the embryonic stem cells are removed from the blastocyst and put on a tissue culture to be grown. Transfer the homologous recombinant gene and grow them in gancyclvir and neomycin. The cells with the new genes for white fur are then transferred back to the blastocyst. Many of the transformed blastocysts will be implanted into pseudo-pregnant mouse with white fur. The mouse will give birth to some white mice and some with patches of gray, showing the activity of the old gene. The mice with the patches – which means they have both the gray fur and white fur genes – will mate with a white mouse. If the gametes of the gray white mouse were from the recombinant stem cells then it will give birth to all gray mice. All of the cells in the mice are heterozygous for the fur gene. The gray mice will then mate together with the heterozygous mice. Identify which mice has the homozygous recombinant and mate them until both of the alleles are knocked out. The end result is the knock out mouse which is when both of the alleles have been knocked out.[7]

Gene Trapping[edit]

Gene trapping is done by using a sequence of artificial DNA which holds a “reporter gene” that is made to insert into any gene at random. The artificial DNA prevents RNA splicing in the cell, thus preventing the existing gene from synthesizing its assigned protein and eliminating its function. Now the activity of the artificial “reporter gene” can be observed and studied, to determine the existing gene’s normal function in the mouse.

Which method is better?[edit]

For both of these methods, a DNA vector is used to carry the artificial DNA into the embryonic stem cells of the mice. Once the DNA is injected, the cells are cultured in-vitro, and then injected into mouse embryos. These embryos are given planted into female mice, which then give birth to mice with the knocked out genes.

Both ways have their own advantages. For example, in gene targeting, the target gene is known in the DNA sequence. This method allows researchers to knock out the sequence(s) that they find are interesting. On the other hand, although the specific gene which is knocked out is unknown in gene trapping, it would create different kinds of mice because there is no efficiency or precision in how the “reporter gene” binds; finding the function of specific gene can become cumbersome because of the randomness. The researchers need to spend a lot of time testing the ES cell to identify which gene has been knocked out. Moreover, a certain gene that is not easily chosen may be knocked out in random manner.


  1. Wikipedia: Knockout Mouse [41]
  2. National Genome Human Research Institute: Knockout Mice [42]


  1. “What is transgenic technology”. Knockout Mouse and Transgenic Research. Retrieved 2009-11-14. 
  2. Twyman, Richard. “Knockout Mice”. The Human Genome. Retrieved 2009-11-14. 
  3. “NIH Knockout Mouse Project”. National Institute of Health. Retrieved 2009-11-14. 
  4. “Knockout Mice”. National Human Genome Research Institute. Retrieved 2009-11-14. 
  5. Berg, Jeremy (2006). Biochemistry (6th Ed. ed.). W. H. Freeman. ISBN 0716787245. 
  6. Twyman, Richard. “Knockout Mice”. The Human Genome. Retrieved 2009-11-14. 
  7. Campbell, A. Malcom. “Homologous Recombination and Knockout Mouse”. Davidson College. Retrieved 2009-11-18. 

Transgenic animal are animals that have had foreign genes from another animal introduced into their genome. A foreign gene (such as a hormone or blood protein) is cloned and injected into the nuclei of another animal’s in vitro fertilized egg. Cells are then able to integrate with the transgene, and the foreign gene is expressed, upon which the developing embryo is surgically implanted in a surrogate mother. The result of this process, if the embryo develops, is a transgenic animal housing a particular gene from another species.

Applications of transgenic technology are for example, improving upon livestock, such as higher quality wool in sheep, or increasing the amount of muscle mass of an animal so that it can produce more meat for consumption. Conversely, transgenic animals can also be utilized for medical purposes such as producing human proteins by inserting a desired transgene into the genome of an animal in a manner that causes the target protein to be expressed in the milk of the trangenic animal.

Another example is one that involved mice. Normal mice have the capability to not be infected by the polio virus. They do not have the cell surface molecule that is required as a receptor for polio, unlike humans, who do have this receptor. However, the polio receptor gene can be injected into a mouse, thus developing a transgenic mouse. This allows the mouse to now be successfully infected by the polio virus, and display the similar symptoms that are displayed in humans who are affected by the polio virus.

The most common studies that are currently going on with transgenic animals involve animals, such as the rhesus macaques. These animals contain the human gene of the Huntington’s disease. This allows scientists to research options that can provide a cure to Huntington’s or at least a better treatment option. Other animals, such as mice or those that contain human stem cells, are used to create medicine and treatment options for diabetes, strokes, and blindness.

The human genome project has also been of great help in the role of transgenic animals. With the newfound discovery of the DNA sequence of the human genome, scientists can now study the genes that are involved as drug targets, which can help provide them with the ability to mark the appropriate gene that can aid in providing the cure to any certain disease that they are studying.

The expression of a transgene can also be engineered to take place in plants, such as obtaining the bio-luminescent gene that gives fireflies their glow in the dark ability, and introducing it to a plant.

Transgenic Animals Countless Benefits to Humanity[edit]

Three of the most widely-used reasons for producing transgenic animals for the benefit of human welfare are agriculture, medical and industrial.

Agricultural Applications[edit]

Farmers have always wanted to have the best breed for any type of animal and to have the best traits that it can possible have. The normal way of breeding animals can potentially take up a lot of time and is not entirely efficient. With new advances in technology, selected characteristics can be developed in species with a lot less time and more accuracy.

Not only are animals produced more efficiently, the quality of the animals are enhanced as well. Some examples include having cows create milk with lessened milk content and sheep that produced a lot more wool.

Also, with these new qualities in animals they must be protected. Scientists are researching on creating animals that are resistant to particular diseases and to enhance the two reasons stated above.

Medical Applications[edit]

Animals that have their genes modified to show disease symptoms, may be studied and cure could possibly be contrived in the near future.At Harvard, scientists created a transgenic mouse also known as OncoMouse® or also known as the Harvard mouse which allows it to carry genes that can enhance the development of a variety of cancers that are found in humans.

Xenotransplantation will play a major role in the medical industry in the future. It is the transplantation of living cells, tissues or organs from one species to another. Due to worldwide shortages of organs, advances in gene manipulation of animals can alter their organs to become susceptible to humans. For example, Transgenic pigs may play a major role in the transplant of organs to humans. Because Pig and human organs are closely related, there is a possibility to use pig organs for transplants. However currently, a pig protein inhibits the human body’s immune system acceptance of the organs. If animals such as Pigs can have its protein successfully supplanted by a human protein can be used to meet a major need- transplant organs such as the hearts, liver, or kidneys. It can also be applied to bringing about refined drugs in the pharmacy industry and nutritional supplements. An example is insulin and anti-clotting factors of blood can soon be extracted from milk of transgenic animals such as goats, sheep, cows. This milk being the source of importance is undergoing major research to create a type that will be able to treat diseases such as phenylketonuria or cystic fibrosis.

Human gene therapy is another medical application that is gaining wide acclaim. In essence, it is the transfer of genetic information into patient tissues and organs. As a result, diseased defective copies of genes can be eliminated or their normal functions rescued. Moreover, the procedure can provide new functions to cells. For Example, to combat cancer and other diseases, the insertion of a gene that causes the production of immune system mediator proteins can be introduced. By this therapy, countless genetic disease could have potential cures further down the road.
There are two paths to Gene therapy. The first path is direct transfer of genes into the patient. The second path is the use of living cells as vehicles to transport the genes of interest. These two paths both have certain advantages and disadvantages.
Direct gene transfer is the most simplistic way of administering the gene of choice. There are two methods to direct gene transfer. The first method is the process in which genes are delivered via liposomes or other biological microparticles into patient’s tissue or bloodstream. The second method of the introduction of genes is using genetically-engineered viruses, such as retroviruses or adenoviruses. However, due to biological safety concerns, viruses must first be altered so they are not infections before introduction. However, due to the simplicity of the direct gene transfer method, there are major weaknesses. For example, it does not allow for the control of where the therapeutic gene will insert. The transferred gene will either randomly insert itself into the patient’s chromosomes or remain unintegrated in the targeted tissue. Moreover, the target tissue may not be easily accessible for direct gene application of the therapeutic gene.
The second method of gene therapy is the use of living cells to deliver the therapeutic gene. This method is very complex compared to the direct gene transfer method. There are three major steps to this method. The first step is cells from the patient are isolated and propagated in vitro. The second step is the introduction of the therapeutic gene into these cells using methods similar to the direct gene transfer. The last step is the genetically modified cells are returned to the patient. The advantage of using gene transfer vehicles is, in the laboratory cells can be manipulated more accurately and precisely than in the body. In addition, some of these cells are able to continually propagate under laboratory conditions before reintroduction into the patient. Moreover, some of these cells types have the ability to localize to particular regions of the human body, for example, hematopoietic (blood-forming) stem cells can return to the bone marrow upon reintroduction in to the body. This action can be very useful for applying the therapeutic gene which has regional specificity. However, a major disadvantage is the biological complexity of the living cell’s environment. The isolation of a specific type of cell requires not only extensive information of its biological markers, but also knowledge of the requirements for that cell type to stay alive in vitro and continue to divide. Unfortunately, there are many cells types with unknown information to their specific biological markers. Moreover, many normal human cells cannot be sustained in the lab for long periods of time without amassing deleterious mutations.

Industrial Applications[edit]

Animals that have transgenes have been produced to for testing on chemical safety as these animals are sensitive to toxic things. Also, these transgenic animals may produce something that can be utilized in biochemical reactions. Microorganisms have been structured to be able to produce enzymes that can make major reactions speed up.

Production of Transgenic Animals[edit]

The production of transgenic animals is taking the genome, the genetic makeup of the organism and introducing foreign genes into that organism. These insertions of genes are known as transgenes. Most importantly, these foreign genes must be transmitted through the germ line of the organisms. As a result, every cell, including the germ cells, whose function is to transmit genes to the organism’s offspring, contains the same change in genetic material.The predominant method of creating these transgenic animals is the use of DNA Microinjection. However, producing these type of transgenic animals is hardly deemed a success as DNA insertion is arbitrary and success rate very low. The offspring is what’s studied for this new transgenic gene. But the ability to produce these type of offspring that is successfully carrying the gene is extremely difficult.

Scientists may produce transgenic animals is three main ways: DNA microinjection, retrovirus-mediated gene transfer and embryonic stem cell mediated gene transfer.

DNA microinjection[edit]

Technique summary[edit]

The first animals to be experimented with DNA microinjection was the mouse. DNA Microinjection is the transfer of a desired gene into the pronucleus of the reproductive cell. This cell is first cultured in vitro. Then reaching to a specific stage or threshold of the embryonic phase, it is then transmitted into the recipient female.

Technical Explanation[edit]

The pipets for this technique must be created especially from glass that are extremely thing and a pipet puller as well as a micro-forge. It must be absolutely flat at the tip or there will be impedance when injecting into the embryo. The specification of the DNA injection pipet should have an internal diameter of about 1 µm or even less. When performing this technique gloves that are covered with talc should be avoided as the power has the potential to clog the pipets and could lead to the failure of the embryos. The embryo that is working with should be put in very low magnification. Using the pipet, with ease suction the embryo into the end and let it stay there. The tip of the pipet is brought to exactly where the pronucleus is and then it is punctured through the cell membrane and into the cytoplasm area. It is often hard to see if the pipet tip has gone through into the pronuclear membrane. The only safe bet in judging if it was transferred successfully is to glance at the pronucleus to see if it swells up and its size in volume amounts to around 1pl. After injection it is then moved to the far end of the dish so that the next one may be done as well. When a bundle of embryos are complete, it is left for incubation and then evaluation for a duration of time. The embryos that are viable will then be transmitted to a female’s oviduct and then utilized.

Retrovirus-mediated gene transfer[edit]

Retroviruses are used as vectors to transfer genetic material in the form of RNA rather than DNA. It is the transfer of genetic material into the host cell, resulting in a chimera, a organism that has various genes aside from its own. These chimeras are inbred for as many as twenty generation until homozygous offspring are formed, carrying two copies of the same transgene in all of its cells.
This has been proven successful in 1974, when a virus was used as a vector into embryos of mice. They showed the desired transgene.

Embryonic stem cell mediated gene transfer[edit]

The technique involves isolation of the totipotent stem cells from embryos(stemcells that can develop into any type of specialized cell). The desired gene is inserted/transfer into the stem cell. These stem cells containing the desired DNA of interest are now incorporated into the host’s embryo. Thus resulting in a chimeric animal. A major benefit of this technique is that it may test the transgenes on the molecular level, which essentially saves ample time and using this technique one would not have to wait for living offspring.

Stem Cells (in Further Detail)[edit]

What exactly are stem cells?

Stems cells are now a hot topic for research because of their seemingly endless potential. They are cells that may develop into numerous different categories of cells in the body during the beginning stages of life and also during the growth stage. Stems cells can also be utilized as an internal repair system, basically dividing incessantly to restock the cells under damage and repair until it reaches back to equilibrium and for the duration of the organism’s life span. As stem cells divide, each has the opportunity to choose between sticking as a stem cell or become more specific- one with a specialized function, examples including liver cell, a white blood cell, brain cell, etc.

How are stem cells set apart from other types of cell?

There are two primary properties that are used to do this. The first aspect is that stem cells are initially unspecialized cells and may regenerate through cell division, and even at times after prolong periods of time without activity. Another aspect is that under right, specific physiological conditions can be promoted to turn into either tissue or certain cells of organs and with distinctive abilities. Examples of when stem cells are maximized for their repair function are in organs, bone marrow, gut marrow, where they constantly divide to restore injured cells or ones that have been heavily used.

In the past, researchers mainly worked with two categories of stem cells which were from both humans and animals. The two that were worked on were embryonic stem cells and somatic stem cells, which can also be called adult stem cells. The first embryo cells came from mouse as described above, which occurred around 1981. The human embryonic stem cells were made for reproduction and was made possible through the intense research done with the mouse embryos. Recently, there has been a third category of stem cells known as induced pluripotent stem cells(iPSCs). These cells are unique because they are cells of adults that can be reconditioned by gene modifications to be a stem cell.

Why are stem cells valuable for living organisms?

Typically in blastocysts, which are embryos of only 3–5 days aged, the cells on the inside will turn into the cells for all of the body of that living organism, even specialized cells and organs including the skin, heart, lung, reproductive cells-sperm and egg, and different tissues. Within the tissues of adults including bone marrow and muscle, these stem cells have the ability to replace the cells that are damaged, affected by disease, or simply just used.

Research in the stem cell arena, has continued to add new insights to the development of organisms from cells and the repairing mechanism of affected cells. Stem cells may also be utilized to help select for new drugs to be brought to market and better understand not only cell developments but also the irregularities that induce defects in the infants of organisms.

Special characteristics of stem cells

Stem Cells are very unique and set apart from other cells of the body. All types of stem cells will have 3 defining characteristics- able to divide and replenish themselves for long duration of time, not specialized, able to be turned into numerous different types of cell types. For each of these properties, further depth analysis will be explained below.

The first property discussed was stem cells ability to divide and replenish themselves for a long duration of time. Typically cells of muscle or nerve do not duplicate by themselves, but stem cells have the ability to do this and also done ample times. Stem cells replicated countless times in the laboratory times for months at a time may result in millions of cells. If the cells can go on for a long time and not be specialized just like the parents, these cells are able to perform self-renewal for the longterm. Two sources of profound interest under study about this self-renewal for the longterm is how embryonic stem cells can replicate for an entire year in the laboratory and not differentiate but usually non-embryonic stem cells are not capable of this and which aspects of organisms are the ones that are source of regulation of stem cell replication and this self-renewal.

Finding out how the regulation of stem cells is performed for stem cells normal development may assist in finding out the reasoning for cancer through irregular cell division. This could also lead to more efficient growth embryonic and non-embryonic stem cells performed in the laboratory setting. Having stem cells that continue to stay as unspecialized result from special conditions. These special conditions are set up from signals in the cells that induce the stem cells to replicated and stay as unspecialized.

Stem cells are ones that are not specialized. Since they are not specialized, they are incapable of doing any specialized tasks that could occur in specific tissues or organs. As a result, stem cells cannot work collaboratively with other cells to perform organized tasks such as being a carrier of oxygen molecules throughout the body such as red blood cells. But what is unique about stem cells is their potential to be made into specialized cells such as nerve cells, brain cells, or muscle cells.

Stem cells have the capability to be made into specialized cells. The progression of stem cells that are not specialized being turned into ones that are specialized is known as differentiation. The differentiation process have multiple steps and the progression through these steps increases specialization. Many factors help to control this progression. Signals that both inside and outside of the cell help promote the stem cells through each stage. Outside signals include being in close,touching proximity of nearby cells, chemicals that are given off by other cells, and the presence of specific molecules in the immediate environment. Inside signals are managed by genes present on the DNA that tell it exactly what to do. Understanding the regulation of these stem cells can help to grow cells or even tissues to help in selecting for drugs and cell therapy, which is what makes stem cells so special and a primary source of research.

Different types of stem cells

Embryonic stem cells
These type of cells come from embryos. A major portion of these types of cells come from eggs that are fertilized in vitro or in the laboratory setting and then given to labs so that research may be done on them. The embryos from the human stem cells are usually about 4–5 days aged and are in the blastocysts form, which essentially is a hollow ball of cells. The blastocysts have a total of three structures including the trophoblast, embryoblast or pluriblast, and blastocoel. The trophoblast is a layer that surrounds the blastocoel. The hollow cavity of the blastocyst is the blastocoel and the embryoblast is mass of cells that will turn into defined structures of the fetus.

How are embryonic stem cells identified?

While creating embryonic stem cells, there are various checkpoints to test if the cells have the right properties that allow it to be called embryonic stem cells. This is also known as characterization. There is not a universal test agreed to always be used to mark embryonic stem cells but there are various tests that can be used. The first one that can be used is to grow these stem cells for a number of months. This proves that the cell can do long-term growth and self-renewal. The cells are put under a microscope and observed to see that it is in good condition and still have not differentiated. A second test that is to determine transcription factors that are characteristic of cells that are not differentiated. Specific transcription factors to look for are Nanog and Oct4. Essentially what transcription factors do is aid in turning genes either off or on when needed, which is very integral in cell differentiaion and development of embryos. Nanog and Oct4 help to keep the stem cells to be undifferentiated. A third test is to use specific techniques to look for cell surface markers that undifferentiated stem cells will give off. A fourth test is to look at the chromosomes using a microscope and to diagnose if there is damage or if the quantity of chromosomes is different. A 5th test is to see if the cells can be grown again after putting it in the freezer and then allowing it to thaw. The last test which is the 6th one is to test if these human embryonic stem cells are pluripotent. This may be done by permitting the cells to instinctively differentiate in the laboratory, conducting the cells so that it will form a cell that consists of three germ layers, or injecting the cells into a mouse that has an impaired immune system to test for the development of teratoma, a tumor that is benign. The growth of the injected stem cells and its differentiation may be observed since the immune system of the mouse does not reject it. Encompassed in the tumor cells is a combination of differentiated or somewhat differentiated kind of cells, showing that embryonic stem cells have the ability to differentiate into other different types of cells.

How does differentiation of embryonic stem cells occur?

When embryonic stem cells are kept under the right conditions, they can be kept in the unspecialized state. When cells are permitted to aggregate and form what is called embryoid bodies, spontaneously differentiation occurs. These cells are able to form numerous different types of cells. This does show that this sample of embryonic stem cells is good condition, however this method is not efficient in creating certain cell types.

Mouse embryonic cells that are directed in differentiation, photo by Terese Winslow, [37]

In order to generate cultures of specific types of differentiated cells such as blood cells or brain cells, is done by controlling the differentiation of these embryonic stem cells. Components to modify are the different chemicals the culture medium is made of, the surface of the culture dish, or even the cell themselves by giving them specific genes. After a long time of trial and error there have been some standard protocols established for this directed differentiation to certain cell types to occur. If this directed differentiation of embryonic stem cells is done successfully, they can be used to treat certain diseases which include Parkinson’s disease, Duchenne’s muscular dystrophy, heart disease, vision loss and traumatic spinal cord injury.

Adult stem cells
Adult stem cells are thought to be undifferentiated type of cells, located with differentiated cells either in a tissues or organs that can revitalize itself and may differentiate to give either some or all of the primary specialized types of cells of an organ or tissue. The main job of adult stem cells in organism are to sustain and restore the tissues where they are located. Unparalleled to embryonic stem cells that are named according to location in which they are found, the stock of some adult stem cells in some tissues that are already mature are still being researched.

As more research is being conducted on adult stem cells, their presence is being found in many additional areas of tissue than ever before. This has opened up the possibility of these adult stem cells to be used as transplants. A widespread use of adult stem cells as transplants are for hematopoietic stem cells from bone marrow, which is blood-forming. It is now evident that stem cells do exist in the heart and the brain. The control of differentiation of these stems cells if done correctly it may be feasible to use them for transplantation therapy treatments.

Adult stem cells were first discovered in bone marrow, which contained two versions: hematopoietic stem cells and bone marrow stromal stem cells, which were discovered second. The stromal cells were small in number but had the ability to make everything including fat, bone, cartilage, and fibrous connective tissue.

Location of adult stem cells and their role?

Adult stem cells are actually located in numerous different organs and tissues which include bone marrow, brain, blood vessels, skin, teeth, heart liver, epithelium part of ovarian, and testis. Within each tissue, stem cells live in a particular area. In a lot of tissues, some stem cells comprise the outside layer of small blood vessel known as pericytes. Stem cells usually do not divid for long durations of time until prompted to for normal maintenance of tissues, after injury, or by disease.

Normally the number of stem cells in each tissue is small and once taken away from the body, their ability to divide becomes limited and duplicating large amounts of stem cells difficult. As a result, researchers are looking for improved ways to grow large quantities of adult stem cells in the laboratory so that specific ones may be created to target and treat diseases and injuries. Uses include to recreate bone from cells located in the bone marrow stroma, making cells that produce insulin to help treat diabetes of type1, and to rejuvenate heart muscles that were greatly impaired after a heart attack event.

Identification of adult stem cells

There are many methods to identifying stem cells. Researchers typically use several methods to identify the adult stem cells. One way it occurs is to tag the cells that are in living tissue with molecular markers and then look to see the produced specialized cell types. Another useful method would be to take the cells from a living organism, tag them in the laboratory and reinsert them into another organism to observe whether or not the cells recreate cells at their original tissue location.

One of the primary things that must be exhibited is that one adult stem cell will be able to produce an entire colony of genetically identical cells that can also create the correct differentiated cell types of that particular tissue. To produce these results experimentally and confirm that the cells are indeed adult stem cells is done through showing that it can create genetically identical cells or that the cells can remake the tissues after inserted into another animal or both of these.

Adult cell differentiation

Differentiation of Both Hematopoietic and stromal stem cells, photo by Terese Winslow, [38]

Normal differentiation
Adult stem cells are free to divide when called and can produce mature cells that have the same shapes, structure, role of that tissue in which it resides. Examples of this will follow. Hematopoietic stem cells will produce any type of blood cells including the b lymphocytes, T lymphocytes, natural killer cells, basophils, monocytes, red blood cells, etc. Mesenchymal stem cells actually produce a whole variety of cell types including bone cells, fat cells, cartilage cells, etc. Neural stem cells of the brain may produce neuron, astroyctyes, and oligodendrocytes. In the lining of digestive tract reside epithelial stem cells and they produce cells including goblet cells, enteroendocrine cells, absorptive cells, etc. The stem cells of the skin reside in the basal layer of the epidermis and produces keratinocytes, that provide the security layer.

Particular adult stem cells can differentiate into other types of cells of other organs or tissues than it’s predicted type, such as heart muscle cells differentiating into brain cells. This type of differentiation is better known as transdifferentiation. This occurrence in human beings is still not fully proven. Some possible explanations for this type of differentiation being observed could be the junction of this donor cell with the recipient. Another explanation could be that these injected stem cells give off factors that promote that other organism’s own stem cells to initiate the repair mechanisms. When transdifferentiation has been observed, it is only seen in small instances.

Scientists have proved that some adult cells can be remade into different cell types in the laboratory using precise gene alterations. This can prove to be a way to remake cells into the other ones that have been injured or eliminated because of diseases. In diabetes, the cells that produce insulin or beta pancreatic cells can be recreated by reprogramming other cells in the pancreas. These recreated cells were very close i appearance and shape to the actual beta pancreatic cells. These reprogrammed cells when put into mice did improve the regulation of the sugar levels in the blood even though the mice had nonworking pancreatic beta cells.

Adult somatic cells can be reprogrammed to mimic embryonic stem cells through the presence of genes of embryos, and these types of cells are known as induced pluripotent stem cells iPSCs. Through iPSCs cells can be introduced that receptive by the donor and will not be rejected, which is important when recreating new tissue. However, iPSCs are still under study until they can produced to entirely only stick to its designated cell type.

Similarity among stem cells

Both human embryonic and adult stem cells have similarities and its differences in relation to using for regenerative therapy or repairing already damaged tissue and cells. A primary difference between adult stem cells and embryonic is the amount of different abilities that each is capable and the specific kind of differentiated cell types they will turn into. Embryonic stem cells can actually turn into all the different type of cells in the body because of their pluripotent nature. Adult stem cells are very specific and so limited to only differentiating into the type of cells of their original tissue.

A noteworthy difference is that embryonic stem cells can be grown with great ease in the laboratory. Looking within mature tissues, the adult stem cells are limited in number so finding these cells may be difficult. Unlike embryonic stem cells, adult stem cells still do not have a way to be grown in the laboratory. This difference has a great impact as replacing cell mechanisms oftentimes requires an abundance of cells in order to work properly.

Moreover, the tissues created from either embryonic or adult stem cells may be different in probability of rejection rate post-injection or transplantation. Embryonic stem cells have not been researched too heavily yet as testing using cells from hESCs were only just now approved by the FDA(Food and Drug Administration). The adult stem cells and tissues that form as a result are presumed to be less probable to rejection post-transplantation. The success can be attributed to using patient’s self cells to be duplicated in the laboratory and then induced to differentiate into a specific cell kind and then re-injected into that same very patient. Utilizing the adult stem cells and the tissue products from the patient’s very own cells highly decreases the probably of rejection by the immune system. This proves to be a major benefit since only using immunosuppressive drugs can help fix this problem but then the drugs have side effects that come along as well.

Uses for stem cells

Using adult stem cells to repair heart muscle cells, photo by Terese Winslow, [39]

There are many uses for stem cells, especially in research and in clinic. Studying human embryonic stems cells will help give information about development of humans. The principal target is to pinpoint how undifferentiated stem cells become differentiated cells and then later to form organs and tissues. Gene regulation is imperative in this aspect. A lot of the most irregular activity in humans result from aberrant erroneous cell division and differentiation. New research has found that iPS cells show that specific factors are associated with genetic signaling and molecular signaling and introducing these into the cells in a proper manner to command these processes will need a special technique.

Stem cells of humans may be used to select for new drugs. These drugs can be tested to see that it is not damaging using these differentiated cells. A vivid example would be to use cancerous cells to select for drugs that could be anti-tumor. Environment of the drugs should be very similar in order to check if the drugs actually work and this can be done through having a precise command over where the differentiation of stem cells turn into.

Another widespread use of stem cells is to utilize them to create cells and tissues to repair damaged or disease tissue in cell therapy. These regeneration of cells and tissues can aid in treating disease such as Alzheimer’s disease, stroke, heart disease, osteoarthritis, and spinal cord injury.

Checklist for successful transplant of stem cells

1) Duplicate in mass amounts and be able to produce enough quantities of tissue
2) Differentiate into wanted type of cells
3) Live to survive in recipient post-transplantation
4) Become integrated into the tissue in the proximity post-transplantation
5) For entire duration of organism’s life- be able to correctly function
6) No detrimental effects on recipient

Ethical conflicts with stem cells?

The main concern with stem cells has to do with the human embryonic stem cells, which has created a lot of public interest and conflict. Stem cells that are pluripotent, or may become numerous different types of cells in the human body are created from human embryos that are some days aged. The major debate is of when does life technically commence and if embryos or even fetuses would be considered as such and also who has the power to decide on such an issue.

United States’ position on stem cells

The Bush administration in 2001 offered federal funds for research on human embryonic stem cells if certain three criteria were met. However, President Barack Obama issued an Executive Order 13505 known as Removing Barriers to Responsible Scientific Research Involving Human Stem Cells on the 9th of March 2009. This allowed National Institutes of Health or NIH to take a different strategy on doing human stem cell research. Also this Executive Order essentially nullified both the Executive Order 13435 and the presidential statement that occurred on August 9, 2001.


Arlan Richardson. Photo. “Use of Transgenic mice in Aging Research.” 1997

Gordon, Jon W. Photo. “Transgenic Technology and Laboratory Animal Science.” 1997.

Gordon, Jon W. Photo. “Transgenic Technology and Laboratory Animal Science.” 1997.

“2009 Executive Order Disposition Tables: Removing Barriers to Responsible Scientific Research Involving Human Stem Cells.” . 11 March 2009. 2 December 2009.

Margawati, Endang Tri. “Transgenic Animals: Their Benefits To Human Welfare.” ActionBioscience. Jan 2003. 15 Nov 2009

“Stem Cell Basics.” In Stem Cell Information. Bethesda, MD: National Institutes of Health, U.S. Department of Health and Human Services, 2009. . 3 December 2009.

“Transgenic Animals and Genetic Research.”. 16 Nov 2009.

“What are Some Issues in Stem Cell Research.”. 9 November 2009. 3 December 2009.

Winslow, Terese. Photo. 2001. 2 Dec. 2009.

Winslow, Terese. Photo. 2001. 2 Dec. 2009.

Winslow, Terese. Photo. 2001. 2 Dec. 2009.

Zwaka Thomas P. “Use of Genetically Modified Stem Cells in Experimental Gene Therapies.” <>

Zwaka Thomas P. Photo. “Use of Genetically Modified Stem Cells in Experimental Gene Therapies.” <>
Transgenic plants are genetically engineered to have genes from other organisms inserted into their genome. Transgenic plants are identified as a class of genetic modified organisms (GMO). The introduced genes do not have to be from the plant kingdom, but can come from animals, viruses, or bacteria as well. The uses of exogenous gene introduction include virus immunity, a replacement for pesticides, the ability to grow in acidic soil, and greater nutritional content.

Making Transgenic Plants[edit]

Breeding transgenesis cisgenesis

Transgenic plants are constructed by inserting genes from other organisms into the host plant’s DNA sequence. For this to happen a desired gene must be isolated and cloned. A few changes must be made to the gene so that it can effectively be inserted into the plant. First, a promoter sequence must be added to the gene. The promoter sequence is an on/off switch that controls where and under what cues the gene is expressed. The gene must also sometimes be modified (e.g. The Bt gene for insect resistance has a greater amount of A-T nucleotide pairs than plants, which tend to have more C-T pairs. The A-T nucleotides can be substituted for with C-T pairs in a manner that does not significantly change the amino acid sequence, leading to greater protection of the inserted gene in plant cells.). A terminal sequence must also be added to signal when the end of the gene sequence has been reached. Finally, a selective marker gene must be inserted to identify plant cells which have successfully integrated the transgene.

Agrobacterium System[edit]

A method that is used to transform plants is the Agrobacterium method and the “Gene Gun” method. The Agrobacterium method uses
Agrobacterium tumefaciens, a soil-dwelling bacterium that has the ability to
infect plant cells by introducing transfer DNA, or T-DNA of a tumor-inducing (Ti)
plasmid (i.e. a DNA sequence that can replicate independently of chromosomal DNA and
is often circular) to the host’s nuclear DNA. The bacteria is part of the rhizobiaceae family which is responsible for many tumors found in plants. The Ti plasmid contains the T-DNA as
well as a series of vir (virulence) genes that direct the infection process.
Agrobacterium tumefaciens can be used as a vector for gene transfer into plants.
First, a hybrid plasmid that carries only the T-DNA from a Ti plasmid is cut open
with a restriction enzyme and a foreign gene is inserted, creating a recombinant
plasmid. The recombinant plasmid is then transferred into an Agrobacterium
tumefaciens cell that contains a Ti plasmid that has had its T-DNA removed. The
Agrobacterium with the engineered plasmid is then used to infect a plant and
integrates the T-DNA with the foreign gene into the plant genome. For the
Agrobacterium to be used the DNA must be able to penetrate into the plant cells.
This is often done with electroporation, where brief high-voltage electrical pulses
are administered to naked protoplasts (i.e. plant tissues and DNA). The electrical
pulses open the pores in the plasma membrane allowing the DNA to enter the
protoplast (which can then be grown into a mature plant by treating it with
hormones). In the “Gene Gun” method, gold or tungsten microspheres (about 1
micrometer in diameter) are coated with the DNA or RNA from the specific gene of
interest. The microspheres are then accelerated into undifferentiated target cells
in a petri dish. Once inside the cells, the gene from the DNA coating the
microsphere is released and can be incorporated into the host plant genome. The advantage of this method is that a high percentage of a single copy of T-DNA can used to transform the plant. In addition, they are an abundant of vector system available to carry out this method.

Biolistic Method[edit]

This method delivers microprojectiles that are coated with DNA by accelerating it into the cell of interest. The microprojectiles are usually made up of tungsten or gold. To carry out the acceleration, an explosion is made with gunpowder under high pressure of helium. Plants that are made using the boilistic method have multiple copies of a gene that is still able to segregate in a Mendelian pattern. This method helps increase the diversity seen in plants. There are some advantages to the biolistic method compared to the Agrobacterium method. The plants that undergo the bombardment of genes in this method are still fertile. Other advantages includes this is the only reliable method to transform the chloroplast and this method does not need any transformation vector.

Importance of Transgenic Plants[edit]

The new methods developed to transform plants have opened a new field of interest. Transgenic plants are used to solve a lot of problems in the agriculture sector. In addition, transgenic plants can be used in the medical field

Nutrients of Transgenic Plants[edit]

When people go to the supermarket, they often buy fruits that are not soft or overly ripened. The major problem in the agriculture field with fruits is that the fruits often become soft during processing and transporting because they are being ripened. Using one of the methods for creating transgenic plants, scientists are able to slow down the process of ripening. Three companies have been able to apply this technology to slow down the ripening of tomatoes. And now other companies are hoping to be able to do the same for other fruits such as mangos or papayas.
Cereal grains and legume seeds are a big source of protein for many people. However, the cereal grains and legumes seed often lack certain amino acids such lysine in cereal grains and methionine in legume seeds. Many efforts have been put into creating seeds that are higher in nutritional values. Currently, transgenic tobacco and canola seeds have a 33% increase in methionine due to the transgenic technology. In addition, the nutritional values have potatoes have increased by transforming it with AmA1, a gene from amaranth.

Increasing the nutritional values in plants and fruits can address many malnutrition problems and diseases. Vitamin A deficiency is a huge problem in Asia that affects around 124 million children and causes blindness. The main staple in Asia is rice, but rice does not contain any vitamin A. Researches are being performed in hope of developing rice that is rich in vitamin A. Currently, scientist have found the genes that encode for B-carotene (pro-vitamin A) enzymes in the endosperm of transgenic rice seed and they hope to use this information to engineer rice in a way that vitamin A can be produce through the rice.

Uses of Transgenic Crops[edit]

The use of transgenic plants for pathogen resistance has received the most attention from popular media. The use of GMOs has been a topic of debate since their introduction in the mid-1990s. The two best known cases were virus-immunity in papayas and insect immunity in crops such as corn through a gene from Bacillus thuringiensis (BT). The papaya ringspot virus (PRSV) that severely damages papaya trees was causing a major toll on the papaya industry in Hawaii. Genes for the protein coat of the virus were inserted into papaya tissue by using the gene gun. Some of the papaya cells incorporated the viral genes into their DNA, giving the plant immunity to PRSV. This saved the Hawaiian papaya industry. The introduction and use of BT crops is even more publicized. The BT gene codes for the Cry proteins which are toxic to and that specifically target and kill the larvae of butterflies and moths. By introducing this into plants, crops such as corn, rice, and potatoes were able to exhibit the Cry proteins, and have proved to be very effective at stopping insect pests such as the European corn borer caterpillar. The protein is very selective and does not harm other insects (e.g. beetles, flies, bees, wasps) and is also considered safe for human consumption. The use of the BT endotoxic has greatly reduced the use of pesticides on crops. However, issues concerning immunity of the pests to the BT corn are a problem, and refuge crops that do not contain the toxin are planted to reduce the evolution of the caterpillar immunity to the Cry proteins.

GMOs have also been bred to improve food nutritional quality, to induce a longer shelf-life by delaying senescence, to allow corn to grow in acidic soil, to protect strawberries from cold temperatures, and a variety of other uses.


Bessin, Ric. Bt-Corn: “What It Is And How It Works”. University of Kentucky College of Agriculture. January 2004.

Transgenic Crops: An Introduction and Resource Guide. Colorado State University Soil and Crop Sciences. March 2006.

Lipps G (editor). (2008). Plasmids: Current Research and Future Trends. Caister Academic Press.
Raven, Peter. “Biology of Plants”. W.H. Freeman and Company. New York. 2005.
“Harvest of Fear” (Film) – Nova. PBS. 2004

Peña, Leandro. Transgenic Plants: Methods and Protocols. Totowa, NJ: Humana, 2005. Print.


Difference in absorbance of ultraviolet light between single stranded and double stranded DNA

Hypochromicity describes a material’s decreasing ability to absorb light. Hyperchromicity is the material’s increasing ability to absorb light.

The Hypochromic Effect describes the decrease in the absorbance of ultraviolet light in a double stranded DNA compared to its single stranded counterpart. Compared to a single stranded DNA, a double stranded DNA consists of stacked bases that contribute to the stability and the hypochromicity of the DNA.

When a double stranded DNA is denatured, the stacked bases break apart and thus becomes less stable. It also absorbs more ultraviolet light since the bases no longer forms hydrogens bonds and therefore are free to absorb light. Ways to denature DNA include high temperature, addition of denaturant, and increasing the pH level.

Importance of Hypochromic Effect[edit]

Nucleic acid melting curve showing hyperchromicity as a function of temperature

The measurement of absorption of light is important in monitoring the melting and annealing of DNA. At the melting temperature (Tm), the DNA is half denatured and half double stranded. By lowering the temperature below the Tm, the denatured DNA strands would anneal back into a double stranded DNA. When temperature is above the Tm, the DNA is denatured.

Because melting occurs almost instantly at a certain temperature, monitoring the absorbance of the DNA at various temperature would indicate the melting temperature. By being able to find the temperature at which DNA melted and annealed, scientists are able to separate DNA strands and anneal them with other DNA strands. This is important in creating hybrid DNAs, which consists of two DNA strands from different sources. Since DNA strands can only anneal if they are similar, the creation of hybrid DNAs can indicate similarities between genomes of different organisms.


  • Berg, Jeremy & Tymoczko, John. (2006). Biochemistry 6th edition. W.H. Freeman and Company.

Because DNA contains all of the heredity information and the instruction for protein production, it is crucial that there be very few changes to the DNA. DNA is constantly bombarded by radiation and chemical mutagens that can cause mutation. However the rate of mutation is very low because of the four main type of DNA repairs.

DNA Injury Detection and Signaling[edit]

The human genome is under constant toxic stress from normal cellular conditions such as free radicals or errors in DNA replication, as well as extrinsic conditions such as UV radiation. To combat these stresses and properly maintain the genome, the DDR pathway, or DNA damage response pathway has evolved. This pathway serves to detect errors or abnormalities, propagate the detection signal, and activate systems to correct the issue. If the damage is irreparable, the cell undergoes apoptosis, or programmed cell death, to avoid passing on the potentially lethal errors in DNA. Cells come across DNA damage constantly, so the DDR pathway is vital to cell survival.

The most lethal form of DNA damage comes from ionizing radiation which causes breaks in the double stand. The repair protein RAD51 quickly collects into foci at sites of DNA damage. It is suggested that damaged induced phosphorylation of the histone variant H2AX indicates the sites of DNA breaks; many other repair proteins also collect at these sites of H2AX accumulation. In mice lacking H2AX, immune system degradation and increased incidence of tumors are found.

The major regulators of cellular response to DNA damage are ATM and ATR kinases (ataxia telangiectasia mutated) through the regulation of phosphorylation of over 700 proteins. This phosphorylation is the initial step in the signaling of DNA damage.

“Structural Dynamics in DNA damage signaling and repair” was an article written by JJ Perry, Elizabeth Cotner-Gohara, Tom Ellenberger, and John A. Tainer. In this article, DNA damage responses are studied in aspects that reveal the role of protein in such pathways. DNA is continually damaged by metabolites and toxicants. Thus, DNA repair and damage response are essential in the function of life. There are three steps in which DNA damage is involved. The damage is first detected, removed, then eventually replaced with the correct DNA sequence. The pathway regenerates a 3’ terminal that will be extended using DNA polymerase with an undamaged strand as the template. The repair is completed with a ligase resealing the DNA backbone. Because this process of repair generates toxic intermediates, strong “genetic selection” is required as the DNA is being restored. Proteins structures are found to be connected to the coordination of steps within the DNA damage response and repair pathways. This is very important because proteins are once again, related in the DNA replication process.

When different methods come together, the dynamics of DNA repair complexes can be studied in great details. Such methods involve X-ray crystallography, NMR, SAXS – small-angle X-ray scattering, DXMS – hydrogen-deuterium exchange mass spectrometry, etc. These methods provide information as small as from the nanoscale to atomic level. For instance, SAXS gives information on the flexibility of macromolecules in solutions. It also provides information on the entire pathways and their interactions in solution. In addition, DXMS shows more on the conformation changes that take place during the repairing process as detailed as the resolution of a single amino acid. Thus, combining different structural biochemistry methods helps scientist in discovering the different coordination’s between DNA repair and damage response system. Current studies found that the “Transition between different enzyme conformations can involve non-native interactions that lower the energy barrier for inter-conversion between different states” (1). This discovery is very important because it describes the connections between the changes in the DNA repair complex (conformation changes) and the biological outcomes occurred through such changes. For instance, as stated, the changes in enzyme conformation cause the lower of activation energy for the conversion between different states during the process of restoring damaged DNA. Another example is that changing the normal protein flexibility and the stability of the repair protein system can cause great genetic diseases. Changes in DNA and ATP binding are found to be related to cancer as well as how the defects in the flexibility and stability of DNA repair framework are related to aging disorders such as Cockayne Syndrome or TTD.

The damage repair is carried out by the multi-domain nucleotide excision repair helicase (NER). This enzyme removes bulky and distorted cut from one strand of the DNA needed to be repaired. This is a very precise process where only the defected strand is removed without affecting the undamaged DNA strand because the undamaged DNA strand serves as the template for the modification and repairing process. The NER proteins are assembled in a way that allows for the verification of the damaged site before the actual removal of the DNA backbone. One example of DNA repairing process is on the performance of Yeast Rad4, a multi-domain protein that binds to the distorted part of the helix being repaired by NER. The binding of the protein is showed to stabilize the distorted DNA structure. Observations show that Rad4 inserts a beta hairpin through the DNA helix to relocate its bases. One surprising discovery was that instead of binding to the damaged DNA strand, Rad4 is bound to the undamaged one. The result was that the helical axis is offset due to the damaged DNA strand, causing a bend in structure that increase the Rad4 DNA interaction surface to the neighboring hairpin regions. This extending interaction creates a more stabilized damaged DNA, though its bases are now exposed to the solvent. This stabilization aids NER as it is repairing the damaged strand.

Another important component in the DNA damage response and repair is BER – base excision repair pathway. The difference between BER and NER is that BER has the ability to detect and remove single nucleotides with the smallest modification such as the addition of one single methyl group. Thus, it is extremely efficient in fixing distorted DNA strands. In BER, the oxidative damage-specific glycosylates OGG1 and MutM are found to interact with 8-oxoG bases. 8-oxoG bases are composed of a hydrogen-bond donor N9 and an accept O8. They interact with OGG1 and provides selective cut of the damaged DNA. This entire complex is known as the pseudo-Michaelis complex. Overall, different mechanisms were observed in the process of DNA damage response and repair from the combination of methods ranging from NMR, X-ray crystallography, to SAXS, etc.

Below is an image of a process of DNA repair where the DNA ligase I is repairing a chromosomal damage.

Role of 9-1-1 in DNA Repair[edit]

DNA repair consists of the detection of existing damage and the actual healing of this impairment. 9-1-1 is a heterotrimeric protein, consists of three sub-units in which at least one is different than the other two, that wraps around DNA to initiate the recruitment of specific checkpoint proteins and freezes the cell cycle temporarily. More specifically, it causes phosphorylation of Sc-Mec1/Hs-ataxia telangiectasia, where Sc- and Hs- prefixes refer to Saccharomyces cerevisiae (a eukaryotic species) and Homo sapiens respectively, and Rad3. Chk1 and Sc-Rad53/Hs-CHK2 protein kinases are activated resulting in the inhibition of cell cycle phases G1/S intra-S or G2/M. Accumulation of repair genes, fixation of the replication fork, and the decrease in production of cyclins (proteins that progress the cell cycle) also result from this activation. 9-1-1 works with Sc-Cdc28 to selectively accumulate Sc-Ddc2. The presence of Sc-Ddc2/Hs-ATRIP, Sc-Mec1/HS-ATR, and 9-1-1 together activates the checkpoint regardless of the detection of DNA damage.

Mismatch Repair[edit]

Mismatch repairs corrects any mistakes in nucleotide pairing that escape the proofreading ability of DNA polymerase during replication. Base nucleotides that are incorrectly paired causes deformity in the secondary structure of DNA. The MSH2 and MSH6 dimer binds to the mismatch on the strand. Then, MLH1, an endonuclease, will bind to the MSH and nick the strand. Then exonucleases will degrade the region in between and then allow DNA polymerase delta to place the correct nucleotide and DNA ligase will re-connect the strand. Using this ability, the enzyme cut out the distorted portion of the new DNA strand and then use the old DNA strand as a template to fill in the gap. In E.Coli, the mismatch repair enzyme recognizes the old DNA strand by the presence of methyl groups on certain sequences. In eukaryotic cells, it is unknown how the enzyme is able to distinguish between the old and new DNA strands.

Direct Repair[edit]

In direct repair, instead of replacing an entire nucleotide, the wrong nucleotide is structurally changed to the right nucleotide. UV ray from the sun causes pyrimidine dimers by forming covalent bonding between adjacent pyrimidines. Some eukaryotic cells have an enzyme called photolyase. The enzyme breaks the covalent bond between the pyrimidine dimers with the energy from light.

Nucleotide Excision[edit]

NER Helicase[edit]

DNA repair is carried out by the nucleotide excision repair (NER) helicase, a protein that is composed of multiple domains. NER assembles around damaged DNA regions (which, because of their error, contain a bulge or lesion that encourages NER to bind) in a stepwise manner, allowing damage to be carefully verified before the actual excision is performed. For example, yeast Rad4 protein (an analogue of mammalian XPC) indirectly detects DNA damage by binding to a nearby undamaged region. The damaged DNA strand is flexible, allowing a stable complex to form which includes Rad23, the protein that actually repairs the damage.

If XPC-Rad4 cannot detect a damaged site, one alternative involves the DDC1-DDC2 dimer. This dimer forms a complex with a damaged DNA region and an ubiquitin ligase. The complex ubiquitinates XPC and DDC2, the latter of which then releases the DNA molecule, passing it on to XPC and the normal NER process.

Nucleotide Excision Repair can be divided into two subcategories: Global Genome Repair and Transcription Coupled Repair.

Global Genome Repair involves the XPC and hHR23B dimer binding to the damages DNA and then Transcription Factor 2H (TFIIH) bind to the complex. Then XPG binds and the DNA is further unwound. The nucleases XPG and XPF cleave the DNA, which essentially removes the damaged DNA. Then DNA polymerase delta fills in the gap with the correct nucleotide and then DNA ligase re-connects the strand.

Transcription coupled repair is when RNA polymerase stalls at the damaged site and then Cockayne Syndrome B protein (CSB) displaces RNA polymerase and recruits TFIIH and XPG. The DNA is unwound before the nucleases XPG and XPF cleave the DNA. Then the damaged section is removed and DNA polymerase delta fills in the gap and ligase re-connects the strand.

Source: Molecular Cell Biology, Lodish et al., 6th edition (2008), pages 145-160

The Base-Excision Repair pathway[edit]

Not all damages are large enough to cause the lesions that are detected by NER. The base excision repair (BER) pathway repairs single nucleotide errors, sometimes as slight as the addition of a methyl group. While small, these damages can often be enough to impede DNA replication or produce nonfunctional proteins. Damage detection in the BER pathway is difficult because, in addition to the errors being small, there are a large number of them. Numerous enzymes are used to detect different small errors and initiate the BER pathway.

The first step in base-excision repair is the excision of modified nucleotide. Enzymes called DNA glycosylases, each has its own ability to recognize certain type of modified bases, cleave the bond between the 1′-carbon of the deoxyribose sugar and the base and remove the base. Then enzyme called apurinic or apyrimidinic (AP) endonuclease breaks the phosphodiester bond and another enzyme removes the deoxyribose sugar. DNA polymerase comes and adds the correct nucleotide to a free 3’OH group. Finally, DNA ligase connects the DNA strand by forming phosphodiester bond.

Backbone repair and DNA ligase[edit]

Damage to the sugar-phosphate backbone of DNA is repaired by DNA ligases. Because the DNA backbone is common to all organisms, these ligases are likewise found in every organism that uses DNA as its genetic material. DNA ligase seals breaks in the backbone by a three-step process. In the first step, several of the enzyme’s domains adopt a specific conformation, allowing an active site lysine residue to be adenylated. In the last two steps, the enzyme encircles the broken DNA strand and fuse the two ends together.

Heather Tsai DNA Ligase.png

Double-Strand Break Repair[edit]

Breaks in the double strand of DNA are common, but particularly hazardous to the cell due to increased chance of genetic mutation. Major causes of double strand breaks include reactive oxygen from oxidative metabolism, ionizing radiation, and enzyme errors. The strand could be repaired in one of two major ways: homologous-directed repair and the nonhomologous DNA end joining pathway (NHEJ).

Homology-Directed Repair[edit]

Any diploid organism could use homology-directed repair, even if the diploidy is temporary, as in bacteria. Types of homology-directed repair include homologous recombination, single strand annealing, and breakage-induced replication. In homologous recombination, an identical or nearly identical sequence of DNA is required as a template for repair during the S phase of the cell cycle, which occurs only during and shortly after DNA replication, and before mitosis. Nucleotide sequences are then exchanged between similar strands.

Nonhomologous DNA End Joining Pathway (NHEJ)[edit]

NHEJ arose as an alternative to homology-directed repair, as template donors are usually not available in nondividing cells. With a remarkably flexible mechanism, NHEJ has a wide diversity of substrates that can be converted into the desired product. Like other DNA repair processes, it requires three main proteins: a nuclease to resect damaged DNA , polymerases to fill in new DNA, and a ligase to the restore the strand. Key components include Ku, DNA-PKcs, Artemis, Pol x polymerases, and the ligase complex consisting of XLF, XRCC4, and DNA ligase IV. Each DNA end could then be modified independently multiple times, and substitutions with other enzymes is permitted due to its flexible nature. The problem of joining heterogenous DNA ends at double-strand breaks was shown to have evolved convergently in prokaryotes and eukaryotes.


  1. Huen, M. SY. “Assembly of checkpoint and repair machineries at DNA damage sites.” Trends in Biochemical Sciences, Volume 35, Issue 2, 101-108, 28 October 2009
  2. Perry JJ, Cotner-Gohara E, Ellenberger T, Tainer JA. “Structural dynamics in DNA damage signaling and repair.” Curr. Opin. Struct. Biol. 2010 Jun; 20(3)
  3. Pierce, Benjamin A., Jung H. Choi, and Mark E. McCallum. Genetics: a Conceptual Approach. New York, NY: W.H. Freeman, 2008. Print.
  4. Lieber MR. The mechanism of double-strand DNA break repair by the nonhomologous DNA end-joining pathway. Annu Rev Biochem. 2010;79:181–211.
  5. Perry, J. Jefferson P., Elizabeth Cotner-Gohara, Tom Ellenberger, and John A Tainer. “Structural Dynamics in DNA Damage Signaling and Repair”. Current Opinion in Structural Biology. (2010): 283-294. ScienceDirect.
  6. Eichinger, S. Christian and Stefan Jentsch. “9-1-1: PCNA’s specialized cousin.” Trends in Biochemical Sciences, Volume 36, Issue 11, 563-568, 04 October 2011.


Mismatch Repair in mammals is an important mechanism in the overall processes of DNA repair. Mismatch Repair (MMR) works by removing incorrect base pair match-ups in double-stranded DNA and replacing it with the correct base pair. However, MMR has other known functions, including mutagenesis in different in vivo conditions.

Canonical MMR Mechanism[edit]

Errors in DNA replication pose many problems to both the integrity of the DNA and to the individual. MMR is one way that these errors are fixed, as it is known that deficiency in MMR causes cancerous tumors in animal models.

The basic MMR system relies on the proteins, MutSα, MutLα, EXO1, RFC, PCNA, RPA, polymerase-δ, and DNA ligase I. There are three basic steps of MMR known as licensing, degradation, and resynthesis.


In licensing, MutSα binds to the mismatch error in the DNA strand, which causes a change in the conformation of MutSα into a sliding clamp. This change is dependent upon an exchange of ADP for ATP. MutLα is recruited to forma ternary complex with MutSα, which then diffuses along the DNA strand until it reaches PCNA.

PCNA, or proliferating cell nuclear antigen, is a protein that can undergo a conformational change to become a ring around DNA. To attach to the DNA, it relies on the function of RFC, or replication factor C. This protein uses ATP hydrolysis to attach the PCNA to the DNA. This attachment is only efficient when there is a “nick” in the DNA, or an apyrimidinic (AP) site. The PCNA can then attach to the 3′ end of the nick. While RFC can add PCNA without a nick in the DNA, this is down with extremely low efficiency.

Once it reaches PCNA, the cryptic endonuclease of MutLα is activated and causes additional nicks that are on both sides of the mismatch error on the same strand. This is only necessary for PCNA binding on the 3′ side of the mismatch. Nicks are only made on the same strand as the mismatch because PCNA is not symmetric and has distinct sides to it. As such, MutLα can only interact with PCNA on a specific face and the complex will have a certain orientation, which remains constant even when sliding across the DNA. MutLα has endonuclease activity on one of its heterodimer subunit, PMS2, and this will only nick the same strand as the PCNA binding.

The reason that nicks are made close by to the mismatch (which is essential for DNA repair) is because the complex making the nicks, MutSα/MutLα, has the highest number around the mismatch site, correlating with greater PCNA collision frequency. This is especially important in replication, where the PCNA molecules adhere to the DNA for an extended period of time even after replication. Due to RFC, they are loaded at the 3′ terminus of an Okazaki fragment of the leading strand. They adhere with a certain orientation, which allows MutSα/MutLα to cleave the nascent DNA strand, even though the gap around the Okazaki fragment has long been linked. As such, the MMR system has the correct directionality due to this nick generation.

DNA Repair by DNA ligase I


In degradation, EXO1 is loaded at the nicks created by the PCNA-activated MutSα/MutLα complex. This creates a large gap that starts the nick and ends around 150 bases afteer the mismatch. This gap is single-stranded and on the same side as the mismatch. EXO1 is an exonuclease that can only cut in a 5′ to 3′ direction.


Resynthesis involves PCNA, polymerase-δ, and DNA ligase I in order to replace the removed bases and, overall, fix the mismatch error.

EXO1 Independent Mechanism[edit]

Although not proven in humans, EXO1 deficient mice showed less mutations than MSH2 and MLH1 deficient mice, indicating a mismatch repair mechanism that does not require EXO1. Indeed, a 5′ nick MMR mechanism could occur without EXO1 through use of polymerase-δ and MutSα, RPA, RFC, and PCNA. When there is a 5′ nick from the mismatch error, polymerase-δ can catalyze strand displacement, whereby FEN1 can catalyze the removal of the strand containing the mismatch. DNA ligase I would then seal the nick formed.

Insertion/Deletion Loops and Trinucleotide Repeats[edit]

Insertion/Deletion loops (IDLs) and trinucleotide repeats (TNRs) interact largely with MMR in both error-preventing and error-propagating ways.

Origin of IDLs[edit]

IDLs arise due to the activity of polymerase on TNRs. Trinucleotide repeats are large number of repeats of a single tripley of nucleotides. Such repeats have been implicated in diseases such as Fragile X. When polymerase reads these repeats, it slows down. However, helicase does not slow down, and due to being relatively faster, there becomes long strands of single stranded DNA. As such, these strands can bunch up and form an IDL. This would cause polymerase to create shorter than usual DNA strands.

The Error-Preventing Role of MMR[edit]

When things such as this happen, MMR can work to fix it. If the loops is less than two to three extrahelical nucleotides long, the canonical MMR can fix it. However, if the loop is longer, there is a MutSβ-mediated way for loops to be fixed. However, this happens by some other MMR mechanism, for in the regular process, PCNA would not be able to diffuse past the large loop. Thus, there must be some non EXO1-mediated MMR.

The Error-Propagating Role of MMR[edit]

In certain cases, MMR may be “hijacked” to cause TNR expansion. In the event that there is a cruciform loop structure, where there are loops in both strands in the same relative position, a cleavage by PCNA attached by RFC onto one of the loops activating MutSβ and MutLα endonucleolytic activity may cause one of the loops to collapse. When polymerase replaces the missing nucleotides, there will be an extension of the trinucleotide repeat. A larger number of repeats has been linked to more severe disease in diseases (such as Huntington’s Disease) that are caused by TNRs, and so this is an important field of study.

Antibody Variation and Class-Switching[edit]

Antibody variation, although due to a variety of reasons, is largely dependent on the role of MMR. After VDJ recombination, a process involving recombination of the variable, diversity, and join regions of the immunoglobulin genes, a variety of IgM antibodies can be made. However, there are further mechnanisms for antibody variability.

The role of MMR in Somatic Hypermutation[edit]

Somatic hypermutation (SHM) is a process whereby many mutations arise in the variable region of the antibody. It works through activation-induced cytidine deaminase (AID) where C nucleotides are converted to U. This occurs during transcription, because AID works best on single-stranded DNA. When this occurs, there is a mismatch error on the resultant DNA. As such, uracil DNA-glycosylase works to do base-excision repair (BER) and remove the incorrect U nucleotide. However, once this happens there is an apyrimidinic site (AP) remaining. This site can be the target of EXO1 in order for MMR to occur.

In this case, EXO1 cleavage may cause a large swath of DNA to be excised. When polymerase goes to fix it, AP sites and remaining uracil nucleotides may cause incorrect mutations at the sites where AID acts. This would result in changes in the variable site of the antibody and ultimately, different antigen recognition.

The role of MMR in Class-Switch Recombination[edit]

MMR can also cause the type of antibody to change, such as from IgG to IgM while recognizing the same type of antigen. When there are two AP sites and EXO1 causes the excision of a section of DNA to the other gap, class-switch recombination (CSR) can occur due to a double-strand break.


  1. Peña-Diaz, J., & Jiricny, J. (2012). Mammalian mismatch repair: error-free or error-prone? Trends in biochemical sciences, 37(5), 206–14. doi:10.1016/j.tibs.2012.03.001
  2. Zhao, J. et al. (2009) Mismatch repair and nucleotide excision repair proteins cooperate in the recognition of DNA interstrand crosslinks. Nucleic Acids Res. 37, 4420-4429
  3. Lopez Castel, A. et al (2010) Repeat instability as the basis for human diseases and as a potential target for therapy. Nat. Rev. Mol. Cell Biol. 11, 165-170.


DNA strand breaks are often caused by internal and external factors. After the termini of these strands break, they require processing before missing nucleotides can be replaced by DNA polymerase and its strands rejoined by DNA ligases. The enzyme polynucleotide kinase/phosphatase plays an important role in repairing DNA strand breaks by catalyzing the restoration of DNA’s termini. In addition to this, PNKP also helps in other DNA repair pathways through interactions with other DNA repair proteins such as XRCC1 and XRCC4. PNKP is important in maintaining genomic stability of normal tissues, like developing neural cells, and enhancing resistance of cancer cell to genotoxic therapeutic agents.

Polynucleotide kinase/phosphatase[edit]

When damage is done to cellular DNA, this causes aging, cancer etiology and treatment, and neurological disorders. DNA damage comes in different forms like: base modification and base loss and strand breaks. These damages can be triggered by intracellular agents like primary reactive oxygen species (ROS) and exogenous agents. In order to protect themselves from this damage, cells have evolved a battery of repair pathways. These counter mutational and cytotoxic consequences that occur due to DNA damage. Various mechanisms that cause strand breaks include: cleavage by physical and chemicals means such as ionizing radiation (IR) and ROS, and enzymatic processes. Therefore, strand breaks comes in a wide variety of forms and different strand breaks can be further classified or subdivided based on the nature of their termini. The enzyme PNKP carries 5’-kinase and 3’phosphatase activities that are essential for processing of single and double strand breaks at termini. Research into PNKP has shown that small molecule inhibitors of these enzymes sensitize cells to IR or chemotherapeutic agents. Researchers have also identified that mutations that have lead to changes in PNKP, similar to mutations in other genes that encode other strand break repair proteins, have been connected to a severe autosomal recessive neurological disorder.

Chemistry of strand break termini[edit]

IR-and free radical-induced breaks[edit]

Ionizing radiation (IR) causes strand breaks with a variety of end groups at 3’-termini by generating hydroxyl radicals. By generating hydroxyl radicals, reactions at different carbon atoms occur within the deoxyribose group to produce two predominant end groups: phosphate and phosphoglycolate. Phosphoglycolate formation is dependent on the presence of oxygen while 3’-phosphate groups are produced under normoxia and anoxia. On the other end, at the 5’-termini, the major end group is phosphate. In addition to causing strand breaks, ionizing radiation also generates complex lesions. These areas contain two or more damaged bases or strand breaks in close quarters and singly damaged sites. Complex lesions include frank DSBs with a ratio of SSB:DSB determined to be ~25:1. Another factor that causes strand breaks is hydrogen peroxide. Similar to IR-medicated damage, hydrogen peroxide causes far fewer frank DSBs. Bleomycin, a chemotherapeutic agent, additionally produces DSBs at the 3’-phosphoglycolate termini.

Camptothecin-induced breaks[edit]

The enzyme topoisomerase 1 creates a DNA cut with a 5’-OH terminus and a covalent 3’-phosphate-enzyme intermediate in order to relieve torsional strain. Using topoisomerase, camptothecin prevents resultant strand rejoining, leaving a DNA-enzyme ‘dead-end’ complex. By hydrolyzing this complex with tyrosyl-DNA phosphodiesterase, more cuts with 3’-phosphate and 5’-OH termini are made.

Repair-endonuclease induced breaks[edit]

Using DNA glycosylases, damaged bases can be removed. The abasic sites are then cleaved by one of two classes of enzymes. One of the enzymes, AP endonuclease, hydrolyses the phosphodiester bond 3’ to the abasic site in order to give 3’-OH and 5’-deoxyribose phosphate termini. By using DNA polymerase β, 5’-deoxyribose phosphate termini can be converted to 5’-phosphate. AP lyase works by cleaving the phosphodiester bond 5’ to the abasic site by a β-elimination reaction to give a β-unsaturated aldehyde attached to 3’-phosphate at one terminus and a 5’-phosphate at the other. Since many DNA glycosylases have this enzyme activity, the pentenal moiety can then be eliminated by an AP endonuclease to give 3’-OH or by an AP lyase to give 3’-phosphate. Enzymes NEIL1 and NEIL2, mammalian DNA glycosylases with β,δ-lyase activity, remove an extensive amount of mutagenic and cytotoxic oxidative pyrimidien lesions and purine-derived formamidopyrimidines.

Molecular architecture of PNKP[edit]

PNKP is constituted as a multidomain enzyme. It consists of 2 domains: an N-terminal forkhead-associated (FHA) domain and a C-terminal catalytic domain that is composed of fused phosphatase and kinase subdomains. Using a flexible polypeptide segment, the two domains, FHA and catalytic domain are linked together. This flexible polypeptide segment acts to selectively bind acidic casein kinase 2 (CK2)-phosphorylated regions in XRCC1 and XRCC4. XRcc1 and XRCC4 are important scaffolding proteins that repair DNA SSBs and DSBs. Aprataxin and APLF are DNA repair factors that also include FHA domains that likewise bind CK2-phosphorylated XRCC1 and XRCC4. This function could result in coordinated regulation of these proteins leading to binding of the phosphorylated scaffolding factors. PNKP and T4 polynucleotide kinases are similar in their catalytic domain in that they both contain contiguous kinase and phosphatase domains but different in that T4 enzyme lacks a FHA domain and that the kinase subdomain lies N-terminal to the phosphatasesubdomian.

The two catalytic active sites are positioned on the same side of the protein of murine PNKP. Murine and T4 kinase subdomain share resembling structure of a bipartite active site cleft that ahs separate ATP and DNA binding sites. The structure of the ATP binding site includes Walker A (P-loop) and B motifs conserved in various kinases. In addition it also carries aspartic acid that activates the 5’-hydroxly for attack on the ATP γ-phosphate. DNA binding sites between mammalian and phage enzymes are different. While phage PNK DNA binding cleft forms a narrow channel that leads to the conserved catalystic aspartic acid residue that accommodates single-stranded substrates, mammalian enzymes phosphorylates 5’hydroxyl termini within cut, gapped or DSBs with single-stranded 3’ overhanging ends since single-stranded 5’ termini are phosphorylated less efficiently. A broad DNA recognition grove composed of two distinct positively charged surfaces, selectively recognizes larger, double-stranded DNA substrates. By using structural information from small angle X-ray scattering experiments coupled with the effect of amino acid substitutions on surfaces of kinase, researchres found that DNA substrates bind across these surfaces in a defined orientation.

A typical process employed by many phosphatases is the haloacid dehalogenase fold. Mechanisms employed by these enzymes are dependent on Mg2+ while proceeding by a catalytic aspartate and acyl-phosphate intermediate. Mammalian PNKP executes its processes on a multitude of 3’-phoshate ends like those within nick,s gaps, DSBs, and single-stranded termini. Two narrow channels that are surrounded by large positively charged loops make a pathway to the phosphatase active site but aren’t wide enough to take in double-stranded substrates. This shows that either a requirement for remodeling of the phosphatase substrate binding surface or an unwinding of the DNA is needed to accommodate double-stranded substrates.

DNA repairing scaffold proteins, XRCC1 and XRCC4, interacts with PNKP function, mediated by binding of the PNKP FHA domain to phosphorylated motifs on XRCC1 and XRCC4. FHA domains, phospho-peptide binding modules, have a β-sandwich fold where a series of loops jut out from one side of the β-sandwich and provide a peptide binding surface with a marked preference for targets that contain a phospho-threonine residue. Even though XRCC1 and XRCC4 are structurally unrelated, they share similar motifs that are phosphorylated by CK2 and act as the binding sites for the PNKP FHA domain. A significant reduction in the efficiency of SSB repair occurs when a cluster of CK2 phosphorylation sites between residues 515 and 526 in XRCC1 is needed for interaction with PNKP and amino acid substitutions within this certain region. Similarly, a primary CK2 site in XRCC4, THr233, is needed for PNKP binding and for efficient repair of DSBs in vivo. Significant conservation of sequence is show around these sites. Phosphorylation of a conserved serine occurs and structure of the complex with regard to the primary phospho-threonine reveals a dynamic interaction of this residue with ARG35 or ARg44 of PNKP FHA domain. Tyrosine residue is conserved at the -4 position and asparagines residue is conserved at the +3 position. Some reactions aren’t conserved in the complex with XRCC4 in the FHA domain. The +3 position residue is a glutamic acid. Due to the peptides acidic properties and long-range electrostatic interactions between residues, the largely positively charged peptide-binding surface contributes to binding specificity. Threonine phosphorylation in the +4 position also plays a role to binding selectivity through the recruitment of a second PNKP FHA domain.

PNKP and single-strand break repair (SSBR)[edit]

The multienzyme pathway, SSBR, uses different participants depending on the causative agent. An example would be with IR-induced strand breaks that involves losing at least one nucleotide. The process of damage recognition and correction of the strand that is broken at the termini is carried out by enzymes poly(ADP-ribose)polymerase (PARP), XRCC1, AP endonuclease 1 and PNKP, with other proteins acting as backups to this functionality. By using a short patch pathway that involves DNA polymerase β and DNA ligase III or a long patch pathway that uses DNA polymerase δ and/or ɛ, the FEN1 endonuclease and DNA ligase I, consecutive replacement of nucleotides and strand resealing may occur. When IR occurs, APE1 removes 3′-phosphoglycolates while PNKP hydrolyses 3′-phosphate groups. This occurs when 3′-phosphatase activity of APE1 is much weaker than that of PNKP. Enzyme PNKP also plays a role in confirming that 5′-OH termini are phosphorylated. Due to the fact that phosphatase activity of PNKP is much more active than the kinase activity, when strand breaks with both 3′-phosphate and 5′-OH termini occur, the activity of PNKP is prioritized. Phosphatase activity in PNKP was shown to be important in the rapid repair of hydrogen peroxide-induced SSBs in mammalian cells when a failure of overexpression of phosphatase-defective PNIKP to compensate for Xrcc1defieicney occurred. Correspondingly, another important factor of PNKP phosphate activity involves a small molecule inhibitor that dramatically retards SSBR in irradiated human cells. While it is shown here that important phosphatase activity exists in PNP, the physiological important of the 5′-kinase activity has yet to be determined.

A commonly accepted model for repair of radiation-induced SSBS is when SSB catalyzes the polymerization of chains of ADP-ribose onto acceptor chromatin proteins and itself. BY doing this, SSBR attracts the scaffold protein, XRCC1 and maybe also tightly bound DNA ligase III. The proteins then in turn recruits PNKP or APE1 in order to restore the essential terminal groups for DNA polymerase β so that it can add the missing base and allow DNA ligase III to rejoin the strand. By researching and analyzing protein-protein interactions, it was found that direct interactions between XRCC1 and PNKP exist, as well as with DNA polymerase β and DNA ligase III. This shows that these connected partnerships include tetrameric complex between the four proteins. This formation could form for various models. While there is evidence that shows interactions between XRCC1 and PNKP, evidence also exists that counters the concept that XRCC1 recruits either PNKP or APE1 to the strand break. By using the technique of cross linking proteins to DNA substrates, experiments were conducted to track the temporal association of SSBR proteins in HeLa cell. Through this process of incubation, it was discovered that for substrates with either 3′-phosphoglycolate termini or 3′-phosphate termini, APE1 and PNKP, were recruited to the strand breaks before XRCC1/DNA ligase III. In addition to this discovery, it was found that immunodepletion of APE1 or PNKP diminished the binding of XRCC1 to the following substrates. This indicated that APE1 and PNKP inducted XRCC1 to sites of oxidative damage rather than in reverse. Conversely, PNKP foci were found to be in the nuclei of hydrogen peroxide-treated cells expressing XRCC1, but did not exist in cells lacking XRCC1. This shows that although XRCC1 might not be required in the beginning stages of PNKP or APE1, it expedites the focal accumulation and provocation of these specific enzymes at sites of chromosomal damage

Even though DNA repair protein XRCC1 lacks inherent enzymatic activity, it has the ability to enhance both kinase and phosphatase activities of PNKP. By using florescence measurements to work out the binding mechanism between PNKP and substrates that mimic different strand breaks, the mechanism surrounding XRCC1-induced stimulation was discovered. Even though PNKP bounded tightly to a nicked substrate with a 5′-OH terminus with a Kd value of 0.25 μM, this was only 5- to 6-fold tighter than PNKP binding to the identical duplex bearing a 5′-phosphate. This showed that PNKP stayed bounded to the product of its kinase activity. Results showed that the presence of XRCC1 did not influence the binding of PNKP to the nonphosphorylated substrate. But further results also showed that PNKP interaction with the phosphorylated duplex was abolished thus indicating that XRCC1 did influence the binding and displaced PNKP from the reaction product. By following the evidence of kinetics of product accumulation under limiting enzyme concentration, the result of the addition of XRCC1 increasing PNKP enzymatic turnover was confirmed. Further data has shown that similar kinetic data was observed for PNPK phosphatase activity.

The relationship between PNKP and XRCC1 is further complicated by CK2-mediated phosphorylation of XRCC1. While promoting interaction with other proteins, XRCC1 phosphorylation also works to stabilize the XRCC1-DNA ligase III complex. Observations were found of multiple sites of CK2-mediated XRCC1 phosphorylation involved in vitro, clustered within specific locations. In order to recruit XRCC1 and PNKP to nuclear foci in hydrogen peroxide-treated or γ-irradiated cells, XRCC1 phosphorylation is needed. XRCC1 phosphorylation is also needed to promote more rapid repair of SSBs. If a cell lacked XRCC1 phosphorylation, this would not impact cell survival. But through further research and analysis, it was found that cells without function XRCC1 with triple mutant XRCC1 would fail to fully restore rapid SSBR, showing that there indeed existed an important interaction with PNKP. Repair of the cell could easily be completed by overexpression of PNKP. This shows that XRCC1 plays an important role in increasing PNKP enzyme turnover, especially when the cell contains a limiting concentration of PNKP.

Phosphorylation of XRCC1 by CK2, compared to nonphosphorylated XRCC1, prompts the kinase and phosphatase activities of PNKP that are measured in vitro. In contrast, Stimulation by nonphosphorylated XRCC1 is due to enhanced enzymatic turnover of PNKP. This situation brings up problems since it can be seen that phosphorylated and nonphosphorylated XRCC1 bind PNKP at different site and with different affinities, but both are able to stimulate PNKP by a similar mechanism. Research found that while phosphorylated XRCC1 binds the FHA domain with a Kdvalue of 4 nM, the nonphosphorylated protein binds the catalytic domain of PNKP with a 10-fold weaker affinity. This indicates that a certain possibility of phosphorylation-independent interaction between PNKP and XRCC1 in human cells exists. Researchers found that PNKP co-immunoprecipitated with XRCC1 triple mutant that was expressed in human 293T cells. While 85–90% of the cellular XRCC1 is phosphorylated, this does not indicate that the key cluster of amino acids involved in interaction with the FHA domain is fully phosphorylated. An increase in phosphorylation at the cluster and an approximately 3-fold increase in PNKP copurifying with XRCC1 was due to treatment of cells with hydrogen peroxide. This shows that cells might play a role in enhancing CK2-mediated phosphorylation of XRCC1 and its subsequent interaction with PNKP FHA domain. This enhancement happens directly in response to a confrontation by hydrogen peroxide or radiation to deal with rather high levels of DNA damage in an efficient manner. On the opposite end of the spectrum, unstressed cells are able to cope with comparatively low level of endogenous DNA damage by using a different method. By using nonphosphorylated XRCC1, or XRCC1 with a restricted degree of phosphorylation, it is able to activate PNKP through binding to the catalytic domain.

Cells are sensitive to camptothecin due to PNKP depletion in its cells and Pnk1 deletion in fission yeast. XRCC1 overlooks the repair of these strand breaks by forming a complex with TDP1, DNA ligase III and PNKP. Neurodegenerative disorder, spinocerebellar ataxia with axonal neuropathy-1, is caused by mutation of TDP1. Research shows that SCAN 1 cells have a reduced capacity to repair Camptothecin-induced SSBs and also display slow repair of hydrogen peroxide-induced SSBs. This evidence proffers that TDP1 is important and required to repair lesions generated by oxidative processes, lesions that possibly justify neurodegeneration observed in SCAN1. Evidence for this was shown by experiments for fission yeast in G0, Tdp1 and Pnk1 that act sequentially in order to process the 3′-termini of naturally occurring SSBs .

PNKP and base excision repair (BER)[edit]

Cellular mechanism BER, base excision repair, is accountable for the repair of most minor base modifications determined by IR, ROS and alkylating agents. First step in the mechanism is to remove the modified base by DNA glycosylases and then cleave the DNA at the newly formed apurinic/apyrimidinic (AO) site using APE1. Another way would be to use glycosylases hydrolyze the AP site with its AP lyase activity. With the discovery of the nei endonuclease endonuclease VIII-like-1 (NEIL1) and NEIL2 mammalian DNA glycosylases, it was indisputable that PNKP was involved in the BER pathway. Nei endonuclease VIII-like-1 (NEIL1) and NEIL2 mammalian DNA glycosylases possess β,δ-AP lyase activity that generates 3′-phosphate termini. Instead of binding directly to PNKP, these glycosylases instead are associated with larger complexes that contain other BER components that include PNKP. The function of these glycosylases are to undertake a variety of base lesions that include: thymine glycol, 5-hydroxyuracil and 8-oxoguanine . In addition to this function, glycosylases can also cleave intact abasic sites that are generated by glycosylases that do not possess AP lyase activity, and the pentenal moiety generated by the β-elimination AP lyases of other DNA glycosylases. Because of this NEIL glycosylases would compete with APE1 thus forming the basis of a different, APE1-independent, BER pathway. Although current research can not indicate to what extent NEIL1- or NEIL2-catalyzed cleavage of abasic sites arises in cells, the cleavage of these sites could possibly explain for the increased sensitivity of PNKP-depleted cells to the alkylating agent methyl methanesulfonate (MMS). This sensitivity to MMS came as a surprise in the experiments due to major lesions inflicted by this agent being N7-methylguanine and N3-methyladenine, with little if at all any direct strand scission . Downregulating aprataxin expression also causes cells to be sensitive to MMS. But since human DNA glycosylase that are responsible for removing these methylated bases do not possess AP lyase activity, the ability to act upon the abasic sites generated by MPG to produce strand breaks with 3′-phosphate termini must fall to NEIL1 or NEIL2.

PNKP and double-strand break repair (DSBR)[edit]

In the two major double-strand break repair pathways, there is proof for PNKP participating in nonhomologous end joining. But in contrast, due to its failure to influence IR-induced sister chromatic exchange by PNKP deletion, this suggests that PNKP may actually not be involved in homologous recombination. In addition to the other pathways, PNKP plays an additional role as a back-up, XRCC1-dependent, DSB repair pathway. Experiments showed evidence for PNKP participation through using human cell-free extracts. This evidence showed that PNKP kinase activity was required before binding of linearized plasmid substrates bearing 5′-OH termini could happen. XRCC4 and DNA-PK were important in determining how successful phosphorylation was. In parallel to the role of XRCC1 linking PNKP to DNA ligase III, XRCC4 links PNKP to DNA ligase IV. CK2-mediated phosphorylation of XRCC4 Thr233 plays a role in interacting with the PNKP FHA domain and smoothly stimulating XRCC4–DNA ligase IV mediated ligation of a 5′-dephosphorylated plasmid substrate in vitro. In an Xrcc4-deficient cell line, when expression of XRCC4 occurs instead of wild-type XRCC4, the rate of survival is reduced by approximately 30% following irradiation and thus slowing down the rate of DSB repair.

The function role of the XRCC4-PNKP interaction was able to be determined by coming biophysical and biochemical examination. While phosphorylation of XRCC4 advocates a tight affinity for PNKP, nonphosphorylated XRCC4 also have the ability to bind to PNKP. Though in this particular case, binding is to the catalytic domain of PNKP thus weakening the affinity. Similar to the ability of XRCC1 stimulation of PNKP turnover from SSBs, nonphosphorylated XRCC4 has the ability to stimulate pNKP enzymatic turnover from DSBs. Research found that the presence of phosphorylated XRCC4 failed to stimulate PNKP and thus did block PNKP-mediated DNA phosphorylation. But with the additional attendance DNA ligase IV, the complex it forms with phosphorylated XRCC4 has the ability to reverses the inhibition and stimulate PNKP turnover. A ratio of XRCC4:DNA ligase IV:PNKP of ∼7:1:3 was found in the proteins in HeLA cells, with almost half of the XRCC4 vitally phosphorylated at Thr233. This shows that in cells, only a fraction of XRCC4 can be complexed to DNA ligase IV thus indicating a possibility for FHA-independent interaction between XRCC4 and PNKP. Using XRCC4 co-immunoprecipitation with PNKP, the FHA independent interaction between XRCC4 and PNKP was confirmed for expression in cells depleted of endogenous PNKP. PNKP also has an important function of processing DSB 3′-phosphoglycolate termini, especially 3′-overhanging and blunt-ended termini. These termini are produced by IR, bleomycin and enediyne compounds like neocarzinostatin. Even though APE1 has the ability to remove phosphoglycolate groups at SSB termini and recessed DSB termini, with blunt-ended DSB termini it loses its effectiveness and with overhanging termini it is completely ineffective.

Physiological roles and clinical potential of PNKP[edit]

PNKP is involved in several DNA repair pathways that work to protect cells from endogenous and exogenous genotoxic agents. Neurological disorders with various symptoms occur when disruption of NHEU genes and SSBR/BER genes occur. An example would be microcephaly. Microcephaly occurs in people with mutations in LIG4 that encodes DNA ligase IV. Deletion of Xrcc1 in mice causes seizures. Research has found that PNKP mutations are the cause of a sever neurological autosomal recessive disease that is characterized by microcephaly. Symptoms include intractable seizures and developmental delay. Through analysis of families, mutations were found in both the kinase and phosphatase domains. Through the collection of all the symptoms shown by patients with MCSZ, it shows the involvement of PNKP in multiple DNA repair pathways.

PNKP has also shown to be linked to pathophysiological conditions. It has been observed that elevated expression of PNP in arthrofibrotic tissue shows a role for PNKP in mitigating the effects of ROS generated by macrophages. It has also been observed in another experiment that physiologically and environmentally relevant doses of cadmium and copper are known to elicit neurotoxic and carcinogenic effects, thus inhibiting PNKP.

The concept of DNA repair capacity of tumor cells shows an important point in clinical response to many antineoplastic agents. Thus investigations are underway of inhibitors of several DNA repair enzymes like PNKP. They hold on to the ability to sensitize cells to radiation and chemotherapeutic drugs thus showing an important concept for research. Through this research, a small molecule inhibitor of PNKP phosphatase activity was identified and exhibited to heighten the sensitivity of cells to IR and camptothecin. This is the parent compound of two clinically important topoisomerase I poisons, irinotecan and topotecan, that are frequently used to treat colon and ovarian cancers.


PNKP is an important enzyme that is used in cellular processing of strand break termini. PNKP is involved in many DNA repair pathways due to its helpful properties. More research is needed to identify how it is regulated, how it collaborates with other repair enzymes, and physiological role in neurons and other tissues. PNKP is seen as a therapeutic target in treatment of cancer since it is involved in a variety of repair pathways. Therefore, new inhibitory compounds will need to be identified, researched, and optimized for clinical use. Further research should be invested in identifying synthetic lethal partners of PNKP in order to view its potential use as single agents against tumors deficient in proteins.


Weinfeld, Michael. “Tidying up loose ends: the role of polynucleotide kinase/phosphatase in DNA strand break repair.” Trends in Biochemical Sciences 36.5 (2011): 262-71. PubMed. Web. 21 Nov. 2012.

DNA Packaging[edit]

DNA packaging is an important process in living cells. Without it, a cell is not able to accommodate large amount of DNA that is stored inside. For example, a bacterial cell which ranges from 1 to 2um in length contains amount of DNA that is 400 times as big (Becker et al. 530). Eukaryotic cells face even bigger challenges. A typical human cell has enough “DNA to wrap around the cell more than 15,000 times” (531). Therefore, DNA packaging is crucial because it makes sure that those excessive DNA are able to fit nicely in a cell that is many times smaller.

The DNA in bacterial cells are either circular or linear. To accommodate the size of bacterial cell, supercoiled DNA are folded into loops with each loop resembles shape of bead-like packets containing small basic proteins that is analogous to histone found in Eukaryotes (533).

In eukaryotic cells, DNA packaging is more complicated because they contain amount of DNA that is much larger than that of bacterial cells. More proteins are therefore required for the process with histone being the most important one. This protein is consisted largely of positive amino acids like lysine and arginine which make the overall structure positive. Thus, histone interacts favorably with the negative phosphate groups from DNA. There are five main types of histone, H1, H2A, H2B, H3 and H4 (533). Two of each H2A, H2B, H3 and H4 joins to form an octamer wrapped around by DNA of 146 base pairs like a bead on a string. This bead, consisting of eight histone molecules and 146 DNA base pairs, is known as the nucleosome. Each nucleosome is connected by a DNA linker of 50 base pairs to form a fiber like structure called chromatin. H1 is believed to be found in these DNA linkers. Chromatin fibers can be further compacted to form higher order of structures called heterochromatin or euchromatin depending on the degree of packing. Ultimately, DNA packaging in eukaryotic cells can lead to the formation of chromosome which is only present during cell division or several other situations (533-535). In eukaryotic cells, DNA packaging is not only in the nucleus but is also in mitochondria and chloroplast. The overall shape of their DNA resembles that of bacteria instead that of eukaryotes.

Histone chaperones and the nucleosome assembly processes[edit]

Histones are proteins that allow DNA to be tightly packaged into units called nucleosomes. The DNA wraps itself around the histones.

Chromatin is made of DNA and proteins (Histones). Chromatin is used to give structure to a chromosome.

Nucleosome consists of the acidic chromatin and the basic histone proteins.

Histone chaperones
Histone chaperone guided folding pathways, assists in the folding and unfolding of the DNA around the histone.

The tight coiling of DNA allows easier access to the DNA which makes sequencing faster.

Need for histone chaperones
Nucleosomes can be assembled or disassembled and are done in stepwise function. Histone chaperones guide the pathway process, they control and regulate.

Structural forms of histone chaperones
Since histone chaperones participate at each step of the nucleosome assembly processes, there are different chaperones needed for each different step.[1]


Becker, Wayne M, et al. The World of the Cell. 7th ed. New York: Pearson/Benjamin Cummings, 2009. Print.

Churchill, Das, Tyler The histone shuffle: histone chaperones in an energetic dance
DNA’s many properties such as minuscule size, informational element, and amplification provides for the potential of DNA to produce unique materials. As a matter of fact, the DNA molecule is arguably one of the most promising functional nanomaterial to date. DNA’s characteristic trait of self-assembly and molecular recognition makes it a key player in the bottom up construction of nanotechnology. The key factor in DNA’s role as nonmaterial lies in the molecule’s sticky ends. A sticky end is the short single stranded overhang that protrudes from the end of a double stranded DNA molecule. These sticky ends when paired with their complementary counterparts will cohere to form a diverse range of molecular complex. Under lab conditions these sticky end sequences are programmable so that there is control over intermolecular interactions and predictable geometry at the point of cohesion. Such control in formation allows for the construction of artificial DNA structures. Over the years the use of DNA in nanostructures have expanded in three main directions.

  1. The synthesis of artificial networks using DNA.
  2. The integration of DNA onto surfaces.
  3. The formation of metal or semiconductor nanoparticle assemblies along DNA.

The Fabrication of Artificial Networks[edit]

Due to the fact that the DNA double helix runs along a single axis, the molecule is unbranched. This provides a problem with using DNA in the fabrication of nanomaterial because joining DNA molecules by sticky ends can only allow for construction in a single linear direction. However branched DNA does occur naturally in nature as an intermediated formed when chromosomes exchange information during meiosis. This structure is formally known as the Holliday structure. Over the years scientists have been able to feasibly design sequences that leads to stable synthetic variants of the Holliday junction. This allows for the fabrication of artificial networks consisting of native DNA.

Attachment or Integration of DNA onto Solid State Surfaces.[edit]

The first step in DNA technology is to attach the molecule to the surface of interest. There are three different methods that may be used for such attachments.

  1. Electrostatic interaction between DNA and a substrate.
  2. Covalent binding of a chemical group attached to the DNA end.
  3. Binding of a protein attached to the DNA end to the corresponding antibody immobilized at the surface.

Due to the enhanced surface area and high surface free energy, nanoparticles can absorb biomolecules strongly and facilitate attachment. Generally the absorptions of biomolecules onto the naked surface of bulk material will result in their denaturation and loss of bioactivity. However the since nanoparticles have the ability to reduce the distance between the redox site of DNA and the working electrode surface, biomolecules are able to maintain their bioactivity. This is due to the fact that the rate of electron transfer is inversely dependent on the exponential distance between the two molecules.

Metal Nanoparticles[edit]

Metal nanoparticles have the potential to revolutionize convention technology and experimental medicinal industries because of their electrical properties. There has been three main approaches in using DNA to assemble metals or semi-conducting nanoparticles.

  1. Electrostatic binding of negatively charged DNA to positively charged colloids
  2. Chemical binding of colloids to DNA
  3. Formation of nucleation sites along DNA molecules followed by metal or semiconductors deposition to allow for the direct growth of nanoparticles along the DNA.

Prerequisites for Structural DNA Nanotechnology[edit]

Before DNA can be used as nanomaterial it must be structured into the correct form, which can be achieved through three main ideas:

  1. Hybridization
  2. Stably branched DNA
  3. Synthesis of designed sequences.

Hybridization mostly entails the use of sticky-ended cohesion, which combines pieces of linear duplex DNA together through Hydrogen bonding. Sticky-ended cohesion is very useful because we can predict how the sticky ends will cohere with one another due to affinity. The double-helix will be the ultimate structure formed, which helps us to eliminate the need to first establish the crystal structure.

DNA before ligase

Stably branched DNA in combination with hybridization is what allows for DNA to be construction material. DNA molecules that have branched self-assemble to form larger arrangements. So, where the DNA forms a junction, the sticky ends have come together in a complementary fashion. But, the DNA will have left over sticky ends which it will use to bind to other junctions to “self-assemble” into two-dimensional or three-dimensional lattices.

However, there is one major problem with junctions, they are unstable because of their sequence symmetry. The symmetry exhibited allows for a specific type of isomerization known as branch migration. Branch migration allows the branch point to relocate which makes our entire structure highly unstable. To make it so that DNA is a stable structure we must minimize sequence symmetry. Nature, though, is very symmetrical and we must therefore synthesize DNA molecules of arbitrary sequences. Luckily, we have had laboratories that do just that since the 1980’s and it is now possible to order DNA molecules known as “vanilla” DNA because they lack complexity. So, these DNA molecules are readily synthesized, making it possible to have DNA in a structural form that will aid us in using it as nanomaterial.

Individual Steps[edit]

Motif Design[edit]

This step involves the switching of the connections between DNA strands, it is very similar to recombinant DNA in that we do something close to crossing over by switching connections between two different DNA double helices, this ultimately gives us a new connectivity. It is important to note that this step is purely theoretical, it is a type of drawing stage similar to blue prints for the later sequence design.
Motif design is carried out through an operation called reciprocal exchange, if only a single reciprocal exchange is performed a conformer has just been made, and there is no difference. There must be at least two operations carried out for there to be a result of different topologies.
There are a few key motifs that have been generated:

  1. DX motif: has exchanges between strands of opposite polarity and is known for its length which is twice that of a conventional linear duplex DNA
  2. DX + J motif: has an extra domain which is usually perpendicular to the plane of the two helix axes which allows the domain to be a topographic marker, easy to distinguish through a atomic force microscope.
  3. TX motif: has three domains that are joined in a particular pattern known as 1-3 fashion (where the top helical domain of one is joined at the bottom domain of another). These three domains are useful because in 2D arrays created there will be useful cavities.
  4. PX motif: happens everywhere that two double helices can be juxtaposed
  5. JX2 motif: is the topoisomer of the PX motif and it lacks two of the exchanges that the PX has

Sequence Design and Symmetry Minimization[edit]

This step is where we use the motifs designed previously, and we go to the lab and literally design. The main goal is to get the molecules we are working with to become excited states. One effective method for achieving this is based on minimization of sequence symmetry, which of course has worked very well for the design of branched molecules.

There are a few helpful guidelines to follow to avoid getting unstable molecules:

  1. prevent long stretches of Guanine, because these could form other structures near crossover points
  2. avoid certain tracts that look like they might be symmetric, such as: homopolymer tracts, polyprimidine tracts, alternating purine-pyrimidine tracts, or polypurine tracts

There are some situations where these previous tips don’t necessarily work, in the case of a 12-arm junction for example. Here it is impossible to flank the branch points with different base pairs. So instead of attempting to eliminate symmetry around the center of the junction, we take identical nucleotide pairs and space them at four-step intervals around the junction.

There are a few situations where scientists have completely ignored sequence symmetry and have managed to design structures, such as with DNA origami. Scientists used a long single strand of viral DNA to scaffold a couple hundred shorter strands to produce a two-dimensional or three-dimensional shape. Two astonishing DNA origami pieces that they were able to create include a smiley face, and a map of the Western Hemisphere.

Uses of DNA as a Nanomaterial[edit]

DNA has a variety of amazing uses, one such use of this tiny molecule is that it can be used to make nanomechanical devices that are extremely unique in and of themselves. Such nanomechanical devices include molecules that self-assemble, change their own shapes, and walk along a DNA sidewalk. DNA can also be used to organize other cellular species, DNA can help to organize or move proteins, enzymes, and nanoelectronic components. Such remarkable uses for this dynamic molecule has spurred on others to begin investigating the possibilities for programming the information in DNA beyond the use of just its genetic code.


DNA species is not only used for organizing, but can be used also with DNA-based systems. Some of these systems are periodic assembly, using DNA to organize proteins and nanoelectronic components, algorithmic assembly, derived from DNA-based computation, and seminatural components.

Organization of proteins and nanoelectric components using Periodic Assembly[edit]

DNA can be used to organize protein structures. Protein to proteins can be bound using structural DNA nanotechnology. For example, biotin groups can be attached to DNA arrays and then it was bounded to streptavidin. Another focus of structural DNA nanotechnology is organizing nanoelectronic and nanophotonic component. Multiple components can be used to organize metallic nanoparticles, such as gold.

Algorithmic Assembly[edit]

Structural DNA nanotechnology can be used to organize aperiodic matter, although it is not as simple, using algorithmic assembly. One advantage of algorithmic assembly is its ability to generate complex algorithmic patterns using only a few tiles. However, its disadvantage is that algorithmic assembly is extremely sensitive to errors, and it is more prone to error than periodic assembly.

Seminatural Constructs[edit]

It is also possible to combine other chemical species with DNA in nanoconstructs. Metallo-organic complexes were placed in junction sites, which led to different properties in presence or absence of metal. It is also possible to stimulate G-tetrad formation by use of a square-shaped organometallic molecule.

Because DNA can be combined with other molecular synthesized species, it is extremely valuable. Therefore, it is also questioned whether the arrangement can be changed per time. A fourth dimension is used in structural DNA nanotechnology.

Devices can be based on Structural Transitions. However, this system is limited to two states, as they ignore the programmability of DNA. Also there are transitions that are sequence dependet, so many individually addressable devices can be in the same solution.


Seeman, Nadrian C. “Nanomaterials Based on DNA”. Annual Review of Biochemistry 2010. Vol. 79: 65-87. 03/11/2010. DOI: 10.1146/annurev-biochem-060308-102244

Abu-Salah, Khalid M., Ansari, Anees A., Alrokayan, Salman A. “DNA-Based Applications in Nanobiotechnology”. Journal of Biomedicine and Biotechnology 2010. 2010:15. 12/1/2001


Structural DNA nanotechnology is concentrated on synthesizing and building sequences of nucleic acid complexes with nano particles and nano materials. Structural DNA nanotechnology aims to achieve complete control of DNA with respect to its structure in space and time. With the use of SDN, scientists can manipulate the structure of DNA into any shape. Nanotechnology and nano science are inherent subjects of the construction of DNA, because the structure and dimensions are measured on the nano scale.

Fundamental concepts[edit]

A DNA four-arm junction showing the nucleotide sequences.

These four strands associate into a DNA four-arm junction because this structure maximizes the number of correct base pairs, with As matched to Ts and Cs matched to Gs.[2] See this image for a more realistic model of the four-arm junction showing its tertiary structure.

DNA nanotechnology creates complex structures out of nucleic acids by taking advantage of the specificity of base pairing in nucleic acid molecules. The structure of a nucleic acid molecule consists of a sequence of nucleotides, distinguished by which nucleobase they contain (A,C,T,G). Nucleic acids have the property that two molecules will bind to each other to form a double helix only if the two sequences are complementary, meaning that they form matching sequences of base pairs, with A’s only binding to T’s, and C’s only to G’s. Because the formation of correctly matched base pairs is energetically favorable, nucleic acid strands are expected in most cases to bind to each other in the conformation that maximizes the number of correctly paired bases. This property, that the sequence determines the pattern of binding and the overall structure, is used by the field of DNA nanotechnology in that sequences are artificially designed so that a desired structure is favored to form.[3]

Most if not all DNA nano-structures utilize the DNA branched structure, the most basic of which looks like a 4-way intersection. This simple rigid branched structure is made with 4 separate complementary DNA structures. Though there are naturally occurring branched structures the Holiday junction being an example, the difference between those and the artificially made branches for nanotechnology is that the base sequence of each arm in the artificial structure is different, meaning that the junction point is fixed in a certain position, giving rigidity and stability to the structure.[3]

Initial DNA Nanotechnology Process[edit]

Motif design and sequence design are the two major steps needed for the initial process of structural DNA nanotechnology. It is essential to generate new DNA motifs that can self assemble from existing strands of DNA. Motif design relies upon the “reciprocal exchange” between the connections of two DNA double helix strands into the formation of new DNA motifs. Coinciding with motif design, a sequence design is necessary in order to classify individual strands. The sequence design is highly important and vital to the self assembly of the previously designed motif.

Constructs and DNA Origami[edit]

With the use of motif design, sequence design and nano materials, DNA can be (and has been) formed into numerous different constructs. Scaffolding strands and helper strands are utilized in the folding and formation of DNA origami. Paul W.K. Rothemund demonstrated that with strand folding, he was able to create a smiley face of DNA. One of the important discoveries of these constructs and origami creations is the increase in surface area available to the DNA. This can be utilized in the embedding of nano mechanical devices, building three dimensional objects and even potential therapeutic delivery uses.[4]

DNA-Based Nanomechanical Devices[edit]

In addition to triumphs in the development of the structure of DNA nanomaterials in space, efforts have been made with considerable success to develop the controlled changing of these structures in time. Such DNA based structures fall under the study of nanomechanical devies. Nanomechanical motion can achieved by the exploitation of DNA structural transitions such as B-Z transition and the controlled conversion of a PX structure to that of a JX2 structure. Motion can also be achieved using DNA sequence dependency. Virtually every DNA sequence dependent device exploits a technique involving as the addition of an 8-nt “toehold” to a controlling strand in the device to allow for a change of state. Initially, this toehold in the controlling strand is unpaired, but when a complete complement is added to the strand, the toehold and its complement bind together, effectively removing the remainder of the strand by means of branch migration. Examples of devices that utilize this technique are the construction of a pair of molecular tweezers developed by Bernard Yurke and his co-workers, and nanomechanical devices constructed to walk on DNA “sidewalks” in a controlled manner by means of both human intervention, and more recently, autonomous action.[4]


  1. Churchill, Das, Tyler The histone shuffle: histone chaperones in an energetic dance
  2. Mao, Chengde (December 2004). “The Emergence of Complexity: Lessons from DNA”. PLoS Biology 2 (12): 2036–2038. doi:10.1371/journal.pbio.0020431. PMC 535573. PMID 15597116
  3. ab Seeman, Nadrian C. (June 2004). “Nanotechnology and the double helix”. Scientific American 290 (6): 64–75. doi:10.1038/scientificamerican0604-64. PMID 15195395
  4. ab Nadrian C. Seeman Department of Chemistry, New York University, New York, New York 10003 Nanomaterials Based on DNA Annual Review of Biochemistry Vol. 79: 65-87 (Volume publication date July 2010) DOI: 10.1146/annurev-biochem-060308-102244

Holliday Junction[edit]

A Holliday Junction is the combination of four moving DNA strands that bridge together to form four double-helical arms. This model was proposed by Robin Holliday in 1964 to explain the transfer of genetic information in yeast, now known as homologous recombination. (1)

2-D View of the Holliday Junction

However, this structure is unstable because they contain homologous sequence symmetry, which allows itself to isomerize. These junctions can slide up and down the DNA since they are between homologous sequences. When there are a few single strands with the same length, the free ends of each strand can match up to another DNA helix with free ends. The strands are now crossed over, which creates the Holliday junction. (2)


Proliferation is the growth or production of cells in an organism. Controlling proliferation, which is achieved by the genes of a cell, maintains the homeostasis of the body. Proliferative genes such as proto-oncogenes, that promote cell division, facilitate growth. Antiproliferative genes, on the other hand, inhibit proliferation and are important in limiting positive cell growth. Antiproliferative genes are categorized into three groups: tumor suppressor genes, genes that can kill a cell through loss of function, and genes that code for growth inhibitory proteins. Problems in either proliferative or antiproliferative genes may cause uncontrolled cell growth, leading to severe consequences.


J P Rouault, R Rimokh, C Tessa, G Paranhos, M Ffrench, L Duret, M Garoccio, D Germain, J Samarut, and J P Magaud: “BTG1, a member of a new family of antiproliferative genes.”

Protein-DNA Recognition[edit]

The binding affinity of DNA was traditionally thought to be dictated by its surrounding bases and phosphates. However, recent discoveries have shed light upon the role that interactions between DNA and selective residues, which were previously thought to be outside of the binding interface, affect genetic expression. When these regions bind to DNA, they do not result in a fixed structure. Rather, they create a dynamic complex which serves to regulate the transcription of RNA from DNA. This includes regulation of not only the machinery used to make RNA but also modifications to the pre-mRNA. [1]

Basic Mechanisms of Protein-DNA interaction[edit]

The process whereby DNA is transcribed into RNA and RNA is translated into a protein in its simplest form was thought to be controlled by characteristic indirect and direct mechanisms. These mechanisms are generally understood in terms of hydrogen bonding and electrostatic interactions. Hydrogen bonding between corresponding bases serves as a direct DNA template. The structure of DNA is also indirectly influenced by interactions between phosphate groups.[2]

Delving further into DNA Readout[edit]

Looking beyond the basic mechanisms which influence DNA conformation, we must also consider a plethora of other factors that determine how DNA can change its shape and flexibility during transcription. One of the primary factors that helps determine the geometry of a DNA molecule is its groove and width. The electrostatic potential of the molecule, primarily its ability to bind to proteins, is dictated by the geometry of the complex. Proteins that bind to DNA are fitted with segments which recognize specific DNA sequences. These segments are referred to as ID segments because they are intrinsically disordered and they can distinguish and bind to DNA at multiple stages. Although the mechanism by which ID segments influence the shape of DNA is not fully understood, it is theorized that ID segments work much like other globular proteins and fold into a designated shape once they make contact with their target sequence on the DNA. One of the most important roles of ID segments is guiding DNA-binding proteins along a complex. ID segments also help a DNA complex transition into a more specific state. [3]

ID Segments in Protein-Protein Interactions[edit]

The function of ID segments has been best studied in protein-protein interactions. In these interactions, ID regions initially appear to serve no distinguishing purpose. However, upon closer inspection, it is clear that ID regions influence both binding affinity and selectivity between proteins. ID regions can change the binding affinity between two proteins by adopting different conformations which allows for different binding interactions and more mobility. [4]

ID segments in Protein-DNA Interactions[edit]

Although the role of ID segments in DNA-protein interactions is not as well understood as the role of ID segments in protein-protein interactions, researchers do know that in most DNA-protein interactions for eukaryotic cells transcription factors only target the DNA binding region of the protein. This may be because ID regions tend to be more susceptible to degradation by proteolysis. Roughly 70% of proteins that bind DNA contain ID regions, which are usually localized on the tail. These ID segments give the tail a charge which functions in the identification of DNA sequences using electrostatic interactions. Electrostatic DNA-protein interactions are nonspecific, and they minimize the presence of free DNA-binding proteins. [5]

The Mediator complex in humans has roles in regulating RNA polymerase II’s ability to express genes for proteins. Not much is known about how it works, because of the sheer size of Mediator, 26 sub units and 1.2MDa, and can change composition if the promoters are different. This makes it seem like Mediator has an infinite number of regulatory functions.
Studying Mediator is difficult, because it is a challenging process to isolate it from human cells. It is not very abundant and is found taking different forms in the body. Another reason is that bioinformatics cannot be used to study its function. There is no conclusive evidence for Mediator being in microbes, so it is currently assumed to be only in eukaryotes.

Mediator Control of Transcription Machinery[edit]

Mediator’s first discovered function was activity in activator-dependent transcription. Researchers found that Mediator is a target of DNA-biding transcription factors. However, Mediator may also play a role in activator-independent transcription, because it can cause basal transcription (a type of activator-independent transcription). This suggest Mediator could have control over initiation of transcription. More evidence to support this is RNA pol II recruitment is enhanced and transcription complexes are stabilized at the promoter by Mediator. Mediator has been shown to interact with Pre-Elongation Complex (PEC), which is the transcription initiation machinery, through its large surface area that allows for multiple protein-protein interactions. Also Mediator acts as a frame for PEC assembly.

CDK8 Submodule[edit]

Strangely, Mediator can also block RNA pol II from working with PEC, with the help of cyclin-dependent kinase 8 (CDK8). CDK8 contains the proteins CDK8, MED12, MED13, and cyclin C. CDK8’s effect was discovered by separating Mediators with CDK8 and Mediators without and seeing that CDK8 Mediator could not begin transcription. It does this by inhibiting Mediator’s ability to bind RNA pol II C-terminal Domain and recruit RNA pol II to a pre-elongation complex. Another role of CDK8 is regulating serum response genes such as FOS, EGRI, EGR2, and EGR3. CDK8 helps regulate RNA pol II phosphorylation and elongation at these genes.

Mediator Gene-Selective Functions[edit]

Mediator is likely to have a role in all protein-coding genes. New research has shown that MED1, the largest Mediator subunit is generally targeted by nuclear receptors. MED1 is phophorylation by extracellular signal-regulated kinase (ERK), stabilizing MED1 in Mediator. When MED1 is phophorylated, it shows better ability to activate transcription.


The human Mediator complex: a versatile, genome-wide regulator
of transcription – Dylan J. Taatjes

Aging effects on chromatin

In mammals, aging is a complex event that is probably a result of a wide variety of molecular changes interacting with each other. In culture it has been observed that cells undergo an irreversible cell cycle arrest known as replicative senescence which is believed to reflect the aging process at a cellular level. Recent studies have shown that in skin biopsies that these senescent cells accumulate during aging and can represent up to 15% of the total cells. Eukaryotic cells histones and non-histone proteins that form chromatin fiber are closely associated with DNA and the determination of the senescent phenotype has been linked with chromatin involvement.


During embryogenesis, chromatin states are what allow for the development of an organism. Evidence shows that it is possible that is that the organization of chromatin deteriorates throughout the lifespan of an organism. New evidence suggests that this chromatin deterioration could have an effect on the way that organisms age.

Chromatin is a mass of genetic material composed of DNA, proteins and RNA complex that is located inside of the nucleus comprised of about 146bp of DNA wrapped around an octamer of histones. There are two divisions of chromatin into heterochromatin and euchrochromatin with heterochromatin being further divided into the two types, constitutive and facultative heterochromatin. For awhile Chromatin has been known to have a correlative relation to just about every process in the mammalian nucleus, but recent advances in technology allowing loss-of-function experiments and genome-wide approaches have given light to the identification of a causal relationship between specific changes in the chromatin structure and the aging phenotype. Aging in this case is the progressive decline in vitality that eventually ends in death and an aging phenotype being some outward sign usually associated with aging such as osteoporosis, sarcopenia, declining immune function, cancer and many others. Chromatin structure is dynamic and goes through extensive developmental and age-associated remodeling that appear to counter the aging and age-associated diseases, such as cancer, and extend the organismal lifespan. However, non-deterministic changes in the chromatin structure might also be a contributing factor to the breakdown of nuclear, cell and tissue function and ultimately causing the very symptoms of aging. Evidence points to the loss of heterochromatin structure, altered patterns of histone modification, loss of key heterochromatin proteins and increased levels of persistent DNA damage to be common signs of both normal and premature aging.

Chromatin as a potential regulator in aging[edit]

A characteristic of aging is the loss of homeostatic mechanisms that once acted to offset the macromolecular wear and tear that occurs during an organism’s lifetime. Chromatin, being a macromolecule of the cell, is exposed to stresses that can affect both its structure and function. There is a proposal stating that aging is due to the change from a youthful chromatin configuration to one that helps bring about molecular signatures of aging. One example of this can be found in monozygotic twins, whose chromatin modification patterns diverge increasingly with age. These chromatin changes might be the base to subtle phenotypic variations that become more prominent as twins (as well as closely related individuals) age. This evidence suggests that aging may be the job of chromatin-based epigenetic regulatory mechanisms. The alteration of chromatin structure with regards to aging has been studied in depth; most importantly noted is the reshaping of chromatin during cellular senescence. Cellular senescence is an irreversible state of cell cycle arrest. This arrested state of the cell cycle is hypothesized to reflect aging.

Age-associated deregulation of chromatin modifiers[edit]

A common feature of aging is random cell-to-cell variation in gene expression. Also, differences in the expression of chromatin modifiers have been found throughout senescence and aging. This changes the distribution and levels of chromatin modifications throughout the nucleus and at aging-associated loci, leading to the activation of age-promoting physiological responses.

In mammals, there is an age associated decline in genomic DNA methylation at certain DNA sequences which has been theorized to promote the deheterochromatinization of these regions. However there is a tendency for DNA methylation to increase at certain sites as aging progresses which supports the idea that heterochromatin accumulates with tissue aging at these sites. Particularly in locations where the histone chaperone HIRA has been observed. Histone chaperones are histone binding proteins involved in the assembly of histones into nucleosomes and their activity has been used for the determination of chromatin structure and function. In particular HIRA has been linked to to transcription activation and is known to have an evolutionarily conserved role in heterochromatin formation. Experimental data has shown that HIRA either increases in expression or undergoes regulation in aging baboon skin. Given that there has been both an observed overall decline in heterochromatin, but an increase at specific sites it has been suggested that aging is also associated with remodeling the chromatin structure.

SIRT1 and SIRT6 in DNA repair and aging[edit]

Silent information regulator 2 (Sir2) has been found to aid in the process of silencing rRNA genes and telomeres. As yeast age (Sir2 are found in yeast), this ability is weakened. This weakened ability results in excision of rDNA arrays from the genome, which creates circular episomal DNAs. If Sir2 continues to function normally, these arrays are inhibited and the lifespan of yeast is extended. Sir2 does not seem to directly affect the lifespan of some organisms, but studies on budding yeast by Karberlein have shown that the silencing protein Sir2 is a limiting component of longevity; deletions of Sir2 shorten life span and an extra copy of the gene increases life span. Some evidence suggests that Sir2 homologs SIRT1 and SIRT6 have very important roles in managing the response to DNA damage and cellular stress. Orthologs of Sir2 have anti-aging functions in many species including nematodes and flies, but the mechanisms do not appear to involve the rDNA circles in these species.

SIRT1 makes changes to chromatin by deacetylation of H1K26, H3K9, and H4K16. Findings from experiments on mice show that SIRT1 binds to repetitive elements. After oxidative DNA damage, SIRT1 performs genome-wide redistribution to damage sites. At damage sites, SIRT1 deacetylates H1, advancing DNA repair. Expression of SIRT1 declines with age and has been correlated with premature aging in mice. SirT1’s actual effect on mammalian aging has yet to be established however since it contains many non-chromatin substrates.

SIRT6 also seems to be involved in repairing damaged DNA, further evidenced by the fact that SIRT6-deficient human cells are more affected by DNA damaging agents. SIRT6 helps in homologous recombination by deacetylating the C-terminal binding protein interacting protein. SIRT6 also directs PARP-1, which is one of the first responders to DNA damage. Low levels of SIRT6 annul Werner helicase function, promoting telomere dysfunction.


O’Sullivan, Roderick J., and Jan Karlseder. “The Great Unravelling: Chromatin as a Modulator of the Aging Process.” Trends in Biochemical Sciences 37.11 (2012): 466-76. Web.

1. DiMauro, Teresea, and Gregory David. “Chromatin Modifications: The Driving Force of Senescence and Aging?” Aging 1.2 (2009): n. pag. Print.
2. Guarente, Leonard. “Sir2 Links Chromatin Silencing, Metabolism, and Aging.” Genes and Dev 14.9 (200): 1021-026. Print.
3. Pegoraro, Gianluca, Nard Kubben, Ute Wickert, Heike Göhler, Katrin Hoffmann, and Tom Misteli. “Ageing-related Chromatin Defects through Loss of the NURD Complex.” Nature Cell Biology (2009): n. pag. Print.
4. Sedivy, J., G. Banumathy, and P. Adams. “Aging by Epigenetics—A Consequence of Chromatin Damage?” Experimental Cell Research 314.9 (2008): 1909-917. Print.
5. Vaquero, Alejandro, Alejandra Loyola, and Danny Reinberg. “The Constantly Changing Face of Chromatin.” Science of Aging Knowledge Environment 2003.14 (2003): 4re-4. Print.


Saccharomyces cerevisia (Sir2) is an NAD+-dependent histone deacetylase. It’s role within the cell is to link chromatin silencing to genomic stability, cellular metabolism, and lifespan regulation. For example, in mice, if there is a deficiency for SIRT6 (family member of Sir2), the mice experience genomic instability, metabolic defects, and degenerative pathologies in terms of aging, everything opposite of the roles of Sir2. With new insights to the previously ambiguous SIRT6, scientists have discovered that SIRT6 is a very substrate-specific histone deacetylase that promotes proper chromatin function in things like telomere stabilization and DNA repair.

Sir2: a chromatin-aging connection[edit]

Sir2 is the founding member of the family of proteins called sirtuins. These proteins provided the first link between chromatin regulation and aging. Sir2 favors chromatin silencing at sub-telomeric DNA, silent mating-type loci, and rDNA repeats. These effects of Sir2 on chromatin is mediated by having Sir2 catalyzing the deacetylation of lysine residues on the amino terminal ends of histones H3 and H4 and also on the globular core of histone (all by NAD+-dependent histone deacetylase activity). Deacetylation of H4 lysine 16 and H3K56 mediate the silencing effects of Sir2.

Example: In budding yeast, Sir2 regulates replicative lifespan through a couple of chromatin-silencing processes.

First, Sir2 suppresses recombination between rDNA repeats and this prevents

Second, H4K16 acetylation levels increase at telomeres when replicative age increases; Thus, Sir2 protein levels decrease. These chromatin changes create defects in telomere position-dependent transcriptional silencing and trigger replicative senescence.[6]

A few studies have shown that there are aging-related Sir2 functions that might be chromatin-independent, making the relationship between Sir2 and lifespan regulation even more complex.

For example,Sir2 asymmetrically segregates damaged proteins to the yeast mother cell during cell division; this asymmetry can age the mother cell by forming toxic protein aggregates. Also, Sir2 can block lifespan extension in response to nutrient deprivation of mutations in nutrient-sensing pathways.

Mammalian sirtuin proteins: venturing out from chromatin[edit]

SIRT1 is the most closely related to year Sir2 out of the seven SIR2 family members. However, Sir2 appears to deacetylate histones exclusively while SIRT1 appears to more than 40 substrates. SIRT1 deacetylates many non-histone proteins and impacts on many phsysiologic processes like apoptosis.

SIRT1, SIRT6, and SIRT7 are concentrated in different sub-nuclear patterns; SIRT2 is cytoplasmic; SIRT3, SIRT4, and SIRT5 reside in the mitochondria.

SIRTching for a function through knockout mice[edit]

SIRT6-deficient mice appear normal when born, but after a couple of weeks, they start to develop degenerative phenotypes like osteoporosis. They also experience metabolic defects – so much that with such low levels of the insulin-like protein IFG-1, these mice die by 1 month.

An orphan enzyme finds its substrates[edit]

Through experiments, it was found in vitro that SIRT6 promotes mono-ADP-ribosylation, an alternative NAD+-depdendent reaction in sirtuins. Another breakthrough occurred to further understand SIRT6 function through discovery of the enzymatic activity and the first substrate of SIRT6: NAD+-depdendent deacetylation of histone H3 lysine 9. SIRT6 specifically deacetylates H3K9, but lacks activity on a lot of other histone tail residues due to its intense specificity.

Two groups were identified independently as the second substrate for SIRT6: lysine 56 of histone H3 (H3K56Ac).

To the core and beyond: biochemical dissection of SIRT6 function[edit]

Sirtuin proteins have a conserved central “sirtuin domain” flanked by N- and C- terminal extensions. The sirtuin domain supposedly has an enzymatic core and understanding this domain can show scientists the physiologic regulation of sirtuin proteins.

For SIRT6, a recent study showed that the N- and C- terminal domains regulate SIRT6 function by having the C terminus require proper nuclear localization (but is dispensable for enzymatic activity) and then the N terminus is beneficial for chromatic association and intrinsic catalytic activity.

Why is catalytic activity required for chromatin association in the cell?

It could be possible that histone deacetylation by SIRT6 might be able to stabilize SIRT6 availability at chromatin or it can promote propagation of SIRT6 molecules along chromatin.

At the ends of chromosomes: SIRT6 regulates telomeric chromatin[edit]

SIRT6 plays an important role in the chromatin-regulatory context by keeping the integrity of telomeric chromatin stable. Telomeres are specialized DNA-protein structures which protect chromosome ends that are linear from degradation and fusion. SIRT6 plays a huge role at telomeres in humans for a couple of reasons:

First, telomere structures need to be correct in order to maintain genomic stability; chromosomal instability is apparent in cancer cells.
Also, telomere length decreases with cellular age. This shows that SIRT6’s role at telomeres correlates with aging.


With many experiments and discoveries, SIRT6 has been determined as a site-specific histone deacetylase, playing very important roles in keeping up telomere integrity, honing aging-associated gene expression programs, preventing the genome to become unstable, and maintaining metabolic homeostasis.

Not only does SIRT6 function at specific sites in the genome, it plays a role in binding to additional gene promotors. Also, there might be interactions between SIRT6 and other sirtuin proteins.

Lastly, SIRT6 might have an impact on cancer due to the fact that there have been links between SIRT6 and cancer by the SIRT6 chromosomal locus.


Tennen, Ruth I., and Katrin F. Chua. “Chromatin regulation and genome maintenance by mammalian SIRT6.” Trends in Biochemical Sciences 36.1 (2011) 39-46. Academic Search Complete. Web. 05 December. 2012.
RNA is also known as ribonucleic acid. It is a part of most living organisms as well as viruses. It contains bases of Adenine, Cytosine, Guanine, and Uracil (instead of Thymine) which all bind to the ribose. RNA can be used to make DNA as well as synthesize proteins. It is the only polymer that can serve as a catalyst to the formation of proteins as well as storing genetic information. The RNA backbone is made of alternating ribose-phosphate groups. RNA can be found usually single stranded in humans, but can appear double stranded in many other organisms, including viruses.

Some viruses have RNA as their primary genetic material. They are known as RNA viruses. These viruses infect cells by first binding to a specific protein or receptor on the surface of the cell. After binding to the cell’s surface, the virus injects its genetic material, or RNA, into the cell. The viral RNA, then, associates with the ribosomes of the infected cell. Essentially, a virus seizes control of its host’s molecular machinery, uses the host cell’s transcriptional abilities to produce viral proteins. The newly-made viral proteins then go on to produce new viruses. Furthermore, viral RNA can form replication complexes where it can copy itself. This newly-replicated RNA then gets packaged into the newly created viruses, which leads the cell to lyse, or break open. Consequently, these released viruses can go on to infect other cells.

RNA is nucleic acid, and its single-stranded, helical structure is constructed by nucleotides of nitrogenous bases, ribose sugar, and phosphate group; the bases are adenine, guanine, cytosine, and uracil, for which, 1’ nitrogen of pyrimidine base and 9’ nitrogen of purines base are bonded to 1’carbon of pentose sugar by glycosidic bond; base pairs of adenine and uracil and of cytosine and guanine are bonded by hydrogen bonds; the ribose is a pentose sugar of carbon numbered from 1’ to 5’ and has a hydroxyl group on the 2’ carbon; the 3’ and 5’ carbons of ribose sugar are bonded to phosphate group by phosphodiester bond; more importantly, the structure is of A-form geometry, which is constructed as of vast and thin major groove and of flat and broad minor groove, the structure can fold on itself to form secondary structure, such as tRNA and rRNA, and the secondary structure that are stabilized by hydrogen bonds, domains of loops, and metal ions, such as Mg 2+, form specific tertiary form.

Double Stranded RNA[edit]

Double Stranded RNAs, or dsRNA, are RNA’s that have a complementary strand, similar to that of DNA. Many viruses are made from dsRNAs that infect a variety of hosts, ranging from animals, humans, fungi, plants, and bacteria. An RNA virus is a virus that contains only RNA as its genetic material, or whose genetic material passes through an RNA intermediate during replication. An example of a RNA virus is Hepatitis B, because even though it has a double-stranded DNA genome, the genome is transcribed into RNA during replication. An interesting fact about RNA viruses is that they have very high mutation rates since they lack DNA polymerases which is responsible for finding and editing mistakes. dsRNA’s can also be synthetically produced by the process of in vitro and cloning using PCR to amplify the results. dsRNA’s are responsible for the RNAi pathway.

Double strand RNA, dsRNA, is important because it helps regulating genes expression in eukaryotes cells. It triggers different gene silencing known as RNAi-Interfering RNA. Interfering RNA is a dsRNA that gets chopped off into a smaller fragments and binds to mRNA to block the gene expression. It also helps to reduce the production of gene’s encoded protein in order to get just right growth and reduce the self defense.


RNA is usually found in humans as a single stranded linear polymer. The monomeric units (nucleotides) linked together by 3’5′ phosphodiester bridges. (A nucleoside is a ribose sugar connected to a base through the 1’C, while a nucleotide is a nucleoside plus a phosphate group connected to the 5’C of the sugar) The secondary structure of RNA is stabilized by Hydrogen bonds, intrastrand pairing of the bases (AU, GC) oftentimes resulting in structures such as hairpin loops. The stability of these loops depend on the number of unpaired bases in the loop, anything more than 10 or less than 5 is not very energetically favorable. There are oftentimes when the structure of RNA is not very stable because of the inability to match up Watson and Crick base pairs in the stem of the hair pin loops. Because it is single stranded, RNA will also fold into more complex structures, there are times when three nucleotides interact together to stabilize the structure. The Mg2+ stabilizes the structure when it is more elaborately structured. In these cases, there are often Hydrogen bond donors or acceptors that aren’t already in Watson and Crick base pairs can interact and Hydrogen bond in ‘irregular’ pairing. Because of the extra hydroxyl group attached to the anomeric Carbon (the 2′ Carbon), RNA is not as stable as DNA and will not form double helices as easily, although there have been cases of them found in some viruses. The 2′ hydroxyl group on RNA also causes it to self hydrolyze. The hydroxyl group will attack the phosphorous which cleaves the phosophodiester bond on the 5′ end. This instability also contributes to DNA being the preferred molecule for genetic storage in humans.

The technique of Northern blotting is often used to uncover the DNA sequence of a sample.

There are many different types of RNA, and they carry out different function in the cell.


Messenger RNA

Transcribes the DNA and is the template for the synthesis of protein. DNA + RNA polymerase makes mRNA.

Transfer RNA

Brings the activated amino acids from other parts of the cell to the site of translation, or the ribosome. tRNA reads the information in th emRNA and translates that to amino acid. In other words, it translates information from the RNA to proteins.

Ribosomal RNA

RNA that takes part in translating Messenger RNA into protein, constituent of ribosomes. rRNA is the most common and deals with the activity of the ribosome. rRNA deals with the formation of peptide bonds and is carried by this RNA in the ribosome.
Small interfering RNA
Bind to Messenger RNA and help them degrade.

Micro RNA

Small non-coding RNA that inhibit translation of their complementary mRNA.
small nuclear RNA
Responsible for the sorting of proteins by removal of the introns (splicing) from hnRNA as well as maintaining telomeres
Interference RNA
inhibition of gene expression by cutting up mRNA.
Structural insights into RNA interference.

The structures of these different types of RNA will vary depending on what they are supposed to do. The tertiary structure varies by function. Even in the simplest sense, some will be relatively long strands of nucleic acids, such as Messenger RNA up to 1.2 kilobases, while others are relatively short sequences of 21 nucleotides such as miRNA.

References[edit]”>RNA Function
Viadiu, Hector. “Types of RNA.” UCSD. Lecture. November 2012.

Messenger ribonucleic acid (mRNA) is the blueprint of protein reproduction. Transcribed from deoxyribonucleic acid (DNA), mRNA transfers genetic information from the cell nucleus into the protein-producing ribosomes located in the cytoplasm. Similar to DNA, the genetic information is encoded in four nucleotides that are arranged in codons, or triplets of nucleotide bases. Each codon corresponds to a specific amino acid, and the sequence of codons ends with a codon that has a stop signal. The protein synthesis process requires transfer RNA (tRNA) and ribosomal RNA (rRNA). mRNA makes up only about 5% of the different types of RNA found in both Prokaryotic and Eukaryotic cells.



During transcription, an RNA strand is copied by an enzyme, RNA polymerase. RNA is then synthesized in the 5′ to 3′ direction, as is also done in DNA replication. The template of the two DNA strands is the one in which the RNA is synthesized. RNA polymerase binds to the 3′ end and replicates via phosphodiester bonds.

The obvious difference between DNA and mRNA in this stage is in the uracil (U) that is present in RNA instead of thymine (T) in DNA.

The RNA first transcribed from the DNA is known as pre-messenger RNA (pre-mRNA) since the exact copy of the DNA region contains both introns and exons. Messenger RNA contains only exons. Introns are removed via splicing by spliceosomes, which recognize intronic sequences based on a GU beginning, a long pyrimidine chain, and an AG ending. Only exons remains in mRNA mainly because it contains useful genetic information for translation – producing a protein. Introns, however, do not provide useful genetic information.

caps and PolyA tails are added as modification to protect the active ends of mRNA after transcription and before translation.


In eukaryotes, the product of transcription of a protein-coding gene is pre-mRNA which requires processing to generate functional mRNA. Several processing reactions occur.

5’processing: capping[edit]

Very soon after it has been synthesized by RNA polymearse II, the 5′ end of the primary RNA transcript, pre-mRNA, is modified by the addition of a 5′ cap(a process known as capping). This process involves the addition of 7-methylguanosine(m7G) to the 5’end. To achieve this, the terminal 5′ phosphate is first removed by a phosphatase. Guanosyl transferase then catalyzed a reaction whereby the resulting diphosphate 5′ end attacks the α phosphorus atom of a GTP molecule to add a G residue in an unusual 5’5′ triphosphate link. The G residue is then methylated by a methyl transferase adding a methyl group to the N-7 position of the guanine ring, using S-adenosyl methionine as methyl donor. The ribose of the adjacent nucleotide (nucleotide 2 in the RNA chain) or the riboses of both nucleotides 2 and 3 may also be methylated to give cap 1 or cap 2 structures respectively. In these cases. the methyl groups are added to the 2′-OH groups of the ribose sugars.

The cap protects the 5′ end of the primary transcript against attack by ribonucleases that have specificity for 3’5′ phosphodiester bonds and so cannot hydrolyze the 5’5′ bond in the cap structure. In addition, the cap plays a role in the initiation step of protein synthesis in eukaryotes. Only RNA transcripts from eukaryotic protein-coding genes become capped; prokaryotic mRNA and eukaryotic rRNA and tRNAs are uncapped.


RNA splicing is a key step in RNA processing because it precisely remove the intron sequences and join the ends of neighboring exons to produce a functional mRNA molecule. The exon-intron boundaries are marked by specific sequences. In most cases, at the 5′ boundary between the exon and the intron(the 5′ splice site), the intron starts with the sequence GU and at the 3’exon-intron boundary (the 3′ splice site) the intron ends with the sequence AG. Each of these two sequences lies within a longer consensus sequence. A polypyrimidine tract (a conserved stretch of about 11 pyrimidines) lies upstream of the AG at the 3′ splice site. A key signal sequence is the branchpoint sequence which is located about 20-50 nt upstream of the 3′ splice site. In vertebrates this sequence is 5′-CURAY-3′ where R=purine and Y=pyrimidine (in yeast this sequence is 5′-UACUAAC-3′).
RNA splicing occurs in two steps. In the first step, the 2′-OH of the A residue at the branch site attacks the 3’5′ phosphodiester bond at the 5′ splice site causing that bond to break and the 5′ end of the intron to loop round and form an unusual 2’5′ bond with the A residue in the branchpoint sequence. Because this A residue already has 3’5′ bonds with its neighbors in the RNA chain, the intron becomes branched at this point to form what is known as a lariat intermediate (named as such since it resembles a cowboy’s lasso). The new 3′-OH end of exon 1 now attacks the phosphodiester bond at the 3′ splice site causing the two exons to join and release the intron, still as a lariat. In each of the two splicing reacitons, one phophate-ester bond is exchanged for another (i.e. these are two transesterification reactions). Since the number of phosphate-ester bond is unchanged, no ATP is consumed.

3′ processing:cleavage and polyadenylation[edit]

A majority of eukaryotic pre-mRNAs undergo polyadenylation which involves cleavage of the RNA at its 3′ end and the addition of about 200A residues to form a poly(A)tail. The cleavage and polyadenylation reactions require the existence of a polyadenylation signal sequence (5′-AAUAAA-3′) located near the 3′ end of the pre-mRNA followed by a sequence 5′-YA-3′ (where Y=a pyrimidine), often 5′-CA-3′, in the next 11-20 nt. A GU-rich sequence (or U-rich sequence) is also usually present further downstream. After these sequence elements have been synthesized, two multisubunit proteins called CPSF (cleavage and polyadenylation specificity factor) and CStF (cleavage stimulation factor F) aretransferred from the CTD of RNA polymerase II to the RNA molecule and bind to the sequence elements. A protein complex is formed which includes additional cleavage factors and an enzyme called poly(A) polymerase (PAP). This complex cleaves the RNA between the AAUAAA sequence and the GU-rich sequence. Poly(A) polymerase then adds about 200A residues to the new 3′ end of the RNA molecule using ATP as precursor. As it is made, the poly(A) tail protects the 3′ end of the final mRNA against ribonuclease digestion and hence stabilizes the mRNA. In addition, it increases the efficiency of translation of the mRNA. However, some mRNAs, notably histone pre-mRNAs, lack a poly(A) tail. Nevertheless, histone pre-mRNA is still subject to 3′ processing. It is cleaved near the 3′ end by a protein complex that recognizes specific signals, one of which is a stem-loop structure, to generate the 3’end of the mature mRNA molecule.

The primary RNA transcript that continues to be synthesized includes both coding(exon) and noncoding(intron) regions. The latter need to be removed and the exon


In Eukaryotic cells, following synthesis, mRNA typically goes through a series of modifications before being exported to the cytoplasm for translation. These modifications include a 5’ guanine capping and a polyadenylation at the 3′ end. This strand of Adenine residues (anywhere from 80-250) is called the Poly-A tail and is needed for the export, protection, translation, and stability of the mRNA. Splicing, the process in which introns are removed and exons are joined, also occurs before exportation.

After all the proper modifications have been carried out, the mature mRNAs are ready to be exported through the nuclear pore into the cytoplasm. Nuclear pores are the channels between the nucleus and cytoplasm, and is a selective barrier that allow macromolecule transportation. Alternate splicing patterns of introns allows the same gene to express in a slightly different way in mRNA creating a different, but similar protein. In order for the mature mRNAs to be carried out, first, the formation of the messenger ribonucleoprotein (mRNPs) export complex with RNA binding proteins and transport factors (carriers) must occur since Mex67-Mtr2 heterodimer, the principal mRNA carrier, binds loosely to bulk mRNA.

Nuclear Transport[edit]

Nuclear export is a pathway unique to eukaryotic cells because the nuclear and cytoplasmic compartments within the eukaryotic cells enables spatial separation of the two processes, transcription and translation. The separation between the two processes allows for multiple steps in between for further modification and gene expression regulation, which becomes vital for physiological responses to extra- and intracellular signals.

mRNA nuclear export can be simplified to three stages:

  • 1) the pre-mRNA is transcribed in the nucleus, the site of mRNA synthesis, processing, and packing into mRNP (messenger ribonucleoprotein) complexes (as briefly described earlier)
  • 2) the mRNP molecules are targeted to and translocated through the nuclear pore complexes (NPC) of the nuclear envelope
  • 3) the mRNPs are released into the cytoplasm for translation to occur. Each of these stages involves numerous protein factors and other molecules that need to be recruited to carry out processes.

Formation of mRNP in yeasts:[edit]

  • 1) In the nucleus, transcription is mediated through RNA polymerase II. This is followed by modifications like the addition of the 5’ cap, splicing, and 3’ processing. The TREX complex is recruited during these processes and coordinates many of the next steps.
  • 2) The 3’ end processing is necessary because it generates the poly-A-tail which is crucial for the mRNA to be exported. This process requires the factors Rna14, Rna15 and Pcf11. Nab 2 is added onto the poly-A-tail mRNA then recruits Yra1 and Sub 2 during this time. When mRNA is in contact with Pcf11, Yra1 is transferred to the TREX subunit Sub2. (Yra1-Pcf11 binding is an important early step). Yra1 is necessary
  • 3) The MEx67-Mtr2 heterodimer is drafted.
  • 4) mRNPs can now be remodeled by tha DEAD-box helicase Sub2
  • 5) Yra1 dissociates itself from mRNP before export, along with the TREX complex.
  • 6) mRNP is drawn to the nucleus side of the NPC transport channel, where weak interactions arise with FG nucleoporins (proteins that perforate the nuclear pore). To increase the efficiency of export, several mechanisms exist to concentrate mRNAs at the nucleus side of the NPC. Eg: several actively transcribing genes like GAL1 are concentrated at the NPC.
  • 7) mRNP goes through the NPC transport channel, to the cytoplasmic side, where is once again goes through remodeling to prevent from going back into the nucleus.

Cofactors Involved[edit]

The NPCs itself have very essential proteins that facilitate mRNA nuclear export. Within the NPC, there is a conical, basket-like feature that protrudes into the nucleus called the nuclear basket. It contains proteins like Nup 60 Nup2, and Mlp2. The cytoplasm similarly has proteins that are cofactors to the export process (Nup1259, Dbp5, Gle1). There are several other key proteins and components of mRNA export that will not be discussed, but they the references for this page will provide much more insight on the specific functions of these export factors.

Here is a short summary of the principle export factors for yeast and metazoans:

  • Mex67-Mtr2 (yeast) and Nxf1-nxt1 (metazoan): facilitate bulk mRNA export through NPCs
  • Yra1 (yeast) and ALY (metazoan): Adaptor linking Mex67-Mtr2 to mRNA molecule
  • Sub2 (yeast) and UAP56 (metazoan): DEAD-box helicase involved in assembly of export-competent mRNPs
  • Nab2 (yeast): Binds the poly (A) til of mRNA to Mlp1 and regulates length of the 3′ poly (A) tail
  • Mlp1 (yeast) and (TPR): Nuclear basket protein that binds to Nab2
  • TREX (both yeast and metazoan): The complex involved in coordinating and regulating transcription
  • TREX-2 (both): directs actively expressing genes to NPCs
  • Gle1 and Gdf1 (yeast) and GLE (metazoan): enhances Dbp5 activity
  • Nup159 (yeast) and NUP214 (metazoan): cytoplasmic NPC protein that binds to Dbp5


Recruiting these factors is an essential step for the trafficking and quality control of the export. Most molecules that need to be transported from the nucleus into the cytoplasm involve karyopherin-mediated receptors, like small mRNA export. Its transport direction is based on the gradient of the GTP-bound state of the small GTPase Ran, making the mRNA export process uncharacteristic of normal protein export such as tRNA. Bulk mRNA is exported using Mex67-Mtr2, a non-karyopherin-mediated receptor, via the Nxfl pathway. The Mex67-Mtr2 molecule is recruited to the mRNP using the TREX component. Furthermore, recent works in vertebres shows that the binding of the Yra1 homologue, ALY, to mRNA is stimulated by the presence of the ATP bound form of the Sub2 homologue UAP56. This binding increases the ATPase activity of UAP56. Moreover, Nxf1 binds mRNA associated ALY, forming a ternary complex, and the RNA-binding affinity of Nxf1 is increased in the presence of ALY. Taken together, the events result in an mRNP with bount export receptors. But it is unclear how many receptors must bind a single mRNA for efficient export to occur.

Bulk mRNA Export Pathway[edit]

The Nxfl pathway involves a small set of transcripts that are exported via karyopherin Crm1, a protein that also mediates the export of incompletely spliced mRNA from HIV viruses. Therefore, if an mRNA molecule is not properly processed and spliced of its introns, it can be kept in the nucleus to degrade since it is recognized as a viral mRNA molecule.
When the mRNP and mRNA are properly processed and have recruited all the necessary receptors and cofactors, it is considered export ready (export competent). The export-competent mRNP is then targeted only to the NPC using its recruited export receptor. The export receptor carries the mRNP to the NPC where it stays and interacts with the NPC proteins to allow recognition. The interactions can be nicely summarized in the figure below.

Bulk Release into Cytoplasm[edit]

The directionality of the bulk mRNA release is determined by another mechanism since it does depend on the RanGTP gradient for small mRNA export. It is determined by the function of two important export factors, Dbp5 and Gle1. The Dbp5 protein binds to the NPC cytoplasmic face by interacting with the NPC protein Nup214 As the mRNP comes closer to the cytoplasmic side of the NPC, it interacts with Dbp5 and Gle1. The binding and interactions between mRNP and the two proteins causes a conformational change and activates the removal of a set of proteins from the mRNP. It physically and spatially changes the mRNP making it suitable to be exported out of the NPC into the cytoplasm. These removed proteins are recycled and brought back into the nucleus where it goes through another cycle of mRNA export. In addition, as the mRNP enters the cytoplasm, specific cytoplasmic mRNA-binding proteins are incorporated. These specific links to translation further show the inherent connections between steps in gene expression.

Translational Significance[edit]

Since mRNA export is essential for proper gene expression, this process must be properly conducted. Incorrect steps in this export can lead to errors in transcription, and consequently translation. For example, errors in recruiting export factors can lead to incorrect mRNA production, and if the transcript is not recognized by nuclear surveillance the mRNA may be kept inside the nucleus and degraded by exosomes and various other enzymes. Errors in mRNA export can also be linked to many human diseases and developmental issues. Incorrect mRNA export are connected to perturbations that yield mutations in gene encoding export proteins or mRNA-binding proteins as well as mutations in genes that result in the inhibition of correct export of their own mRNA transcripts. Extreme cases also include the decreased regulation or hijacking of endogenous mRNA export complexes by viruses, which enables specific viral genes to hybridize with the mRNA transcript and be expressed in the organism. But with the vast knowledge of the mRNA export process, these malfunctions can be better understood and more easily preventable, and it may be possible to address many issues of diseases and gain a complete understanding of the way cellular function is generated at the simplest level: molecularly.


In prokaryotes, because the mRNA does not need to be modified or transported, it can be translated by the ribosomes right after transcription.

A picture of the translation process.

In eukaryotes, however, mRNA can only be translated after it has been modified and transported to the cytoplasm (the mature mRNAs). mRNA is translated into proteins on the ribosomes located on the endoplasmic reticulum. Translation starts by the ribosomes binding to a site on the 5′ side. The ribosome moves along the mRNA until is comes across the start codon AUG. When this binding occurs, the ribosome is joined by an initiator tRNA that carries a formylmethionine (fMet) group that recognizes the start codon. Next, an aminoacyl-tRNA that can base pair with the next codon appears and joins the ribosome complex. Along with the aminoacyl-tRNA is the elongation factor EF-Tu (in bacteria) and a source of energy (usually GTP). The fMet (in bacteria) or Met group covalently bonds to the incoming amino acid of the aminoacyl-tRNA. The initiator tRNA is then released and the ribosome shifts one codon toward the 3′ end. A new aminoacyl-tRNA arrives and the amino acid of this aminoacyl-tRNA binds to the previous amino acid. This process continues until the ribosome reaches a stop codon (UAA, UAG, or UGA). The newly bound amino acids are the translated mRNA into a protein. The ribosomal complex containing the tRNA splits back up into its separate parts, re-assembling when new mRNA needs to be translated into protein.

The elongation process “terminates” when a stop codon reaches the A site of the ribosome. Incoming tRNA, which carries the subsequent amino acid, will not be accepted by the ribosome at the A site. The A site will then be specific to a protein called the release factor. The release factor will hydrolyze the bond of the tRNA to the polypeptide in the P site, thus releasing the polypeptide chain. The two ribosomal subunits, release factor, and mRNA then come apart to signify the end of the termination process.

  • Stop Codon – A stop codon implies a sequence of three nitrogenase bases in the mRNA that signifies the termination of polypeptide elongation, or translation. The amino acid sequence is then released from the mRNA template to form its final 3D conformation.
Stop Codon Sequence
Pre-mRNA to mRNA.png


An mRNA can be changed its nucleotide composition in some instances. This process is called editing. In human, the apolipoprotein mRNA is one of the cases. This editing mRNA takes place in some tissues, but not all of them. In this edition, the mRNA’s codon is given an early stop, therefore, it will produce a shorter protein when going to the translation process.

Alteration of mRNA sequence through base modification mRNA editing frequently generates protein diversity. Several proteins have been identified as being similar to C-to-U mRNA editing enzymes based on their structural domains and the occurrence of a catalytic domain characteristic of cytidine deaminases. In light of the hypothesis that these proteins might represent novel mRNA editing systems that could affect proteome diversity, we consider their structure, expression and relevance to biomedically significant processes or pathologies.


The message transported through mRNA after a certain amount of time will be degraded and be deleted. This process is called degradation. The cell can easily and quickly changed the protein production in case of any changing needs due to the lifetime of the mRNA. The lifetime of different types of mRNA can be different.The life span of mRNA molecules in the cytoplasm is an important key in determining the pattern of protein synthesis within a cell. Prokaryotic mRNA molecules often are degraded by enzymes within a few minutes of their synthesis and this is one reason as to why prokaryotes can vary their patterns of protein synthesis so quickly in response to changes in their environment. Eukaryotic mRNA, on the other hand, typically survives for hours, days, or for some instances, weeks. One example of multicellular mRNA is hemoglobin polypeptides which, in the process of developing red blood cells which are unusually stable, these long-lived mRNAs are translated repeatedly in the cell. Research done on yeasts suggest that a common pathway for mRNA degradation begins with the enzymatic shortening of the poly-A tail which helps trigger the action of enzymes that remove the 5’ cap. This removal of the 5’cap end is crucial as it is regulated by particular nucleotide sequences in the mRNA. Once the cap is removed, nuclease enzymes can then move in and rapidly chew up the mRNA. This process of mRNA degradation relies on deadenylation. The shortening of poly-A tail is initiated by deadenylase and afterward, mRNA is either fully degraded or stored in the case of certain cells.

Another mechanism that blocks expression of specific mRNA molecules known as MicroRNA (miRNA) or miRNAs have also become of interest. They are formed from longer RNA precursors that fold back on themselves, forming a long, double-stranded hairpin structure held together by hydrogen bonds. These small singled stranded RNA molecules can bind to complementary sequences in mRNA molecules and an enzyme, called the Dicer, can then cut the double-stranded RNA molecules into short fragments. One of the two strands is degraded and then the other stand, often the miRNA associates with a large protein complex and which allows the complex to bind to any mRNA molecule with a complementary sequence to either degrade or block translation of mRNAs.

Scientists also observed that gene expression inhibited by RNA molecules was possible. This was observed when they noticed that injecting double stranded RNA molecules into a cell somehow turned off a gene with the same sequence. Scientists called this phenomenon RNA interference or Interference RNA (RNAi). It was later discovered that this interference was due to small interfering RNAs (siRNAs) which are RNAs of similar size and function as miRNAs. Researched showed that the cellular machinery for making siRNAs was the same mechanism for creating miRNAs in the cell. The mechanisms by which these small RNAs function are also the same. Because the cellular RNAi pathway can lead to the destruction of RNA sequences complementary to themselves, it is believed that they originally acted as a natural defense against infection by RNA viruses.


  • David Hames, Nigel Hooper. Biochemistry. Third edition. Taylor and Francis Group. New York,2005.
  • Neil A. Campbell, Jan B Reece. Biology Seventh Edition, 2005 Pearson Education, Inc.

Nuclear export of mRNA. Murray Stewart. MRC Laboratory of Molecular Biology, Hills Rd., Cambridge CB2 0QH, UK

NMD Introduction[edit]

NMD is a short-handed term for “Nonsense-mediated mRNA decay”. NMD is a mechanism of mRNA surveillance which functioned as detecting nonsense mutations and preventing the expression of reduction or incorrect proteins.
During RNA translation, the mRNAs can produce abridged proteins carry dominant negative activities. After transcription, mRNA performs a convergence of ribonucleoprotein components first, and then undergoes the regulation of pre-mRNA. As average intro sizes in eukaryotic cells are relatively large, the probability of aberrant mRNA splicing will highly increases. This results in the production of nonsense/stopping codon such as UAA, UAG, and UGA to increase. NMD is then triggered by exon junction complexes, also known as “EJCs”, which are stored during the pre-mRA processing. The presence of EJCs can further promote ribosomal recruitment before the replacement by translation. The EJCs existed in the downstream of a nonsense codon are not performed since the ribosome is reduced from the transcript without reaching it. The remaining EJCs sign of recruitment of UPF1 after the mRNA’s transport away from the nucleus and intro the cytosol, the RNA degrading center. Overall, NMD is considered both a process of degrading truncated mRNA and also a method to regulate normal transcripts’ expression.

NMD Factors[edit]

Essential proteins for NMD are: UPF1, UPF2 and UPF3 with the core NMD machinery. For UPF1, proteins including SMG-1, SMG-5, SMG-6 and SMG-7 can intercede the both cycles of phosphorylation and dephosphorylation.

UPF and SMG proteins[edit]

The UPF and SMG proteins are the core machineries for NMD mechanism.

Here are some of the proteins and their functions:

  1. UPF1: located mainly in cytoplasmic reticulum and some nuclear. Its functions are promoting translation; histone mRNA decay; ATPase; helicase; phosphoprotein substrate for SMG-1 and requited by eRFs to stop codons; undergoes a cycle of phosphorylation and dephosphorylation.
  2. UPF2: located in cytoplasmic but has nuclear localization signals. It s functions are to promoting translation, EJC adapter protein that binds to both UPF1 and UPF3; binds to RNA in vitro.
  3. UPF3: located mainly in nuclear, some in cytoplasmic. Its functions are promoting translation; EJC protein with short and long isoforms that differentially distribute into distinct cytoplasmic protein complexes with UPF1
  4. SMG-1: located in cytoplasmic. Its functions are phosphoionsitide 3-kinase-related kinase family member; phosphorylates UPF1
  5. SMG-5: located mainly in cytoplasmic and some nuclear. Its functions are interacting with PP2A and promoting UPF1 dephosphorylation.
  6. SMG-6: located mianly in cytoplasmic, and some nuclear. Its functions are interacting with PP2A and promoting UPF1 dephosphorylation.
  7. SMG-7: located mianly in cytoplasmic, and some nuclear. Its functions are interacting with PP2A and promoting UPF1 dephosphorylation; when over expresses, it recruits UPF1 to P-bodies.

Exon-Junction Complex (EJC)[edit]

The Exon Junction Complex (EJC) contains four major proteins: eIF4AIII, MAGOH, MLN51, and Y14. All of the four proteins form the core functions in NMD mechanism such as cross-linking, coimmuno-precipitation, mutation analysis, and RNase H footprinting techniques. The purpose of EJC is to serve as station to attach the transient EJC components, and therefore associate more transiently with the mRNA during its journey from the nucleus to the cytoplasm.

NMD during a pioneer round of translation[edit]

NMD during a pioneer round of translation can produce large amount of truncated proteins with deleterious activities. In order to undergo this process, mRNAs are scanned for PTCs and degraded during the early rounds of translation. This method further suggests that mRNAs are scanned either in the nucleus or after entering the cytoplasm when the mRNAs are associated with the nucleus.

During the pioneer round of translation, NMD occurs when UPF1 interacting with UPF2, which is bound to UPF3.
EJC core, attaching with UPF2 and UPF3, is recruited to exon-exon junctions during mRNA splicing. In an aberrant transcript, at least one EJC deposited downstream of the premature stop codon. It then can interact with UPF1 recruited by CBC and the eukaryotic release factors eRF1 and eRF3. The interaction between UPF1 and
UPF2 is strengthened by CBC when undergoing the pioneer round of translation. In the final step, mRNAs eventually decay. Comparing to a PTC-containing mRNA, a normal mRNA transcript prohibits NMD because the presence of EJC stops codon and further displaced by the ribosome before UPF1 is recruited. After this step, normal transcripts exchange theirs proteins at both 5’ and 3’ ends and proceed to enlarge translation continuously.

Molecular Interactions that Define Nonsense Codons[edit]

The second signal, such as EJC, is required to define a stop codon as being premature and thus trigger another NMD. Before the mechanism, mRNA is first degraded by NMD. mRNA then undergoes the pioneer round of translation, which leads to the recognition of the stop codon by the eukaryotic release factors eRF1 and eRF3. UPF1 recruits the protein kinase SMG-1, which together with the eRFs forms a transient complex called SURF in third step. For the next step, the SURF complex interacts with an EJC as a requirement for SMG-1 to phosphorylate UPF1. The interaction further triggers the degradation of mRNA, which eventually reduces the release factors and the 40S and 60S ribosomal subunits.

UPF1 Dephosphorylation, P-Body Recruitment, and mRNA Decay[edit]

After UPF1 is phosphorylated by SMG-1 in the previous step, it further recruits the dephosphorylation.
Late molecular activity in NMD degrades the mRNA. The phosphorylation of UPF1 recruits the SMG-5/SMG-7 heterodimers and the phosphatase PP2A indicated in the first two steps. P-body recruitment then trigger PP2A to dephosphorylate UPF1. The dephosphotylate UPF1 then releases PP2A and SMG-5/SMG-7 from the mRNP and also decapping, which the mRNA body rapidly decays by 5’ to 3’ exonucleas.

Recent Study[edit]

Recent researches suggest that mammalian NMD has other uses other than being a linear pathway. All the discoveries of NMD involves the study of EJCs and other independent EJCs.


  1. NMD
  2. Yao-Fu Chang, J. Saadi Imam, and Miles F.Wilkinson The Nonsense-Mediated Decay RNA Surveillance Pathway Link:


Transfer RNA (tRNA) have a primary, secondary, and tertiary (L-shaped) structure. tRNA bonds to activated amino acids and transfers them to the ribosomes. Once at the ribosome, an initiator tRNA binds the amino acid to the ribosome to start translation. It carries the amino acids and binds to the Messenger RNA (mRNA) to form proteins.

tRNA’s structure contains an amino acid attachment-site and a template-recognition site. The template-recognition site is called a anticodon and contains a sequence of three bases that are complementary to the codon on the mRNA. tRNA travels from nucleus to cytoplasm in a cell. Each tRNA can be used repeatably to be transcribed from DNA in nucleus.

There are 61 different anticodon sequences which code for the 20 amino acids. However, most prokayotic cells only have 30-40 different tRNAs and eukaryotes have about 50 different tRNAs. This is the third nucleotide of the codon, also called a wobble base, allows wobble pairing of the anticodon to the codon.

An example of the crystal structure of Yeast Phenylalanine of tRNA.

Role in Protein Synthesis[edit]

In protein synthesis, a tRNA molecule takes a specific activated amino acid to the site. The amino acid is esterified to the 3′ or 2′ -hydroxyl group of the terminal adenylate of tRNA. This joining of tRNA and an amino acid forms an aminoacyl-tRNA and is catalyzed by a specific enzyme called aminoacyl-tRNA synthetase (aaRS). There are 20 aminoacyl-tRNA synthetase, one for each amino acid. Similarly, there is a specific aaRS for each tRNA. The esterification reaction also called charging of the tRNA is powered by ATP.

The process of protein synthesis starts out when a charged tRNA (a tRNA with an amino acid attached), mRNA, and the small and large ribosomal subunits come together and form the initiation c complex, which consists of a peptidyl binding site (P site) and an aminoacyl binding site (A site). The first tRNA, otherwise known as the initiator RNA, binds to the mRNA start codon, AUG; thus, the first amino acid in the chain is methionine. To add additional amino acids to the polypeptide chain, a second charged tRNA must come in and have its anticodon bind to the next mRNA codon in the vacant A site. The P site and A site are in close proximity, thus allowing a formation of a stable peptide bond by reacting the carboxy terminus of the amino acid in the P site with the amino terminus of the amino acid on the tRNA in the A site. The reaction is catalyzed by peptidyl transferase. The complex moves along the RNA in a process called translocation which causes the tRNA in the P site to be displaced. The tRNA in the A site then moves into the P site so another charged tRNA can move into the A site. This process continues until the stop codon is reached the polypeptide chain is released from the ribosome.

1. amino acid + ATP –> aminoacyl-AMP + PPi
2. aminoacyl-AMP + tRNA –> aminoacyl-tRNA + AMP


tRNA Structure[edit]


2. Secondary Structure[edit]

The secondary structure is formed like cloverleaf structure because of four base-paired stems also called arms. The cloverleaf contains three non-base-paired loops: D, anticodon, and TpsiC loop. The terminal CCA is not base paired. It’s duplexed between the 5’segment and 3’segment.

The acceptor stem which is not a loop is the site where the enzyme amino-acyl-tRNA synthase attaches an amino acid. It is located opposite of the anticodon arm which reads the mRNA.

There are different types loops. In D loop, D arm ends. Anticodon arms ends in anticodon loop.
In the figure, it shows hydrogen bond present inside the loop structure. The hydrogen bonds stabilized the structure.

3. Tertiary Structure[edit]

For the tertiary structure, it can be described as a compact of L shape. It is three dimensional. The structure is bonded and stabilized by base pairing and base stacking. Base pairs between nucleotides in the D loop and the TΨC loop. At the end of the L shape is the three base sequence called anti codon.


The anticodon region of a transfer RNA is a sequence of three bases. They are complementary to a codon in the messenger RNA. In the translation, the pairing between its anticodon and the messenger codon brings the ribosome. The amino acid is attached at its 3′ end. And it will be peptide bond. In prokaryote cells, there are about 35 tRNAs with different anticodons present. In eukaryote cells, there are 50 tRNAs with anticodons present.
tRNA with the anticodon CCC is complementary to the anticodon GGG. The anticodon AAA is complementary to the anticodon UUU.
Since each type of tRNA has a different one, the anticodon of tRNA is able to identify others well.

tRNA Aminoacylation[edit]

Aminoacyl-tRNA is an amino acid ester of tRNA. It can be called a charged tRNA. When a polypeptide chain is formed by the anticodon of the tRNA, the reaction is thermodynamically unfavorable. So, aminoacyl-tRNA is used to activate the formation.
An amino acid is esterified to the 3′-end of a tRNA containing the corresponding anticodon in amynoaclyation of tRNA molecules. As a result, the aminoacyl-tRNA attaches amino acids to the tRNA. These paring of amino acids and tRNAs define the genetic code. The aminoacyl-tRNA synthestase(AARSs)catalyze the aminoacylation of tRNAs. During transfer the genetic information from the nucleotide sequence of a gene to the amino acid sequence of a protein, this process plays an important role. When errors occur, amynoacyl-tRNA synthetases edit mechanisms structurally. Further, it prevents the error synthesis and releases aminoacylated tRNA that shouldn’t be placed.


AARSs is an enzyme that catalyzes the esterification of specific amino acid to a tRNA to form an aminoacyl tRNA. AARSs take a major role in translation druring protein synthesis. In recent researches, scientists discovered that AARSs also take role in ex-translation.


The accuracy of the protein translation depends on the exactness of AARSs’ recognition of both the amino acid to be activated and the cognate tRNA molecules. That is a crucial step in the fidelity of the translation. All AARSs carry out the same two-steps reaction:

Step 1: AARSs binds ATP to the amino acid to induce an aminocyl-adenylate intermediate in which a covalent linkage between the 5′-phosphate in ATP and the carboxyl-end of amino acid.[8][9] Next, the AARSs use the generated energy from ATP hydrolysis to activate the amino acid which results in the formation of aminoacyl-AMP as an energy storage.[10]
Step 2: The amino acid is transferred to the appropriate tRNA and bind either 2’OH or 3’OH of the 3′ adenosine terminal of tRNA covalently. The energy that stored in aminoacyl-AMP is used to transfer the amino acid to the tRNA to form aminoacyl tRNA.


Modified version of AARSs and natural fragments take role in ex-translational functions as confirmed in recent studies. The interplay of AARSs appears to be at the center of homeostatic mechanisma which controls angiogenesis, inflammation, metabolism, and tumorigenesis.[11] Through some recent experiments, ex-translational functions of AARSs was found to be the interplay between natural extracellular fragments of human TrpRS and TyrRS in angiogenesis. TyrRS was found to have a nuclear localization signal that is controlled by its cognate tRNA (called tRNA-Tyr) so that a decrease in level of tRNA-Tyr will increase the level of nuclear import of the AARSs which will induce effects on many gene regulatory mechanisms. In contrast, an increase in level of tRNA-Tyr will decrease the level of TyrRS. Thus, the subcellular distribution of TyrRS is directly controlled by the demands of protein synthesis and this control is an example of homeostatic mechanism that balances a translational with an ex-translational functions.[12].

However, recent reseaches also showed that the ex-translational function of AARSs is regulated which is a contrast to the discussion above. This regulation is considered an auto-balancing process in which a natural fragment takes control in the activity of its own original protein. Thus, further research is needed to confirm the specific role of AARSs in ex-translational functions.

Binding to Ribosome[edit]

tRNA’s function is to bring amino acids to the ribosome during translation.

tRNA will bind at the A, P and E sites of ribosomes. The A site will bind to aminoacyl-tRNA which was signaled by the codon that is binding to that site. The codon will also signify the next correct amino acid that will be in the peptide chain. But the A site will only work when the P site has an aminoacyl-tRNA attaching to it. The P-site is actually occupied by a chain with a few amino acids called peptidyl-tRNA. It carries synthesized amino acid chains. Lastly, the E site carries the empty tRNA.

Three dimensional image of a tRNA.



PURPLE:Acceptor stem

RED: D arm

BLUE Anticodon arm

BLACK: Anticodon

GREEN: T arm

Diseases Caused by Mitochondrial tRNA Gene Mutation[edit]

Mitochondria are an organelle in the cell, which contains 22 tRNA. Gene mutation of tRNA will cause serious diseases. There are seven kinds of genes diseases caused by mitochondrial tRNA gene mutation:

Basal ganglia calcification, cerebellar atrophy, increased lactate; a CT image of a person diagnosed with MELAS

1- np5601 G->A and np3243 A->G gene mutations related to MELAS (Mitochondrial encephalomyopathy, lactic acidosis, and stroke-like episodes). Most patients get this disease before 40-year-old with epilepsia and lactic acidosis. Some of them will die during 20~30 age.

2- np8363 G->A, np8356 T->C, and np8344 A->G gene mutations related to MERRF (Myoclonic Epilepsy with Ragged Red Fibers). MERRF affects central nerve system, causing epilepsia, Dementia and epicophosis.

Example of “ragged red fibers” in MELAS syndrome.

3- np4274 T->C gene mutation related to LIMM (Lethal Infantile Mitochondrial Myopathy). Most patients are newborn, having nerve defect and lactic acidosis, and die in one month.

4- np1644 G->T gene mutation related to subacute necrotizing encephalomyelopathy (SNE). This disease is familial autosomal recessive inheritance, happened to newborn baby.

5- np606 A->G gene mutation related to Rhabdomyolysis. Toxin produced by muscle cells is the main reason that causes Rhabdomyolysis.

6- np4500 G->A gene mutation related to the splenic lymphoma. The splenic lymphoma is a common malignant tumor happened on spleen. Normally, the splenic lymphoma caused by advanced stage lymphoma transfer.

7- np4336 A->G, np15927, and np15928 gene mutations related to Parkinson’s Disease and Alzheimer’s Disease. Parkinson’s Disease is a degenerative disorder of the central nervous system.


1. Inheritance of Mitochondrial Disease:

2. Diseases of Human Mitochondria tRNA:






Ribosomal RNA, also known as rRNA, is a significant component of the ribosome. rRNA fabricates the polypeptides and provides a mechanism for decoding mRNA into amino acids and interacts with the tRNA during translation. rRNA was once known to be the key structural component of ribosomes, but its actually found to be a catalytic element for protein synthesis. It is the most abundant type of RNA (about 80%) in the cell.

rRNA is comprised of a large and small subunit. Prokaryotic rRNA is 70 svedbergs large. A svedberg is a unit of measurement for the sedimentation coefficient or how fast the molecule sediments when centrifuged. The 70S rRNA contains a large 50S subunit which includes a 23S and 16S subunit and a small 30S subunit which contains a 5S subunit. The 23S, 16S and 5S units are essential during protein synthesis, and the structure and function of the ribosomes. The formation of these RNAs take place by cleaving the primary 30S subunits and processing further by folding the molecule to form internal base pair structures. Experiments involving chemical probing methods have been conducted which have provided a detailed model of the secondary structure of the 16S subunit. The secondary structure was obtained through analyzing and comparing the sequences. Proteins containing the 16S ribosomal rRNA can fold and form the 30S subunit.The conformational change of the 16S ribosomal rRNA plays a crucial role in the assembly of the ribosome. The 5S unit found in the 30S subunit is an important part of the large subunit of most ribosomes found in organisms. RNA is the most abundant of the three major types of RNAs with a 80% relative amount in E. coli for example, following by tRNA (15%) and finally mRNA (5%). Ribosomal RNA has a mass of 1.2 x 10^3 kd and 3700 number of nucleotides in E. coli.

With the help of x-ray crystallographic technique, scientists are able to reveal the detailed features of secondary structures.

The use of Polymerase Chain Reaction (PCR) has been of great importance in the amplification of rRNA genes. PCR is used to amplify rRNA genes in many organisms, however, it is found that the amplification of rRNA genes via traditional PCR methods cannot be conducted in extremely thermophilic organisms.

rRNA contains two tRNA binding sites, an A site and a P site. At the A site, the rRNA binds to a aminoacyl-tRNA, a tRNA bound to an amino acid. The amino acid is transferred to a peptidyl-tRNA containing the growing peptide chain. After the amino acid is added, the empty tRNA is moved to the P binding site where it is ejected. The mRNA then shifts 3 bases (1 codon) for the next aminoacyl-tRNA to bind to the A binding site.

In prokaryotes, rRNA are formed by cleavage and other modifications of nascent RNA chains. Therefore precursors of transfer and ribosomal RNA are cleaved and chemically modified after transcription (DNA –> RNA) in prokaryotes.

Base Pairing[edit]

rRNA takes great part of base-pairing between the codon and the anticodon. “Adenine 1493, one of three universally conserved bases in 16S rRNA, forms hydrogen bonds with the bases in both the codon and the anticodon only if the codon and anticodon are correctly paired.”


M. Ogle and V. Ramakrishnan. Annu. Rev. Biochem. 74 (2005):129-177.

RNA interference in cultured cells.

Small RNA is a classification of RNA which includes small-interferring RNA (siRNA), micro RNA (miRNA), and piwi-interacting RNA (piRNA). These small RNA play important roles in biological and diseases processes.

siRNA & micro RNA[edit]

small-interferring RNA (siRNA) is a class of RNA molecules that are around 20-25 nucleotides in length. They are mostly involved with the RNA interference (RNAi) pathway in order to interfere with the expression of a specific gene.

siRNA is a type of double stranded RNA that was found target mRNA cleavage sites and were designed to target transcript silencing through transfection of the siRNA into mammalian cells. This allowed for the development of RNAi-based applications such as a new class of therapeutics.

micro RNA (miRNA) is a class of RNA molecules that are found in eukaryotic cells. They are generally 20-25 nucleotides in length and are also involved in translation repression and gene silencing. They were similar to siRNA and was found to negatively regulate expression of target transcripts.

The stem-loop of a pre-microRNA.

These two types of RNA were established as guides in governing silencing of target transcripts. This also raised questions of how these small RNAs were produced and it was found that immunoprecipitates in Drosophilia S2 cells processed the dsRNA (double stranded RNA) into the siRNA in vitro. miRNA was found to be derived from a conserved stem-loop precursor. This suggests that a dicing step could be required for miRNA biogenesis. The stem-loop forms part of a several hundred nucleotide long miRNA precursor which is then transcribes into miRNA. The existence of this precursor was found in Drosophia pupae.

In analyzing small RNA pathways in Drosophia, it was found that isolated dicer-1 and dicer-2 mutatnts were responsible for the biogeniss of miRNA and siRNA, respectively. Dicer-1 processed pre-miRNA independent of ATP while Dicer-2 processed dsRNA as ATP dependent. However, in mammalian cells, only one dicer generates both miRNA and siRNA.

siRNA effected silencing as they program RNAi effectors (such as RISC) to target mRNA. RISC is a magnesium dependent endoribonuclease that is affected by miRNA and siRNA to target mRNA cleavage activity.

miRNA has a controversial effector mechanism. This disparity is because there is a lack of a comparable well defined biochemical readout for miRNA induced RISC activity while there is a clear one from siRNA.

Making RISC[edit]

RISC: the effector complex for small RNAs

It is known that small RNAs aid in the regulation of gene expression. However, small RNAs cannot function individually to catalyze reactions. Instead, they come together and form RNA-induced silencing complexes (RISCs) in order to help with silencing genes and locating RISC targets. In this sense, the assembly of RISC is crucial for the small RNAs to do their job. [13]

Argonaute: the core component of RISC

The Argonaute (Ago) family of proteins is a main component of RISC that is essential to RISC’s function of target recognition and silencing. The Ago family can be divided into the Ago subfamily and Piwi subfamily. These Ago proteins, each with their own characteristics, are in charge of the functions of the small RNAs that they are paired with. SiRNAs and micro RNAs bind to Ago proteins while piRNAs bind to Piwi proteins. In mammals, the four proteins from the Ago subfamily (AGO1, AGO2, AGO3, AGO4) hinder translation in their target mRNAs, with AGO2 having the unique ability within its subfamily to induce RNA interference. In flies, AGO2 also triggers RNA interference in siRNA while AGO1 focuses on miRNA. What is different in flies compared to the case with mammals is that both AGO1 and AGO2 in flies can target cleavage and cause RNA interference. [13]

Two steps in RISC assembly: RISC loading and unwinding

There are two steps involved in RISC assembly. The first step is called RISC-loading, and this is when small RNA duplexes are incorporated into Ago proteins. Prior to this step, the double-stranded siRNAs and miRNAs are converted by RNase III enzymes (Drosha and Dicer) into small RNA duplexes: siRNA duplexes and miRNA-miRNA* duplexes. In the second step, the double-stranded small RNA duplexes are separated into two strands inside the Ago protein. Of the two strands, the strand with a less stable 5’ end is kept, serving as the ‘guide strand’. The other strand, called the ‘passenger strand’, is thrown out in order to produce a functional RISC. This strand selection in which one strand is preferred over the other is referred to as the ‘asymmetric rule’. [13]

Genome Encoded Small RNA[edit]

The Human Genome Project observed a relatively small number of protein-coding genes relative to genome size. It is believed that only five percent of the genome encodes proteins. miRNA, siRNA, and piRNA are part of the noncoding genome.

miRNA is believed to exist in hundreds of species and are identified through forward genetics by miRNA mutant isolation, bioinformation predictions based on the stem-loop, and direct cloning of small RNA. It is unclear how pre-miRNA is converted and there are studies to indicate that pri-miRNA and pre-miRNA occur separately in the nucleus and cytoplasm. dsRNA is a feature of pri-miRNA and aids in the processing into pre-miRNA. Dicer and Drosha are part of the factors required for the small RNA maturation. They are believed to function with dsRNA binding proteins which aid in the miRNA production.

Endo-siRNA play important roles in regulatig genome functions in diverse species. They cleave target mRNA so that RNA-dependent RNA polymerases use the cleaved mRNA as templates to prime synthesis of secondary siRNAs. These are then loaded onto non-slicing agos to contribute to target silencing. This corresponds to the spreading of RNAi in mRNA and linked to silencing of worms.

piRNA are small RNA that also aid in the interference but focus on repetition. piRNA in mammalians are mapped uniquely in the genome and cluster to a small number of around 10 to 83 kb. Findings of the amplication of piRNA led to a ping-pong model in which it switches between Ago3 and Aubergine to create new piRN through each successive round. Different Piwi proteins conduct piRNA functions both cooperatively and independent of one another. piRNA play an important role in germ line development and the maintenance of genomic integrity. They are also involved in silencing but this is still unknown how. However, studies suggest that they regulate DNA methylation.


  1. Fuxreiter, Monika, Istvan Simon, and Sarah Bondos. “Dynamic Protein–DNA Recognition: Beyond What Can Be Seen.” Trends in Biochemical Sciences 36.8 (2011): 415
  2. Fuxreiter, Monika, Istvan Simon, and Sarah Bondos. “Dynamic Protein–DNA Recognition: Beyond What Can Be Seen.” Trends in Biochemical Sciences 36.8 (2011): 415
  3. Fuxreiter, Monika, Istvan Simon, and Sarah Bondos. “Dynamic Protein–DNA Recognition: Beyond What Can Be Seen.” Trends in Biochemical Sciences 36.8 (2011): 415
  4. Fuxreiter, Monika, Istvan Simon, and Sarah Bondos. “Dynamic Protein–DNA Recognition: Beyond What Can Be Seen.” Trends in Biochemical Sciences 36.8 (2011): 415
  5. Fuxreiter, Monika, Istvan Simon, and Sarah Bondos. “Dynamic Protein–DNA Recognition: Beyond What Can Be Seen.” Trends in Biochemical Sciences 36.8 (2011): 415-416
  6. dsfdf
  8. Desogus, Gianluigi; Flavia Todone; Peter Brick; and Silvia Onesti. “Active Site of Lysyl-tRNA Synthetase: Structural Sudies of the Adenylation Reaction. Biochemistry, 2000 vol 39, 8418-8425.
  9. Klug, William, and Michael Cummings. Concepts of Genetics.5th Edition. Upper Saddle River, NJ: Prentice Hall, 1997.
  10. Hartweel, Leland; Leroy Hodd; Michael Goldberg; Ann Reynolds; Lee Silver; and Ruth Veres. Genetics: From Genes to Genomes. Boston: Mgraw-Hill, 1999.
  11. Guo, Min, and Paul Schimmel. “ – Trends in Biochemical Sciences – Homeostatic mechanisms by alternative forms of tRNA synthetases.” | Search through over 11 million science, health, medical journal full text articles and books.. N.p., n.d. Web. 7 Dec. 2012. .
  12. G. Fu et al. tRNA-controlled nuclear import of a human tRNA synthetase J. Biol. Chem., 287 (2012), pp. 9330–9334
  13. abc Kawamata, Tomoko and Tomari, Yukihide. “Making RISC”, ‘[Trends in Biochemical Sciences]’, July 2010: 368-375. Retrieved on 21 November 2012.
  14. Liu, Qinghua; Paroo, Zain; Biochemical Principles of Small RNA Pathways Annu. Rev. Biochem. 79 (2010): 295-319.

Image from Wikipedia Commons


MicroRNAs(miRNAs) are short, single-stranded RNAs that are about 21 nucleotides in length. Their function is to regulate gene expression. Like other types of RNA, miRNAs are transcribed from DNA; However, they do not participate in protein translation. miRNAs are non-coding RNAs that bind to complementary mRNA and inhibit their translation. miRNAs and siRNAs both function to interfere with gene expression. However, miRNAs are single-stranded, whereas siRNAs are double-stranded.

miRNAs have been determined to play a crucial role in regulation of DNA damage response. Scientists believe that the transmission of generic information in eukaryotic cells requires accuracy in DNA replication and chromosome as well as the ability to sense and repair spontaneous and induce DNA damage. In order to maintain genomic integrity, cells undergo a DNA damage response, a complex network of signaling pathways. This network is composed of coordinates sensors, tranducers and effectors in cell cycle arrest, appotosis and DNA repair.[1]

miRNAs have recently been linked to various diseases. Recent researches have shown that there is connection between dysregulation of miRNAs with certain diseases, which leads to the need of further researching in robust regulation of miRNA activity.[2]

miRNAs once were considered to be very stable molecules because miRNAs expression is known to be strictly controlled by the mechanisms acting at the level of transcription and also the processing of miRNA precursors. However, recently, scientists have figured out another mechanism that is important for miRNA homeostasis which is the active degradation of mature miRNAs. Degradation of miRNA takes role in dynamic miRNA expression patterns. Researches showed that miRNAs degradation can have affect on specific sets of miRNAs even though how this specificity comes about still remains unknown.[3]

Formation & Function[edit]

The main function of miRNAs is to regulate the translation of mRNA. In the nucleus, the miRNAs are first transcribed as primary miRNAs(pri-miRNAs) with caps and a poly-A tail. The pri-miRNAs are then processed into precursor miRNAs(pre-miRNAs) by an enzyme called Drosha. The structure of pre-miRNA is a 70 nucleotide-long stem-loop structure. The pre-miRNAs are then exported into the cytoplasm and split into mature miRNAs by an enzyme called Dicer. These mature miRNAs will integrate into the RNA-induced silencing complex(RISC) and activate the RISC. The activated RISC can then allow miRNAs to bind with the targeted mRNA and silence the gene expression. In animal cells, miRNAs are more commonly base paired with the mRNA and inhibit protein translation. The binding of miRNAs to complementary mRNA can degrade the mRNA and therefore terminate protein translation. Or miRNA can inhibit the reading of the 5′-cap and prevent translation. In plant cells, the miRNAs are more likely to perfectly bind with the target mRNA and promote cleavage. MicroRNA’s are formed from the hairpins of long single-srranded RNA’s that fold in on themselves. The double-stranded hairpins get cut by enzymes called dicer and results in a single-stranded microRNA. MircroRNA then forms a microRNA-protein complex and can then degrade a targeted mRNA and also block translation of targeted mRNA. In few instances, miRNA have shown signs of promoting translation, especially under starvation conditions. The reason for such activity are not known.

Canonical miRNA Function[edit]

Developing studies have demonstrated that miRNAs carry on a critical role in interacting with the canonical DNA damage response. The DNA damage response is an active system that includes commencement of transcriptional programs, enhancement of DNA restoration, and apoptosis if damage is severe. Breakages in DNA double-strands are mended by homologous re-fusion and non-homologous end-connecting repair pathways. Other forms of DNA damage are repaired by base excision repair (BER), nucleotide excision repair (NER), and DNA mismatch repair (MMR).

miRNA play a significant part in gene regulation and other cellular functions. Many important genes in the DNA damage response are managed by their corresponding miRNAs. For one, miRNAs monitor DNA damage response by way of target genes. In the process of DNA repair, chromatin remodeling takes place to permit DNA repair proteins access to DNA that are damaged. With more miRNA targets such as ATM, H2AX, and RAD52, DNA responsive genes are under inhibition by miRNAs. It has been revealed that higher expression of a certain miRNA -such as miR-421- will reduce ATM delivery, and downregulate H2AX in particular cellular situations.

DNA damaging agents in various treatments have proven to initiate miRNAs. Occurrences of DNA damage have depicted a correlation with the activation of miRNAs, underlying the significance that miRNAs regulate DNA damage response based on the magnitude of the DNA damage.

Noncanonical miRNA Function[edit]

Recent study has shown that by miRNA directly targeting the primary transcripts of other miRNA in the nucleus, it can control the biogenesis of the miRNA. A particular example is the miR-709 which negatively regulates the miR-15a/16-1. This particular miR709 is found in the mouse nucleus, and it binds specifically to miR15a/16-1 which are both 19-nucleotide recognition element. It clusters and blocks the processing of primary transcript of miR-15a/16-1 into the precursor. As such it regulates the maturation at a post-transcriptional level, which means post primary transcript but pre precursor. As such, because miR-709 can regulate the miR-15a/16-1, which in turn regulates the cell apoptosis, the miR-709 indirectly regulates the cell apoptosis. This in turn demonstrates that miRNA can affect the expressions of things within a cell because it can regulate the biogenesis of the other miRNA within the cells.

The miRNA can also regulate the long ncRNA. ncRNA are generally longer than 200 nucleotides and are non-protein-coding transcripts. The first experimental evidence that shows long ncRNAs are functional miRNA targets is shown by Hansen. In the experiment, the antisense transcript of the cerebellar degeneration related protein 1 (CDR1), which is a circular ncRNA, has been shown to be near perfect complementary with miR-671, which is in the nucleus. Within the experiment, miR-671 directs the cleavage of the CDR1 antisense transcript in an AGO2-dependent manner. Thus, with the negative regulation of the circular antisense ncRNA, it also decreases the CDR1 sense transcript. The study down shows that the antisense RNA can be destabilized by the miRNA through the AGO2 -mediated cleavage, and the sense mRNA can be stabilized by the circular noncoding antisense RNA.

miRISCs and Its Components[edit]

miRNA combine with Argonaute proteins and GW182 proteins to form miRNA-induced silencing complexes, or miRISCs. AGO attach to the N-terminus of the GW182 protein, while the miRNA bond to the AGO. The GW182 protein seems to be the more important of the two, as it contains the main silencing region. This was discovered when miRNA induced repression was still effective even after the knockoff of AGO in Drosophila cells.

miRNA Inhibition of Translation[edit]

miRNA possess several methods of inhibiting translation. Suppression can occur both before and post translation, although the former method seems to be preferred.

Before Initiation[edit]

  1. miRISC can suppress translation by interfering with the reading of 5′ eIF4F-cap structure. The miRNA prevents the ribosome from reading the cap, thus initiation never starts. On the other hand, mRNA that were able to translate without the cap recognition step were not suppressed by the miRNA.
  2. miRNA are also able to prevent the creation of a functional ribosomal unit. On normal mRNA, the 40S and 60S ribosomal subunit come together to form the 80S complex, which helps translate the mRNA. miRNA inhibit 60S from joining with the 40S unit, making mRNA translation impossible. Translation is never able to start.

Post Initiation[edit]

  1. miRNA blocks the elongation of the new RNA being translated.
  2. The ribosome is forced to drop-off from the mRNA. The 40S and 60S ribosomal units split up before translation is complete.
  3. The miRNA induce preteolysis of the newly transcribed polypeptide chain. The chain is broken up by enzymes.

The mechanisms for the three post initiation inhibitors are known.

miRNA stability[edit]

In contrast to the suggestion in the past that miRNAs are highly stable, recent researches have shown that individual miRNAs in certain environments are subject to accelerated decay, which alters miRNA levels so that affects its activity.[4]

During miRNA biogenesis, miRNAs are transcribed by polymerase II as primary transcripts (pri-miRNAs) and the are matured in a multi-step biogenesis process to produce the mature and functional miRNA form. In one case, the pri-miRNAs are captured by polyaldenylated and are quite long (several kilibases long). Pri-miRNAs possess hairpin structures which includes the mature sequence of miRNA in their stem. In another case, the precursor miRNAs (pre-miRNAs) can be kept in introns of mRNAs or other non-coding RNAs. In either of these two cases, the nuclear RNAse type III enzyme Drosha in a complex with co-factor DiGeorge syndrome critical region 8 homolog (DGCR8), cleaves near the base stem which releases about 70 nucleotides pre-miRNA.[5]

Deadenylation and Decay[edit]

In deadenylation, the miRNA binds to AGO, GW182, and also poly(A)-binding protein (PABP). The PABP attaches to the GW182 protein, forming a slightly different miRISC. The miRISC removes the 5′-cap from the mRNA, which immediately causes decay of the mRNA. Deadenylation is effective because it rids the cell of excess mRNA, eliminating the chance of accidental translation. The decayed fragments are collected by the P bodies, and reused by the miRNA.

The degradation of miRNAs occurs under the aid of several miRNA-degrading enzymes. Many miRNA-degrading enzymes have been determined including both 3’to 5′ and 5′ to 3′ exoribonucleases. Recent researches have shown that certain RNases were found to take the role in the turnover process of different sets of miRNAs in different organisms. However, the substrate specificity and phylogenetic conservation of individual miRNA turnover enzyme are still in the need of researching.[6]

microRNA-206 and Synapse Repair[edit]

In a mouse model of ALS: When mice get ALS, production of microRNA-206 is induced/increased. Deficiency of microRNA-206 accelerates the progression of the disease.
-MicroRNA-206 is required for regeneration of damaged neuromuscular synapses (the signals between muscle and nerve cells). When synapse is damaged, microRNA-206 turns on repair. Without miRNA-206, synapses cannot be repaired; however, some synapses can grow back.
-MicroRNA-206 does this through histone deacetylase and fibroblast growth factor (FGF) signaling pathways. Growth factors are specific signals from other cells that tell the cell to grow.
-MicroRNA-206 blocks translation. It then activates histone deacetylation which condenses chromatin, therefore blocks transcription.
-MicroRNA-206 slows the progression of ALS by repairing neuromuscular synapses.

MicroRNA genes are found in intergenic regions. These regions have its own miRNA gene promoter and regulatory units. Approximately forty percent of miRNA genes are lie in the introns of the proteins coding, non-proteins coding, and even in the exons. The miRNAs are found in the orientation that are regulated together with its own host gene. Between forty-two to fifty percent of other miRNA genes were shown in a common promoter, which originate from polycistronic units. The polycistronic units have a discrete loops of 3-6 where the mature miRNAs are being processed, but the miRNAs family are not homologous structure function. Hence, the promoters have a few identical motifs to other genes promoters that were transcribed protein coding genes from the RNA polymerase II. Also, in the DNA template does not have the finish during the mature miRNA production, because there is about five percent of human miRNAs shows RNA editing. The site-specific modification of RNA sequences to yield products different from those encoded by their DNA. The yield of the product allows to increases the diversity, the scope of miRNA action implied from the genome alone.

miRNAs and Disease[edit]


Recent studies have shown that miRNAs are involved in causing diseases. In the case of cancer, researchers found that miRNAs can inhibit the E2F1 protein that regulates cell proliferation. miRNAs bind to the mRNA first before translating the E2F1 protein.
One microRNA, miR-21, was labeled as the first oncomir. It is known to aid in tumor growth and metastasis by targeting natural occurring tumor suppressors in the human body. Tropomyosin 1 (TPM1) is a direct target of miR-21, along with programmed cell death 4(PDCD4) and maspin, all of which are inversely correlated with the expression of miR-21 in the presence of tumors. This shows that miR-21 has the ability to target multiple genes and inhibit multiple metabolic pathways at the same time.

Kidney Fibrosis[edit]

Renal fibrosis is the excessive accumulation of fibrous tissues (connective tissues), occurring as a reparative process after scarring or trauma to the kidney. This type of nephropathy directly promotes renal dysfunction, which ultimately leads to kidney failure and death. Study has shown that a certain microRNA, miR-21, shows significant elevation in expression during the progression of kidney scarring. Experiments were conducted to validate this specific sequence and its effect in mice.

The abrogation of miR-21 in mice showed no overt abnormalities and no obvious suppression/prevention of tumor growths; however, these mice developed far less interstitial scarring tissue in response to kidney injury. Analysis has detected groups of genes and their subsequent metabolic pathways that were inhibited by miR-21. One of which involves peroxisome proliferator-activated receptor- α(Pparα), which is a lipid metabolism pathway that incorporates the synthesized anti–miR-21 oligonucleotides to inhibit miR-21. Pparα is found to ease the effects of ureteral-obstruction induced kidney fibrosis. miR-21 also regulates the redox metabolic pathway that involves a protein called Mpv171. The repression of Mpv171 in cells enhances kidney damage by reducing the production of oxygen radicals.

These studies demonstrate that miR-21 has a broad spectrum of influences on the microscopic scale and can be a suitable target for antifibrotic and cancer therapies.

Heart Disease[edit]

Another studies have shown miRNA inhibits the maturation in the murine heart, and plays an essential role during its development. The expression level of the miRNA is been changed in the disease of the human heart; it is the involvement in cardiomyopathies. During the heart disease development, they were several specific miRNAs that were been identified in animal models that were mostly in mice under pathological conditions. Those specific miRNA conditions key factors are important for cardiogenesis, the hypertrophic growth response, and cardiac conductance.


Fabian, Marc R., Nahum Sonenberg, and Witold Filipowicz. “Regulation of MRNA Translation and Stability by MicroRNAs.” Annual Review of Biochemistry (2010): 351-79.
Neil A. Campbell, Jane B. Reece “Biology 8th edition”

External links[edit]

  • Chau, and Tran. “MicroRNA-21 Promotes Fibrosis of the Kidney by Silencing Metabolic Pathways.” US National Library of Medicine. N.p., 15 Feb. 2012. Web. 20 Oct. 2012. .
  • Liu, Youhua. “Renal Fibrosis: New Insights into the Pathogenesis and Therapeutics.” Nature Publishing Group, 19 Sept. 2005. Web. 20 Oct. 2012. .
  • Chen X, Liang H, Zhang CY, Zen K. “miRNA regulates noncoding RNA: a noncanonical function model” Trends Biochem Sci. 2 Sept. 2012. Web. 25 Oct. 2012 .
  • Guohui Wan, Rohit Mathur, Xiaoxiao Hu, Xinna Zhang, Xiongbin Lu, miRNA response to DNA damage, Trends in Biochemical Sciences, Volume 36, Issue 9, September 2011. .
  1. Wan, Guohui, Rohit Mathur, Xiaoxiao Hu, Xinna Zhang, and Xiongbin Lu. “ – Trends in Biochemical Sciences – miRNA response to DNA damage.” | Search through over 11 million science, health, medical journal full text articles and books.. N.p., n.d. Web. 7 Dec. 2012. .
  2. Chang, T.C. and Mendell, J.T. (2007) microRNAs in vertebrate physiology and human disease. Annu. Rev. Genomics Hum. Genet. 8, 215–239
  3. Großhans, Rüegger . “MicroRNA turnover: when, how, and why. [Trends Biochem Sci. 2012] – PubMed – NCBI.” National Center for Biotechnology Information. N.p., n.d. Web. 6 Dec. 2012. .
  4. Großhans, Rüegger . “MicroRNA turnover: when, how, and why. [Trends Biochem Sci. 2012] – PubMed – NCBI.” National Center for Biotechnology Information. N.p., n.d. Web. 6 Dec. 2012. .
  5. Krol, J. et al. (2010) The widespread regulation of microRNA biogenesis, function and decay. Nat. Rev. Genet. 11, 597–610
  6. Großhans, Rüegger . “MicroRNA turnover: when, how, and why. [Trends Biochem Sci. 2012] – PubMed – NCBI.” National Center for Biotechnology Information. N.p., n.d. Web. 6 Dec. 2012. .

1. Small nuclear RNA (snRNA)[edit]

Small nuclear RNAs (snRNA) are the small RNA molecules that are found in the nucleus of eukaryotic cells. They are usually 300 nucleotides or smaller and the nucleus contains more than just snRNA. The function of snRNA was discovered before the ribozyme enzyme by a few years. They are transcribed by RNA polymerase II or RNA polymerase III. They are important because they help in the process of pre-mRNA splicing and processing, which is the removal of the introns from hnRNA, and involved in the maintenance of the telomeres, or the ends of chromosomes. 5 snRNAs makes up the spliceosomes which are responsible for removing the introns from nuclear pre-mRNA eukaryotes. The spliceosome interacts with the ends of an RNA intron. It cuts at specific points to release the intron, then immediately joins the two exons that were adjacent to the intron. They are also responsible for mediating catalysis and aligning splice sites. Thomas Cech and Sydney Altman discovered that RNA molecules can serve as catalysts and changed the views of molecular evolution. snRNA are always found with specific proteins which make up the complexes called small nuclear ribonucleoproteins (snRNP), or snurps. The secondary structures are highly conserved in organisms ranging from yeast and human beings. Large groups of snRNA are called small nucleaolar RNA’s (snoRNA’s). snoRNA’s are responsible for cleaving eukaryotic long preRNA. They are important in RNA biogenesis and guide chemical modifications or ribosomal RNA (rRNA) and other RNA genes (tRNA and snRNA). Many snoRNA’s are created by processed introns. The host gene for the snoRNA is a ribosomal protein or translation factor. [43] [44]

2. Small Nucleolar RNA molecule (snoRNA)[edit]

snoRNAs, or Small Nucleolar RNA, modifies ribosomal RNAs (rRNAs) by mediating the cleavage of long pre-rRNA strands into its functional subunits (18S, 5.8S and 28S molecules). snoRNAs can also add the finishing modifications to rRNA subunits. [45]

3. Micro RNA (miRNA)[edit]

Micro RNA (miRNA) is a gene regulatory small RNA that is typically 21-23 nucleotides long. It is similar to small interfering RNA (siRNA)in that they bind to complementary mRNA molecules and inhibit their translation, however unlike siRNA which is a double strand RNA, miRNA is a single stranded RNA and it is only partially complementary to mRNA molecules. This class of RNA is non-coding.

Micro RNA has a great range of functions. It is used in cellular growth, development and insulin secretions among other things.

However it has been found that too much miRNA has been found to implicate diseases, such as Fraglie X Mental Retardation, as well as some forms of cancer.

4. Small interfering RNA (siRNA)[edit]

siRNAs, are known by several different names: small interfering RNA, short interfering RNA, and silencing RNA, were discovered by David Baulcombe’s research group in Norwich, England. They are roughly 20-25 nucleotides in length and are double stranded RNA molecules with overhanging 2 nucleotides on the 3’ends. They are largely responsible for the process of RNA interference (RNAi) pathways, which interferes with the expression of a gene. Other RNAi pathways such as antiviral mechanisms and shaping the chromatic structure of a genome are also mediated by siRNA. The discovery of siRNA’s ability to be synthetically produced, allowed for the induction of RNAi in mammalian cells. This then allows for research in drug development of the biomedical field such as treatment for the cure of Human Immunodeficiency virus (HIV).
However using RNAi through the use of siRNA in living animals is difficult, because siRNA responds differently to different types of cells and the effectiveness varies from very well to poor. It is not yet understood why the effectiveness of siRNA in living animals varies so vastly. Artificial siRNA can be made synthetically by a phage enzyme which is called a dicer. It is the dicer enzyme that causes destruction of the double stranded RNA (dsRNA). By transfecting artificial siRNAs, specific transcripts are used to probe gene function. Although this is a useful tool, the high cost of production makes it nearly impossible for most laboratories and researchers to be able to use this method of probing gene functions by transfecting artificial siRNAs. Chemical synthesis, invitro transcription, or RNase 3-dicer digestion of long dsRNA’s (double stranded RNA’, in vivo from plasmids PCR cassettes, or viral vectors CMV or polymerase III transcription unit. SiRNA’s are used for loss of function studies. SiRNA’s are very sequence specific.

5. RNA Interference (RNAi)[edit]

RNAi was discovered Craig Mello and Andrew Fire in the 1990s. The experimented by antisense RNA experiments. It is the process in which double stranded RNA triggers the degradation of homologous mRNA.


RNA interference occurs when a double strand of RNA is broken down by an enzyme called Dicer. Dicer chops the double stranded RNA into short sequences 20-25 base pairs long. These base pairs then complex with the RISC enzyme and a homologous strand of RNA, which is then catalytically cleaved by RISC.



It is used to degrade mRNA in cells as a defense mechanism against Viral DNA that may have infected the cell and to shut down the effects of specific genes post transcription without having to regulate actual gene expression in the cell’s DNA. This can be also used as a gene silencing technique. siRNA is put into a cell by transfection reagents. These reagents increase amount of RNA and DNA that can be absorbed by cultured cells. RNAi is used in the biomedical field to silencing disease causing genes. The RNAi can either be injected into specific cells or using modified viruses to transfect the cells. One common use of RNAi is in the birth control pill which stops sperm from fertilizing the eggs by splicing the gene that encodes protein to allow the sperm to bind to the egg. RNAi is also being used to knock out genes in salamanders in an attempt to discover which genes are responsible for their regenerative capabilities in an attempt to cure disases previously thought to be incurable, such as Huntington’s, Parkinsons, and Alzheimers by attempting to trigger the regeneration of the neurons who’s death are responsible for such a disorder.

6. Interference RNA (iRNA)[edit]

Interference RNA or iRNA is used for gene regulation. It is an antisense RNA (complementary to other RNA, mostly mRNA). It is important for gene regulation and it is being researched currently for collective anti-cancer properties. It has ties to siRNA as siRNA is involved in the RNA interference pathway. RNA interference (RNAi) is a phenomenon of gene silencing at the mRNA level offering a quick and easy way to determine the function of a gene both in vivo and in vitro. [46]

7. RNA is a component of telomerase[edit]

Telomerase is a ribonucleoprotein (a ribonucleic acid-protein complex). It is an enzyme that maintains the telomeres (ends) of chromosomes during DNA replication. It has been found to be useful in the therapeutical, pharmaceutical, and diagnostic reagents.


8. Non-coding RNA (ncRNA)[edit]

Non-coding RNA is basically any RNA molecule that is not translated into a protein. Non-coding RNA can be found in many different forms of RNA, such as: ribosomal RNA (rRNA), transfer RNA (tRNA), and small RNAs [microRNA and small interfering RNA (siRNA)]. Non-coding RNA can be small or it can be very large. The small non-coding RNA molecules is also known as sRNA, whereas the large or long non-coding RNA is also known as lncRNA. The non-coding RNA molecule that was transcribed from DNA is often referred to as an RNA gene.

It is significant to note that there exists a growing interest in small, barely detectable non-coding RNA molecules because some of them have been found to play an important role in the regulation of gene expression. These small RNA molecules are known as RNA genes. In the early 1990s, American geneticist Victor Ambros and his colleagues first identified these molecules in the species of worm Caenorhabditis elegans. They were found to be responsible for turning off gene expression during worm development. This novel function was later discovered in other species as well. A decade later, another American geneticist Stephen R. Holbrook of Lawrence Berkeley National Laboratory in California discovered several other potential RNA genes previously undetected via a complex computer program called RNAGENiE. Currently, much research is being conducted over these tiny non-coding RNA molecules. In recent years, biotech and pharmaceutical companies have been looking into the potential of RNA genes as drug targets due to recent interest in RNA genes produced during bacterial infections and their pathogenic effects through the regulation of gene expression of host DNA.

9. Antisense RNA[edit]

Antisense RNA is an RNA strand that is complementary to the messenger RNA (mRNA) strand that transcribes within the cell. The antisense RNA is a single stranded RNA molecule. The antisense strand is brought into a cell in order to inhibit the translation of the mRNA. It does this through base pairing to the complementary mRNA strand, which obstructs the ability of the mRNA to translate.

Antisense RNA has been previously thought to be useful as a therapeutic technique for disease therapy, however over the past few years only one drug has been synthesized through the use of antisense RNA. It has been found that antisense RNA failed to have an effective design for disease therapy.

10. tmRNA[edit]

In an RNA, an RNAse can take off the 3’ end of an mRNA so that the mRNA has no stop codon for the ribosome sense and stop translation. Once the entire strand of mRNA is translated, this leads to the ribosome being stuck on the mRNA, with a peptidyl-tRNA in the P site of the ribosome. To fix this there is the tmRNA, which removes ribosomes that are stuck on an mRNA. This tmRNA has characteristics of both a tRNA and a mRNA.

In E. Coli, the tmRNA present is SsrA. The structure of this SsrA is arranged so that at one end there is an alanine attached with a tag sequence, and the SsrA is folded to look like a tRNA. The SsrA will enter the A site of the stuck ribosome and the Alanine on the SsrA will form a peptide bond with the polypeptide that is stuck on the ribosome. The tag sequence on the tmRNA is then translated like a mRNA and added to the amino acids on the stuck polypeptide. The string of about 12 added amino acids are called a proteolysis tag. At the end of the tmRNA, a stop codon will signal the ribosome to stop translation and detach itself as well as the SsrA-tagged peptide. SspB, a helper protein, can then recognize the proteolysis tag on the polypeptide chain and bring it to the protease, ClpXp, to be destroyed. [1]

11. Catalytic RNA[edit]

Catalytic RNA carry out enzymatic reactions. Catalytic RNAs are usually found near proteins where the catalytic activity is found in the RNA portion, rather than protein. [2]


RNA polymerase is an enzyme that produces RNA and catalyzes the initiation and elongation of RNA chains from a DNA template. RNA is created using a process known as transcription. The RNA polymerase is a key component to this process. The reaction that this enzyme catalyzes for is:
(RNA)n + Ribonucleoside Triphosphate ->/<- (RNA)n+1 +PPi. RNA polymerases are relatively large. The size of RNA polymerase in a typical eukaryotic cell is roughly 500kDa. In bacteria it is roughly 400kDa and in T7 bacteriophage it is roughly 100kDa. Their speed of transcription is about 50 bases per second. A typical mRNA that codes for an average protein takes about 20 seconds in a prokaryotic cell and about 3 minutes in a eukaryotic cell. It is primarily longer in eukaryotes due to the fact that eukaryotic genes contain many segments that contain introns.

Requirements to Function[edit]

For DNA polymerases to properly carry out their function they must have the following components present for catalysis to occur.
1. A template of DNA. The preferable template is a double stranded DNA. Single stranded DNA may also work as a template but RNA strands or DNA-RNA hybrids may not be used.
2. Activated precursors. The reactions require ribonucleoside triphosphates: ATP (Adenine -ribose-triphosphate), GTP (Guanine-ribose-triphosphate), ATP (Adenosine-triphosphate), and UTP (Uracil-ribose-triphosphate). Nucleotides with three phosphates to the 5’ carbon of the ribose sugar.

Example of Ribonucleoside triphosphate (ATP)

3. Divalent metal. Unlike DNA polymerase, a primer is not needed but a divalent metal ion like magnesium ion or manganese ion is effective.

The direction of synthesis is from 5′ to 3′ and the synthesis is driven by the hydrolysis of pyrophosphate. There have been hybridization experiments that show RNA synthesized by RNA polymerase is complementary to its DNA template.

Rna syn.svg

RNA Biogenesis Pol I, Pol II, and Pol III[edit]

Gene transcription takes place in the nucleus of eukaryotic cells and transcription is performed by three different multisubunit RNA polymerases, Pol I, Pol II, and Pol III. Still little is known today about the biogenesis of these RNA polymerases: from their origin of synthesis, the cytoplasm, to their arrival in the nucleus for transcription. Only until recently have studies shown that polymerase assembly intermediates, assembly factors and factors required for polymerase nuclear import exist in the cell cytoplasm. Pol II is the most identifiable one so is the basis of most studies on the biogenesis of RNA polymerase.

Structure and Assembly of RNA Polymerase II[edit]

RNA Polymerase II Complex

RNA Pol II transcribes mRNAs and small non-coding RNAs and contains 12 polypeptide subunits. Each RNA Pol has their own specified role in RNA polymerase. They all have ten identical subunit catalytic cores. The peripheral subunits are what differentiate their structure and function; RNA Pol II has been determined to contain cores that allow it to model the homologous cores in Pol I and Pol III. Pol I and Pol III will bind to opposite sides of Pol II (binding to Rpb1 and Rpb2) and are then divided into three interacting subunits.

3D Structure Model of RNA Polymerase II

The assembly of eukaryotic RNA core was first identified in studies of bacterial RNA polymerase because RNA Pol II core subunits are exactly identical to that of bacteria. Assembly of RNA Pol II is initiated by the formation of the αα dimer which interacts with the β and forms a bound complex intermediate. The active cleft in the RNA Pol II is composed of β subunits which are formed in the final step of assembly, so the polymerase will not be catalytically active until it is complete. RNA Pol II in both bacteria and eukaryotic cells has both exhibited formation in equivalent manner.

Assembly in vitro experiments have also been conducted to determine the origins of RNA Pol II. Using three mutant large subunits, their assembly was followed with the use of pulse chase experiments. Scientists found that Rpb3 and Rpb3 were the first to interact, and the bound complex then interacts with Rpb1. However, because larger mutated subunits were used, final assembly could not be complete without the use of Rpb6, Rpb10, and Rpb12, which are not normally part of final assembly in normal sized RNA Pol II.
RNA Nuclear Import

If any RNA subunits are lost during its assembly, there will be an excess of Rpb1 present in the cytoplasm, meaning that the polymerase needs to be fully assembled before it is allowed to enter the nucleus and take place in transcriptase. Pol II nuclear localization factors have been identified to be functional polymerase-interacting proteins in the cell. The accumulation of Rpb1 is caused by the depletion of GPN1 and GPN3. The expression of GPN1 will lead to the depletion of excess Rpb1. GPN1 binding to Pol II can also be directly influence the ability of GTP to bind properly. Homologs of GPN1 also aid in the biogenesis and final assembly of Pol II. GPN1 interacts with the CCT complex, which chaperones many subunits in the formation of Pol II.

Nuclear Import Signal[edit]

The components of Pol II, the subunits and GPN proteins, are unable to produce a nuclear import signal, therefore, which is why a Pol II cannot enter the nucleus until it is fully assembled, so it can produce a signal. Iwr1 is a factor that interacts with fully assembled Pol II and adapts a nuclear signal onto it. And deletion of Iwr1 leads to a accumulation of all the Pol II subunits, showing that lwr1 is most likely the key to proper final assembly. Iwr1 binds to the active site on Pol II and can “sense” completion by interacting with the Rpb1 and Rpb2 subunits, ensuring that Pol II is fully assembled; this acts as the final checkpoint before entering the nucleus. Because deletion of Iwr1 affects the concentration of subunits in the cytoplasm, a nuclear export signal is used to trigger the recycling of Iwr1. Currently Iwr1 is only know to effect the subunits and factors involving Pol II upon depletion, nothing has been found on how it affects Pol I and Pol II.

Biogenesis for RNA Pol I and Pol III[edit]

The origins of Pol I and Pol III may depend on the chaperones Hsp90 and R2TP because the client proteins for these two chaperones were discovered to be the subunits of Pol I and Pol III. This makes sense because the deletion of A135, the Pol I subunit, results in Hsp90 binding to Pol I’s larging subunit, A190. Several bleaching experiments have been conducted on Pol I that revealed Pol I is assembled at the promoter sites. It unclear as to what happens to Pol I after transcriptase because it remains as a stable complex and does not dissociate, scientists are trying to determine whether or not Pol I is fundamentally different in other organisms.

Pol III is the least understood polymerase out of the three. A NLS sequence was discovered near the N-terminus of the second larger Pol III subunit, C128, and when this sequence is deleted it leads to the accumulation of C128 in the cytoplasm and other Pol III subunits. However, the other Pol III subunits remained intact and nuclear. This reveals that the core of Pol III is assembled within the cytoplasm and the released subunits bind the core of the nucleus. It appears that Pol III follows the same assembly pathway as that of Pol II, as revealed by native mass spectroscopy.

Due to the fact that all three RNA polymerases have at least ten identical subunits, we can draw the conclusion that all three polymerases can coordinate and simultaneous assembly. The study of a certain subunit in any of three polymerases can be better understood by also studying the other subunits at that stage of biogenesis.

RNA Polymerase Translocates[edit]

RNA molecules thousands of nucleotides long are synthesized by multi-subunit DNA-dependent RNA polymerases. Nucleotide condensation’s reiterative reaction happens at rates of tens of nucleotides per second. This is consistently linked to the translocation of the enzyme along the DNA template (threading of the DNA and emerging RNA molecule through the enzyme. This reiteration of the nucleotide addition/translocation cycle without separating the DNA from the RNA involves both isomorphic and metamorphic conformational flexibility to such a magnitude that it accommodates the essential molecular motions.

Types of RNA Polymerase[edit]


Eukaryotic cells have three types of RNA polymerases.
Pol I: This type of RNA polymerase synthesizes RNA for the large subunits of ribosomes. Ribosomes are pretty much the protein making organelle in cells.
Pol II: Creates mRNAs. Messenger RNAs provide a template for protein synthesis for ribosomes. It also creates many small nuclear RNAs which help modify RNA after they are formed.
Pol III: Creates tRNAs. Transfer RNAs is basically for the small subunit of ribosomes.

These three types of polymerases can be distinguished from one another in lab by the level of inhibition by the alpha-amanitin poison. PolI is completely resistance to this poison. PolII is highly sensitive to this poison. And PolIII is moderately sensitive.

RNA polymerases in eukaryotic cells are composed of several subunits. Majority of them are small and unique to each type of polymerase. However there are two large subunits that are similar among all of the polymerases. This fact highlights that all these polymerases must have evolved from an original polymerase. The two large subunits are the functional core of this enzyme. The other smaller subunits tend to provide the specific functions for each distinct type of polymerase.


In bacteria, the RNA polymerase holoenzyme is made up of two parts, a core polymerase and a sigma factor. The core polymerase has the components needed for elongation in transcription, while the sigma factor is only needed for transcriptional initiation. The core polymerase is made up of two α’s, one β, and one β’ unit (α2 β β’), while the sigma factor is only made up of s. In total, there are 5 subunits in RNA polymerase– alpha (α), beta (β), beta’ (β ‘), sigma (s), and omega (w). However, the function of omega is unknown and is thought to possibly stabilize RNA polymerase.

In bacterial DNA, the promoter sequence is recognized by the sigma unit of the RNA polymerase. Upon recognition of the promoter sequence, the sigma factor will guide the RNA polymerase to the promoter. This sigma factor will then bind the RNA polymerase to the promoter through the α unit of the core polymerase. [4]


Archaeal RNA polymerases are pretty similar to eukaryotic RNA. Especially similar to RNA Polymerase II. These polymerases may have evolved from stripping down eukaryotic systems. An archea polymerase is used in PCR because it can withstand the high temperature used to split DNA strands.


RNA polymerase have a multitude of structural features that help in the transcription process. A Structure known as the clamp keeps the polymerase anchored to DNA . The flap ensures that the MRNA is retained. The rudder prevents DNA/RNA hybrid from occurring. DNA does not enter the mouth of the polymerase directly. It is usually held sidewise with a sharp bend to its left as it exits the polymerase. mRNA is believed to leave from the back of the polymerase. NTPs enter the active site as the same channel that DNA is pulled through but through a secondary channel .

Typical RNA polymerase structure

Similarities and Differences between RNA Polymerase and DNA Polymerase[edit]

The synthesis of RNA and DNA is similar in many aspects. Both of them follow the synthesis direction of 5′->3′. Another is that the method of elongation is by the 3’OH group at the terminus of the growing chain that makes a nucleophilic attack on the innermost phosphate of the incoming nucleoside triphosphate. Another similarity is that the synthesis is driven by the hydrolysis of pyrophosphate. However the difference between the two is that RNA polymerase does not require a primer unlike DNA polymerase which does. Also although DNA polymerase can actually correct mistakes in the nucleotide transcription, RNA polymerase lacks this ability to excise the mismatches nucleotides.


  1. Joan L. Slonczewski, John W. Foster. “Microbiology: An Evolving Science.”
  2. Joan L. Slonczewski, John W. Foster. “Microbiology: An Evolving Science.”
  3. Macromolecular micromovements: how RNA polymerase translocates. Svetlov V, Nudler E.
  4. Joan L. Slonczewski, John W. Foster. “Microbiology: An Evolving Science.”

Transcription Elongation Complex (TEC)[edit]

To start transcription, RNA Polymerase (RNAP) must recognize and bind to a promoter sequence. Some factors include assisting the polymerase to an open promoter complex in which the DNA exposes the bases, forming a transcription bubble. Then, RNAP typically undergoes an abortive initiation in which the process synthesizes short strands of RNA transcripts. RNAP returns to the initial promoter site and escapes the region by forming a stable, transcription elongation complex (TEC) which is able to transcribe the whole gene.

Single-molecule Techniques[edit]

Atomic Force Microscopy (AFM)[edit]

Atomic force microscopy is a technique used to image the ultrastructural alteration in the TEC such as the change in bend angles of the template DNA induced by RNAP. The TEC is placed on a flat surface then scanned with a AFM cantilever which is a beam anchored at one end. Then, deflections are detected by a laser that reflects the surface. This allows the reconstruction of two-dimensional image of transcriptional complex.

Single Molecule Fluorescence[edit]

Another technique used to monitor transcription is to fluorescently tag the RNAP itself. This method allows the monitor of promoter search or elongation with minimal perturbation. Specifically, the structural change in TEC can be examined by using the method called Fluorescence resonance energy transfer (FRET). FRET can follow the distance between two nucleotides by measuring the intensity change in fluorescence.


By attaching beads to single RNAP molecules, one can record the position of these beads to determine the change in location or rotational state of the enzyme. Specifically, the beads can be sensitively measured by measuring the light scattered from the bead or the rotational states. One can also apply force on the beads with an OT. OT is a tightly focused beam of infrared laser light that exerts forces on the beads by means of radiation pressure. In addition, force can be applied by means of laminar fluid flow. The end of the DNA template can be attached to a second bead so that fluid flow can exert force on the free bead which place tension on the DNA template.

Transcription Initiation[edit]

Steps in initiation

Promoter Search[edit]

Transcription requires a binding of the holoenzyme to DNA promoter sequence that is placed throughout an excess of genomic DNA. This is a problem that is common to all sequence-specific DNA-binding proteins. Two independent mechanisms, sliding and intersegment transfer, have been proposed to enhance binding by increasing its efficiency of the search process. Sliding transfer occurs when RNAP associates with nontarget DNA by diffusing in a random “walk” until it reaches the target site. Meanwhile, transegment transfer involves polymerase searching for the promoter by crossing from on position to another, bound simultaneously to both DNA segments.

Open-Complex Formation[edit]

When locating a promoter site, the RNAP undergoes a structural transition from the closed complex to the open complex (OPC). The RNAP bends and unwinds a segment of DNA with the aid of initiation factors such as “sigma”, creating the transcription bubble. “sigma” is dubbed as the “housekeeping” factor that directs RNAP to recognize vast number of promoters in bacteria. For instance, AFM reading of E. Coli promoter revealed that the DNA is bent between 55̊ and 88̊ which is a consistent measurement from the bend angles inferred from gel mobility assays.

Abortive Initiation[edit]

After forming OPC, RNAP starts the synthesis of RNA oligonucleotide complementary to the DNA template strand. Although RNAP creates highly stable complex during elongation phase, the initially transcribing complex (ITC) is highly unstable causing spontaneous release of short RNA chains and restarting synthesis which is known as “abortive initiation.”


On-Pathway Elongation[edit]

During transcription, RNAP translocates along the template DNA synthesizing an mRNA that has thousands of nucleotides in length. When the mRNA reaches 9-11nt in length, RNAP leaves the promoter region and enters the elongation phase. In this step, the TEC complex is very stable and remains tightly bound to both the DNA template and the nascent RNA during nucleotide addition. The major stabilizing factor of the complex is thought to be the base pairing within the RNA:DNA hybrid. The “sliding clamp” model states that the extensive protein-nucleic acid contacts within the polymerase greatly contributes to RNA retention, increasing the overall stability. The “clamp” that consists of narrow protein channels surround the hybrid to prevent any shearing motion between the RNA and the DNA.

Off-Pathway Events[edit]

The process of on-pathway elongation is frequently interfered by entry into off-pathway states that plays an important role in regulating RNA synthesis. One example of RNA regulation is transcriptional pausing during elongation. The puases can reduce rate of mRNA production, recruit factors for the TEC that modify the subsequent transcription, function as a precursor to termination, or lead to messenger splicing. The long “stabilized” pauses are known to play a regulatory role in formation of RNA hairpins in the transcript which is thought to inactivate RNAP. Series of studies have displayed that pauses lasting 20 seconds or more indicates a rate of base misincorporation during RNA synthesis, suggesting in need for proofreading.


Termination is a tricky step because of the stability of the TEC complex and RNAP must dissociate accurately releasing the mRNA and the DNA template. In prokaryotes, the termination occurs at specific sequence that code for a stable hairpin in the nascent RNA. In general, termination might be caused through allosteric interactions between RNA hairpin and RNAP that trigger the TEC to release the substrates to stop the reaction. Some studies concluded that termination occurs due to an intermediate elongation-incompetent state whereas some studies support that termination occurs rather quickly with no intermediates.


Herbert, Kristina M., William J. Greenleaf, and Steven M. Block. “Single-Molecule Studies of RNA Polymerase: Motoring Along.” Annual Review of Biochemistry 77.149-76 (2008): 149-172. Print.
RNA-dependent RNA polymerase is an enzyme, which catalyzes the replication of RNA from an RNA template. Usually, the typical RNA polymerase is well known that are catalyzes the transcription of mature RNA from a DNA template.


The most famous RdRP in a virus is the polio virus 3Dpol. The virus is made up of RNA which enters the cell through receptor-mediated endocytosis. The RNA is able to act as a template for complementary RNA synthesis. The complementary strand of the RNA is able to act as a template, in order to produce new viral genomes which are packaged and prepare to lyse from the cell transfer to other cells for more infection. This method of replication there is no DNA; therefore the replication is rapidly.However, the downside is that there is no ‘back-up’ DNA copy.

There is several eukaryotes that have RdRPs, and the RdRPs are involved in RNA interference; these amplify microRNAs and small temporal RNAs. Also, they produce double-stranded RNA from using the small interfering RNAs as primers. The RdRPs are used in the defense mechanisms, but it can be usurped by RNA viruses for their benefit.

Polio Virus[edit]

The first interaction for the polio virus is with a host cell; it consists two materials: binding to a specific cell surface protein, and the poliovirus receptor (PVR). The PVR, is a cell surface sialylated glycoprotein, and is a member of the immunoglobulin superfamily (is a loop in the stucture of the protein that is a Ig domain). Therefore, PVR has three Ig loops that are on the outside of the cell. The loops begins with the most farthest of the cell surface. In loop 1, the polio virus binds to it receptor, which the receptor molecule binds on the virus particle.

The poliovirus genome is made of positive sense single stranded RNA that encodes a polyprotein of aa’s in the range of 2100-2400. Both ends of the genome are modified; in the 5′ end is modify by a covalently attached basic protein VPg which consist of 23 aa’s, and the 3′ end by polyadenylation. In a series of cleavages, viral proteases cleave themselves out and break down the polyprotein into 10 separate gene products involved in replication and packaging.

The viral proteases 2A cleaves the p220 subunit of the cap binding complex; therefore, they make host cell from the mRNA unrecognizable to ribosomes. The 2A protease abrogates most of the host cell’s own protein synthesis. Viral mRNA depends on a 5′ UTR that contains an internal ribosome entry site; serves as a ribosome docking site to the subunits of ribosomes.

Replication occurs entirely in the cytoplasm. In addition, they serve as a template for protein synthesis, the positive sense strand genome is utilized as a template for the synthesis negative sense strands. On the other hand, the host cells has a lack of necessities to replicate RNA. Poliovirus uses a viral RNA-dependent RNA polymerase to produce RNA molecules of the opposite polarity. Viral protein VPg covalently attached to uridine, which serves as the primer. The first round of replication produces a single antisense molecule. The antisense template is used to produce copies of the original genome, which they are packaged into viral capsids before it gets release.

The virus has been translated to its own RNA, so it produce the necessary proteins, and the virus genome is replicated. However, the virus needs to package the newly synthesized RNA molecules inside capsids, and must need the RNA packaged in order the virus is completed. The capsid proteins self-assemble into an immature capsid that has a structure of which proteins were needed, but the final form of the virus is not finished to cleaved. The mature poliovirus capsid has icosahedral symmetry, and have 60 copies of viral capsid proteins that are VP1, VP2, VP3, and VP4. The viral RNA enters the incomplete capsid and is secured inside when the viral proteases make the final cleavages. Once the genomes have been packaged into mature virions, the virus particles await the cell’s lysis in order to be released. As many as 100,000 virions can be released from a single infected cell.

There is a conformational changes in the capsid, because there was a binding in the virus with the receptor. VP4, an internal capsid protein detaches from the capsid. The capsid swells and the poliovirus genome is susceptible to degradation. When VP1 is released, the genome is released onto the cytoplasm of the cell. The viral entry strategy is very inefficient; only 1% of the viruses initiate an infection.

Discovered in the 1980s, RNA helicases are enzymes that use ATP to bind and remodel RNA and ribonucleoprotein complexes (RNPs). Mostly all helicases work and interact with many other proteins inside a multi-component assembly. While it is it unknown how RNA helicases exactly locate their binding sites on the complexes, experiments show that they most likely either bind to cofactors, which then guides them to the complex, or the helicases themselves find the binding sites according to a complex code of features on the RNAs. RNA helicases also play an important role in eukaryotic RNA metabolism and are found in all kingdoms of life. But little is known about them and how they work in the cell. RNA helicases are similar to DNA helicases and share similar functions.

RNA Helicase Classifications[edit]

RNA helicases can also be classified into six superfamilies (SFs). SFs 1 and 2 are comprised of helicases that are non-ring forming. All eukaryotic RNA helicases belong to these superfamilies. SFs 3 to 6 are helicases that can form rings and can be found in bacteria and viruses.

SFs 1 and 2 can be broken down into well-defined helicase families. Each family has distinct structural and functional properties. Six of the families have RNA helicases while the rest consist of DNA helicases. Helicases in SF 1 and 2 have a core made of two similar helicase domains and have at least 12 characteristic sequence motifs at positions in the helicase core. Not all helicases in one family will have the same motifs but they have high sequence conservation. In other families, sequence conservation is low. Across superfamilies, sequence conservation is even lower. This suggests differentiation between DNA and RNA helicases was not an evolutionary force in the classification of helicase families.

The helicase core is also surrounded on either side by C- and N-terminal domains. The terminal domains are essential to the helicase’s cellular specificity because they assist specific complexes in recruiting proteins. They accomplish this through their interactions with other proteins or by recognizing specific nucleic acid sections. Unlike the core’s sequence motifs, the C- and N-terminal domains are not conserved between families. Certain families in SF1 and SF2 are also identifiable by their characteristic beta-hairpin in between the VA and VI motifs of the helicase core. The helicase families who show this beta-hairpin are the Ski2-like, DHeAH/RHA and NS3/NPH-II families. Other families, such the Upf1-like and RIG-I-like families, have noticeable inserts between or within the helicase core domains.

NPH-II helicase, found in vaccinia virus, and NS3 helicase from the hepatitis C virus are two RNA helicases that are essential for viral replication, and have had extensive studies. Both of these helicases load on a 3′ single strand of RNA and moves toward the 5′ end of the strand. These helicases begin to unwind the RNA through bursts and pauses, beginning at the junction of the single and double strand. During pauses, the helicase could be either preparing to continue unwinding, but it could also dissociate from the RNA. As of the present, there is still little known about the fundamental characteristics of the helicases acting on the RNA.

Source: Li PTX, Vieregg J, Tinoco I Jr. How RNA Unfolds and Refolds. Annu Rev Biochem. 2008;77:77-100.

RNA Helicase Mechanisms[edit]

RNA helicases employ two mechanisms for unwinding: canonical and by local strand separation. Both methods are ATP-dependent because ATP binding is needed not only for the helicase to bind to the duplex but to also keep the two helicase domains together. In both canonical and local strand unwinding, the helicase domains surround the nucleic acid in similar directions and make contact with the RNA’s sugar-phosphate backbone. This allows for complete attachment of the RNA helicase and movement along the RNA by 1nt per ATP consumed. Many translocating helicases can move in bursts of up to 18 nt steps before they perform a rate limiting step, allowing for quick unwinding of the RNA duplex. In local strand unwinding, the bound RNA strand often show bends in its backbone due to the presence of ATP analogs while in canonical unwinding no such bend is exhibited. The bends decrease preference for the duplex structure and most likely represent the RNA conformation of the two strands after the duplex is unwound.

Canonical Duplex Unwinding[edit]

Canonincal Unwinding Mechanism for RNA Helicase

When RNA helicases unwind RNA strands canonically, the RNA helicase attaches itself on the single-stranded region of the RNA strand and then translocates along the bound strand. It has defined direction and can either go 3’ to 5’ or 5’ to 3’ as it displaces the complementary strand. Each translocating step has multiple processes, including ATP binding and hydrolysis. ATP binding and hydrolysis drives the process forward. This type of winding requires strands to have single-stranded regions in a defined polarity with respect to the duplex. The RNA helicases families who are known to perform this mechanism are Upf1-like, Ski2-like, RIG-I-like and DEAH/RHA.

Local Strand Separation[edit]

Duplex Unwinding by Local Strand Separation

Local strand separation occurs when a RNA helicase loads itself directly on a duplex region of the RNA, and uses ATP to separate the strands. Unlike the canonical method, this type of unwinding does not require a single-stranded region with specific orientation nor ATP hydrolysis. ATP binding is sufficient for duplex unwinding to occur, ATP hydrolysis however is needed to successfully detach the helicase from the RNA. Sometimes the enzyme will dissociate before the strands have completely separated because of the strands quickly re-annealing. As the RNA strand gets longer, however, this type of unwinding is unfavorable and inefficient. The DEAD-box family unwinds duplexes this way and can only handle duplexes with 10 to 12 basepairs.

Other Functions[edit]

RNA helicases also have other functions aside from unwinding duplexes. It can also displace proteins on RNAs. This is called RNP remodeling. RNP remodeling appears to be important in how RNA helicase functions since RNAs are usually attached to a protein in vivo. RNP remodeling, however, is not essential in unwinding but also works for helicases that unwind canonically and for DEAD-box proteins. Some helicases can only remove certain proteins, while others can remove a wider variety of proteins.

RNA helicases have also shown to help in the RNA folding process. An example are the RNA helicases who facilitate and regulate RNA folding in fungal mitochondria as RNA chaperones. They should not be confused with protein chaperones which also help in RNA folding. RNA chaperones guide the RNA through the series of folding steps while continually proofreading. It determines if the substrates formed are correct or incorrect. If correct, the process is continued but if incorrect, the substrate is disregarded and the RNA chaperone opens up a new reaction path for RNA folding. Protein chaperones on the other hand catalyze the steps of the folding pathway and help stabilize the subsequent RNA structure.

Other helicase families show activities in regards to the innate immune system. The RIG-I RNA helicase translocates to a RNA duplex but instead of unwinding it, the helicase acts as a pattern recognition receptor and determines if a viral RNA present in the cytoplasm. It detects the viral RNA based upon the long double stranded RNA’s it creates during viral replication. Only viral RNA’s are detected because a majority of RNA’s in eukaryotic cells form only short RNA duplexes, which is ideal for the local strand unwinding process.

DEAH/RHA helicases also help in both the separation of a spliced mRNA from the spliceosome and the ligation of the exon to the mRNA. It separates the spliced mRNA from the spliceosome by attaching to the mRNA and moving from the 3’ to 5’ direction, breaking the RNA-RNA and RNA-protein along the way.


Jankowsky, Eckhard. “RNA helicases at work: binding and rearranging.” Trends in Biochemical Sciences. xx (2010): 1-11. Web.
Riboswitches are recently discovered RNA domains that function as gene expression regulators. It is a portion of the mRNA strand that is able to bind small molecules and alter the gene activity. An mRNA which possesses the riboswitch is able to regulate its own activity depending on whether or not a molecule is attached to it. They are located at the 5′ end of untranslated regions of messenger RNA. These functional domains exist in bacteria and have also been engineered in the laboratory[47]. Riboswitches are significant because most believe that proteins are primarily responsible for the complexity, specificity, and efficiency of gene control. Most riboswitches exist in bacterias although some have also been found in plants and fungi[48].

It was first described by Ronald Breaker’s lab in 2002 when they utilized in-line probing of Escherichia coli btuB mRNA to show that it could bind a metabolite/substrate and inhibit translation of the strand’s product (AdoCbl) — without proteins[49].

The original meaning of riboswitch was that messenger RNA can sense small molecules of metabolite. While this is still the use today, others have changed the meaning to include other types of RNA, further expanding the meaning. mRNA that contains a riboswitch can regulate its own activity. This opens many doors in the world of biology because it shows that molecules can evolve to be their own masters, or regulating themselves. These RNA were seen to distinguish between very similar molecules or analogs which shows the intricacy of the method. This fact has opened up a world of RNA because it is now known that the capabilities of RNA were much greater than once known. It is interesting because it illustrates how little we humans know about our very own bodies. Riboswitches allow RNA to respond to different concentrations of molecules almost as though the RNA had a mind of its own determining its actions. Due to the expansion of the definition of a riboswitch, there are many different kinds known to mankind today.

As the mantra of structural biochemistry is that structure determines function, it is not a surprise that the structure of the riboswitch allows for such great function. Most RNA do not need to conform to the strict watson and crick model of DNA allowing for many variations in RNA. The great variation in RNA is responsible for riboswitchs abilities. Riboswitches are made of two parts. the aptamer domain and the expression platform. The aptamer domain essentially acts as a receptor that binds to specific ligands. The expression platform is interesting because it can toggle between two different secondary structures when binding to a ligand, creating a plethora of possible structures. In both parts of a riboswitch there is a switching sequence. This switching sequence directs the expression of the genes.

Types of Riboswitches[edit]

There are several types of riboswitches known, some of which are:

  • TPP riboswitch : this riboswitch binds TPP (thiamin pyrophosphate in order to regulate the transport and synthesis of thiamin as well as other metabolites with similar properties.
  • Lysine riboswitch : binds to lysine and regulates its biosynthesis, catabolism, and transport.
  • Glycine riboswitch : this riboswitch regulates glycine metabolism. This is the only riboswitch known currently to be able to perform cooperative binding.
  • FMN riboswitch : this riboswitch binds FMN (flavin mononucleotide) in order to regulate the transport and synthesis of riboflavin.
  • Purine riboswitch : binds purines to regulate its transport and metabolism. Different forms of this riboswitch are able to bind either guanine or adenine depending on the pyrimidine in the riboswitch.
  • Cobalamin riboswitch : this riboswitch binds adenosylcobalamin, the coenzyme form of B12 vitamin, in order to moderate the synthesis and transport of cobalamin and other similar metabolites.

as well as many others such as SAM riboswitch, PreQ1 riboswitch, SAH riboswitch, glmS riboswitch, and cyclic di-GMP riboswitch.


Riboswitch Model.jpg

Riboswitches consist of two functional components, the conserved aptamer region and the highly variable expression platform. Unlike proteins, only four nucleotides are available to generate the specificity required by the riboswitch to bind[50].

The aptamer domain is usually a single binding site that has a highly conserved primary and secondary RNA structure and forms selective binding pockets for ligands. It essentially acts as a sensor for metabolites within the cell. Since it is located at the 5′ end of mRNA, it is usually the first to be transcribed by RNA polymerase.

To improve aptamer-substrate affinity, structural data shows that hydrogen bonds, van der Waals, and other interactions form with the substrate and also adjacent RNA regions. Other aptamers may utilize an induced fit mechanism with deep binding pockets[51].

The expression platform is commonly located downstream from the aptamer.


Most riboswitches function within feedback pathways by sensing metabolites and turning “off” the ability to express genes that would produce proteins that would continue the production of that metabolite[52].
The aptamer region tends to recognize ligands that are closely related to the gene products downstream from the riboswitch expression platform.


  1. ^ Wang, J., Lee, E., Morales, D., Lim, J., Breaker, R. “Riboswitches that Sense S-adenosylhomocysteine and Activate Genes Involved in Coenzyme Recycling”. Molecular Cell 29, 691–702, March 28, 2008.
  2. ^ Nahvi, A., Sudarsan, N., Ebert, M., Zou, X., Brown, K., Breaker, R., “Genetic Control by a Metabolite Binding mRNA” Chemistry & Biology, Vol. 9, 1043-1049, September, 2002.
  3. ^ Coppins, R., Hall, K., Groisman, A. “The intricate world of riboswitches” Current Opinion in Microbiology, Volume 10, Issue 2, April 2007, Pages 176-181.
  4. ^ Breaker, R. “Complex Riboswitches”Science, Vol. 319, 1795-1797, 28 March 2008.

How RNA Unfolds and Refolds[edit]

In general, RNA unfolds from the tertiary structure to secondary structure to single stranded RNA and vice-versa is true for how RNA folds. RNA unfolding depends on temperature to denature RNA or sometimes enzymes such as RNA-dependent RNA polymerase (RdRps) or helicases. Moreover, scientists use the techniques called optical tweezers, which is also called laser tweezers, and fluorescence resonance energy transfer, also known as FRET, to study how secondary and tertiary RNA structures unfold and refold. Furthermore, scientists use cation binding to study how ribozymes fold and unfold.

Secondary Structure RNA[edit]

Secondary RNA structure can unfold by increasing the temperature or using chemical reagents to denature RNA. Another technique used to study how RNA unfolds is optical tweezers. This technique applies a force that causes RNA to unfold in physiological temperature and buffer solutions (79). For example, the ends of a hairpin RNA have two beads—one that has an optical trap and the other has a micropipette strap. From this, RNA can be pulled and unzipped as the micropipette moves.

RNA refolding occurs in the reverse process of RNA unfolding. When micropipette moves, RNA can be pushed back which makes RNA relaxed and refolds RNA. However, if the relaxation force applied by optical tweezers increases, this can cause RNA to misfold (81).

Misfolding in RNA can be corrected by increasing the force. When force is increased, the RNA will try to refold into an active and functional form.

Tertiary Structure RNA[edit]


Tertiary RNA structure is relatively weak therefore, by changing the temperature or solutions that are not much different from the physiological state can destabilize RNA interaction.


A technique called FRET, fluorescence resonance energy transfer, can be used to understand how RNA folds (78). Scientists label two-dyed nucleotides on RNA strand and through observations of RNA folding, FRET signal allows scientists to measure the distance and motif between those two-dyed labeled nucleotides with respect to time. Furthermore, scientists can also use FRET to understand the changes in RNA conformation when RNA is bound to Mg2+ or ribosomal proteins.

Single-molecule of RNA Enzymes[edit]

To study the single-molecule of RNA enzymes, scientists use ribozymes and FRET technique. The difference between the study of how ribozyme unfolds and folds and that of the secondary or tertiary RNA structure is that scientists add a series of Mg2+ and they observe the FRET signals in order to tell whether ribozyme is docked (folded) or undocked (unfolded). From this Mg2+ “pulse-chase experiments,” scientists can find the “kinetic fingerprints” of the hairpin ribozymes’ enzymatic states (84). Based on this, scientists were able to figure out that ribozymes participate in chemical reactions such as oxidation or reduction, synthesis of nucleotides, and formation of peptides. Thus, the study of ribozymes reinforces the RNA World hypothesis, which stated that RNA preceded DNA (85).

Effects of Ligand and Protein Binding to RNA[edit]

Another way that RNA unfolds is through the appearance or the lack of ligands and/or proteins. Specific proteins and/or ligands bind to RNA and cause it to unzip. By using a technique called single-molecule fluorescence, scientists studied ribonucleoproteins (RNP) and its effect on RNA (88). In this technique, scientists can count RNP subunits in bacteriophage through “electron cryomicroscopy and crystallography” (89). Then, when RNA hairpins unfold, RNPs are assembled and proteins bind to RNA causing RNA to change conformation.

There are three commonly used applications of single-molecule fluorescence techniques. The first is simply counting the subunits in an ribonucleoprotein (RNP). The second common technique is annealing two hairpins, requiring the unfolding of both. As of current, the specific protein role is still not entirely clear. The third technique is to use the fact the RNP assembly is sequential. Because RNP assembly is sequential, this is an indication that events that occur early on in protein binding result in conformational changes in the RNA. By labeling a pair of fluorophores at different positions of telomeric RNA scientists have identified the binding of p65 protein can induce conformational change.[1]

In DNA, argininine is the component used to bind and stabilize the molecule. However, in RNA, it is argininamide, and not arginine that stabilizes and binds the TAR hairpin.[2]

Enzymes used to unfold RNA[edit]

Scientists learned that RNA needs energy input in order to unfold itself however, RNA folding does not require energy because this is a spontaneous reaction. According to the authors in the “How RNA Unfolds and Refolds,” in order to unfold three to four base pairs in RNA, one ATP is used (89). Therefore, enzymes such as helicases or RNA-dependent RNA polymerase help RNA unfold by using chemical energy that is present when nucleoside triphosphates undergo a hydrolysis reaction. For example, helicases take the energy from the ATP hydrolysis reaction to extract the proteins bound on RNA and unfold the double-stranded RNA (90). Therefore, as the concentration of ATP goes up, the faster this step will be. Although RNA-dependent RNA polymerase has not completely explored, scientists believed and expected that it is similar to how helicases work.

Another way that RNA can be unfolded is by binding single stranded RNA to a single-strand specific protein. However, in this situation, the binding must be strong so that it can overcome the forces seen in base pair bonding.[3]

In viral RNA replication, RNA must be single-stranded in order for its sequence to be interpreted during replication and translation. The RNA molecule is first unfolded by an RNA-dependent RNA-polymerase or ribosome. Through the hydrolysis of nucleoside triposphates, the enzymes can use that energy to be able to unfold the RNA substrates. The RNA needs to have varying sequences.[4]

Work Cited:
Li, Pan T.X., Tinoco Jr, Ignacio, and Vieregg, Jeffrey. “How RNA Unfolds and Refolds.” Annual Review of Biochemistry. 2008. 77-100. Print.

Tertiary Structure Folding[edit]

The tertiary folding refers to the interactions between the distal domains that form the structure needed for the RNA to carry out catalytic and regulatory functions. These interactions are fairly weak and can be easily unfolded using small change in temperature and solutions. Specifically, the FRET technique is utilized to observe the tertiary folding of RNA. This technique is performed by measuring the distance between the two florescence-dyed nucleotides. This enables observation of specific tertiary motifs in real time which consists of distinctive folding of the interacting RNA strands. The FRET technique was crucial in studying the RNA folding by measuring the conformational change during the binding of salt (Mg2+) or ribosomal protein. One prominent motif that was studied extensively is the tetraloop-receptor interaction which is present in many large folded RNAs and has been used to create synthetic RNA “building blocks.”

Using optical tweezers and increasing the force, RNA can be unfolded into four distinct conformations in the following order: kissing complex, two linked hairpins, one hairpin, and single strand. Similarly, when the force is decreased, the single strand can be refolded in the reverse order into the kissing complex. The kissing interaction is defined as the base pairing (complementary sequences) between the two hairpin loops and the hairpin loops is created when two complementary sequences in a single RNA meet and bind.


Salt effect on tertiary structures[edit]

In tertiary structures folding and stability are highly dependent on ionic conditions, especially of Mg2+. Thus metal ions have a greater effect on tertiary structures than on secondary structures of RNA. Mg2+ slows down the kinetics of breaking tertiary interactions, but only moderately affects the folding rates.

Common motifs that demonstrate salt effects include intron ribozymes, pseudoknots, and loop-loop interactions:
In intron ribozymes distinct rips were observed in MgCl2, indicating the unfolding of a structural domain. When there was no Mg2+, no rips were observed.

In pseudoknots compact structures are formed, and have increased stability with bound Mg2+.

In loop-loop interactions force manipulation is used to see how an intramolecular kissing complex changes. This can be seen from the unfolding and refolding of secondary structures. The base pair sequence affects the salt dependence of kissing interactions.[5]


Work Cited: Li, Pan T.X., Jeffrey Vieregg, and Ignacio Tinoco. “How RNA Unfolds and Refolds.” Annual Review of Biochemistry 77.1 (2008): 77-100. Print.

  1. Li PTX, Vieregg J, Tinoco I Jr. How RNA Unfolds and Refolds. Annu Rev Biochem. 2008;77:77-100.
  2. Li PTX, Vieregg J, Tinoco I Jr. How RNA Unfolds and Refolds. Annu Rev Biochem. 2008;77:77-100.
  3. Li PTX, Vieregg J, Tinoco I Jr. How RNA Unfolds and Refolds. Annu Rev Biochem. 2008;77:77-100.
  4. Li PTX, Vieregg J, Tinoco I Jr. How RNA Unfolds and Refolds. Annu Rev Biochem. 2008;77:77-100.
  5. Li PTX, Vieregg J, Tinoco I Jr. How RNA Unfolds and Refolds. Annu Rev Biochem. 2008;77:77-100.

Mechanical unfolding of RNA has become the preferred method by which to study the RNA folding problem, due to its lack of need for high temperatures or denaturants. The invention of laser tweezers has now made this technique even more valuable, in that it allows force to be applied to single molecules of RNA, and a clear view of their unfolding and refolding. To discover the structural transitions of RNA, measured force is applied and the end-to-end distance of the strand is measured; this also allows for the calculation of mechanical work done. Also developed was the technique FRET, or fluorescence resonance energy transfer, which is used to make clear RNA folding conformational change kinetics (Li, Vieregg, Tinoco Jr., 78).

When studying metal ion bonding, mechanical unfolding is the preferred method of study for several reasons. The use of optical tweezers allows RNA to be studied in physiological conditions, and the force applied by the tweezers only affects non-covalent interactions. Because of this, thermodynamic calculations and interpretations remain simple; the force applied affects only the RNA’s structure, and not the activity of water molecules and ions (85). Therefore, scientists do not need to consider colligative properties in their experiments. Furthermore, RNA conformation can be manipulated so that scientists can measure the formation or the break of interaction in RNA. Overall, the development of this method has resulted in easier, more accurate studying of RNA structure.


Li, Pan T.X., Jeffrey Vieregg, and Ignacio Tinoco. “How RNA Unfolds and Refolds.” Annual Review of Biochemistry 77.1 (2008): 77-100. Print.


Short RNAs play an important role in biochemistry. Short RNAs include transfer RNAs, small nuclear RNAs, micro RNAs, and other ones. In this section, we are gonna talk about how to profile short RNAs using helicos single-molecule sequencing. In order to lengthen the RNAs chain, scientists use the methods of splicing and 3 prime-end processing. And there are many non-protein coding RNAs that can be less than 200nt; they can be called short RNAs.

Profiling short RNAs[edit]


In RNA isolation, scientists want to purify sRNA from the total RNA or cultured cells. Hence they use certain materials to do this technique. They are mirVana™ miRNA Isolation Kit, miRNeasy Mini Kit, RNA/DNA kit. The RNA/DNA kit can be used to isolate large amounts of sRNA from the total RNA. Then people use TBE-Urea polyacrylamide gel electrophoresis and overnight elution to obtain the sRNA from each kit.
In the process of making cDNA, scientists use Escherichia coli PolyA polymerase and 100 mM CTP substrate. Then they apply the process of reverse transcriptase called ThermoScript. Consequently, Phenol, chloroform, and isoamyl alcohol were used with the 5 M Ammonium acetate. And the cDNA synthesis primer and RNAse A are also used. This sequence of this primer was created by the Integrated DNA Technologies and their sequence is TCG CGA GCG GCC GCG GGG GGG GGG GGrG rGrG.
In profiling 3 prime-end sRNAs, the reverse transcriptase called SuperScript III, USER enzyme, dTU-V cDNA synthesis primer with the sequence of TTTTUTTUTUTTTUTTTTUTTTUTTV, RNAse H, and RNAse 1f were used.
Overall, the two methods use similar materials in the process. They both used 100 and 70% ethanol, 10 mM dNTPs, AMPure ®, Magnetic stand for 1.5-mL tubes beads,and PCR machine. The RNAse inhibitors were also used in the methods. They are ANTI-RNAse inhibitors or RNAseOUT inhibitors.
In sequencing cDNA, scientists use 20 U/ mL Terminal Transferase, dATP, 1 mM Biotin-ddATP, 10 mg/mL Bovine serum albumin, Quant-tT™ OliGreen ® ssDNA Reagent, NanoDrop 3300, HeliScope™ Single Molecule Sequencer, and Helicos ® Flow Cells


Before beginning to profile short RNAs, scientists have to understand the goals of the experiment. The desired length sRNA must be separated from the long RNAs. They have to understand that only some of sRNAs have the 3 primed polyA tail of the mRNAs that can be used to convert sRNAs to cDNA. Also, using random hexamers can only be used sometimes to convert the short RNAs to cDNAs since the RNAs are too short for the process of conversion to go smoothly. Some RNAs have modifications at their 3 primed and 5 primed ends that would make it harder to make conversion to cDNA.
There are two methods in detecting sRNAs. The first method detects sRNAs with the 3 primed -OH. And the second one detects 3 primed-polyA sRNAs. First, the isolation of sRNAs occur using different kits. If only less than 200nt section of sRNAs is needed, the mirVana kit , miRNeasy, or RNA/DNA kit would be useful. The TBEUrea denaturing polyacrylamide gel-electrophoresis can also be used to isolate sNRA for a specific region.

The general method of profiling sRNAs occurs by tailing RNAs with 3 primed polyC. First, they put RNA into a PCR tube with 30μL. The amount of short RNAs used in this step can be around 5ng to 10ng. Then they incubate the tubes in the PCR machine at 850C for 2 minutes, then put the tubes in the ice for another 2 minutes. To the tube, they add 10 mL of 5× E. coli PolyA polymerase buffer; 5 mL of 25 mM MnCl2, 1 mL of 100 mM CTP, 1 mL of Anti-RNAse or RNAseOUT, and 3 mL of 2 U/ mL E. coli PolyA polymerase. Hence , they mix the solutions and incubate it in PCR machine for 3 hrs at 370C. After that,40μL of water and 10 μL of 5M ammonium acetate were added to the incubated tube. Then they extract the solids twice in the tube with phenol, chloroform, isoamyl acid. Then precipitation of the solution occurs when they add three times the 100% EtOH to the tube at -800C . In the end, they would centrifuge the solution at 40 for 30 mins and wash the precipitation with 70% EtOH, then vaccuum dry the solid. Finally, you have to put the solid in 30.5μL of water.To make cDNA, they first add the 1 mL of 100 mM cDNA synthesis primer to the 30.5μL solution in the previous step. Then they would incubate the solution for 2 minutes at the 700C in the PCR machine. At that temperature, they would add more reagents to the solution like 10 mL of ThermoScript cDNA Synthesis buffer for 5 times ,5 mL of 0.1 M DTT, 2.5 mL of 10 mM dNTPs, and 1 mL of ThermoScript reverse transcriptase. Finally, incubation of the solution is required for 15 mins to inactivate the reverse transcriptase. After the cDNA synthesis process, they purify the cDNA. First, they mix the synthesis cDNA with 1 mL of RNAse A. And the AMPure beads suspension were mixed so that the beads would not be suspended. Then to the mixture of cDNA and RNAse, they added 150 mL of the AMPure beads to incubate the solution at room temperature for 30 minutes. Hence, they collect the beads using the magnetic stand and the solids would be removed from the solution. The solution was washed 2 times with 200 mL of 70% EtOH and the solids would be dried for from 30 to 45 minutes at room temperature. Then they would wash the cDNA two times with 20μL of water.
Before beginning to profile short RNAs, scientists have to understand the goals of the experiment. The desired length sRNA must be separated from the long RNAs. They have to understand that only some of sRNAs have the 3 primed polyA tail of the mRNAs that can be used to convert sRNAs to cDNA. Also, using random hexamers can only be used sometimes to convert the short RNAs to cDNAs since the RNAs are too short for the process of conversion to go smoothly. Some RNAs have modifications at their 3 primed and 5 primed ends that would make it harder to make conversion to cDNA.

The second method of profiling sRNAs is to make cDNA by profiling with polyA tails. First, 1 mL of 50 mM dTU-V primer and 1 mL of 10 mM dNTPs are mixed together. Then the solution is incubated for 5 minutes at 650C by using the thermocycler. Then the solution is to be put on the cold aluminum and it was left there for 1 minute. The next step is to add the solution to 2 mL of ten times of SuperScript III reaction buffer, 4 mL of 25 mM MgCl2, 2 mL of 0.1 M DTT, 1 mL of SuperScript III, and 1 mL of RNAseOut. This mixture will be incubated for 50 minutes at 850C. The next step is to remove the dTU-V primer sequences from the mixture obtained previously. In order to carry this reaction, 1 mL USER enzyme is added to the mixture and the solution is incubated again for 15 minutes at 370C. Then d 1 mL of RNAse H and 1 mL of RNAse are added to the solution. And this solution can be used for the next step which is the purification of cDNA

In this reaction, 180μL of AMPure beads is heated up to room temperature. Then the beads will be added to the solution that was mixed and incubated to make the cDNA solution. The beads are then collected by using the magnetic stand and the solids would be collected too. Then the beads are washed with 500 mL of 70% EtOH two times. The solids would then dried up. Finally, to isolate the cDNA from the beads, 20 mL of nuclease-free water is added to the solids. And the liquid will be removed using the pipet until the cDNA solid is obtained.

Sequencing cDNA[edit]

The 3 primed end of the cDNA will be blocked by polyA residues using terminal transferase (TdT). In order to carry out the process, they need to prepare the 3 primed end of the cDNA. First, they got cDNA to be tailed (<10 ng) in 10.8 mL of water. Then 2 mL of 10× TdT buffer and 2 mL of 2.5 mM CoCl2 are added to the mixture and the mixture is to be incubated for 5 minutes at 950C.4 mL of 50 mM dATP, 0.2 mL of BSA, and 1 mL of TdT are added to the incubated solution. Then the mixture will be incubated in the PCR machine for 60 minutes at 700C. The solution is to be put on ice for 2 minutes. Then 1 mL of 10× TdT buffer, 1 mL of 2.5 mM
CoCl2, 0.5 mL of 200 mM Biotin-ddATP, 6.5 mL of water, and 1 mL of TdT are added to the cold solution. Finally, it will be incubated in the PCR machine for 20 minutes at 700C
With one negative charge per connective phosphate, RNA is considered a polyelectrolyte. In order to achieve neutrality, RNA attracts cations, creating a counterion atmosphere. While weak, this cation bonding is crucial to maintain both secondary and tertiary structures. For this reason, the structure, stability, and reactivity are heavily dependent on external ionic conditions. Salt, usually in the form of Mg(2+), has been found to have profound kinetic effects on the folded and unfolded states of RNA (Li, Vieregg, Tinoco Jr, 85). The addition of salt causes a higher kinetic barrier between the folded and unfolded states, but does not effect the point at which the transition between the two states occurs.

The aforementioned statement was confirmed by a study by Li, Vieregg, and Tinoco Jr titled “How RNA Unfolds and Refolds.” The study examined the effect of increasing levels of Mg(2+) on RNA hairpins and a three-helix junction. As predicted, the addition of the salt resulted in higher energy unfolding and refolding, meaning that the higher concentration of cations stabilizes secondary structures. This increase in stability was found to be the result of a slight decrease in the unfolding rate and a larger increase in the folding rate of the RNA (87). Monovalent cations were also studied, and where found to have the same effect as divalent salts, but to a much lesser extent (86).

Metal ions, like Mg(2+), have an even stronger effect on RNA tertiary structure than on secondary structure, due to a higher dependence on ionic conditions. The addition of metal ions greatly slows the process of undoing tertiary structure, but has little effect on tertiary folding rates (88). For example, loop-loop interactions, which rely on a “kissing” interaction, becomes highly stabilized with the addition on a mere mM of Mg(2+).


Li, Pan T.X., Jeffrey Vieregg, and Ignacio Tinoco. “How RNA Unfolds and Refolds.” Annual Review of Biochemistry 77.1 (2008): 77-100. Print.

Lentiviral Delivery of designed shRNA’s and the mechanism of RNA interference in mammalian cells.


RNAi was first introduced when plant biologists attempted to introduce genes into a petunia. When they added a gene that attempted to deepen the flowers purple color, the gene actually inhibited it. The resulting flowers had white patches or were completely white.

Soon after this discovery, another group of researchers realized that this same gene-silencing phenomenon was occurring in experiments with C. elegans. These scientists figured out that RNAi is triggered by double-stranded RNA, which is not typically found in healthy cells. Two well known scientists, Andrew Fire and Craig Mello, were awarded the 2006 Nobel Prize in physiology or medicine for this discovery. [53]

Biological Implications[edit]

RNA interference (RNAi) is a natural mechanism within the cell used to silence the expression of certain genes. Small RNA molecules play essential roles in regulating gene expression by RNA interference. There are three basic characteristics of these pathways:

1) Small RNA biogenesis

2) Formation of RNA-induced silencing complexes.(RISCs)

3) Targeting of complementary mRNAs.

RNA interference is triggered by the enzyme Dicer, which cleaves long double-stranded RNA (dsRNA) producing 20 to 30 nucleotide RNAs whose sequences can base pair with segments of mRNA transcripts. Then the newly generated microRNAs (miRNAs) or small interfering RNAs (siRNAs) will assemble into complexes designed to complementarily fit into target RNA strands that wish to be silenced. The induced silencing complexes are called RNA- induced silencing complexes (RISC) and are
constructed into large multiprotein effectors, called RNA-induced silencing complexes (RISCs), which bind to target transcripts and trigger their destruction.

Cognate RNA is then cleaved in the middle region bound to the siRNA strand. This mechanism has been theorized to have a self-defense purpose to protect cells against viral infections or cancerous cells.

RNAi can help to study tissue regeneration. RNAi shuts down individual genes during the tissue regeneration and the scientists can understand what genes in amphibians are involved in regenerating tissue when missing limbs are regrown. By understanding this process, they hope to learn how to regenerate human tissue.

RNAi is proposed to have evolved about a billion years ago, before plants and animals diverged. This is due to the fact that it exists in all living organisms, from plants to animals.

Modern hypotheses state that RNAi evolved as a cellular defense mechanism against invaders such as RNA viruses. When they replicate, RNA viruses temporarily produce a double-stranded form. This double-stranded intermediate would trigger RNAi and inactivate the virus’ genes, preventing an infection.

RNAi may also have evolved to combat the spread of genetic elements called transposons within a cell’s DNA. Transposons can wreak havoc by jumping from spot to spot on a genome, sometimes causing mutations that can lead to cancer or other diseases. Like RNA viruses, transposons can take on a double-stranded RNA form that would trigger RNAi to clamp down on the potentially harmful jumping. [54]

Cellular Mechanism[edit]

The dicer protein from Giardia intestinalis, which catalyzes the cleavage of dsRNA to siRNAs. The RNase domains are colored green, the PAZ domain yellow, the platform domain red, and the connector helix blue.[1]

RNAi is a process in which RNA is used to scilence genes. the main player in this process is the RNA-induced silencing complex (RISC), the complex is activated by short double-stranded RNA molecules. [55]. dsRNA can come from infection by a retro virus or artificially inserted (exogenous), the RNA can also come from within the cell’s own genome (endogenous). [56].


There are two approaches toward dsRNA depending on its origin, whether it be exogenous (from outside the cell) or endogenous (from inside the cell).

-Exogenous: the foreign dsRNA is detected and bound by an effector protein, the protein initiates the dicer to cut up the dsRNA, the same effector protein helps in transporting the siRNA to RISC.[57]

-Endogenous: the target dsRNA is cut up by the Dicer into single stranded siRNA, which are then transported to an active RISC. When they are incorporated into the RISC the siRNA base pair with their corresponding sequences on mRNA strands, which are then cleaved at those sites. By cleaving the mRNA the synthesis of protein is halted [58].

Left: A full-length argonaute protein from the archaea species Pyrococcus furiosus. Right: The PIWI domain of an argonaute protein in complex with double-stranded RNA.

The enzyme dicer trims double stranded RNA, to form small interfering RNA or microRNA. These processed RNAs are incorporated into the RNA-induced silencing complex (RISC), which targets messenger RNA to prevent translation.[2]


RISC stands from RNA-induced silencing complex, its active components consist of endonucleases and argonaute proteins. The function of each respectively is to recognize a complimentary sequence on mRNA (complementary to the sequence of the bound siRNA) strands and cleave the mRNA (argonaute proteins). This process is ATP independent and act directly through components of the RISC [59] [60].

How RNA pairs with the argonaute protein, structurally was determined by X-ray crystallography, through x-ray crystallography the active sites were determined which led to accurate information regarding how the RNA binds to the argonaute protein. In the active site, the phosphorylated 5′ end of the RNA strand enters and bonds with a cation (i.e. magnesium) and by having an aromatic stacking structure between the 5′ nucleotides in the siRNA. It has been inferred that the active site contains the ability to pair the siRNA with its corresponding mRNA. [61].

Research Implications[edit]

RNAi-based therapies have been proposed as a way to regulate and get rid of several disease causing genes. This path has shown to be the most promising. A good target for this type of therapy would be all forms of cancer. Cancer is often caused by overactive genes and regulating the activity of these could stop the spread of it.

Viral infections are also hypothetical targets for RNAi therapies. Many believe that RNAi actually evolved as a way to combat RNA viruses. Reducing the expression of important viral genes would leave the virus helpless and prone to attack by the immune system. In vitro, studies have already indicated that HIV, polio, HCV and others have been reduced by these therapies.

RNAi is already serving as a way to identify function of certain genes. Prior to this discovery, researchers had often resorted to inserting new genes into an organism to see what the effect would be. More recently however, scientists can merely silence the gene of interest and observe the effects that the target gene has on organism function. It can also shed light upon complex cellular pathways.

RNAi has been a novel and highly important discovery for research. For years, scientists had been intensely studying how proteins regulate gene activity, focusing most of their attention on proteins called transcription factors. Now RNA, through RNAi and related processes, is known as an essential player in the cell’s complex technique of gene regulation. [62]

Research Applications[edit]

As mentioned in the section above, RNAi can be used to selectively “silence” targeted genes in order to analyze the affects this will incur on the model organism. One area of research focuses on regeneration, the regrowth of lost or damaged body parts. This ability is quite common in nature. For example, tree stumps can grow sprouts that develop into new stems, leaves, and flowers; in lab, a mass of undifferentiated cells can grow into a mature plant; in fact, a section of certain plants composed of fully-differentiated cells can also grow into a mature plant. Animals, too, have regenerative abilities: including invertebrates such as sponges, hydra, planarians, and starfish, as well as vertebrates such as salamanders and amphibians. Humans, on the other hand, have only limited regenerative abilities. Apart from healing wounds, humans can regenerate some of the liver and the tips of fingers and toes [63]. Wouldn’t it be amazing if scientists could find a key to regenerating human tissues? RNAi is currently being used to target specific genes and turn them off in planarians and amphibians in order to analyze the functions of those genes. This way, researchers hope to find out which genes are responsible for regeneration.

Role in Tissue Regeneration[edit]

RNA interference (RNAi) is a mechanism that organisms use to silence genes when their protein products are no longer needed. The silencing happens when short RNA molecules bind to stretches of mRNA, preventing translation of the mRNA. To focus in on the genes that enable planarians to regenerate, Sánchez Alvarado and his coworkers are using RNA interference (RNAi). RNAi is a natural process that organisms use to silence certain genes. Sánchez Alvarado’s group harnesses RNAi to intentionally interfere with the function of selected genes. The researchers hope that by shutting down genes in a systematic way, they’ll be able to identify which genes are responsible for regeneration. The researchers are hoping that their work in planarians will provide genetic clues to help explain how amphibians regenerate limbs after an injury. Finding the crucial genes and understanding how they allow regeneration in planarians and amphibians could take us closer to potentially promoting regeneration in humans.


Specifically in planarians, which can regrow a whole worm from a small fraction of its body, RNAi’s ability to shut off specific genes has led to the discovery that the location of head and tail formation is controlled by *hedgehog signaling and the Wnt/B-catenin pathway. The Wnt/B-catenin pathway regulates the formation of the anterior-posterior axis. “Silencing” either hedgehog or Wnt/B-catenin with RNAi causes head and tail to grow at wrong ends [64].
In addition, some basic researchers are trying to figure out how stem cells work by planarians. These worms are like stem cells in the sense that they can regenerate. Planarians’ resemblance to stem cells isn’t just coincidence. Scientists have discovered that planarians can perform the amazing act of regeneration due to the presence of specialized stem cells in their bodies. Developmental biologist Alejandro Sánchez Alvarado of the University of Utah School of Medicine in Salt Lake City used the gene-silencing technique RNAi to search for planarian genes that were essential for regeneration. He found 240 genes that caused a physical defect in the worm’s growth and regenerative ability when silenced. Interestingly, 16 percent of these looked very much like genes that had been linked to human disease.

  • in addition to regeneration in planarians, hedgehog signaling is vital to brain, intestinal tract, finger, and toe development in mammals.

RNAi and Neurological Diseases[edit]

RNAi not only protects cells from foreign genes, it is also involved in regulating the cells own genes, including the cell’s own set of noncoding mRNA’s. Thus, improperly functioning RNAi can lead to diseases and inherited disorders, including fragile X syndrome. Fragile X causes mental retardation because of the loss of FMRP, a protein usually synthesized from the FMR1(fragile X mental retardation 1) gene. It was discovered that FMRP is a component of RISC, indicating that the loss of this protein prevents RNAi in neurons from functioning properly, thus causing mental retardation [65]. (More research needs to be done to establish this link.)

On the flip side, RNAi has the potential to treat neurological diseases as well. In a similar fashion to how RNAi eliminates foreign mRNA from viral infections, the high specificity of RNAi can be used to target mutations of normal genes that lead to neurological diseases. This way, RNAi can mediate the effects of detrimental dominant alleles by “knocking out” expression of these mutant genes while leaving normal ones alone. Accordingly, this potential can be expanded to other diseases including those caused by triplet expansion or trinucleotide repeats (Neurodegernative diseases such as Spinobulbar Muscular Atrophy and Hungtington’s in addition to Fragile X). [66]


  1. Macrae I, Zhou K, Li F, Repic A, Brooks A, Cande W, Adams P, Doudna J (2006). “Structural basis for double-stranded RNA processing by dicer”. Science 311 (5758): 195–8. doi:10.1126/science.1121638. PMID 16410517. 
  2. Hammond S, Bernstein E, Beach D, Hannon G (2000). “An RNA-directed nuclease mediates post-transcriptional gene silencing in Drosophila cells”. Nature 404 (6775): 293–6. doi:10.1038/35005107. PMID 10749213. 
  3. U.S. Department of Health and Human Services. Inside the Cell. September 2005..

National Institute of General Medical Sciences [67]

Biology Pages [68]

Functional Genomics, Fragile X Syndrome,and RNA Interference [69]

The New Genetics (2006): n. pag. U.S. DEPARTMENT OF HEALTH AND HUMAN SERVICES, National Institutes of Health, National Institute of General Medical Sciences. Web. .

Main Component: Argonaute[edit]

The main component of RISC is the argonaute (Ago) proteins. These proteins will associate the RNAs. The Ago family can be divided into the Argo subfamily and the Piwi sub family. siRNAs and miRNAs bind to the Argo subfamily and piRNAs bind to the Piwi subfamily. In mammals, each of the four Ago subfamily proteins (AGO1-4) can repress translation but only AGO2 can cleave the RNA and result in RNA interference (RNAi)

Two Steps in RISC Assembly: RISC Loading and Unwinding[edit]

siRNA and miRNA come from double stranded RNA that has been chopped up by the RNase III enzymes, Drosha and Dicer. The resulting RNA are call RNA duplexes. There are two models of when RNA unwinds when binding to Ago proteins. The ‘helicase model’ propose that the RNAs were separated into single stranded RNA first then incorporated into the Ago proteins. The other model is the ‘duplex-loading model’ which states that the double stranded RNA binds to the Ago proteins then dissociated within the protein. Recent studies show that the ‘duplex-loading model’ may be the model for when RNA unwinds. Therefore, RISC assembly can be divided into 2 steps: small RNA duplex is bound to Ago protein, the double stranded RNA dissociates into two single stranded RNA. RNA duplex bound to Ago protein is called pre-RISC while Ago protein with single stranded RNA is called mature RISC.

Since the double stranded RNA will unwind into two single stranded RNA one of these strands must be discarded. The discarded RNA strand is called the passenger strand and the other strand is called the guide strand. The strand with a less stable 5’ end will serve as the guide strand while the other strand is discarded.

RISC Loading[edit]

RISC loading machinery[edit]

Ago proteins need the help of RISC-loading machinery to bind to RNA. RISC-loading machinery is composed of Dicer-2 (DCR-2) and R2D2 for Drosophila Ago2. R2D2 binds to the more stable end of RNA while Dcr-2 binds to the more stable end. Although Dcr-2 can both dice up RNA and load RNA into Ago proteins, studies has shown that the siRNA duplexes must dissociate from Dcr-2 after dicing then rebind to the Dcr-2-R2D2 dimer according to its stability. Human only has one type of Dicer, human Dicer and its partner protein TRBP (TAR binding protein) helps load RNA into AGO2-RISC. However, studies have shown that Dicer is only needed when loading into fly Ago2. It is not needed when loading RNA complexes for other Ago proteins. It seems that there are two pathways of RISC loading, a Dicer dependent pathway and a Dicer independent pathway.

Small RNA sorting[edit]

siRNA duplexes usually have perfectly complementary sequences so that all the bases are lined up. However, miRNA-miRNA* complexes usually have central mismatches. In flies, the Dcr-R2D2 likes to bind to the perfectly complementary siRNA like complexes but doesn’t like RNA strands with mismatches. On the other hand, Ago1 likes to bind to sequences that has central mismatch around nucleotide 8-11.

Another guild loading in the right orientation is the identity of the nucleotide in the 5’ end of the guild strand. In flies, Ago1 favors U while Ago2 favors C. For plants, the orientation of the strands relies heavily on the identity of the nucleotide as well. Arabidosis AGO1 prefers U, AGO2 and AGO 4 prefer A and AGO 5 prefers C. The MID and PIWI domain of Arabidosis Ago proteins confer recognition of the nucleotide at the 5’ end. However, mammalian Ago proteins only prefer perfectly complementary siRNA like complexes and disfavor RNA with non-central mismatches. However, if the RNA only has central mismatches, Ago protein will incorporate it without any difficulties as well. Also, human Ago protein does not have a preference for the 5’ end nucleotide. Therefore, human Ago proteins do not have a strict small DNA sorting system.


Slicer-dependent unwinding[edit]

A pre-RISC loaded with double stranded RNA is very similar to a mature RISC that is bounded to a target mRNA. Therefore, the passenger RNA is like the target RNA for the guide strand. In slicer-dependent unwinding, the passenger strand is discarded just like how a target mRNA would be discarded. This type of unwinding only occurs in siRNA like complexes that has highly complementary strands.

Slicer-independent unwinding[edit]

Human AGO1, 3, and 4 does not have any slicer activity, therefore, it cannot use slicer-dependent unwinding. Also, if the RNA strands have mismatches, the slicers would not unwind the two strands. Therefore, another pathway was proposed as the slicer-independent unwinding. In this type of unwinding, the mismatched RNA will actually accelerate the unwinding process and it is essential for this type of unwinding. Therefore, scientists dub this the ‘mirror-image’ process of target recognition. It is basically the opposite of when the guild strand anneals to the target strand.


Kawamata,T and Tomari,Y. “Making RISC”. Trends in Biochemical Sciences.35.7(2010):368-376.
DNA and RNA are different from their structure, functions, and stabilities. DNA has four nitrogen bases adenine, thymine, cytosine, and guanine and for RNA instead of thymine, it has uracil. Also, DNA is double-stranded and RNA is single-stranded which is why RNA can leave the nucleus and DNA can’t. Another thing is that DNA is missing an oxygen.

Predominant structures[edit]

DNA is a double-stranded molecule with a long chain of nucleotides while RNA is only single-stranded. In most of its biological roles and has a shorter chain of nucleotides (after transcription and splicing, only exons remain in RNA). DNA exists mainly in a double helix form while RNA will take on many different shapes and sizes such as the ‘hair pin formation’. DNA is used to carry an organism’s genetic information while RNA takes on many different roles, for instance, RNA can act as an enzyme such as ribozyme. There is one single type of DNA while there are many types of RNA that have different functions such as mRNA (carries DNA message to cytoplasm), tRNA (carries amino acids to mRNA and Ribosomes), rRNA (Ribosomal RNA, workbench for protein synthesis). DNA cannot catalyze its own synthesis while RNA can. This supports the RNA World Hypothesis. The pairing of bases in DNA including A-T(Adenine-Thymine) and G-C(Guanine-Cytosine)is different to that of RNA including A-U(Adenine-Uracil) and G-C(Guanine-Cytosine).

Bases and sugars[edit]

DNA is a long polymer with deoxyriboses and phosphate backbone. Having four different nitrogenous bases: adenine, guanine, cytosine and thymine.
RNA is a polymer with a ribose and phosphate backbone. Four different nitrogenous bases: adenine, guanine, cytosine, and uracil.

Ribofuranose-2D-skeletal.png Structure of ribose in RNA Deoxyribose structure.svg Structure of deoxyribose in DNA


DNA is a nucleic acid that contains the genetic instructions used in the development and functioning of all known living organisms. It is a medium of long-term storage and transmission of genetic information, while RNA is a nucleic acid polymer that plays an important role in the process of translating genetic information from deoxyribonucleic acid (DNA) into protein products. RNA acts as a messenger between DNA and the protein synthesis complexes known as ribosomes.

Both DNA and RNA start synthesis in the 5′-3′ direction. However, no primer is needed for RNA. In addition, only RNA polymerase lacks the ability to detect errors of base pairing.


Deoxyribose sugar in DNA is less reactive because of C-H bonds on the second carbon (C2). DNA is stable in alkaline conditions. It has smaller grooves where the damaging enzyme can attach which makes it harder for the enzyme to attack DNA; RNA, on the other hand, has larger grooves which makes it easier to be attacked by enzymes. RNA, ribose sugar is more reactive because of the presence of hydroxyl group on C2. RNA is not stable in alkaline conditions because bases can easily deprotonate the Hydrogen from the -OH on C2. After deprotonation, the negatively charged oxygen may attack the Phosphate at the PO4, kicking off the Oxygen connected to the 5’C of next nucleotide over, resulting in hydrogenation.

Unique features[edit]

The helix geometry of DNA is of β-Form. DNA is completely protected by the body i.e. the body destroys enzymes that cleave DNA. DNA can be damaged by exposure to ultra-violet rays. The helix geometry of RNA is of α-Form. RNA strands are continually made, broken down and reused. RNA is more resistant to damage by ultra-violet rays.

Comparison chart[edit]

Here is a chart that shows the differences between DNA and RNA:

Structural Name: Deoxyribonucleic Acid Ribonucleic Acid
Function: Medium of long-term storage and transmission of genetic information. Transfer the genetic code needed for the creation of proteins from the nucleus to the ribosome. This process prevents the DNA from having to leave the nucleus, so it stays safe. Without RNA, proteins could never be made.
Structure: Typically a double- stranded molecule with a long chain of nucleotides. A single-stranded molecule in most of its biological roles and has a shorter chain of nucleotides.
Bases/Sugars: Long polymer with a deoxyribose and phosphate backbone and four different bases: adenine, guanine, cytosine and thymine. Shorter polymer with a ribose and phosphate backbone and four different bases: adenine, guanine, cytosine, and uracil.
Base Pairing: A-T (Adenine-Thymine), G-C (Guanine-Cytosine) A-U (Adenine-Uracil), G-C (Guanine-Cytosine)
Stability: Deoxyribose sugar in DNA is less reactive because of C-H bonds. Stable in alkaline conditions. DNA has smaller grooves where the damaging enzyme can attach which makes it harder for the enzyme to attack DNA. Ribose sugar is more reactive because of C-OH (hydroxyl) bonds. Not stable in alkaline conditions. RNA on the other hand has larger grooves which makes it easier to be attacked by enzymes.
Unique Traits: The helix geometry of DNA is of B-Form. DNA is completely protected by the body i.e. the body destroys enzymes that cleave DNA. DNA can be damaged by exposure to Ultra-violet rays. The helix geometry of RNA is of A-Form. RNA strands are continually made, broken down and reused. RNA is more resistant to damage by Ultra-violet rays.

Transcription, also known as RNA synthesis, is a method in which a DNA nucleotide sequence is transcribed into RNA information. In this process, genetic information is simply copied from one molecule to another. In prokaryotic transcription, the mRNA genetic information is made and then translated to make proteins. In prokaryotes, translation and transcription can occur simultaneously in the cytoplasm. In eukaryotic transcription, the genetic material is transcribed in the nucleus.
Transcription in eukaryotes is much more complex than in prokaryotes. One reason for this is the presence of histones in eurakyotic DNA. These histones tend to hinder the access of polymerases to the promoter. The process of transcription can be thought of as four sequential steps. The first would be the initiation step, during which the RNA polymerase II (RNAPII) binds to the DNA site in order to form a preinitiation complex with other transcriptional factors. The location of this on the DNA is identified as the “promoter.” The second step involves an enzyme called a helicase that unwinds the DNA double helix. After the DNA is unwound, synthesis of RNA can begin based on the DNA template strand. It should be noted that Uracil of RNA is paired with Adenine of DNA. This step is called the elongation step, during which the polymerase leaves the promoter behind through a process called promoter clearance, and transcribes the rest of the DNA strand. The final step of transcription is termination of synthesis. There are different signals that lead to the termination of transcription. This step is also called the termination step, and the RNA polymerase finally releases the DNA.

Promoter Sites[edit]

RNA transcription from DNA begins with the recognition of promoter sites on the DNA strand by RNA Polymerase. These promoter sites are designated base sequences that mark the beginning of transcription on the long DNA strand. Transcription for RNA from DNA does not simply begin anywhere on the DNA strand. The probability that transcription will begin at a desired location just by chance is very slim, thus requiring sequences that RNA polymerase can recognize and initiate transcription. Promoter sites on the DNA sequence provide these starting points for the synthesis of specific RNA sequences from specific genes on the DNA strand. The first nucleotide to be transcribed is numbered +1. The nucleotide upstream to +1 (adjacent to +1 on the 5′ side) will be identified as -1.

Since RNA polymerase, the enzyme that synthesizes RNA from DNA, polymerizes RNA from the 5′ to 3′ end, the promoter site where it attaches is always upstream, meaning closer to the 5′ end of the DNA, from the gene of interest. Oftentimes there are molecules that attach to the promoter site and subsequently recruit the RNA polymerase to attach there and begin transcription; these molecules are called transcription factors.

In bacteria, there are two distinct sequences upstream (5′) to the first nucleotide to be transcribed that function as promoter sites and determine where transcription will begin. One of them is located at 10 nucleotides to the 5′ end of the first nucleotide to be transcribed (-10 region) and is called the Pribnow box with the consensus sequences of “TATAAT”. The other, located further upstream at the -35 region, has a consensus sequence of “TTGACA”. Note that most often, the first nucleotide to be transcribed is a purine.

The proteins that guide RNA polymerase to genes are the sigma factors. A sigma factors binds RNA polymerase through the alpha subunit and then helps the core enzyme detect or a recognize a specific DNA sequence, this is called a promoter. A single bacteria species can also make several different sigma factors. They also help core RNA polymerase locate the consensus promoter sequences near the beginning of a gene.

Bacterial DNA template—————-TTGACA(-35)———–TATAAT/Pribnow(-10)————Start of RNA (+1)07:15, 21 November 2010 (UTC)07:15, 21 November 2010 (UTC)~~

In eukaryotes, the promoter site exists at the -25 region with a consensus sequences of “TATAAA”. This sequences is called the “TATA box” or Hogness box. When the cell wants to transcribe the DNA strand, the “TATA binding protein” (transcription factor) attaches to the TATA box and subsequently helps in getting the RNA polymerase to attach there and begin synthesizing the RNA.
In addition to the TATA box, most eukaryotes also have a second promoter site at the -75 region called the CAAT box with a consensus sequence of GGNCAATCT. Finally, RNA transcription in eukaryotes is also stimulated by the presence of enhancer sequences found in distant locations from the +1 region on either the 5′ or 3′ side.

Eukaryotic DNA template————CAAT box(-75)/optional———-TATA box(-25)————Start of RNA(+1)07:15, 21 November 2010 (UTC)07:15, 21 November 2010 (UTC)Anneyoh (talk) 07:15, 21 November 2010 (UTC)

Note: Not all base sequences of promoter sites are identical. They are called consensus seqeunces because they share common features, however almost all promoter sequences differ from the idealized consensus sequence by one or two bases.

Enzymes that replicate DNA do not rely solely on the sequence of bases when determining binding specificity. The three dimensional structure is also important in determining where replicating proteins will bind. For most DNA-binding proteins, the readout of base pairs through hydrogen bonds or hydrophobic contacts is not sufficient to explain specificity. The shape of the minor groove within a binding site can be “read” by a complementary set of basic side chains of DNA binding molecules, most typically arginines but also lysines, when presented in the correct conformation.

Kinks can contribute to binding specificity by creating conformations that enhance protein-DNA and protein-protein contacts. The DNA-binding site of the catabolite activator protein (CAP) shows large kinks at two steps which cause an overall bending of the DNA of about 90◦ around the protein . The kink at the steps creates a space for an arginine residue to engage in partial stacking interactions with a thymine at that site.

Challenges Associated with the Elongation Step[edit]

As the RNAPII transcribes along the gene (or chromatin) during the transcript elongation, it has to find a way to deal with nucleosomes. One way of coping with the nucleosome is to disassemble it into separate histones and uncoiled gene before transcribing (as shown in the figure). Then, as the RNAPII transcribes along the uncoiled strand of gene, the separated histones may coil the gene to form a nucleosome back again. Histones are able to disassemble into further subunits, which include H2A/H2B dimer (depicted in red) and H3/H4 dimer (depicted in yellow). This disassembly of nucleosomes into histones is usually assisted by ATP-dependent chromatin remodelers and histone chaperones. Some of the identified ATP-dependent chromatin remodelers include SWI-SNF, ISWI, CHD, AND INO80/SWR. FACT (Facilitates Chromatin Transcription) is an example of Histone chaperone, which also plays a significant role in destabilizing the nucleosomes on a gene in order to facilitate the transcript elongation. Mainly, it functions by removing the H2A/H2B dimer from the nucleosome.

RNAPII also has to be certain on inserting the correct nucleotides. This is achieved by specific structure called trigger loop located under the active site of the RNAPII, where the nucleotides bind. The function of the trigger loop is to align the nucleotide in correct orientation for forming phosphodiester bond with the transcribing strand of gene. Only the right nucleotides are capable of aligning in correct orientation with specific trigger loops, which enable RNAPII to be certain on inserting the correct nucleotides.

The Mechanism of Elongation also plays a significant role in RNAPII fidelity. Transcript elongation is done by Brownian ratchet mechanism, which allows the RNAPII to move back and forth of the gene. By removing the misplaced one and inserting the correct one again, the RNAPII can not only increase the fidelity, but also enhance the rate of insertion of further nucleotides. This removal of a misplaced nucleotide usually requires general factors that encourage transcript cleavage, such as TFIIS.

In the elongation of RNA transcripts, the sigma factor remains assocated with the transcribing complex until about nine bases have been joined. [Microbiology]. The original RNA polymerase then continues to move along the template, and synthesized RNA at 45 base pairs per second.

Other Factors Affecting Transcript Elongation[edit]

Histone modification is positively correlated with transcript elongation. In other words, transcription elongation requires increasing amount of histone modification in order to occur in faster rates. One of the histone modifications include histone acetylation, which is catalyzed by histone acetyltransferases (HATs) and histone deacetylases (HDACs). Histone methylation, which is another type of histone modification, interferes with the transcript elongation in order to regulate the rate of histone acetylation. It is believed that histone modification is associated with the disassembly of histones, and thus enhancing the transcript elongation.

Transcription Repression

Polycomb group proteins (PcG) are necessary for organisms to develop from cells to tissues. PcGs form protein complexes with many units that function as transcriptional repressors controlling thousands to hundreds of thousands of genes during cell differentiation and growth during normal development of the organism. Most multicellular organisms need PcGs for growth and development. Homeotic (HOX) genes are correctly expressed during development because of these same PcGs by regulating cell cycles, cancer, x-body inactivation, fate of cells, stem cell pathways and differentiation, among other developmental matters. As proteins, they also contain enzymatic like function when they target specific genes and thus downregulating their transcription. Their work includes the recruitment of other repressors that work together.
PcG proteins can be best described as two parts: PRC1 and PRC 2. They serve unique purposes. PRC 1 catalyzes the ubiquitylation of histone H2Awhich leads to the repressing of gene transcription by making the chromatin more compacted and less available. PRC 2 serves as the catalyst force for the methylation of histone H3 in order to repress the zeste 12 and development of the ectoderm.
PcG don’t attach similar sets of genes in all cells. The mechanisms and DNA patterns that regulate the binding of PcG proteins to the promoters in the cells have to be specific and complex.
Like previously mentioned, PcG does not work alone- there are other factors and proteins which it recruits to help during gene transcription repression. BCL6 Co-repressor which is also known as BCOR, helps with transcription repression in people who suffer from oculofaciocardiodental (OFCD) disease. The complex of PcG and BCOR targets germ cell genes.
The best known case that shows how signaling pathways help with the expressional methods of PcF is the hedgehog-signaling conserved pathway which is important during embryo development. This sonic hedgehog ligand (SHH) is best known for its work with cancer progression and stem cell maturation. More research is yet to be done.


Polycomb group protein-mediated
repression of transcription
Lluı´s Morey1,2 and Kristian Helin1,2
1 Biotech Research and Innovation Centre (BRIC), University of Copenhagen, Ole Maaløes Vej 5, 2200 Copenhagen, Denmark
2 Centre for Epigenetics, Ole Maaløes Vej 5, 2200 Copenhagen, Denmark

DNA Damage and Mutagenesis[edit]

When DNA is defected by UV, it is hindered from carrying on the transcription with RNAPII. As a result, damaged DNA may be repaired through the mechanism called transcription-coupled nucleotide excision DNA repair (TC-NER). Another way of repairing is polyubiquitylation of RNAPII. Both of these two repairing mechanisms involve the activity of RNAPII and degradation or removal of the damaged part of DNA.

Observations also show some relations between transcript elongation and mutagenesis (or simply mutation among the DNA strands). Although very little is discovered on the specific interactions between transcript elongation and mutagenesis, observations suggest that the increased rate of transcript elongation results in increased level of mutation among DNA strands. This probes a critical relation between transcription level and the fidelity of DNA replication.

Although a highly transcribed region spends a majority of its time being single-stranded, the rate of mutagenesis during DNA replication does not increase, but active transcription can interfere with the precision of DNA polymerase as it adds nucleotides to the template strand. The single-stranded DNA may not be protected by chromatin proteins and nucleosomes, but there is little evidence to argue that transcription is mutagenic to the DNA template strand.

In transcription-associated recombination (TAR), DNA polymerase and RNA polymerase II can produce a hybrid mRNA strand that contains both DNA and RNA nucleotides, these are called R-loops. R-loops lead to genetic instability as the cell has trouble during replication trying to activate the S Phase checkpoints. Mutants with the R-loops usually do not make it past the S phase and are not viable.

RECQL5 Helicase and Genomic Stability[edit]

The enzyme RECQL5 helicase may also have a role in maintaining genomic stability. RECQL5 is a protein that plays a role in preventing collapse or replication forks, which would lead to DNA damage, and the accumulation of DNA double-strand breaks, which would interfere with future replications and transcriptions if the mutation was in a coding region. Mutations in proteins similar to RECQL5 have lead to an increased rate of cancer.

Gene Traffic[edit]

Sometimes more than one RNAPII may bind to the same strand of gene for transcript elongation. This promotes gene traffic among the polymerases, which may either cause decrease in the rate of transcript elongation or force the polymerases to move forward in faster rate. However, currently very little is known on this phenomenon such as how and why the traffic causes the way polymerases react to the traffic. Some hypothesize that the main cause is directly related to the frequent collisions among the polymerases resulting from elongating at different rates on the same strand.

tRNA roles in transcription[edit]

tRNA is an RNA molecule and is thus transcribed from DNA, other than this it has little to do with transcription. The primary role of tRNA lies in translation where it interacts with the mature mRNA to bring the appropriate amino acid which it carries to the growing polypeptide chain.
Nucleosides consist of a base linked to a ribose or deoxyribose sugar. Nucleoside can form a glysidic bond linking to 1+ phophate group. A nucleoside + one phosphate makes a nucleotide.

Nucleic acid is an important macromolecule because it carries the information in a form that can be passed from one generation to the next. These macromolecules consist of a large number of linked nucleotides which makes off a sugar, a phosphate, and a nitrogenous base (either a purine or pyrimidine). Purines- Adenine and Guanine. Pyrimidines- Cytosine, Uracil, and Thymine. Sugars and phosphates are linked through the esterphosphate bond created the common backbone that plays a structural role. On the other hand, the sequence of bases along a nucleic acid chain carries the genetic information.

Two of the most common nucleic acids known are deoxyribonucleic acid (DNA) and ribonucleic acid (RNA). Between the two, they differ by only a hydroxyl group and the bonding between the nucleic acids by which will be discussed further in the deoxyribonucleic and ribonucleic section of the wikibook.

Ribose Structure
Deoxyribose Structure


In 1870, Johann Friedrich Miescher was the first person that isolated the components of DNA. He found a weakly acidic substance of unknown function in the nuclei of human white blood cells and named it “nuclein”. In the 1920s, it was discovered that nucleic acids was a major components of chromosomes. Elemental analysis showed the presence of phosphorous, 2-deoxyribose sugar, and four different heterocyclic bases. The two monocyclic bases are classified as pyrimidines, and the other two bicyclic bases are purines.
Four nucleic acid

Until the late 1940s and early 1950s, DNA was determined to play a main role in inheritance. The structure of DNA more than ever become a universal interest in scientic world. Erwin Chargaff was a pioneer that tried to construct the composition of DNA. He found that the amount of adenine (A) always equaled the amount of thymine (T), and the amount of guanine (G) always equaled the amount of cytosine (C).

Structure of Nucleic Acid[edit]

A nucleic acid contains three parts: a phosphate group, a sugar group (deoxyribose or ribose), and a base. The bases are adenine, guanine, cytosine, and thymine (uracil for RNA). When a base is attached to a sugar group it is called a nucleoside. The four nucleosides for DNA are deoxyadenosine, deoxyguanosine, deoxycytidine, and thymidine. The four nucleosides for RNA are adenosine, guanosine, cytidine, and uridine. A nucleotide is when a nucleoside is bound to one or more phosphate groups.The four nucleotide units of DNA are called deoxyadenylate, deoxyguanylate, deoxycytidylate, and thymidylate.

Nucleotides have a distinctive structure composed of three components that held together by covalent bond:a nitrogen-containing base (cytosine,thymine,acenine,guanine, a 5-carbon sugar – ribose or deoxyribose, a phosphate group.

The structure of a nucleotide

The polymer of nucleotide is nucleic acid. It is built by forming phosphodiester bonds between the 3′ carbon of one nucleotide and the 5′ carbon of another nucleotide, creating sugar-phosphate backbone.

Sugar-Phosophate Backbone


Basics of Transcription[edit]

Transcription is similar to DNA replication in that DNA is used as a template to make a new nucleotide strand (RNA). The newly synthesized RNA strands are complementary to the DNA template strand.

RNA polymerase uses ribonuceloside triphosphate (rNTP) to synthesize mRNA strands (rATP, rUTP, rCTP, and rGTP) in the 5′->3′ direction.

Transcription can be broken down into 3 steps:

1. Initiation. Transcription begins when RNA polymerase binds to a DNA region known as a promoter. Additional transcription factors are required to hold the RNA polymerase to the correct region of the DNA. After RNA polymerase binds to the promoter region, it melts 10-15 nucleotide base pairs around the transcription start site, allowing for rNTPs to bind to the template strand. Initiation ends when the first rNTP is linked to RNA polymerase by a phosphodiester bond. (Unlike DNA replication, no primer is needed)

2. Elongation: RNA polymerase leaves the start site and travels down the template in the 3′->5′ direction. The DNA helix opens ahead of RNA polymerase during this process due to helicase.

3. Termination: Rna polymerase releases from the DNA template strand and leaves DNA.

Prokaryotic Transcription: Operons[edit]

Transcription is very similar in both prokaryotes and eukaryotes in that there is an intiation step, elongation step, and termination step. However, they also have their differences.

Prokaryotes contain operons while eukaryotes do not. Operons are clusters of related genes involved in a similar function and are often found in a contiguous array. Operons are controlled by a single promoter, and as a result, transcription produces 1 mRNA that can be translated into multiple proteins. If there were no operons, there would have to be separate promoters for each gene. Operons help make transcription more efficient; for example:

With Operon:

 |Promoter|Gene A|Gene B|Gene C| ---transcription---> mRNA ---translation---> protein A + protein B + protein C

Without Operon:

 |Promoter|Gene A| ---transcription---> mRNA A ---translation---> protein A
 |Promoter|Gene B| ---transcription---> mRNA B ---translation---> protein B
 |Promoter|Gene C| ---transcription---> mRNA C ---translation---> protein C

While an operon provides the advantage of being able to initiate transcription at one point and transcribe many genes, it has its disadvantages as well. One disadvantage is that if the promoter for the operon sequence is mutated, all the genes in the operon cannot be transcribed.

An operon consists of 3 parts:

1. Structural Genes
2. Promoter region
3. Operator region

The structural gene encodes for proteins. All structural genes will turn into a single mRNA that encodes for multiple proteins.

The promoter region is where RNA polymerase binds to DNA to initiate transcription. Not all promoters will have the same sequence. Strong promoters will have a similar nucleotide sequence as a known consensus sequence. Weak promoters will have a different sequence.

The operator region is located next to, and overlaps with the promoter region. It is the site where a repressor can bind. When a represson binds to an operator, RNA polymerase will not be able to bind to the promoter, and as a result transcription will not occur.

Inducible Operons

An inducible operon is an operon where a substance is required to be bound before transcription will occur, it is normally “off” but when a substance binds it is turned “on.” The lac operon is an example of an inducible operon. It encodes 3 enzymes involved in the metabolism of lactose. The lac operon has 4 regions:

1. CAP binding site: important to increase the rate of transcription of an operon
2. Promoter: location where RNA polymerase binds
3. Operator: location where a repressor binds
4. Genes (ZYA): 
 Gene Z encodes for Beta-galactosidase, which breaks down lactose
 Gene Y encodes for galactosidase primase, a transporter protein that allows lactose to get into the cell
 Gene A encodes for galactosidase transacetylase

Bacteria prefer to metabolize glucose over lactose. Given this fact, we can see three different scenarios:

1. No Lactose, High Glucose

If there is no lactose, the lac operon will not be turned on. A repressor will bind to the operator which overlaps the transcription start site, preventing RNA polymerase from binding at the promoter, and thus preventing transcription of the lac operon. This is important for bacteria to save energy. Because they prefer to metabolize glucose, there is no need to turn on the lac operon in the presence of no lactose and high glucose.

2. High Lactose, High Glucose

Lactose will bind to the repressor, causing a conformational change that forces the repressor to unbind from the operator. As a result, RNA polymerase can bind to the promoter and allows for transcription to occur, however, it is a weak transcription.

3. High Lactose, Low Glucose

Lactose will still bind to the repressor and force it to unbind from the operator (due to the high amount of lactose present). However, the low glucose levels will cause an increase in cyclic AMP. The increase in cAMP will bind to the CAP binding site. This then increases RNA polymerase’s binding to the promoter, resulting in high amounts of transcription.

Repressible Operon

A repressible operon is an operon that where transcription is normally “on,” but when a substance binds it turns “off.” It is the opposite of an inducible operon.

The Trp Operon is an example of a repressible operon, and is involved in the synthesis of the essential amino acid tryptophan. Tryptophan is either made, or obtained from the environment. The Trp Operon is normally on, transcribing the RNA needed to synthesize tryptophan, however, in the presence of tryptophan (obtained from the environment), the operon is turned off in order to save energy.

Prokaryotic Transcription: Transcription Factors (Sigma)[edit]

The function of a transcription factor is to bring RNA polymerase and the promoter together. It will bind to RNA polymerase and at the same time, associate with the DNA promoter. Transcription factors exist in both prokaryotes and eukaryotes.

An example of a prokaryotic transcription factor is the “Sigma Factor.” The sigma factor is involved in locating RNA polymerase to the correct location.

Eukaryotic Transcription: RNA Polymerase[edit]

There are three different types of RNA Polymerase in eukaryotes:

1. RNA polymerase I: makes rRNA
2. RNA polymerase II: makes mRNA, miRNA, and splicing RNA
3. RNA polymerase III: makes tRNA, rRNA, and splicing RNA

The structures of all three RNA polymerases are very similar and highly conserved. The components include a site for rNTP to enter, and a site for phosphodiester bond formation. RNA polymerase II contains an area known as the carboxy terminal domain (CTD), which is RNA pol II specific. CTD is a string of seven amino acid repeats, which is found to repeat 52 times in vertebrates. It is essential for viability, and RNA polymerase II cannot function without it. Before transcription, CTD is non-phosphorylated, and after the initiation step of transcription, CTD becomes phosphorylated.

Eukaryotic Transcription: Promoter, Proximal Promoters, and Enhancers[edit]

The promoter tells the cell where to start transcription. Transcription factors identify and bind to promoter regions, and also help to recruit RNA polymerases.

Proximal promoters and enhancers are both sites for transcription factors to bind as well. Proximal promoters are about 200 base pairs upstream of the start site. Enhancers are further away from the start site, and can be found up to 50,000 base pairs up or downstream of the site.

Promoter regions were identified through two different experiments: 5′ deletion analysis and linker scanning analysis.

Prokaryotic vs. Eukaryotic Transcription[edit]

A few differences between Eukaryotic and Prokaryotic transcription include: Eukaryotes have multiple general transcription factors, lack operons, and have a genome packed into chromatin. Prokaryotes on the other hand, have one general transcription factor, have operons, and have a genome located in plasmids.

Looking at and comparing the three steps of transcription between Prokaryotes and Eukaryotes:

1. Initiation: the binding of RNA polymerase to double-stranded DNA.

For prokaryotes: RNA polymerase, in its complexity, contains a core that has to bind to the promoter region of the DNA template. Another subunit, known as sigma, is what makes this possible by finding and binding to the promoter region using lots of weak H-bonds with the base pairs (with the culmination of many weak H-bonding, a strong net force is seen). So, the RNA polymerase simply slides along the DNA, and it either finds a promoter, or dissolves and starts somewhere else. If it finds a promoter, it proceeds to unwind the DNA, which leads to elongation. Prokaryotes have two promotors sites located upstream of the first nucleotide to be transcribed. They are the Pribnow box, which is located 10 nucleotides upstream and has the sequence TATAAT and the -35 region which has the sequence TTGACA.

For Eukaryotes: Eukaryotes are more complicated in their initiation phase. Firstly, the RNA polymerase doesn’t randomly scale a DNA for promoter regions. Rather, transcription factors are used to create specific instances in the promoter regions for the RNA polymerase to bind to. In eukaryotic cells, there are also 3 different kinds of RNA polymerase, each that transcribes a different type of RNA. Once a RNA polymerase binds to its respective promoter region (equipped with transcription factors), it creates a transcription initiation complex, which traverses the DNA. Eukaryotes also have two promotor sites, one is called the TATA or Hogness box which is at location -25 and has the sequence TATAAA, and the CAAT box, which is located at -75 and has the sequence GGNCAATCT. Transcription is initated by stimulation by the enhancer sequence.

2. Elongation: the covalent addition of nucleotides to the 3′ end of the growing polynucleotide chain.

For prokaryotes: As the DNA is unwound, its base pairs are now available for binding. The first ribonucleoside triphospate (RNA building blocks) binds to it. The RNA polymerase loses its sigma subunit, leaving only the core. The unwound DNA sites provide sites for the RNA building blocks to H-bond to (in their correct base pair). Also, the triphosphates are used like train links, that covalently form phosphodiester bonds with each new RNA block.

For Eukaryotes: As the complex moves across the DNA, it unzips it and allows for Watson-Crick base-pairing to occur with transient RNA building blocks, linking them together using phosphodiester bonds.

3. Termination: the recognition of the transcription termination sequence and the release of RNA polymerase.

For prokaryotes: Elongation continues until it reaches a stop signal found on the DNA. The RNA polymerase core dissolves, and the DNA rewinds. The termination signal in E. Coli is a base-paired hairpin which is rich in guanine and cytosine. These two nucleotides binds complementary to one another creating a hair-pin turn which is then followed by several uracil nucleotides. That hairpin acts like a knot to the RNA strand that is being made, so the RNA detaches from the DNA template. The polymerase leaves shortly after, and the DNA is rewound. Transcription can also be stopped by a rho protein which causes the mRNA to fall of the template DNA strand.

For Eukaryotes: When the RNA complex reaches a termination signal on the DNA, the RNA polymerase is simply detached from the DNA allowing it to rewind. The resulting RNA is then processed. The newly-transcribed mRNA is future processed by adding a cap to the 5′ end and a poly(A) tail to the 3′ in a process called polyadenylation. It adds several adenine residues to the 3′ end of the mRNA. The cap and poly(A) tail act to stabilize the mRNA molecule and prevent it from degrading.


Chromatin Modification[edit]

DNA and proteins make up complex chromatin, and is can be found as either enchromatin or heterochromatin. Enchromatin is loosely condensed while heterochromatin is tightly condensed. Chromatin condensation is important because it determines transcription activity: if chromatin is too tightly condensed (heterochromatin), then the transcription factors and RNA polymerase are not able to get in; however, if chromatin is loosely condensed like in enchromatin, then transcription factors and RNA polymerase can more easily access the chromatin and start transcription. This means that genes containing enchromatin are highly likely to be transcribed, whereas heterochromatin genes are less likely to be transcribed.

30nm Chromatin Structures

Chromatin compaction and relaxation is regulated by modifying histone tails. When histone is attached to the acetyl group, due to the neutralization of the positive charge on the histone, the interaction between the negative charge on the DNA and the positive charge on the histone becomes weaker. As a result, RNA polymerase can easily access to the DNA, and thus, this process facilitates the transcriptional activity in vivo. In contrast, when histone is deacetylated, meaning that acetyl group is removed from the histone tail, the chromatin structure becomes more compacted, and accordingly, transcription is repressed.


Reverse transcription is the process in which a double stranded DNA molecules are made from a single stranded RNA. The name of this method is formed by its opposite direction to transcription. It also involves the presence of a reverse transcriptase enzyme, a primer, DNTAs and a RNase inhibitor.

Reverse transcriptase[edit]

A reverse transcriptase, also known as RNA-dependent DNA polymerase, is a DNA polymerase enzyme that transcribes single-stranded RNA into double-stranded DNA. It also helps in the formation of a double helix DNA once the RNA has been reverse transcribed into a single strand cDNA. Normal transcription involves the synthesis of RNA from DNA; hence, reverse transcription is the reverse of this.

Mechanism for a reverse transcription[edit]


Step a: The minus strand known as primer to transfer RNA to form the first DNA strand and to interact with the tRNA 3 end in a polymerase binding mode.

Step b: The enzyme cleaves the RNA template by binding to it in a RNase H mode.

Step c: the reverse transcriptase use the PPT sequence as a primer to bind in the polymerase mode for the synthesis of the second DNA strand.


Retroviruses store genetic information on RNA. An example of retroviruses are H.I.V. and A.I.D.S.. Retroviruses flow from RNA to DNA. Viruses are enclosed in protein coats and are not capable of independent growth and therefore cannot live without the host.

Although AIDS is a terrible disease which utilizes reverse transcriptase, mankind owes a considerable debt to it. Reverse transcriptase has seen extensive use in the study of gene expression (via gene chips), and protein synthesis (mRNA—reverse transcriptase—>cDNA–>infuse in recombinant plasmid–>insert into E. coli–>have E. coli synthesize more of the original mRNA–>have that mRNA translated into its respective protein).


Sometimes, the enzyme helping in the method such as reverse transcriptase makes mistakes, leading to the wrong reading of the RNA sequence. It causes in the difference of the single infected cells produced in all viruses. Instead, they form a diversity of molecular differences in their surface coat and enzymes, giving scientists difficulty in inventing the corresponding drug for the disease. As a result, it is difficult to fight HIV with vaccines due to its continual changing in surface molecules.

Invention of drugs for HIV[edit]

For a while, reverse transcriptase was considered as a great target for many HIV studies. It is discovered that without reverse transcriptase, the segment of mutated DNA can’t become incorporated into the host cell, and therefore, can’t be reproduced.

As a result, the very first major class of drugs were found to aim on this enzyme to slow down HIV infections called reverse transcriptase inhibitors. They are: AZT, 3TC, d4T, ddc, and ddl that block the recoding of viral RNA into DNA. Yet, the continual change in HIV surfaces molecules limits the effect of these drugs.

Common HIV drugs and How They Work[edit]

There have been quite a few drug therapies starting from the 1990s till 2006, yet still the research for new enhanced drugs are still occurring. Highly Active Antiretroviral Therapy (HAART) consist of three different HIV treatments; Protease Inhibitors, Non-nucleoside reverse transcriptase inhibitor, and Nucleoside reverse transcriptase inhibitor. (Introduction to HAART video HAART)For the Protease Inhibitor (PI) the main target point of this drug is to inhibit the viral protease, which in turn is responsible for Proteolytic processing of the viral polypeptide. There is also a non-nucleoside reverse transcriptase inhibitor (NNRTI) in combination with two nucleoside reverse transcriptase inhibitors (NRTI & NtRTI). NNRTI is a non-competitive inhibitor, which means that it binds to the reverse transcriptase enzyme by binding at a different site. In results in the change of the binding site shape and retarding of the catalyst ability. Relating this to the viral DNA, the movement of protein domains of our target enzyme are stopped. This means that the DNA synthesis doesn’t occur. The nucleoside reverse transcriptase inhibitors (NRTI & NTRI) instead work as competitive substrate inhibitors. Competitive inhibitors occur when the substrate competes with the inhibitor at the active site. Relating it to the reverse transcriptase enzyme, we see in the process the deoxynucleotide of the normal DNA competes with the deoxynucleotide aimed towards growing the viral DNA chain. Thus, there is now a 3′-OH group on the deoxyribose unit. This means that deoxyribonucleotide is unable to form the next 5′-3′ phosphodiester bond essential for the elongation of the DNA chain. This is called chain termination. (For a general visual of the process NRTI). [70]

Another HIV/AIDS inhibitor is the Diketoaryl (DKA) Integrase inhibitors. Integrase is the 3rd viral enzyme that has a two step catalyses.
1. 3′ Processing: The integrase catalyses the processing of the 3′-ends of the viral cDNA. The processing corresponds to an endonucleotlytic cleave of the 3′-ends of the viral cDNA.
2. Strand Transfer: From 3′-processing, the viral 3′-OH cDNA ends are ligated to the 5′-DNA phosphate of an acceptor DNA, which is the host chromsome
Pre-Integration Complex: This macromolecule molecule is formed during and after the 3′-processing which undergoes nuclear translocation. It carries the 3′-processed viral cDNA ends with viral and cellular proteins to the nucleus before integration occurs.
DKA aims to block the stran-transfer step (step 2). The other inhibitors block the strand transfer step and 3′-processing. This integrase inhibitors are still on trial to aid in the future discovery of more specific Anti-AID drugs.[71]

List of Current Anti-AID Drugs[edit]

Here are a list of current anti-AID therapies that have been approved by the FDA in order of; FDA approval year, brand name, generic name and manufacturer.
Fusion inhibitors
2003 Fuzeon Enfuvirtide (T-20) Roche Pharmaceuticals & Trimeris
Nucleoside reverse transcriptase inhibitors (NRTIs)
1987 Retrovir Zidovudine (AZT) GlaxoSmithKline
1991 Videx Didanosine (ddI) Bristol-Myers Squibb
1992 Hivid Zalcitabine (ddC) Roche Pharmaceuticals
1994 Zerit Stavudine (d4T) Bristol-Myers Squibb
1995 Epivir Lamivudine (3TC) GlaxoSmithKline
1997 Combivir Lamivudine+ Zidovudine GlaxoSmithKline
1998 Ziagen Abacavir GlaxoSmithKline
2000 Trizivir Abacavir + lamivudine + zidovudine GlaxoSmithKline
2000 Videx EC Didanosine (ddI) Bristol-Myers Squibb
2001 Viread Tenofovir disoproxil Gilead Sciences
2003 Emtriva Emtricitabine (FTC) Gilead Sciences
2004 Epzicom Abacavir+ Lamivudine GlaxoSmithKline
2004 Truvada Emtricitabine+ Tenofovir Gilead Sciences
Non-nucleoside reverse transcriptase inhibitors (NNRTIs)
1996 Viramune Nevirapine Boehringer Ingelheim
1997 Rescriptor Delavirdine (DLV) Pfizer
1998 Sustiva Efavirenz Bristol-Myers Squibb
Protease inhibitors (PIs)
1995 Invirase Saquinavir Roche Pharmaceuticals
1996 Norvir Ritonavir Abbott Laboratories
1996 Crixivan Indinavir (IDV) Merck
1997 Viracept Nelfinavir Pfizer
1997 Fortovase Saquinavir Mesylate Roche Pharmaceuticals
1999 Agenerase Amprenavir GlaxoSmithKline
2000 Kaletra Lopinavir+ Ritonavir Abbott Laboratories
2003 Reyataz Atazanavir Bristol-Myers Squibb
2003 Lexiva Fosamprenavir GlaxoSmithKline

Telomerase Reverse Transcriptase[edit]

Understanding the structure of Telomeres is vital for genomic stability. The dysregulation of telomere could lead to Apoptosis(Cell death), and abnormality in the cell proliferation. The enzyme telomerase is essential in the process of maintaining telomere repeats in most eukaryotic cells. This Telomerase consists of a reverse transcriptase enzyme and an RNA strand that controls the synthesis of the G-rich strand of telomere terminal repeats. The telomerase reverse transcriptase contains a particular and variable C- and N- terminal extensions that flank a central domain that is reverse transcriptase like. The telomerase reverse transcriptase has two distinguishable properties which are the stable association with the telomerase RNA and the ability reverse transcribe the RNA segment repeatedly.

In Eukaryotes, telomeres are nucleoproteins located at the end of linear chromosomes. It consists of short sequences in addition to proteins that have interaction with these sequences directly as well as indirectly. One of the telomeres jobs is to protect the chromosome terminal from degradation and other reactions that are inappropriate for the chromosome. It also promotes division of chromosomes during meiosis and mitosis. Incomplete replication of telomeres leads to loss of DNA, which is known as “the end replication problem”. The enzyme Telomerase is very crucial in bacteria cells and is required for the reproduction of cell population. On the other hand, in eukaryotes, the telomerase enzyme is suppressed in normal somatic tissues but highly expressed in the sex tissues such as ovaries and testis. Telomerase is considered a plausible target when it comes to cancer therapy because of its up-regulation in cancer cells. [4]


1-3.^ Integrase Inhibitors To Treat HIV/AIDS Yves Pommier, Allison A. Johnson and Christophe Marchand. Volume 4. March 2005.

4. 1Bloomfield Center for Research in Aging, Lady Davis Institute for Medical Research,
Sir Mortimer B. Davis Jewish General Hospital, and Department of Anatomy and Cell
Biology and Department of Medicine, McGill University, Montreal, Quebec, Canada
Although it has been a common belief that regulation of transcription takes place via regions adjacent to the coding region of the gene, mostly by promoters and enhancers, and that polymerase acts as machine that quickly “reads the gene”, recent evidence suggests that there is much more to the process than just that.

Transcript elongation is extremely complex and highly regulated and the process is significant as it affects both the organization and the integrity of the genome. This will explore some of the intricacies of transcript elongation by RNA Polymerase II, that has been overlooked for some time.

RNA Polymerase II transcription cycle can be classified into several distinct steps:

  1. RNAPII is recruited to the promoter of a gene where it forms a pre-initiation complex with the general transcription factors
  2. At this point initiation ensues and the promoter is left behind in the process called “promoter clearance”
  3. RNAPII enters processive transcript elongation and the gene is fully transcribed
  4. Then, transcriptional termination occurs, and results in the release and recycling of the RNAPII

Challenges that face RNAPII:

  1. Polymerase needs to escape the promoter
  2. The production of pre-mRNA transcript needs to be tightly coupled to RNA biogenesis. This includes RNA capping, splicing, transcript, cleavage, and polyadenylation.
  3. RNAPII must navigate past nucleosomes and other obstacles (like DNA damage) because transcript elongation occurs in the chromatin
  4. Transcription is affected by DNA metabolic processes: DNA repair, recombination, and replication.
  5. Transcript elongation in highly transcribed genes is carried out by several polymerase molecules at the same time, so gene “traffic” must be regulated.

Alpha-Amanitin–RNA polymerase II complex 1K83

DNA transcription

This mechanism is what governs transcript elongation. It involves the forward translocation states being stabilized by the binding and hydrolysis of the correct incoming nucleotide. The problem with this mechanism is that although it moves forward rapidly, it can also move backwards. So even though there are newly formed phosphodiester bonds, the enzyme can backtrack for one or several nucleotides so that the newly formed 3’ terminus comes out of alignment with the active site.
Brownian motion can bring RNA back in alignment with the active site. General elongation factors affect multiple equilibria between different enzyme states, which is what helps drive the reaction towards forward translocation.

Transcription fidelity is essential, as the correct insertion of nucleotides into the RNA transcript during elongation is highly important for accuracy of gene expression.

There are a few key structures of the RNA polymerase II that ensures fidelity:

  1. Trigger loop
  2. Rpb9 (subunit on RNAPII)
  3. Transcription factors, especially transcription factors known as TFIIS
  4. RNA Polymerase II itself

The Trigger Loop[edit]

The trigger loop is located beneath the active site and is involved in multiple interactions with the incoming nucleotide. The trigger loop plays a key role in fidelity, mismatched nucleotides in the active site do not correctly align with the loop and therefore result in a large reduction of the rate of phosphodiester bond formation. So the trigger loop mediates phosphodiester bond formation in this manner. Trigger loop discriminates against dNTPs and interacts with rNTPs bases and with the 2’OH.

Another important structure on the RNAPII is the Rpb9 subunit, which was first discovered by scientists studying yeast stains. Rpb9 delays closure of the trigger loop on the incoming nucleotide, which helps to ensure transcription fidelity by allowing more time for mistakes to be fixed.

RNA Polymerase II and TFIIS[edit]

RNAPII itself plays a key role in its own transcription fidelity, as this polymerase can move forwards and backwards upon the molecule it is transcribing, and can therefore correct any mistakes through transcript cleavage of those erroneously incorporated nucleotides. Transcript cleavage is greatly enhanced by transcription factors known as TFIIS. Recent studies have also suggested that TFIIS also play a role in the process of fidelity as they are tightly coupled to the function of Rpb9, which leads us to believe that Rpb9 might be involved in RNAPII fidelity before and after nucleotide addition by affecting the function of the trigger loop and by mediating TFIIS function.

Transcript elongation occurs on a chromatin template, chromatin is an extremely repressive template to the process of transcription. Thus, certain mechanisms must be utilized to make it a more friendly place for transcription to occur. This involves temporary displacement and modification/dissasembly of nucleosomes. This can occur through a few different mechanisms, and requires;

  1. Histone chaperones
  2. Chromatin remodeling factors
  3. Histone modifying enzymes

These same factors also aid in the resetting of the chromatin structure after RNAPII has transcribed.

Nucleosome Assembly[edit]

First, the H3-H4 dimers are acetylated by HAT1-RbAp48 complex. Most of the H3 and H4 proteins are then passed on to Asf1, CAF-1, yRtt106, and HIRA to mediate chromatin assembly. The choice between CAF-1, yRtt106, and HIRA depends on the variant of H3. For example, H3.1 targets CAF-1 while H3.3 targets HIRA. CAF-1 and yRtt106 associate with DNA synthesis and HIRA occurs outside of DNA synthesis. Histone hand-off to different histone chaperones depend on the specific acetylation mark H3 K56Ac. H3 K56Ac’s role in yeast is to drive chromatin assembly during DNA repair and replication. However, since H3 K56Ac is harder to detect in humans, it is only speculated that it will do the same task. With H3 K56Ac there is a higher chance that Asf1 will transfer histones to CAF-1 or Rtt106. The choice of the DNA synthesis-dependent pathway or the synthesis-independent pathway depends on the physical interactions between the chaperones. The final chaperone in the hand-off will place the histones on DNA. There are mechanisms for which location of the DNA to place the histone and also mechanisms for increasing the concentration of the histone. [1]

Nucleosome Dissasembly[edit]

One of the mechanisms used is nucleosome disassembly in front of elongating RNAPII; certain experiments prove this to be a mechanism. One such experiment measuring histone density at the yeast GAL genes, showed that gene activation causes loss of nucleosomes at the promoter and also within the coding region. This loss of histone density is caused by elongating RNAPII . Genome-wide analysis of nucleosome occupancy in yeast revealed that transcription rate and histone density are inversely related, further proof that nucleosomes are disassembled during transcription.
Another mechanism is the displacement of all core histones during transcription. There are different types of dimers however. H2A/H2B dimers are localized on the exterior of the nucleosome, and have fewer protein-DNA contacts. These dimers are rapidly exchanged in response to transcription factors. Conversely, histones H3 and H4 are much less mobile and their turnover rate is quite independent of the H2A/H2B dimer. So, although all core histones are displaced during this process, histones H2A/H2B are more readily moved while H3/H4 are not.

Just as the nucleosome assembly process occurred in a stepwise manner with histone chaperones, the nucleosome disassembly process is a stepwise process in the reverse direction. The most significant difference between nucleosome assembly and dissambly is that nucleosome disassembly requires energy. The energy is needed to break the histone-DNA bond so that histone chaperones can bind to the histone and move it away from the DNA. The factors to take into account before is undergoing disassembly is the post-translational histone modifications and histone chaperone availability. It is about equilibrium. If only H2A-H2B’s equilibrium is leaning towards removal from DNA there would only be histone exchange. However, if H3-H4’s equilibrium is leaning towards removal from DNA there would be nucleosome disassembly. H3 K56Ac in promoter region is a common example in nucleosome disassembly because it shifts H3-H4’s equilibrium towards removal from DNA. The equilibrium is all affected by interactions that include histone-histone chaperone interaction, histone-DNA interaction, or histone-histone interactions.[1]

Histone Chaperones[edit]

Histone chaperones are histone-binding proteins involved in intracellular histone dynamics, recent evidence revealed that one histone chaperone known as FACT has a huge role in transcription elongation. FACT (facilitates chromatin transcription) is a histone chaperone that facilitates elongation by destabilizing nucleosome structure so that H2A/H2B dimer can be removed during the passage of RNAPII. Spt6 is a H3 and H4 histone chaperone that maintains chromatin structure, promoting restoration of normal chromatin structure in the wake of RNAPII transcript elongation. Histone chaperones in general are involved in both removing and re-depositing histones during transcript elongation.

RNA Polymerase II Transcription

Histone chaperones as proteins not only pack histone and DNA into the nucleosome structure but also dissembles the structure. There are certain ways histones and DNA fold together in the nucleosome and histone chaperones help oversee every step. H2A, H2B, H3, and H4 are histone proteins. The histone chaperones assist in formation of tetrasome from the heterotetramer of H3-H4 on DNA. The tetrasome in combination with the H2A-H2B dimer forms the nucleosome. Histone chaperones are so important because without their help, positively charged histones would form aggregates with negatively charged DNA. The fact that the histones have a hydrophobic part and a slightly acidic part makes the protein even more attracted to DNA. Histone chaperones, although not sequentially similar, can be structurally grouped as beta-sheet sandwich chaperones, alpha-earmuff chaperones, beta-propeller chaperones, and beta-barrel and half barrel chaperones. [1]

Beta-sheet sandwich chaperones
This structure is characterized by an N-terminal core domain of Saccharomyces cerevisiae Asf1 and the absence of an acidic tail. Histone binding was confirmed by mutagenesis and NMR chemical shift to occur at the hydrophobic and acidic surface of Asf1. The C-terminal tail of H4 can bind to either the yAsf1 beta-sheet or H2A mini beta-sheet. Therefore, yAsfi could access the H3-H4 dimer and form a complex that is structurally conserved. The crystal structure of human ASF1a reveals this complex. Asf1 passes off the H3-H4 dimer to other histone chaperones such as CAF-1 or HIRA. Asf1 regulates transcription, replication and repair. In addition to Asf1, Yaf9 is another chaperone in the beta-sheet sandwich chaperone. Yaf9 plays a role in H2AZ acetylation and deposition into euchromatic promoter regions. It also contains structurally conserved features that could serve as H3-H4 binding sites and regulates transcription. [1]

Alpha-earmuff chaperones
This group is responsible for histone delivery from cytoplasm, binding to H1 and assembling and disassembling nucleosomes. These chaperones are characterized by their use of long alpha-helix for dimerization and linking alpha beta earmuff motifs. Mutagenesis confirms the histone binding site at the central and bottom surfaces of the earmuff domains. Vps75 is one of the chaperones in this group and it binds H3-H4 dimer. It is involved in the acetylation of H3 K56, which in turn aids the packing process. Along with acetylation, Vps75 is also responsible for regulating transcription, repair, and maintaining telomere length. The earmuff domains are closer in Vps75 than NAP1. NAP1 is responsible for transcription, H2AZ exchange, linker histone deposition, and histone delivery.[1]

Beta-propeller chaperones
This group is characterized by acidic patches that are not as distinct as other chaperones. Nucleoplasmin is responsible for regulating nucleolar events and histone storage during such events as oogenesis, sperm chromatin decompaction, and nucleosomal assembly. Nucleoplasmin’s pentamer N-terminus has the ability to self associate into a decamer. The nucleoplasmin core works in conjunction with linker histones and core histones to produce five histone octamers. Along with having a beta-propeller chaperone structure, CAF-1 and RbAp46 has an alpha-helix at the N-terminus. These chaperones are negatively charged at the top and hydrophobic at the bottom. H4 alpha helix binds between the chaperones’ alpha helix and a binding loop in an acidic region. Consequently, the H3-H4 dimer structure is disrupted. To counteract, these chaperones take on the role of associating the H3-H4 dimer with several chromatin modifying complexes such as HAT1, PRC2, NURD, and NURF.[1]

Beta-barrel and half barrel chaperones
This family consists of FACT subunits Spt16, Pob3, Nhp6, SPT16 and SSRP1. FACT is the term used to basically mean that these chaperones facilitates chromatin transcription. Pob3’s structure consisting of a helix-capped beta-barrel helps it bind to the yeast replication protein A (RPA) complex to assist in replication. Spt16’s has linked aminopeptidase and pita-bread domains. Rtt106 binds histones acetylated on H3 K56. Its H3-H4 binding site is located in a loop in the C-terminal domain.[1]

Histone variant chaperone Chz1
This histone chaperone group is characterized as not having a defined sequence or structure. Chz1 is a H2AZ-H2B binding protein. NMR confirms Chz1’s unique structure of alpha-helices that bind mainly on one surface of the H2AZ-H2B dimer with high affinity. Therefore, it dissociates more slowly than it associates. [1]

The oligomeric state of the histone chaperone depends on the stage of the nucleosome assembly. Histone chaperone-histone binding could either be simple or multimeric. The weak individual binding adds up to a high affinity multimeric binding. Although Asf1 and Chz1 are both monomeric, they play different roles in nucleosome assembly. Asf1 prevents H3-H4 tetramer formation. H3 would usually dimerize inside the histone octamer, but Asf1 binds to H3 making it exposed to other histone chaperones. Once Asf1 is released, the H3-H4 tetramer can form with Rtt106 and CAF1 attached. It is after this stage that the tetramer is deposited onto DNA. Chz1 on the other hand binds to an already exposed histone octamer surface of H2AZ-H2B. Therefore, the histones are directly deposited on tetrasomes.[1]

Alpha beta earmuff chaperones assemble nucleosomes in vitro. NAP1 not only assembles histone octamers but also tetrasomes. NAP1 assembles histone octamers through H2A-H2B or

H2AZ-H2B. NAP1 assembles histone tetrasomes through H3-H4 and DNA. This process is not as favorable when there is H2A-H2B around. Vps75 is different from NAP1 in that it is more specific for H3-H4 and has weaker affinity for other histones. NAP1 binds all the histones that have the common histone fold through one surface. This is also the point in which it differs with FACT because FACT binds multiple surfaces of multiple histones. FACT carries out different functions depending on if it’s in vitro or in vivo. When it’s in vitro, FACT removes the H2A-H2B dimer. However, when it’s in vivo FACT puts the H2A-H2B dimers on DNA during the processes of transcriptional elongation, repairs, and replication.[1]

There’s only one binding site for H4 in dCAF-1 p55 and hCAF-1 RbAp48 subunits but there’s multiple binding sites for H3. The single binding site for H4 is on the opposite face of the H3-H4

dimer. This allows the CAF-1 subunits to either reach histone dimers bound to ASf1 or histone tetramers that are being deposited onto DNA.

Np is assembled as a dimer of pentamers. Each face of these histone chaperones could bind an octamer. Np binds H2A-H2B dimers and N1 binds H3-H4 tetramers to form octamers. NASP,

nuclear autoantigenic sperm protein, binds H3/H4 and aids the nucleosomes assembly process. [1]

Histone chaperone-guided folding pathways

Histone chaperones are extremely important in guiding the histone-DNA folding process because without it the histone would form intermediates that act as kinetic traps. Nucleosome assembly is an energetically favorable process and to remain that way the histone chaperones need to prevent these kinetic traps that prevent histones from ever binding on DNA. These kinetic traps are low in energy and it is not energetically favorable for histone to get out of these kinetic traps and bind on DNA when it is more stable in these intermediates. The histone chaperones allow histone-DNA complex to fold in a manner that is most stable and low energy. It is an energetically downhill process to get to DNA kinetic traps along the pathway are lower in energy so without the help of histone chaperones the histone would be stuck in that intermediate and never reach its destination. Histone chaperones work together with ATP-dependent chromatin remodelers. They are useful in times of kinetic traps to get the histones back on the correct pathway. Since kinetic traps are low in energy, chromatin remodelers get these intermediates out by raising its energy. These chromatin remodelers are also useful in breaking histone-DNA bonds. Certain acetylation marks on histones either increase or decrease affinity of the binding and is removed depending on if the nucleosomes needs to undergo assembly or disassembly. [1]

Nucleosome Remodeling[edit]

ATP-dependent chromatin remodeling complexes (remodelers) use the energy of ATP to modify the structure of chromatin.

There are four main families of remodelers:

  1. SWI-SNF
  2. ISWI
  3. CHD
  4. INO80/SWR

One SWI-SNF remodeler known as RSC can stimulate RNA polymerase II transcript elongation through a mononucleosome in a simple reconstituted chromatin transcription system. This is enhanced by histone acetylation as it increases the affinity of RSC for the nucleosome. The remodeler Chd1 is associated with chromatin at sites of active transcription, and it also plays a role in transcript elongation as it interacts with elongation factors Paf, DSIF, and FACT.

Histone Modifications[edit]

Covalent modification of histones is another way of modifying chromatin structure and it that does not involve histone removal and replacement. Instead it involves altering the packaging of chromatin by affecting internucleosomal contacts or changing electrostatic charge. As well as using the covalently attached moieties as a binding surface for elongation associated effector complexes.

The aforementioned can be achieved using these three mechanisms:

  1. Histone acetylation
  2. Histone methylation
  3. Histone ubiquitation

Selth, Luke A. Sigurdsson, Stefan. Svejstrup, Jesper Q. “Transcript Elongation by RNA Polymerase II”. Annual Review of Biochemistry 2010. Vol. 79: 271-293. 04/01/2010. DOI: 10.1146/annurev.biochem.78.062807.091425

mRNA processing and transfer surrounds the movement of mRNA from the nucleus to the cytoplasm. After transcription, a process which occurs in the nucleus, mRNA must travel to the cytoplasm, where it can reach the ribosomes. mRNA travels past the nuclear membrane in the form of mRNP (messenger Ribonucleoprotein), a structure in which contains cargo-carrier components. The cargo (mRNA) requires the assistance of carriers (proteins) in order to be transported across the nuclear membrane. This mechanism makes usage of recyclable proteins and imposes directionality.

After the mRNP goes through nuclear processing, the mRNP approaches the NPC (Nuclear Pore Complex). Located at this complex is the TREX-2 complex. The TREX-2 complex is comprised of a Sac3 complex containing Sus1, Cdc31 and Thp1. Together, this TREX-2 complex improves the efficiency in which mRNP enters the NPC transport channel, which occurs by promoting active interactions between the TREX-2 complex and that of approaching mRNPs. Located along the NPC nuclear face, the TREX-2 complex becomes a sort of platform in which the necessary interactions between the TREX-2 complex and the mRNP are favorable. The TREX-2 complex serves as an attractive force, which concentrates export-ready mRNPs and helps promote movement through the NPC.

In addition to the Sac3 complex of the TREX-2 complex, a complex known as the SAGA complex is a part of the TREX-2 complex. The SAGA complex, which is attached to the Sac3 complex, is most known to be the location in which active genes are localized. More specifically, the TREX-2 complex localizes the active genes to pores located on the nuclear basket of the NPC transport channel as well as the SAGA complex.

Once the mRNP clears the NPC complex, the mRNP becomes exposed to the cytoplasmic conditions. Located in the cytoplasm is an essential DEAD-box helicase known as Dbp5. Dbp5 is responsible for the remodeling of the mRNP complex, essentially removing the carrier aspects of the carrier-cargo relationship of the mRNP. The remodeling of the mRNP, an act whose specifics are unknown, releases Nab2 from the Poly(A) tail of the mRNA and Mex67-Mtr2. As a result, the introduction of Dbp5 isolates the mRNA from the mRNP. However, Dbp5 is not always in an active conformation. Dbp5 flow freely between the nuclear and cytoplasmic side, but only affect the mRNP in the cytoplasmic side. This occurs because of the presence of Gle1 and IP6. Gle1 and IP6 are located on the cytoplasmic face of the NPC. Without these two compounds, Dbp5 remains in a dormant state. This occurs because Gle1 enhances the ATPase efficiency of Dbp5 as well as its affinity for RNA. IP6 enhances the attraction between Gle1 and Dbp5. When an activated Dbp5 remodels the mRNP into the mRNA and the carrier aspects, the Nab2 and Mex67-Mtr2 are recycled. In other words, they flow freely from the cytoplasmic side to the nuclear side and are reused in the formation of the mRNP. The usage of Dbp5 as a means of remodeling the mRNP controls the directionality of the transport. Once the mRNP is separated into the mRNA and carriers, the mRNA cannot re-enter the nucleus. The mRNA that is isolated is ready for translation.

Stewart M. “Nuclear export of mRNA”. Trends Biochem Sci. 2010 Nov;35(11):609-17. Epub 2010 Aug 16. Review. Accessed 2012 Nov 20.

In genetics, translation is the process by which mRNA is decoded and translated to produce a polypeptide sequence, otherwise known as a protein. This process is preceded by the transcription of DNA to RNA. This method of synthesizing proteins is directed by RNA and accomplished with the help of Ribosome.
In translation, a cell decodes the mRNA’s genetic message and assembles the brand new polypeptide chain accordingly. This genetic message is composed of a sequence of codons which comprise the mRNA strand. The translator of this message is the transfer RNA, or tRNA. The main function of tRNA is to transfer free amino acids from the cytoplasm to a ribosome, where it is attached to the growing polypeptide chain. Normally, a cell is well stocked with all 20 amino acids, either by producing them or getting them from the food we eat. The ribosome adds on the amino acids that are brought to it by tRNA molecules to the growing end of the polypeptide chain.

Ribosome mRNA translation en.svg

The Ribosome[edit]

Molecular Picture of a ribosome, Blue = Proteins, Orange = RNA, Red = Active Site

Ribosomes help coordinate the binding of tRNA anticodons with mRNA codons during translation. Ribosome can be like molecular machine that can decode RNA and use the information to build a polypeptide that contains a precise sequence of amino acid. The decoding process of the ribosome may be compared to a language translation machine that converts one language to another. So, the ribosome can be viewed as translating the language of the mRNA code into sensible protein sequences that conduct the activites of the cell. In eukaryotes, the ribosomal subunits are produced in the nucleolus of the cell. A ribosome is made up of two subunits: the large and small subunits. These subunits are made up of proteins and RNA molecules called ribosomal RNA (rRNA). The rRNA in prokaryotes and eukaryotes differ in many ways. For example, eukaryotic large subunits have a size of 60S, while prokaryotes have a size of 50S, meaning that prokaryotic subunits are smaller. Together, the large 50S and the small 30S subunit can form the 70S ribosome. The 30S subunit agrees to the selection of cognate aminoacyl tRNAs by facilitating base-pairing between mRNA codons and tRNA anticodons, while the active site or the PTC exists in the 50S subunit. The typical E. coli cell has approximately 18,000 ribosomes. The PTC catalyzes both the peptidyl transfer during protein elongation and hydrolysis of the petidyl tRNA during termination. In vitro, the 50S subunit can synthesize peptide bonds rapidly as the entire 70S ribosome, indicating that there is a catalyst. In human health, ribosome can serve as a medicinal target for drugs that work by inhibiting translation within pathogens without affecting the host organism. This is medically important because antibiotics such streptomycin and kanamycin targets the smaller subunits of bacteria while leaving the larger eukaryotic subunits unharmed.

The structure of the ribosome consists of a binding site for mRNA and three binding sites for tRNA. For tRNA, the P site (peptidyl-tRNA site) carries the growing polypeptide chain, while the A site (aminoacyl-tRNA site) holds the tRNA that carries the next amino acid that is to be added to the growing chain. The E site (exit site) is the site where discharged tRNAs leave the ribosome. The ribosome holds these two components close together and positions the new amino acid in a way that allows the addition of new amino acids to the carboxyl end of the growing polypeptide chain. As the chain gets longer, it goes through an exit tunnel in the large ribosomal subunit. Once the chain is completed, it is released to the cytosol of the cell through the exit tunnel. A particularly iconic video of Protein Translation shows this process.


In translation, there are three main steps that describe the protein decoding and synthesis process. These steps, in order, are called initiation, elongation, and termination.


The first step of translation is called initiation. In this step, mRNA, a tRNA containing the first amino acid of the polypeptide, and two ribosomal subunits come together to start the process. The small subunit then binds to both mRNA and a specific initiator tRNA, which contains the amino acid methionine (MET). Next, the subunit scans along the mRNA strand until it reachs the start codon AUG, which indicates the start of translation process. The start codon also establishes the reading frame for the mRNA strand, with is crucial to synthesizing the protein. A shift in the reading frame results in mistranslation of the mRNA. Then, the tRNA initiator then binds to the start codon via hydrogen bonding.

The complex of consisting of mRNA, initiator tRNA, and the small ribosomal subunit attaches to the large ribosomal subunit, which completes the initiation complex. These components are brought together by the help of proteins called initiation factors which bind to the small ribosomal subunit during initiation. In addition, the cell spends GTP energy to help form the initiation complex. Once the formation of the initiation complex is complete, the initiator tRNA attaches to the P site of the ribosome, and the empty A site is ready for the next aminoacyl tRNA. The polypeptide is always synthesized in one direction, which is from the N-terminus to the C-terminus direction.

In bacteria, initiation of protein synthesis is started by binding of proteins called initiation factors (IFs) to the small 30S unit. Followed by the binding of fMet-tRNA to it. (fmet is the Met with formamyl (HCOO-) added to the N-terminus which is later removed). The three small proteins called initiation factors (IF1, IF2, and IF3) are required for the initiation of protein synthesis in bacteria. IF3 first brings mRNA and the 30S ribosome subunits together which allow the ribosome-binding site to find its complementary site on the 16S rRNA. Then, IF1 binds to and blocks the A site. The IF2 bound to GTP then escorts the initiator N-formylmethionyl-tRNA to the start codon located at the P site. IF3 is released when the initiator tRNA is in place. The 50S subunit then docks to the 30S subunit, GTP is hydrolyzed and IF1 and IF2 are released as well. This complex of Ifs-30S-fmet-tRNA now binds to the mRNA to be translated. To establish the reading frame, the mRNA in bacteria has an 8 nucleotide, purine-rich sequence called the Shine-Delgarno sequence, which is complementary to the rRNA of the small (30S) unit and helps in the initial binding of mRNA to the 30S unit complex. The Shine-Delgarno sequence is very close to the AUG codon, where the start of the translation occurs. The Shine-Delgarno sequence is present in all the bacteria.

The process of initiation of translation is complete when the large (50S) subunit binds to the Ifs-30S-fmet-tRNA complex. The binding of 50S subunit is an energy requiring process. First the energy rich GTP binds to another protein IF2 (initiation factor 2), forming the GDP-IF2 complex. The GDP-IF2 complex participates in the formation the final protein synthesizing 70S complex. In eukaryotes, additional initiation factors (IFs) are involved.


Elongation begins with the aminoactyl tRNA that is delivered to the A site as a ternary complex with the elongation factor. The anticodon of an incoming tRNA pairs the bases with the complementary bases of the mRNA codon at the A site. The tRNA is attached to the amino acid which the mRNA codon codes for. The code which relates the codon to the amino acid is known as the Genetic Code. During this process, an elongation factor (as well as EF-Tu in bacteria) is necessary. The hydrolysis of molecules of GTP to GDP is required for codon recognition with the release of PPi that allows for the accuracy and efficiency of the process of recognizing codons. Next, a new peptide bond is formed between the new amino acid in the A site and the carboxyl group of the growing polypeptide. This peptide formation is carried on by the help of rRNA molecule of the large subunit. Now, the peptide chain has been elongated by one amino acid. The tRNA carrying the elongated polypeptide chain is then moved to the P site while the empty tRNA in the P site is moved to the E side and exits. The process is repeated for the next incoming tRNA and amino acid: tRNA carries the next amino acid to attach to the A site, base-pairs, need energy from the hydrolysis of GTP to GDP, peptide bond formation to connect new amino acid to polypeptide chain, and appropriate movements occur, empty tRNA moves and exist to go back for another “trip”, tRNA carrying elongated polypeptide chain moved to P site where it waits for another coming tRNA to take place. Therefore, one by one, an amino acid is added to the preceding amino acid.

The release factors recognize and bind to the A-site stop codon and activate the hydrolysis and release of the polypeptide from the P-site tRNA. the C terminus of the polypeptide chain attached to the P-stie tRNA that undergoes the attack at the ester carbon by a water molecule; therefore, releasing the newly synthesized polypeptide. The ribosome PTC catalyzes the aminolysis of the ester bond, where the alpha amino group of the A-site aminoacyl tRNA attacks the P-site peptidyl tRNA at the carbonyl group. this happens because the amines react faster with esters to form peptide bonds.

Note: The P site is called as such because it only binds to Peptidyl-tRNA molecule, the tRNA anchoring the growing polypeptide chain. The A Site is called as such because it only binds to incoming Aminoacyl-tRNA molecules, the tRNA that contains the free amino acid.


The elongation process stops when any stop codon, a codon signaling the end of translation, reaches the A site of the ribosome. The A site will not accept any incoming tRNA, leaving it free to bind to the release factor. The release factor will hydrolyze the bond between tRNA and polypeptide in the P site, releasing the polypeptide chain. Subsequently, the two ribosomal subunits, release factor, and mRNA come apart when their jobs are done. The polypeptide is released from the mRNA template and allowed to fold into its final 3D conformation.
Also, the ribosome arrives at the end of the coding region, not the end of the RNA. So, the end of the coding region is marked by one of the three stop codons. The formation of the last peptide bond and the subsequent translocation of mRNA leads to ejection of tRNA in the E site and also brings the stop codon into the A site.

In bacteria, they have 2 class I release factors that decode the three stop codons: UAG, UAA, and UGA. The class II release factor starts the disscoiation of the class I release factor from the post-termination ribosomal complex after peptidyl tRNA hydrolysis. in eukaryotes and archea, they have one class I release factor that can decode all three stop codons.

Stop Codon Sequence

Eukaryote-specific Ribosomal Features[edit]

Microscopy studies have shown that eukaryotic 40S and 60S ribosomal subunits are largely analogous to prokaryotic 30S and 50S ribosomal subunits. Indeed, a great many structural “landmarks” are conserved, including a central protuberance, two stalks, and a sarcin-ricin loop (SRL). However, within eukaryotic and archaeal ribosomes, subunits have underwent remodeling within specific regions (the 40S subunit, for example, is divided into the “head”, “beak”, “platform”, “body”, “shoulder”, “left foot”, and “right foot” regions). Remodeling through protein addition has occurred primarily on solvent-exposed faces of both the 40S and 60S eukaryotic subunits. With respect to the 40S subunit, rRNA expansion segments (ESs) have networked with a variety of eukaryote-specific protein components, resulting in structural linkage atop the subunit. Similarly, the solvent-exposed face of the 60S subunit possesses two expanded regions, each containing high concentrations of ESs and eukaryote-specific protein elements.

Tertiary contact within the eukaryote-specific ribosomal proteins is essential in producing stable subunit configurations. Such proteins and their extensions are responsible for interconnection that occurs within both subunits. Within the 40S subunit, the presence of extensions rpS10, rpS12, rpS21, and rpS7 serve to link 11 proteins, creating a “daisy-chain” structure. Likewise, protein-modulated subunit connections are quite prevalent, with eukaryote-specific extensions forming networks of interaction. This produces a number of structural effects, including the presence of β sheets and α helices, which may extend tangentially across the ribosomal subunits to interact with distant regions of the structure.

Differences between prokaryotic translation and eukaryotic translation[edit]

The picture of prokaryotic translation.

The picture of eukaryotic translation.

•In general, ribosomes involved in eukaryotic translation are about 30% larger than their prokaryotic counterparts.

•Relative to prokaryotic ribosomes, eukaryotic ribosomes require a very large number of assembly, maturation, and initiation factors. They are also subjected to a great degree of regulation. While prokaryotic ribosome assembly and translational initiation are influenced of a handful of nonribosomal factors, eukaryotic ribosome development and translational initiation are modulated by approximately 200 maturation factors and a minimum of nine initiation factors, respectively.

•The P Site in prokaryotic translation is directly on AUG at the beginning, but in eukaryotic translation the ribosomal subunit scans the chain until reaches to AUG.

•Prokaryotic translation is initiated by the presence of the Shine-Dalgarno (SD) sequence, a short series of base pairs that identifies and binds to an anti-SD sequence located at the end of a 16S rRNA subunit within the ribosome. On the other hand, eukaryotic translation does not involve Shine-Dalgarno (SD) sequences, relying instead on Poly-A-Binding Protein (PABP) within a “scanning mechanism” of sorts.

•The initiating amino acid in eukaryotic translation is methionine. Conversely, the initiating amino acid in prokaryotic translation is N-Formylmethionine (fMet).

Increasing the Rate of Translation[edit]

The majority of proteins are synthesized in 20 seconds to a few minutes. This can be sped up through a number of different processes.

•A single mRNA can be bound by multiple ribosomes, with each ribosome synthesizing its own protein.

•Ribosomal subunits can be rapidly recycled by the cell. Once the ribosome has reached a stop codon down the 3′ end of the mRNA strand, the subunits can reattach to the beginning to immediately begin synthesis of a new protein.

•Polysomes: polysomes are a cluster of ribosomes bound to a circular mRNA molecule. Two proteins are bound to the 5′-CAP and Poly A tail to make the mRNA circular. EIF4 binds to the 5-CAP, and Poly A Binding Protein I (PABPI) binds to the poly A tail. These two proteins associate together to make the mRNA circular, which contributes towards a more efficient translation. The ribosomes in a polysome can travel down from the 5′ end to the 3′ end, and only have a short distance jump to recycle back to the beginning of the 5′ end of the mRNA strand and repeat translation.

Nonsense Mutations in Translation[edit]

A nonsense mutation is a point mutation (single base substitution/single nucleotide mutation) in a DNA sequence that introduces a premature stop codon in the sequence. For example, if a codon with the sequence 5′ U A C 3′, which codes for the amino acid tyrosine had a nonsense mutation where the C was mutated into a G, the new codon would be 5′ U A G 3′, which codes for a STOP codon, truncating the protein. If a nonsense mutation occurs, the cell can respond with Nonsense Mediated Decay.

Nonsense mediated decay degrades mRNA that have a nonsense mutation. The mRNA will initially undergo one round of translation, where the cell will recognize something is wrong due to the premature stop codon, and degrade the mRNA. The nonsense mutation is identified through the exon junction complexes located along the mRNA.

Exon junction complexes are simply proteins that physically associate where 2 exons come together. During translation, as the ribosome moves down the mRNA strand, exon junction complexes are “bumped” off as they come in contact with the ribosome. If translation is stopped early due to a premature stop codon, the exon junction complexes will remain on the mRNA and not be removed. This signals the cell that something is wrong, and nonsense mediated decay is allowed to degrade the mRNA.

Nonsense mediated decays are not necessarily always beneficial. It does not check to see whether or not the truncated protein is functional, but rather only looks for nonsense mutations along the strand. One such example can be seen in the disease Cystic Fibrosis.

Cystic Fibrosis occurs due to a nonsense mutation in the gene “cystic fibrosis transmembrane conductance regulator,” or CFTR, which codes for a chloride channel. People without cystic fibrosis have two working copies of the CFTR gene, and only one is needed to prevent the disease. A CFTR nonsense mutation produces a truncated protein, and the cell performs nonsense mediated decay in response, degrading the CFTR mRNA. The degradation of CFTR mRNA results in very little to no CFTR channels, causing cystic fibrosis. However, scientists have discovered that the truncated protein produced due to the premature stop codon is still functional as a CFTR channel. In this case, a response designed to protect the body ultimately ends up hurting it.


Slonczewski, Joan L. Microbiology. “An Evolving Science.” Second Edition.
Klinge, Sebastian; Voigts-Hoffmann, Felix; Leibundgut, Marc; Ban, Nenad. “Atomic structures of the eukaryotic ribosome.” Trends in biochemical sciences doi:10.1016/j.tibs.2012.02.007 (volume 37 issue 5 pp.189 – 198)
RNA modification must be performed in order to form the various proteins needed for eukaryotes to function. RNA modification generates mature RNA. Through RNA modification, a eukaryotic cell can use fewer variations in base pairs of the genetic code (DNA and RNA) while creating proteins with diverse functions. One of the most common methods employed by eukaryotes is to use spliceosomes to cleave out introns (intervening sequences/ non coding proteins) of the pre-RNA, leaving only exons (expressed sequences/coding proteins). Cutting out the intron segments allows for the possibility of exon to rearrangement. Spicing all exons together is called mature mRNA.

What’s the advantage of spitting genes?
Exons are segments that coding proteins and give proteins specific functions. This leads to the concept of exon shuffling. Exons shuffling is the rearrangement of the exons in the mRNA. These mRNA will come up with different types of proteins with different functions, binding sites, and catalytic sites. This path could lead to the evolution of new proteins.

Even though the use of exons and introns are quite common in eukaryotes, such practice is rarely performed by prokaryotes. In addition, evolution has shown that DNA sequences of genes encoding proteins were conserved. They showed that introns were once appeared in the prokaryotes’ ancestral genes, and were vanished over time. The reason for this might be because such processes are not time efficient. Time efficiency is very important for prokaryotes because they multiply at a very fast rate.

Schematic depiction of removal of introns from a strand of RNA

Introns are regions within the primary-transcript where part of the fragments are to be removed. They are named introns for intervening sequences. the regions that are saved are called the exons.

Introns are removed from the primary transcript, precursor mRNA (pre-mRNA) after the poly A tail and the 5′ cap have been added. Introns usually begin with Guanine-Uracil, and end with Adenine-Guanine that is preceded by a pyrimidine-rich tract, which signal splicing. Introns are spliced from the pre-mRNA by spliceosomes, which are made from proteins and small RNA molecules. Some introns are self-splicing which means they have the ability to remove themselves from an RNA molecule. One advantage of having genes being split by introns is that alternate splicing patterns allows the formation of proteins with varying functions without requiring new genes for each such protein.

There are four types of introns: Group I introns, Group II Introns, Nuclear pre-mRNA Introns, and Transfer RNA Itrons. Group I introns are found in some rRNA genes and splices itself out of genes. Group I introns fold into a type of secondary structure that has a nine-looped stem that is required in order to be spliced. Group II introns are found in mitochondria and chloroplasts. They are self-splicing as well, but they cut themselves out differently than Group I introns. They too fold into a secondary structure like Group I introns, but their splicing produces a lariat structure. A lariat structure is formed when an introns folds back on itself after an exon is cut from it. Nuclear pre-mRNA introns are found in the nucleus in protein-encoding genes. Their removal requires the presence of snRNAs and several other proteins. Transfer RNA introns are found in tRNA genes and needs enzymes in order to be spliced out of the genes.

The average human gene has 8 introns and some have more than 100. The size range ranges from 50 to 10,000 nucleotides. They are longer than exons.

Evolutionary Differences in Existence of Introns[edit]

Introns are usually found in the genes of higher eukaryotes such as birds or mammals. Lower eukaryotes such as yeast have fewer introns and prokaryotes rarely if ever have introns. Study of genes that have been highly conserved in evolution suggests that introns were present in most organisms long ago but were lost in organisms such as prokaryotes as an evolutionary measure to allow faster replication. The presence of introns is thought to contribute to the development of new genes through exon shuffling. The advantage of the introns is that exons maintained function but are able to interact in new ways. Without introns, the crossover would most likely result in a loss of function.

Discovery of Introns[edit]

Introns (intervening sequences), were discovered by Phillip Sharp and Richard Roberts in 1977 discovered that several genes are discontinuous. Electron microscopic studies of mRNA and DNA segments combined showed the presence of introns. If the gene was continuous only one of the strands of DNA would be displaced. However, it was observed that the strands were displaced in some regions but remained as double strands in other regions, thus proving the existence of introns.


Info. obtained (Berg, Stryer, et al.)

General information[edit]

Exons are protein-coding segments and one of the two major divisions of DNA that is transcribed into RNA. All genes start with exons which are often interrupted by introns (non-protein coding segments). Exons received their name because they exit the nucleus and allow the DNA sequences to be expressed (prefix ex-comes from “expressed”)1. Exons are the actual part that contains codes for particular protein parts. The number of exons in DNA can vary from one species to another. Before the functional mRNA is formed, a splicing complex called spliceosome cleaves exons to bring them together, then removes all introns and connects exons to each other. Exons join together and travel out of the nucleus where they eventually code for proteins.


Exons were first discovered in 1977 by American molecular biologists and Nobel Prize winners (1993) Richard Roberts and Phillip Sharp. They used electron microscopy to study mRNA and DNA hybrids. In the absence of introns, the entire region that is hybridized to the mRNA would be displaced. In their experiment, they observed regions that were not displaced, creating a loop that is indicative of an intron. Initially it was assumed that the sequence of both DNA and mRNA were identical or continuous. The result of their experiments revealed that DNA had stretches of bases not present in mRNA. Based on that they explained the nature of exons.

Roles of exons[edit]

With the help of introns, exons can undergo recombination or exon shuffling. Crossovers occur in random, but homologous, positions at a frequency that depends on DNA length. Exon shuffling is a natural process that allows the formation of new functional proteins by creating new arrangements and thus new interactions with minimal risk to the sequence encoding of the functional parts. In the absence of introns, crossovers are likely to disrupt the exon sequence and often create a loss of function.
Moreover, exon shuffling can produce new and useful proteins which lead to evolution. It encode different domains of the protein products with the processes of transcription, RNA processing, and translation.


1 Jerry Bergman, The Functions of Introns: From Junk DNA to Designed DNA[73]

RNA splicing is a modification of an RNA that takes place during the transcription of the primary transcript to the mRNA. Splicing refers to introns being cut out or removed, and the remaining sequence (called exons) being attached. This modification occurs in the nucleus, before the RNA is moved to the cytoplasm.

Splicing happens in all the domains of life, but types of splicing differ immensely between the major divisions. Eukaryotes splice many protein-coding messenger RNAs and some non-coding RNAs. Prokaryotes, on the other hand, splice rarely, and when they do, it is mostly non-coding RNAs.

Discovery of RNA Splicing[edit]

RNA splicing was discovered by two scientists Phillp Allen Sharp and Richard J. Roberts and they were awarded the 1993 Nobel Prize in Physiology or Medicine for their achievement. The initial discovery of RNA splicing led to the resolution of an earlier paradox in which scientists had discovered RNA in the nucleus that was unusually long compared to the mRNA found in the cytoplasm of the cell. The strange nuclear RNA had a 5’ end containing a cap structure and a 3’ end that contained a polyadenosine [poly(A)] tract and these were similar structures found in the shorter mRNA found in the cytoplasm. The subsequent discovery of splicing explained how the small mRNA could have the same termini as the longer nuclear RNA. While the termini were the same, the lengths were different because introns had been removed from the middle of the strand. These introns, it was discovered, proved to be a problem for the cell because, for example, a nearly a quarter of all mutations in globin genes responsible for beta-thalassemia came from problems in splicing.

It became apparent through development of reactions that replicated RNA splicing that the splicing is done by a branch-shaped section of a lariat RNA and that such RNAs were integral to splicing. Later it was found that these small snRNAs compiled particles found in spliceosomes. Via an intermediatemade up of lariatRNA and the 5’ exon-RNA, the spliceosome was able to remove the intron.


In eukaryotes, genes are transcribed to messenger RNAs comprising both introns and exons. For the production of validated mRNAs, the introns are to be trimmed off and the exons attached back together by the spliceosome, the molecular tailor of the cell. The spliceosome has the ability to alter its snipping and stitching process in order to generate variation in mRNAs based on a single coil of pre-mRNA cloth. Alternative splicing is the process by which the spliceosome can develop multiple mRNA isoforms from a single bolt of pre-mRNA. Such alternative splicing has enhanced evolutionary possibilities in complex multicellular organisms without the addition of gene number.

The spliceosome is considered as one of the most complicated macromolecular machine in the eukaryotic cell. It is involved with hundreds of RNA and protein mechanisms, specifically with assembly and disassembling pathways. The main role for a spliceosome in eukaryotes is to develop messenger RNAs. Genes are transcribed as precursors to mRNAs, called pre-mRNA, and then the RNAs are generated by the snipping and stitching of intron and exon components. Introns are regions of the pre-mRNA that are cut by spliceosomes to serve as a source of non-coding RNAs. Exons code for proteins so they are usually wanted. Furthermore, a spliceosome can uniquely snip and stitch in ways that will create different types of mRNA, which as allowed evolution to allow organisms to increase in gene number and complexity. The reason why Spliceosome are considered one of the most complicated macromolecule machines in a cell is because they have the responsibility of properly recognizing and processing a large amount of sequences. For example, spliceosomes end up processing five small RNAs and up to 100 different polypeptides in budding yeast. To make things more complicated, humans even need to use a second splicing apparatus, the minor spliceosome. In studying these complex machinery, many barriers were in the way because of the limitations in vivo. However, novel approaches to researching splicing have developed, such as in vitro assembly and purification of active spliceosomes, microscopic visualizations of single spliceosomes, and more. The advantages of these methods are that they are more specific and allow the wider boundary of studying either hundreds or a single RNA molecule. One of the new methods involves using microarrays.

Using splicing-dependent microarrays allows researches to distinguish which features of the splicing cycle are universal or specific to pre-mRNAs. Groups of developed DNA arrays differentiate between spliced and unspliced RNAs and then are probed with cellular RNA to isolate even further. Analysis of the splicing response allowed the observation of how loss of activity in specific protein directly affected the splicing of individual pre-mRNA. Overall, the microarray has proved its importance of pre-mRNA identity by efficiently isolating the desired-protein.

Spliceosome analysis have often times brought up significant obstacles in gaining understanding of more detailed mechanisms. To overcome such difficulties, laboratory techniques have employed methods like in vitro, which involves observing single pre-mRNA molecules, and active spliceosome purification, which includes well- characterized enzymes and controlled conditions. Although each chemical approach are relatively distinct, each contribute a complementary and synergistic view that heighten the knowledge of splicing machinery.

The spliceosome is a complex macromolecular machine consisting of small nuclear riobonucleoprotein particles (snRNPs): U1, U2, U4, U5, and U6, as well as roughly 100 separate splicing factors. The snRNAs range in length from 107-210 nucleotides; the snRNAs link with proteins to make small ribonucleoprotein particles (snRNPs). The snRNP contains a single snRNA and multiple proteins.

Splicing is carried out in multi-megadalton complexes. This means that the spliceosome is made of several components in an ordered manner. First, U1 snRNP binds to the 5’ splice site (SS). At the same time, branchpoint bridgeing protein (BBP) and Mud 2 binds to the branch site. Then, U2 snRNP will displace the BBP/Mud 2 and bind to the branch site. Next, U4/U6.U5 tri-snRNP will also binds to the complex. Before splicing the RNA U1 and U4 will leave the complex and Prp19 will bind. After the splicing is completed, the spliceosome will undergo conformational change for the ligation process. After the ligation process, the components of the Spliceosome will degrade and be recycled. Therefore, each spliceosome is a single turnover enzyme

From all the new and developed techniques, it is not for certain that the spliceosome cycle in the body is far from simple, but rather an “extraordinary dynamic and flexible machine” The new technologies constantly bring new evidence of the detailed reversible, irreversible, kinetic, and mechanism interactions of the pre-mRNA substrate. There are still many things unknown about the spliceosome and its process. There is still limited structural information, which means many of its functional details are unavailable. Same with unknown kinetic understanding. However, the research of this dynamic machinery is still developing and continuing to discover new methods and information on its purpose.

Pre-mRNA Splicing Process[edit]

Splicing requires there to be three sequences in the introns. One end of the intron is the 5′ splice site and the other end is the 3′ splice site. At these sites are short consensus sequences. The sequence that exons are ordered in the mRNA usually correlates with the sequence in the corresponding DNA. The process is aided by spliceosomes, which are small RNA molecules that recognize the beginning of introns (usually GU) and the end (usually AG) and catalyze splicing at these sites. Changing a single nucleotide at these sites may prevent splicing to occur. There are also self-splicing introns. The third sequence important to splicing is located at the branch point. The branch point is where an adenine nucleotide lies from 18 to 40 nucleotides before the 3′ splice site. The deletion or mutation of the adenine nucleotide at the branch point would prevent splicing. Splicing occurs in large structures called spliceosomes.

Before splicing takes place, an intron between exon 1 and exon 2. Pre-mRNA splices in two steps. In step one, the pre-mRNA is spliced at the 5′ splice site, separating exon 1 from the intron. The 5′ end of the intron then attaches to the branch point folding back on itself and forming a structure called a lariat. The folding back occurs by the guanine nucleotide in the 5′ consensus sequence bonding with the adenine nucleotide at the branch point through transesterification. In step two a splice is made at the 3′ splice site and the 3′ end of exon 1 is attached to the 5′ end of exon 2. The intron is separated as a lariat and becomes linear when the bond breaks at the branch point and is then degraded by nuclear enzymes. And finally, the mature mRNA consisting of only the exons spliced together are moved to the cytoplasm and translated.

It is important to note that the 5′ cap greatly affects pre-mRNA processing and mRNA export and if it were ever to be removed, then it would be known as the first irreversible step in mRNA decay which will affect the entire gene expression.

Splice sites of mRNA precursors

Pre-mRNA Splicing: Constitutive vs Alternative[edit]

Constitutive splicing pertains to the way mRNA is spliced in exactly the same way, every time with the splicesome

Alternative splicing allows for different expression of genes through SR proteins, which select alternative sites for splicing, using different exons or expressing them in a different order. By choosing combinations of alternative splice sites, protein isoforms can be created that are structurally and functionally distinct. It is estimated that at least 75% of human genes undergo this mechanism.

Another alternative splicing uses multiple 3′ cleavage sites. There are 2 or more potential site for cleavage in a pre-mRNA sequence. However, this may or may not produce different proteins.

Rna splicing.JPG

RNA alternative splicing.jpg

Alternative splicing can occur under cellular Stress[edit]

Because alternative splicing can control gene expression, it is an important mechanism that a cell can use in response to certain stressful environmental and pathological cellular conditions such as heat, cold, UV-light, oxygen, ion balance, infections, inflammation, fever, etc.

Splicing factors can be enhancing (recognizing positive sequence elements) or silencing (recognizing negative sequence elements). These sequence elements can by exonic or intronic which determines whether they are included (exonic) or left out (intronic). The splicing enhancers are typically bound and activated by SR proteins. SR proteins are ‘serine/arginine-rich;’ they are a group of proteins that have been highly conserved throughout evolution that participate in both alternative and constitutive splicing. They are involved in regulating and selecting the splice sites.


Alternative or unconventional mRNA splicing can be part of adaptive stress responses in certain cell organelles, such as the endoplasmic reticulum (ER). Moreover, abnormal mRNA splicing could also be related to cell apoptosis. Under stress conditions, unfolded proteins accumulate in the ER and form aggregates. These abnormal aglomerations engage a response process called unfolded protein response (UPR), which is triggered thanks to a few different stress sensors that reside in the ER. One of those sensors is inositol-requiring enzyme 1α (IRE1α), a type I transmembrane protein, which once activated, initiates the abnormal splicing of the mRNA that encodes the transcription factor X-box binding protein 1 (XBP1), leading to the translation of a more stable spliced form of XBP1 (XBP1s). XBP1s translocates to the nucleus, where it controls the upregulation of a subset of UPR-related genes linked to protein folding, quality control, ERAD and ER/Golgi biogenesis. Furthermore, prolonged ER stress leads to the inactivation of IRE1α signaling, which in turn is associated with the attenuation of XBP1 mRNA splicing, process that could sensitize cells to apoptosis.

In the heat-shock protein 47 (HSP47), the selection of the 5’ splice-site in the non-coding region of the pre-mRNA is performed more efficiently. In cold shock, alternative pre-mRNA splicing is induced in neurofibromatosis type 1 (NF1) which brings about a cryptic exon. Stress induced long-term neuronal hypersensitivity is associated with stress-induced alternative splicing of the pre-mRNA of neuronal acetylcholinesterase (ACHE).

Impact of Heat Shock Stress[edit]

Many types of stress, including heat shock, can immediately block many crucial metabolic processes such as DNA replication until recovery. Heat shock proteins (HSPs) help protect cells from injury and aid cell recovery and after heat shock conditions subside. The blocking of pre-mRNA splicing in heat-shocked proteins is well characterized. HSPs are not affected in their expression, however, because they do not contain any introns.

SRp38 is an SR protein, that when overexpressed, antagonizes the activity of SR protein SF2/ASF (splicing factor 2/alternative splicing factor). SRp38 is unique in that, when phosphorylated, it activates sequence-specific splicing that requires an as of yet unidentified cofactor. This activity stems from SRp38’s entry into a complex with U1 and pre-mRNA which strengthens the interaction of U1 and U2 with pre-mRNA. SRp38 is a strong splicing repressor when dephosphorylated after heat shock; after mild heat shock it is rephosphorylated, accompanying the return of splicing activity.

Nuclear stress bodies (nSBs) are proposed to control splicing activity under stress by bringing a set of splicing factors to the region where they bind to SATIII transcripts. The nSBs are the sites of accumulation of heat-shock factor 1 (HSF1) in human cells, and appear fleetingly after mild heat shock, chemical and hypertonic stresses. They are also the site of accumulation of pre-mRNA splicing factors (SF2/ASF, 9G8, SRp30c for the adenoviral E1A gene). The nSBs are assembled on regions of chromatin that consist of long satellite III (SatIII) DNA. After heat shock, chromatin reorganization occurs along with HSF1 transcription of SatIII RNAs. Recruitment of the SF2/ASF and SRp30c proteins requires the stress-induced SATIII transcripts. Reducing the transcription blocks the SR protein splicing factor recruitment.

Structural Insights into RNA Splicing[edit]

The study of introns and alternative splicing has lead to the additional classification for introns in both eukaryotes and prokaryotes. Two groups of introns have been discovered. The first group of introns, Group I, were the first self-splicing ribozymes to be discovered. Group II introns were later reported. The Group II intros have highly complex RNA structures and they also possess a unique diverse range of chemical reactivity. The Group II intros possess the capability to catalyze the 2′-5′ bond formation, and the ability to retrotranspose onto DNA. Retrotranspositions onto DNA requires the help of intron-encoded proteins. Specific analysis of Group II’s secondary structure revealed six structural domains. Domain V is the mos conserved phylogenetically (closely related among various groups of organisms). The lower helix of Domain V possess a catalytic triad that consists of nucleotides that is very similar to that of a spliceosome called U6 spliceosomal RNA. This has led to the belief that Group II introns share acommon ancestor with nuclear introns and the eukaryotic splliceosome. This has led to further meticulous study of Group II introns.

Study of Group II Introns[edit]

Group II introns have been further classified into thre main structural elements based on their RNA secondary sturcture. The three groups are IIA, IIB, and IIC. Analysis by reverse transcriptases have shown that group IIC introns are the most primal and simplistic lineage of the Group II introns. The secondary structure of lineage IIC is much simple than that of IIA and IIB. Because the Group IIC are much smaller and simplistic, they are much more preferred by study of crystallization. Oceanobacillus iheyensis was the first organism to have its Group II intron successfully crystallized and examined via x-ray diffraction. Group IIC is also significantly different from Group IIA and IIB by the fact that the nucleophile used during the first step of the splicing is a water molecule. Group IIA and IIB use a 2′-OH nucleophile from the adenosine in Domain VI. Because Group IIC uses a water molecule the introns released are linear molecules, while Groups IIA and IIB introns will be released as a lariat branched species. The use of the Group IIC intron has further suggest that the active site is composed of the bulge and catalytic triad of Domain V, mentioned above. It has been shown that these regions are influenced by the binding of catalytic metal ions, i.e. magnesium. This meatal ion interaction is very common in protein in order to influence shape, i.e. Fe hemoglobin. The ion helps modulate the eclectrostatic environment at the core of the intron or protein. The metal ion interaction can also make or breack the phosphodiester linkages in the DNA and RNA polymerases.

Possible Splicing Errors[edit]

Although the rate of splicing errors is very low, it does happen occasionally due to several distinct possibilities. A splicing error will most likely result in a mutation of a splice site and could compound into losing the function of the particular site. An exposure of a premature stop codon or a misplace/misinserted exon or intron could all lead to a mutation. A mutation from variations in the splice location which could cause a wrong amino acid to be interpreted. A misinterpreted amino acid could result in reducing specificity. All mutations could result in wrongly constructed proteins, which can be life threatening, i.e. cickle cell anemia or cystic fombrosis. Fortunately many splicing errors can be safeguarded by Nonsense-mediated mRNA decay.

Nonsense-Mediated Decay[edit]

Nonsense mediated decay is the cellular mechanism of mRNA that exists to detect incorrectly spliced information, and to prevent the expression of incorrect proteins. After transcription the mRNA will reassemble with ribonucleoprotein. Nonsense mediated decay is initiated by exon junction comples that are cut out from the genetic information during the mRNA processing. Exon junction complexes located past a nonsense codon act as tags for the mRNA ribonucleoprotein, RNP. The RNP is able to recognize disorganizantion and wrong splicing from the pressence of these exons that were supposed to be cut out.The nonsense mediated decay will transport the excon tagged misinformation set out the nucleus and into the cytosol where the misinformation RNA is degraded.


Cellular stress and RNA splicing. Biamonti G, Caceres JF. Trends Biochem Sci. 2009 Mar;34(3):146-53. Epub 2009 Feb 7.

William Fontaigne De La Tour Dautrieve

Aaron A. Hoskins, Melissa J. Moore, The spliceosome: A Flexible, Reversible Macromolecular Machine, Trends in Biochemical Sciences, Volume 37, Issue 5, May 2012.

Sharp, Phillip A. “The Discovery of Split Genes and RNA Splicing.” Trends in Biochemical Sciences 30.6 (2005): 279-81.

Woehlbier, U., Hetz, C. Modulating stress responses by the UPRosome: A matter of life and death. Trends in Biochemical Sciences, June 2011, Vol. 36, No. 6.

Small Nuclear Ribonucleoprotein Particles (snRNPs)[edit]

Small Nuclear RiboNucleoprotein Particles (snRNPs) are the secondary molecules made of small nuclear RNAs (snRNAs) and specific proteins. snRNP molecules make up the larger splicosome molecules. U1 snRNP recognizes the binding site on the 5’ end and the six-nucleotide sequence of the U1 snRNA binds to the splice site on the pre-mRNA. From this, the spliceosome will assemble along the pre-mRNA molecule. The U2 snRNP will bind to the branch site on the intron with its complementary sequence between the U2 snRNA sequence and the pre-mRNA. The U4, U5, and U6 snRNPS then bind with the U1 and U2 complexes and form the necessary spliceosome.
The splicing process itself begins with the U5 interaction with the exon sequence on the 5’ splice site. The U6 goes through intramoleculear reorganization after breaking from U4, which allows U2 to base pair and interact with the 5’ end of the intron, taking U1 out of the spliceosome. The U2-U6 complex forms a helix that forms the center of the spliceosome itself. U4 prevents U6 from splicing until the splice sites are correctly aligned. Once alignment has occurred, the transesterification reaction cuts the 5’ exon at the phosphodiester bond and produces a lariat intermediate. Splicing continues with rearrangements with the spliceosome that will then produce the next transesterification reaction on the pre-mRNA. In the rearrangement, the U5 aligns with the 5’ exon so that it is easier to attack the 3’ splice site to produce another spliced product. To finish the splicing process, the U2, U5, and U6 release itself from the lariat intron.

snRNP Biogenesis Cycle[edit]

The biogenesis of snRNPs begin with the transcription of a monomethyl-guanosine (m7G)capped snRNA-precursor using RNA polymerase II. Following its transcription, the snRNA is exported out of the nucleas to react with Sm proteins, which combine to form the Sm core domain. This then triggers the hypermethylation of them7G-cap, thereby generatingthe trimethylguanosine (TMG)m^(2,2,7)3G-cap. The two-part nuclear localization signal (NLS) consisting of the Sm core domain and the TMG cap causes the relocation of the snRNP back to the nucleus. Before re-entering the nucleus, the snRNP undergoes completion of the biogenesis cycle in subnuclear domains called Cajal bodies. It is still relatively unknown as to which proteins join the snRNP at which stages of the biogenesis cycle. The U6snRNP does not follow the above stated steps and is speculated to carry out its biogenesis within the nucleoplasm.

snRNP Assembly Factors[edit]

Current work on snRNPs have demonstrated cellular assembly strategies for RNA-protein complexes. snRNPs form in vivo by the synchronized action of a complex assembly line containing assembly chaperones, scaffolding proteins, and catalysts. RNP assembly factors satisfy two functions. One, they augment assembly efficiency by helping the accumulation of higher order building blocks and second, they hinder the collection of Sm proteins and the assembly of snNRP centers that contain wrong RNA, or RNAs that do not entertain an Sm site. Various reports have spoken of the ‘proofreading’ function of the assembly machinery. These new strategies employ affinity to those used by protein complexes and also admit the explanation of common rules on how molecular machines are made in vivo.


1. Chari, Ashwin, and Utz Fischer. “Cellular Strategies for the Assembly of Molecular Machines.” Trends in Biochemical Sciences 35.12 (2010): 676-83. Print.
RNA degradation is a very ancient and important process. It is a physiological process of the cell cycle to maintain a balanced RNA concentration. To do this, cells secret an abundance of RNases, or ribonucleases, enzymes that help to break apart RNA. RNases play a significant role in cellular immune system by defending against RNA viruses, eradicating of old and non-viable RNAs, and manipulating of the nucleotides for RNA sequencing and gene expression. Viable RNAs avoid degradation by RNases by protecting themselves with “armors” such as the G-cap and the poly-A-tail. Also there are RNase inhibitors that bind to RNases, prohibiting them from cleaving RNA strands.

      1. _[74]


Ribonuclease is a type of enzyme that is capable of cleaving the phosphodiester bond between each unit of nucleic acid that form the RNA backbone. A phosphodiester bond in a single RNA strand is formed by the linkage between the 3’ carbon atom of one ribose sugar molecule and the 5’ carbon atom of another ribose sugar attached to an adjacent nucleoside. These enzymes have overlapping functions as the small nuclear RNAs acting on mRNAs, such that they catalyze RNA degradation by breaking down RNA into shorter partial strands.
Ribonuclease can be grouped into two categories: endoribonucleases and exoribonucleases.

300px a picture of two RNA nucleotides with base adenine and cytosine


Endoribonulcases, much like the restriction endonucleases, are able to recognize certain RNA nucleotides within RNA and cleave at a specific site. These enzymes are able to break apart both single or double-stranded RNAs.


Exoribonucleases cleave RNA by removing terminal nucleotides from either the 5’ end or the 3’ end of the RNA strand. Enzymes that remove nucleotides from the 5′ end are called 5′-3′ exoribonucleases, and enzymes that remove nucleotides from the 3′ end are called 3′-5′ exoribonucleases.

Both endonucleases and exoribonucleases can be further broken down into sub-classes of ribonucleases based on their chemical cleaving mechanisms, such as phosphorolytic and hydrolytic activations. They exist in all kingdoms of life, the bacteria, archaea, and eukaryotes. They are involved in the degradation of many different RNA species, such as messenger RNA, transfer RNA, ribosomal RNA, microRNA, etc.

The left side shows the hydrolytic activity of RNases in which a water molecule intercepts the 3’ ester bond between the phosphate group and the 5’ –OH group of the adjacent sugar, breaking off one nucleotide. The left side shows the phosphorolytic activity of RNases in which a phosphate group intercepts the 3’ ester bond between the phosphate group and the 5’ –OH group of the adjacent sugar, breaking off a nucleotide with two phosphate groups on it.
The left side shows the hydrolytic activity of RNases in which a water molecule intercepts the 3’ ester bond between the phosphate group and the 5’ –OH group of the adjacent sugar, breaking off one nucleotide. The left side shows the phosphorolytic activity of RNases in which a phosphate group intercepts the 3’ ester bond between the phosphate group and the 5’ –OH group of the adjacent sugar, breaking off a nucleotide with two phosphate groups on it.


The left side shows a strand of mRNA, protected by a G-cap and a poly-A-tail. The Dcp1/2 protein recognizes the G-cap at the 5’ end of the RNA, takes it away, and exoribonuclease Xrn1 comes and starts chewing off the RNA strand from the 5’ end. The right side shows the deadenylase protein recognizing the poly-A-tail and starts chewing it off. Then comes the exosome, another exoribonuclease that degrade RNA from the 3’ to 5’ direction. The middle shows an endoribonuclease binding to a specific sequence within the RNA and cleaving it internally.

Important RNase[edit]

RNase A[edit]

The structure of RNase A was first crystalized 50 years ago. It was the first enzyme and the third protein whose amino sequence was determined. It is a single chain protein that contains 4 disulfide bridges. It contains 124 amino residues and 19 out of the 20 amino acids, except for tryptophan. These enzymes were found in the exocrine cells of the bovine pancreas. They are very tough and highly stabilized enzymes, mostly due to their structures and folding patterns. Its molecular formula is C575H907N171O192S12 and the general structure consists of 2 sheets of alpha helices and beta sheets cross-linked by four disulfide bridges. RNase A has basic properties. It specifically attacks at the 3’ phosphate of a pyridine nucleotide. The cleavage involves simply two steps: 1). the 3’,5’-phosphodiester bond is cleaved to generate a 2’,3’-cyclic phosphodiester intermediate and 2). the cyclic phosphodiester is hydrolyzed to a 3’-monophosphate.

Example: pG-pG-pC-pA-pG undergoes RNase A cleavage would result in 2 sequences: pG-pG-pCp and A-pG.

RNase A can be activated by potassium and sodium salt and inhibited by alkylation of His12 or His119 in initiate RNA cleaving. It utilizes both phosphorolytic and hydrolytic activities to cleave a strand of RNA.

RNase P[edit]

RNase P.png

Sidney Altman discovered and named RNase P while working in the Laboratory of Molecular Biology in Cambridge, England, focusing on the functions of tRNA. He proposed that by altering spatial relationships in tRNA, either by insertion of new nucleotides or deletion of existing nucleotides, would affect or change the function of the tRNA. His experiments with E. coli. have shown that mutated tRNAs could not develop into a full mature tRNA, which in turn could not serve its functions of delivering amino acids during protein synthesis. However, these dysfunctional tRNAs quickly resolved back to wild-type tRNAs. By isolating the DNA to RNA transcript of tRNA, Altman had found that there are extra nucleotides hanging off of the 5’ and 3’ ends of tRNAs. When these tRNAs were introduced to a live medium, an enzyme was observed to cut off the extra nucleotides through the cleave of a phosphodiester bond, exposing the 5’ end of the molecule. This RNase was different than other previous known RNase because of its specificity at the 5’ end of the tRNA. Altman also showed that RNase P-like activities were present in cells extracted from a variety of organisms, including humans.
RNase P is unique that it is ribozyme. While it cleave other RNAs, it cleaves itself as well, meaning it self-destructs during reaction. It is a single stranded protein containing 120 amino residues. They are found in many organisms such as archaea, bacteria and eukarya as well as chloroplasts in plants. The make-up of RNase P differ from one organism to another, but their functionalities are the same because of orthogonal properties. RNase P is a crucial component in the production of functional tRNA molecules.

RNase T2[edit]

Overall structure of RNase T2 proteins.png

T2 family Ribonucleases are considered to be transferase type RNases and are distinguished from the RNaseA and RNaseT1 protein families based on three features.

– First of all T2 ribonucleases are more evenly distributed and are found in bacteria, plants, protozoans, animals and even viruses, whereas RNase T1 enzymes exist only in bacterial and fungal organisms and RNaseA family enzymes are highly represented in animals.

– Secondly, the optimal pH of activity of many T2 ribonucleases is between 4 and 5.
By contrast, RNaseT1 and RNaseA families have optimal pH activity at alkaline (pH 7-8) or weakly acidic (pH ~7).

– Thirdly, T2 ribonucleases do not discriminate their cleavage sites. T2 families generally cleave at all four bases, whereas RNaseA and RNaseT1 families tend to be specific for pyrimidine or guanosine bases respectively.

The biological role for T2 Ribonucleases varies. These endoribonucleases are ubiquitously represented in organisms across kingdoms and have been show to perform a variety of functions in different organisms besides it’s ability to hydrolyze RNA. Some examples of biological roles include the scavenging of nucleic acids, the degradation of self-RNA, modulation of a host immune response, and serving as cellular cytotoxins.

Other T2 RNase Properties[edit]

T2 RNase are transferase-type RNases and catalyze the cleavage of ssRNA (single-stranded RNA) through a 2′,3′-cyclic phosphate intermediate. The result of this catalyzed reaction are mono- or oligonucleotides with a 3′ phosphate group. Typically, these RNases are secreted from the cell or specific special locations within the cell such as vacuoles, which may prove important to how their activity is modulated within the cell. This family of RNases has a specific structure and mechanism that is well known from x-ray crystallography. Likewise, crystallography has defined specific places such as specific binding sites, called B1 for sites with a 5′ end and B2 for site with a 3′ end, as well as a core made up of alpha and beta structures. Additionally, the catalysis of T2 RNase starts with one to three histidine residues. It should be noted that alteration or mutation of these residues causes inactivation of the RNase. The two main steps of this catalysis are transphosphorylation and hydrolysis. Further study is being conducted in the following areas for these specific RNases:

  1. Discovering how RNases from this family can function independent of catalysis
  2. Mutational analysis to determine the regions necessary for nuclease-independent functions
  3. How these RNases enter the cell to reveal how proteins cross membranes, while many things cannot


1. T2 Family ribonucleases: ancient enzymes with diverse roles.
Luhtala N, Parker R.
Trends Biochem Sci. 2010

2. Raines, Ronald T. “Ribonuclease A.” Chem Rev. Wisconsins: University of Wisconsins, 1998. 1045-065. Print.

3. Gopalan, Venkat, Agustin Vioque, and Sidney Altman. “RNase P: Variations and Uses*.” RNase P: Variations and Uses. JBC Papers in Press, 10 Dec. 2011. Web. 20 Nov. 2012. .

4. Cuchillo, C. M.; Nogués, M. V.; Raines, R. T. (2011). “Bovine pancreatic ribonuclease: Fifty years of the first enzymatic reaction mechanism”. Biochemistry 50: 7835-7841. PMC 3172371. PMID 21838247.

5. J. Holzmann, P. Frank, E. Löffler, K. Bennett, C. Gerner & W. Rossmanith (2008). “RNase P without RNA: Identification and functional reconstitution of the human mitochondrial tRNA processing enzyme”. Cell 135 (3): 462–474. doi:10.1016/j.cell.2008.09.013. PMID 18984158.
Telomeres (from the Greek telos, “an end”) are long stretches of repeating non-coding DNA sequences at the ends of the DNA strand. They protect the ends of DNA and prevent DNA strands from shortening or attaching to other molecules by masking the chromosome. Russian Alexei Olonikov was the first to postulate the problem of chromosomes replicating at the tip.[1] He theorized that in every subsequent replication bits of the DNA would be lost until a critical limit had been reached, thereupon cell division would cease.


Telomerase adding Telomere extension

Telomerase is an enzyme that creates the Telomeres. Telomerase adds specific repeating sequences (“TTAGGG” in all vertebrates) to the ends of four DNA strands.

The telomerase enzyme has an RNA template that partially attaches to the shortened end of the DNA strand. New nucleic acids then attach to the template, extending the DNA strand. Once the telomerase leaves, the double stranded DNA is completed with the DNA polymerase. Telomerase was discovered in 1985 by Carol W. Greider and Elizabeth Blackburn. For this discovery, they were awarded the 2009 Nobel Prize in Physiology or Medicine along with Jack W. Szostak.[3]

Szostak and Blackburn first discovered telomeres in ciliates. They chose ciliates because at one stage of their life cycle, they make a million new telomeres. The model created includes a telomere-dedicated DNA polymerase, which adds telomeric repeats onto chromosome ends. Therefore, telomeres are represented as a motif in DNA sequences.

Telomerase’s presence in humans is somewhat strange. It is located in the nucleus which is unsurprising because that is where DNA replication takes place. However, Telomerase activity is not present in all cells. It was found to be almost absent in the majority of normal adult tissues, including cardiac and skeletal muscle, lung, liver, and kidney. Because of this curious lack of telomerase activity, a theory arose connecting telomere length to aging and cell senescence. According to this theory, human somatic cells are born with a full number of telomeric repeats, but the telomerase enzyme is not present in some tissues. The cells of those tissues would lose about 50 to 100 nucleotides from each chromosome end each time they underwent replication and division. Eventually, the telomeres would cease to exist and the chromosomes themselves would start losing nucleotides, carrying genetic defects into their next division so that neither daughter cell would be viable. Thus after a certain number of divisions a cell will not have enough nucleotides and die.[4]

Telomeres at the end of a chromosome.

The function of Telomerase is to allow for short replacements of Telomeres which are gradually lost during cell division.[5] In normal conditions without Telomerase, a cell would divide until it would hit a critical point known as the Hayflick limit.[6] In the presence of Telomerase, however, the cell has the ability to replace lost DNA and divide without limit. But this continuous growth comes with a consequence as this growth may lead eventually to cancerous cells.

While the details are not fully known, it would seem that that shortened Telomeres play a role in aging due to the erosion of the DNA over time. The questions arises whether or not Telomerase has the ability to greatly extend the lifespan of a human due to its importance in the maintenance of the Telomeres.[7]Dr. Michael Fossel, a professor of clinical medicine at Michigan State University, has expressed his views on Telomerase as a viable treatment for cell senescence.

However, several experiments have raised doubts on the ability of Telomerase as an effective anti-aging treatment. An experiment was done with mice having higher levels of Telomerase and it was discovered that they also had a higher rate of cancer which therefore led to a shorter lifespan. In addition, Telomerase favors tumorogenesis.[8] Telomerase fosters cancer development by allowing uncontrolled cell growth which eventually proliferates into tumors. In fact, Telomerase activity has been observed in approximately 90% of all human tumors which suggests that the uncontrolled growth of a cell as conveyed by Telomerase has a key role in cancer.

In addition to using Telomerase as an anti-aging treatment, Telomerase has potential as a drug target against cancer.[9] Since it is necessary for the immortality of many cancer cell types, it is believed that if a drug is able to deactivate Telomerase activity in a cell, Telomeres would shorten, mutations would happen, cell stability would decrease and cancer would be, in essence, effectively treated. Experimental drugs have been tested in mouse models and some drugs have moved onto clinical testing.


Cancer Biology[edit]

The significance of studying telomeres can be found in telomerase, which rebuilds the telomere so that the cells can keep dividing. The telomerase, however, eventually shortens the telomere, causing the cell to die. In the case of cancer cells, this enzyme builds telomeres long past the cell’s average lifetime. These cells then are called to be “immortaled”, since they can divide endlessly. This results in a tumor. Many researchers believe that telomere maintenance activity is characterized in most human cancer cells. Though the mechanism by which such phenomena happen has not been well understood, the discovery may reveal key elements of telomere function.
Telomerase, on the other hand, is the natural enzyme used for telomere repair, highly abundant in stem cells, germ cells, hair follicles, and most cancers cells, but its expression is low or in some cases absent in somatic cells. Telomerase functions by adding bases to the ends of the telomeres. Cells with sufficient telomerase activity are considered immortal in the sense that they can divide past the Hayflick limit without entering senescence or apoptosis. For this reason, telomerase is viewed as a potential target for anti-cancer drugs such as telomestatin.

2009 Nobel Prize[edit]

The Nobel Prize 2009 in Physiology and Medicine was awarded to three scientists who have discovered how the chromosomes can be copied in a complete way during cell divisions and how they are protected against degradation. By showing that the ends of the chromosomes, telomeres, and their enzyme, telomerase, are significant in protecting the chromosomes from degradation, they identified telomerase and explained how the telomeres protect the ends of the chromosomes and built by telomerase.
On the other hand, if the telomeres become shortened, cells can duplicate damaged as cancer cells. If telomerase is well maintained, conversely, telomere length is maintained and the cell does not become cancerous. In the case of cancer cells, telomerase allows the cell to divide without any limit. Certain genetic disease are caused by a defective telomerase. This discovery can thus be used to stimulate the development of new therapeutic strategies. Understanding such fundamental mechanism is an important first step toward opening new doors for cures for cancer and other related diseases, as well as anti-aging.

Hayflick Limit[edit]

The Hayflick limit is the number of times a normal cell may divide until it reaches a critical limit and stops dividing based on the idea that Telomeres reach a critical length.[10] This limit was discovered by Leonard Hayflick in the 1960s who demonstrated that the cells in a normal fetus divided around 40 to 60 times before entering into cell senescence. Due to repeated mitosis, the Telomere shortening occurred which inhibited cell division which is analogous to aging. The discovery of this limit, a pillar of Biology, refuted the early contention by Alexis Carrel who, along with the majority of scientists during that time period, believed cells were “immortal”.

Role of Telomere[edit]

Telomeres account for the lost bits of DNA at the ends of chromosomes during DNA replication. Since DNA polymerase moves along the template strand in the 5′–> 3′ direction, some of the 5′ end of the template strand will not be replicated. This results in the incomplete ends as shown in the diagram below. However, telomeres are usually very long, ranging from 400 to 600 base pairs in yeast to many kilobases in humans. They are made of six to eight base pair long repeats which are usually rich with guanine bases. With long stretches of telomeres at the ends of DNA strands, the incomplete strands of DNA will still contain the genetic code.

Incomplete ends.JPG

Guanosine Tetraplex: a structure of DNA with four strands of DNA. Often the structure of telomere.

The shortening of telomeres in humans induces cell senescence in humans. This mechanism appears to cause the formation of cancerous cells. Telomere length has been theorized in recent publications to account for the aging in humans. Since cells replicate identically, there must be a reason why cells within a body lose function and viability with time. Telomeres may have some influence over the aging process since every consequent DNA replication results in the shortening of telomeres. Two aspects to this question are: (i) whether telomere length, as measured in specific cell populations in the body, correlates with longevity or disease; and (ii) whether telomere shortening in any cell population causes functional impairment of that cell population. However, some may argue telomeres do not correlate to longevity as mice contain long strands of telomeres, but contrarily live much shorter lives than humans who do not have as long telomeres as do mice. And some may argue that telomere length does correlate to longevity as it determines the number of times that a cell can divide before it dies or reaches senescence.

Recent Publications[edit]

Recently it has been found that telomerase activity is inversely related to length of the telomeres. In other words, telomere elongation happens more often on short telomeres rather than long ones. The research showed a deficiency in telomerase activity in telomeres greater than 125 base pairs,and there was 2 to 3 times more telomerase activity in telomeres shorter than 125 base pairs. This preferential elongation has been demonstrated in yeast and mice, and now human somatic cells. Kinetic data indicates that elongation in yeast cells in a single event in which elongates the telomeres to a certain length, whereas in human cells the elongation seems to be a gradual process. The researchers showed that telomerase adds a regulated length of telomere in each cell division. The researchers showed that human cells expressed telomerase, however long telomeres were maintained and not elongated where as the cells with shorter telomeres elongated, which goes to show that telomeres can not be infinitely extended.[11]

Another interesting paper was focused on the role of DNA damage response (DDR) proteins in the role of telomere maintenance. The review says that early stage DNA repair proteins have a significant role in telomere maintenance where as late stage proteins usually do not take part in telomere repair. The interplay with these proteins and the proteins that cap the telomeres to protect the telomeres is very important too. Many of stronger DDR proteins inhibit cell replication, because of this fact, it would be harmful to the organism for these proteins to be a part of telomere repair. These protein caps on the telomeres inhibit full DNA damage response which keeps the stronger protein from “repairing” the telomere ends. It still isn’t clear why some of the DDR proteins participate in telomere maintenance and others do not, but it is clear that the cellular process in repairing a DNA break and repairing telomeres are two different process, with the former halting cellular division.[12]


  1. “Telomeres, telomerase, and aging: Origin of the theory”. Alexey M. OlovnikovE-mail The Corresponding Author. 1999. Retrieved 2009-11-05. 
  2. “Repeat Expansion–Detection Analysis of Telomeric Uninterrupted (TTAGGG)n Arrays”. [7]. 2007. Retrieved 2009-11-05. 
  3. “The Nobel Prize in Physiology or Medicine 2009”. [8]. 2009. Retrieved 2009-11-05. 
  4. “What are telomeres and telomerase?”. [9]. Retrieved 2009-11-05. 
  5. “Telomerase: regulation, function and transformation.”. [10]. Retrieved 2009-11-05. 
  6. “Hayflick Limit Theory”. [11]. Retrieved 2009-11-05. 
  7. “Extension of Life-Span by Introduction of Telomerase into Normal Human Cells”. [12]. Retrieved 2009-11-05. 
  8. “Anti-Aging Medicine”. João Pedro de Magalhães. 2008. Retrieved 2009-11-05. 
  9. Foreman, Judy. “Telomerase – a Promising Cancer Drug Stuck in Patent Hell?”. Retrieved 2009-11-05. 
  10. “Cellular Senescence”. João Pedro de Magalhães. 2008. Retrieved 2009-11-17. 
  11. Britt-Compton, Bethan; Capper, Rebecca; Rowson, Jan; Baird, Duncan M. (2009). FEBS Letters (583): 3076–3080. 
  12. Lyndall, David (2009). The EMBO Journal (28): 2174–2187. 

RNA plays a variety of roles in gene expression, from messenger, catalysis to regulations. For instance, in E.coli, 80% of RNA is ribosomal RNA, 15% is transfer RNA, and only 5% is Messenger RNA. Messenger RNA is the template for protein synthesis. Since the amount of mRNA is relatively small, affinity chromotography must be performed to purify it.

This technique exploits the structure of mRNA,which is polyadenylated at the 3′ end to form a Poly A tail. The Poly(A)n can pair with the complementary base Poly(T)n through the formation of Hydrogen bonds. The Poly(A) region is used to selectively isolate mRNA from the rest of the RNA via affinity chromatography. Only mRNA with Poly A can be bound by Poly T in the column. RNA that lack Poly A tails elute out of the column at high salt concentrations. The mRNA is separated for other RNAs. Then the Poly(A) mRNA can be washed out from column by adding a low salt eluting buffer.

Affinity chromotography is very useful for the purification of RNA and the synthesis of cDNA.

Figure Isolating mRNA from Total RNA.pdf

RNA extraction is a technique of isolating and purifying RNA from in-vivo tissues and samples. There are several ways of extracting RNA. The presence of ribonucleases enzymes within the tissue cells complicates the extraction and purification process by quickly degrading the isolated RNA. Isolated and purified RNA can be used to detect gene expression, biomarkers, drug efficacy, and much more.


First, homogenize the sample tissue in Trizol solution using a vortex mixer. The speed of mixing is very important because the tissue can only be homogenized when mixing at high speed, however, friction may cause heat, which may accelerate RNA degradation.
Add chloroform to the finely grinded mixture to perform a liquid phase separation. For every milliliter of Trizol reagent, add 0.2 mL of chloroform. Cap and shake the mixtures vigorous for 15 seconds and incubate them at room temperature. Centrifuge the mixtures at no more than 14,000 rpm for 15 minutes at 2-80C. After centrifugation, the mixture separates into three distinct layers, the lower red chloroform layer, an interphase layer of remaining tissues and fat, and a clear upper aqueous layer. RNA remains in the aqueous layer exclusively. Transfer the aqueous layer to another container. Wash and precipitate the RNA by using isopropyl alcohol. Add 0.5mL of isopropyl alcohol to every mL of Trizol reagent. Incubate at room temperature for 10 minutes and centrifuge again for 4 minutes in RNA elution plates. Remove all supernatant and wash the RNA again with ethanol. Wash the RNA subsequently with buffers. Make sure to dry off any remaining alcohol because they lower the quality of RNA and promote RNA degradation. The last wash should contain RNase free water to elude out the isolated and purified RNA.


1. Gottshall, Susan L., Saban Tekin, and Peter J. Hansen. “EXTRACTION AND PURIFICATION OF TOTAL RNA USING TRIzol OR TRI REAGENT.” (n.d.): n. pag. Web. 15 Nov. 2012. .


The short RNAs includes different classes of molecules like small nuclear RNA, micro RNA, and transfer RNA and etc. Many functional RNAs are less characterized as short RNA because short RNAs are found at the promoter and 3′ termini of gene sequeces
. It also involves in paramutation, is a reciprocal action between two alleles at locus, inducing an intermolecular change in the other allele. The short RNAs can be studied in research field for describing and analyzing the stratigies and process to develop short RNA species with single-molecule sequencing (SMS) and its efficiency in many laboratory research. The Short RNA is derived from the final product of functional precursor RNA species in the cell. One of the examples of the short RNA species in the cell can be found in gene splicing in DNA sequence and 3′ end RNA processing for synthesizing grown-long mRNAs
. However, many classes related to functional RNAs that are not in protein coding are most likely from precursors which are longer than grown up long mRNAs. To fractionate RNA into species some methods are being used to study the various functions of the short RNAs. For example, Dr. Philipp Kapranov, and his researchers uses the method of Helicos single-molecule sequence to study the details of short RNAs in the cell [3].

RNA Isolation for short RNA[edit]

To proceed the isolation of RNA, sRNA gets purified from total RNA (cultured cells) with mirVana miRNA Isolation Kit and miRNeasy Mini Kit, which also is known as Qiagen. The RNA Kit is used for large amount of sRNAs (from cultured cell) preparation
. Through this method of RNA/DNA kit, sRNA of specific fractions gets purified with Elution and Electrophoresis method of TBE-Urea polyacrylamide gel. Electrophoresis is action of spreadout particles in fuild due to spartially uniform electric field throughout the cell structures.

Difficulties of Analysing short RNAs[edit]

Studying of the short RNAs is very challenging among scientists and researchers. There are some difficult problems that challenges researchers who are studying about short RNAs: 1) Specific Isolation of sRNA must be successful with desired length range. 2) There’s lack of interest about sRNAs with molecular handles ike 3′ PolyA Tail of the mRNAs which is used for converting itself into cDNAs. 3) sRNA is sometimes too short for efficient and successful conversion into cDNA with hexamers. 4) Some specific sRNAs have modification at gene sequence of their 5′ and 3′ ends that interrupt the reading with ssubsequent molecular manipulation; therefore, often few sRNAs cannot be detected by some methods that a lot researchers are using. 5) Some classes of sRNAs have firm 2′ structures that blocks itself from being detected by enzymatic methods which are often conducted by researchers under nondenaturing condition. 6) Certain sRNAs such as miRNAs have short lengths that are not capable of being used for efficient mapping to complex genomes; thus, discovering sRNAs and its specific functions are very challenging for researchers to study and conduct experiments related to sRNAs. 7) Many methods depend on using ligation process and PCR amplication that can misrepresent the composition and quantification of the RNA species in the cell structures. Moreover, 8) Fraction of sRNA can be dominated with very highly presence of RNA classes like snRNA, rRNA, and sno RNAs (These will require a lot of complexity for completing characterization of the sRNA population in the cell. The most challenging part of studying sRNAs happens to deal with the physical structure in the cell: short length of sRNA makes researchers difficult to conduct an experiement

Isolation of sRNA Fraction[edit]

Using diverse methods, sRNA fraction can be detected and isolated with using either miRNeasy, or DNA/RNA Kit (Qiagen)
. If wanted, sRNA Isolation can be conducted in specific size range by using TBE-Urea denaturing polyacrylamide gel electrophoresis as another substitutive method for both researchers and scientists who are studying for sRNAs. One of the method of studying sRNAs deals with Tailing of RNAs with 3′ PolyC: RNA is mixed with water in PCR amplication tube and it is used with protocol; then, sRNA gets incubated for 2 minutes at 85 degress Celcius in a PCR Machine and put into ice 2 minutes at least. After its incubation, reagents are added to the solution with E. coli Poly A polymerase buffer and mixed. SRNA gets extracted twice with addition of phenol or chloroform or isoamly for few seconds; then, purification with addition of 100% EtOH (Ethanol) for half an hour with minus 80 degrees Celcius. Centrifuge at 4 degrees of Celcius and leave it for 30 minutes. Then, washed with 70% EtOH and finally resuspend with water

Preparation of cDNA Sequence[edit]

The cDNA can be sequence from the 3′ end with poly-A Polymerase in the method of terminal transferase (TdT) and be blocked at the other 3′ end
[8].thus, the cDNA can be bind to the oligo-dT present at the surface of the cell. This method is common for protocol and for poly A protocol. Some amounts of cDNA are analyzed under experiement with use of regular NanoDrop if the expected concentration of cDNA is at certain range between 5-10 ng. Tailing part of cDNA is followed: 1) cRNA is prepared to be tailed in water and depending on its volume, its time spending in the Incubation varies and amount of buffer can be differed. After Incubation, certain micro-measured amount of TdT buffer is added into the cDNA soltuion; The cDNA gets incubated again in the PCR Amplication machine at 37 degrees of Celcius for an hour. Incubated cDNA sample gets heat up to 95 degrees of Celcius for 5 minutes. Later, TdT Buffer is added and incubation is repeated to get final sequence of cDNA and measure the concentraion of tailed cDNA in the sample.



Nematodes are one of the most diverse phylums of all animals. Half of the 28,000 different species of nematodes are parasitic. Nematodes, a type of roundworm, have tubular digestive systems with openings at both ends. Found in nearly all parts of the world, nematodes have adapted to almost all ecosystems, characterized by varying levels of water salinity, temperature, and altitude. Not only have they adapted, nematodes have evolved to outnumber other animals that coexist in the same ecosystems. For example, nematodes constitute 90% of all life on the seafloor.

Although some nematodes are completely dependent on other types of animals for reproduction, some of the strategies used by nematodes seem rather advanced. For example, the parasitic tetradonematid nematode is hypothesized to induce fruit mimicry in tropical ants. Infected ants develop bright red gasters, tend to be more sluggish, and walk with their gasters in a conspicuous elevated position. These changes likely cause frugivorous birds to confuse the infected ants for berries and eat them. Parasite eggs passed in the bird’s feces are subsequently collected by foraging Cephalotes atratus and are fed to their larvae, thus completing the tetradonematid life cycle.

The genomes of nematodes are distinct from other metazoans[edit]

RNA regulation is an important and pervasive process, made possible by both RNA molecules and RNA-binding proteins. RNA molecules function as regulators and targets in diverse pathways pertinent to the proper decoding of the genome. RNA-binding proteins act as effectors of RNA stability and translation efficiency, guide transcripts to defined locations within the cell, control the fidelity of gene decoding, and function as cofactors to promote the activity of functional and structural RNA molecules.

Wild-type C. elegans hermaphrodite stained with the fluorescent dye Texas Red to highlight the nuclei of all cells

Caenorhabditis elegans is a model organism[edit]

The facile genetics of the nematode Caenorhabditis elegans make it a useful observational model organism for the study of RNA regulatory mechanisms. The function of specific genes in this organism can be disrupted in a relatively straightforward manner by RNA interference (RNAi). The use of RNAi allows researchers to determine the function of specific genes, by silencing their functions in certain ways. In another important application, it was discovered that this organism showed behavioral responses to nicotine, including acute response, tolerance, withdrawal, and sensitization.

Nematodes contain an expanded genome[edit]

A surprising discovery based on the laboratory study of C. elegans is that the genome of nematodes contains an expansion of putative RNA-binding proteins relative to other metazoans. The RNA-binding protein Pumilio has 11 homologs in, while it has only two homologs in humans. The CCCH-type tandem zinc finger (TZF) family, which includes the mammalian protein tristetraprolin (TTP) has 16 homologs in roundworms. Lastly, there exists 27 Argonaute homologs in nematodes.

A homologous trait is any characteristic of organisms that is derived from a common ancestor. Paralogs are homologs present in the same species, and usually differ in function. Paralogs arise from gene duplication. Orthologs are homologs present in different species, and usually are similar in function. Orthologs usually arise from speciation when one species diverges into two separate species. Homology among proteins, DNA, and RNA is often concluded on the basis of sequence similarity. It is more effective to compare amino acid sequences than nucleotide base sequences because there are 20 distinct amino acids and only 4 distinct nucleotide bases.

Forward and reverse genetic experiments have provided data, which highlight the basis for the expansion of nematode homologs. Specifically, the data indicate that the RNA-binding family expansions may play roles in germline development, gametogenesis, and early embryogenesis.

The PUF family of RNA-binding proteins[edit]

The founding members of the PUF family of RNA-binding proteins are Pumilio and FBF, which together maintain the population of progenitor cells in the distal region of the germline. This group promotes the cellular switch from spermatogenesis to oogenesis at the onset of adulthood. During the transition from mitosis to meiosis when the single-celled state becomes a syncytial region, the meiotic nuclei recellularize. Spermatocytes are formed first during the L4 larval stage and stored in the spermatheca, which are then converted to oocytes.

PUF-8 and PUF-9 are biochemically similar to enzymes, in that one of their RNA-binding properties includes a high level of specificity. For instance, the eight nucleotide (5’-UGURNNAUA-3’) that is recognized by the PUF domains differs by a single nucleotide from the nucleotide NRE (Nanos Response Element). NRE is only a single nucleotide shorter than FBE, yet the FBR is discriminated by the two PUF elements more than 30-fold.

The nematode TZF binding specificity is different than the nematode PUF binding specificity. MEX-5 is a TZF protein that binds with a high affinity but relaxed specificity to any uridine rich sequence. The relaxed specificity means that TZF binds to both uridine rich sequences and polyuridine, while TTP binds more that 80-fold more tightly to AREs than polyuridine.


The function of RNA-binding proteins, like all proteins, is dictated by their structure. Novel function of RNA-binding protein families, with a common domain characterized by a new binding specificity, is a based on structural changes. Biochemistry and genetics serves as a basis for the identification of critical sequence elements and structural changes, yet fails to provide a mechanistic explanation of how these elements directly relate to novel function. Evidently, further research is necessary in the area of structural studies in an RNA-binding family.


1 R. Lehmann and C. Nusslein-Volhard, The maternal gene nanos has a central role in posterior pattern formation of the Drosophila embryo, Development 112 (1991), pp. 679–691. View Record in Scopus | Cited By in Scopus (140)

2 B. Zhang, M. Gallegos, A. Puoti, E. Durkin, S. Fields, J. Kimble and M.P. Wickens, A conserved RNA-binding protein that regulates sexual fates in the C. elegans hermaphrodite germ line, Nature 390 (1997), pp. 477–484. View Record in Scopus | Cited By in Scopus (245)

3 B.C Varnum, Q.F. Ma, T.H. Chi, B. Fletcher and H.R. Herschman, The TIS11 primary response gene is a member of a gene family that encodes proteins with a highly conserved sequence containing an unusual Cys-His repeat, Mol Cell Biol 11 (1991), pp. 1754–1758. View Record in Scopus | Cited By in Scopus (85)

Basis of RNA[edit]

-RNA bases: A,G,C,U
-Base Pairs: A-U, G-C
-non-canonical pairs: G-U
-Stability: G-C > A-U > G-U
-Single Stranded: strand folds upon it to form base pairs; can have a diverse form of secondary structure

Secondary Structure[edit]

Structure Rules[edit]

  1. Base pairing stabilize the structure
  2. Unpaired sections-loops destabilize the structure
  3. When a base in one position changes, the base it pairs to must also change to maintain the same structure-covariation


-Most base pairs are non-crossing base pairs: -any two pairs (i, j) and (i’, j’) -> i < i’

Circular Representation[edit]

-Base pairs of a secondary structure represented by a circle
-Arc drawn for each base pairing in the structure


-The number of RNA secondary structures for the sequence: (Recurrence Relation)

S(n+1)=S(n)+ƩS(j-1)S(n-j), (n≥2)
-There are approximately 1.3 billion RNA structures of length n is 27

Types of regions[edit]

1) Hairpin Loop – 4 or more bases long for each loop
2) Bulge Loop – bases on one side cannot form base pairs
3) Interior Loop – bases on both sides cannot form base pairs
4) Multi Loop (Junctions) – 2 or more double stranded regions converge to form a closed structure

Structure Prediction Methods[edit]

1) Maximize Base Pairs

-Determine set of maximal base pairs
-Align bases based on ability to pair up to determine the optical structure
-Nussinov Algorithm: 4 ways to get the optimal structure between i and j
-Find strucuter with the most base pairs: A-U and G-C
2) Minimize Energy

-Determine maximum of scores for 4 structures at a particular position
-Stacks are the dominant stabilizing force
-Energy minimization algorithm predicts the secondary structure by minimizing the free energy
-Require estimation of energy terms contributing to secondary structure
-Dynamic programming approach:

1) Initialization
2) Recursion
3) Traceback
-Does not require prior sequence alighnment
-Energy associated with any position is only influenced by local sequence and structure


  1. How Do RNA Folding Algorithms Work? S.R. Eddy. Nature Biotechnology, 22:1457-1458, 2004.

Making RISC’s[edit]

MicroRNA’s, Piwii-interacting RNA’s, and small interfering RNA’s are unique in the world of RNA catalysts because they cannot perform any designated functions on their own. In order for small RNA’s to function, they must first make RISC’s. RISC stands for RNA-induced Silencing Complex. RISC’s play an important role in regulating a multitude of biological processes by interfering with gene expression. Analyzing the assembly of these effector complexes can help us gain a better understanding of how small RNA’s such as siRNA silence specific sequences. The assembly of RISC’s has puzzled the scientific community because the final product contains single stranded RNA, while its precursors contain double-stranded RNA.[10]


The central protein of an RISC is the Argonaute, abbreviated Ago. The term Argonaute encompasses a family of proteins that act as catalyst in RISC’s. The specific function that the small non-coding RNA will perform is determined, in part, by which Ago protein it is associated with. There are two primary classes of Ago proteins. One class binds to miRNA’s and siRNA’s while the other primarily binds to PiRNA’s. Argonaute proteins share the ability to prevent translation. However, they differ in how they interfere with the production of polypeptides. For example in humans, the AGO2 protein uses a cleaver to create RNAi. Whereas in flies, the AGO1 protein works with miRNA to regulate gene expression.[11]

The Assembly of RISC’s[edit]

Although Ago proteins are central to the formation of RISC’s, the mere binding of a small noncoding RNA to its complementary protein will not result in a complete RISC. Research has shown that RISC assembly is the result of a highly regulated mechanism. This mechanistic pathway results in the processing of small RNA until the desired RISC is produced. The assembly of an RISC can be broken down into two primary steps, loading and unwinding. In the first step, the noncoding RNA is “loaded” onto its corresponding Ago protein. In the second step, the double stranded small RNA is separated inside of the Ago protein. This is the “unwinding” step which results in a single stranded RISC molecule.[12]

ATP Powers RISC Assembly[edit]

Researchers have found that ATP is needed to load miRNA onto the Ago protein but it is not required to unwind the complex within the protein. These results have been confirmed for both drosophila and humans. Upon closer examination of Ago protein complexes, it was found that these proteins lack any domains that could be used to bind ATP. Scientists hypothesize that the ATP is consumed by machinery in the process of non-coding RNA loading.[13]


  1. Kapranov, Philip; Ozsolak Fatih and Milos Patrice M. (2012). Profiling of short RNAs using Helicos single-molecule sequencing. Cambridge, MA: Helicos Biosciences & Methods Mol Biol. 
  2. Kapranov, Philip; Ozsolak Fatih and Milos Patrice M. (2012). Profiling of short RNAs using Helicos single-molecule sequencing. Cambridge, MA: Helicos Biosciences & Methods Mol Biol. 
  3. Kapranov, Philip; Ozsolak Fatih and Milos Patrice M. (2012). Profiling of short RNAs using Helicos single-molecule sequencing. Cambridge, MA: Helicos Biosciences & Methods Mol Biol. 
  4. Kapranov, Philip; Ozsolak Fatih and Milos Patrice M. (2012). Profiling of short RNAs using Helicos single-molecule sequencing. Cambridge, MA: Helicos Biosciences & Methods Mol Biol. 
  5. Kapranov, Philip; Ozsolak Fatih and Milos Patrice M. (2012). Profiling of short RNAs using Helicos single-molecule sequencing. Cambridge, MA: Helicos Biosciences & Methods Mol Biol. 
  6. Kapranov, Philip; Ozsolak Fatih and Milos Patrice M. (2012). Profiling of short RNAs using Helicos single-molecule sequencing. Cambridge, MA: Helicos Biosciences & Methods Mol Biol. 
  7. Kapranov, Philip; Ozsolak Fatih and Milos Patrice M. (2012). Profiling of short RNAs using Helicos single-molecule sequencing. Cambridge, MA: Helicos Biosciences & Methods Mol Biol. 
  8. Kapranov, Philip; Ozsolak Fatih and Milos Patrice M. (2012). Profiling of short RNAs using Helicos single-molecule sequencing. Cambridge, MA: Helicos Biosciences & Methods Mol Biol. 
  9. Kapranov, Philip; Ozsolak Fatih and Milos Patrice M. (2012). Profiling of short RNAs using Helicos single-molecule sequencing. Cambridge, MA: Helicos Biosciences & Methods Mol Biol. 
  10. Kawamata, Tomoko, and Yukihide Tomari. “Making RISC.” Trends in Biochemical Sciences 35.7 (2010): 368
  11. Kawamata, Tomoko, and Yukihide Tomari. “Making RISC.” Trends in Biochemical Sciences 35.7 (2010): 368
  12. Kawamata, Tomoko, and Yukihide Tomari. “Making RISC.” Trends in Biochemical Sciences 35.7 (2010): 368-369
  13. Kawamata, Tomoko, and Yukihide Tomari. “Making RISC.” Trends in Biochemical Sciences 35.7 (2010): 373-374


Retroviruses are created by Gag, a single virus-encoded protein. Assembling an infectious particle means there are very diverse interactions (specific and nonspecific) between Gag proteins and RNA. These interactions are vital for the particle construction, packing of viral RNA in the particle, and how the primer is placed for viral DNA synthesis.

Gag proteins: the building blocks of retroviruses[edit]

Retroviruses are composed of 6 genera (alpha-, beta-, gamma-, delta-, epsilon-, and lenti-retrovirsuses) and also a subfamily called Spumaviridae. Th Gag protein is the building block for retrovirus particles; expression of this protein allows for efficient assembly of virus-like particles in most, if not all, mammalian cells. When viruses are being assembled, Gag proteins interact with lipids in the plasma membrane with RNAs and other Gag molecules. These interactions involve the formation of the particle, selecion of what RNA species will be included in the particle, and the refolding of the packages RNAs.

A nonspecific interaction with RNAs drives virus particle assembly[edit]

Purified, recombinant Gag proteins are soluble in aqueous solutions so they aren’t interested for protein-protein interactions that needs to occur for virus particle formation. However, if a single-stranded nucleic acid is added, it triggers in vitro assembly on the Gag proteins into VLPs.
In vivo, particle assembly is dependent on Gag-nucleic acids (NA) interactions because NAs are about 20-40 nucleotides, so they can support assembly in vitro. The NAs are so short, though, that they can only bind to a few Gag molecules.
The NC domain of Gag is where the site of interaction is at with NAs.When the NC domain binds with NA, the conformation is altered of the capsid domain. This ultimately exposes new interfaces for protein-protein interaction.
In many retroviruses, the NC domain of Gag is most attracted to RNA.However, the matrix domain at the N-terminus is positively charged and is also able to interact with RNAs.


Rein, Alan, et al. “Diverse interactions of retroviral Gag proteins with RNAs” Trends in Biochemical Sciences 36.7 (2011) 373-379. Academic Search Complete. Web. 21 Nov. 2012.

Different techniques are commonly used to determine or alter DNA sequences. The most important ones include:

  1. DNA sequencing: It provides us a lot of information about gene architecture, the control of gene expression, and protein structure.
  2. solid state synthesis of nucleic acids: The desired sequences of nucleic acids can be synthesized precisely de novo and used to identify or amplify other nucleic acids.
  3. restriction enzyme analysis: These Restriction Endonucleases provides an investigator to manipulate DNA segments by cutting them from the recognized sites.
  4. polymerase chain reaction (PCR) : This is one of the most useful methods to amplify DNA sequence billion fold. This technique requires denaturing of DNA and hybridizing primers complementary to strand forward and reverse and the DNA is synthesized on both strands in the 5′ to 3′ direction. This technique is used to recognize pathogens and genetic diseases, to determine the source of a hair left at the scene of a crime, and to resurrect genes from fossils.
  5. Blotting techniques: Southern and Northern Blots are used to separate DNA and RNA, respectively.1


1. Berg, Jeremy M. 2007. Biochemistry. Sixth Ed. New York: W.H. Freeman. 68-69, 78. 2. Voet, Voet, Pratt (2004). – Fundamentals of Biochemistry
In the article, “The interface between transcription and mechanisms maintaining genome integrity,” the main interfaces between transcription and other process related to DNA and the relevance to genome integrity in eukaryotes is thoroughly explored. Researchers at Cancer Research UK London Research Institute have studied the process of transcription, translation, and RNA mutagenesis and its effect on maintaining genome integrity. Specifically, eukaryotic RNA polymerase II transcription affects processes like chromatin remodeling and DNA repair.
Loss of genome integrity causes changes in gene expression. For example loss in a chromosome region causes a mutation and in turn alters the protein that is created based on the genetic code. The research group has also discovered that the movement of RNA polymerases through the chromatin directly and indirectly affects the integrity of the genomic region that is transcribed.

Obstacles to transcription
The factors that affect the maintenance of genome integrity during the transcription process are nucleosomes and DNA damager. RNA polymerase and its co-factors attempt to temporarily move the obstructing nucleosomes away from the transcribing region of the DNA. Second, DNA damage like a DNA double-strand break will not allow the RNA polymerase to continue transcribing since the bulky DNA lesion acts as a large obstacle. Thankfully, simple base damage is not a permanent obstacle as there are developed pathways that attempt to fix the lesion.

The pathway that fix the damage-stalling DNA lesion is called TC-NER, or transcription coupled nucleotide excision machinery. The TC-NER pathway attempts to remove areas of DNA damage in the transcribed region that is first encountered by the RNA polymerase II. If there are lesions further down the transcribed region that the RNA polymerase has not yet reached, another pathway called the GG-NER, or general genome nucleotide excision machinery is used. The TC-NER is dependent on proteins called Cockayne Syndrome (CS) A and CSB, that act as cofactors in this pathway. However, the exact mechanism by which the CS protein help catalyze the removal of DNA lesions through TC-NER is still to be determined.

A third pathway (in addition to TC-NER and GG-NER) used to fix the damage-stalling is RNA Polymerase II polyubiquitylation. The polyubiquitylation process has 3 types. The RNA Polymerase II K63 pathway is independent and does not lead to the degradation of the RNA polymerase. However, in the more direct pathway RNA polymerase II is subject to mono-ubiquitylation and the K46 poly-ubiquitylation. After these two steps, the polymerase is degraded. This degradation of the polymerase itself is typically used as the last result to fixing the damage-stalling DNA with the bulky lesion. Once the polymerase is degraded, the DNA lesion is attempted to be removed through a second trial of TC-NER and GG-NER when another RNA polymerase II encounters the lesion, or by DNA recombination.

Another phenomenon that affects genome integrity occurs during the process of DNA replication. During DNA replication, a replication fork is formed where there is a leading and lagging strand that make up the two sides of the fork. At the center of the fork, DNA polymerase slides along and replicates the DNA template. Pol ε is responsible for the leading strand synthesis and the Pol δ is responsible for lagging strand synthesis. In addition to the polymerases, there are primases, helicases and supplemental enzymes all present at this replication fork. The collision of these individual machineries can evidently cause severe consequences.

The realization that these clashes can occur at the replication fork provide further evidence of the phenomenon of transcription-associated mutagenesis or TAM. Highly transcribed region in the genome tend to have a lower percentage of packaging aids like nucleosome present. This leads to a more open structure and ‘single strandedness’. The loss of the chromatin proteins and the resulting structural packaging damage of the DNA strands leave the DNA more susceptible to damage and loss of genome integrity. Overall, the spontaneous mutation rate in a eukaryotic gene is proportional to the transcription level. Interestingly, this means that the movement of RNA polymerase II and the repeating transcription processes of the same segment on the DNA strand of interest leading a higher probability of mutagenesis. An example of such mutation has been determined by the research group. In yeast DNA, there is accumulation of dUTP instead of dTTP during DNA replication as a result of highly transcribing a particular strand. This example leads to the conclusion that replication fork breakdown does occur if there is clashing when transcription is in process as well.

RNA mutagenesis does not contribute significantly to the loss of genome integrity like DNA mutagenesis does. This is because the mRNA that is used to eventually produce proteins are short-lived, especially in comparison to tRNA and mrRNA. Due to the short life of mRNA, it is doubtful of mutant protein is caused by RNA mutagenesis.

In conclusion, during transcription and replication of DNA, a loss of genome integrity may occur as a result to clashes and lesions of the proteins and polymerases in the normal process. The cell uses pathways like TC-NER,GG-NER, and RNA PII ubiquitylation to attempt to remove the lesions. Better computational models for studying these pathways may help understand genome instability thoroughly in the future.

1. Svejstrup, Jesper Q. The interface between transcription and mechanisms maintaining genome integrity. Clare Hall Laboratories, Cancer Research UK London Research Institute, Blanche Lane, South Mimms, EN6 3LD, UK. Trends in Biochemical Sciences Vol. 35 No. 6

2. Mefford, H.C and Eichler, E.E. (2009). Duplication hotspots, rare genomic disorders, and common disease. Curr, Opin. Genet. Dev. 19, 196-204.


Restriction enzymes were first discovered by Werner Arber and Hamilton Smith. Daniel Nathans pioneered their use which led to recombinant DNA technology.

Restriction Endonucleases[edit]

Restriction endonucleases, also known as restriction enzymes, are responsible for the phenomenon in bacteria known as host-controlled restriction modification or phenotypic modification. Restriction/Modification enzyme systems are divided into three categories: Type I, Type II, and Type III. The key distinctions between these systems are that Type II enzymes contain separate restriction and methylation systems, while Type I and Type III enzymes carry both restriction and methylation properties in one enzyme, consisting of two or three heterologous subunits. Typical commercial restriction enzymes used in molecular biology are produced by Type II systems. Type II restriction endonucleases recognize specific palindromic sequences (a sequence that reads the same on both strands except one strand is reversed). The restriction enzyme recognizes a particular sequence of base pairs (about 4-8 bp long) with an axis of rotational symmetry. Once this site of recognition is established, it cleaves the phosphodiester bond in each strand of the double helical DNA. The number and size of the fragments produced depends on the frequency of occurrence of the recognition site in the DNA to be cut. The restriction enzyme cuts the DNA into smaller fragments so that they can be analyzed and manipulated easier. Restriction endonucleases can help with the analysis of chromosome structure, sequencing long DNA molecules, isolating genes, and creating new DNA molecules to be cloned.

Cleavage by a restriction enzyme can generate a number of various ends. Often, these ends have 3′-hydroxyl and 5′-phosphate ends. Some cleavages produce single-stranded overhangs, called cohesive ends or sticky ends, while others generate blunt ends. These ends or cleaved sites can be subsequently annealed and ligated to vector DNA or any kind of DNA having compatible ends. Not all cuts may necessarily be symmetrical as BamHI for example, cuts from the ends of DNA sequence in a non-symmetrical fashion.

It is also possible to visualize restriction fragments by gel electrophoresis. There are three different methods that can be used. The first method is a polyacrylamide gel which can separate fragments up to 1000 base pairs. The next one is an agarose gel which can separate up to 20kb. And finally, the Pulsed-Field Gel Electrophoresis (PFGE) which can separate up to millions of nucleotides based on the stretching and relaxation of DNA as the electric field is turned on and off. Autoradiography or staining by ethidium bromide can then be used to visualize the DNA. The gel electrophoresis is run through an electric field which isolates fragments by size, noting that smaller fragments travel farther within the gel and the larger fragments are closer to the start. Compared to a known size standard, the location of where the restriction enzymes cut are known.

Suppose the following segment of DNA is recognized by the restriction enzyme. The red (*) symbolized the axis of symmetry. One major component of these cleavage sites is the presence of twofold rotational symmetry about this axis. The cleavage site is also highlighted on the diagram. Once this site recognition is established, the phosphodiester bond between the highlighted C-G and G-C will be cleaved by the restriction enzyme on the corresponding strands of the double helix. Note that different restriction enzymes have different cleavage sites. Therefore cleavage of phosphodiester bond will not always be between C-G or G-C.

5′ C-C-G-C-G-G 3′

3′ G-G-C-G-C-C 5′


Restriction enzyme Eco RI.JPG

BamHI as an example[edit]

BamHI is an example of a restriction enzyme. It cleaves palendromic sequences, 6 bases at a time. For example:

File:BamHI palendromic sequence.jpg

This image shows where the BamHI would cleave the palendromic sequence.

BamHI binds to non-specific DNA and slides down the DNA strands with a dimer enzyme that quickly reads the DNA to find palendromic sequences of 6 bases. If it finds 6 bases that are not palendromic, it will still cut the bases, but will do so poorly. BamHI works best when it finds a palendromic sequence to cut.

File:Catalytic mechanism.jpg
This image shows the mechanism that BamHI uses to cleave the phosphodiester bond. In the first step, when reacting with water, the substrate obtains a hydroxide group and gives the phosphate two negative charges. A metal is usually in the transition state because the metal is a positive charge that balances out the two negative charges, and makes the transition state more stable. Magnesium is a good metal to use. Calcium is also sometimes used, but it does not work as well because it hinders the cleaving process. Water is then added to the transition state, which results in the donation of a proton to the leaving group, finally breaking the bond and cleaving the DNA.

Everything in the pre-reactive and post-reactive states, excepting the fact that the phosphodiester bond is now broken, cleaving the DNA, basically remains the same as shown in the following picture.

File:Pre and post reactive states for BamHI.jpg

The red image shows the post-reactive state and the blue shows the pre reactive state. The most obvious difference is shown in the middle, where the phosphodiester bond is now broken (near the A), and the two bases are now separated.

The 3-D image of the BamHI for pre-reactive state

The 3-D image of the BamHI for post-reactive state

Restriction Enzyme Control[edit]

Restriction endonucleases exhibit their high specificity due to two major characteristics. One being that they must not degrade the host DNA that contains the sequence recognized by the restriction enzymes. Second, they must only cleave sites that are specifically recognized. The restriction endonucleases must be able to tell the difference between one specific cleaving sequence and a different sequence. Restriction endonucleases are able to exhibit these properties through the process of methylation. Methylate enzymes present in the organism protect the organisms DNA containing the palindromic sequences from being cleaved by the restrictio endonucleases. Once the methylate enzymes methylate the adenines of the organisms personal bases, the restriction enzymes will not cut at these sites. Every restriction endonuclease results in a specific methylate enzyme in the host cell that will methylate the specific sequence sites in the hosts DNA. This system of self-methylation and restriction enzyme action is known as restriction-modification system. The restriction endnucleases produced by the host organism can only cleave sequences that are not marked with the methylated adenines allowing for restriction enzyme control.

Cleavage Mechanism

As stated above, restriction endonucleases cleave the bond between the oxygen 3′ and phosphorous atoms. Restriction endonucleases catalyze the hydrolysis of these phosphodiester bonds in DNA. The mechanism of this reaction is based upon nucleophillic attack of the phosphorous creating a pentacoordinate transition state. This results in a bipyramidal structure.

Type I
Type I restriction enzymes were the first to be identified and are characteristic of two different strains of E. coli. The recognition site is asymmetrical and is composed of two portions: one containing 3-4 nucleotides, and another containing 4-5 nucleotides which are separated by a spacer. Several enzyme cofactors, are required for their activity. Type I restriction enzymes possess three subunits called HsdR, HsdM, and HsdS; HsdR is required for restriction, HsdM is necessary for adding methyl groups to host DNA (methyltransferase activity) and HsdS is important for specificity of cut site recognition in addition to its methyltransferase activity.

Type II
Typical type II restriction enzymes differ from type I restriction enzymes in several ways. They are composed of only one subunit, their recognition sites are usually undivided and palindromic and 4-8 nucleotides in length, they recognize and cleave DNA at the same site, and they do not use cofactors for their activity (except Mg2+). These are the most commonly available and used restriction enzymes.

Type III
Type III restriction enzymes recognize two separate non-palindromic sequences that are inversely oriented. They cut DNA about 20-30 base pairs after the recognition site.

Analyzing Restriction Digests[edit]

Res gel.jpg

After a restriction enzyme digest of plasmid DNA was done, the DNA fragments were analysis on an argrose gel. By examining the pattern of bands obtained on the gel, the size of DNA and vector can be determined. This can be use to confirm if the correct plasmid was isolated or not.

From the Figure above conclusion like following can be made:

Lane 3 and 6 were digested by enzyme twice. Two bands represent two pieces of linear DNA. Bottom band is the size of the gene inserted between the two enzyme sites in the multiple cloning sites and the other is the size of the rest of the plasmid.

Lane 4 and 7 were digested by enzyme once. The band moves slower than other linear DNA indicates that DNA is nicked. This DNA uncoils, but remains circular and usually migrates more slowly than linear DNA of the same size.

Lane 5 and 8 were undigested. Super coiled DNA migrates more rapidly than linear DNA of the same size.

A gel envelope that is to be placed on a test pad in preparation for Southern blotting, photo by Linda Bartlett, [40]

Southern blotting, which is named after its inventor, Edwin Southern, is a common technique used in molecular biology to separate and characterize DNA. It is an effective way to identify a specific DNA pattern by the following procedures. Southern blotting is a technique used to determine the presence of a specific DNA sequence within a mixture via agarose gel electrophoresis.

Southern blotting is a hybridization technique that enables researchers to determine the presence of certain nucleotide sequences in a sample of DNA.

Southern Blotting was the first technique of its kind. Soon after however, additional analogous techniques known as Western Blotting and Northern Blotting (among others), were cleverly named eponymously; to follow closely follow the convention behind the naming of Southern Blotting. These techniques don’t detect the presence of DNA, but rather the presence of Proteins and RNA, respectively.


DNA is digested with one or more restriction enzymes, and the resulting fragments are separated according to size by electrophoresis through an agarose or acrylamide gel. The DNA is denatured and transferred from the gel to a solid support. The reason for transferring the DNA fragments to a solid support (usually a nitrocellulose plate) is that the DNA is inaccessible to DNA probes while embedded in the gel. The relative positions of the DNA fragments are preserved during their transfer to the filter. The DNA fragments attached to the filter are then exposed to a strand of radioactively-labeled DNA that is complimentary to the DNA strand on the plate that is of interest. Autoradiography is then used to locate the positions of bands complementary to the probe.


1. The DNA sample, if necessary, is digested with an appropriate restriction enzymes, or restriction endonucleases. The digest is then separated by gel electrophoresis, usually on an agarose gel. If a large amount of restriction fragments are present in the sample, the sample may likely appear as faint smears rather than discrete bands.

By the 1970 the DNA fragments was isolated by using gravity. After that time, the scientists found other way to separate DNA fragment. That is Gel Electrophoresis. Electrophoresis uses electricity to separate different sized molecules.

2. The digest is then denatured to allow for transfer onto a membrane. Since the sample DNA is double-stranded and only single-stranded DNA can be transferred, the sample must be denatured by soaking in an appropriate alkaline solution (e.g. 0.5M NaOH). If the sample is still too large to be transferred (more than 15kb), the sample may be treated with an appropriate acid to depurinate (e.g. remove the purines) the sample and break it down into smaller fragments. The sample is then neutralized before continuing with the procedure.

3. The sample is transferred onto a membrane which is a sheet of special blotting paper for analysis. The transfer to another membrane is performed in order to preserve the position of the DNA fragments once electrophoresis has been performed. A Nitrocellulose membrane is generally used, though some may prefer the use of nylon for a better binding capacity. It may also be noted that nylon is less fragile than nitrocellulose. The membrane used is laid on top of the gel, and usually paper towels are placed on top of the membrane to ensure an even distribution by applying pressure evenly. The transfer is done usually be capillary action, which may take several hours. Alternatively, a vacuum apparatus can be used; this is very similar to capillary action, though transferring via vacuum apparatus may be faster. The transfer can also occur by moving the DNA out of the gel and onto the membrane by electrophoresis, a process called electrotransfer.

4. The sample is then treated with UV light to irreversibly cross-link the sample to the membrane covalently. Alternatively, the sample may be baked at around 80°C for several hours (this should only be done if using nylon membrane since nitrocellulose is highly combustible).

5. The membrane is then probed with labeled, single-stranded DNA (this is the target DNA sequence). This process is also known as hybridization. The labeled DNA binds to the membrane DNA via the binding of complementary strands. The label is generally a 32P probe label, though a bioluminescent probe or biotin/streptavidin may also be used. The reason that hybridization is important is that it allows you to physicall