Vertebrates comprise all species of animals within the subphylum Vertebrata. Vertebrates represent the overwhelming majority of the phylum Chordata, with about 69,276 species described. Vertebrates include the jawless fishes and jawed vertebrates, which include the cartilaginous fishes and the bony fishes; the bony fishes in turn, cladistically speaking include the tetrapods, which include amphibians, reptiles and mammals. Extant vertebrates range in size from the frog species Paedophryne amauensis, at as little as 7.7 mm, to the blue whale, at up to 33 m. Vertebrates make up less than five percent of all described animal species; the vertebrates traditionally include the hagfish, which do not have proper vertebrae due to their loss in evolution, though their closest living relatives, the lampreys, do. Hagfish do, possess a cranium. For this reason, the vertebrate subphylum is sometimes referred to as "Craniata" when discussing morphology. Molecular analysis since 1992 has suggested that hagfish are most related to lampreys, so are vertebrates in a monophyletic sense.
Others consider them a sister group of vertebrates in the common taxon of craniata. The word vertebrate derives from the Latin word vertebratus. Vertebrate is derived from the word vertebra, which refers to any of the bones or segments of the spinal column. All vertebrates are built along the basic chordate body plan: a stiff rod running through the length of the animal, with a hollow tube of nervous tissue above it and the gastrointestinal tract below. In all vertebrates, the mouth is found at, or right below, the anterior end of the animal, while the anus opens to the exterior before the end of the body; the remaining part of the body continuing after the anus forms a tail with vertebrae and spinal cord, but no gut. The defining characteristic of a vertebrate is the vertebral column, in which the notochord found in all chordates has been replaced by a segmented series of stiffer elements separated by mobile joints. However, a few vertebrates have secondarily lost this anatomy, retaining the notochord into adulthood, such as the sturgeon and coelacanth.
Jawed vertebrates are typified by paired appendages, but this trait is not required in order for an animal to be a vertebrate. All basal vertebrates breathe with gills; the gills are carried right behind the head, bordering the posterior margins of a series of openings from the pharynx to the exterior. Each gill is supported by a cartilagenous or bony gill arch; the bony fish have three pairs of arches, cartilaginous fish have five to seven pairs, while the primitive jawless fish have seven. The vertebrate ancestor no doubt had more arches than this, as some of their chordate relatives have more than 50 pairs of gills. In amphibians and some primitive bony fishes, the larvae bear external gills, branching off from the gill arches; these are reduced in adulthood, their function taken over by the gills proper in fishes and by lungs in most amphibians. Some amphibians retain the external larval gills in adulthood, the complex internal gill system as seen in fish being irrevocably lost early in the evolution of tetrapods.
While the more derived vertebrates lack gills, the gill arches form during fetal development, form the basis of essential structures such as jaws, the thyroid gland, the larynx, the columella and, in mammals, the malleus and incus. The central nervous system of vertebrates is based on a hollow nerve cord running along the length of the animal. Of particular importance and unique to vertebrates is the presence of neural crest cells; these are progenitors of stem cells, critical to coordinating the functions of cellular components. Neural crest cells migrate through the body from the nerve cord during development, initiate the formation of neural ganglia and structures such as the jaws and skull; the vertebrates are the only chordate group to exhibit cephalisation, the concentration of brain functions in the head. A slight swelling of the anterior end of the nerve cord is found in the lancelet, a chordate, though it lacks the eyes and other complex sense organs comparable to those of vertebrates.
Other chordates do not show any trends towards cephalisation. A peripheral nervous system branches out from the nerve cord to innervate the various systems; the front end of the nerve tube is expanded by a thickening of the walls and expansion of the central canal of spinal cord into three primary brain vesicles: The prosencephalon and rhombencephalon, further differentiated in the various vertebrate groups. Two laterally placed eyes form around outgrowths from the midbrain, except in hagfish, though this may be a secondary loss; the forebrain is well developed and subdivided in most tetrapods, while the midbrain dominates in many fish and some salamanders. Vesicles of the forebrain are paired, giving rise to hemispheres like the cerebral hemispheres in mammals; the resulting anatomy of the central nervous system, with a single hollow nerve cord topped by a series of vesicles, is unique to vertebrates. All invertebrates with well-developed brains, such as insects and squids, have a ventral rather than dorsal system of ganglions, with a split brain stem running on each side of the mouth or gut.
Vertebrates originated about 525 million years ago during the Cambrian explosion, which saw
Glycine is an amino acid that has a single hydrogen atom as its side chain. It is the simplest amino acid, with the chemical formula NH2‐CH2‐COOH. Glycine is one of the proteinogenic amino acids, it is encoded by all the codons starting with GG. Glycine is known as a "helix breaker", due to its ability to act as a hinge in the secondary structure of proteins. Glycine is a sweet-tasting crystalline solid, it is the only achiral proteinogenic amino acid. It can fit into hydrophilic or hydrophobic environments, due to its minimal side chain of only one hydrogen atom; the acyl radical is glycyl. Glycine was discovered in 1820 by the French chemist Henri Braconnot when he hydrolyzed gelatin by boiling it with sulfuric acid, he called it "sugar of gelatin", but the French chemist Jean-Baptiste Boussingault showed that it contained nitrogen. The American scientist Eben Norton Horsford a student of the German chemist Justus von Liebig, proposed the name "glycocoll"; the name comes from the Greek word γλυκύς "sweet tasting".
In 1858, the French chemist Auguste Cahours determined. Although glycine can be isolated from hydrolyzed protein, this is not used for industrial production, as it can be manufactured more conveniently by chemical synthesis; the two main processes are amination of chloroacetic acid with ammonia, giving glycine and ammonium chloride, the Strecker amino acid synthesis, the main synthetic method in the United States and Japan. About 15 thousand tonnes are produced annually in this way. Glycine is cogenerated as an impurity in the synthesis of EDTA, arising from reactions of the ammonia coproduct. In aqueous solution, glycine itself is amphoteric: at low pH the molecule can be protonated with a pKa of about 2.4 and at high pH it loses a proton with a pKa of about 9.6. Glycine is not essential to the human diet, as it is biosynthesized in the body from the amino acid serine, in turn derived from 3-phosphoglycerate, but the metabolic capacity for glycine biosynthesis does not satisfy the need for collagen synthesis.
In most organisms, the enzyme serine hydroxymethyltransferase catalyses this transformation via the cofactor pyridoxal phosphate: serine + tetrahydrofolate → glycine + N5,N10-Methylene tetrahydrofolate + H2OIn the liver of vertebrates, glycine synthesis is catalyzed by glycine synthase. This conversion is reversible: CO2 + NH+4 + N5,N10-Methylene tetrahydrofolate + NADH + H+ ⇌ Glycine + tetrahydrofolate + NAD+ Glycine is degraded via three pathways; the predominant pathway in animals and plants is the reverse of the glycine synthase pathway mentioned above. In this context, the enzyme system involved is called the glycine cleavage system: Glycine + tetrahydrofolate + NAD+ ⇌ CO2 + NH+4 + N5,N10-Methylene tetrahydrofolate + NADH + H+In the second pathway, glycine is degraded in two steps; the first step is the reverse of glycine biosynthesis from serine with serine hydroxymethyl transferase. Serine is converted to pyruvate by serine dehydratase. In the third pathway of glycine degradation, glycine is converted to glyoxylate by D-amino acid oxidase.
Glyoxylate is oxidized by hepatic lactate dehydrogenase to oxalate in an NAD+-dependent reaction. The half-life of glycine and its elimination from the body varies based on dose. In one study, the half-life varied between 4.0 hours. The principal function of glycine is as a precursor to proteins. Most proteins incorporate only small quantities of glycine, a notable exception being collagen, which contains about 35% glycine due to its periodically repeated role in the formation of collagen's helix structure in conjunction with hydroxyproline. In the genetic code, glycine is coded by all codons starting with GG, namely GGU, GGC, GGA and GGG. In higher eukaryotes, δ-aminolevulinic acid, the key precursor to porphyrins, is biosynthesized from glycine and succinyl-CoA by the enzyme ALA synthase. Glycine provides the central C2N subunit of all purines. Glycine is an inhibitory neurotransmitter in the central nervous system in the spinal cord and retina; when glycine receptors are activated, chloride enters the neuron via ionotropic receptors, causing an Inhibitory postsynaptic potential.
Strychnine is a strong antagonist at ionotropic glycine receptors, whereas bicuculline is a weak one. Glycine is a required co-agonist along with glutamate for NMDA receptors. In contrast to the inhibitory role of glycine in the spinal cord, this behaviour is facilitated at the glutamatergic receptors which are excitatory; the LD50 of glycine is 7930 mg/kg in rats, it causes death by hyperexcitability. In the US, glycine is sold in two grades: United States Pharmacopeia, technical grade. USP grade sales account for 80 to 85 percent of the U. S. market for glycine. If purity greater than the USP standard is needed, for example for intravenous injections, a more expensive pharmaceutical grade glycine can be used. Technical grade glycine, which may or may not meet USP grade standards, is sold at a lower price for use in industrial applications, e.g. as an agent in metal complexing and finishing. USP glycine has a wide variety of uses, including as an additive in pet food and animal feed, in foods and pharmaceuticals as a sweetener/taste enhancer, or as a component of food supplements and protein drinks.
Two glycine molecules in a dipeptide form are referred to as a diglycinate. Because they use a different s
Amino acids are organic compounds containing amine and carboxyl functional groups, along with a side chain specific to each amino acid. The key elements of an amino acid are carbon, hydrogen and nitrogen, although other elements are found in the side chains of certain amino acids. About 500 occurring amino acids are known and can be classified in many ways, they can be classified according to the core structural functional groups' locations as alpha-, beta-, gamma- or delta- amino acids. In the form of proteins, amino acid residues form the second-largest component of human muscles and other tissues. Beyond their role as residues in proteins, amino acids participate in a number of processes such as neurotransmitter transport and biosynthesis. In biochemistry, amino acids having both the amine and the carboxylic acid groups attached to the first carbon atom have particular importance, they are known as α-amino acids. They include the 22 proteinogenic amino acids, which combine into peptide chains to form the building-blocks of a vast array of proteins.
These are all L-stereoisomers, although a few D-amino acids occur in bacterial envelopes, as a neuromodulator, in some antibiotics. Twenty of the proteinogenic amino acids are encoded directly by triplet codons in the genetic code and are known as "standard" amino acids; the other two are selenocysteine, pyrrolysine. Pyrrolysine and selenocysteine are encoded via variant codons. N-formylmethionine is considered as a form of methionine rather than as a separate proteinogenic amino acid. Codon–tRNA combinations not found in nature can be used to "expand" the genetic code and form novel proteins known as alloproteins incorporating non-proteinogenic amino acids. Many important proteinogenic and non-proteinogenic amino acids have biological functions. For example, in the human brain and gamma-amino-butyric acid are the main excitatory and inhibitory neurotransmitters. Hydroxyproline, a major component of the connective tissue collagen, is synthesised from proline. Glycine is a biosynthetic precursor to porphyrins used in red blood cells.
Carnitine is used in lipid transport. Nine proteinogenic amino acids are called "essential" for humans because they cannot be produced from other compounds by the human body and so must be taken in as food. Others may be conditionally essential for medical conditions. Essential amino acids may differ between species; because of their biological significance, amino acids are important in nutrition and are used in nutritional supplements, fertilizers and food technology. Industrial uses include the production of drugs, biodegradable plastics, chiral catalysts; the first few amino acids were discovered in the early 19th century. In 1806, French chemists Louis-Nicolas Vauquelin and Pierre Jean Robiquet isolated a compound in asparagus, subsequently named asparagine, the first amino acid to be discovered. Cystine was discovered in 1810, although its monomer, remained undiscovered until 1884. Glycine and leucine were discovered in 1820; the last of the 20 common amino acids to be discovered was threonine in 1935 by William Cumming Rose, who determined the essential amino acids and established the minimum daily requirements of all amino acids for optimal growth.
The unity of the chemical category was recognized by Wurtz in 1865, but he gave no particular name to it. Usage of the term "amino acid" in the English language is from 1898, while the German term, Aminosäure, was used earlier. Proteins were found to yield amino acids after enzymatic acid hydrolysis. In 1902, Emil Fischer and Franz Hofmeister independently proposed that proteins are formed from many amino acids, whereby bonds are formed between the amino group of one amino acid with the carboxyl group of another, resulting in a linear structure that Fischer termed "peptide". In the structure shown at the top of the page, R represents a side chain specific to each amino acid; the carbon atom next to the carboxyl group is called the α–carbon. Amino acids containing an amino group bonded directly to the alpha carbon are referred to as alpha amino acids; these include amino acids such as proline which contain secondary amines, which used to be referred to as "imino acids". The alpha amino acids are the most common form found in nature, but only when occurring in the L-isomer.
The alpha carbon is a chiral carbon atom, with the exception of glycine which has two indistinguishable hydrogen atoms on the alpha carbon. Therefore, all alpha amino acids but glycine can exist in either of two enantiomers, called L or D amino acids, which are mirror images of each other. While L-amino acids represent all of the amino acids found in proteins during translation in the ribosome, D-amin
The X chromosome is one of the two sex-determining chromosomes in many organisms, including mammals, is found in both males and females. It is a part of the XY sex-determination X0 sex-determination system; the X chromosome was named for its unique properties by early researchers, which resulted in the naming of its counterpart Y chromosome, for the next letter in the alphabet, following its subsequent discovery. It was first noted. Henking was studying the testicles of Pyrrhocoris and noticed that one chromosome did not take part in meiosis. Chromosomes are so named because of their ability to take up staining. Although the X chromosome could be stained just as well as the others, Henking was unsure whether it was a different class of object and named it X element, which became X chromosome after it was established that it was indeed a chromosome; the idea that the X chromosome was named after its similarity to the letter "X" is mistaken. All chromosomes appear as an amorphous blob under the microscope and only take on a well defined shape during mitosis.
This shape is vaguely X-shaped for all chromosomes. It is coincidental that the Y chromosome, during mitosis, has two short branches which can look merged under the microscope and appear as the descender of a Y-shape, it was first suggested that the X chromosome was involved in sex determination by Clarence Erwin McClung in 1901. After comparing his work on locusts with Henking's and others, McClung noted that only half the sperm received an X chromosome, he called this chromosome an accessory chromosome, insisted that it was a proper chromosome, theorized that it was the male-determining chromosome. Luke Hutchison noticed that a number of possible ancestors on the X chromosome inheritance line at a given ancestral generation follows the Fibonacci sequence. A male individual has an X chromosome, which he received from his mother, a Y chromosome, which he received from his father; the male counts as the "origin" of his own X chromosome, at his parents' generation, his X chromosome came from a single parent.
The male's mother received one X chromosome from her mother, one from her father, so two grandparents contributed to the male descendant's X chromosome. The maternal grandfather received his X chromosome from his mother, the maternal grandmother received X chromosomes from both of her parents, so three great-grandparents contributed to the male descendant's X chromosome. Five great-great-grandparents contributed to the male descendant's X chromosome, etc; the X chromosome in humans spans more than 153 million base pairs. It represents about 800 protein-coding genes compared to the Y chromosome containing about 70 genes, out of 20,000–25,000 total genes in the human genome; each person has one pair of sex chromosomes in each cell. Females have two X chromosomes, whereas males have one Y chromosome. Both males and females retain one of their mother's X chromosomes, females retain their second X chromosome from their father. Since the father retains his X chromosome from his mother, a human female has one X chromosome from her paternal grandmother, one X chromosome from her mother.
This inheritance pattern follows the Fibonacci numbers at a given ancestral depth. Genetic disorders that are due to mutations in genes on the X chromosome are described as X linked. If X chromosome has a genetic disease gene, it always causes illness in male patients, since men have only one X chromosome and therefore only one copy of each gene. Females, may stay healthy and only be carrier of genetic illness, since they have another X chromosome and possibility to have healthy gene copy. For example hemophilia and red-green colorblindness run in family this way; the X chromosome carries hundreds of genes but few, if any, of these have anything to do directly with sex determination. Early in embryonic development in females, one of the two X chromosomes is randomly and permanently inactivated in nearly all somatic cells; this phenomenon is called X-inactivation or Lyonization, creates a Barr body. If X-inactivation in the somatic cell meant a complete de-functionalizing of one of the X-chromosomes, it would ensure that females, like males, had only one functional copy of the X chromosome in each somatic cell.
This was assumed to be the case. However, recent research suggests that the Barr body may be more biologically active than was supposed; the partial inactivation of the X-chromosome is due to repressive heterochromatin that compacts the DNA and prevents the expression of most genes. Heterochromatin compaction is regulated by Polycomb Repressive Complex 2; the following are some of the gene count estimates of human X chromosome. Because researchers use different approaches to genome annotation their predictions of the number
Transcription is the first step of gene expression, in which a particular segment of DNA is copied into RNA by the enzyme RNA polymerase. Both DNA and RNA are nucleic acids. During transcription, a DNA sequence is read by an RNA polymerase, which produces a complementary, antiparallel RNA strand called a primary transcript. Transcription proceeds in the following general steps: RNA polymerase, together with one or more general transcription factors, binds to promoter DNA. RNA polymerase creates a transcription bubble; this is done by breaking the hydrogen bonds between complementary DNA nucleotides. RNA polymerase adds RNA nucleotides. RNA sugar-phosphate backbone forms with assistance from RNA polymerase to form an RNA strand. Hydrogen bonds of the RNA–DNA helix break, freeing the newly synthesized RNA strand. If the cell has a nucleus, the RNA may be further processed; this may include polyadenylation and splicing. The RNA may exit to the cytoplasm through the nuclear pore complex; the stretch of DNA transcribed into an RNA molecule is called a transcription unit and encodes at least one gene.
If the gene encodes a protein, the transcription produces messenger RNA. Alternatively, the transcribed gene may encode for non-coding RNA such as microRNA, ribosomal RNA, transfer RNA, or enzymatic RNA molecules called ribozymes. Overall, RNA helps synthesize and process proteins. In virology, the term may be used when referring to mRNA synthesis from an RNA molecule. For instance, the genome of a negative-sense single-stranded RNA virus may be template for a positive-sense single-stranded RNA; this is because the positive-sense strand contains the information needed to translate the viral proteins for viral replication afterwards. This process is catalyzed by a viral RNA replicase. A DNA transcription unit encoding for a protein may contain both a coding sequence, which will be translated into the protein, regulatory sequences, which direct and regulate the synthesis of that protein; the regulatory sequence before the coding sequence is called the five prime untranslated region. As opposed to DNA replication, transcription results in an RNA complement that includes the nucleotide uracil in all instances where thymine would have occurred in a DNA complement.
Only one of the two DNA strands serve as a template for transcription. The antisense strand of DNA is read by RNA polymerase from the 3' end to the 5' end during transcription; the complementary RNA is created in the opposite direction, in the 5' → 3' direction, matching the sequence of the sense strand with the exception of switching uracil for thymine. This directionality is because RNA polymerase can only add nucleotides to the 3' end of the growing mRNA chain; this use of only the 3' → 5' DNA strand eliminates the need for the Okazaki fragments that are seen in DNA replication. This removes the need for an RNA primer to initiate RNA synthesis, as is the case in DNA replication; the non-template strand of DNA is called the coding strand, because its sequence is the same as the newly created RNA transcript. This is the strand, used by convention when presenting a DNA sequence. Transcription has some proofreading mechanisms, but they are fewer and less effective than the controls for copying DNA.
As a result, transcription has a lower copying fidelity than DNA replication. Transcription is divided into initiation, promoter escape and termination. Transcription begins with the binding of RNA polymerase, together with one or more general transcription factors, to a specific DNA sequence referred to as a "promoter" to form an RNA polymerase-promoter "closed complex". In the "closed complex" the promoter DNA is still double-stranded. RNA polymerase, assisted by one or more general transcription factors unwinds 14 base pairs of DNA to form an RNA polymerase-promoter "open complex". In the "open complex" the promoter DNA is unwound and single-stranded; the exposed, single-stranded DNA is referred to as the "transcription bubble."RNA polymerase, assisted by one or more general transcription factors selects a transcription start site in the transcription bubble, binds to an initiating NTP and an extending NTP complementary to the transcription start site sequence, catalyzes bond formation to yield an initial RNA product.
In bacteria, RNA polymerase holoenzyme consists of five subunits: 2 α subunits, 1 β subunit, 1 β' subunit, 1 ω subunit. In bacteria, there is one general RNA transcription factor: sigma. RNA polymerase core enzyme binds to the bacterial general transcription factor sigma to form RNA polymerase holoenzyme and binds to a promoter. In archaea and eukaryotes, RNA polymerase contains subunits homologous to each of the five RNA polymerase subunits in bacteria and contains additional subunits. In archaea and eukaryotes, the functions of the bacterial general transcription factor sigma are performed by multiple general transcription factors that work together. In archaea, there ar
Mammals are vertebrate animals constituting the class Mammalia, characterized by the presence of mammary glands which in females produce milk for feeding their young, a neocortex, fur or hair, three middle ear bones. These characteristics distinguish them from reptiles and birds, from which they diverged in the late Triassic, 201–227 million years ago. There are around 5,450 species of mammals; the largest orders are the rodents and Soricomorpha. The next three are the Primates, the Cetartiodactyla, the Carnivora. In cladistics, which reflect evolution, mammals are classified as endothermic amniotes, they are the only living Synapsida. The early synapsid mammalian ancestors were sphenacodont pelycosaurs, a group that produced the non-mammalian Dimetrodon. At the end of the Carboniferous period around 300 million years ago, this group diverged from the sauropsid line that led to today's reptiles and birds; the line following the stem group Sphenacodontia split off several diverse groups of non-mammalian synapsids—sometimes referred to as mammal-like reptiles—before giving rise to the proto-mammals in the early Mesozoic era.
The modern mammalian orders arose in the Paleogene and Neogene periods of the Cenozoic era, after the extinction of non-avian dinosaurs, have been among the dominant terrestrial animal groups from 66 million years ago to the present. The basic body type is quadruped, most mammals use their four extremities for terrestrial locomotion. Mammals range in size from the 30–40 mm bumblebee bat to the 30-meter blue whale—the largest animal on the planet. Maximum lifespan varies from two years for the shrew to 211 years for the bowhead whale. All modern mammals give birth to live young, except the five species of monotremes, which are egg-laying mammals; the most species-rich group of mammals, the cohort called placentals, have a placenta, which enables the feeding of the fetus during gestation. Most mammals are intelligent, with some possessing large brains, self-awareness, tool use. Mammals can communicate and vocalize in several different ways, including the production of ultrasound, scent-marking, alarm signals and echolocation.
Mammals can organize themselves into fission-fusion societies and hierarchies—but can be solitary and territorial. Most mammals are polygynous. Domestication of many types of mammals by humans played a major role in the Neolithic revolution, resulted in farming replacing hunting and gathering as the primary source of food for humans; this led to a major restructuring of human societies from nomadic to sedentary, with more co-operation among larger and larger groups, the development of the first civilizations. Domesticated mammals provided, continue to provide, power for transport and agriculture, as well as food and leather. Mammals are hunted and raced for sport, are used as model organisms in science. Mammals have been depicted in art since Palaeolithic times, appear in literature, film and religion. Decline in numbers and extinction of many mammals is driven by human poaching and habitat destruction deforestation. Mammal classification has been through several iterations since Carl Linnaeus defined the class.
No classification system is universally accepted. George Gaylord Simpson's "Principles of Classification and a Classification of Mammals" provides systematics of mammal origins and relationships that were universally taught until the end of the 20th century. Since Simpson's classification, the paleontological record has been recalibrated, the intervening years have seen much debate and progress concerning the theoretical underpinnings of systematization itself through the new concept of cladistics. Though field work made Simpson's classification outdated, it remains the closest thing to an official classification of mammals. Most mammals, including the six most species-rich orders, belong to the placental group; the three largest orders in numbers of species are Rodentia: mice, porcupines, beavers and other gnawing mammals. The next three biggest orders, depending on the biological classification scheme used, are the Primates including the apes and lemurs. According to Mammal Species of the World, 5,416 species were identified in 2006.
These were grouped into 153 families and 29 orders. In 2008, the International Union for Conservation of Nature completed a five-year Global Mammal Assessment for its IUCN Red List, which counted 5,488 species. According to a research published in the Journal of Mammalogy in 2018, the number of recognized mammal species is 6,495 species included 96 extinct; the word "mammal" is modern, from the scientific name Mammalia coined by Carl Linnaeus in 1758, derived from the Latin mamma. In an influential 1988 paper, Timothy Rowe defined Mammalia phylogenetically as the crown group of mammals, the clade consisting of the most recent common ancestor of living monotremes and therian m
In molecular biology, DNA replication is the biological process of producing two identical replicas of DNA from one original DNA molecule. DNA replication occurs in all living organisms acting as the basis for biological inheritance; the cell possesses the distinctive property of division. DNA is made up of a double helix of two complementary strands. During replication, these strands are separated; each strand of the original DNA molecule serves as a template for the production of its counterpart, a process referred to as semiconservative replication. As a result of semi-conservative replication, the new helix will be composed of an original DNA strand as well as a newly synthesized strand. Cellular proofreading and error-checking mechanisms ensure near perfect fidelity for DNA replication. In a cell, DNA replication begins at origins of replication, in the genome. Unwinding of DNA at the origin and synthesis of new strands, accommodated by an enzyme known as helicase, results in replication forks growing bi-directionally from the origin.
A number of proteins are associated with the replication fork to help in the initiation and continuation of DNA synthesis. Most prominently, DNA polymerase synthesizes the new strands by adding nucleotides that complement each strand. DNA replication occurs during the S-stage of interphase. DNA replication can be performed in vitro. DNA polymerases isolated from cells and artificial DNA primers can be used to start DNA synthesis at known sequences in a template DNA molecule. Polymerase chain reaction, ligase chain reaction, transcription-mediated amplification are examples. DNA exists as a double-stranded structure, with both strands coiled together to form the characteristic double-helix; each single strand of DNA is a chain of four types of nucleotides. Nucleotides in DNA contain a deoxyribose sugar, a phosphate, a nucleobase; the four types of nucleotide correspond to the four nucleobases adenine, cytosine and thymine abbreviated as A, C, G and T. Adenine and guanine are purine bases, while cytosine and thymine are pyrimidines.
These nucleotides form phosphodiester bonds, creating the phosphate-deoxyribose backbone of the DNA double helix with the nucleobases pointing inward. Nucleobases are matched between strands through hydrogen bonds to form base pairs. Adenine pairs with thymine, guanine pairs with cytosine. DNA strands have a directionality, the different ends of a single strand are called the "3′ end" and the "5′ end". By convention, if the base sequence of a single strand of DNA is given, the left end of the sequence is the 5′ end, while the right end of the sequence is the 3′ end; the strands of the double helix are anti-parallel with one being 5′ to 3′, the opposite strand 3′ to 5′. These terms refer to the carbon atom in deoxyribose to which the next phosphate in the chain attaches. Directionality has consequences in DNA synthesis, because DNA polymerase can synthesize DNA in only one direction by adding nucleotides to the 3′ end of a DNA strand; the pairing of complementary bases in DNA means that the information contained within each strand is redundant.
Phosphodiester bonds are stronger than hydrogen bonds. This allows the strands to be separated from one another; the nucleotides on a single strand can therefore be used to reconstruct nucleotides on a newly synthesized partner strand. DNA polymerases are a family of enzymes. DNA polymerases in general cannot initiate synthesis of new strands, but can only extend an existing DNA or RNA strand paired with a template strand. To begin synthesis, a short fragment of RNA, called a primer, must be created and paired with the template DNA strand. DNA polymerase adds a new strand of DNA by extending the 3′ end of an existing nucleotide chain, adding new nucleotides matched to the template strand one at a time via the creation of phosphodiester bonds; the energy for this process of DNA polymerization comes from hydrolysis of the high-energy phosphate bonds between the three phosphates attached to each unincorporated base. Free bases with their attached phosphate groups are called nucleotides; when a nucleotide is being added to a growing DNA strand, the formation of a phosphodiester bond between the proximal phosphate of the nucleotide to the growing chain is accompanied by hydrolysis of a high-energy phosphate bond with release of the two distal phosphates as a pyrophosphate.
Enzymatic hydrolysis of the resulting pyrophosphate into inorganic phosphate consumes a second high-energy phosphate bond and renders the reaction irreversible. In general, DNA polymerases are accurate, with an intrinsic error rate of less than one mistake for every 107 nucleotides added. In addition, some DNA polymerases have proofreading ability. Post-replication mismatch repair mechanisms monitor the DNA for errors, being capable of distinguishing mismatches in the newly synthesized DNA strand from the original strand sequence. Together, these three discrimination steps enable replication fidelity of less than one mistake for every 109 nucleotides added; the rate of DNA replication in a living cell was first measured as the rate of phage T4 DNA elongation in phage-infected E. coli. During the period of exponential DNA increase at 37 °C, the rate was 749 nucleotides per second