A chromosome is a deoxyribonucleic acid molecule with part or all of the genetic material of an organism. Most eukaryotic chromosomes include packaging proteins which, aided by chaperone proteins, bind to and condense the DNA molecule to prevent it from becoming an unmanageable tangle. Chromosomes are visible under a light microscope only when the cell is undergoing the metaphase of cell division. Before this happens, every chromosome is copied once, the copy is joined to the original by a centromere, resulting either in an X-shaped structure if the centromere is located in the middle of the chromosome or a two-arm structure if the centromere is located near one of the ends; the original chromosome and the copy are now called sister chromatids. During metaphase the X-shape structure is called a metaphase chromosome. In this condensed form chromosomes are easiest to distinguish and study. In animal cells, chromosomes reach their highest compaction level in anaphase during chromosome segregation.
Chromosomal recombination during meiosis and subsequent sexual reproduction play a significant role in genetic diversity. If these structures are manipulated incorrectly, through processes known as chromosomal instability and translocation, the cell may undergo mitotic catastrophe; this will make the cell initiate apoptosis leading to its own death, but sometimes mutations in the cell hamper this process and thus cause progression of cancer. Some use the term chromosome in a wider sense, to refer to the individualized portions of chromatin in cells, either visible or not under light microscopy. Others use the concept in a narrower sense, to refer to the individualized portions of chromatin during cell division, visible under light microscopy due to high condensation; the word chromosome comes from the Greek χρῶμα and σῶμα, describing their strong staining by particular dyes. The term was coined by von Waldeyer-Hartz, referring to the term chromatin, introduced by Walther Flemming; some of the early karyological terms have become outdated.
For example and Chromosom, both ascribe color to a non-colored state. The German scientists Schleiden, Virchow and Bütschli were among the first scientists who recognized the structures now familiar as chromosomes. In a series of experiments beginning in the mid-1880s, Theodor Boveri gave the definitive demonstration that chromosomes are the vectors of heredity, it is the second of these principles, so original. Wilhelm Roux suggested. Boveri was able to confirm this hypothesis. Aided by the rediscovery at the start of the 1900s of Gregor Mendel's earlier work, Boveri was able to point out the connection between the rules of inheritance and the behaviour of the chromosomes. Boveri influenced two generations of American cytologists: Edmund Beecher Wilson, Nettie Stevens, Walter Sutton and Theophilus Painter were all influenced by Boveri. In his famous textbook The Cell in Development and Heredity, Wilson linked together the independent work of Boveri and Sutton by naming the chromosome theory of inheritance the Boveri–Sutton chromosome theory.
Ernst Mayr remarks that the theory was hotly contested by some famous geneticists: William Bateson, Wilhelm Johannsen, Richard Goldschmidt and T. H. Morgan, all of a rather dogmatic turn of mind. Complete proof came from chromosome maps in Morgan's own lab; the number of human chromosomes was published in 1923 by Theophilus Painter. By inspection through the microscope, he counted 24 pairs, his error was copied by others and it was not until 1956 that the true number, 46, was determined by Indonesia-born cytogeneticist Joe Hin Tjio. The prokaryotes – bacteria and archaea – have a single circular chromosome, but many variations exist; the chromosomes of most bacteria, which some authors prefer to call genophores, can range in size from only 130,000 base pairs in the endosymbiotic bacteria Candidatus Hodgkinia cicadicola and Candidatus Tremblaya princeps, to more than 14,000,000 base pairs in the soil-dwelling bacterium Sorangium cellulosum. Spirochaetes of the genus Borrelia are a notable exception to this arrangement, with bacteria such as Borrelia burgdorferi, the cause of Lyme disease, containing a single linear chromosome.
Prokaryotic chromosomes have less sequence-based structure than eukaryotes. Bacteria have a one-point from which replication starts, whereas some archaea contain multiple replication origins; the genes in prokaryotes are organized in operons, do not contain introns, unlike eukaryotes. Prokaryotes do not possess nuclei. Instead, their DNA is organized into a structure called the nucleoid; the nucleoid occupies a defined region of the bacterial cell. This structure is, dynamic and is maintained and remodeled by the actions of a range of histone-like proteins, which associate with the bacterial chromosome. In archaea, the DNA in chromosomes is more organized, with the DNA packaged within structures similar to eukaryotic nucleosomes. Certain bacteria contain plasmids or other extrachromosomal DNA; these are circular structures in the cytoplasm that contain cellular DNA and play a role in horizontal gene transfer. In prokaryotes and viruses, the DNA is densely packed and organized.
Enzymes are macromolecular biological catalysts. Enzymes accelerate chemical reactions; the molecules upon which enzymes may act are called substrates and the enzyme converts the substrates into different molecules known as products. All metabolic processes in the cell need enzyme catalysis in order to occur at rates fast enough to sustain life. Metabolic pathways depend upon enzymes to catalyze individual steps; the study of enzymes is called enzymology and a new field of pseudoenzyme analysis has grown up, recognising that during evolution, some enzymes have lost the ability to carry out biological catalysis, reflected in their amino acid sequences and unusual'pseudocatalytic' properties. Enzymes are known to catalyze more than 5,000 biochemical reaction types. Most enzymes are proteins; the latter are called ribozymes. Enzymes' specificity comes from their unique three-dimensional structures. Like all catalysts, enzymes increase the reaction rate by lowering its activation energy; some enzymes can make their conversion of substrate to product occur many millions of times faster.
An extreme example is orotidine 5'-phosphate decarboxylase, which allows a reaction that would otherwise take millions of years to occur in milliseconds. Chemically, enzymes are like any catalyst and are not consumed in chemical reactions, nor do they alter the equilibrium of a reaction. Enzymes differ from most other catalysts by being much more specific. Enzyme activity can be affected by other molecules: inhibitors are molecules that decrease enzyme activity, activators are molecules that increase activity. Many therapeutic drugs and poisons are enzyme inhibitors. An enzyme's activity decreases markedly outside its optimal temperature and pH, many enzymes are denatured when exposed to excessive heat, losing their structure and catalytic properties; some enzymes are used commercially, in the synthesis of antibiotics. Some household products use enzymes to speed up chemical reactions: enzymes in biological washing powders break down protein, starch or fat stains on clothes, enzymes in meat tenderizer break down proteins into smaller molecules, making the meat easier to chew.
By the late 17th and early 18th centuries, the digestion of meat by stomach secretions and the conversion of starch to sugars by plant extracts and saliva were known but the mechanisms by which these occurred had not been identified. French chemist Anselme Payen was the first to discover an enzyme, diastase, in 1833. A few decades when studying the fermentation of sugar to alcohol by yeast, Louis Pasteur concluded that this fermentation was caused by a vital force contained within the yeast cells called "ferments", which were thought to function only within living organisms, he wrote that "alcoholic fermentation is an act correlated with the life and organization of the yeast cells, not with the death or putrefaction of the cells."In 1877, German physiologist Wilhelm Kühne first used the term enzyme, which comes from Greek ἔνζυμον, "leavened" or "in yeast", to describe this process. The word enzyme was used to refer to nonliving substances such as pepsin, the word ferment was used to refer to chemical activity produced by living organisms.
Eduard Buchner submitted his first paper on the study of yeast extracts in 1897. In a series of experiments at the University of Berlin, he found that sugar was fermented by yeast extracts when there were no living yeast cells in the mixture, he named the enzyme that brought about the fermentation of sucrose "zymase". In 1907, he received the Nobel Prize in Chemistry for "his discovery of cell-free fermentation". Following Buchner's example, enzymes are named according to the reaction they carry out: the suffix -ase is combined with the name of the substrate or to the type of reaction; the biochemical identity of enzymes was still unknown in the early 1900s. Many scientists observed that enzymatic activity was associated with proteins, but others argued that proteins were carriers for the true enzymes and that proteins per se were incapable of catalysis. In 1926, James B. Sumner crystallized it; the conclusion that pure proteins can be enzymes was definitively demonstrated by John Howard Northrop and Wendell Meredith Stanley, who worked on the digestive enzymes pepsin and chymotrypsin.
These three scientists were awarded the 1946 Nobel Prize in Chemistry. The discovery that enzymes could be crystallized allowed their structures to be solved by x-ray crystallography; this was first done for lysozyme, an enzyme found in tears and egg whites that digests the coating of some bacteria. This high-resolution structure of lysozyme marked the beginning of the field of structural biology and the effort to understand how enzymes work at an atomic level of detail. An enzyme's name is derived from its substrate or the chemical reaction it catalyzes, with the word ending in -ase. Examples are alcohol dehydrogenase and DNA polymerase. Different enzymes that catalyze the same chemical reaction are called isozymes; the International Union of Biochemistry and Molecular Biology have developed a nomenclature for enzymes, the EC numbers. The first number broadly classifies the enzyme based on its mechanism; the top-level classification is: EC 1, Oxidoreductases: catalyze oxidation/reducti
Neutrophils are the most abundant type of granulocytes and the most abundant type of white blood cells in most mammals. They form an essential part of the innate immune system, their functions vary in different animals. They are formed from stem cells in the bone marrow and differentiated into subpopulations of neutrophil-killers and neutrophil-cagers, they are short-lived and motile, or mobile, as they can enter parts of tissue where other cells/molecules cannot. Neutrophils may be banded neutrophils, they form part of the polymorphonuclear cells family together with eosinophils. The name neutrophil derives from staining characteristics on hematoxylin and eosin histological or cytological preparations. Whereas basophilic white blood cells stain dark blue and eosinophilic white blood cells stain bright red, neutrophils stain a neutral pink. Neutrophils contain a nucleus divided into 2–5 lobes. Neutrophils are a type of phagocyte and are found in the bloodstream. During the beginning phase of inflammation as a result of bacterial infection, environmental exposure, some cancers, neutrophils are one of the first-responders of inflammatory cells to migrate towards the site of inflammation.
They migrate through the blood vessels through interstitial tissue, following chemical signals such as Interleukin-8, C5a, fMLP, Leukotriene B4 and H2O2 in a process called chemotaxis. They are the predominant cells in pus, accounting for its whitish/yellowish appearance. Neutrophils are recruited to the site of injury within minutes following trauma and are the hallmark of acute inflammation; when adhered to a surface, neutrophil granulocytes have an average diameter of 12–15 micrometers in peripheral blood smears. In suspension, human neutrophils have an average diameter of 8.85 µm. With the eosinophil and the basophil, they form the class of polymorphonuclear cells, named for the nucleus' multilobulated shape; the nucleus has the separate lobes connected by chromatin. The nucleolus disappears as the neutrophil matures, something that happens in only a few other types of nucleated cells. In the cytoplasm, the Golgi apparatus is small and ribosomes are sparse, the rough endoplasmic reticulum is absent.
The cytoplasm contains about 200 granules, of which a third are azurophilic. Neutrophils will show increasing segmentation. A normal neutrophil should have 3–5 segments. Hypersegmentation occurs in some disorders, most notably vitamin B12 deficiency; this is noted in a manual review of the blood smear and is positive when most or all of the neutrophils have 5 or more segments. Neutrophils are the most abundant white blood cells in humans; the stated normal range for human blood counts varies between laboratories, but a neutrophil count of 2.5–7.5 x 109/L is a standard normal range. People of African and Middle Eastern descent may have lower counts. A report may divide neutrophils into segmented bands; when circulating in the bloodstream and inactivated, neutrophils are spherical. Once activated, they change shape and become more amorphous or amoeba-like and can extend pseudopods as they hunt for antigens. Neutrophils have a preference to engulf refined carbohydrates over bacteria. In 1973 Sanchez et al. found that the neutrophil phagocytic capacity to engulf bacteria is affected when simple sugars are digested, that fasting strengthens the neutrophils' phagocytic capacity to engulf bacteria.
However, the digestion of normal starches has no effect. It was concluded that the function, not the number, of phagocytes in engulfing bacteria was altered by the ingestion of sugars. In 2007 researchers at the Whitehead Institute of Biomedical Research found that given a selection of sugars, neutrophils engulf some types of sugar preferentially; the average lifespan of inactivated human neutrophils in the circulation has been reported by different approaches to be between 5 and 90 hours. Upon activation, they marginate and undergo selectin-dependent capture followed by integrin-dependent adhesion in most cases, after which they migrate into tissues, where they survive for 1–2 days. Neutrophils are much more numerous than the longer-lived monocyte/macrophage phagocytes. A pathogen is to first encounter a neutrophil; some experts hypothesize. The short lifetime of neutrophils minimizes propagation of those pathogens that parasitize phagocytes because the more time such parasites spend outside a host cell, the more they will be destroyed by some component of the body's defenses.
Because neutrophil antimicrobial products can damage host tissues, their short life limits damage to the host during inflammation. Neutrophils will be removed after phagocytosis of pathogens by macrophages. PECAM-1 and phosphatidylserine on the cell surface are involved in this process. Neutrophils undergo a process called chemotaxis via amoeboid movement, which allows them to migrate toward sites of infection or inflammation. Cell surface receptors allow neutrophils to detect chemical gr
Chromosome 19 is one of the 23 pairs of chromosomes in humans. People have two copies of this chromosome. Chromosome 19 spans more than 58.6 million base pairs, the building material of DNA. The following are some of the gene count estimates of human chromosome 19; because researchers use different approaches to genome annotation their predictions of the number of genes on each chromosome varies. Among various projects, the collaborative consensus coding sequence project takes an conservative strategy. So CCDS's gene number prediction represents a lower bound on the total number of human protein-coding genes; the following is a partial list of genes on human chromosome 19. For complete list, see the link in the infobox on the right; the following diseases are some of those related to genes on chromosome 19: National Institutes of Health. "Chromosome 19". Genetics Home Reference. Retrieved 2017-05-06. "Chromosome 19". Human Genome Project Information Archive 1990–2003. Retrieved 2017-05-06
In biology, a gene is a sequence of nucleotides in DNA or RNA that codes for a molecule that has a function. During gene expression, the DNA is first copied into RNA; the RNA can be directly functional or be the intermediate template for a protein that performs a function. The transmission of genes to an organism's offspring is the basis of the inheritance of phenotypic trait; these genes make up different DNA sequences called genotypes. Genotypes along with developmental factors determine what the phenotypes will be. Most biological traits are under the influence of polygenes as well as gene–environment interactions; some genetic traits are visible, such as eye color or number of limbs, some are not, such as blood type, risk for specific diseases, or the thousands of basic biochemical processes that constitute life. Genes can acquire mutations in their sequence, leading to different variants, known as alleles, in the population; these alleles encode different versions of a protein, which cause different phenotypical traits.
Usage of the term "having a gene" refers to containing a different allele of the same, shared gene. Genes evolve due to natural selection / survival of the fittest and genetic drift of the alleles; the concept of a gene continues to be refined. For example, regulatory regions of a gene can be far removed from its coding regions, coding regions can be split into several exons; some viruses store their genome in RNA instead of DNA and some gene products are functional non-coding RNAs. Therefore, a broad, modern working definition of a gene is any discrete locus of heritable, genomic sequence which affect an organism's traits by being expressed as a functional product or by regulation of gene expression; the term gene was introduced by Danish botanist, plant physiologist and geneticist Wilhelm Johannsen in 1909. It is inspired by the ancient Greek: γόνος, that means offspring and procreation; the existence of discrete inheritable units was first suggested by Gregor Mendel. From 1857 to 1864, in Brno, he studied inheritance patterns in 8000 common edible pea plants, tracking distinct traits from parent to offspring.
He described these mathematically as 2n combinations where n is the number of differing characteristics in the original peas. Although he did not use the term gene, he explained his results in terms of discrete inherited units that give rise to observable physical characteristics; this description prefigured Wilhelm Johannsen's distinction between phenotype. Mendel was the first to demonstrate independent assortment, the distinction between dominant and recessive traits, the distinction between a heterozygote and homozygote, the phenomenon of discontinuous inheritance. Prior to Mendel's work, the dominant theory of heredity was one of blending inheritance, which suggested that each parent contributed fluids to the fertilisation process and that the traits of the parents blended and mixed to produce the offspring. Charles Darwin developed a theory of inheritance he termed pangenesis, from Greek pan and genesis / genos. Darwin used the term gemmule to describe hypothetical particles. Mendel's work went unnoticed after its first publication in 1866, but was rediscovered in the late 19th century by Hugo de Vries, Carl Correns, Erich von Tschermak, who reached similar conclusions in their own research.
In 1889, Hugo de Vries published his book Intracellular Pangenesis, in which he postulated that different characters have individual hereditary carriers and that inheritance of specific traits in organisms comes in particles. De Vries called these units "pangenes", after Darwin's 1868 pangenesis theory. Sixteen years in 1905, Wilhelm Johannsen introduced the term'gene' and William Bateson that of'genetics' while Eduard Strasburger, amongst others, still used the term'pangene' for the fundamental physical and functional unit of heredity. Advances in understanding genes and inheritance continued throughout the 20th century. Deoxyribonucleic acid was shown to be the molecular repository of genetic information by experiments in the 1940s to 1950s; the structure of DNA was studied by Rosalind Franklin and Maurice Wilkins using X-ray crystallography, which led James D. Watson and Francis Crick to publish a model of the double-stranded DNA molecule whose paired nucleotide bases indicated a compelling hypothesis for the mechanism of genetic replication.
In the early 1950s the prevailing view was that the genes in a chromosome acted like discrete entities, indivisible by recombination and arranged like beads on a string. The experiments of Benzer using mutants defective in the rII region of bacteriophage T4 showed that individual genes have a simple linear structure and are to be equivalent to a linear section of DNA. Collectively, this body of research established the central dogma of molecular biology, which states that proteins are translated from RNA, transcribed from DNA; this dogma has since been shown to have exceptions, such as reverse transcription in retroviruses. The modern study of genetics at the level of DNA is known as molecular genetics. In 1972, Walter Fiers and his team were the first to determine the sequence of a gene: that of Bacteriophage MS2 coat protein; the subsequent development of chain-termination DNA sequencing in 1977 by Frederick Sanger improved the efficiency of sequencing and turned it into a routine laboratory tool.
An automated version of the Sanger method was used in early phases of the
Protein Data Bank
The Protein Data Bank is a database for the three-dimensional structural data of large biological molecules, such as proteins and nucleic acids. The data obtained by X-ray crystallography, NMR spectroscopy, or cryo-electron microscopy, submitted by biologists and biochemists from around the world, are accessible on the Internet via the websites of its member organisations; the PDB is overseen by an organization called the Worldwide Protein Data Bank, wwPDB. The PDB is a key in areas such as structural genomics. Most major scientific journals, some funding agencies, now require scientists to submit their structure data to the PDB. Many other databases use protein structures deposited in the PDB. For example, SCOP and CATH classify protein structures, while PDBsum provides a graphic overview of PDB entries using information from other sources, such as Gene ontology. Two forces converged to initiate the PDB: 1) a small but growing collection of sets of protein structure data determined by X-ray diffraction.
In 1969, with the sponsorship of Walter Hamilton at the Brookhaven National Laboratory, Edgar Meyer began to write software to store atomic coordinate files in a common format to make them available for geometric and graphical evaluation. By 1971, one of Meyer's programs, SEARCH, enabled researchers to remotely access information from the database to study protein structures offline. SEARCH was instrumental in enabling networking, thus marking the functional beginning of the PDB; the Protein Data Bank was announced in October 1971 in Nature New Biology as a joint venture between Cambridge Crystallographic Data Centre, UK and Brookhaven National Laboratory, USA. Upon Hamilton's death in 1973, Tom Koeztle took over direction of the PDB for the subsequent 20 years. In January 1994, Joel Sussman of Israel's Weizmann Institute of Science was appointed head of the PDB. In October 1998, the PDB was transferred to the Research Collaboratory for Structural Bioinformatics; the new director was Helen M. Berman of Rutgers University.
In 2003, with the formation of the wwPDB, the PDB became an international organization. The founding members are PDBe, RCSB, PDBj; the BMRB joined in 2006. Each of the four members of wwPDB can act as deposition, data processing and distribution centers for PDB data; the data processing refers to the fact that annotate each submitted entry. The data are automatically checked for plausibility; the PDB database is updated weekly. The PDB holdings list is updated weekly; as of 17 October 2018, the breakdown of current holdings is as follows: 120,052 structures in the PDB have a structure factor file. 9,734 structures have an NMR restraint file. 3,486 structures in the PDB have a chemical shifts file. 2,531 structures in the PDB have a 3DEM map file deposited in EM Data BankThese data show that most structures are determined by X-ray diffraction, but about 10% of structures are now determined by protein NMR. When using X-ray diffraction, approximations of the coordinates of the atoms of the protein are obtained, whereas estimations of the distances between pairs of atoms of the protein are found through NMR experiments.
Therefore, the final conformation of the protein is obtained, in the latter case, by solving a distance geometry problem. A few proteins are determined by cryo-electron microscopy; the significance of the structure factor files, mentioned above, is that, for PDB structures determined by X-ray diffraction that have a structure file, the electron density map may be viewed. The data of such structures is stored on the "electron density server". In the past, the number of structures in the PDB has grown at an exponential rate, passing the 100 registered structures milestone in 1982, the 1,000 in 1993, the 10,000 in 1999, the 100,000 in 2014. However, since 2007, the rate of accumulation of new protein structures appears to have plateaued; the file format used by the PDB was called the PDB file format. This original format was restricted by the width of computer punch cards to 80 characters per line. Around 1996, the "macromolecular Crystallographic Information file" format, mmCIF, an extension of the CIF format started to be phased in.
MmCIF is now the master format for the PDB archive. An XML version of this format, called PDBML, was described in 2005; the structure files can be downloaded in any of these three formats. In fact, individual files are downloaded into graphics packages using web addresses: For PDB format files, use, e.g. http://www.pdb.org/pdb/files/4hhb.pdb.gz or http://pdbe.org/download/4hhb For PDBML files, use, e.g. http://www.pdb.org/pdb/files/4hhb.xml.gz or http://pdbe.org/pdbml/4hhbThe "4hhb" is the PDB identifier. Each structure published in PDB receives a four-character alphanumeric identifier, its PDB ID; the structure files may be viewed using one of several free and open source computer programs, including Jmol, Pymol, VMD, Rasmol. Other non-free, shareware programs
A base pair is a unit consisting of two nucleobases bound to each other by hydrogen bonds. They form the building blocks of the DNA double helix and contribute to the folded structure of both DNA and RNA. Dictated by specific hydrogen bonding patterns, Watson–Crick base pairs allow the DNA helix to maintain a regular helical structure, subtly dependent on its nucleotide sequence; the complementary nature of this based-paired structure provides a redundant copy of the genetic information encoded within each strand of DNA. The regular structure and data redundancy provided by the DNA double helix make DNA well suited to the storage of genetic information, while base-pairing between DNA and incoming nucleotides provides the mechanism through which DNA polymerase replicates DNA and RNA polymerase transcribes DNA into RNA. Many DNA-binding proteins can recognize specific base-pairing patterns that identify particular regulatory regions of genes. Intramolecular base pairs can occur within single-stranded nucleic acids.
This is important in RNA molecules, where Watson–Crick base pairs permit the formation of short double-stranded helices, a wide variety of non-Watson–Crick interactions allow RNAs to fold into a vast range of specific three-dimensional structures. In addition, base-pairing between transfer RNA and messenger RNA forms the basis for the molecular recognition events that result in the nucleotide sequence of mRNA becoming translated into the amino acid sequence of proteins via the genetic code; the size of an individual gene or an organism's entire genome is measured in base pairs because DNA is double-stranded. Hence, the number of total base pairs is equal to the number of nucleotides in one of the strands; the haploid human genome is estimated to be about 3.2 billion bases long and to contain 20,000–25,000 distinct protein-coding genes. A kilobase is a unit of measurement in molecular biology equal to 1000 base pairs of DNA or RNA; the total amount of related DNA base pairs on Earth is estimated at 5.0×1037 and weighs 50 billion tonnes.
In comparison, the total mass of the biosphere has been estimated to be as much as 4 TtC. Hydrogen bonding is the chemical interaction. Appropriate geometrical correspondence of hydrogen bond donors and acceptors allows only the "right" pairs to form stably. DNA with high GC-content is more stable than DNA with low GC-content. But, contrary to popular belief, the hydrogen bonds do not stabilize the DNA significantly; the larger nucleobases and guanine, are members of a class of double-ringed chemical structures called purines. Purines are complementary only with pyrimidines: pyrimidine-pyrimidine pairings are energetically unfavorable because the molecules are too far apart for hydrogen bonding to be established. Purine-pyrimidine base-pairing of AT or GC or UA results in proper duplex structure; the only other purine-pyrimidine pairings would be AC and GT and UG. The GU pairing, with two hydrogen bonds, does occur often in RNA. Paired DNA and RNA molecules are comparatively stable at room temperature, but the two nucleotide strands will separate above a melting point, determined by the length of the molecules, the extent of mispairing, the GC content.
Higher GC content results in higher melting temperatures. On the converse, regions of a genome that need to separate — for example, the promoter regions for often-transcribed genes — are comparatively GC-poor. GC content and melting temperature must be taken into account when designing primers for PCR reactions; the following DNA sequences illustrate pair double-stranded patterns. By convention, the top strand is written from the 5' end to the 3' end. A base-paired DNA sequence: ATCGATTGAGCTCTAGCG TAGCTAACTCGAGATCGCThe corresponding RNA sequence, in which uracil is substituted for thymine in the RNA strand: AUCGAUUGAGCUCUAGCG UAGCUAACUCGAGAUCGC Chemical analogs of nucleotides can take the place of proper nucleotides and establish non-canonical base-pairing, leading to errors in DNA replication and DNA transcription; this is due to their isosteric chemistry. One common mutagenic base analog is 5-bromouracil, which resembles thymine but can base-pair to guanine in its enol form. Other chemicals, known as DNA intercalators, fit into the gap between adjacent bases on a single strand and induce frameshift mutations by "masquerading" as a base, causing the DNA replication machinery to skip or insert additional nucleotides at the intercalated site.
Most intercalators are known or suspected carcinogens. Examples include ethidium acridine. An unnatural base pair is a designed subunit of DNA, created in a laboratory and does not occur in nature. DNA sequences have been described which use newly created nucleobases to form a third base pair, in addition to the two ba