1.
RNA polymerase II
–
RNA polymerase II is protein-protein complex. It is one of the three RNAP enzymes found in the nucleus of eukaryotic cells and it catalyzes the transcription of DNA to synthesize precursors of mRNA and most snRNA and microRNA. A550 kDa complex of 12 subunits, RNAP II is the most studied type of RNA polymerase, a wide range of transcription factors are required for it to bind to upstream gene promoters and begin transcription. Early studies suggested a minimum of two RNAPs, one of which synthesized rRNA in the nucleolus while the remaining enzyme synthesizes other RNA in the nucleoplasm, part of the nucleus but outside the nucleolus. In 1969, science experimentalists Robert Roeder and William Rutter definitively discovered an additional RNAP that was resonablable for transcription of some kind of RNA in the nucleoplasm, the finding was obtained by the use of DEAE-Sephadex ion-exchange chromatography. The technique separated the enzymes by the order of the elutions, Ι, ΙΙ, ΙΙΙ. The enzymes were named according to the order of the elutions, RNAP I, RNAP II and this discovery demonstrated that there was an additional enzyme present in the nucleoplasm, which allowed for the differentiation between RNAP II and RNAP III. The eukaryotic core RNA polymerase II was first purified using transcription assays, the purified enzyme has typically 10-12 subunits and is incapable of specific promoter recognition. DNA-directed RNA polymerase II subunit RPB1 – an enzyme that in humans is encoded by the POLR2A gene, RPB1 is the largest subunit of RNA polymerase II. It contains a carboxy terminal domain composed of up to 52 heptapeptide repeats that are essential for polymerase activity, the CTD was first discovered in the laboratory of C. J. Ingles at the University of Toronto and by JL Corden at Johns Hopkins University. In combination with several other subunits, the RPB1 subunit forms the DNA binding domain of the polymerase. Exists as a heterodimer with another polymerase subunit, POLR2J forming a core subassembly, RPB3 strongly interacts with RPB1-5,7, 10-12. RNA polymerase II subunit B4 – encoded by the POLR2D gene is the fourth-largest subunit, RPB5 – In humans is encoded by the POLR2E gene. Two molecules of this subunit are present in each RNA polymerase II, RPB5 strongly interacts with RPB1, RPB3, and RPB6. RPB6 – forms a structure with at least two other subunits that stabilizes the transcribing polymerase on the DNA template, RPB7 – encoded by POLR2G and may play a role in regulating polymerase function. RPB7 interacts strongly with RPB1 and RPB5, RPB8 – interacts with subunits RPB1-3,5, and 7. RPB9 – The groove in which the DNA template is transcribed into RNA is composed of RPB9, RPB10 – the product of gene POLR2L. It interacts with RPB1-3 and 5, and strongly with RPB3, RPB11 – the RPB11 subunit is itself composed of three subunits in humans, POLR2J, POLR2J2, and POLR2J3
2.
Escherichia coli
–
Escherichia coli is a gram-negative, facultatively anaerobic, rod-shaped, coliform bacterium of the genus Escherichia that is commonly found in the lower intestine of warm-blooded organisms. Most E. coli strains are harmless, but some serotypes can cause food poisoning in their hosts. The harmless strains are part of the flora of the gut, and can benefit their hosts by producing vitamin K2. E. coli is expelled into the environment within fecal matter, the bacterium grows massively in fresh fecal matter under aerobic conditions for 3 days, but its numbers decline slowly afterwards. E. coli and other facultative anaerobes constitute about 0. 1% of gut flora, cells are able to survive outside the body for a limited amount of time, which makes them potential indicator organisms to test environmental samples for fecal contamination. A growing body of research, though, has examined environmentally persistent E. coli which can survive for extended periods outside of a host, the bacterium can be grown and cultured easily and inexpensively in a laboratory setting, and has been intensively investigated for over 60 years. E. coli is a chemoheterotroph whose chemically defined medium must include a source of carbon, under favorable conditions, it takes only 20 minutes to reproduce. E. coli is a Gram-negative, facultative anaerobic and nonsporulating bacterium, cells are typically rod-shaped, and are about 2.0 μm long and 0. 25–1.0 μm in diameter, with a cell volume of 0. 6–0.7 μm3. E. coli stains Gram-negative because its cell wall is composed of a peptidoglycan layer. During the staining process, E. coli picks up the color of the counterstain safranin, the outer membrane surrounding the cell wall provides a barrier to certain antibiotics such that E. coli is not damaged by penicillin. Strains that possess flagella are motile, the flagella have a peritrichous arrangement. E. coli can live on a variety of substrates and uses mixed-acid fermentation in anaerobic conditions, producing lactate, succinate, ethanol, acetate. Optimum growth of E. coli occurs at 37 °C, and it uses oxygen when it is present and available. It can, however, continue to grow in the absence of oxygen using fermentation or anaerobic respiration, the ability to continue growing in the absence of oxygen is an advantage to bacteria because their survival is increased in environments where water predominates. The bacterial cell cycle is divided into three stages, the B period occurs between the completion of cell division and the beginning of DNA replication. The C period encompasses the time it takes to replicate the chromosomal DNA, the D period refers to the stage between the conclusion of DNA replication and the end of cell division. The doubling rate of E. coli is higher when more nutrients are available, However, the length of the C and D periods do not change, even when the doubling time becomes less than the sum of the C and D periods. At the fastest growth rates, replication begins before the round of replication has completed, resulting in multiple replication forks along the DNA
3.
Entrez
–
The name Entrez was chosen to reflect the spirit of welcoming the public to search the content available from the NLM. Entrez Global Query is a search and retrieval system that provides access to all databases simultaneously with a single query string. Entrez can efficiently retrieve related sequences, structures, and references, the Entrez system can provide views of gene and protein sequences and chromosome maps. Some textbooks are available online through the Entrez system. The Entrez front page provides, by default, access to the global query, all databases indexed by Entrez can be searched via a single query string, supporting boolean operators and search term tags to limit parts of the search statement to particular fields. This returns a unified results page, that shows the number of hits for the search in each of the databases, Entrez also provides a similar interface for searching each particular database and for refining search results. The Limits feature allows the user to narrow a search a web forms interface, the History feature gives a numbered list of recently performed queries. Results of previous queries can be referred to by number and combined via boolean operators, search results can be saved temporarily in a Clipboard. Users with a MyNCBI account can save queries indefinitely and also choose to have updates with new search results e-mailed for saved queries of most databases and it is widely used in the field of biotechnology as a reference tool for students and professionals alike. Entrez searches the following databases, PubMed, biomedical literature citations and abstracts, including Medline - articles from journals, in addition to using the search engine forms to query the data in Entrez, NCBI provides the Entrez Programming Utilities for more direct access to query results. The eUtils are accessed by posting specially formed URLs to the NCBI server, there was also an eUtils SOAP interface which was terminated on July 2015. In 1991, entrez was introduced in CD form, in 1993, a client-server version of the software provided connectivity with the internet. In 1994, NCBI established a website, and Entrez was a part of initial release. In 2001, Entrez bookshelf was released and in 2003, the Entrez Gene database was developed, Entrez search engine form Entrez Help
4.
Protein Data Bank
–
The Protein Data Bank is a crystallographic database for the three-dimensional structural data of large biological molecules, such as proteins and nucleic acids. The PDB is overseen by a called the Worldwide Protein Data Bank. The PDB is a key resource in areas of structural biology, most major scientific journals, and some funding agencies, now require scientists to submit their structure data to the PDB. Many other databases use protein structures deposited in the PDB, for example, SCOP and CATH classify protein structures, while PDBsum provides a graphic overview of PDB entries using information from other sources, such as Gene ontology. By 1971, one of Meyers programs, SEARCH, enabled researchers to access information from the database to study protein structures offline. SEARCH was instrumental in enabling networking, thus marking the beginning of the PDB. Upon Hamiltons death in 1973, Tom Koeztle took over direction of the PDB for the subsequent 20 years, in January 1994, Joel Sussman of Israels Weizmann Institute of Science was appointed head of the PDB. In October 1998, the PDB was transferred to the Research Collaboratory for Structural Bioinformatics, the new director was Helen M. Berman of Rutgers University. In 2003, with the formation of the wwPDB, the PDB became an international organization, the founding members are PDBe, RCSB, and PDBj. Each of the four members of wwPDB can act as deposition, data processing, the data processing refers to the fact that wwPDB staff review and annotate each submitted entry. The data are automatically checked for plausibility. The PDB database is updated weekly, likewise, the PDB holdings list is also updated weekly. As of 14 March 2017, the breakdown of current holdings is as follows,103,514 structures in the PDB have a structure factor file,9,057 structures have an NMR restraint file. 2,826 structures in the PDB have a chemical shifts file, therefore, the final conformation of the protein is obtained, in the latter case, by solving a distance geometry problem. A few proteins are determined by cryo-electron microscopy, the significance of the structure factor files, mentioned above, is that, for PDB structures determined by X-ray diffraction that have a structure file, the electron density map may be viewed. The data of such structures is stored on the electron density server, however, since 2007, the rate of accumulation of new protein structures appears to have plateaued. The file format used by the PDB was called the PDB file format. This original format was restricted by the width of computer punch cards to 80 characters per line, around 1996, the macromolecular Crystallographic Information file format, mmCIF, which is an extension of the CIF format started to be phased in
5.
National Center for Biotechnology Information
–
The National Center for Biotechnology Information is part of the United States National Library of Medicine, a branch of the National Institutes of Health. The NCBI is located in Bethesda, Maryland and was founded in 1988 through legislation sponsored by Senator Claude Pepper, the NCBI houses a series of databases relevant to biotechnology and biomedicine and is an important resource for bioinformatics tools and services. Major databases include GenBank for DNA sequences and PubMed, a database for the biomedical literature. Other databases include the NCBI Epigenomics database, all these databases are available online through the Entrez search engine. NCBI is directed by David Lipman, one of the authors of the BLAST sequence alignment program. He also leads a research program, including groups led by Stephen Altschul, David Landsman, Eugene Koonin, John Wilbur, Teresa Przytycka. NCBI is listed in the Registry of Research Data Repositories re3data. org, NCBI has had responsibility for making available the GenBank DNA sequence database since 1992. GenBank coordinates with individual laboratories and other databases such as those of the European Molecular Biology Laboratory. Since 1992, NCBI has grown to other databases in addition to GenBank. The NCBI assigns a unique identifier to each species of organism, the NCBI has software tools that are available by WWW browsing or by FTP. For example, BLAST is a sequence similarity searching program, BLAST can do sequence comparisons against the GenBank DNA database in less than 15 seconds. RAG2/IL2RG The NCBI Bookshelf is a collection of freely accessible, downloadable, some of the books are online versions of previously published books, while others, such as Coffee Break, are written and edited by NCBI staff. BLAST is a used for calculating sequence similarity between biological sequences such as nucleotide sequences of DNA and amino acid sequences of proteins. BLAST is a tool for finding sequences similar to the query sequence within the same organism or in different organisms. It searches the query sequence on NCBI databases and servers and post the results back to the browser in chosen format. Input sequences to the BLAST are mostly in FASTA or Genbank format while output could be delivered in variety of such as HTML, XML formatting. HTML is the output format for NCBIs web-page. Entrez is both indexing and retrieval system having data from sources for biomedical research
6.
UniProt
–
UniProt is a freely accessible database of protein sequence and functional information, many entries being derived from genome sequencing projects. It contains an amount of information about the biological function of proteins derived from the research literature. The UniProt consortium comprises the European Bioinformatics Institute, the Swiss Institute of Bioinformatics, EBI, located at the Wellcome Trust Genome Campus in Hinxton, UK, hosts a large resource of bioinformatics databases and services. SIB, located in Geneva, Switzerland, maintains the ExPASy servers that are a resource for proteomics tools. In 2002, EBI, SIB, and PIR joined forces as the UniProt consortium, each consortium member is heavily involved in protein database maintenance and annotation. Until recently, EBI and SIB together produced the Swiss-Prot and TrEMBL databases and these databases coexisted with differing protein sequence coverage and annotation priorities. Swiss-Prot aimed to provide reliable protein sequences associated with a level of annotation. Recognizing that sequence data were being generated at a pace exceeding Swiss-Prots ability to keep up, meanwhile, PIR maintained the PIR-PSD and related databases, including iProClass, a database of protein sequences and curated families. The consortium members pooled their resources and expertise, and launched UniProt in December 2003. UniProt provides four core databases, UniProtKB, UniParc, UniRef, UniProt Knowledgebase is a protein database partially curated by experts, consisting of two sections, UniProtKB/Swiss-Prot and UniProtKB/TrEMBL. As of 19 March 2014, release 2014_03 of UniProtKB/Swiss-Prot contains 542,782 sequence entries, UniProtKB/Swiss-Prot is a manually annotated, non-redundant protein sequence database. It combines information extracted from literature and biocurator-evaluated computational analysis. The aim of UniProtKB/Swiss-Prot is to all known relevant information about a particular protein. Annotation is regularly reviewed to keep up with current scientific findings, the manual annotation of an entry involves detailed analysis of the protein sequence and of the scientific literature. Sequences from the gene and the same species are merged into the same database entry. Differences between sequences are identified, and their cause documented, a range of sequence analysis tools is used in the annotation of UniProtKB/Swiss-Prot entries. Computer-predictions are manually evaluated, and relevant results selected for inclusion in the entry and these predictions include post-translational modifications, transmembrane domains and topology, signal peptides, domain identification, and protein family classification. Relevant publications are identified by searching databases such as PubMed, the full text of each paper is read, and information is extracted and added to the entry
7.
Chromosome
–
A chromosome is a DNA molecule with part or all of the genetic material of an organism. Prokaryotes usually have one single circular chromosome, whereas most eukaryotes are diploid, chromosomes in eukaryotes are composed of chromatin fiber. Chromatin fiber is made of nucleosomes, a nucleosome is a histone octamer with part of a longer DNA strand attached to and wrapped around it. Chromatin fiber, together with associated proteins is known as chromatin, chromatin is present in most cells, with a few exceptions, for example, red blood cells. Occurring only in the nucleus of cells, chromatin contains the vast majority of DNA, except for a small amount inherited maternally. Chromosomes are normally visible under a microscope only when the cell is undergoing the metaphase of cell division. Before this happens every chromosome is copied once, and the copy is joined to the original by a centromere resulting in an X-shaped structure, the original chromosome and the copy are now called sister chromatids. During metaphase, when a chromosome is in its most condensed state, in this highly condensed form chromosomes are easiest to distinguish and study. In prokaryotic cells, chromatin occurs free-floating in cytoplasm, as these cells lack organelles, the main information-carrying macromolecule is a single piece of coiled double-helix DNA, containing many genes, regulatory elements and other noncoding DNA. The DNA-bound macromolecules are proteins that serve to package the DNA, chromosomes vary widely between different organisms. Some species such as certain bacteria also contain plasmids or other extrachromosomal DNA and these are circular structures in the cytoplasm that contain cellular DNA and play a role in horizontal gene transfer. Chromosomal recombination during meiosis and subsequent sexual reproduction plays a significant role in genetic diversity. In prokaryotes and viruses, the DNA is often densely packed and organized, in the case of archaea, by homologs to eukaryotic histones, small circular genomes called plasmids are often found in bacteria and also in mitochondria and chloroplasts, reflecting their bacterial origins. Some use the term chromosome in a sense, to refer to the individualized portions of chromatin in cells. However, others use the concept in a sense, to refer to the individualized portions of chromatin during cell division. The word chromosome comes from the Greek χρῶμα and σῶμα, describing their strong staining by particular dyes, schleiden, Virchow and Bütschli were among the first scientists who recognized the structures now so familiar to everyone as chromosomes. The term was coined by von Waldeyer-Hartz, referring to the term chromatin, in a series of experiments beginning in the mid-1880s, Theodor Boveri gave the definitive demonstration that chromosomes are the vectors of heredity. His two principles were the continuity of chromosomes and the individuality of chromosomes and it is the second of these principles that was so original
8.
Prokaryote
–
A prokaryote is a unicellular organism that lacks a membrane-bound nucleus, mitochondria, or any other membrane-bound organelle. The word prokaryote comes from the Greek πρό before and καρυόν nut or kernel, prokaryotes can be divided into two domains, Archaea and bacteria. In contrast, species with nuclei and organelles are placed in the domain Eukaryota, in the prokaryotes, all the intracellular water-soluble components are located together in the cytoplasm enclosed by the cell membrane, rather than in separate cellular compartments. Bacteria, however, do possess protein-based bacterial microcompartments, which are thought to act as primitive organelles enclosed in protein shells, some prokaryotes, such as cyanobacteria may form large colonies. Others, such as myxobacteria, have multicellular stages in their life cycles, molecular studies have provided insight into the evolution and interrelationships of the three domains of biological species. Eukaryotes are organisms, including humans, whose cells have a well defined membrane-bound nucleus, the division between prokaryotes and eukaryotes reflects the existence of two very different levels of cellular organization. Distinctive types of prokaryotes include extremophiles and methanogens, these are common in extreme environments. Prokaryotes have a cytoskeleton, albeit more primitive than that of the eukaryotes. At least some also contain intracellular structures that can be seen as primitive organelles. Membranous organelles are known in some groups of prokaryotes, such as vacuoles or membrane systems devoted to special metabolic properties, in addition, some species also contain carbohydrate-enclosed microcompartments, which have distinct physiological roles. Most prokaryotes are between 1 µm and 10 µm, but they can vary in size from 0.2 µm to 750 µm, Bacteria and archaea reproduce through asexual reproduction, usually by binary fission. DNA transfer between prokaryotic cells occurs in bacteria and archaea, although it has mainly studied in bacteria. In bacteria, gene transfer occurs by three processes and these are bacterial virus -mediated transduction, plasmid-mediated conjugation, and natural transformation. Transduction of bacterial genes by bacteriophage appears to reflect an occasional error during intracellular assembly of virus particles, the transfer of bacterial DNA is under the control of the bacteriophage’s genes rather than bacterial genes. Conjugation in the well-studied E. coli system is controlled by plasmid genes, infrequently during this process, a plasmid may integrate into the host bacterial chromosome, and subsequently transfer part of the host bacterial DNA to another bacterium. Plasmid mediated transfer of host bacterial DNA also appears to be a process rather than a bacterial adaptation. Natural bacterial transformation involves the transfer of DNA from one bacterium to another through the intervening medium, for a bacterium to bind, take up and recombine donor DNA into its own chromosome, it must first enter a special physiological state called competence. About 40 genes are required in Bacillus subtilis for the development of competence, the length of DNA transferred during B. subtilis transformation can be as much as a third to the whole chromosome
9.
DNA polymerase
–
In molecular biology, DNA polymerases are enzymes that synthesize DNA molecules from deoxyribonucleotides, the building blocks of DNA. These enzymes are essential to DNA replication and usually work in pairs to two identical DNA strands from a single original DNA molecule. During this process, DNA polymerase reads the existing DNA strands to create two new strands that match the existing ones. Every time a cell divides, DNA polymerases are required to duplicate the cells DNA. In this way, genetic information is passed down generation to generation. Before replication can take place, an enzyme called helicase unwinds the DNA molecule from its tightly woven form and this opens up or unzips the double-stranded DNA to give two single strands of DNA that can be used as templates for replication. In 1956, Arthur Kornberg and colleagues discovered DNA polymerase I and they described the DNA replication process by which DNA polymerase copies the base sequence of a template DNA strand. Kornberg was later awarded the Nobel Prize in Physiology or Medicine in 1959 for this work, DNA polymerase II was also discovered by Thomas Kornberg and Malcolm E. Gefter in 1970 while further elucidating the role of Pol I in E. coli DNA replication. The main function of DNA polymerase is to synthesize DNA from deoxyribonucleotides, the DNA copies are created by the pairing of nucleotides to bases present on each strand of the original DNA molecule. This pairing always occurs in specific combinations, with cytosine along with guanine, by contrast, RNA polymerases synthesize RNA from ribonucleotides from either RNA or DNA. When synthesizing new DNA, DNA polymerase can add free nucleotides only to the 3 end of the newly forming strand and this results in elongation of the newly forming strand in a 5-3 direction. No known DNA polymerase is able to begin a new chain, it can add a nucleotide onto a pre-existing 3-OH group. Primers consist of RNA or DNA bases, in DNA replication, the first two bases are always RNA, and are synthesized by another enzyme called primase. It is important to note that the directionality of the newly forming strand is opposite to the direction in which DNA polymerase moves along the template strand. Since DNA polymerase requires a free 3 OH group for initiation of synthesis, hence, DNA polymerase moves along the template strand in a 3-5 direction, and the daughter strand is formed in a 5-3 direction. This difference enables the resultant double-strand DNA formed to be composed of two DNA strands that are antiparallel to each other, the function of DNA polymerase is not quite perfect, with the enzyme making about one mistake for every billion base pairs copied. Error correction is a property of some, but not all DNA polymerases and this process corrects mistakes in newly synthesized DNA. When an incorrect base pair is recognized, DNA polymerase moves backwards by one pair of DNA
10.
Protein
–
Proteins are large biomolecules, or macromolecules, consisting of one or more long chains of amino acid residues. Proteins perform a vast array of functions within organisms, including catalysing metabolic reactions, DNA replication, responding to stimuli, a linear chain of amino acid residues is called a polypeptide. A protein contains at least one long polypeptide, short polypeptides, containing less than 20–30 residues, are rarely considered to be proteins and are commonly called peptides, or sometimes oligopeptides. The individual amino acid residues are bonded together by peptide bonds, the sequence of amino acid residues in a protein is defined by the sequence of a gene, which is encoded in the genetic code. In general, the code specifies 20 standard amino acids, however. Sometimes proteins have non-peptide groups attached, which can be called prosthetic groups or cofactors, proteins can also work together to achieve a particular function, and they often associate to form stable protein complexes. Once formed, proteins only exist for a period of time and are then degraded and recycled by the cells machinery through the process of protein turnover. A proteins lifespan is measured in terms of its half-life and covers a wide range and they can exist for minutes or years with an average lifespan of 1–2 days in mammalian cells. Abnormal and or misfolded proteins are degraded more rapidly due to being targeted for destruction or due to being unstable. Like other biological macromolecules such as polysaccharides and nucleic acids, proteins are essential parts of organisms, many proteins are enzymes that catalyse biochemical reactions and are vital to metabolism. Proteins also have structural or mechanical functions, such as actin and myosin in muscle and the proteins in the cytoskeleton, other proteins are important in cell signaling, immune responses, cell adhesion, and the cell cycle. In animals, proteins are needed in the diet to provide the essential amino acids that cannot be synthesized, digestion breaks the proteins down for use in the metabolism. Methods commonly used to study structure and function include immunohistochemistry, site-directed mutagenesis, X-ray crystallography, nuclear magnetic resonance. Most proteins consist of linear polymers built from series of up to 20 different L-α-amino acids, all proteinogenic amino acids possess common structural features, including an α-carbon to which an amino group, a carboxyl group, and a variable side chain are bonded. Only proline differs from this structure as it contains an unusual ring to the N-end amine group. The amino acids in a chain are linked by peptide bonds. Once linked in the chain, an individual amino acid is called a residue, and the linked series of carbon, nitrogen. The peptide bond has two forms that contribute some double-bond character and inhibit rotation around its axis, so that the alpha carbons are roughly coplanar
11.
DNA replication
–
In molecular biology, DNA replication is the biological process of producing two identical replicas of DNA from one original DNA molecule. This process occurs in all living organisms and is the basis for biological inheritance, DNA is made up of a double helix of two complementary strands. During replication, these strands are separated, each strand of the original DNA molecule then serves as a template for the production of its counterpart, a process referred to as semiconservative replication. Cellular proofreading and error-checking mechanisms ensure near perfect fidelity for DNA replication, in a cell, DNA replication begins at specific locations, or origins of replication, in the genome. Unwinding of DNA at the origin and synthesis of new results in replication forks growing bi-directionally from the origin. A number of proteins are associated with the fork to help in the initiation and continuation of DNA synthesis. Most prominently, DNA polymerase synthesizes the new strands by adding nucleotides that complement each strand, DNA replication occurs during the S-stage of interphase. DNA replication can also be performed in vitro, DNA polymerases isolated from cells and artificial DNA primers can be used to initiate DNA synthesis at known sequences in a template DNA molecule. The polymerase chain reaction, a laboratory technique, cyclically applies such artificial synthesis to amplify a specific target DNA fragment from a pool of DNA. DNA usually exists as a structure, with both strands coiled together to form the characteristic double-helix. Each single strand of DNA is a chain of four types of nucleotides, nucleotides in DNA contain a deoxyribose sugar, a phosphate, and a nucleobase. These nucleotides form phosphodiester bonds, creating the phosphate-deoxyribose backbone of the DNA double helix with the bases pointing inward. Nucleotides are matched between strands through hydrogen bonds to form base pairs, adenine pairs with thymine, and guanine pairs with cytosine. DNA strands have a directionality, and the different ends of a single strand are called the 3 end and the 5 end. By convention, if the sequence of a single strand of DNA is given. The strands of the helix are anti-parallel with one being 5 to 3. These terms refer to the atom in deoxyribose to which the next phosphate in the chain attaches. Directionality has consequences in DNA synthesis, because DNA polymerase can synthesize DNA in only one direction by adding nucleotides to the 3 end of a DNA strand, the pairing of complementary bases in DNA means that the information contained within each strand is redundant
12.
Exonuclease
–
Exonucleases are enzymes that work by cleaving nucleotides one at a time from the end of a polynucleotide chain. A hydrolyzing reaction that breaks phosphodiester bonds at either the 3 or the 5 end occurs and its close relative is the endonuclease, which cleaves phosphodiester bonds in the middle of a polynucleotide chain. In both archaea and eukaryotes, one of the routes of RNA degradation is performed by the multi-protein exosome complex. This process involves the exonucleases catching up to the pol II, pol I then synthesizes DNA nucleotides in place of the RNA primer it had just removed. DNA polymerase I also has 3 to 5 and 5 to 3 exonuclease activity, the 3 to 5 can only remove one mononucleotide at a time, and the 5 to 3 activity can remove mononucleotides or up to 10 nucleotides at a time. In 1971, Lehman IR discovered exonuclease I in E. coli, since that time, there have been numerous discoveries including, exonuclease, II, III, IV, V, VI, VII, and VIII. Each type of exonuclease has a type of function or requirement. Exonuclease I breaks apart single-stranded DNA in a 3 →5 direction and it does not cleave DNA strands without terminal 3-OH groups because they are blocked by phosphoryl or acetyl groups. Exonuclease II is associated with DNA polymerase I, which contains a 5 exonuclease that clips off the RNA primer contained immediately upstream from the site of DNA synthesis in a 5 →3 manner. Exonuclease III has four catalytic activities,3 to 5 exodeoxyribonuclease activity, exonuclease IV adds a water molecule, so it can break the bond of an oligonucleotide to nucleoside 5 monophosphate. This exonuclease requires Mg 2+ in order to function and works at higher temperatures than exonuclease I, exonuclease V is a 3 to 5 hydrolyzing enzyme that catalyzes linear double-stranded DNA and single-stranded DNA, which requires Ca2+. This enzyme is important in the process of homologous recombination. Exonuclease VIII is 5 to 3 dimeric protein that not require ATP or any gaps or nicks in the strand. The 3 to 5 human type endonuclease is known to be essential for the processing of histone pre-mRNA. Following the removal of the downstream cleavage product 5 to 3 exonuclease continues to further breakdown the product until it is completely degraded and this allows the nucleotides to be recycled. This initiates transcriptional termination because one does not want DNA or RNA strands building up in their bodies, cCR4-NOT is a general transcription regulatory complex in yeast that is found to be associated with mRNA metabolism, transcription initiation, and mRNA degradation. CCR4 has been found to contain RNA and single-stranded DNA3 to 5 exonuclease activities, another component associated with the CCR4 complex is CAF1 protein, which has been found to contain 3 to 5 or 5 to 3 exonuclease domains in the mouse and Caenorhabditis elegans. This protein has not been found in yeast, which suggests that it is likely to have an abnormal exonuclease domain like the one seen in a metazoan, yeast contains Rat1 and Xrn1 exonuclease
13.
DNA polymerase III holoenzyme
–
DNA polymerase III holoenzyme is the primary enzyme complex involved in prokaryotic DNA replication. It was discovered by Thomas Kornberg and Malcolm Gefter in 1970, the complex has high processivity and, specifically referring to the replication of the E. coli genome, works in conjunction with four other DNA polymerases. DNA Pol III is a component of the replisome, which is located at the replication fork, the replisome is composed of the following,2 DNA Pol III enzymes, each comprising α, ε and θ subunits. The α subunit has the polymerase activity, the ε subunit has 3→5 exonuclease activity. The θ subunit stimulates the ε subunits proofreading,2 β units which act as sliding DNA clamps, they keep the polymerase bound to the DNA. 2 τ units which acts to two of the core enzymes. 1 γ unit which acts as a loader for the lagging strand Okazaki fragments. The γ unit is made up of 5 γ subunits which include 3 γ subunits,1 δ subunit, the δ is involved in copying of the lagging strand. Χ and Ψ which form a 1,1 complex and bind to γ or τ, X can also mediate the switch from RNA primer to DNA. DNA polymerase III synthesizes base pairs at a rate of around 1000 nucleotides per second, DNA Pol III activity begins after strand separation at the origin of replication. Because DNA synthesis cannot start de novo, an RNA primer, complementary to part of the single-stranded DNA, is synthesized by primase, DNA polymerase III has a high processivity and therefore, synthesizes DNA very quickly. This high processivity is due in part to the β-clamps that hold onto the DNA strands, the removal of the RNA primer allows DNA ligase to ligate the DNA-DNA nick between the new fragment and the previous strand. DNA polymerase I & III, along many other enzymes are all required for the high fidelity, high-processivity of DNA replication
14.
DNA polymerase I
–
DNA polymerase I is an enzyme that participates in the process of prokaryotic DNA replication. Discovered by Arthur Kornberg in 1956, it was the first known DNA polymerase and it was initially characterized in E. coli and is ubiquitous in prokaryotes. In E. coli and many bacteria, the gene that encodes Pol I is known as polA. The physiological function of Pol I is mainly to repair any damage with DNA, in 1956, Arthur Kornberg and his colleagues discovered Pol I by using Escherichia coli extracts to develop a DNA synthesis assay. The scientists added -labeled Thymidine so that a polymer of DNA, not RNA. It was discovered that the P-fraction contained Pol I and heat-stable factors that were essential for the DNA synthesis reactions to undergo extreme temperatures and these factors were identified as nucleoside triphosphates, the building blocks of nucleic acids. The S-fraction contained multiple deoxynucleoside kinases. ”Pol I mainly functions in the repair of damaged DNA, Pol I is part of the alpha/beta protein superfamily protein class, which consists of alpha and beta segments that are scattered throughout any given protein. E. coli DNA Pol I consists of four domains with two separate enzymatic activities, the fourth domain consists of an exonuclease that proofreads the product of DNA Pol I and is able to remove any mistakes committed by Pol I. The other three domains work together to sustain DNA polymerase activity. E. coli bacteria contains 5 different DNA polymerases, DNA Pol I, DNA Pol II, DNA Pol III, DNA Pol IV, and DNA Pol V. Eukaryotic cells contain 5 different DNA polymerases, α, β, γ, δ, and ε. Eukaryotic DNA polymerase β is most similar to E. coli DNA Pol I because its function is associated with DNA repair. DNA polymerase β is mainly used in base excision- repair and nucleotide-excision repair, a total of 15 human DNA polymerases have been identified. DNA polymerases also cannot initiate DNA chains so they must be initiated by short RNA or DNA segments known as primers, in order for DNA polymerization to take place, two requirements must be met. First of all, all DNA polymerases must have both a template strand and a primer strand, unlike RNA, DNA polymerases cannot synthesize DNA from a template strand. Synthesis must be initiated by a short RNA segment, known as RNA primase, secondly, DNA polymerases can only add new nucleotides to the preexisting strand through hydrogen bonding. Since all DNA polymerases have a structure, they all share a two-metal ion-catalyzed polymerase mechanism. One of the metal ions attacks the primer 3 hydroxyl group of the dNTP, the second metal ion will stabilize the oxygens negative charge. The X-ray structures of the domain of all DNA polymerases have been said to resemble that of a humans right hand
15.
Mutant
–
The natural occurrence of genetic mutations is integral to the process of evolution. The study of mutants is an part of biology, by understanding the effect that a mutation in a gene has. Although not all mutations have a phenotypic effect, the common usage of the word mutant is generally a pejorative term only used for noticeable mutations. Previously, people used the sport to refer to abnormal specimens. The scientific usage is broader, referring to any organism differing from the wild type, mutants should not be confused with organisms born with developmental abnormalities, which are caused by errors during morphogenesis. In a developmental abnormality, the DNA of the organism is unchanged, conjoined twins are the result of developmental abnormalities. Chemicals that cause developmental abnormalities are called teratogens, these may also cause mutations, chemicals that induce mutations are called mutagens. Most mutagens are also considered to be carcinogens, within any given environment, a certain species has a variety of different competitors for resources, and predators to be wary of. This applies to truly realistic environments, which are ephemeral and they are dynamic and constitute a multitude different aspects especially as seasons change. This means for a species to exist within this ecosystem, they must be able to adapt to environmental cues they are given. This gives rise to species having special advantages over others. A difference between species is prevalent in this manner and this is an example of a beneficial mutation that caused a mutant who can live within their harsh environment. This success will cause organism A to have a higher rate, thereby rapidly expanding the mutations which helped organism A to survive. This creates a new mutant of the original species. Another example of this is during the, during this massive increase in factories and other facilities, there was an influx in air pollution within many different environments. This influx caused a change in many ecosystems. The peppered-moth, shown here, had a color of white. Unfortunately because of the influx of air pollution, the trees it would hide in started turning darker and darker because of the black smock coming from the factories
16.
Ultraviolet
–
Ultraviolet is an electromagnetic radiation with a wavelength from 10 nm to 400 nm, shorter than that of visible light but longer than X-rays. UV radiation constitutes about 10% of the light output of the Sun. It is also produced by electric arcs and specialized lights, such as lamps, tanning lamps. Consequently, the effects of UV are greater than simple heating effects. Suntan, freckling and sunburn are familiar effects of over-exposure, along with risk of skin cancer. Living things on dry land would be damaged by ultraviolet radiation from the Sun if most of it were not filtered out by the Earths atmosphere. More-energetic, shorter-wavelength extreme UV below 121 nm ionizes air so strongly that it is absorbed before it reaches the ground, Ultraviolet is also responsible for the formation of bone-strengthening vitamin D in most land vertebrates, including humans. The UV spectrum thus has both beneficial and harmful to human health. Ultraviolet rays are invisible to most humans, the lens in a human eye ordinarily filters out UVB frequencies or higher, and humans lack color receptor adaptations for ultraviolet rays. Under some conditions, children and young adults can see ultraviolet down to wavelengths of about 310 nm, near-UV radiation is visible to some insects, mammals, and birds. Small birds have a fourth color receptor for ultraviolet rays, this gives birds true UV vision, reindeer use near-UV radiation to see polar bears, who are poorly visible in regular light because they blend in with the snow. UV also allows mammals to see urine trails, which is helpful for animals to find food in the wild. The males and females of some species look identical to the human eye. Ultraviolet means beyond violet, violet being the color of the highest frequencies of visible light, Ultraviolet has a higher frequency than violet light. He called them oxidizing rays to emphasize chemical reactivity and to them from heat rays. The terms chemical and heat rays were eventually dropped in favour of ultraviolet and infrared radiation, in 1878 the effect of short-wavelength light on sterilizing bacteria was discovered. By 1903 it was known the most effective wavelengths were around 250 nm, in 1960, the effect of ultraviolet radiation on DNA was established. The discovery of the ultraviolet radiation below 200 nm, named vacuum ultraviolet because it is absorbed by air, was made in 1893 by the German physicist Victor Schumann
17.
Hypothesis
–
A hypothesis is a proposed explanation for a phenomenon. For a hypothesis to be a scientific hypothesis, the method requires that one can test it. Scientists generally base scientific hypotheses on previous observations that cannot satisfactorily be explained with the scientific theories. Even though the hypothesis and theory are often used synonymously. A working hypothesis is a provisionally accepted hypothesis proposed for further research, P is the assumption in a What If question. Remember, the way that you prove an implication is by assuming the hypothesis, --Philip Wadler In its ancient usage, hypothesis referred to a summary of the plot of a classical drama. The English word hypothesis comes from the ancient Greek ὑπόθεσις word hupothesis, in Platos Meno, Socrates dissects virtue with a method used by mathematicians, that of investigating from a hypothesis. In this sense, hypothesis refers to an idea or to a convenient mathematical approach that simplifies cumbersome calculations. In common usage in the 21st century, a hypothesis refers to an idea whose merit requires evaluation. For proper evaluation, the framer of a hypothesis needs to define specifics in operational terms, a hypothesis requires more work by the researcher in order to either confirm or disprove it. In due course, a hypothesis may become part of a theory or occasionally may grow to become a theory itself. Normally, scientific hypotheses have the form of a mathematical model, in entrepreneurial science, a hypothesis is used to formulate provisional ideas within a business setting. The formulated hypothesis is then evaluated where either the hypothesis is proven to be true or false through a verifiability- or falsifiability-oriented Experiment, any useful hypothesis will enable predictions by reasoning. It might predict the outcome of an experiment in a setting or the observation of a phenomenon in nature. The prediction may also invoke statistics and only talk about probabilities, other philosophers of science have rejected the criterion of falsifiability or supplemented it with other criteria, such as verifiability or coherence. The scientific method involves experimentation, to test the ability of some hypothesis to adequately answer the question under investigation. In contrast, unfettered observation is not as likely to raise unexplained issues or open questions in science, a thought experiment might also be used to test the hypothesis as well. In framing a hypothesis, the investigator must not currently know the outcome of a test or that it remains reasonably under continuing investigation, only in such cases does the experiment, test or study potentially increase the probability of showing the truth of a hypothesis
18.
Wild type
–
Wild type refers to the phenotype of the typical form of a species as it occurs in nature. Originally, the type was conceptualized as a product of the standard normal allele at a locus, in contrast to that produced by a non-standard. Mutant alleles can vary to an extent, and even become the wild type if a genetic shift occurs within the population. Continued advancements in genetic mapping technologies have created an understanding of how mutations occur. In general, however, the most prevalent allele – i. e. the one with the highest gene frequency – is the one deemed as wild type. Wild-type alleles are indicated with a + superscript, for example w+ and vg+ for red eyes and full-size wings, manipulation of the genes behind these traits led to the current understanding of how organisms form and how traits mutate within a population. Research involving the manipulation of wild-type alleles has application in fields, including fighting disease. The genetic sequence for wild-type versus mutant phenotypes and how these genes interact in expression is the subject of much research, better understanding of these processes is hoped to bring about methods for preventing and curing diseases that are currently incurable such as infection with the herpes virus. One example of such promising research in these fields was the study examining the link between wild-type mutations and certain types of lung cancer. Research is also being done dealing with the manipulation of certain traits in viruses to develop new vaccines. This research may lead to new ways to combat deadly viruses such as the Ebola virus, research using wild-type mutations is also being done to establish how viruses transition between species to identify harmful viruses with the potential to infect humans. Selective breeding to enhance the most beneficial traits is the structure upon which agriculture is built, Genetic manipulation expedited the evolution process to make crop plants and animals larger and more disease resistant. Research into wild-type mutations has allowed the creation of modified crops that are more efficient food producers. Utilization of these wild-type mutations has led to plants capable of growing in extremely arid environments. As more is understood about these genes, agriculture will continue to become an efficient process. Amplification of advantageous genes allows the best traits in a population to be present at much higher percentages than normal and these changes have also been the reason behind certain plants and animals being almost unrecognizable when compared to their ancestral lines. – The Telegraph Wild Type Learning Activity Wild-Type vs. Mutant Genetic Background
19.
Nitrogen mustard
–
The nitrogen mustards are cytotoxic chemotherapy agents similar to mustard gas. Although their common use is medicinal, in principle these compounds can also be deployed as chemical warfare agents, nitrogen mustards are nonspecific DNA alkylating agents. Nitrogen mustard gas was stockpiled by several nations during the Second World War, production and use is therefore strongly restricted. Also during World War II, an incident during the air raid on Bari, Italy, led to the release of gas that affected several hundred soldiers. Medical examination of the survivors showed a number of lymphocytes. After World War II was over, the Bari incident and the Yale groups studies eventually converged prompting a search for similar compounds. Due to its use in studies, the nitrogen mustard known as HN2 became the first chemotherapy drug mustine. Nitrogen mustards are not related to the plant or its pungent essence, allyl isothiocyanate. The original nitrogen mustard drug, mustine, is no longer commonly in use because of excessive toxicity, other nitrogen mustards developed as treatments include cyclophosphamide, chlorambucil, uramustine, ifosfamide, melphalan, and bendamustine. Bendamustine has recently re-emerged as a viable chemotherapeutic treatment, nitrogen mustards that can be used for chemical warfare purposes are tightly regulated. Their weapon designations are, HN1, Bisethylamine HN2, Bismethylamine HN3, for example, mazapertine, aripiprazole & fluanisone. Canfosfamide was also made from normustard, some nitrogen mustards of opiates were also prepared, although these are not known to be antineoplastic. Nitrogen mustards form cyclic aminium ions by intramolecular displacement of the chloride by the amine nitrogen and this aziridinium group then alkylates DNA once it is attacked by the N-7 nucleophilic center on the guanine base. A second attack after the displacement of the second chlorine forms the second alkylation step that results in the formation of interstrand cross-links as it was shown in the early 1960s. At that time it was proposed that the ICLs were formed between N-7 atom of residue in a 5’-d sequence. These kinds of lesions are effective at forcing the cell to apoptosis via p53. Note that the alkylating damage itself is not cytotoxic and does not directly cause cell death, later it was clearly demonstrated that NMs form a 1,3 ICL in the 5’-d sequence. The strong cytotoxic effect caused by the formation of ICLs is what makes NMs an effective chemotherapeutic agent, other compounds used in cancer chemotherapy that have the ability to form ICLs are cisplatin, mitomycin C, carmustine, and psoralen
20.
Psoralen
–
Psoralen is the parent compound in a family of natural products known as furocoumarins. It is structurally related to coumarin by the addition of a furan ring. Psoralen occurs naturally in the seeds of Psoralea corylifolia, as well as in the fig, celery, parsley, West Indian satinwood. It is widely used in PUVA treatment for psoriasis, eczema, vitiligo, many furocoumarins are extremely toxic to fish, and some are deposited in streams in Indonesia to catch fish. Psoralen is a mutagen, and is used for purpose in molecular biology research. Psoralen plus UVA therapy has shown clinical efficacy. Unfortunately, an effect of PUVA treatment is a higher risk of skin cancer. An important use of psoralen is in PUVA treatment for skin problems such as psoriasis and eczema and this takes advantage of the high UV absorbance of psoralen. The psoralen is applied first to sensitise the skin, then UVA light is applied to clean up the skin problem, psoralen has also been recommended for treating alopecia. Psoralens are also used in photopheresis, where they are mixed with the extracted leukocytes before UV radiation is applied, despite the photocarcinogenic properties of psoralen, it had been used as a tanning activator in sunscreens until 1996. Psoralens are used in tanning accelerators, although psoralen increases the sensitivity to light. Some patients have had severe skin loss after sunbathing with psoralen-containing tanning activators, patients with lighter skin colour suffer four times as much from the melanoma-generating properties of psoralens than those with darker skin. An additional use for optimized psoralens is for the inactivation of pathogens in blood products, the synthetic amino-psoralen, amotosalen HCl, has been developed for the inactivation of infectious pathogens in platelet and plasma blood components prepared for transfusion support of patients. Prior to clinical use, amotosalen-treated platelets have been tested and found to be non-carcinogenic when using the established p53 knockout mouse model, the technology is currently in routine use in certain European blood centers and has been recently approved in the US. Psoralen intercalates into the DNA double helix where it is positioned to form adduct with adjacent pyrimidine bases, preferentially thymine. Several physicochemical methods have been employed to derive binding constants for psoralen-DNA interactions, water solubility is important for two reasons, pharmokinetics relating to drug solubility in blood and necessitating the use of organic solvents. Psoralens can also be activated by irradiation with long wavelength UV light, while UVA range light is the clinical standard, research that UVB is more efficient at forming photoadducts suggests that its use may lead to higher efficacy and lower treatment times. The photochemically reactive sites in psoralens are located at each of the double bonds in the furan ring
21.
PubMed Identifier
–
PubMed is a free search engine accessing primarily the MEDLINE database of references and abstracts on life sciences and biomedical topics. The United States National Library of Medicine at the National Institutes of Health maintains the database as part of the Entrez system of information retrieval, from 1971 to 1997, MEDLINE online access to the MEDLARS Online computerized database primarily had been through institutional facilities, such as university libraries. PubMed, first released in January 1996, ushered in the era of private, free, home-, the PubMed system was offered free to the public in June 1997, when MEDLINE searches via the Web were demonstrated, in a ceremony, by Vice President Al Gore. Information about the journals indexed in MEDLINE, and available through PubMed, is found in the NLM Catalog. As of 5 January 2017, PubMed has more than 26.8 million records going back to 1966, selectively to the year 1865, and very selectively to 1809, about 500,000 new records are added each year. As of the date,13.1 million of PubMeds records are listed with their abstracts. In 2016, NLM changed the system so that publishers will be able to directly correct typos. Simple searches on PubMed can be carried out by entering key aspects of a subject into PubMeds search window, when a journal article is indexed, numerous article parameters are extracted and stored as structured information. Such parameters are, Article Type, Secondary identifiers, Language, publication type parameter enables many special features. As these clinical girish can generate small sets of robust studies with considerable precision, since July 2005, the MEDLINE article indexing process extracts important identifiers from the article abstract and puts those in a field called Secondary Identifier. The secondary identifier field is to store numbers to various databases of molecular sequence data, gene expression or chemical compounds. For clinical trials, PubMed extracts trial IDs for the two largest trial registries, ClinicalTrials. gov and the International Standard Randomized Controlled Trial Number Register, a reference which is judged particularly relevant can be marked and related articles can be identified. If relevant, several studies can be selected and related articles to all of them can be generated using the Find related data option, the related articles are then listed in order of relatedness. To create these lists of related articles, PubMed compares words from the title and abstract of each citation, as well as the MeSH headings assigned, using a powerful word-weighted algorithm. The related articles function has been judged to be so precise that some researchers suggest it can be used instead of a full search, a strong feature of PubMed is its ability to automatically link to MeSH terms and subheadings. Examples would be, bad breath links to halitosis, heart attack to myocardial infarction, where appropriate, these MeSH terms are automatically expanded, that is, include more specific terms. Terms like nursing are automatically linked to Nursing or Nursing and this important feature makes PubMed searches automatically more sensitive and avoids false-negative hits by compensating for the diversity of medical terminology. The My NCBI area can be accessed from any computer with web-access, an earlier version of My NCBI was called PubMed Cubby
22.
Eukaryotic DNA replication
–
Eukaryotic DNA replication is a conserved mechanism that restricts DNA replication to only once per cell cycle. Eukaryotic DNA replication of chromosomal DNA is central for the duplication of a cell and is necessary for the maintenance of the eukaryotic genome, DNA replication is the action of DNA polymerases synthesizing a DNA strand complementary to the original template strand. To synthesize DNA, the double-stranded DNA is unwound by DNA helicases ahead of polymerases, replication processes permit the copying of a single DNA double helix into two DNA helices, which are divided into the daughter cells at mitosis. The replisome is responsible for copying the entirety of genomic DNA in each proliferative cell and this process allows for the high-fidelity passage of hereditary/genetic information from parental cell to daughter cell and is thus essential to all organisms. Much of the cycle is built around ensuring that DNA replication occurs without errors. In G1 phase of the cycle, many of the DNA replication regulatory processes are initiated. In eukaryotes, the vast majority of DNA synthesis occurs during S phase of the cycle. During G2, any damaged DNA or replication errors are corrected, finally, one copy of the genomes is segregated to each daughter cell at mitosis or M phase. These daughter copies each contain one strand from the parental duplex DNA and this mechanism is conserved from prokaryotes to eukaryotes and is known as semiconservative DNA replication. Initiation of eukaryotic DNA replication is the first stage of DNA synthesis where the DNA double helix is unwound, the priming event on the lagging strand establishes a replication fork. Priming of the DNA helix consists of synthesis of an RNA primer to allow DNA synthesis by DNA polymerase α, priming occurs once at the origin on the leading strand and at the start of each Okazaki fragment on the lagging strand. DNA replication is initiated from specific sequences called origins of replication, to initiate DNA replication, multiple replicative proteins assemble on and dissociate from these replicative origins. The individual factors described below work together to direct the formation of the pre-replication complex, both Cdc6 and Cdt1 proteins associate with the already bound ORC independently from each other. The ORC, Cdc6, and Cdt1 together are required for the association of the minichromosome maintenance complex proteins with replicative origins during G1 phase of the cell cycle. Eukaryotic origins of replication control the formation of a number of complexes that lead to the assembly of two bidirectional DNA replication forks. These events are initiated by the formation of the complex at the origins of replication. This process takes place in the G1 stage of the cell cycle, the pre-RC formation involves the ordered assembly of many replication factors including the origin recognition complex, Cdc6 protein, Cdt1 protein, and minichromosome maintenance proteins. This transition involves the assembly of additional replication factors to unwind the DNA
23.
Pre-replication complex
–
A pre-replication complex is a protein complex that forms at the origin of replication during the initiation step of DNA replication. Formation of the pre-RC is required for DNA replication to occur, complete and faithful replication of the genome ensures that each daughter cell will carry the same genetic information as the parent cell. Accordingly, formation of the pre-RC is an important part of the cell cycle. As organisms evolved and became more complex, so did their pre-RCs. The following is a summary of the components of the pre-RC amongst the different domains of life, in bacteria, the main component of the pre-RC is DnaA. The pre-RC is complete when DnaA occupies all of its binding sites within the bacterial origin of replication, the archaeal pre-RC is very different from the bacterial pre-RC and can serve as a simplified model of the eukaryotic pre-RC. It is composed of a single origin recognition complex protein, Cdc6, the eukaryotic pre-RC is the most complex and highly regulated pre-RC. In most eukaryotes it is composed of six ORC proteins, Cdc6, Cdt1, the MCM heterohexamer arguably arose via MCM gene duplication events and subsequent divergent evolution. The pre-RC of Schizosaccharomyces pombe is notably different from that of other eukaryotes, sap1 is also included in the S. pombe pre-RC because it is required for Cdc18 binding. The pre-RC of Xenopus laevis also has a protein, MCM9. Recognition of the origin of replication is a critical first step in the formation of the pre-RC, in different domains of life this process is accomplished differently. In prokaryotes, origin recognition is accomplished by DnaA, DnaA binds tightly to a 9-base pair consensus sequence in oriC,5 – TTATCCACA –3. There are 5 such 9-bp sequences and 4 non-consensus sequences within oriC that DnaA binds with differential affinity, DnaA binds R4, R1, and R2 with high affinity and R5, I1, I2, I3, and R3 with lesser affinity. The pre-RC is complete when DnaA occupies all of the high, archaea have 1–3 origins of replication. The origins are generally AT-rich tracts that vary based on the archaeal species, the singular archaeal ORC protein recognizes the AT-rich tracts and binds DNA in an ATP-dependent fashion. Eukaryotes typically have multiple origins of replication, at least one per chromosome, saccharomyces cerevisiae is the only known eukaryote with a defined initiation sequence TTTTTATG/ATTTA/T. This initiation sequence is recognized by ORC1-5, ORC6 is not known to bind DNA in S. cerevisiae. Initiation sequences in S. pombe and higher eukaryotes are not well defined, however, the initiation sequences are generally either AT-rich or exhibit bent or curved DNA topology
24.
Helicase
–
Helicases are a class of enzymes vital to all living organisms. Their main function is to unpackage an organisms genes and they are motor proteins that move directionally along a nucleic acid phosphodiester backbone, separating two annealed nucleic acid strands using energy derived from ATP hydrolysis. There are many resulting from the great variety of processes in which strand separation must be catalyzed. Approximately 1% of eukaryotic genes code for helicases, the human genome codes for 95 non-redundant helicases,64 RNA helicases and 31 DNA helicases. They also function to remove nucleic acid-associated proteins and catalyze homologous DNA recombination, metabolic processes of RNA such as translation, transcription, ribosome biogenesis, RNA splicing, RNA transport, RNA editing, and RNA degradation are all facilitated by helicases. Helicases move incrementally along one nucleic acid strand of the duplex with a directionality and processivity specific to each particular enzyme, Helicases adopt different structures and oligomerization states. Whereas DnaB-like helicases unwind DNA as ring-shaped hexamers, other enzymes have been shown to be active as monomers or dimers, in the latter case, the helicase acts comparably to an active motor, unwinding and translocating along its substrate as a direct result of its ATPase activity. Helicases may process much faster in vivo than in vitro due to the presence of proteins that aid in the destabilization of the fork junction. Enzymatic helicase action, such as unwinding nucleic acids is achieved through the lowering of the barrier of each specific action. The size of the barrier to overcome by the helicase contributes to its classification as an active or passive helicase. In passive helicases, a significant activation barrier exists, certain nucleic acid combinations will decrease unwinding rates, while various destabilizing forces can increase the unwinding rate. In passive systems, the rate of unwinding is less than the rate of translocation, another way to view the passive helicase is its reliance on the transient unraveling of the base pairs at the replication fork to determine its rate of unwinding. In active helicases, Vun is approximately equal to Vtrans, another way to view the active helicase is its ability to directly destabilize the replication fork to promote unwinding. These two categories of helicases may also be modelled as mechanisms, in such models the passive helicases are conceptualized as Brownian ratchets, driven by thermal fluctuations and subsequent anisotropic gradients across the DNA lattice. Depending upon the organism, such helix-traversing progress can occur at speeds in the range of 5,000 to 10,000 R. P. M. DNA helicases were discovered in E. coli in 1976 and this helicase was described as a DNA unwinding enzyme that is found to denature DNA duplexes in an ATP-dependent reaction, without detectably degrading. The first eukaryotic DNA helicase was in 1978 in the lily plant, since then, DNA helicases were discovered and isolated in other bacteria, viruses, yeast, flies, and higher eukaryotes. Below is a history of discovery,1976 – Discovery
25.
DnaA
–
DnaA is a protein that activates initiation of DNA replication in bacteria. It is an initiation factor which promotes the unwinding of DNA at oriC. The onset of the phase of DNA replication is determined by the concentration of DnaA. DnaA accumulates during growth and then triggers the initiation of replication, replication begins with active DnaA binding to 9-mer repeats upstream of oriC. Binding of DnaA leads to separation at the 13-mer repeats. This binding causes the DNA to loop in preparation for melting open by the helicase DnaB, the active form DnaA is bound to ATP. Immediately after a cell has divided, the level of active DnaA within the cell is low, although the active form of DnaA requires ATP, the formation of the oriC/DnaA complex and subsequent DNA unwinding does not require ATP hydrolysis. The oriC site in E. coli has three AT rich 13 base pair regions followed by four 9 bp regions with the sequence TTATCAA, around 10 DnaA molecules bind to the 9 bp regions, which wrap around the proteins causing the DNA at the AT-rich region to unwind. There are 8 DnaA binding sites within oriC, to which DnaA binds with differential affinity, when DNA replication is about to commence, DnaA occupies all of the high and low affinity binding sites. The denatured AT-rich region allows for the recruitment of DnaB, which complexes with DnaC, DnaC helps the helicase to bind to and to properly accommodate the ssDNA at the 13 bp region, this is accomplished by ATP hydrolysis, after which DnaC is released. Single-strand binding proteins stabilize the single DNA strands in order to maintain the replication bubble, DnaB is a 5→3 helicase, so it travels on the lagging strand. It associates with DnaG to form the only primer for the leading strand, the interaction between DnaG and DnaB is necessary to control the longitude of Okazaki fragments on the lagging strand. DNA polymerase III is then able to start DNA replication
26.
DnaB helicase
–
DnaB helicase is an enzyme in bacteria which opens the replication fork during DNA replication. Initially when DnaB binds to dnaA, it is associated with dnaC, after DnaC dissociates, DnaB binds dnaG. The N-terminal has a structure that forms an orthogonal bundle. The C-terminal domain contains an ATP-binding site and is probably the site of ATP hydrolysis. In eukaryotes, helicase function is provided by the MCM complex, the DnaB helicase is the product of the dnaB gene. The helicase enzyme that is produced is a hexamer in E. coli, the energy for DnaB activity is provided by NTP hydrolysis. Mechanical energy moves the DnaB into the fork, physically splitting it in half. In E. coli, dnaB is a protein of six 471-residue subunits. During DNA replication, the strand of DNA binds in the central channel of dnaB. The binding of dNTPs causes a change which allows the dnaB to translocate along the DNA. At least 10 different enzymes or proteins participate in the phase of replication. They open the DNA helix at the origin and establish a prepriming complex for subsequent reactions, the crucial component in the initiation process is the DnaA protein, a member of the AAA+ ATPase protein family. Many AAA+ ATPases, including DnaA, form oligomers and hydrolyze ATP relatively slowly and this ATP hydrolysis acts as a switch mediating interconversion of the protein between two states. In the case of DnaA, the ATP-bound form is active, eight DnaA protein molecules, all in the ATP-bound state, assemble to form a helical complex encompassing the R and I sites in oriC. DnaA has an affinity for the R sites than I sites. The I sites, which only the ATP-bound DnaA, allow discrimination between the active and inactive forms of DnaA. The tight right-handed wrapping of the DNA around this complex introduces an effective positive supercoil, the associated strain in the nearby DNA leads to denaturation in the A, T-rich DUE region. The complex formed at the origin also includes several DNA-binding proteins- Hu, IHF
27.
DnaG
–
DnaG is a bacterial DNA primase and is encoded by the dnaG gene. The enzyme DnaG, and any other DNA primase, synthesizes short strands of RNA known as oligonucleotides during DNA replication and these oligonucleotides are known as primers because they act as a starting point for DNA synthesis. DnaG catalyzes the synthesis of oligonucleotides that are 10 to 60 nucleotides long and these RNA oligonucleotides serve as primers, or starting points, for DNA synthesis by bacterial DNA polymerase III. DnaG is important in bacterial DNA replication because DNA polymerase cannot initiate the synthesis of a DNA strand, DnaG synthesizes a single RNA primer at the origin of replication. This primer serves to prime leading strand DNA synthesis, for the other parental strand, the lagging strand, DnaG synthesizes an RNA primer every few kilobases. These primers serve as substrates for the synthesis of Okazaki fragments, in E. Primases tend to initiate synthesis at specific three nucleotide sequences on single-stranded DNA templates and for E. coli DnaG the sequence is 5-CTG-3. DnaG contains three separate protein domains, a binding domain, an RNA polymerase domain, and a DnaB helicase binding domain. There are several bacteria that use the DNA primase DnaG, a few organisms that have DnaG as their DNA primase are Escherichia coli, Bacillus stearothermophilus, and Mycobacterium tuberculosis. E. coli DnaG has a weight of 60,000 kilodaltons and contains 581 amino acids. DnaG performs this catalysis near the fork that is formed by DnaB helicase during DNA replication. DnaG must be complexed with DnaB in order for it to catalyze the formation of the oligonucleotide primers, the mechanism for primer synthesis by primases involves two NTP binding sites on the primase protein. Prior to the binding of any NTPs to form the RNA primer, the ssDNA contains a three nucleotide recognition sequence that recruits NTPs based on Watson-Crick base pairing. After binding DNA, DnaG must bind two NTPs in order to generate an enzyme-DNA-NTP-NTP quaternary complex, the Michaelis constants for the NTPs vary depending on the primase and templates. The two NTP binding sites on DnaG are referred to as the site and elongation site. The initiation site is the site at which the NTP to be incorporated at the 5 end of the primer binds, the elongation site binds the NTP that is added to the 3 end of the primer. This reaction results in a dinucleotide and breaking of the bond between the α and β phosphorus, releasing pyrophosphate and this reaction is irreversible because the pyrophosphate that is formed is hydrolyzed into two inorganic phosphate molecules by the enzyme inorganic pyrophosphatase. In E. coli, primers begin with a triphosphate adenine-guanine dinucleotide at the 5 end, the rate limiting step of the primer synthesis occurs after NTP binding but before or during dinucleotide synthesis. The DnaG primase is a 581 residue monomeric protein with three domains, according to proteolysis studies
28.
Origin recognition complex
–
In molecular biology, origin recognition complex is a multi-subunit DNA binding complex that binds in all eukaryotes in an ATP-dependent manner to origins of replication. The subunits of complex are encoded by the ORC1, ORC2, ORC3, ORC4, ORC5. ORC is a component for eukaryotic DNA replication, and remains bound to chromatin at replication origins throughout the cell cycle. ORC directs DNA replication throughout the genome and is required for its initiation, ORC bound at replication origins serves as the foundation for assembly of the pre-replication complex, which includes Cdc6, Tah11, and the Mcm2-Mcm7 complex. Pre-RC assembly during G1 is required for licensing of chromosomes prior to DNA synthesis during S phase. Cell cycle-regulated phosphorylation of Orc2, Orc6, Cdc6, and MCM by the cyclin-dependent protein kinase Cdc28 regulates initiation of DNA replication, in yeast, ORC also plays a role in the establishment of silencing at the mating-type loci Hidden MAT Left and Hidden MAT Right. ORC participates in the assembly of transcriptionally silent chromatin at HML and HMR by recruiting the Sir1 silencing protein to the HML, both Orc1 and Orc5 bind ATP, though only Orc1 has ATPase activity. The binding of ATP by Orc1 is required for ORC binding to DNA and is essential for cell viability, the ATPase activity of Orc1 is involved in formation of the pre-RC. ATP binding by Orc5 is crucial for the stability of ORC as a whole, only the Orc1-5 subunits are required for origin binding, Orc6 is essential for maintenance of pre-RCs once formed. Interactions within ORC suggest that Orc2-3-6 may form a core complex, bell and Anindya Dutta, DNA REPLICATION IN EUKARYOTIC CELLS, Annual Review of Biochemistry,2002. A comprehensive review of molecular DNA replication
29.
ORC1
–
Origin recognition complex subunit 1 is a protein that in humans is encoded by the ORC1 gene. The origin recognition complex is a highly conserved six subunits protein complex essential for the initiation of the DNA replication in eukaryotic cells, the protein encoded by this gene is the largest subunit of the origin recognition complex. This protein is found to be selectively phosphorylated during mitosis and it is also reported to interact with MYST histone acetyltransferase 2, a protein involved in control of transcription silencing. ORC1 has been shown to interact with
30.
ORC2
–
Origin recognition complex subunit 2 is a protein that in humans is encoded by the ORC2 gene. The origin recognition complex is a highly conserved six subunits protein complex essential for the initiation of the DNA replication in eukaryotic cells. Studies in yeast demonstrated that ORC binds specifically to origins of replication and serves as a platform for the assembly of additional factors such as Cdc6. The protein encoded by this gene is a subunit of the ORC complex and this protein forms a core complex with ORC3, ORC4, and ORC5. It also interacts with CDC45L and MCM10, which are known to be important for the initiation of DNA replication. ORC2 has been shown to interact with
31.
ORC3
–
Origin recognition complex subunit 3 is a protein that in humans is encoded by the ORC3 gene. The origin recognition complex is a highly conserved six subunits protein complex essential for the initiation of the DNA replication in eukaryotic cells. Studies in yeast demonstrated that ORC binds specifically to origins of replication and serves as a platform for the assembly of additional factors such as Cdc6. The protein encoded by this gene is a subunit of the ORC complex, studies of a similar gene in Drosophila suggested a possible role of this protein in neuronal proliferation and olfactory memory. Alternatively spliced transcript variants encoding distinct isoforms have been reported for this gene, ORC3 has been shown to interact with
32.
ORC4
–
Origin recognition complex subunit 4 is a protein that in humans is encoded by the ORC4 gene. The origin recognition complex is a highly conserved six subunit protein complex essential for the initiation of the DNA replication in eukaryotic cells. Studies in yeast demonstrated that ORC binds specifically to origins of replication and serves as a platform for the assembly of additional factors such as Cdc6. The protein encoded by this gene is a subunit of the ORC complex and it has been shown to form a core complex with ORC2L, -3L, and -5L. Three alternatively spliced variants encoding the same protein have been reported. ORC4 has been shown to interact with
33.
ORC5
–
Origin recognition complex subunit 5 is a protein that in humans is encoded by the ORC5 gene. The origin recognition complex is a highly conserved six subunit protein complex essential for the initiation of the DNA replication in eukaryotic cells. Studies in yeast demonstrated that ORC binds specifically to origins of replication and serves as a platform for the assembly of additional factors such as Cdc6. The protein encoded by this gene is a subunit of the ORC complex and it has been shown to form a core complex with ORC2L, -3L, and 4L. Alternatively spliced transcript variants encoding distinct isoforms have been described, ORC5 has been shown to interact with
34.
ORC6
–
Origin recognition complex subunit 6 is a protein that in humans is encoded by the ORC6 gene. The origin recognition complex is a highly conserved six subunit protein complex essential for the initiation of the DNA replication in eukaryotic cells. Studies in yeast demonstrated that ORC binds specifically to origins of replication and serves as a platform for the assembly of additional factors such as Cdc6. The protein encoded by this gene is a subunit of the ORC complex and it has been shown that this protein and ORC1 are loosely associated with the core complex consisting of ORC2, -3, -4 and -5. Gene silencing studies with small interfering RNA demonstrated that this plays an essential role in coordinating chromosome replication and segregation with cytokinesis. ORC6 has been shown to interact with MCM5, ORC2, Replication protein A1, ORC4, DBF4, ORC3, CDC45-related protein, MCM4 and Cell division cycle 7-related protein kinase
35.
Cdt1
–
DNA replication factor Cdt1 is a protein that in humans is encoded by the CDT1 gene. The protein encoded by this gene is a key licensing factor which, along with the protein Cdc6, CDT1 belongs to a family of replication proteins conserved from yeast to humans. Examples of orthologs in other species include, S. pombe – cdt1 Drosophila melanogaster – double parked or Dup Xenopus laevis - Cdt1 DNA replication factor CDT1 has been shown to interact with SKP2, Cdt1 is recruited by the origin recognition complex in origin licensing. Null-mutations for Cdt1 are lethal in yeast, the spores undergo mitosis without DNA replication, the overexpression of Cdt1 causes rereplication in H. sapiens, which activates the Chk1 pathway, preventing entry into mitosis