Protein Data Bank
The Protein Data Bank is a database for the three-dimensional structural data of large biological molecules, such as proteins and nucleic acids. The data obtained by X-ray crystallography, NMR spectroscopy, or cryo-electron microscopy, submitted by biologists and biochemists from around the world, are accessible on the Internet via the websites of its member organisations; the PDB is overseen by an organization called the Worldwide Protein Data Bank, wwPDB. The PDB is a key in areas such as structural genomics. Most major scientific journals, some funding agencies, now require scientists to submit their structure data to the PDB. Many other databases use protein structures deposited in the PDB. For example, SCOP and CATH classify protein structures, while PDBsum provides a graphic overview of PDB entries using information from other sources, such as Gene ontology. Two forces converged to initiate the PDB: 1) a small but growing collection of sets of protein structure data determined by X-ray diffraction.
In 1969, with the sponsorship of Walter Hamilton at the Brookhaven National Laboratory, Edgar Meyer began to write software to store atomic coordinate files in a common format to make them available for geometric and graphical evaluation. By 1971, one of Meyer's programs, SEARCH, enabled researchers to remotely access information from the database to study protein structures offline. SEARCH was instrumental in enabling networking, thus marking the functional beginning of the PDB; the Protein Data Bank was announced in October 1971 in Nature New Biology as a joint venture between Cambridge Crystallographic Data Centre, UK and Brookhaven National Laboratory, USA. Upon Hamilton's death in 1973, Tom Koeztle took over direction of the PDB for the subsequent 20 years. In January 1994, Joel Sussman of Israel's Weizmann Institute of Science was appointed head of the PDB. In October 1998, the PDB was transferred to the Research Collaboratory for Structural Bioinformatics; the new director was Helen M. Berman of Rutgers University.
In 2003, with the formation of the wwPDB, the PDB became an international organization. The founding members are PDBe, RCSB, PDBj; the BMRB joined in 2006. Each of the four members of wwPDB can act as deposition, data processing and distribution centers for PDB data; the data processing refers to the fact that annotate each submitted entry. The data are automatically checked for plausibility; the PDB database is updated weekly. The PDB holdings list is updated weekly; as of 17 October 2018, the breakdown of current holdings is as follows: 120,052 structures in the PDB have a structure factor file. 9,734 structures have an NMR restraint file. 3,486 structures in the PDB have a chemical shifts file. 2,531 structures in the PDB have a 3DEM map file deposited in EM Data BankThese data show that most structures are determined by X-ray diffraction, but about 10% of structures are now determined by protein NMR. When using X-ray diffraction, approximations of the coordinates of the atoms of the protein are obtained, whereas estimations of the distances between pairs of atoms of the protein are found through NMR experiments.
Therefore, the final conformation of the protein is obtained, in the latter case, by solving a distance geometry problem. A few proteins are determined by cryo-electron microscopy; the significance of the structure factor files, mentioned above, is that, for PDB structures determined by X-ray diffraction that have a structure file, the electron density map may be viewed. The data of such structures is stored on the "electron density server". In the past, the number of structures in the PDB has grown at an exponential rate, passing the 100 registered structures milestone in 1982, the 1,000 in 1993, the 10,000 in 1999, the 100,000 in 2014. However, since 2007, the rate of accumulation of new protein structures appears to have plateaued; the file format used by the PDB was called the PDB file format. This original format was restricted by the width of computer punch cards to 80 characters per line. Around 1996, the "macromolecular Crystallographic Information file" format, mmCIF, an extension of the CIF format started to be phased in.
MmCIF is now the master format for the PDB archive. An XML version of this format, called PDBML, was described in 2005; the structure files can be downloaded in any of these three formats. In fact, individual files are downloaded into graphics packages using web addresses: For PDB format files, use, e.g. http://www.pdb.org/pdb/files/4hhb.pdb.gz or http://pdbe.org/download/4hhb For PDBML files, use, e.g. http://www.pdb.org/pdb/files/4hhb.xml.gz or http://pdbe.org/pdbml/4hhbThe "4hhb" is the PDB identifier. Each structure published in PDB receives a four-character alphanumeric identifier, its PDB ID; the structure files may be viewed using one of several free and open source computer programs, including Jmol, Pymol, VMD, Rasmol. Other non-free, shareware programs
In biology, a gene is a sequence of nucleotides in DNA or RNA that codes for a molecule that has a function. During gene expression, the DNA is first copied into RNA; the RNA can be directly functional or be the intermediate template for a protein that performs a function. The transmission of genes to an organism's offspring is the basis of the inheritance of phenotypic trait; these genes make up different DNA sequences called genotypes. Genotypes along with developmental factors determine what the phenotypes will be. Most biological traits are under the influence of polygenes as well as gene–environment interactions; some genetic traits are visible, such as eye color or number of limbs, some are not, such as blood type, risk for specific diseases, or the thousands of basic biochemical processes that constitute life. Genes can acquire mutations in their sequence, leading to different variants, known as alleles, in the population; these alleles encode different versions of a protein, which cause different phenotypical traits.
Usage of the term "having a gene" refers to containing a different allele of the same, shared gene. Genes evolve due to natural selection / survival of the fittest and genetic drift of the alleles; the concept of a gene continues to be refined. For example, regulatory regions of a gene can be far removed from its coding regions, coding regions can be split into several exons; some viruses store their genome in RNA instead of DNA and some gene products are functional non-coding RNAs. Therefore, a broad, modern working definition of a gene is any discrete locus of heritable, genomic sequence which affect an organism's traits by being expressed as a functional product or by regulation of gene expression; the term gene was introduced by Danish botanist, plant physiologist and geneticist Wilhelm Johannsen in 1909. It is inspired by the ancient Greek: γόνος, that means offspring and procreation; the existence of discrete inheritable units was first suggested by Gregor Mendel. From 1857 to 1864, in Brno, he studied inheritance patterns in 8000 common edible pea plants, tracking distinct traits from parent to offspring.
He described these mathematically as 2n combinations where n is the number of differing characteristics in the original peas. Although he did not use the term gene, he explained his results in terms of discrete inherited units that give rise to observable physical characteristics; this description prefigured Wilhelm Johannsen's distinction between phenotype. Mendel was the first to demonstrate independent assortment, the distinction between dominant and recessive traits, the distinction between a heterozygote and homozygote, the phenomenon of discontinuous inheritance. Prior to Mendel's work, the dominant theory of heredity was one of blending inheritance, which suggested that each parent contributed fluids to the fertilisation process and that the traits of the parents blended and mixed to produce the offspring. Charles Darwin developed a theory of inheritance he termed pangenesis, from Greek pan and genesis / genos. Darwin used the term gemmule to describe hypothetical particles. Mendel's work went unnoticed after its first publication in 1866, but was rediscovered in the late 19th century by Hugo de Vries, Carl Correns, Erich von Tschermak, who reached similar conclusions in their own research.
In 1889, Hugo de Vries published his book Intracellular Pangenesis, in which he postulated that different characters have individual hereditary carriers and that inheritance of specific traits in organisms comes in particles. De Vries called these units "pangenes", after Darwin's 1868 pangenesis theory. Sixteen years in 1905, Wilhelm Johannsen introduced the term'gene' and William Bateson that of'genetics' while Eduard Strasburger, amongst others, still used the term'pangene' for the fundamental physical and functional unit of heredity. Advances in understanding genes and inheritance continued throughout the 20th century. Deoxyribonucleic acid was shown to be the molecular repository of genetic information by experiments in the 1940s to 1950s; the structure of DNA was studied by Rosalind Franklin and Maurice Wilkins using X-ray crystallography, which led James D. Watson and Francis Crick to publish a model of the double-stranded DNA molecule whose paired nucleotide bases indicated a compelling hypothesis for the mechanism of genetic replication.
In the early 1950s the prevailing view was that the genes in a chromosome acted like discrete entities, indivisible by recombination and arranged like beads on a string. The experiments of Benzer using mutants defective in the rII region of bacteriophage T4 showed that individual genes have a simple linear structure and are to be equivalent to a linear section of DNA. Collectively, this body of research established the central dogma of molecular biology, which states that proteins are translated from RNA, transcribed from DNA; this dogma has since been shown to have exceptions, such as reverse transcription in retroviruses. The modern study of genetics at the level of DNA is known as molecular genetics. In 1972, Walter Fiers and his team were the first to determine the sequence of a gene: that of Bacteriophage MS2 coat protein; the subsequent development of chain-termination DNA sequencing in 1977 by Frederick Sanger improved the efficiency of sequencing and turned it into a routine laboratory tool.
An automated version of the Sanger method was used in early phases of the
A base pair is a unit consisting of two nucleobases bound to each other by hydrogen bonds. They form the building blocks of the DNA double helix and contribute to the folded structure of both DNA and RNA. Dictated by specific hydrogen bonding patterns, Watson–Crick base pairs allow the DNA helix to maintain a regular helical structure, subtly dependent on its nucleotide sequence; the complementary nature of this based-paired structure provides a redundant copy of the genetic information encoded within each strand of DNA. The regular structure and data redundancy provided by the DNA double helix make DNA well suited to the storage of genetic information, while base-pairing between DNA and incoming nucleotides provides the mechanism through which DNA polymerase replicates DNA and RNA polymerase transcribes DNA into RNA. Many DNA-binding proteins can recognize specific base-pairing patterns that identify particular regulatory regions of genes. Intramolecular base pairs can occur within single-stranded nucleic acids.
This is important in RNA molecules, where Watson–Crick base pairs permit the formation of short double-stranded helices, a wide variety of non-Watson–Crick interactions allow RNAs to fold into a vast range of specific three-dimensional structures. In addition, base-pairing between transfer RNA and messenger RNA forms the basis for the molecular recognition events that result in the nucleotide sequence of mRNA becoming translated into the amino acid sequence of proteins via the genetic code; the size of an individual gene or an organism's entire genome is measured in base pairs because DNA is double-stranded. Hence, the number of total base pairs is equal to the number of nucleotides in one of the strands; the haploid human genome is estimated to be about 3.2 billion bases long and to contain 20,000–25,000 distinct protein-coding genes. A kilobase is a unit of measurement in molecular biology equal to 1000 base pairs of DNA or RNA; the total amount of related DNA base pairs on Earth is estimated at 5.0×1037 and weighs 50 billion tonnes.
In comparison, the total mass of the biosphere has been estimated to be as much as 4 TtC. Hydrogen bonding is the chemical interaction. Appropriate geometrical correspondence of hydrogen bond donors and acceptors allows only the "right" pairs to form stably. DNA with high GC-content is more stable than DNA with low GC-content. But, contrary to popular belief, the hydrogen bonds do not stabilize the DNA significantly; the larger nucleobases and guanine, are members of a class of double-ringed chemical structures called purines. Purines are complementary only with pyrimidines: pyrimidine-pyrimidine pairings are energetically unfavorable because the molecules are too far apart for hydrogen bonding to be established. Purine-pyrimidine base-pairing of AT or GC or UA results in proper duplex structure; the only other purine-pyrimidine pairings would be AC and GT and UG. The GU pairing, with two hydrogen bonds, does occur often in RNA. Paired DNA and RNA molecules are comparatively stable at room temperature, but the two nucleotide strands will separate above a melting point, determined by the length of the molecules, the extent of mispairing, the GC content.
Higher GC content results in higher melting temperatures. On the converse, regions of a genome that need to separate — for example, the promoter regions for often-transcribed genes — are comparatively GC-poor. GC content and melting temperature must be taken into account when designing primers for PCR reactions; the following DNA sequences illustrate pair double-stranded patterns. By convention, the top strand is written from the 5' end to the 3' end. A base-paired DNA sequence: ATCGATTGAGCTCTAGCG TAGCTAACTCGAGATCGCThe corresponding RNA sequence, in which uracil is substituted for thymine in the RNA strand: AUCGAUUGAGCUCUAGCG UAGCUAACUCGAGAUCGC Chemical analogs of nucleotides can take the place of proper nucleotides and establish non-canonical base-pairing, leading to errors in DNA replication and DNA transcription; this is due to their isosteric chemistry. One common mutagenic base analog is 5-bromouracil, which resembles thymine but can base-pair to guanine in its enol form. Other chemicals, known as DNA intercalators, fit into the gap between adjacent bases on a single strand and induce frameshift mutations by "masquerading" as a base, causing the DNA replication machinery to skip or insert additional nucleotides at the intercalated site.
Most intercalators are known or suspected carcinogens. Examples include ethidium acridine. An unnatural base pair is a designed subunit of DNA, created in a laboratory and does not occur in nature. DNA sequences have been described which use newly created nucleobases to form a third base pair, in addition to the two ba
Gene expression is the process by which information from a gene is used in the synthesis of a functional gene product. These products are proteins, but in non-protein coding genes such as transfer RNA or small nuclear RNA genes, the product is a functional RNA; the process of gene expression is used by all known life—eukaryotes and utilized by viruses—to generate the macromolecular machinery for life. Several steps in the gene expression process may be modulated, including the transcription, RNA splicing and post-translational modification of a protein. Gene regulation gives the cell control over structure and function, is the basis for cellular differentiation and the versatility and adaptability of any organism. Gene regulation may serve as a substrate for evolutionary change, since control of the timing and amount of gene expression can have a profound effect on the functions of the gene in a cell or in a multicellular organism. In genetics, gene expression is the most fundamental level at which the genotype gives rise to the phenotype, i.e. observable trait.
The genetic code stored in DNA is "interpreted" by gene expression, the properties of the expression give rise to the organism's phenotype. Such phenotypes are expressed by the synthesis of proteins that control the organism's shape, or that act as enzymes catalysing specific metabolic pathways characterising the organism. Regulation of gene expression is thus critical to an organism's development. A gene is a stretch of DNA. Genomic DNA consists of two antiparallel and reverse complementary strands, each having 5' and 3' ends. With respect to a gene, the two strands may be labeled the "template strand," which serves as a blueprint for the production of an RNA transcript, the "coding strand," which includes the DNA version of the transcript sequence.. The production of the RNA copy of the DNA is called transcription, is performed in the nucleus by RNA polymerase, which adds one RNA nucleotide at a time to a growing RNA strand as per the complementarity law of the bases; this RNA is complementary to the template 3' → 5' DNA strand, itself complementary to the coding 5' → 3' DNA strand.
Therefore, the resulting 5' → 3' RNA strand is identical to the coding DNA strand with the exception that Thymines are replaced with uracils in the RNA. A coding DNA strand reading "ATG" is indirectly transcribed through the “TAC” in the non-coding template strand as "AUG" in the mRNA. In prokaryotes, transcription is carried out by a single type of RNA polymerase, which needs a DNA sequence called a Pribnow box as well as a sigma factor to start transcription. In eukaryotes, transcription is performed by three types of RNA polymerases, each of which needs a special DNA sequence called the promoter and a set of DNA-binding proteins—transcription factors—to initiate the process. RNA polymerase. RNA polymerase II transcribes all protein-coding genes but some non-coding RNAs. Pol II includes a C-terminal domain, rich in serine residues; when these residues are phosphorylated, the CTD binds to various protein factors that promote transcript maturation and modification. RNA polymerase III transcribes 5S rRNA, transfer RNA genes, some small non-coding RNAs.
Transcription ends. While transcription of prokaryotic protein-coding genes creates messenger RNA, ready for translation into protein, transcription of eukaryotic genes leaves a primary transcript of RNA, which first has to undergo a series of modifications to become a mature mRNA; these include 5' capping, set of enzymatic reactions that add 7-methylguanosine to the 5' end of pre-mRNA and thus protect the RNA from degradation by exonucleases. The m7G cap is bound by cap binding complex heterodimer, which aids in mRNA export to cytoplasm and protect the RNA from decapping. Another modification is 3' polyadenylation, they occur if polyadenylation signal sequence is present in pre-mRNA, between protein-coding sequence and terminator. The pre-mRNA is first cleaved and a series of ~200 adenines are added to form poly tail, which protects the RNA from degradation. Poly tail is bound by multiple poly-binding proteins necessary for mRNA export and translation re-initiation. A important modification of eukaryotic pre-mRNA is RNA splicing.
The majority of eukaryotic pre-mRNAs consist of alternating segments called introns. During the process of splicing, an RNA-protein catalytical complex known as spliceosome catalyzes two transesterification reactions, which remove an intron and release it in form of lariat structure, splice neighbouring exons together. In certain cases, some introns or exons can be either removed or retained in mature mRNA; this so-called alternative splicing creates series of different transcripts originating from a single gene. Because these transcripts can be translated into different proteins, splicing extends the complexity of eukaryotic gene expression. Extensive RNA processing may be an evolutionary advantage made possible by the nucleus of eukaryotes. In prokaryotes and translation happen together, whilst in eukaryotes, the nuclear membrane separates the two processes, giving time for RNA processing to
A dimer is an oligomer consisting of two monomers joined by bonds that can be either strong or weak, covalent or intermolecular. The term homodimer is used when the two molecules are heterodimer when they are not; the reverse of dimerisation is called dissociation. When two oppositely charged ions associate into dimers, they are referred to as Bjerrum pairs. Carboxylic acids form dimers by hydrogen bonding of the acidic hydrogen and the carbonyl oxygen when anhydrous. For example, acetic acid forms a dimer in the gas phase, where the monomer units are held together by hydrogen bonds. Under special conditions, most OH-containing molecules form dimers. Borane occurs as the dimer diborane, due to the high Lewis acidity of the boron center. Excimers and exciplexes are excited structures with a short lifetime. For example, noble gases do not form stable dimers, but do form the excimers Ar2*, Kr2* and Xe2* under high pressure and electrical stimulation. Molecular dimers are formed by the reaction of two identical compounds e.g.: 2A → A-A.
In this example, monomer "A" is said to dimerise to give the dimer "A-A". An example is a diaminocarbene, which dimerise to give a tetraaminoethylene: 2 C2 → 2C=C2Carbenes are reactive and form bonds. Dicyclopentadiene is an asymmetrical dimer of two cyclopentadiene molecules that have reacted in a Diels-Alder reaction to give the product. Upon heating, it "cracks" to give identical monomers: C10H12 → 2 C5H6Many nonmetallic elements occur as dimers: hydrogen, oxygen, the halogens, i.e. fluorine, chlorine and iodine. Noble gases can form dimers linked for example dihelium or diargon. Mercury occurs as a mercury cation, formally a dimeric ion. Other metals may form a proportion of dimers in their vapour. Known metallic dimers include Li2, Na2, K2, Rb2 and Cs2. Many small organic molecules, most notably formaldehyde form dimers; the dimer of formaldehyde is dioxetane. In the context of polymers, "dimer" refers to the degree of polymerization 2, regardless of the stoichiometry or condensation reactions.
This is applicable to disaccharides. For example, cellobiose is a dimer of glucose though the formation reaction produces water: 2C6H12O6 → C12H22O11 + H2OHere, the dimer has a stoichiometry different from the pair of monomers. Amino acids can form dimers, which are called dipeptides. An example is glycylglycine. Other examples are carnosine. Pyrimidine dimers are formed by a photochemical reaction from pyrimidine DNA bases; this cross-linking causes DNA mutations, causing skin cancers. Monomer Trimer Polymer Protein dimer "IUPAC "Gold Book" definition". Retrieved 2009-04-30
Apoptosis is a form of programmed cell death that occurs in multicellular organisms. Biochemical events lead to death; these changes include blebbing, cell shrinkage, nuclear fragmentation, chromatin condensation, chromosomal DNA fragmentation, global mRNA decay. The average adult human loses between 70 billion cells each day due to apoptosis. For an average human child between the ages of 8 to 14 year old 20 to 30 billion cells die per day. In contrast to necrosis, a form of traumatic cell death that results from acute cellular injury, apoptosis is a regulated and controlled process that confers advantages during an organism's lifecycle. For example, the separation of fingers and toes in a developing human embryo occurs because cells between the digits undergo apoptosis. Unlike necrosis, apoptosis produces cell fragments called apoptotic bodies that phagocytic cells are able to engulf and remove before the contents of the cell can spill out onto surrounding cells and cause damage to them; because apoptosis cannot stop once it has begun, it is a regulated process.
Apoptosis can be initiated through one of two pathways. In the intrinsic pathway the cell kills itself because it senses cell stress, while in the extrinsic pathway the cell kills itself because of signals from other cells. Weak external signals may activate the intrinsic pathway of apoptosis. Both pathways induce cell death by activating caspases, which are proteases, or enzymes that degrade proteins; the two pathways both activate initiator caspases, which activate executioner caspases, which kill the cell by degrading proteins indiscriminately. Research on apoptosis has increased since the early 1990s. In addition to its importance as a biological phenomenon, defective apoptotic processes have been implicated in a wide variety of diseases. Excessive apoptosis causes atrophy, whereas an insufficient amount results in uncontrolled cell proliferation, such as cancer; some factors like Fas receptors and caspases promote apoptosis, while some members of the Bcl-2 family of proteins inhibit apoptosis.
German scientist Karl Vogt was first to describe the principle of apoptosis in 1842. In 1885, anatomist Walther Flemming delivered a more precise description of the process of programmed cell death. However, it was not until 1965. While studying tissues using electron microscopy, John Foxton Ross Kerr at the University of Queensland was able to distinguish apoptosis from traumatic cell death. Following the publication of a paper describing the phenomenon, Kerr was invited to join Alastair R. Currie, as well as Andrew Wyllie, Currie's graduate student, at University of Aberdeen. In 1972, the trio published a seminal article in the British Journal of Cancer. Kerr had used the term programmed cell necrosis, but in the article, the process of natural cell death was called apoptosis. Kerr and Currie credited James Cormack, a professor of Greek language at University of Aberdeen, with suggesting the term apoptosis. Kerr received the Paul Ehrlich and Ludwig Darmstaedter Prize on March 14, 2000, for his description of apoptosis.
He shared the prize with Boston biologist H. Robert Horvitz. For many years, neither "apoptosis" nor "programmed cell death" was a cited term. Two discoveries brought cell death from obscurity to a major field of research: identification of components of the cell death control and effector mechanisms, linkage of abnormalities in cell death to human disease, in particular cancer; the 2002 Nobel Prize in Medicine was awarded to Sydney Brenner and John E. Sulston for their work identifying genes that control apoptosis; the genes were identified by studies in the nematode C. elegans and homologues of these genes function in humans to regulate apoptosis. In Greek, apoptosis translates to the "falling off" of leaves from a tree. Cormack, professor of Greek language, reintroduced the term for medical use as it had a medical meaning for the Greeks over two thousand years before. Hippocrates used the term to mean "the falling off of the bones". Galen extended its meaning to "the dropping of the scabs".
Cormack was no doubt aware of this usage. Debate continues over the correct pronunciation, with opinion divided between a pronunciation with the second p silent and the second p pronounced, as in the original Greek. In English, the p of the Greek -pt- consonant cluster is silent at the beginning of a word, but articulated when used in combining forms preceded by a vowel, as in helicopter or the orders of insects: diptera, etc. In the original Kerr, Wyllie & Currie paper, there is a footnote regarding the pronunciation: "We are most grateful to Professor James Cormack of the Department of Greek, University of Aberdeen, for suggesting this term; the word "apoptosis" is used in Greek to describe the "dropping off" or "falling off" of petals from flowers, or leaves from trees. To show the derivation we propose that the stress should be on the penultimate syllable, the second half of the word being pronounced like "ptosis", which comes from the same root "to fall", is used to describe the drooping of the upper eyelid."
The initiation of apoptosis is regulated by activation mechanisms, because once apoptosis has begun, it leads to the death of the cell. The two best-understood activation mechanisms are the extrinsic pathway; the intrinsic pathway is activated by intracellular signals generated when cells are stressed and depends on the release of proteins from th
A chromosome is a deoxyribonucleic acid molecule with part or all of the genetic material of an organism. Most eukaryotic chromosomes include packaging proteins which, aided by chaperone proteins, bind to and condense the DNA molecule to prevent it from becoming an unmanageable tangle. Chromosomes are visible under a light microscope only when the cell is undergoing the metaphase of cell division. Before this happens, every chromosome is copied once, the copy is joined to the original by a centromere, resulting either in an X-shaped structure if the centromere is located in the middle of the chromosome or a two-arm structure if the centromere is located near one of the ends; the original chromosome and the copy are now called sister chromatids. During metaphase the X-shape structure is called a metaphase chromosome. In this condensed form chromosomes are easiest to distinguish and study. In animal cells, chromosomes reach their highest compaction level in anaphase during chromosome segregation.
Chromosomal recombination during meiosis and subsequent sexual reproduction play a significant role in genetic diversity. If these structures are manipulated incorrectly, through processes known as chromosomal instability and translocation, the cell may undergo mitotic catastrophe; this will make the cell initiate apoptosis leading to its own death, but sometimes mutations in the cell hamper this process and thus cause progression of cancer. Some use the term chromosome in a wider sense, to refer to the individualized portions of chromatin in cells, either visible or not under light microscopy. Others use the concept in a narrower sense, to refer to the individualized portions of chromatin during cell division, visible under light microscopy due to high condensation; the word chromosome comes from the Greek χρῶμα and σῶμα, describing their strong staining by particular dyes. The term was coined by von Waldeyer-Hartz, referring to the term chromatin, introduced by Walther Flemming; some of the early karyological terms have become outdated.
For example and Chromosom, both ascribe color to a non-colored state. The German scientists Schleiden, Virchow and Bütschli were among the first scientists who recognized the structures now familiar as chromosomes. In a series of experiments beginning in the mid-1880s, Theodor Boveri gave the definitive demonstration that chromosomes are the vectors of heredity, it is the second of these principles, so original. Wilhelm Roux suggested. Boveri was able to confirm this hypothesis. Aided by the rediscovery at the start of the 1900s of Gregor Mendel's earlier work, Boveri was able to point out the connection between the rules of inheritance and the behaviour of the chromosomes. Boveri influenced two generations of American cytologists: Edmund Beecher Wilson, Nettie Stevens, Walter Sutton and Theophilus Painter were all influenced by Boveri. In his famous textbook The Cell in Development and Heredity, Wilson linked together the independent work of Boveri and Sutton by naming the chromosome theory of inheritance the Boveri–Sutton chromosome theory.
Ernst Mayr remarks that the theory was hotly contested by some famous geneticists: William Bateson, Wilhelm Johannsen, Richard Goldschmidt and T. H. Morgan, all of a rather dogmatic turn of mind. Complete proof came from chromosome maps in Morgan's own lab; the number of human chromosomes was published in 1923 by Theophilus Painter. By inspection through the microscope, he counted 24 pairs, his error was copied by others and it was not until 1956 that the true number, 46, was determined by Indonesia-born cytogeneticist Joe Hin Tjio. The prokaryotes – bacteria and archaea – have a single circular chromosome, but many variations exist; the chromosomes of most bacteria, which some authors prefer to call genophores, can range in size from only 130,000 base pairs in the endosymbiotic bacteria Candidatus Hodgkinia cicadicola and Candidatus Tremblaya princeps, to more than 14,000,000 base pairs in the soil-dwelling bacterium Sorangium cellulosum. Spirochaetes of the genus Borrelia are a notable exception to this arrangement, with bacteria such as Borrelia burgdorferi, the cause of Lyme disease, containing a single linear chromosome.
Prokaryotic chromosomes have less sequence-based structure than eukaryotes. Bacteria have a one-point from which replication starts, whereas some archaea contain multiple replication origins; the genes in prokaryotes are organized in operons, do not contain introns, unlike eukaryotes. Prokaryotes do not possess nuclei. Instead, their DNA is organized into a structure called the nucleoid; the nucleoid occupies a defined region of the bacterial cell. This structure is, dynamic and is maintained and remodeled by the actions of a range of histone-like proteins, which associate with the bacterial chromosome. In archaea, the DNA in chromosomes is more organized, with the DNA packaged within structures similar to eukaryotic nucleosomes. Certain bacteria contain plasmids or other extrachromosomal DNA; these are circular structures in the cytoplasm that contain cellular DNA and play a role in horizontal gene transfer. In prokaryotes and viruses, the DNA is densely packed and organized.