Transduction is the process by which foreign DNA is introduced into a cell by a virus or viral vector. An example is the viral transfer of DNA from one bacterium to another and hence an example of horizontal gene transfer. Transduction does not require physical contact between the cell donating the DNA and the cell receiving the DNA, it is DNase resistant. Transduction is a common tool used by molecular biologists to stably introduce a foreign gene into a host cell's genome; when viruses, including bacteriophages, infect bacterial cells, their normal mode of reproduction is to harness the replicational and translation machinery of the host bacterial cell to make numerous virions, or complete viral particles, including the viral DNA or RNA and the protein coat. Transduction was discovered by Norton Zinder and Joshua Lederberg at the University of Wisconsin–Madison in 1952 in Salmonella. Transduction happens through either the lysogenic cycle. If the lysogenic cycle is adopted, the phage chromosome is integrated into the bacterial chromosome, where it can stay dormant for thousands of generations.
If the lysogen is induced, the phage genome is excised from the bacterial chromosome and initiates the lytic cycle, which culminates in lysis of the cell and the release of phage particles. The lytic cycle leads to the production of new phage particles which are released by lysis of the host; the packaging of bacteriophage DNA has low fidelity and small pieces of bacterial DNA, together with the bacteriophage genome, may become packaged into the bacteriophage genome. At the same time, some phage genes are left behind in the bacterial chromosome. There are three types of recombination events that can lead to this incorporation of bacterial DNA into the viral DNA, leading to two modes of genetic recombination. Generalized transduction is the process by which any bacterial DNA may be transferred to another bacterium via a bacteriophage, it is a rare event. In essence, this is the packaging of bacterial DNA into a viral envelope; this may occur in two main ways and headful packaging. If bacteriophages undertake the lytic cycle of infection upon entering a bacterium, the virus will take control of the cell's machinery for use in replicating its own viral DNA.
If by chance bacterial chromosomal DNA is inserted into the viral capsid, used to encapsulate the viral DNA, the mistake will lead to generalized transduction. If the virus replicates using'headful packaging', it attempts to fill the nucleocapsid with genetic material. If the viral genome results in spare capacity, viral packaging mechanisms may incorporate bacterial genetic material into the new virion; the new virus capsule now loaded with part bacterial DNA continues to infect another bacterial cell. This bacterial material may become recombined into another bacterium upon infection; when the new DNA is inserted into this recipient cell it can fall to one of three fates The DNA will be absorbed by the cell and be recycled for spare parts. If the DNA was a plasmid, it will re-circularize inside the new cell and become a plasmid again. If the new DNA matches with a homologous region of the recipient cell's chromosome, it will exchange DNA material similar to the actions in bacterial recombination.
Specialized transduction is the process by which a restricted set of bacterial genes is transferred to another bacterium. The genes that get transferred depend on. Specialized transduction occurs when the prophage excises imprecisely from the chromosome so that bacterial genes lying adjacent to the prophage are included in the excised DNA; the excised DNA is packaged into a new virus particle, which delivers the DNA to a new bacterium, where the donor genes can be inserted into the recipient chromosome or remain in the cytoplasm, depending on the nature of the bacteriophage. When the encapsulated phage material infects another cell and becomes a "prophage", the coded prophage DNA is called a "heterogenote". An example of specialized transduction is λ phage in Escherichia coli. Transduction with viral vectors can be used to modify genes in mammalian cells, it is used as a tool in basic research and is researched as a potential means for gene therapy. In these cases, a plasmid is constructed in which the genes to be transferred are flanked by viral sequences that are used by viral proteins to recognize and package the viral genome into viral particles.
This plasmid is inserted into a producer cell together with other plasmids that carry the viral genes required for formation of infectious virions. In these producer cells, the viral proteins expressed by these packaging constructs bind the sequences on the DNA/RNA to be transferred and insert it into viral particles. For safety, none of the plasmids used contains all the sequences required for virus formation, so that simultaneous transfection of multiple plasmids is required to get infectious virions. Moreover, only the plasmid carrying the sequences to be transferred contains signals that allow the genetic materials to be packaged in virions, so that none of the genes encoding viral proteins are packaged. Viruses collected from these cells are applied to the cells to be altered; the initial stages of these infections mimic infection with natural viru
A base pair is a unit consisting of two nucleobases bound to each other by hydrogen bonds. They form the building blocks of the DNA double helix and contribute to the folded structure of both DNA and RNA. Dictated by specific hydrogen bonding patterns, Watson–Crick base pairs allow the DNA helix to maintain a regular helical structure, subtly dependent on its nucleotide sequence; the complementary nature of this based-paired structure provides a redundant copy of the genetic information encoded within each strand of DNA. The regular structure and data redundancy provided by the DNA double helix make DNA well suited to the storage of genetic information, while base-pairing between DNA and incoming nucleotides provides the mechanism through which DNA polymerase replicates DNA and RNA polymerase transcribes DNA into RNA. Many DNA-binding proteins can recognize specific base-pairing patterns that identify particular regulatory regions of genes. Intramolecular base pairs can occur within single-stranded nucleic acids.
This is important in RNA molecules, where Watson–Crick base pairs permit the formation of short double-stranded helices, a wide variety of non-Watson–Crick interactions allow RNAs to fold into a vast range of specific three-dimensional structures. In addition, base-pairing between transfer RNA and messenger RNA forms the basis for the molecular recognition events that result in the nucleotide sequence of mRNA becoming translated into the amino acid sequence of proteins via the genetic code; the size of an individual gene or an organism's entire genome is measured in base pairs because DNA is double-stranded. Hence, the number of total base pairs is equal to the number of nucleotides in one of the strands; the haploid human genome is estimated to be about 3.2 billion bases long and to contain 20,000–25,000 distinct protein-coding genes. A kilobase is a unit of measurement in molecular biology equal to 1000 base pairs of DNA or RNA; the total amount of related DNA base pairs on Earth is estimated at 5.0×1037 and weighs 50 billion tonnes.
In comparison, the total mass of the biosphere has been estimated to be as much as 4 TtC. Hydrogen bonding is the chemical interaction. Appropriate geometrical correspondence of hydrogen bond donors and acceptors allows only the "right" pairs to form stably. DNA with high GC-content is more stable than DNA with low GC-content. But, contrary to popular belief, the hydrogen bonds do not stabilize the DNA significantly; the larger nucleobases and guanine, are members of a class of double-ringed chemical structures called purines. Purines are complementary only with pyrimidines: pyrimidine-pyrimidine pairings are energetically unfavorable because the molecules are too far apart for hydrogen bonding to be established. Purine-pyrimidine base-pairing of AT or GC or UA results in proper duplex structure; the only other purine-pyrimidine pairings would be AC and GT and UG. The GU pairing, with two hydrogen bonds, does occur often in RNA. Paired DNA and RNA molecules are comparatively stable at room temperature, but the two nucleotide strands will separate above a melting point, determined by the length of the molecules, the extent of mispairing, the GC content.
Higher GC content results in higher melting temperatures. On the converse, regions of a genome that need to separate — for example, the promoter regions for often-transcribed genes — are comparatively GC-poor. GC content and melting temperature must be taken into account when designing primers for PCR reactions; the following DNA sequences illustrate pair double-stranded patterns. By convention, the top strand is written from the 5' end to the 3' end. A base-paired DNA sequence: ATCGATTGAGCTCTAGCG TAGCTAACTCGAGATCGCThe corresponding RNA sequence, in which uracil is substituted for thymine in the RNA strand: AUCGAUUGAGCUCUAGCG UAGCUAACUCGAGAUCGC Chemical analogs of nucleotides can take the place of proper nucleotides and establish non-canonical base-pairing, leading to errors in DNA replication and DNA transcription; this is due to their isosteric chemistry. One common mutagenic base analog is 5-bromouracil, which resembles thymine but can base-pair to guanine in its enol form. Other chemicals, known as DNA intercalators, fit into the gap between adjacent bases on a single strand and induce frameshift mutations by "masquerading" as a base, causing the DNA replication machinery to skip or insert additional nucleotides at the intercalated site.
Most intercalators are known or suspected carcinogens. Examples include ethidium acridine. An unnatural base pair is a designed subunit of DNA, created in a laboratory and does not occur in nature. DNA sequences have been described which use newly created nucleobases to form a third base pair, in addition to the two ba
Origin of replication
The origin of replication is a particular sequence in a genome at which replication is initiated. This can either involve the replication of DNA in living organisms such as prokaryotes and eukaryotes, or that of DNA or RNA in viruses, such as double-stranded RNA viruses. DNA replication may proceed from this point unidirectionally; the specific structure of the origin of replication varies somewhat from species to species, but all share some common characteristics such as high AT content. The origin of replication binds the pre-replication complex, a protein complex that recognizes and begins to copy DNA. There are significant differences between prokaryotic and eukaryotic origins of replication: Most bacteria have a single circular molecule of DNA, only a single origin of replication per circular chromosome. Most archaea have a single circular molecule of DNA, several origins of replication along this circular chromosome. Eukaryotes have multiple origins of replication on each linear chromosome that initiate at different times, with up to 100,000 present in a single human cell.
Having many origins of replication helps to speed the duplication of their much larger store of genetic material. The segment of DNA, copied starting from each unique replication origin is called a replicon; the replicons range from 40 kb length, in Drosophila, to 300 kb in plants. Mitochondrial DNA in many organisms has two ori sequences. In humans, they are called oriH and oriL for the heavy and light strand of the DNA, each being the origin of replication for single-stranded replication; the two Chloroplast DNA ori sequences in Nicotiana tabacum, the tobacco plant, has been characterized as oriA and oriB. Origins of replication are assigned names containing "ori"; when it comes to plasmids, origins of replication are classified in two ways: Narrow or broad host range High- or low-copy number. The genome of E. coli consists of a single circular DNA molecule of 4.6 x 106 nucleotide pairs. DNA replication begins at a single origin of replication. In E. coli, the origin of replication—oriC—consists of three A–T rich 13-mer repeats and four 9-mer repeats.
Ten to 20 monomers of the replication initiator protein DnaA bind to the 9-mer repeats, the DNA coils around this protein complex forming a protein core. This coiling stimulates the AT rich region in the 13-mer sequence to unwind, allowing the helicase loader DnaC to load the replicative helicase DnaB to each of the two unwound DNA strands; the helicase DnaB forms the basis of the primosome, a complex of enzymes to which DNA polymerase III is recruited before replication can occur. Many bacteria, including E. coli, contain plasmids that each contain an origin of replication named oriV for vegetative replication. They in general still work by binding DnaA; these are separate from the origins of replication that are used by the bacteria to copy their genome and are regulated differently. For example, the E. coli plasmid pBR322 uses a protein called Rop/Rom to regulate the number of plasmids that are within each bacterial cell. This limits the number of plasmids per cell – the copy number – to 30–40.
The pUC series of plasmids, including pUC19, is more used. Compared to pBR322, it uses ori with a single point mutation and has the regulatory Rop/Rom gene removed. With those changes, the bacteria can produce up to 500 copies pUC19 per cell; this allows genetic engineers to produce large quantities of DNA for research purposes. Other origins of plasmid replication include pSC101, 15A origin and Bacterial artificial chromosomes. During conjugation, the rolling circle mode of replication starts at the oriT sequence of the FAT plasmid. In eukaryotes, the budding yeast Saccharomyces cerevisiae were first identified by their ability to support the replication of mini-chromosomes or plasmids, giving rise to the name Autonomously replicating sequences or ARS elements; each budding yeast origin consists of a short essential DNA sequence that recruits replication proteins. In other eukaryotes, including humans, the base pair sequences at the replication origins vary. Despite this sequence variation, all the origins form a base for assembly of a group of proteins known collectively as the pre-replication complex: First, the origin DNA is bound by the origin recognition complex which, with help from two further protein factors, load the mini chromosome maintenance protein complex.
Once assembled, this complex of proteins indicates that the replication origin is ready for activation. Once the replication origin is activated, the cell's DNA will be replicated. In metazoans, pre-RC formation is inhibited by the protein geminin, which binds to and inactivates Cdt1. Regulation of replication prevents the DNA from being replicated more than once each cell cycle. In humans an origin of replication has been identified near the Lamin B2 gene on chromosome 19 and the ORC binding to it has extensively been studied. Viruses possess a single origin of replication. A variety of proteins have been described as being involved in viral replication. For instance, Polyoma viruses utilize host cell DNA polymerases, which attach to a viral origin of replication if the T antigen is present. OriDB the DNA Replication Origin Database Origin of transfer Lewin, Benjamin. Genes VIII. Prent
The phenotype of an organism is the composite of the organism's observable characteristics or traits, including its morphology or physical form and structure. An organism's phenotype results from two basic factors: the expression of an organism's genetic code, or its genotype, the influence of environmental factors, which may interact, further affecting phenotype; when two or more different phenotypes exist in the same population of a species, the species is called polymorphic. A well-documented polymorphism is Labrador Retriever coloring. Richard Dawkins in 1978 and again in his 1982 book The Extended Phenotype suggested that bird nests and other built structures such as caddis fly larvae cases and beaver dams can be considered as "extended phenotypes"; the genotype-phenotype distinction was proposed by Wilhelm Johannsen in 1911 to make clear the difference between an organism's heredity and what that heredity produces. The distinction is similar to that proposed by August Weismann, who distinguished between germ plasm and somatic cells.
The genotype-phenotype distinction should not be confused with Francis Crick's central dogma of molecular biology, a statement about the directionality of molecular sequential information flowing from DNA to protein, not the reverse. The term "phenotype" has sometimes been incorrectly used as a shorthand for phenotypic difference from wild type, bringing the absurd statement that a mutation has no phenotype. Despite its straightforward definition, the concept of the phenotype has hidden subtleties, it may seem that anything dependent on the genotype is a phenotype, including molecules such as RNA and proteins. Most molecules and structures coded by the genetic material are not visible in the appearance of an organism, yet they are observable and are thus part of the phenotype, it may seem that this goes beyond the original intentions of the concept with its focus on the organism in itself. Either way, the term phenotype includes inherent traits or characteristics that are observable or traits that can be made visible by some technical procedure.
A notable extension to this idea is the presence of "organic molecules" or metabolites that are generated by organisms from chemical reactions of enzymes. Another extension adds behavior to the phenotype. Behavioral phenotypes include cognitive and behavioral patterns; some behavioral phenotypes may characterize psychiatric syndromes. Phenotypic variation is a fundamental prerequisite for evolution by natural selection, it is the living organism as a whole that contributes to the next generation, so natural selection affects the genetic structure of a population indirectly via the contribution of phenotypes. Without phenotypic variation, there would be no evolution by natural selection; the interaction between genotype and phenotype has been conceptualized by the following relationship: genotype + environment → phenotype A more nuanced version of the relationship is: genotype + environment + genotype & environment interactions → phenotype Genotypes have much flexibility in the modification and expression of phenotypes.
The plant Hieracium umbellatum is found growing in two different habitats in Sweden. One habitat is rocky, sea-side cliffs, where the plants are bushy with broad leaves and expanded inflorescences; these habitats alternate along the coast of Sweden and the habitat that the seeds of Hieracium umbellatum land in, determine the phenotype that grows. An example of random variation in Drosophila flies is the number of ommatidia, which may vary between left and right eyes in a single individual as much as they do between different genotypes overall, or between clones raised in different environments; the concept of phenotype can be extended to variations below the level of the gene that affect an organism's fitness. For example, silent mutations that do not change the corresponding amino acid sequence of a gene may change the frequency of guanine-cytosine base pairs; these base pairs have a higher thermal stability than adenine-thymine, a property that might convey, among organisms living in high-temperature environments, a selective advantage on variants enriched in GC content.
Richard Dawkins described a phenotype that included all effects that a gene has on its surroundings, including other organisms, as an extended phenotype, arguing that "An animal's behavior tends to maximize the survival of the genes'for' that behavior, whether or not those genes happen to be in the body of the particular animal performing it." For instance, an organism such as a beaver modifies its environment by building a beaver dam. When a bird feeds a brood parasite such as a cuckoo, it is unwittingly extending its phenotype.
Enzymes are macromolecular biological catalysts. Enzymes accelerate chemical reactions; the molecules upon which enzymes may act are called substrates and the enzyme converts the substrates into different molecules known as products. All metabolic processes in the cell need enzyme catalysis in order to occur at rates fast enough to sustain life. Metabolic pathways depend upon enzymes to catalyze individual steps; the study of enzymes is called enzymology and a new field of pseudoenzyme analysis has grown up, recognising that during evolution, some enzymes have lost the ability to carry out biological catalysis, reflected in their amino acid sequences and unusual'pseudocatalytic' properties. Enzymes are known to catalyze more than 5,000 biochemical reaction types. Most enzymes are proteins; the latter are called ribozymes. Enzymes' specificity comes from their unique three-dimensional structures. Like all catalysts, enzymes increase the reaction rate by lowering its activation energy; some enzymes can make their conversion of substrate to product occur many millions of times faster.
An extreme example is orotidine 5'-phosphate decarboxylase, which allows a reaction that would otherwise take millions of years to occur in milliseconds. Chemically, enzymes are like any catalyst and are not consumed in chemical reactions, nor do they alter the equilibrium of a reaction. Enzymes differ from most other catalysts by being much more specific. Enzyme activity can be affected by other molecules: inhibitors are molecules that decrease enzyme activity, activators are molecules that increase activity. Many therapeutic drugs and poisons are enzyme inhibitors. An enzyme's activity decreases markedly outside its optimal temperature and pH, many enzymes are denatured when exposed to excessive heat, losing their structure and catalytic properties; some enzymes are used commercially, in the synthesis of antibiotics. Some household products use enzymes to speed up chemical reactions: enzymes in biological washing powders break down protein, starch or fat stains on clothes, enzymes in meat tenderizer break down proteins into smaller molecules, making the meat easier to chew.
By the late 17th and early 18th centuries, the digestion of meat by stomach secretions and the conversion of starch to sugars by plant extracts and saliva were known but the mechanisms by which these occurred had not been identified. French chemist Anselme Payen was the first to discover an enzyme, diastase, in 1833. A few decades when studying the fermentation of sugar to alcohol by yeast, Louis Pasteur concluded that this fermentation was caused by a vital force contained within the yeast cells called "ferments", which were thought to function only within living organisms, he wrote that "alcoholic fermentation is an act correlated with the life and organization of the yeast cells, not with the death or putrefaction of the cells."In 1877, German physiologist Wilhelm Kühne first used the term enzyme, which comes from Greek ἔνζυμον, "leavened" or "in yeast", to describe this process. The word enzyme was used to refer to nonliving substances such as pepsin, the word ferment was used to refer to chemical activity produced by living organisms.
Eduard Buchner submitted his first paper on the study of yeast extracts in 1897. In a series of experiments at the University of Berlin, he found that sugar was fermented by yeast extracts when there were no living yeast cells in the mixture, he named the enzyme that brought about the fermentation of sucrose "zymase". In 1907, he received the Nobel Prize in Chemistry for "his discovery of cell-free fermentation". Following Buchner's example, enzymes are named according to the reaction they carry out: the suffix -ase is combined with the name of the substrate or to the type of reaction; the biochemical identity of enzymes was still unknown in the early 1900s. Many scientists observed that enzymatic activity was associated with proteins, but others argued that proteins were carriers for the true enzymes and that proteins per se were incapable of catalysis. In 1926, James B. Sumner crystallized it; the conclusion that pure proteins can be enzymes was definitively demonstrated by John Howard Northrop and Wendell Meredith Stanley, who worked on the digestive enzymes pepsin and chymotrypsin.
These three scientists were awarded the 1946 Nobel Prize in Chemistry. The discovery that enzymes could be crystallized allowed their structures to be solved by x-ray crystallography; this was first done for lysozyme, an enzyme found in tears and egg whites that digests the coating of some bacteria. This high-resolution structure of lysozyme marked the beginning of the field of structural biology and the effort to understand how enzymes work at an atomic level of detail. An enzyme's name is derived from its substrate or the chemical reaction it catalyzes, with the word ending in -ase. Examples are alcohol dehydrogenase and DNA polymerase. Different enzymes that catalyze the same chemical reaction are called isozymes; the International Union of Biochemistry and Molecular Biology have developed a nomenclature for enzymes, the EC numbers. The first number broadly classifies the enzyme based on its mechanism; the top-level classification is: EC 1, Oxidoreductases: catalyze oxidation/reducti
In evolutionary biology, parasitism is a relationship between species, where one organism, the parasite, lives on or in another organism, the host, causing it some harm, is adapted structurally to this way of life. The entomologist E. O. Wilson has characterised parasites as "predators that eat prey in units of less than one". Parasites include protozoans such as the agents of malaria, sleeping sickness, amoebic dysentery. There are six major parasitic strategies of exploitation of animal hosts, namely parasitic castration, directly transmitted parasitism, trophically transmitted parasitism, vector-transmitted parasitism and micropredation. Like predation, parasitism is a type of consumer-resource interaction, but unlike predators, with the exception of parasitoids, are much smaller than their hosts, do not kill them, live in or on their hosts for an extended period. Parasites of animals are specialised, reproduce at a faster rate than their hosts. Classic examples include interactions between vertebrate hosts and tapeworms, the malaria-causing Plasmodium species, fleas.
Parasites reduce host fitness by general or specialised pathology, from parasitic castration to modification of host behaviour. Parasites increase their own fitness by exploiting hosts for resources necessary for their survival, in particular by feeding on them and by using intermediate hosts to assist in their transmission from one definitive host to another. Although parasitism is unambiguous, it is part of a spectrum of interactions between species, grading via parasitoidism into predation, through evolution into mutualism, in some fungi, shading into being saprophytic. People have known about parasites such as roundworms and tapeworms since ancient Egypt and Rome. In Early Modern times, Antonie van Leeuwenhoek observed Giardia lamblia in his microscope in 1681, while Francesco Redi described internal and external parasites including sheep liver fluke and ticks. Modern parasitology developed in the 19th century. In human culture, parasitism has negative connotations; these were exploited to satirical effect in Jonathan Swift's 1733 poem "On Poetry: A Rhapsody", comparing poets to hyperparasitical "vermin".
In fiction, Bram Stoker's 1897 Gothic horror novel Dracula and its many adaptations featured a blood-drinking parasite. Ridley Scott's 1979 film Alien was one of many works of science fiction to feature a terrifying parasitic alien species. First used in English in 1539, the word parasite comes from the Medieval French parasite, from the Latin parasitus, the latinisation of the Greek παράσιτος, "one who eats at the table of another" and that from παρά, "beside, by" + σῖτος, "wheat", hence "food"; the related term parasitism appears in English from 1611. Parasitism is a kind of symbiosis, a close and persistent long-term biological interaction between a parasite and its host. Unlike commensalism and mutualism, the parasitic relationship harms the host, either feeding on it or, as in the case of intestinal parasites, consuming some of its food; because parasites interact with other species, they can act as vectors of pathogens, causing disease. Predation is by definition not a symbiosis, as the interaction is brief, but the entomologist E. O. Wilson has characterised parasites as "predators that eat prey in units of less than one".
Within that scope are many possible strategies. Taxonomists classify parasites in a variety of overlapping schemes, based on their interactions with their hosts and on their life-cycles, which are sometimes complex. An obligate parasite depends on the host to complete its life cycle, while a facultative parasite does not. Parasite life-cycles involving only one host are called "direct". An endoparasite lives inside the host's body. Mesoparasites - like some copepods, for example - enter an opening in the host's body and remain embedded there; some parasites can be generalists, feeding on a wide range of hosts, but many parasites, the majority of protozoans and helminths that parasitise animals, are specialists and host-specific. An early basic, functional division of parasites distinguished macroparasites; these each had a mathematical model assigned in order to analyse the population movements of the host–parasite groupings. The microorganisms and viruses that can reproduce and complete their life cycle within the host are known as microparasites.
Macroparasites are the multicellular organisms that reproduce and complete their life cycle outside of the host or on the host's body. Much of the thinking on types of parasitism has focussed on terrestrial animal parasites of animals, such as helminths; those in other environments and with other hosts have analogous strategies. For example, the snubnosed eel is a facultative endoparasite that opportunistically burrows into and eats sick and dying fish. Plant-eating insects such as scale insects and caterpillars resemble ectoparasites, attacking much larger plants; as female scale-insects cannot move, they are obligate parasites, permanently attached to their hosts. There are six major parasitic strategies, namely parasitic castration, directly transmitted parasitism, trophically transmitted parasitism, vector-transmitted parasitism, parasitoid
In biology, a gene is a sequence of nucleotides in DNA or RNA that codes for a molecule that has a function. During gene expression, the DNA is first copied into RNA; the RNA can be directly functional or be the intermediate template for a protein that performs a function. The transmission of genes to an organism's offspring is the basis of the inheritance of phenotypic trait; these genes make up different DNA sequences called genotypes. Genotypes along with developmental factors determine what the phenotypes will be. Most biological traits are under the influence of polygenes as well as gene–environment interactions; some genetic traits are visible, such as eye color or number of limbs, some are not, such as blood type, risk for specific diseases, or the thousands of basic biochemical processes that constitute life. Genes can acquire mutations in their sequence, leading to different variants, known as alleles, in the population; these alleles encode different versions of a protein, which cause different phenotypical traits.
Usage of the term "having a gene" refers to containing a different allele of the same, shared gene. Genes evolve due to natural selection / survival of the fittest and genetic drift of the alleles; the concept of a gene continues to be refined. For example, regulatory regions of a gene can be far removed from its coding regions, coding regions can be split into several exons; some viruses store their genome in RNA instead of DNA and some gene products are functional non-coding RNAs. Therefore, a broad, modern working definition of a gene is any discrete locus of heritable, genomic sequence which affect an organism's traits by being expressed as a functional product or by regulation of gene expression; the term gene was introduced by Danish botanist, plant physiologist and geneticist Wilhelm Johannsen in 1909. It is inspired by the ancient Greek: γόνος, that means offspring and procreation; the existence of discrete inheritable units was first suggested by Gregor Mendel. From 1857 to 1864, in Brno, he studied inheritance patterns in 8000 common edible pea plants, tracking distinct traits from parent to offspring.
He described these mathematically as 2n combinations where n is the number of differing characteristics in the original peas. Although he did not use the term gene, he explained his results in terms of discrete inherited units that give rise to observable physical characteristics; this description prefigured Wilhelm Johannsen's distinction between phenotype. Mendel was the first to demonstrate independent assortment, the distinction between dominant and recessive traits, the distinction between a heterozygote and homozygote, the phenomenon of discontinuous inheritance. Prior to Mendel's work, the dominant theory of heredity was one of blending inheritance, which suggested that each parent contributed fluids to the fertilisation process and that the traits of the parents blended and mixed to produce the offspring. Charles Darwin developed a theory of inheritance he termed pangenesis, from Greek pan and genesis / genos. Darwin used the term gemmule to describe hypothetical particles. Mendel's work went unnoticed after its first publication in 1866, but was rediscovered in the late 19th century by Hugo de Vries, Carl Correns, Erich von Tschermak, who reached similar conclusions in their own research.
In 1889, Hugo de Vries published his book Intracellular Pangenesis, in which he postulated that different characters have individual hereditary carriers and that inheritance of specific traits in organisms comes in particles. De Vries called these units "pangenes", after Darwin's 1868 pangenesis theory. Sixteen years in 1905, Wilhelm Johannsen introduced the term'gene' and William Bateson that of'genetics' while Eduard Strasburger, amongst others, still used the term'pangene' for the fundamental physical and functional unit of heredity. Advances in understanding genes and inheritance continued throughout the 20th century. Deoxyribonucleic acid was shown to be the molecular repository of genetic information by experiments in the 1940s to 1950s; the structure of DNA was studied by Rosalind Franklin and Maurice Wilkins using X-ray crystallography, which led James D. Watson and Francis Crick to publish a model of the double-stranded DNA molecule whose paired nucleotide bases indicated a compelling hypothesis for the mechanism of genetic replication.
In the early 1950s the prevailing view was that the genes in a chromosome acted like discrete entities, indivisible by recombination and arranged like beads on a string. The experiments of Benzer using mutants defective in the rII region of bacteriophage T4 showed that individual genes have a simple linear structure and are to be equivalent to a linear section of DNA. Collectively, this body of research established the central dogma of molecular biology, which states that proteins are translated from RNA, transcribed from DNA; this dogma has since been shown to have exceptions, such as reverse transcription in retroviruses. The modern study of genetics at the level of DNA is known as molecular genetics. In 1972, Walter Fiers and his team were the first to determine the sequence of a gene: that of Bacteriophage MS2 coat protein; the subsequent development of chain-termination DNA sequencing in 1977 by Frederick Sanger improved the efficiency of sequencing and turned it into a routine laboratory tool.
An automated version of the Sanger method was used in early phases of the