Molecular phylogenetics is the branch of phylogeny that analyzes genetic, hereditary molecular differences, predominately in DNA sequences, to gain information on an organism's evolutionary relationships. From these analyses, it is possible to determine the processes by which diversity among species has been achieved; the result of a molecular phylogenetic analysis is expressed in a phylogenetic tree. Molecular phylogenetics is one aspect of molecular systematics, a broader term that includes the use of molecular data in taxonomy and biogeography. Molecular phylogenetics and molecular evolution correlate. Molecular evolution is the process of selective changes at a molecular level throughout various branches in the tree of life. Molecular phylogenetics makes inferences of the evolutionary relationships that arise due to molecular evolution and results in the construction of a phylogenetic tree; the figure displayed on the right depicts the phylogenetic tree of life as one of the first detailed trees, according to information known in the 1870s by Haeckel.
The theoretical frameworks for molecular systematics were laid in the 1960s in the works of Emile Zuckerkandl, Emanuel Margoliash, Linus Pauling, Walter M. Fitch. Applications of molecular systematics were pioneered by Charles G. Sibley, Herbert C. Dessauer, Morris Goodman, followed by Allan C. Wilson, Robert K. Selander, John C. Avise. Work with protein electrophoresis began around 1956. Although the results were not quantitative and did not improve on morphological classification, they provided tantalizing hints that long-held notions of the classifications of birds, for example, needed substantial revision. In the period of 1974–1986, DNA-DNA hybridization was the dominant technique used to measure genetic difference. Early attempts at molecular systematics were termed as chemotaxonomy and made use of proteins, enzymes and other molecules that were separated and characterized using techniques such as chromatography; these have been replaced in recent times by DNA sequencing, which produces the exact sequences of nucleotides or bases in either DNA or RNA segments extracted using different techniques.
In general, these are considered superior for evolutionary studies, since the actions of evolution are reflected in the genetic sequences. At present, it is still a expensive process to sequence the entire DNA of an organism. However, it is quite feasible to determine the sequence of a defined area of a particular chromosome. Typical molecular systematic analyses require the sequencing of around 1000 base pairs. At any location within such a sequence, the bases found in a given position may vary between organisms; the particular sequence found in a given organism is referred to as its haplotype. In principle, since there are four base types, with 1000 base pairs, we could have 41000 distinct haplotypes. However, for organisms within a particular species or in a group of related species, it has been found empirically that only a minority of sites show any variation at all, most of the variations that are found are correlated, so that the number of distinct haplotypes that are found is small. In a molecular systematic analysis, the haplotypes are determined for a defined area of genetic material.
Haplotypes of individuals of related, yet different, taxa are determined. Haplotypes from a smaller number of individuals from a different taxon are determined: these are referred to as an outgroup; the base sequences for the haplotypes are compared. In the simplest case, the difference between two haplotypes is assessed by counting the number of locations where they have different bases: this is referred to as the number of substitutions; the difference between organisms is re-expressed as a percentage divergence, by dividing the number of substitutions by the number of base pairs analysed: the hope is that this measure will be independent of the location and length of the section of DNA, sequenced. An older and superseded approach was to determine the divergences between the genotypes of individuals by DNA-DNA hybridization; the advantage claimed for using hybridization rather than gene sequencing was that it was based on the entire genotype, rather than on particular sections of DNA. Modern sequence comparison techniques overcome this objection by the use of multiple sequences.
Once the divergences between all pairs of samples have been determined, the resulting triangular matrix of differences is submitted to some form of statistical cluster analysis, the resulting dendrogram is examined in order to see whether the samples cluster in the way that would be expected from current ideas about the taxonomy of the group. Any group of haplotypes that are all more similar to one another than any of them is to any other haplotype may be said to constitute a clade, which may be visually represented as the figure displayed on the right demonstrates. Statistical techniques such as bootstrapping and jackknifing help in providing reliability estimates for the positions of haplotypes within the evolutionary trees; every living organism contains deoxyribonucleic acid, ribonucleic acid, proteins. In general related organisms have a high degree of similarity in the molecular structure of these substances, while the molecules of organisms distantly related s
Plesiomorphy and symplesiomorphy
In phylogenetics, a plesiomorphy, symplesiomorphy or symplesiomorphic character is an ancestral character shared by two or more taxa - but with other taxa linked earlier in the clade. In this situation, the fact that the taxa under consideration share the trait state may hint that they are related, but the fact that the trait state is plesiomorphic means the hint may be misleading. It must be disregarded; the question whether those taxa are related must be determined from other evidence. The term symplesiomorphy was first introduced in 1950 by German entomologist Willi Hennig. Reversal – is the loss of a derived trait state, reestablishing the plesiomorphic trait state present in an ancestor. Pseudoplesiomorphy -- is a trait; the concept of plesiomorphy addresses the perils of grouping species together purely on the basis of morphologic or genetic similarity without distinguishing ancestral from derived character states. Since a plesiomorphic character inherited from a common ancestor can appear anywhere in a phylogenetic tree, its presence cannot reveal anything about the relationships within that tree.
A famous example is the trait of breathing via gills in bony fish and cartilaginous fish. Bony fish are more related to terrestrial vertebrates, which evolved out of a clade of bony fishes that breathe through their skin or lungs, than they are to sharks and other cartilaginous fish, their kind of gill respiration is shared by the "fishes" because it was present in their common ancestor and lost in the other living vertebrates. The shared trait cannot treated as evidence that bony fish are more related to sharks and rays than they are to terrestrial vertebrates. Synapomorphy Autapomorphy
In biology, phylogenetics is the study of the evolutionary history and relationships among individuals or groups of organisms. These relationships are discovered through phylogenetic inference methods that evaluate observed heritable traits, such as DNA sequences or morphology under a model of evolution of these traits; the result of these analyses is a phylogeny – a diagrammatic hypothesis about the history of the evolutionary relationships of a group of organisms. The tips of a phylogenetic tree can be living organisms or fossils, represent the "end", or the present, in an evolutionary lineage. Phylogenetic analyses have become central to understanding biodiversity, evolution and genomes. Taxonomy is the identification and classification of organisms, it is richly informed by phylogenetics, but remains a methodologically and logically distinct discipline. The degree to which taxonomies depend on phylogenies differs depending on the school of taxonomy: phenetics ignores phylogeny altogether, trying to represent the similarity between organisms instead.
Usual methods of phylogenetic inference involve computational approaches implementing the optimality criteria and methods of parsimony, maximum likelihood, MCMC-based Bayesian inference. All these depend upon an implicit or explicit mathematical model describing the evolution of characters observed. Phenetics, popular in the mid-20th century but now obsolete, used distance matrix-based methods to construct trees based on overall similarity in morphology or other observable traits, assumed to approximate phylogenetic relationships. Prior to 1950, phylogenetic inferences were presented as narrative scenarios; such methods are ambiguous and lack explicit criteria for evaluating alternative hypotheses. The term "phylogeny" derives from the German Phylogenie, introduced by Haeckel in 1866, the Darwinian approach to classification became known as the "phyletic" approach. During the late 19th century, Ernst Haeckel's recapitulation theory, or "biogenetic fundamental law", was accepted, it was expressed as "ontogeny recapitulates phylogeny", i.e. the development of a single organism during its lifetime, from germ to adult, successively mirrors the adult stages of successive ancestors of the species to which it belongs.
But this theory has long been rejected. Instead, ontogeny evolves – the phylogenetic history of a species cannot be read directly from its ontogeny, as Haeckel thought would be possible, but characters from ontogeny can be used as data for phylogenetic analyses. 14th century, lex parsimoniae, William of Ockam, English philosopher and Franciscan friar, but the idea goes back to Aristotle, precursor concept 1763, Bayesian probability, Rev. Thomas Bayes, precursor concept 18th century, Pierre Simon first to use ML, precursor concept 1809, evolutionary theory, Philosophie Zoologique, Jean-Baptiste de Lamarck, precursor concept, foreshadowed in the 17th century and 18th century by Voltaire and Leibniz, with Leibniz proposing evolutionary changes to account for observed gaps suggesting that many species had become extinct, others transformed, different species that share common traits may have at one time been a single race foreshadowed by some early Greek philosophers such as Anaximander in the 6th century BC and the atomists of the 5th century BC, who proposed rudimentary theories of evolution 1837, Darwin's notebooks show an evolutionary tree 1843, distinction between homology and analogy, Richard Owen, precursor concept 1858, Paleontologist Heinrich Georg Bronn published a hypothetical tree to illustrating the paleontological "arrival" of new, similar species following the extinction of an older species.
Bronn did not propose a mechanism responsible for precursor concept. 1858, elaboration of evolutionary theory and Wallace in Origin of Species by Darwin the following year, precursor concept 1866, Ernst Haeckel, first publishes his phylogeny-based evolutionary tree, precursor concept 1893, Dollo's Law of Character State Irreversibility, precursor concept 1912, ML recommended and popularized by Ronald Fisher, precursor concept 1921, Tillyard uses term "phylogenetic" and distinguishes between archaic and specialized characters in his classification system 1940, term "clade" coined by Lucien Cuénot 1949, Jackknife resampling, Maurice Quenouille, precursor concept 1950, Willi Hennig's classic formalization 1952, William Wagner's groundplan divergence method 1953, "cladogenesis" coined 1960, "cladistic" coined by Cain and Harrison 1963, first attempt to use ML for phylogenetics and Cavalli-Sforza 1965 Camin-Sokal parsimony, first parsimony criterion and first computer program/algorithm for cladistic analysis both by Camin and Sokal character compatibility method called clique analysis, introduced independently by Camin and Sokal and E. O. Wilson 1966 English translation of Hennig "cladistics" and "cladogram" coined 1969 dynamic and successive wei
In taxonomy, a group is paraphyletic if it consists of the group's last common ancestor and all descendants of that ancestor excluding a few—typically only one or two—monophyletic subgroups. The group is said to be paraphyletic with respect to the excluded subgroups; the arrangement of the members of a paraphyletic group is called a paraphyly. The term is used in phylogenetics and in linguistics; the term was coined to apply to well-known taxa like Reptilia which, as named and traditionally defined, is paraphyletic with respect to mammals and birds. Reptilia contains the last common ancestor of reptiles and all descendants of that ancestor—including all extant reptiles as well as the extinct synapsids—except for mammals and birds. Other recognized paraphyletic groups include fish and lizards. If many subgroups are missing from the named group, it is said to be polyparaphyletic. A paraphyletic group cannot be a clade, or monophyletic group, any group of species that includes only a common ancestor and all of its descendants.
Formally, a paraphyletic group is the relative complement of one or more subclades within a clade: removing one or more subclades leaves a paraphyletic group. The term paraphyly, or paraphyletic, derives from the two Ancient Greek words παρά, meaning "beside, near", φῦλον, meaning "genus, species", refers to the situation in which one or several monophyletic subgroups of organisms are left apart from all other descendants of a unique common ancestor. Conversely, the term monophyly, or monophyletic, builds on the Ancient Greek prefix μόνος, meaning "alone, unique", refers to the fact that a monophyletic group includes organisms consisting of all the descendants of a unique common ancestor. By comparison, the term polyphyly, or polyphyletic, uses the Ancient Greek prefix πολύς, meaning "many, a lot of", refers to the fact that a polyphyletic group includes organisms arising from multiple ancestral sources. Groups that include all the descendants of a common ancestor are said to be monophyletic.
A paraphyletic group is a monophyletic group from which one or more subsidiary clades are excluded to form a separate group. Ereshefsky has argued that paraphyletic taxa are the result of anagenesis in the excluded group or groups. A group whose identifying features evolved convergently in two or more lineages is polyphyletic. More broadly, any taxon, not paraphyletic or monophyletic can be called polyphyletic; these terms were developed during the debates of the 1960s and 1970s accompanying the rise of cladistics. Paraphyletic groupings are considered problematic by many taxonomists, as it is not possible to talk about their phylogenetic relationships, their characteristic traits and literal extinction. Related terminology that may be encountered are stem group, budding cladogenesis, anagenesis, or'grade' groupings. Paraphyletic groups are a relic from previous erroneous assessments about phylogenic relationships, or from before the rise of cladistics; the prokaryotes, because they exclude the eukaryotes, a descendant group.
Bacteria and Archaea are prokaryotes, but archaea and eukaryotes share a common ancestor, not ancestral to the bacteria. The prokaryote/eukaryote distinction was proposed by Edouard Chatton in 1937 and was accepted after being adopted by Roger Stanier and C. B. van Niel in 1962. The botanical code abandoned consideration of bacterial nomenclature in 1975. Among plants, dicotyledons are paraphyletic. "Dicotyledon" has not been used as an ICBN classification for decades, but is allowed as a synonym of Magnoliopsida. Phylogenetic analysis indicates. Excluding monocots from the dicots makes the latter a paraphyletic group. Among animals, several familiar groups are not, in fact, clades; the order Artiodactyla is paraphyletic. In the ICZN Code, the two taxa are orders of equal rank. Molecular studies, have shown that the Cetacea descend from artiodactyl ancestors, although the precise phylogeny within the order remains uncertain. Without the Cetacean descendants the Artiodactyls must be paraphyletic; the class Reptilia as traditionally defined is paraphyletic because it excludes mammals.
In the ICZN Code, the three taxa are classes of equal rank. However, mammals hail from the synapsids and birds are descended from the dinosaurs, both of which are reptiles. Alternatively, reptiles are paraphyletic. Birds and reptiles together make Sauropsids. Osteichthyes, bony fish, are paraphyletic when they include only Actinopterygii and Sarcopterygii, excluding tetrapods; the wasps are paraphyletic, consisting of the narrow-waisted Apocrita without the bees. The sawflies are paraphyletic, forming all of the Hymenoptera except for the Apocrita, a clade deep within the sawfly tree. Crustaceans are not a clade; the modern clade that spans all of them is the Tetraconata. Species have a special status in systematics as being an observable feature of nature itself and a
Hybrid speciation is a form of speciation where hybridization between two different species leads to a new species, reproductively isolated from the parent species. From the 1940s, reproductive isolation between hybrids and their parents was thought to be difficult to achieve and thus hybrid species were thought to be rare. With DNA analysis becoming more accessible in the 1990s, hybrid speciation has been shown to be a common phenomenon in plants. In botanical nomenclature, a hybrid species is called a nothospecies. Hybrid species are by their nature polyphyletic. A hybrid may be better fitted to the local environment than the parental lineage and as such natural selection may favor these individuals. If reproductive isolation is subsequently achieved, a separate species may arise. Reproductive isolation may be genetic, behavioural, spatial, or a combination of these. If reproductive isolation fails to establish, the hybrid population may merge with either or both parent species; this will lead to an influx of foreign genes in the parent population, a situation called an introgression.
Introgression is a source of genetic variation, can in itself facilitate speciation. There is evidence that introgression is a ubiquitous phenomenon in plants and humans, where genetic material from Neanderthals and Denisovans is responsible for much of the immune genes in non-African populations. For a hybrid form to persist, it must be able to exploit the available resources better than either parent species, which, in most cases, it will have to compete with. While grizzly bears and polar bears may have offspring, a grizzly–polar bear hybrid will be less suited in either of the ecological roles than the parents themselves. Although the hybrid is fertile, this poor adaptation would prevent the establishment of a permanent population. Lions and tigers have overlapped in a portion of their range and can theoretically produce wild hybrids: ligers, which are a cross between a male lion and female tiger, tigons, which are a cross between a male tiger and a female lion. In both ligers and tigons, the females are fertile and the males are sterile.
One of these hybrids carries growth-inhibitor genes from both parents and thus is smaller than either parent species and might in the wild come into competition with smaller carnivores, e.g. the leopard. The other hybrid, the liger, ends up larger than either of its parents: about a thousand pounds fully-grown. No tiger-lion hybrids are known from the wild because each species is confined to geographically separated ranges; some situations may favour hybrid population. One example is rapid turnover of available environment types, like the historical fluctuation of water level in Lake Malawi, a situation that favors speciation. A similar situation can be found where related species occupy a chain of islands; this will allow any present hybrid population to move into new, unoccupied habitats, avoiding direct competition with parent species and giving a hybrid population time and space to establish. Genetics too can favour hybrids. In the Amboseli National Park in Kenya, yellow baboons and anubis baboons interbreed.
The hybrid males reach maturity earlier than their pure bred cousins, setting up a situation where the hybrid population may over time replace one or both of the parent species in the area. Genetics are more variable and malleable in plants than in animals reflecting the higher activity level in animals. Hybrids genetics will be less stable than those of species evolving through isolation, which explains why hybrid species appear more common in plants than in animals. Many agricultural crops are hybrids with double or triple chromosome sets. Having multiple sets of chromosomes is called polyploidy. Polyploidy is fatal in animals where extra chromosome sets upset fetal development, but is found in plants. A form of hybrid speciation, common in plants, occurs when an infertile hybrid becomes fertile after doubling of the chromosome number. Hybridization without change in chromosome number is called homoploid hybrid speciation; this is the situation found in most animal hybrids. For a hybrid to be viable, the chromosomes of the two organisms will have to be similar, i.e. the parent species must be related, or the difference in chromosome arrangement will make mitosis problematic.
With polyploid hybridization, this constraint is less acute. Super-numerary chromosome numbers can be unstable, which can lead to instability in the genetics of the hybrid; the European edible frog appears to be a species, but is triploid semi-permanent hybrids between pool frogs and marsh frogs. In most populations, the edible frog population is dependent on the presence of at least one of the parents species to be maintained as each individual need two gene sets from one parent species and one from the other; the male sex determination gene in the hybrids is only found in the genome of the pool frog, further undermining stability. Such instability can lead to rapid reduction of chromosome numbers, creating reproductive barriers and thus allowing speciation. Hybrid speciation in animals is homoploid. While not common, a few animal species are the result of hybridization insects such as the Lonicera fly, some fish, with a mammal, the clymene dolphin, a few birds. One is an unnamed species of Darwin's finches bred with Española cactus finch with a p
Cladistics is an approach to biological classification in which organisms are categorized in groups based on the most recent common ancestor. Hypothesized relationships are based on shared derived characteristics that can be traced to the most recent common ancestor and are not present in more distant groups and ancestors. A key feature of a clade is that all its descendants are part of the clade. All descendants stay in their overarching ancestral clade. For example, if within a strict cladistic framework the terms animals, bilateria/worms, fishes/vertebrata, or monkeys/anthropoidea were used, these terms would include humans. Many of these terms are used paraphyletically, outside of cladistics, e.g. as a'grade'. Radiation results in the generation of new subclades by bifurcation; the techniques and nomenclature of cladistics have been applied to other disciplines. Cladistics is now the most used method to classify organisms; the original methods used in cladistic analysis and the school of taxonomy derived from the work of the German entomologist Willi Hennig, who referred to it as phylogenetic systematics.
Cladistics in the original sense refers to a particular set of methods used in phylogenetic analysis, although it is now sometimes used to refer to the whole field. What is now called the cladistic method appeared as early as 1901 with a work by Peter Chalmers Mitchell for birds and subsequently by Robert John Tillyard in 1921, W. Zimmermann in 1943; the term "clade" was introduced in 1958 by Julian Huxley after having been coined by Lucien Cuénot in 1940, "cladogenesis" in 1958, "cladistic" by Cain and Harrison in 1960, "cladist" by Mayr in 1965, "cladistics" in 1966. Hennig referred to his own approach as "phylogenetic systematics". From the time of his original formulation until the end of the 1970s, cladistics competed as an analytical and philosophical approach to systematics with phenetics and so-called evolutionary taxonomy. Phenetics was championed at this time by the numerical taxonomists Peter Sneath and Robert Sokal, evolutionary taxonomy by Ernst Mayr. Conceived, if only in essence, by Willi Hennig in a book published in 1950, cladistics did not flourish until its translation into English in 1966.
Today, cladistics is the most popular method for constructing phylogenies from morphological data. In the 1990s, the development of effective polymerase chain reaction techniques allowed the application of cladistic methods to biochemical and molecular genetic traits of organisms, vastly expanding the amount of data available for phylogenetics. At the same time, cladistics became popular in evolutionary biology, because computers made it possible to process large quantities of data about organisms and their characteristics; the cladistic method interprets each character state transformation implied by the distribution of shared character states among taxa as a potential piece of evidence for grouping. The outcome of a cladistic analysis is a cladogram – a tree-shaped diagram, interpreted to represent the best hypothesis of phylogenetic relationships. Although traditionally such cladograms were generated on the basis of morphological characters and calculated by hand, genetic sequencing data and computational phylogenetics are now used in phylogenetic analyses, the parsimony criterion has been abandoned by many phylogeneticists in favor of more "sophisticated" but less parsimonious evolutionary models of character state transformation.
Cladists contend. Every cladogram is based on a particular dataset analyzed with a particular method. Datasets are tables consisting of molecular, ethological and/or other characters and a list of operational taxonomic units, which may be genes, populations, species, or larger taxa that are presumed to be monophyletic and therefore to form, all together, one large clade. Different datasets and different methods, not to mention violations of the mentioned assumptions result in different cladograms. Only scientific investigation can show, more to be correct; until for example, cladograms like the following have been accepted as accurate representations of the ancestral relations among turtles, lizards and birds: If this phylogenetic hypothesis is correct the last common ancestor of turtles and birds, at the branch near the ▼ lived earlier than the last common ancestor of lizards and birds, near the ♦. Most molecular evidence, produces cladograms more like this: If this is accurate the last common ancestor of turtles and birds lived than the last common ancestor of lizards and birds.
Since the cladograms provide competing accounts of real events, at most one of them is correct. The cladogram to the right represents the current universally accepted hypothesis that all primates, including strepsirrhines like the lemurs and lorises, had a common ancestor all of whose descendants were primates, so form a clade. Within the primates, all anthropoids are hypothesized to have had a common ancestor all of whose descendants were anthropoids, so they form the clade called Anthropoidea; the "prosimians", on the other hand, form a paraphyletic taxon. The name Prosimii is not used in phylogenetic nomenclature, whic
Convergent evolution is the independent evolution of similar features in species of different lineages. Convergent evolution creates analogous structures that have similar form or function but were not present in the last common ancestor of those groups; the cladistic term for the same phenomenon is homoplasy. The recurrent evolution of flight is a classic example, as flying insects, birds and bats have independently evolved the useful capacity of flight. Functionally similar features that have arisen through convergent evolution are analogous, whereas homologous structures or traits have a common origin but can have dissimilar functions. Bird and pterosaur wings are analogous structures, but their forelimbs are homologous, sharing an ancestral state despite serving different functions; the opposite of convergence is divergent evolution. Convergent evolution is similar to parallel evolution, which occurs when two independent species evolve in the same direction and thus independently acquire similar characteristics.
Many instances of convergent evolution are known in plants, including the repeated development of C4 photosynthesis, seed dispersal by fleshy fruits adapted to be eaten by animals, carnivory. In morphology, analogous traits arise when different species live in similar ways and/or a similar environment, so face the same environmental factors; when occupying similar ecological niches similar problems can lead to similar solutions. The British anatomist Richard Owen was the first to identify the fundamental difference between analogies and homologies. In biochemistry and chemical constraints on mechanisms have caused some active site arrangements such as the catalytic triad to evolve independently in separate enzyme superfamilies. In his 1989 book Wonderful Life, Stephen Jay Gould argued that if one could "rewind the tape of life the same conditions were encountered again, evolution could take a different course". Simon Conway Morris disputes this conclusion, arguing that convergence is a dominant force in evolution, given that the same environmental and physical constraints are at work, life will evolve toward an "optimum" body plan, at some point, evolution is bound to stumble upon intelligence, a trait presently identified with at least primates and cetaceans.
In cladistics, a homoplasy is a trait shared by two or more taxa for any reason other than that they share a common ancestry. Taxa which do share ancestry are part of the same clade. Homoplastic traits caused by convergence are therefore, from the point of view of cladistics, confounding factors which could lead to an incorrect analysis. In some cases, it is difficult to tell whether a trait has been lost and re-evolved convergently, or whether a gene has been switched off and re-enabled later; such a re-emerged trait is called an atavism. From a mathematical standpoint, an unused gene has a decreasing probability of retaining potential functionality over time; the time scale of this process varies in different phylogenies. When two species are similar in a particular character, evolution is defined as parallel if the ancestors were similar, convergent if they were not; some scientists have argued that there is a continuum between parallel and convergent evolution, while others maintain that despite some overlap, there are still important distinctions between the two.
When the ancestral forms are unspecified or unknown, or the range of traits considered is not specified, the distinction between parallel and convergent evolution becomes more subjective. For instance, the striking example of similar placental and marsupial forms is described by Richard Dawkins in The Blind Watchmaker as a case of convergent evolution, because mammals on each continent had a long evolutionary history prior to the extinction of the dinosaurs under which to accumulate relevant differences; the enzymology of proteases provides some of the clearest examples of convergent evolution. These examples reflect the intrinsic chemical constraints on enzymes, leading evolution to converge on equivalent solutions independently and repeatedly. Serine and cysteine proteases use different amino acid functional groups as a nucleophile. In order to activate that nucleophile, they orient an acidic and a basic residue in a catalytic triad; the chemical and physical constraints on enzyme catalysis have caused identical triad arrangements to evolve independently more than 20 times in different enzyme superfamilies.
Threonine proteases use the amino acid threonine as their catalytic nucleophile. Unlike cysteine and serine, threonine is a secondary alcohol; the methyl group of threonine restricts the possible orientations of triad and substrate, as the methyl clashes with either the enzyme backbone or the histidine base. Most threonine proteases use an N-terminal threonine in order to avoid such steric clashes. Several evolutionarily independent enzyme superfamilies with different protein folds use the N-terminal residue as a nucleophile; this commonality of active site but difference of protein fold indicates that the active site evolved convergently in those families. Convergence occurs at the level of DNA and the amino acid sequences produced by translating structural genes into proteins. Studies have found convergence in amino acid sequenc