Human Y-chromosome DNA haplogroup
In human genetics, a human Y-chromosome DNA haplogroup is a haplogroup defined by mutations in the non-recombining portions of DNA from the Y chromosome. Mutations that are shared by many people are called single-nucleotide polymorphisms; the human Y-chromosome accumulates two mutations per generation. Y-DNA haplogroups represent major branches of the Y-chromosome phylogenetic tree that share hundreds or thousands of mutations unique to each haplogroup; the Y-chromosomal most recent common ancestor is the most recent common ancestor from whom all living men are descended patrilineally. Y-chromosomal Adam is estimated to have lived 236,000 years ago in Africa. By examining other bottlenecks most Eurasian men are descended from a man who lived 69,000 years ago. Other major bottlenecks occurred about 5,000 years ago and subsequently most Eurasian men can trace their ancestry back to a dozen ancestors who lived 5,000 years ago. Y-DNA haplogroups are defined by the presence of a series of Y-DNA SNP markers.
Subclades are defined by a terminal SNP, the SNP furthest down in the Y-chromosome phylogenetic tree. The Y Chromosome Consortium developed a system of naming major Y-DNA haplogroups with the capital letters A through T, with further subclades named using numbers and lower case letters. YCC shorthand nomenclature names Y-DNA haplogroups and their subclades with the first letter of the major Y-DNA haplogroup followed by a dash and the name of the defining terminal SNP. Y-DNA haplogroup nomenclature is changing over time to accommodate the increasing number of SNPs being discovered and tested, the resulting expansion of the Y-chromosome phylogenetic tree; this change in nomenclature has resulted in inconsistent nomenclature being used in different sources. This inconsistency, cumbersome longhand nomenclature, has prompted a move towards using the simpler shorthand nomenclature. In September 2012, Family Tree DNA provided the following explanation of its changing Y-DNA haplogroup nomenclature to individual customers on their Y-DNA results pages: Phylogenetic tree of Y-DNA haplogroups Haplogroup A is the NRY macrohaplogroup from which all modern paternal haplogroups descend.
It is sparsely distributed in Africa, being concentrated among Khoisan populations in the southwest and Nilotic populations toward the northeast in the Nile Valley. BT is a subclade of haplogroup A; the site of origin is unknown but in either Asia or Africa 88,000 years ago. Haplogroup DE Haplogroup D Found in Asia Haplogroup E Found in Africa, Middle East, Southern Europe Haplogroup CF Haplogroup C Found in Asia and North America Haplogroup F Found in Europe, Asia and the Americas Haplogroup C Found in Asia and North America Haplogroup C1 Haplogroup C1a Haplogroup C1a1 Found with low frequency in Japan Haplogroup C1a2 Found with low frequency in Europe, Armenians and Nepal Haplogroup C1b Haplogroup C1b1 Haplogroup C1b1a Haplogroup C1b1a1 Found with low frequency in South Asia, Southwest Asia, northern China Haplogroup C1b1a2 Haplogroup C1b1a2a Found among Lebbo' people in Borneo, Indonesia Haplogroup C1b1a2b Found among Han Chinese, Dai people, Murut people, Malay people, Aeta people Haplogroup C1b1a3 Found with low frequency in Saudi Arabia and Iraq Haplogroup C1b1b Found among Dusun people Haplogroup C1b2 Haplogroup C1b2a Found in Indonesia, New Guinea, Melanesia and Polynesia Haplogroup C1b2b Found among the indigenous peoples in Australia Haplogroup C2 Found throughout Eurasia and North America, but among Mongols, Tungusic peoples, Na-Dené-speaking peoples Haplogroup D Found in Japan, the Andaman Islands D1 Haplogroup D1a Haplogroup D1a1 Found in Tibetans, Qiangic peoples, Yi, Hmong-Mien peoples Haplogroup D1a2 Found in Tibetans, Qiangic peoples and Turkic peoples Haplogroup D1b Found in Japan D2 Found in Mactan Island, Philippines Haplogroup E Found in Africa and parts of Middle East and Europe Haplogroup E1 Haplogroup E1a E1 Haplogroup E1b Haplogroup E1b1.
Genomics is an interdisciplinary field of biology focusing on the structure, evolution and editing of genomes. A genome is an organism's complete set including all of its genes. In contrast to genetics, which refers to the study of individual genes and their roles in inheritance, genomics aims at the collective characterization and quantification of genes, which direct the production of proteins with the assistance of enzymes and messenger molecules. In turn, proteins make up body structures such as organs and tissues as well as control chemical reactions and carry signals between cells. Genomics involves the sequencing and analysis of genomes through uses of high throughput DNA sequencing and bioinformatics to assemble and analyze the function and structure of entire genomes. Advances in genomics have triggered a revolution in discovery-based research and systems biology to facilitate understanding of the most complex biological systems such as the brain; the field includes studies of intragenomic phenomena such as epistasis, pleiotropy and other interactions between loci and alleles within the genome.
From the Greek ΓΕΝ gen, "gene" meaning "become, creation, birth", subsequent variants: genealogy, genetics, genomere, genus etc. While the word genome was in use in English as early as 1926, the term genomics was coined by Tom Roderick, a geneticist at the Jackson Laboratory, over beer at a meeting held in Maryland on the mapping of the human genome in 1986. Following Rosalind Franklin's confirmation of the helical structure of DNA, James D. Watson and Francis Crick's publication of the structure of DNA in 1953 and Fred Sanger's publication of the Amino acid sequence of insulin in 1955, nucleic acid sequencing became a major target of early molecular biologists. In 1964, Robert W. Holley and colleagues published the first nucleic acid sequence determined, the ribonucleotide sequence of alanine transfer RNA. Extending this work, Marshall Nirenberg and Philip Leder revealed the triplet nature of the genetic code and were able to determine the sequences of 54 out of 64 codons in their experiments.
In 1972, Walter Fiers and his team at the Laboratory of Molecular Biology of the University of Ghent were the first to determine the sequence of a gene: the gene for Bacteriophage MS2 coat protein. Fiers' group expanded on their MS2 coat protein work, determining the complete nucleotide-sequence of bacteriophage MS2-RNA and Simian virus 40 in 1976 and 1978, respectively. In addition to his seminal work on the amino acid sequence of insulin, Frederick Sanger and his colleagues played a key role in the development of DNA sequencing techniques that enabled the establishment of comprehensive genome sequencing projects. In 1975, he and Alan Coulson published a sequencing procedure using DNA polymerase with radiolabelled nucleotides that he called the Plus and Minus technique; this involved two related methods that generated short oligonucleotides with defined 3' termini. These could be fractionated by electrophoresis on a polyacrylamide gel and visualised using autoradiography; the procedure could sequence up to 80 nucleotides in one go and was a big improvement, but was still laborious.
In 1977 his group was able to sequence most of the 5,386 nucleotides of the single-stranded bacteriophage φX174, completing the first sequenced DNA-based genome. The refinement of the Plus and Minus method resulted in the chain-termination, or Sanger method, which formed the basis of the techniques of DNA sequencing, genome mapping, data storage, bioinformatic analysis most used in the following quarter-century of research. In the same year Walter Gilbert and Allan Maxam of Harvard University independently developed the Maxam-Gilbert method of DNA sequencing, involving the preferential cleavage of DNA at known bases, a less efficient method. For their groundbreaking work in the sequencing of nucleic acids and Sanger shared half the 1980 Nobel Prize in chemistry with Paul Berg; the advent of these technologies resulted in a rapid intensification in the scope and speed of completion of genome sequencing projects. The first complete genome sequence of a eukaryotic organelle, the human mitochondrion, was reported in 1981, the first chloroplast genomes followed in 1986.
In 1992, the first eukaryotic chromosome, chromosome III of brewer's yeast Saccharomyces cerevisiae was sequenced. The first free-living organism to be sequenced was that of Haemophilus influenzae in 1995; the following year a consortium of researchers from laboratories across North America and Japan announced the completion of the first complete genome sequence of a eukaryote, S. cerevisiae, since genomes have continued being sequenced at an exponentially growing pace. As of October 2011, the complete sequences are available for: 2,719 viruses, 1,115 archaea and bacteria, 36 eukaryotes, of which about half are fungi. Most of the microorganisms whose genomes have been sequenced are problematic pathogens, such as Haemophilus influenzae, which has resulted in a pronounced bias in their phylogenetic distribution compared to the breadth of microbial diversity. Of the other sequenced species, most were chosen because they were well-studied model organisms or promised to become good models. Yeast (Saccharomyces
The Y chromosome is one of two sex chromosomes in mammals, including humans, many other animals. The other is the X chromosome. Y is the sex-determining chromosome in many species, since it is the presence or absence of Y that determines the male or female sex of offspring produced in sexual reproduction. In mammals, the Y chromosome contains the gene SRY; the DNA in the human Y chromosome is composed of about 59 million base pairs. The Y chromosome is passed only from father to son. With a 30% difference between humans and chimpanzees, the Y chromosome is one of the fastest-evolving parts of the human genome. To date, over 200 Y-linked genes have been identified. All Y-linked genes are expressed and hemizygous except in the cases of aneuploidy such as XYY syndrome or XXYY syndrome; the Y chromosome was identified as a sex-determining chromosome by Nettie Stevens at Bryn Mawr College in 1905 during a study of the mealworm Tenebrio molitor. Edmund Beecher Wilson independently discovered the same mechanisms the same year.
Stevens proposed that chromosomes always existed in pairs and that the Y chromosome was the pair of the X chromosome discovered in 1890 by Hermann Henking. She realized that the previous idea of Clarence Erwin McClung, that the X chromosome determines sex, was wrong and that sex determination is, in fact, due to the presence or absence of the Y chromosome. Stevens named the chromosome "Y" to follow on from Henking's "X" alphabetically; the idea that the Y chromosome was named after its similarity in appearance to the letter "Y" is mistaken. All chromosomes appear as an amorphous blob under the microscope and only take on a well-defined shape during mitosis; this shape is vaguely X-shaped for all chromosomes. It is coincidental that the Y chromosome, during mitosis, has two short branches which can look merged under the microscope and appear as the descender of a Y-shape. Most therian mammals have only one pair of sex chromosomes in each cell. Males have one Y chromosome and one X chromosome. In mammals, the Y chromosome contains SRY, which triggers embryonic development as a male.
The Y chromosomes of humans and other mammals contain other genes needed for normal sperm production. There are exceptions, however. Among humans, some men have two Xs and a Y, or one X and two Ys, some women have three Xs or a single X instead of a double X. There are other exceptions in which SRY is damaged, or copied to the X. Many ectothermic vertebrates have no sex chromosomes. If they have different sexes, sex is determined environmentally rather than genetically. For some of them reptiles, sex depends on the incubation temperature; the X and Y chromosomes are thought to have evolved from a pair of identical chromosomes, termed autosomes, when an ancestral animal developed an allelic variation, a so-called "sex locus" – possessing this allele caused the organism to be male. The chromosome with this allele became the Y chromosome, while the other member of the pair became the X chromosome. Over time, genes that were beneficial for males and harmful to females either developed on the Y chromosome or were acquired through the process of translocation.
Until the X and Y chromosomes were thought to have diverged around 300 million years ago. However, research published in 2010, research published in 2008 documenting the sequencing of the platypus genome, has suggested that the XY sex-determination system would not have been present more than 166 million years ago, at the split of the monotremes from other mammals; this re-estimation of the age of the therian XY system is based on the finding that sequences that are on the X chromosomes of marsupials and eutherian mammals are present on the autosomes of platypus and birds. The older estimate was based on erroneous reports that the platypus X chromosomes contained these sequences. Recombination between the X and Y chromosomes proved harmful—it resulted in males without necessary genes found on the Y chromosome, females with unnecessary or harmful genes only found on the Y chromosome; as a result, genes beneficial to males accumulated near the sex-determining genes, recombination in this region was suppressed in order to preserve this male specific region.
Over time, the Y chromosome changed in such a way as to inhibit the areas around the sex determining genes from recombining at all with the X chromosome. As a result of this process, 95% of the human Y chromosome is unable to recombine. Only the tips of the Y and X chromosomes recombine; the tips of the Y chromosome that could recombine with the X chromosome are referred to as the pseudoautosomal region. The rest of the Y chromosome is passed on to the next generation intact, allowing for its use in tracking human evolution. By one estimate, the human Y chromosome has lost 1,393 of its 1,438 original genes over the course of its existence, linear extrapolation of this 1,393-gene loss over 300 million years gives a rate of genetic loss of 4.6 genes per million years. Continued loss of genes at the rate of 4.6 genes per million years would result in a Y chromosome with no functional genes –, the Y chromosome would lose complete function – within the next 10 million years, or half that time with the current age estimate of 160 million years.
Comparative genomic analysis reveals that many mammalian species are experiencing a similar loss of function in their h
A genetic marker is a gene or DNA sequence with a known location on a chromosome that can be used to identify individuals or species. It can be described as a variation. A genetic marker may be a short DNA sequence, such as a sequence surrounding a single base-pair change, or a long one, like minisatellites. For many years, gene mapping was limited to identifying organisms by traditional phenotype markers; this included genes that encoded observable characteristics such as blood types or seed shapes. The insufficient number of these types of characteristics in several organisms limited the mapping efforts that could be done; this prompted the development of gene markers which could identify genetic characteristics that are not observable in organisms. Some used types of genetic markers are: RFLP SSLP AFLP RAPD VNTR SSR Microsatellite polymorphism, SNP STR SFP DArT RAD markers Molecular genetic markers can be divided into two classes a) biochemical markers which detect variation at the gene product level such as changes in proteins and amino acids and b) molecular markers which detect variation at the DNA level such as nucleotide changes: deletion, inversion and/or insertion.
Markers can exhibit two modes of i.e. dominant/recessive or co-dominant. If the genetic pattern of homozygotes can be distinguished from that of heterozygotes a marker is said to be co-dominant. Co-dominant markers are more informative than the dominant markers. Genetic markers can be used to study the relationship between an inherited disease and its genetic cause, it is known that pieces of DNA that lie near each other on a chromosome tend to be inherited together. This property enables the use of a marker, which can be used to determine the precise inheritance pattern of the gene that has not yet been localized. Genetic markers are employed in genealogical DNA testing for genetic genealogy to determine genetic distance between individuals or populations. Uniparental markers are studied for assessing paternal lineages. Autosomal markers are used for all ancestry. Genetic markers have to be identifiable, associated with a specific locus, polymorphic, because homozygotes do not provide any information.
Detection of the marker can be direct by indirect using allozymes. Some of the methods used to study the genome or phylogenetics are RFLP, Amplified fragment length polymorphism, RAPD, SSR, they can be used to create genetic maps. There was a debate over. Many researchers hypothesized that virus like particles were responsible for transforming the cell, while others thought that the cell itself was able to infect other canines as an allograft. With the aid of genetic markers, researchers were able to provide conclusive evidence that the cancerous tumor cell evolved into a transmissible parasite. Furthermore, molecular genetic markers were used to resolve the issue of natural transmission, the breed of origin, the age of the canine tumor. Genetic markers have been used to measure the genomic response to selection in livestock. Natural and artificial selection leads to a change in the genetic makeup of the cell; the presence of different alleles due to a distorted segregation at the genetic markers is indicative of the difference between selected and non-selected livestock.
Molecular marker DNA marking de Vicente C, Fulton T. Molecular Marker Learning Modules – Vol. 1. IPGRI, Rome and Institute for Genetic Diversity, New York, USA. de Vicente C, Fulton T. Molecular Marker Learning Modules – Vol. 2. IPGRI, Rome and Institute for Genetic Diversity, New York, USA. de Vicente C, Glaszmann JC. Molecular Markers for Allele Mining. AMS, CIRAD, GCP, IPGRI, M. S. Swaminathan Research Foundation. P. 85. Spooner D, van Treuren R, de Vicente MC. Molecular markers for genebank management. CGN, IPGRI, USDA. P. 126. Media related to Genetic markers at Wikimedia Commons
Mitochondrial DNA is the DNA located in mitochondria, cellular organelles within eukaryotic cells that convert chemical energy from food into a form that cells can use, adenosine triphosphate. Mitochondrial DNA is only a small portion of the DNA in a eukaryotic cell. In humans, the 16,569 base pairs of mitochondrial DNA encode for only 37 genes. Human mitochondrial DNA was the first significant part of the human genome to be sequenced. In most species, including humans, mtDNA is inherited from the mother. However, in exceptional cases, human babies sometimes inherit mtDNA from both their fathers and their mothers resulting in mtDNA heteroplasmy. Since animal mtDNA evolves faster than nuclear genetic markers, it represents a mainstay of phylogenetics and evolutionary biology, it permits an examination of the relatedness of populations, so has become important in anthropology and biogeography. Nuclear and mitochondrial DNA are thought to be of separate evolutionary origin, with the mtDNA being derived from the circular genomes of the bacteria that were engulfed by the early ancestors of today's eukaryotic cells.
This theory is called the endosymbiotic theory. Each mitochondrion is estimated to contain 2–10 mtDNA copies. In the cells of extant organisms, the vast majority of the proteins present in the mitochondria are coded for by nuclear DNA, but the genes for some, if not most, of them are thought to have been of bacterial origin, having since been transferred to the eukaryotic nucleus during evolution; the reasons why mitochondria have retained some genes are debated. The existence in some species of mitochondrion-derived organelles lacking a genome suggests that complete gene loss is possible, transferring mitochondrial genes to the nucleus has several advantages; the difficulty of targeting remotely-produced hydrophobic protein products to the mitochondrion is one hypothesis for why some genes are retained in mtDNA. Recent analysis of a wide range of mtDNA genomes suggests that both these features may dictate mitochondrial gene retention. In most multicellular organisms, mtDNA is inherited from the mother.
Mechanisms for this include simple dilution, degradation of sperm mtDNA in the male genital tract and in the fertilized egg. Whatever the mechanism, this single parent pattern of mtDNA inheritance is found in most animals, most plants and in fungi. In sexual reproduction, mitochondria are inherited from the mother. Most mitochondria are present at the base of the sperm's tail, used for propelling the sperm cells. In 1999 it was reported that paternal sperm mitochondria are marked with ubiquitin to select them for destruction inside the embryo; some in vitro fertilization techniques injecting a sperm into an oocyte, may interfere with this. The fact that mitochondrial DNA is maternally inherited enables genealogical researchers to trace maternal lineage far back in time; this is accomplished on human mitochondrial DNA by sequencing the hypervariable control regions, sometimes the complete molecule of the mitochondrial DNA, as a genealogical DNA test. HVR1, for example, consists of about 440 base pairs.
These 440 base pairs are compared to the same regions of other individuals to determine maternal lineage. Most the comparison is made with the revised Cambridge Reference Sequence. Vilà et al. have published studies tracing the matrilineal descent of domestic dogs from wolves. The concept of the Mitochondrial Eve is based on the same type of analysis, attempting to discover the origin of humanity by tracking the lineage back in time. MtDNA is conserved, its slow mutation rates make it useful for studying the evolutionary relationships—phylogeny—of organisms. Biologists can determine and compare mtDNA sequences among different species and use the comparisons to build an evolutionary tree for the species examined. However, due to the slow mutation rates, it is hard to distinguish between related species to any large degree, so other methods of analysis must be used. Entities subject to uniparental inheritance and with little to no recombination may be expected to be subject to Muller's ratchet, the accumulation of deleterious mutations until functionality is lost.
Animal populations of mitochondria avoid this through a developmental process known as the mtDNA bottleneck. The bottleneck exploits random processes in the cell to increase the cell-to-cell variability in mutant load as an organism develops: a single egg cell with some proportion of mutant mtDNA thus produces an embryo in which different cells have different mutant loads. Cell-level selection may act to remove those cells with more mutant mtDNA, leading to a stabilisation or reduction in mutant load between generations; the mechanism underlying the bottleneck is debated, with a recent mathematical and experimental
Cladistics is an approach to biological classification in which organisms are categorized in groups based on the most recent common ancestor. Hypothesized relationships are based on shared derived characteristics that can be traced to the most recent common ancestor and are not present in more distant groups and ancestors. A key feature of a clade is that all its descendants are part of the clade. All descendants stay in their overarching ancestral clade. For example, if within a strict cladistic framework the terms animals, bilateria/worms, fishes/vertebrata, or monkeys/anthropoidea were used, these terms would include humans. Many of these terms are used paraphyletically, outside of cladistics, e.g. as a'grade'. Radiation results in the generation of new subclades by bifurcation; the techniques and nomenclature of cladistics have been applied to other disciplines. Cladistics is now the most used method to classify organisms; the original methods used in cladistic analysis and the school of taxonomy derived from the work of the German entomologist Willi Hennig, who referred to it as phylogenetic systematics.
Cladistics in the original sense refers to a particular set of methods used in phylogenetic analysis, although it is now sometimes used to refer to the whole field. What is now called the cladistic method appeared as early as 1901 with a work by Peter Chalmers Mitchell for birds and subsequently by Robert John Tillyard in 1921, W. Zimmermann in 1943; the term "clade" was introduced in 1958 by Julian Huxley after having been coined by Lucien Cuénot in 1940, "cladogenesis" in 1958, "cladistic" by Cain and Harrison in 1960, "cladist" by Mayr in 1965, "cladistics" in 1966. Hennig referred to his own approach as "phylogenetic systematics". From the time of his original formulation until the end of the 1970s, cladistics competed as an analytical and philosophical approach to systematics with phenetics and so-called evolutionary taxonomy. Phenetics was championed at this time by the numerical taxonomists Peter Sneath and Robert Sokal, evolutionary taxonomy by Ernst Mayr. Conceived, if only in essence, by Willi Hennig in a book published in 1950, cladistics did not flourish until its translation into English in 1966.
Today, cladistics is the most popular method for constructing phylogenies from morphological data. In the 1990s, the development of effective polymerase chain reaction techniques allowed the application of cladistic methods to biochemical and molecular genetic traits of organisms, vastly expanding the amount of data available for phylogenetics. At the same time, cladistics became popular in evolutionary biology, because computers made it possible to process large quantities of data about organisms and their characteristics; the cladistic method interprets each character state transformation implied by the distribution of shared character states among taxa as a potential piece of evidence for grouping. The outcome of a cladistic analysis is a cladogram – a tree-shaped diagram, interpreted to represent the best hypothesis of phylogenetic relationships. Although traditionally such cladograms were generated on the basis of morphological characters and calculated by hand, genetic sequencing data and computational phylogenetics are now used in phylogenetic analyses, the parsimony criterion has been abandoned by many phylogeneticists in favor of more "sophisticated" but less parsimonious evolutionary models of character state transformation.
Cladists contend. Every cladogram is based on a particular dataset analyzed with a particular method. Datasets are tables consisting of molecular, ethological and/or other characters and a list of operational taxonomic units, which may be genes, populations, species, or larger taxa that are presumed to be monophyletic and therefore to form, all together, one large clade. Different datasets and different methods, not to mention violations of the mentioned assumptions result in different cladograms. Only scientific investigation can show, more to be correct; until for example, cladograms like the following have been accepted as accurate representations of the ancestral relations among turtles, lizards and birds: If this phylogenetic hypothesis is correct the last common ancestor of turtles and birds, at the branch near the ▼ lived earlier than the last common ancestor of lizards and birds, near the ♦. Most molecular evidence, produces cladograms more like this: If this is accurate the last common ancestor of turtles and birds lived than the last common ancestor of lizards and birds.
Since the cladograms provide competing accounts of real events, at most one of them is correct. The cladogram to the right represents the current universally accepted hypothesis that all primates, including strepsirrhines like the lemurs and lorises, had a common ancestor all of whose descendants were primates, so form a clade. Within the primates, all anthropoids are hypothesized to have had a common ancestor all of whose descendants were anthropoids, so they form the clade called Anthropoidea; the "prosimians", on the other hand, form a paraphyletic taxon. The name Prosimii is not used in phylogenetic nomenclature, whic
Genetic genealogy is the use of DNA testing in combination with traditional genealogical methods to infer relationships between individuals and find ancestors. Genetic genealogy involves the use of genealogical DNA testing to determine the level and type of the genetic relationship between individuals; this application of genetics became popular with family historians in the 21st century, as tests became affordable. The tests have been promoted by amateur groups, such as surname study groups, or regional genealogical groups, as well as research projects such as the genographic project; as of 2018, 18.5 million people had been tested. As this field has developed, the aims of practitioners broadened, with many seeking knowledge of their ancestry beyond the recent centuries for which traditional pedigrees can be constructed; the investigation of surnames in genetics can be said to go back to George Darwin, a son of Charles Darwin. In 1875, George Darwin used surnames to estimate the frequency of first-cousin marriages and calculated the expected incidence of marriage between people of the same surname.
He arrived at a figure between 2.25% and 4.5% for cousin-marriage in the population of Great Britain, higher among the upper classes and lower among the general rural population. One famous study examined the lineage of descendants of Thomas Jefferson’s paternal line and male lineage descendants of the freed slave, Sally Hemmings. Bryan Sykes, a molecular biologist at Oxford University tested the new methodology in general surname research, his study of the Sykes surname obtained results by looking at four STR markers on the male chromosome. It pointed the way to genetics becoming a valuable assistant in the service of genealogy and history; the first company to provide direct-to-consumer genetic DNA testing was the now defunct GeneTree. However, it did not offer multi-generational genealogy tests. In fall 2001, GeneTree sold its assets to Salt Lake City-based Sorenson Molecular Genealogy Foundation which originated in 1999. While in operation, SMGF provided free mitochondrial DNA tests to thousands.
GeneTree returned to genetic testing for genealogy in conjunction with the Sorenson parent company and was part of the assets acquired in the Ancestry.com buyout of SMGF. In 2000, Family Tree DNA, founded by Bennett Greenspan and Max Blankfeld, was the first company dedicated to direct-to-consumer testing for genealogy research, they offered eleven marker Y-Chromosome STR tests and HVR1 mitochondrial DNA tests. They tested in partnership with the University of Arizona. In 2007, 23andMe was the first company to offer a saliva-based direct-to-consumer genetic testing, it was the first to implement using autosomal DNA for ancestry testing, which all other major companies now use. By 2018, large DNA genealogy companies had over 18.5 million profiles. GEDmatch said; the publication of The Seven Daughters of Eve by Sykes in 2001, which described the seven major haplogroups of European ancestors, helped push personal ancestry testing through DNA tests into wide public notice. With the growing availability and affordability of genealogical DNA testing, genetic genealogy as a field grew rapidly.
By 2003, the field of DNA testing of surnames was declared to have “arrived” in an article by Jobling and Tyler-Smith in Nature Reviews Genetics. The number of firms offering tests, the number of consumers ordering them, rose dramatically. In 2018 a paper in Science Magazine estimated that a DNA genealogy search on anybody of European descent would result in a third cousin or closer match 60% of the time; the original Genographic Project was a five-year research study launched in 2005 by the National Geographic Society and IBM, in partnership with the University of Arizona and Family Tree DNA. Its goals were anthropological; the project announced that by April 2010 it had sold more than 350,000 of its public participation testing kits, which test the general public for either twelve STR markers on the Y-Chromosome or mutations on the HVR1 region of the mtDNA. In 2007, annual sales of genetic genealogical tests for all companies, including the laboratories that support them, were estimated to be in the area of $60 million.
The current phase of the project is Geno 2.0 Next Generation. As of 2018 one-million participants in over 140 countries have joined the project. Genetic genealogy has enabled groups of people to trace their ancestry though they are not able to use conventional genealogical techniques; this may be because they do not know one or both of their birth parents or because conventional genealogical records have been lost, destroyed or never existed. These groups include adoptees, Holocaust survivors, GI babies, child migrants, descendants of children from orphan trains and people with slave ancestry; the earliest test takers were customers most those who started with a Y-Chromosome test to determine their father's paternal ancestry. These men took part in surname projects; the first phase of the Genographic project brought new participants into genetic genealogy. Those who tested were as to be interested in direct maternal heritage as their paternal; the number of those taking mtDNA tests increased. The introduction of autosomal SNP tests based on microarray chip technology changed the demographics.
Women were as as men to test themselves. Members of the growing genetic genealogy community have been credited with making useful contributions to knowledge in the field. One of the earliest interest groups to emerge was the International Society of Genetic Genealogy, their stated goal is to promote DNA testing for genealogy. Members advocate the use of genetics in genealogical research and the group faci