Protein dimer
In biochemistry, a protein dimer is a macromolecular complex formed by two protein monomers, or single proteins, which are non-covalently bound. Many macromolecules, such as proteins or nucleic acids, form dimers; the word dimer has roots meaning "two parts", di- + -mer. A protein dimer is a type of protein quaternary structure. A protein homodimer is formed by two identical proteins. A protein heterodimer is formed by two different proteins. Most protein dimers in biochemistry are not connected by covalent bonds. An example of a non-covalent heterodimer is the enzyme reverse transcriptase, composed of two different amino acid chains. An exception is dimers that are linked by disulfide bridges such as the homodimeric protein NEMO; some proteins contain specialized domains to ensure specificity. Antibodies Receptor tyrosine kinases Transcription factors Leucine zipper motif proteins Nuclear receptors 14-3-3 proteins G protein-coupled receptors G protein βγ-subunit dimer Kinesin Triosephosphateisomerase Alcohol dehydrogenase Factor XI Factor XIII Toll-like receptor Fibrinogen Variable surface glycoproteins of the Trypanosoma parasite Tubulin Type II restriction enzymes Dimer Protein trimer Oligomer ProtCID
Blood plasma
Blood plasma is a yellowish liquid component of blood that holds the blood cells in whole blood in suspension. In other words, it is the liquid part of the blood that carries cells and proteins throughout the body, it makes up about 55% of the body's total blood volume. It is the intravascular fluid part of extracellular fluid, it is water, contains dissolved proteins, clotting factors, hormones, carbon dioxide and oxygen. It plays a vital role in an intravascular osmotic effect that keeps electrolyte concentration balanced and protects the body from infection and other blood disorders. Blood plasma is separated from the blood by spinning a tube of fresh blood containing an anticoagulant in a centrifuge until the blood cells fall to the bottom of the tube; the blood plasma is poured or drawn off. Blood plasma has a density of 1025 kg/m3, or 1.025 g/ml. Blood serum is blood plasma without clotting factors. Plasmapheresis is a medical therapy that involves blood plasma extraction and reintegration.
Fresh frozen plasma is on the WHO Model List of Essential Medicines, the most important medications needed in a basic health system. It is of critical importance in the treatment of many types of trauma which result in blood loss, is therefore kept stocked universally in all medical facilities capable of treating trauma or that pose a risk of patient blood loss such as surgical suite facilities. Blood plasma volume may be expanded by or drained to extravascular fluid when there are changes in Starling forces across capillary walls. For example, when blood pressure drops in circulatory shock, Starling forces drive fluid into the interstitium, causing third spacing. Standing still for a prolonged period will cause an increase in transcapillary hydrostatic pressure; as a result 12% of blood plasma volume will cross into the extravascular compartment. This causes an increase in hematocrit, serum total protein, blood viscosity and, as a result of increased concentration of coagulation factors, it causes orthostatic hypercoagulability.
Plasma was well-known when described by William Harvey in de Mortu Cordis in 1628, but knowledge of it extends as far back as Vesalius.. The discovery of fibrinogen by William Henson in ca 1770 made it easier to study plasma, as ordinarily, upon coming in contact with a foreign surface – something other than vascular endothelium – clotting factors become activated and clotting proceeds trapping RBCs etc in the plasma and preventing separation of plasma from the blood. Adding citrate and other anticoagulants is a recent advance. Note that, upon formation of a clot, the remaining clear fluid is Serum, plasma without the clotting factors; the use of blood plasma as a substitute for whole blood and for transfusion purposes was proposed in March 1918, in the correspondence columns of the British Medical Journal, by Gordon R. Ward. "Dried plasmas" in powder or strips of material format were developed and first used in World War II. Prior to the United States' involvement in the war, liquid plasma and whole blood were used.
The "Blood for Britain" program during the early 1940s was quite successful based on Charles Drew's contribution. A large project began in August 1940 to collect blood in New York City hospitals for the export of plasma to Britain. Drew was appointed medical supervisor of the "Plasma for Britain" project, his notable contribution at this time was to transform the test tube methods of many blood researchers into the first successful mass production techniques. The decision was made to develop a dried plasma package for the armed forces as it would reduce breakage and make the transportation and storage much simpler; the resulting dried. One bottle contained enough distilled water to reconstitute the dried plasma contained within the other bottle. In about three minutes, the plasma could stay fresh for around four hours; the Blood for Britain program operated for five months, with total collections of 15,000 people donating blood, with over 5,500 vials of blood plasma. Following the "Plasma for Britain" invention, Drew was named director of the Red Cross blood bank and assistant director of the National Research Council, in charge of blood collection for the United States Army and Navy.
Drew argued against the armed forces directive that blood/plasma was to be separated by the race of the donor. Drew insisted that there was no racial difference in human blood and that the policy would lead to needless deaths as soldiers and sailors were required to wait for "same race" blood. By the end of the war the American Red Cross had provided enough blood for over six million plasma packages. Most of the surplus plasma was returned to the United States for civilian use. Serum albumin replaced dried plasma for combat use during the Korean War. Plasma as a blood product prepared from blood donations is used in blood transfusions as fresh frozen plasma or plasma Frozen Within 24 Hours After Phlebotomy; when donating whole blood or packed red blood cell transfusions, O- is the most desirable and is considered a "universal donor," since it has neither A nor B antigens and can be safely transfused to most recipients. Type AB+ is the "universal recipient" type for PRBC donations. However, for plasma the situation is somewhat reverse
Base pair
A base pair is a unit consisting of two nucleobases bound to each other by hydrogen bonds. They form the building blocks of the DNA double helix and contribute to the folded structure of both DNA and RNA. Dictated by specific hydrogen bonding patterns, Watson–Crick base pairs allow the DNA helix to maintain a regular helical structure, subtly dependent on its nucleotide sequence; the complementary nature of this based-paired structure provides a redundant copy of the genetic information encoded within each strand of DNA. The regular structure and data redundancy provided by the DNA double helix make DNA well suited to the storage of genetic information, while base-pairing between DNA and incoming nucleotides provides the mechanism through which DNA polymerase replicates DNA and RNA polymerase transcribes DNA into RNA. Many DNA-binding proteins can recognize specific base-pairing patterns that identify particular regulatory regions of genes. Intramolecular base pairs can occur within single-stranded nucleic acids.
This is important in RNA molecules, where Watson–Crick base pairs permit the formation of short double-stranded helices, a wide variety of non-Watson–Crick interactions allow RNAs to fold into a vast range of specific three-dimensional structures. In addition, base-pairing between transfer RNA and messenger RNA forms the basis for the molecular recognition events that result in the nucleotide sequence of mRNA becoming translated into the amino acid sequence of proteins via the genetic code; the size of an individual gene or an organism's entire genome is measured in base pairs because DNA is double-stranded. Hence, the number of total base pairs is equal to the number of nucleotides in one of the strands; the haploid human genome is estimated to be about 3.2 billion bases long and to contain 20,000–25,000 distinct protein-coding genes. A kilobase is a unit of measurement in molecular biology equal to 1000 base pairs of DNA or RNA; the total amount of related DNA base pairs on Earth is estimated at 5.0×1037 and weighs 50 billion tonnes.
In comparison, the total mass of the biosphere has been estimated to be as much as 4 TtC. Hydrogen bonding is the chemical interaction. Appropriate geometrical correspondence of hydrogen bond donors and acceptors allows only the "right" pairs to form stably. DNA with high GC-content is more stable than DNA with low GC-content. But, contrary to popular belief, the hydrogen bonds do not stabilize the DNA significantly; the larger nucleobases and guanine, are members of a class of double-ringed chemical structures called purines. Purines are complementary only with pyrimidines: pyrimidine-pyrimidine pairings are energetically unfavorable because the molecules are too far apart for hydrogen bonding to be established. Purine-pyrimidine base-pairing of AT or GC or UA results in proper duplex structure; the only other purine-pyrimidine pairings would be AC and GT and UG. The GU pairing, with two hydrogen bonds, does occur often in RNA. Paired DNA and RNA molecules are comparatively stable at room temperature, but the two nucleotide strands will separate above a melting point, determined by the length of the molecules, the extent of mispairing, the GC content.
Higher GC content results in higher melting temperatures. On the converse, regions of a genome that need to separate — for example, the promoter regions for often-transcribed genes — are comparatively GC-poor. GC content and melting temperature must be taken into account when designing primers for PCR reactions; the following DNA sequences illustrate pair double-stranded patterns. By convention, the top strand is written from the 5' end to the 3' end. A base-paired DNA sequence: ATCGATTGAGCTCTAGCG TAGCTAACTCGAGATCGCThe corresponding RNA sequence, in which uracil is substituted for thymine in the RNA strand: AUCGAUUGAGCUCUAGCG UAGCUAACUCGAGAUCGC Chemical analogs of nucleotides can take the place of proper nucleotides and establish non-canonical base-pairing, leading to errors in DNA replication and DNA transcription; this is due to their isosteric chemistry. One common mutagenic base analog is 5-bromouracil, which resembles thymine but can base-pair to guanine in its enol form. Other chemicals, known as DNA intercalators, fit into the gap between adjacent bases on a single strand and induce frameshift mutations by "masquerading" as a base, causing the DNA replication machinery to skip or insert additional nucleotides at the intercalated site.
Most intercalators are known or suspected carcinogens. Examples include ethidium acridine. An unnatural base pair is a designed subunit of DNA, created in a laboratory and does not occur in nature. DNA sequences have been described which use newly created nucleobases to form a third base pair, in addition to the two ba
Chromosome 4
Chromosome 4 is one of the 23 pairs of chromosomes in humans. People have two copies of this chromosome. Chromosome 4 spans more than 186 million base pairs and represents between 6 and 6.5 percent of the total DNA in cells. The chromosome is ~191 megabases in length. In a 2012 paper, seven hundred and fifty seven protein-encoding genes were identified on this chromosome. Two-hundred and eleven of these coding sequences did not have any experimental evidence at the protein level, in 2012. Two-hundred and seventy-one appear to be membrane proteins. Fifty-four have been classified as cancer-associated proteins; the following are some of the gene count estimates of human chromosome 4. Because researchers use different approaches to genome annotation their predictions of the number of genes on each chromosome varies. Among various projects, the collaborative consensus coding sequence project takes an conservative strategy. So CCDS's gene number prediction represents a lower bound on the total number of human protein-coding genes.
The following is a partial list of genes on human chromosome 4. For complete list, see the link in the infobox on the right; the following are some of the diseases related to genes located on chromosome 4: National Institutes of Health. "Chromosome 4". Genetics Home Reference. Retrieved 2017-05-06. "Chromosome 4". Human Genome Project Information Archive 1990–2003. Retrieved 2017-05-06
Zygosity
Zygosity is the degree of similarity of the alleles for a trait in an organism. Most eukaryotes have two matching sets of chromosomes. Diploid organisms have the same loci on each of their two sets of homologous chromosomes except that the sequences at these loci may differ between the two chromosomes in a matching pair and that a few chromosomes may be mismatched as part of a chromosomal sex-determination system. If both alleles of a diploid organism are the same, the organism is homozygous at that locus. If they are different, the organism is heterozygous at that locus. If one allele is missing, it is hemizygous; the DNA sequence of a gene varies from one individual to another. Those variations are called alleles. While some genes have only one allele because there is low variation, others have only one allele because deviation from that allele can be harmful or fatal, but most genes have two or more alleles. The frequency of different alleles varies throughout the population; some genes may have two alleles with equal distribution.
For other genes, one allele may be common, another allele may be rare. Sometimes, one allele is a disease-causing variation. Sometimes, the different variations in the alleles make no difference at all in the function of the organism. In diploid organisms, one allele is inherited from one from the female parent. Zygosity is a description of whether those two alleles have different DNA sequences. In some cases the term "zygosity" is used in the context of a single chromosome; the words homozygous and hemizygous are used to describe the genotype of a diploid organism at a single locus on the DNA. Homozygous describes a genotype consisting of two identical alleles at a given locus, heterozygous describes a genotype consisting of two different alleles at a locus, hemizygous describes a genotype consisting of only a single copy of a particular gene in an otherwise diploid organism, nullizygous refers to an otherwise-diploid organism in which both copies of the gene are missing. A cell is said to be homozygous for a particular gene when identical alleles of the gene are present on both homologous chromosomes.
The cell or organism in question is called a homozygote. True breeding organisms are always homozygous for the traits. An individual, homozygous-dominant for a particular trait carries two copies of the allele that codes for the dominant trait; this allele called the "dominant allele", is represented by a capital letter. When an organism is homozygous-dominant for a particular trait, the genotype is represented by a doubling of the symbol for that trait, such as "PP". An individual, homozygous-recessive for a particular trait carries two copies of the allele that codes for the recessive trait; this allele called the "recessive allele", is represented by the lowercase form of the letter used for the corresponding dominant trait. The genotype of an organism, homozygous-recessive for a particular trait is represented by a doubling of the appropriate letter, such as "pp". A diploid organism is heterozygous at a gene locus when its cells contain two different alleles of a gene; the cell or organism is called a heterozygote for the allele in question, therefore, heterozygosity refers to a specific genotype.
Heterozygous genotypes are represented by a capital letter and a lowercase letter, such as "Rr" or "Ss". Alternatively, a heterozygote for gene "R" is assumed to be "Rr"; the capital letter is written first. If the trait in question is determined by simple dominance, a heterozygote will express only the trait coded by the dominant allele, the trait coded by the recessive allele will not be present. In more complex dominance schemes the results of heterozygosity can be more complex. A heterozygous genotype can have a higher relative fitness than either the homozygous dominant or homozygous recessive genotype - this is called a heterozygote advantage. A chromosome in a diploid organism is hemizygous; the cell or organism is called a hemizygote. Hemizygosity is observed when one copy of a gene is deleted, or, in the heterogametic sex, when a gene is located on a sex chromosome. Hemizygosity must not be confused with haploinsufficiency, which describes a mechanism for producing a phenotype. For organisms in which the male is heterogametic, such as humans all X-linked genes are hemizygous in males with normal chromosomes, because they have only one X chromosome and few of the same genes are on the Y chromosome.
Transgenic mice generated through exogenous DNA microinjection of an embryo's pronucleus are considered to be hemizygous, because the introduced allele is expected to be incorporated into only one copy of any locus. A transgenic individual can be bred to homozygosity and maintained as an inbred line to reduce the need to confirm the genotype of each individual. In cultured mammalian cells, such as the Chinese hamster ovary cell line, a number of genetic loci are present in a functional hemizygous state, due to mutations or deletions in the other alleles. A nullizygous organism carries two mutant alleles for the same gene; the mutant alleles are both complete loss-of-function or'null' alleles, so homozygous null and n
Crystal structure
In crystallography, crystal structure is a description of the ordered arrangement of atoms, ions or molecules in a crystalline material. Ordered structures occur from the intrinsic nature of the constituent particles to form symmetric patterns that repeat along the principal directions of three-dimensional space in matter; the smallest group of particles in the material that constitutes this repeating pattern is the unit cell of the structure. The unit cell reflects the symmetry and structure of the entire crystal, built up by repetitive translation of the unit cell along its principal axes; the translation vectors define the nodes of the Bravais lattice. The lengths of the principal axes, or edges, of the unit cell and the angles between them are the lattice constants called lattice parameters or cell parameters; the symmetry properties of the crystal are described by the concept of space groups. All possible symmetric arrangements of particles in three-dimensional space may be described by the 230 space groups.
The crystal structure and symmetry play a critical role in determining many physical properties, such as cleavage, electronic band structure, optical transparency. Crystal structure is described in terms of the geometry of arrangement of particles in the unit cell; the unit cell is defined as the smallest repeating unit having the full symmetry of the crystal structure. The geometry of the unit cell is defined as a parallelepiped, providing six lattice parameters taken as the lengths of the cell edges and the angles between them; the positions of particles inside the unit cell are described by the fractional coordinates along the cell edges, measured from a reference point. It is only necessary to report the coordinates of a smallest asymmetric subset of particles; this group of particles may be chosen so that it occupies the smallest physical space, which means that not all particles need to be physically located inside the boundaries given by the lattice parameters. All other particles of the unit cell are generated by the symmetry operations that characterize the symmetry of the unit cell.
The collection of symmetry operations of the unit cell is expressed formally as the space group of the crystal structure. Vectors and planes in a crystal lattice are described by the three-value Miller index notation; this syntax uses the indices ℓ, m, n as directional orthogonal parameters, which are separated by 90°. By definition, the syntax denotes a plane that intercepts the three points a1/ℓ, a2/m, a3/n, or some multiple thereof; that is, the Miller indices are proportional to the inverses of the intercepts of the plane with the unit cell. If one or more of the indices is zero, it means. A plane containing a coordinate axis is translated so that it no longer contains that axis before its Miller indices are determined; the Miller indices for a plane are integers with no common factors. Negative indices are indicated with horizontal bars, as in. In an orthogonal coordinate system for a cubic cell, the Miller indices of a plane are the Cartesian components of a vector normal to the plane. Considering only planes intersecting one or more lattice points, the distance d between adjacent lattice planes is related to the reciprocal lattice vector orthogonal to the planes by the formula d = 2 π | g ℓ m n | The crystallographic directions are geometric lines linking nodes of a crystal.
The crystallographic planes are geometric planes linking nodes. Some directions and planes have a higher density of nodes; these high density planes have an influence on the behavior of the crystal as follows: Optical properties: Refractive index is directly related to density. Adsorption and reactivity: Physical adsorption and chemical reactions occur at or near surface atoms or molecules; these phenomena are thus sensitive to the density of nodes. Surface tension: The condensation of a material means that the atoms, ions or molecules are more stable if they are surrounded by other similar species; the surface tension of an interface thus varies according to the density on the surface. Microstructural defects: Pores and crystallites tend to have straight grain boundaries following higher density planes. Cleavage: This occurs preferentially parallel to higher density planes. Plastic deformation: Dislocation glide occurs preferentially parallel to higher density planes; the perturbation carried by the dislocation is along a dense direction.
The shift of one node in a more dense direction requires a lesser distortion of the crystal lattice. Some directions and planes are defined by symmetry of the crystal system. In monoclinic, rhombohedral and trigonal/hexagonal systems there is one unique axis which has higher rotational symmetry than the other two axes; the basal plane is the plane perpendicular to the principal axis in these crystal systems. For triclinic and cubic crystal systems the axis designation is arbitrary and there is no principal axis. For the special case of simple cubic crystals, the lattice vectors are orthogonal and of equal length. So, in this common case, the Miller indices and both denote normals/directions in Cartesian coordinates. For cubic crystals with lattice constant a, the spacing d between adjacent l
Protein Data Bank
The Protein Data Bank is a database for the three-dimensional structural data of large biological molecules, such as proteins and nucleic acids. The data obtained by X-ray crystallography, NMR spectroscopy, or cryo-electron microscopy, submitted by biologists and biochemists from around the world, are accessible on the Internet via the websites of its member organisations; the PDB is overseen by an organization called the Worldwide Protein Data Bank, wwPDB. The PDB is a key in areas such as structural genomics. Most major scientific journals, some funding agencies, now require scientists to submit their structure data to the PDB. Many other databases use protein structures deposited in the PDB. For example, SCOP and CATH classify protein structures, while PDBsum provides a graphic overview of PDB entries using information from other sources, such as Gene ontology. Two forces converged to initiate the PDB: 1) a small but growing collection of sets of protein structure data determined by X-ray diffraction.
In 1969, with the sponsorship of Walter Hamilton at the Brookhaven National Laboratory, Edgar Meyer began to write software to store atomic coordinate files in a common format to make them available for geometric and graphical evaluation. By 1971, one of Meyer's programs, SEARCH, enabled researchers to remotely access information from the database to study protein structures offline. SEARCH was instrumental in enabling networking, thus marking the functional beginning of the PDB; the Protein Data Bank was announced in October 1971 in Nature New Biology as a joint venture between Cambridge Crystallographic Data Centre, UK and Brookhaven National Laboratory, USA. Upon Hamilton's death in 1973, Tom Koeztle took over direction of the PDB for the subsequent 20 years. In January 1994, Joel Sussman of Israel's Weizmann Institute of Science was appointed head of the PDB. In October 1998, the PDB was transferred to the Research Collaboratory for Structural Bioinformatics; the new director was Helen M. Berman of Rutgers University.
In 2003, with the formation of the wwPDB, the PDB became an international organization. The founding members are PDBe, RCSB, PDBj; the BMRB joined in 2006. Each of the four members of wwPDB can act as deposition, data processing and distribution centers for PDB data; the data processing refers to the fact that annotate each submitted entry. The data are automatically checked for plausibility; the PDB database is updated weekly. The PDB holdings list is updated weekly; as of 17 October 2018, the breakdown of current holdings is as follows: 120,052 structures in the PDB have a structure factor file. 9,734 structures have an NMR restraint file. 3,486 structures in the PDB have a chemical shifts file. 2,531 structures in the PDB have a 3DEM map file deposited in EM Data BankThese data show that most structures are determined by X-ray diffraction, but about 10% of structures are now determined by protein NMR. When using X-ray diffraction, approximations of the coordinates of the atoms of the protein are obtained, whereas estimations of the distances between pairs of atoms of the protein are found through NMR experiments.
Therefore, the final conformation of the protein is obtained, in the latter case, by solving a distance geometry problem. A few proteins are determined by cryo-electron microscopy; the significance of the structure factor files, mentioned above, is that, for PDB structures determined by X-ray diffraction that have a structure file, the electron density map may be viewed. The data of such structures is stored on the "electron density server". In the past, the number of structures in the PDB has grown at an exponential rate, passing the 100 registered structures milestone in 1982, the 1,000 in 1993, the 10,000 in 1999, the 100,000 in 2014. However, since 2007, the rate of accumulation of new protein structures appears to have plateaued; the file format used by the PDB was called the PDB file format. This original format was restricted by the width of computer punch cards to 80 characters per line. Around 1996, the "macromolecular Crystallographic Information file" format, mmCIF, an extension of the CIF format started to be phased in.
MmCIF is now the master format for the PDB archive. An XML version of this format, called PDBML, was described in 2005; the structure files can be downloaded in any of these three formats. In fact, individual files are downloaded into graphics packages using web addresses: For PDB format files, use, e.g. http://www.pdb.org/pdb/files/4hhb.pdb.gz or http://pdbe.org/download/4hhb For PDBML files, use, e.g. http://www.pdb.org/pdb/files/4hhb.xml.gz or http://pdbe.org/pdbml/4hhbThe "4hhb" is the PDB identifier. Each structure published in PDB receives a four-character alphanumeric identifier, its PDB ID; the structure files may be viewed using one of several free and open source computer programs, including Jmol, Pymol, VMD, Rasmol. Other non-free, shareware programs