1.
Jmol
–
Jmol is computer software for molecular modelling chemical structures in 3-dimensions. Jmol returns a 3D representation of a molecule that may be used as a teaching tool and it is written in the programming language Java, so it can run on the operating systems Windows, macOS, Linux, and Unix, if Java is installed. It is free and open-source software released under a GNU Lesser General Public License version 2.0, a standalone application and a software development kit exist that can be integrated into other Java applications, such as Bioclipse and Taverna. A popular feature is an applet that can be integrated into web pages to display molecules in a variety of ways, for example, molecules can be displayed as ball-and-stick models, space-filling models, ribbon diagrams, etc. Jmol supports a range of chemical file formats, including Protein Data Bank, Crystallographic Information File, MDL Molfile. There is also a JavaScript-only version, JSmol, that can be used on computers with no Java, the Jmol applet, among other abilities, offers an alternative to the Chime plug-in, which is no longer under active development. While Jmol has many features that Chime lacks, it does not claim to reproduce all Chime functions, most notably, Chime requires plug-in installation and Internet Explorer 6.0 or Firefox 2.0 on Microsoft Windows, or Netscape Communicator 4.8 on Mac OS9. Jmol requires Java installation and operates on a variety of platforms. For example, Jmol is fully functional in Mozilla Firefox, Internet Explorer, Opera, Google Chrome, fast and Scriptable Molecular Graphics in Web Browsers without Java3D
2.
ChEMBL
–
ChEMBL or ChEMBLdb is a manually curated chemical database of bioactive molecules with drug-like properties. It is maintained by the European Bioinformatics Institute, of the European Molecular Biology Laboratory, based at the Wellcome Trust Genome Campus, Hinxton, the database, originally known as StARlite, was developed by a biotechnology company called Inpharmatica Ltd. later acquired by Galapagos NV. The data was acquired for EMBL in 2008 with an award from The Wellcome Trust, resulting in the creation of the ChEMBL chemogenomics group at EMBL-EBI, the ChEMBL database contains compound bioactivity data against drug targets. Bioactivity is reported in Ki, Kd, IC50, and EC50, data can be filtered and analyzed to develop compound screening libraries for lead identification during drug discovery. ChEMBL version 2 was launched in January 2010, including 2.4 million bioassay measurements covering 622,824 compounds and this was obtained from curating over 34,000 publications across twelve medicinal chemistry journals. ChEMBLs coverage of available bioactivity data has grown to become the most comprehensive ever seen in a public database, in October 2010 ChEMBL version 8 was launched, with over 2.97 million bioassay measurements covering 636,269 compounds. ChEMBL_10 saw the addition of the PubChem confirmatory assays, in order to integrate data that is comparable to the type, ChEMBLdb can be accessed via a web interface or downloaded by File Transfer Protocol. It is formatted in a manner amenable to computerized data mining, ChEMBL is also integrated into other large-scale chemistry resources, including PubChem and the ChemSpider system of the Royal Society of Chemistry. In addition to the database, the ChEMBL group have developed tools and these include Kinase SARfari, an integrated chemogenomics workbench focussed on kinases. The system incorporates and links sequence, structure, compounds and screening data, the primary purpose of ChEMBL-NTD is to provide a freely accessible and permanent archive and distribution centre for deposited data. July 2012 saw the release of a new data service, sponsored by the Medicines for Malaria Venture. The data in this service includes compounds from the Malaria Box screening set, myChEMBL, the ChEMBL virtual machine, was released in October 2013 to allow users to access a complete and free, easy-to-install cheminformatics infrastructure. In December 2013, the operations of the SureChem patent informatics database were transferred to EMBL-EBI, in a portmanteau, SureChem was renamed SureChEMBL. 2014 saw the introduction of the new resource ADME SARfari - a tool for predicting and comparing cross-species ADME targets
3.
ChemSpider
–
ChemSpider is a database of chemicals. ChemSpider is owned by the Royal Society of Chemistry, the database contains information on more than 50 million molecules from over 500 data sources including, Each chemical is given a unique identifier, which forms part of a corresponding URL. This is an approach to develop an online chemistry database. The search can be used to widen or restrict already found results, structure searching on mobile devices can be done using free apps for iOS and for the Android. The ChemSpider database has been used in combination with text mining as the basis of document markup. The result is a system between chemistry documents and information look-up via ChemSpider into over 150 data sources. ChemSpider was acquired by the Royal Society of Chemistry in May,2009, prior to the acquisition by RSC, ChemSpider was controlled by a private corporation, ChemZoo Inc. The system was first launched in March 2007 in a release form. ChemSpider has expanded the generic support of a database to include support of the Wikipedia chemical structure collection via their WiChempedia implementation. A number of services are available online. SyntheticPages is an interactive database of synthetic chemistry procedures operated by the Royal Society of Chemistry. Users submit synthetic procedures which they have conducted themselves for publication on the site and these procedures may be original works, but they are more often based on literature reactions. Citations to the published procedure are made where appropriate. They are checked by an editor before posting. The pages do not undergo formal peer-review like a journal article. The comments are moderated by scientific editors. The intention is to collect practical experience of how to conduct useful chemical synthesis in the lab, while experimental methods published in an ordinary academic journal are listed formally and concisely, the procedures in ChemSpider SyntheticPages are given with more practical detail. Comments by submitters are included as well, other publications with comparable amounts of detail include Organic Syntheses and Inorganic Syntheses
4.
DrugBank
–
The DrugBank database is a comprehensive, freely accessible, online database containing information on drugs and drug targets. As both a bioinformatics and a resource, DrugBank combines detailed drug data with comprehensive drug target information. Because of its scope, comprehensive referencing and unusually detailed data descriptions. As a result, links to DrugBank are maintained for nearly all drugs listed in Wikipedia, DrugBank is widely used by the drug industry, medicinal chemists, pharmacists, physicians, students and the general public. Its extensive drug and drug-target data has enabled the discovery and repurposing of a number of existing drugs to treat rare, the latest release of the database contains 8227 drug entries including 2003 FDA-approved small molecule drugs,221 FDA-approved biotech drugs,93 nutraceuticals and over 6000 experimental drugs. Additionally,4270 non-redundant protein sequences are linked to these drug entries, each DrugCard entry contains more than 200 data fields with half of the information being devoted to drug/chemical data and the other half devoted to drug target or protein data. Four additional databases, HMDB, T3DB, SMPDB and FooDB are also part of a suite of metabolomic/cheminformatic databases. The first version of DrugBank was released in 2006 and this early release contained relatively modest information about 841 FDA-approved small molecule drugs and 113 biotech drugs. It also included information on 2133 drug targets, the second version of DrugBank was released in 2009. This greatly expanded and improved version of the database included 1344 approved small molecule drugs and 123 biotech drugs as well as 3037 unique drug targets. Version 2.0 also included, for the first time, withdrawn drugs and illicit drugs, version 3.0 was released in 2011. This version contained 1424 approved small molecule drugs and 132 biotech drugs as well as >4000 unique drug targets, version 3.0 also included drug transporter data, drug pathway data, drug pricing, patent and manufacturing data as well as data on >5000 experimental drugs. Version 4.0 was released in 2014 and this version included 1558 FDA-approved small molecule drugs,155 biotech drugs and 4200 unique drug targets. Version 4.0 also incorporated information on drug metabolites, drug taxonomy, drug spectra, drug binding constants. Table 1 provides a complete statistical summary of the history of DrugBank’s development. All data in DrugBank is non-proprietary or is derived from a non-proprietary source and it is freely accessible and available to anyone. In addition, nearly every item is fully traceable and explicitly referenced to the original source. DrugBank data is available through a web interface and downloads
5.
European Chemicals Agency
–
ECHA is the driving force among regulatory authorities in implementing the EUs chemicals legislation. ECHA helps companies to comply with the legislation, advances the safe use of chemicals, provides information on chemicals and it is located in Helsinki, Finland. The Agency, headed by Executive Director Geert Dancet, started working on 1 June 2007, the REACH Regulation requires companies to provide information on the hazards, risks and safe use of chemical substances that they manufacture or import. Companies register this information with ECHA and it is freely available on their website. So far, thousands of the most hazardous and the most commonly used substances have been registered, the information is technical but gives detail on the impact of each chemical on people and the environment. This also gives European consumers the right to ask whether the goods they buy contain dangerous substances. The Classification, Labelling and Packaging Regulation introduces a globally harmonised system for classifying and labelling chemicals into the EU. This worldwide system makes it easier for workers and consumers to know the effects of chemicals, companies need to notify ECHA of the classification and labelling of their chemicals. So far, ECHA has received over 5 million notifications for more than 100000 substances, the information is freely available on their website. Consumers can check chemicals in the products they use, Biocidal products include, for example, insect repellents and disinfectants used in hospitals. The Biocidal Products Regulation ensures that there is information about these products so that consumers can use them safely. ECHA is responsible for implementing the regulation, the law on Prior Informed Consent sets guidelines for the export and import of hazardous chemicals. Through this mechanism, countries due to hazardous chemicals are informed in advance and have the possibility of rejecting their import. Substances that may have effects on human health and the environment are identified as Substances of Very High Concern 1. These are mainly substances which cause cancer, mutation or are toxic to reproduction as well as substances which persist in the body or the environment, other substances considered as SVHCs include, for example, endocrine disrupting chemicals. Companies manufacturing or importing articles containing these substances in a concentration above 0 and they are required to inform users about the presence of the substance and therefore how to use it safely. Consumers have the right to ask the retailer whether these substances are present in the products they buy, once a substance has been officially identified in the EU as being of very high concern, it will be added to a list. This list is available on ECHA’s website and shows consumers and industry which chemicals are identified as SVHCs, Substances placed on the Candidate List can then move to another list
6.
IUPHAR/BPS
–
The IUPHAR/BPS Guide to PHARMACOLOGY is an open-access website, acting as a portal to information on the biological targets of licensed drugs and other small molecules. The Guide to PHARMACOLOGY is developed as a joint venture between the International Union of Basic and Clinical Pharmacology and the British Pharmacological Society and this replaces and expands upon the original 2009 IUPHAR Database. The information featured includes pharmacological data, target and gene nomenclature, overviews and commentaries on each target family are included, with links to key references. The Guide to PHARMACOLOGY was initially made available online in December 2011 with additional material released in July 2012 and its network of over 700 specialist advisors contribute expertise and data. The current PI and Grant holder of the GtoPdb project is Prof. Jamie A. Davies, the development and release of the first version of the GtoPdb in 2012 was described in an editorial published in the British Journal of Pharmacology entitled Guide to Pharmacology. org- an update. The IUPHAR-DB is no longer being developed and all the contained within this site is now available through the Guide to PHARMACOLOGY. A complete list of all the approved drugs included on the website is available via the ligand list. The Guide to PHARMACOLOGY is being expanded to include information on targets and ligands. Search features on the website include quick and advanced search options, other features include Hot topic news items and a recent receptor-ligand pairing list. A hard copy summary of the database is published as The Concise Guide to Pharmacology 2015/2016 as a series of papers as a bi-annual supplement to the British Journal of Pharmacology. The Guide to PHARMACOLOGY includes links to other relevant resources via target, many of these resources maintain reciprocal links with the relevant Guide to PHARMACOLOGY pages. As of November 2015 the Wellcome Trust is supporting a new project to develop the Guide to Immumopharmacology, the latter continues to be supported by the British Pharmacological Society
7.
PubChem
–
PubChem is a database of chemical molecules and their activities against biological assays. The system is maintained by the National Center for Biotechnology Information, a component of the National Library of Medicine, PubChem can be accessed for free through a web user interface. Millions of compound structures and descriptive datasets can be downloaded via FTP. PubChem contains substance descriptions and small molecules with fewer than 1000 atoms and 1000 bonds, more than 80 database vendors contribute to the growing PubChem database. PubChem consists of three dynamically growing primary databases, as of 28 January 2016, Compounds,82.6 million entries, contains pure and characterized chemical compounds. Substances,198 million entries, contains also mixtures, extracts, complexes, bioAssay, bioactivity results from 1.1 million high-throughput screening programs with several million values. PubChem contains its own online molecule editor with SMILES/SMARTS and InChI support that allows the import and export of all common chemical file formats to search for structures and fragments. In the text search form the database fields can be searched by adding the name in square brackets to the search term. A numeric range is represented by two separated by a colon. The search terms and field names are case-insensitive, parentheses and the logical operators AND, OR, and NOT can be used. AND is assumed if no operator is used, example,0,5000,50,10 -5,5 PubChem was released in 2004. The American Chemical Society has raised concerns about the publicly supported PubChem database and they have a strong interest in the issue since the Chemical Abstracts Service generates a large percentage of the societys revenue. To advocate their position against the PubChem database, ACS has actively lobbied the US Congress, soon after PubChems creation, the American Chemical Society lobbied U. S. Congress to restrict the operation of PubChem, which they asserted competes with their Chemical Abstracts Service
8.
International Chemical Identifier
–
Initially developed by IUPAC and NIST from 2000 to 2005, the format and algorithms are non-proprietary. The continuing development of the standard has supported since 2010 by the not-for-profit InChI Trust. The current version is 1.04 and was released in September 2011, prior to 1.04, the software was freely available under the open source LGPL license, but it now uses a custom license called IUPAC-InChI Trust License. Not all layers have to be provided, for instance, the layer can be omitted if that type of information is not relevant to the particular application. InChIs can thus be seen as akin to a general and extremely formalized version of IUPAC names and they can express more information than the simpler SMILES notation and differ in that every structure has a unique InChI string, which is important in database applications. Information about the 3-dimensional coordinates of atoms is not represented in InChI, the InChI algorithm converts input structural information into a unique InChI identifier in a three-step process, normalization, canonicalization, and serialization. The InChIKey, sometimes referred to as a hashed InChI, is a fixed length condensed digital representation of the InChI that is not human-understandable. The InChIKey specification was released in September 2007 in order to facilitate web searches for chemical compounds and it should be noted that, unlike the InChI, the InChIKey is not unique, though collisions can be calculated to be very rare, they happen. In January 2009 the final 1.02 version of the InChI software was released and this provided a means to generate so called standard InChI, which does not allow for user selectable options in dealing with the stereochemistry and tautomeric layers of the InChI string. The standard InChIKey is then the hashed version of the standard InChI string, the standard InChI will simplify comparison of InChI strings and keys generated by different groups, and subsequently accessed via diverse sources such as databases and web resources. Every InChI starts with the string InChI= followed by the version number and this is followed by the letter S for standard InChIs. The remaining information is structured as a sequence of layers and sub-layers, the layers and sub-layers are separated by the delimiter / and start with a characteristic prefix letter. The six layers with important sublayers are, Main layer Chemical formula and this is the only sublayer that must occur in every InChI. The atoms in the formula are numbered in sequence, this sublayer describes which atoms are connected by bonds to which other ones. Describes how many hydrogen atoms are connected to each of the other atoms, the condensed,27 character standard InChIKey is a hashed version of the full standard InChI, designed to allow for easy web searches of chemical compounds. Most chemical structures on the Web up to 2007 have been represented as GIF files, the full InChI turned out to be too lengthy for easy searching, and therefore the InChIKey was developed. With all databases currently having below 50 million structures, such duplication appears unlikely at present, a recent study more extensively studies the collision rate finding that the experimental collision rate is in agreement with the theoretical expectations. Example, Morphine has the structure shown on the right, as the InChI cannot be reconstructed from the InChIKey, an InChIKey always needs to be linked to the original InChI to get back to the original structure
9.
Simplified molecular-input line-entry system
–
The simplified molecular-input line-entry system is a specification in form of a line notation for describing the structure of chemical species using short ASCII strings. SMILES strings can be imported by most molecule editors for conversion back into two-dimensional drawings or three-dimensional models of the molecules, the original SMILES specification was initiated in the 1980s. It has since modified and extended. In 2007, a standard called OpenSMILES was developed in the open-source chemistry community. Other linear notations include the Wiswesser Line Notation, ROSDAL and SLN, the original SMILES specification was initiated by David Weininger at the USEPA Mid-Continent Ecology Division Laboratory in Duluth in the 1980s. The Environmental Protection Agency funded the project to develop SMILES. It has since modified and extended by others, most notably by Daylight Chemical Information Systems. In 2007, a standard called OpenSMILES was developed by the Blue Obelisk open-source chemistry community. Other linear notations include the Wiswesser Line Notation, ROSDAL and SLN, in July 2006, the IUPAC introduced the InChI as a standard for formula representation. SMILES is generally considered to have the advantage of being slightly more human-readable than InChI, the term SMILES refers to a line notation for encoding molecular structures and specific instances should strictly be called SMILES strings. However, the term SMILES is also used to refer to both a single SMILES string and a number of SMILES strings, the exact meaning is usually apparent from the context. The terms canonical and isomeric can lead to confusion when applied to SMILES. The terms describe different attributes of SMILES strings and are not mutually exclusive, typically, a number of equally valid SMILES strings can be written for a molecule. For example, CCO, OCC and CC all specify the structure of ethanol, algorithms have been developed to generate the same SMILES string for a given molecule, of the many possible strings, these algorithms choose only one of them. This SMILES is unique for each structure, although dependent on the algorithm used to generate it. These algorithms first convert the SMILES to a representation of the molecular structure. A common application of canonical SMILES is indexing and ensuring uniqueness of molecules in a database, there is currently no systematic comparison across commercial software to test if such flaws exist in those packages. SMILES notation allows the specification of configuration at tetrahedral centers, and these are structural features that cannot be specified by connectivity alone and SMILES which encode this information are termed isomeric SMILES
10.
Chemical formula
–
These are limited to a single typographic line of symbols, which may include subscripts and superscripts. A chemical formula is not a name, and it contains no words. Although a chemical formula may imply certain simple chemical structures, it is not the same as a full chemical structural formula. Chemical formulas can fully specify the structure of only the simplest of molecules and chemical substances, the simplest types of chemical formulas are called empirical formulas, which use letters and numbers indicating the numerical proportions of atoms of each type. Molecular formulas indicate the numbers of each type of atom in a molecule. For example, the formula for glucose is CH2O, while its molecular formula is C6H12O6. This is possible if the relevant bonding is easy to show in one dimension, an example is the condensed molecular/chemical formula for ethanol, which is CH3-CH2-OH or CH3CH2OH. For reasons of structural complexity, there is no condensed chemical formula that specifies glucose, chemical formulas may be used in chemical equations to describe chemical reactions and other chemical transformations, such as the dissolving of ionic compounds into solution. A chemical formula identifies each constituent element by its chemical symbol, in empirical formulas, these proportions begin with a key element and then assign numbers of atoms of the other elements in the compound, as ratios to the key element. For molecular compounds, these numbers can all be expressed as whole numbers. For example, the formula of ethanol may be written C2H6O because the molecules of ethanol all contain two carbon atoms, six hydrogen atoms, and one oxygen atom. Some types of compounds, however, cannot be written with entirely whole-number empirical formulas. An example is boron carbide, whose formula of CBn is a variable non-whole number ratio with n ranging from over 4 to more than 6.5. When the chemical compound of the consists of simple molecules. These types of formulas are known as molecular formulas and condensed formulas. A molecular formula enumerates the number of atoms to reflect those in the molecule, so that the formula for glucose is C6H12O6 rather than the glucose empirical formula. However, except for very simple substances, molecular chemical formulas lack needed structural information, for simple molecules, a condensed formula is a type of chemical formula that may fully imply a correct structural formula. For example, ethanol may be represented by the chemical formula CH3CH2OH
11.
Acid dissociation constant
–
An acid dissociation constant, Ka, is a quantitative measure of the strength of an acid in solution. It is the constant for a chemical reaction known as dissociation in the context of acid–base reactions. In the example shown in the figure, HA represents acetic acid, and A− represents the acetate ion, the chemical species HA, A− and H3O+ are said to be in equilibrium when their concentrations do not change with the passing of time. The definition can then be more simply H A ⇌ A − + H +, K a = This is the definition in common usage. A weak acid has a pKa value in the approximate range −2 to 12 in water, pKa values for strong acids can, however, be estimated by theoretical means. The definition can be extended to non-aqueous solvents, such as acetonitrile and dimethylsulfoxide. Denoting a solvent molecule by S H A + S ⇌ A − + S H +, K a = When the concentration of solvent molecules can be taken to be constant, K a =, as before. The value of pKa also depends on structure of the acid in many ways. For example, Pauling proposed two rules, one for successive pKa of polyprotic acids, and one to estimate the pKa of oxyacids based on the number of =O and −OH groups. Other structural factors that influence the magnitude of the dissociation constant include inductive effects, mesomeric effects. Hammett type equations have frequently applied to the estimation of pKa. The quantitative behaviour of acids and bases in solution can be only if their pKa values are known. These calculations find application in different areas of chemistry, biology, medicine. Acid dissociation constants are essential in aquatic chemistry and chemical oceanography. In living organisms, acid–base homeostasis and enzyme kinetics are dependent on the pKa values of the acids and bases present in the cell. According to Arrheniuss original definition, an acid is a substance that dissociates in solution, releasing the hydrogen ion H+. The equilibrium constant for this reaction is known as a dissociation constant. Brønsted and Lowry generalised this further to an exchange reaction
12.
Refractive index
–
In optics, the refractive index or index of refraction n of a material is a dimensionless number that describes how light propagates through that medium. It is defined as n = c v, where c is the speed of light in vacuum, for example, the refractive index of water is 1.333, meaning that light travels 1.333 times faster in a vacuum than it does in water. The refractive index determines how light is bent, or refracted. The refractive indices also determine the amount of light that is reflected when reaching the interface, as well as the angle for total internal reflection. This implies that vacuum has a index of 1. The refractive index varies with the wavelength of light and this is called dispersion and causes the splitting of white light into its constituent colors in prisms and rainbows, and chromatic aberration in lenses. Light propagation in absorbing materials can be described using a refractive index. The imaginary part then handles the attenuation, while the real part accounts for refraction, the concept of refractive index is widely used within the full electromagnetic spectrum, from X-rays to radio waves. It can also be used with wave phenomena such as sound, in this case the speed of sound is used instead of that of light and a reference medium other than vacuum must be chosen. Thomas Young was presumably the person who first used, and invented, at the same time he changed this value of refractive power into a single number, instead of the traditional ratio of two numbers. The ratio had the disadvantage of different appearances, newton, who called it the proportion of the sines of incidence and refraction, wrote it as a ratio of two numbers, like 529 to 396. Hauksbee, who called it the ratio of refraction, wrote it as a ratio with a fixed numerator, hutton wrote it as a ratio with a fixed denominator, like 1.3358 to 1. Young did not use a symbol for the index of refraction, in the next years, others started using different symbols, n, m, and µ. For visible light most transparent media have refractive indices between 1 and 2, a few examples are given in the adjacent table. These values are measured at the yellow doublet D-line of sodium, with a wavelength of 589 nanometers, gases at atmospheric pressure have refractive indices close to 1 because of their low density. Almost all solids and liquids have refractive indices above 1.3, aerogel is a very low density solid that can be produced with refractive index in the range from 1.002 to 1.265. Moissanite lies at the end of the range with a refractive index as high as 2.65. Most plastics have refractive indices in the range from 1.3 to 1.7, for infrared light refractive indices can be considerably higher
13.
Relative permittivity
–
The relative permittivity of a material is its permittivity expressed as a ratio relative to the permittivity of vacuum. Permittivity is a property that affects the Coulomb force between two point charges in the material. Relative permittivity is the factor by which the field between the charges is decreased relative to vacuum. Likewise, relative permittivity is the ratio of the capacitance of a capacitor using that material as a dielectric, relative permittivity is also commonly known as dielectric constant, a term deprecated in physics and engineering as well as in chemistry. Relative permittivity is typically denoted as εr and is defined as ε r = ε ε0, where ε is the complex frequency-dependent absolute permittivity of the material, and ε0 is the vacuum permittivity. Relative permittivity is a number that is in general complex-valued, its real and imaginary parts are denoted as. The relative permittivity of a medium is related to its electric susceptibility, χe, in anisotropic media the relative permittivity is a second rank tensor. The relative permittivity of a material for a frequency of zero is known as its relative permittivity. The historical term for the relative permittivity is dielectric constant and it is still commonly used, but has been deprecated by standards organizations, because of its ambiguity, as some older authors used it for the absolute permittivity ε. The permittivity may be quoted either as a property or as a frequency-dependent variant. It has also used to refer to only the real component εr of the complex-valued relative permittivity. In the causal theory of waves, permittivity is a complex quantity, the imaginary part corresponds to a phase shift of the polarization P relative to E and leads to the attenuation of electromagnetic waves passing through the medium. By definition, the relative permittivity of vacuum is equal to 1. The relative static permittivity, εr, can be measured for static electric fields as follows, first the capacitance of a test capacitor, then, using the same capacitor and distance between its plates, the capacitance Cx with a dielectric between the plates is measured. The relative dielectric constant can be calculated as ε r = C x C0. For time-variant electromagnetic fields, this quantity becomes frequency-dependent, an indirect technique to calculate εr is conversion of radio frequency S-parameter measurement results. A description of frequently used S-parameter conversions for determination of the frequency-dependent εr of dielectrics can be found in this bibliographic source, alternatively, resonance based effects may be employed at fixed frequencies. The relative permittivity is a piece of information when designing capacitors
14.
Infrared spectroscopy
–
Infrared spectroscopy involves the interaction of infrared radiation with matter. It covers a range of techniques, mostly based on absorption spectroscopy, as with all spectroscopic techniques, it can be used to identify and study chemicals. Sample may be solid, liquid, or gas, the method or technique of infrared spectroscopy is conducted with an instrument called an infrared spectrometer to produce an infrared spectrum. An IR spectrum is essentially a graph of infrared light absorbance on the vertical axis vs. frequency or wavelength on the horizontal axis, typical units of frequency used in IR spectra are reciprocal centimeters, with the symbol cm−1. Units of IR wavelength are commonly given in micrometers, symbol μm, a common laboratory instrument that uses this technique is a Fourier transform infrared spectrometer. Two-dimensional IR is also possible as discussed below, the infrared portion of the electromagnetic spectrum is usually divided into three regions, the near-, mid- and far- infrared, named for their relation to the visible spectrum. The higher-energy near-IR, approximately 14000–4000 cm−1 can excite overtone or harmonic vibrations, the mid-infrared, approximately 4000–400 cm−1 may be used to study the fundamental vibrations and associated rotational-vibrational structure. The far-infrared, approximately 400–10 cm−1, lying adjacent to the region, has low energy. The names and classifications of these subregions are conventions, and are loosely based on the relative molecular or electromagnetic properties. Infrared spectroscopy exploits the fact that molecules absorb frequencies that are characteristic of their structure and these absorptions occur at resonant frequencies, i. e. the frequency of the absorbed radiation matches the vibrational frequency. The energies are affected by the shape of the potential energy surfaces, the masses of the atoms. In particular, in the Born–Oppenheimer and harmonic approximations, i. e, the resonant frequencies are also related to the strength of the bond and the mass of the atoms at either end of it. Thus, the frequency of the vibrations are associated with a normal mode of motion. In order for a mode in a sample to be IR active. A permanent dipole is not necessary, as the rule requires only a change in dipole moment, a molecule can vibrate in many ways, and each way is called a vibrational mode. For molecules with N number of atoms, linear molecules have 3N –5 degrees of vibrational modes, as an example H2O, a non-linear molecule, will have 3 ×3 –6 =3 degrees of vibrational freedom, or modes. Simple diatomic molecules have only one bond and only one vibrational band, if the molecule is symmetrical, e. g. N2, the band is not observed in the IR spectrum, but only in the Raman spectrum. Asymmetrical diatomic molecules, e. g. CO, absorb in the IR spectrum, more complex molecules have many bonds, and their vibrational spectra are correspondingly more complex, i. e. big molecules have many peaks in their IR spectra
15.
Nuclear magnetic resonance spectroscopy
–
Nuclear magnetic resonance spectroscopy, most commonly known as NMR spectroscopy, is a research technique that exploits the magnetic properties of certain atomic nuclei. This type of spectroscopy determines the physical and chemical properties of atoms or the molecules in which they are contained and it relies on the phenomenon of nuclear magnetic resonance and can provide detailed information about the structure, dynamics, reaction state, and chemical environment of molecules. Suitable samples range from small compounds analyzed with 1-dimensional proton or carbon-13 NMR spectroscopy to large proteins or nucleic acids using 3 or 4-dimensional techniques. The impact of NMR spectroscopy on the sciences has been substantial because of the range of information, NMR spectra are unique, well-resolved, analytically tractable and often highly predictable for small molecules. Thus, in organic chemistry practice, NMR analysis is used to confirm the identity of a substance, different functional groups are obviously distinguishable, and identical functional groups with differing neighboring substituents still give distinguishable signals. NMR has largely replaced traditional wet chemistry tests such as reagents or typical chromatography for identification. A disadvantage is that a large amount, 2–50 mg, of a purified substance is required. Preferably, the sample should be dissolved in a solvent, because NMR analysis of solids requires a dedicated MAS machine, the timescale of NMR is relatively long, and thus it is not suitable for observing fast phenomena, producing only an averaged spectrum. NMR spectrometers are relatively expensive, universities usually have them, modern NMR spectrometers have a very strong, large and expensive liquid helium-cooled superconducting magnet, because resolution directly depends on magnetic field strength. There are even benchtop NMR spectrometers, the Purcell group at Harvard University and the Bloch group at Stanford University independently developed NMR spectroscopy in the late 1940s and early 1950s. Edward Mills Purcell and Felix Bloch shared the 1952 Nobel Prize in Physics for their discoveries, when placed in a magnetic field, NMR active nuclei absorb electromagnetic radiation at a frequency characteristic of the isotope. The resonant frequency, energy of the absorption, and the intensity of the signal are proportional to the strength of the magnetic field, for example, in a 21 Tesla magnetic field, protons resonate at 900 MHz. It is common to refer to a 21 T magnet as a 900 MHz magnet, spinning the sample is necessary to average out diffusional motion. Whereas, measurements of diffusion constants are done the sample stationary and spinning off, the vast majority of nuclei in a solution would belong to the solvent, and most regular solvents are hydrocarbons and would contain NMR-reactive protons. The most used deuterated solvent is deuterochloroform, although deuterium oxide and deuterated DMSO are used for hydrophilic analytes, the chemical shifts are slightly different in different solvents, depending on electronic solvation effects. NMR spectra are often calibrated against the known solvent residual proton peak instead of added tetramethylsilane, to detect the very small frequency shifts due to nuclear magnetic resonance, the applied magnetic field must be constant throughout the sample volume. High resolution NMR spectrometers use shims to adjust the homogeneity of the field to parts per billion in a volume of a few cubic centimeters. In order to detect and compensate for inhomogeneity and drift in the magnetic field, in modern NMR spectrometers shimming is adjusted automatically, though in some cases the operator has to optimize the shim parameters manually to obtain the best possible resolution
16.
Mass spectrometry
–
Mass spectrometry is an analytical technique that ionizes chemical species and sorts the ions based on their mass-to-charge ratio. In simpler terms, a mass spectrum measures the masses within a sample, mass spectrometry is used in many different fields and is applied to pure samples as well as complex mixtures. A mass spectrum is a plot of the ion signal as a function of the mass-to-charge ratio, in a typical MS procedure, a sample, which may be solid, liquid, or gas, is ionized, for example by bombarding it with electrons. This may cause some of the molecules to break into charged fragments. The ions are detected by a capable of detecting charged particles. Results are displayed as spectra of the abundance of detected ions as a function of the mass-to-charge ratio. The atoms or molecules in the sample can be identified by correlating known masses to the masses or through a characteristic fragmentation pattern. Goldstein called these positively charged anode rays Kanalstrahlen, the translation of this term into English is canal rays. Wien found that the charge-to-mass ratio depended on the nature of the gas in the discharge tube, thomson later improved on the work of Wien by reducing the pressure to create the mass spectrograph. The word spectrograph had become part of the international scientific vocabulary by 1884, a mass spectroscope is similar to a mass spectrograph except that the beam of ions is directed onto a phosphor screen. A mass spectroscope configuration was used in early instruments when it was desired that the effects of adjustments be quickly observed, once the instrument was properly adjusted, a photographic plate was inserted and exposed. The term mass spectroscope continued to be used though the direct illumination of a phosphor screen was replaced by indirect measurements with an oscilloscope. The use of the mass spectroscopy is now discouraged due to the possibility of confusion with light spectroscopy. Mass spectrometry is often abbreviated as mass-spec or simply as MS, modern techniques of mass spectrometry were devised by Arthur Jeffrey Dempster and F. W. Aston in 1918 and 1919 respectively. Sector mass spectrometers known as calutrons were used for separating the isotopes of uranium developed by Ernest O. Lawrence during the Manhattan Project, calutron mass spectrometers were used for uranium enrichment at the Oak Ridge, Tennessee Y-12 plant established during World War II. In 1989, half of the Nobel Prize in Physics was awarded to Hans Dehmelt, a mass spectrometer consists of three components, an ion source, a mass analyzer, and a detector. The ionizer converts a portion of the sample into ions, there is a wide variety of ionization techniques, depending on the phase of the sample and the efficiency of various ionization mechanisms for the unknown species. An extraction system removes ions from the sample, which are then targeted through the mass analyzer, the differences in masses of the fragments allows the mass analyzer to sort the ions by their mass-to-charge ratio
17.
Essential amino acid
–
An essential amino acid, or indispensable amino acid, is an amino acid that cannot be synthesized de novo by the organism, and thus must be supplied in its diet. The nine amino acids humans cannot synthesize are phenylalanine, valine, threonine, tryptophan, methionine, leucine, isoleucine, lysine and these six are arginine, cysteine, glycine, glutamine, proline, and tyrosine. Five amino acids are dispensable in humans, meaning they can be synthesized in the body and these five are alanine, aspartic acid, asparagine, glutamic acid and serine. Pyrrolysine, sometimes considered the 22nd amino acid, is not used by humans, eukaryotes can synthesize some of the amino acids from other substrates. Consequently, only a subset of the amino acids used in protein synthesis are essential nutrients, estimating the daily requirement for the indispensable amino acids has proven to be difficult, these numbers have undergone considerable revision over the last 20 years. The following table lists the WHO recommended daily amounts currently in use for essential amino acids in adult humans, food sources are identified based on the USDA National Nutrient Database Release. The recommended daily intakes for children aged three years and older is 10% to 20% higher than adult levels and those for infants can be as much as 150% higher in the first year of life, cysteine, tyrosine, and arginine are always required by infants and growing children. Various attempts have been made to express the quality or value of various kinds of protein, measures include the biological value, net protein utilization, protein efficiency ratio, protein digestibility-corrected amino acid score and complete proteins concept. Thus, various feedstuffs may be fed in combination to increase net protein utilization, eating various plant foods in combination can provide a protein of higher biological value. Certain native combinations of foods, such as corn and beans, soybeans and rice, or red beans and rice, for example, while 100 g of raw broccoli only provides 28 kcal and 3 g of protein, it has over 100 mg of protein per kcal. An egg contains five times as many calories but only four times as much protein, however, a carrot has only 23 mg protein per kcal or twice the minimum recommendation, a banana meets the minimum, and an apple is below recommendation. It is recommended that adult humans obtain 10–35% of their calories as protein, the US FDA daily reference value of 50 g protein per 2000 kcal is 25 mg/kcal per day. This led William Cumming Rose to the discovery of the amino acid threonine. Roses later work showed that eight amino acids are essential for human beings. Longer term studies established histidine as also essential for adult humans, the distinction between essential and non-essential amino acids is somewhat unclear, as some amino acids can be produced from others. The sulfur-containing amino acids, methionine and homocysteine, can be converted into each other but neither can be synthesized de novo in humans, likewise, cysteine can be made from homocysteine but cannot be synthesized on its own. So, for convenience, sulfur-containing amino acids are considered a single pool of nutritionally equivalent amino acids as are the aromatic amino acid pair. Likewise arginine, ornithine, and citrulline, which are interconvertible by the cycle, are considered a single group
18.
Biosynthesis
–
Biosynthesis is a multi-step, enzyme-catalyzed process where substrates are converted into more complex products in living organisms. In biosynthesis, simple compounds are modified, converted into other compounds and this process often consists of metabolic pathways. Some of these pathways are located within a single cellular organelle, while others involve enzymes that are located within multiple cellular organelles. Examples of these pathways include the production of lipid membrane components. The prerequisite elements for biosynthesis include, precursor compounds, chemical energy and these elements create monomers, the building blocks for macromolecules. Biosynthesis occurs due to a series of chemical reactions, for these reactions to take place, the following elements are necessary, Precursor compounds, these compounds are the starting molecules or substrates in a reaction. These may also be viewed as the reactants in a chemical process. Chemical energy, chemical energy can be found in the form of high energy molecules and these molecules are required for energetically unfavorable reactions. Furthermore, the hydrolysis of these compounds drives a reaction forward, high energy molecules, such as ATP, have three phosphates. Often, the phosphate is split off during hydrolysis and transferred to another molecule. Catalytic enzymes, these molecules are special proteins that catalyze a reaction by increasing the rate of the reaction, coenzymes or cofactors, cofactors are molecules that assist in chemical reactions. These may be metal ions, vitamin derivatives such as NADH and acetyl CoA, in the case of NADH, the molecule transfers a hydrogen, whereas acetyl CoA transfers an acetyl group, and ATP transfers a phosphate. Two examples of type of reaction occur during the formation of nucleic acids. For some of these steps, chemical energy is required, Precursor molecule + ATP ↽ − − ⇀ product AMP + PP i Simple compounds that are converted into other compounds with the assistance of cofactors. For example, the synthesis of phospholipids requires acetyl CoA, while the synthesis of another component, shingolipids. The general equation for these examples is, Precursor molecule + Cofactor → e n z y m e macromolecule Simple compounds that join together to create a macromolecule, for example, fatty acids join together to form phopspholipids. In turn, phospholipids and cholesterol interact noncovalently in order to form the lipid bilayer and this reaction may be depicted as follows, Molecule 1 + Molecule 2 ⟶ macromolecule Many intricate macromolecules are synthesized in a pattern of simple, repeated structures. For example, the simplest structures of lipids are fatty acids, fatty acids are hydrocarbon derivatives, they contain a carboxyl group “head” and a hydrocarbon chain “tail. ”These fatty acids create larger components, which in turn incorporate noncovalent interactions to form the lipid bilayer
19.
Protein
–
Proteins are large biomolecules, or macromolecules, consisting of one or more long chains of amino acid residues. Proteins perform a vast array of functions within organisms, including catalysing metabolic reactions, DNA replication, responding to stimuli, a linear chain of amino acid residues is called a polypeptide. A protein contains at least one long polypeptide, short polypeptides, containing less than 20–30 residues, are rarely considered to be proteins and are commonly called peptides, or sometimes oligopeptides. The individual amino acid residues are bonded together by peptide bonds, the sequence of amino acid residues in a protein is defined by the sequence of a gene, which is encoded in the genetic code. In general, the code specifies 20 standard amino acids, however. Sometimes proteins have non-peptide groups attached, which can be called prosthetic groups or cofactors, proteins can also work together to achieve a particular function, and they often associate to form stable protein complexes. Once formed, proteins only exist for a period of time and are then degraded and recycled by the cells machinery through the process of protein turnover. A proteins lifespan is measured in terms of its half-life and covers a wide range and they can exist for minutes or years with an average lifespan of 1–2 days in mammalian cells. Abnormal and or misfolded proteins are degraded more rapidly due to being targeted for destruction or due to being unstable. Like other biological macromolecules such as polysaccharides and nucleic acids, proteins are essential parts of organisms, many proteins are enzymes that catalyse biochemical reactions and are vital to metabolism. Proteins also have structural or mechanical functions, such as actin and myosin in muscle and the proteins in the cytoskeleton, other proteins are important in cell signaling, immune responses, cell adhesion, and the cell cycle. In animals, proteins are needed in the diet to provide the essential amino acids that cannot be synthesized, digestion breaks the proteins down for use in the metabolism. Methods commonly used to study structure and function include immunohistochemistry, site-directed mutagenesis, X-ray crystallography, nuclear magnetic resonance. Most proteins consist of linear polymers built from series of up to 20 different L-α-amino acids, all proteinogenic amino acids possess common structural features, including an α-carbon to which an amino group, a carboxyl group, and a variable side chain are bonded. Only proline differs from this structure as it contains an unusual ring to the N-end amine group. The amino acids in a chain are linked by peptide bonds. Once linked in the chain, an individual amino acid is called a residue, and the linked series of carbon, nitrogen. The peptide bond has two forms that contribute some double-bond character and inhibit rotation around its axis, so that the alpha carbons are roughly coplanar
20.
Amino group
–
In organic chemistry, amines are compounds and functional groups that contain a basic nitrogen atom with a lone pair. Amines are formally derivatives of ammonia, wherein one or more hydrogen atoms have been replaced by a substituent such as an alkyl or aryl group, important amines include amino acids, biogenic amines, trimethylamine, and aniline, see Category, Amines for a list of amines. Inorganic derivatives of ammonia are also called amines, such as chloramine, see Category, compounds with a nitrogen atom attached to a carbonyl group, thus having the structure R–CO–NR′R″, are called amides and have different chemical properties from amines. An aliphatic amine has no aromatic ring attached directly to the nitrogen atom, aromatic amines have the nitrogen atom connected to an aromatic ring as in the various anilines. The aromatic ring decreases the alkalinity of the amine, depending on its substituents, the presence of an amine group strongly increases the reactivity of the aromatic ring, due to an electron-donating effect. Amines are organized into four subcategories, Primary amines — Primary amines arise when one of three atoms in ammonia is replaced by an alkyl or aromatic. Important primary alkyl amines include, methylamine, most amino acids, Secondary amines — Secondary amines have two organic substituents bound to the nitrogen together with one hydrogen. Important representatives include dimethylamine, while an example of an aromatic amine would be diphenylamine, tertiary amines — In tertiary amines, nitrogen has three organic substituents. Examples include trimethylamine, which has a fishy smell. Cyclic amines — Cyclic amines are either secondary or tertiary amines, examples of cyclic amines include the 3-membered ring aziridine and the six-membered ring piperidine. N-methylpiperidine and N-phenylpiperidine are examples of tertiary amines. It is also possible to have four organic substituents on the nitrogen and these species are not amines but are quaternary ammonium cations and have a charged nitrogen center. Quaternary ammonium salts exist with many kinds of anions, Amines are named in several ways. Typically, the compound is given the prefix amino- or the suffix, the prefix N- shows substitution on the nitrogen atom. An organic compound with multiple amino groups is called a diamine, triamine, tetraamine, systematic names for some common amines, Hydrogen bonding significantly influences the properties of primary and secondary amines. Thus the melting point and boiling point of amines is higher than those of the corresponding phosphines, for example, methyl and ethyl amines are gases under standard conditions, whereas the corresponding methyl and ethyl alcohols are liquids. Amines possess a characteristic smell, liquid amines have a distinctive fishy smell. The nitrogen atom features a lone pair that can bind H+ to form an ammonium ion R3NH+
21.
Carboxylic acid
–
A carboxylic acid /ˌkɑːrbɒkˈsɪlɪk/ is an organic compound that contains a carboxyl group. The general formula of an acid is R–COOH, with R referring to the rest of the molecule. Carboxylic acids occur widely and include the amino acids and acetic acid, salts and esters of carboxylic acids are called carboxylates. When a carboxyl group is deprotonated, its conjugate base forms a carboxylate anion, carboxylate ions are resonance-stabilized, and this increased stability makes carboxylic acids more acidic than alcohols. Carboxylic acids can be seen as reduced or alkylated forms of the Lewis acid carbon dioxide, carboxylic acids are commonly identified using their trivial names, and usually have the suffix -ic acid. IUPAC-recommended names also exist, in system, carboxylic acids have an -oic acid suffix. For example, butyric acid is butanoic acid by IUPAC guidelines, the -oic acid nomenclature detail is based on the name of the previously-known chemical benzoic acid. Alternately, it can be named as a carboxy or carboxylic acid substituent on another parent structure, for example, 2-carboxyfuran. The carboxylate anion of an acid is usually named with the suffix -ate, in keeping with the general pattern of -ic acid and -ate for a conjugate acid and its conjugate base. For example, the base of acetic acid is acetate. The radical •COOH has only a fleeting existence. The acid dissociation constant of •COOH has been measured using electron paramagnetic resonance spectroscopy, the carboxyl group tends to dimerise to form oxalic acid. Because they are both hydrogen-bond acceptors and hydrogen-bond donors, they participate in hydrogen bonding. Together the hydroxyl and carbonyl group forms the functional group carboxyl, carboxylic acids usually exist as dimeric pairs in nonpolar media due to their tendency to self-associate. Smaller carboxylic acids are soluble in water, whereas higher carboxylic acids are less due to the increasing hydrophobic nature of the alkyl chain. These longer chain acids tend to be soluble in less-polar solvents such as ethers. Carboxylic acids tend to have higher boiling points than water, not only because of their surface area. Carboxylic acids tend to evaporate or boil as these dimers, for boiling to occur, either the dimer bonds must be broken or the entire dimer arrangement must be vaporised, both of which increase the enthalpy of vaporization requirements significantly
22.
Butyl group
–
In organic chemistry, butyl is a four-carbon alkyl radical or substituent group with general chemical formula −C4H9, derived from either of the two isomers of butane. In the convention of skeletal formulas, every line ending and line intersection specifies a carbon atom saturated with single-linked hydrogen atoms, the R symbol indicates any radical or other non-specific functional group. Butyl is the largest substituent for which names are commonly used for all isomers. The butyl groups carbon that is connected to the rest of the molecule is called the RI or R-prime carbon, the prefixes sec and tert refer to the number of additional side chains connected to the first butyl carbon. The prefix iso means equal while the prefix n- stands for normal, the four isomers of butyl acetate demonstrate these four isomeric configurations. In that progression, Butyl is the fourth, and the last to be named for its history, the word butyl is derived from butyric acid, a four-carbon carboxylic acid found in rancid butter. The name butyric acid comes from Latin butyrum, butter, subsequent alkyl radicals in the series are simply named from the Greek number that indicates the number of carbon atoms in the group, pentyl, hexyl, heptyl, etc. The tert-butyl substituent is very bulky and is used in chemistry for kinetic stabilization, the effect of the tert-butyl group on the progress of a chemical reaction is called the tert-butyl effect, illustrated in the Diels-Alder reaction below. Compared to a substituent, the tert-butyl substituent accelerates the reaction rate by a factor of 240
23.
Chemical polarity
–
In chemistry, polarity is a separation of electric charge leading to a molecule or its chemical groups having an electric dipole or multipole moment. Polar molecules must contain polar bonds due to a difference in electronegativity between the bonded atoms, a polar molecule with two or more polar bonds must have an asymmetric geometry so that the bond dipoles do not cancel each other. Polar molecules interact through dipole–dipole intermolecular forces and hydrogen bonds, Polarity underlies a number of physical properties including surface tension, solubility, and melting and boiling points. Not all atoms attract electrons with the same force, the amount of pull an atom exerts on its electrons is called its electronegativity. Atoms with high electronegativities – such as fluorine, oxygen and nitrogen – exert a pull on electrons than atoms with lower electronegativities. In a bond, this leads to sharing of electrons between the atoms, as electrons will be drawn closer to the atom with the higher electronegativity. Because electrons have a charge, the unequal sharing of electrons within a bond leads to the formation of an electric dipole. Because the amount of charge separated in such dipoles is usually smaller than a charge, they are called partial charges, denoted as δ+. These symbols were introduced by Christopher Kelk Ingold and Edith Hilda Ingold in 1926, the bond dipole moment is calculated by multiplying the amount of charge separated and the distance between the charges. These dipoles within molecules can interact with dipoles in other molecules, Bonds can fall between one of two extremes – being completely nonpolar or completely polar. A completely nonpolar bond occurs when the electronegativities are identical and therefore possess a difference of zero, a completely polar bond is more correctly called an ionic bond, and occurs when the difference between electronegativities is large enough that one atom actually takes an electron from the other. The terms polar and nonpolar are usually applied to covalent bonds, to determine the polarity of a covalent bond using numerical means, the difference between the electronegativity of the atoms is used. Bond polarity is typically divided into three groups that are based on the difference in electronegativity between the two bonded atoms. He estimated that a difference of 1.7 corresponds to 50% ionic character, see also dipole § Molecular dipoles. While the molecules can be described as covalent, nonpolar covalent, or ionic. However, the properties are typical of such molecules. A molecule is composed of one or more chemical bonds between molecular orbitals of different atoms, a polar molecule has a net dipole as a result of the opposing charges from polar bonds arranged asymmetrically. Water is an example of a polar molecule since it has a positive charge on one side
24.
Aliphatic compound
–
In organic chemistry, hydrocarbons are divided into two classes, aromatic compounds and aliphatic compounds also known as non-aromatic compounds. Aliphatics can be cyclic, but only aromatic compounds contain an especially stable ring of atoms, aliphatic compounds can be saturated, like Hexane, or unsaturated, like Hexene and Hexyne. Open-chain compounds contain no rings of any type, and are thus aliphatic, aliphatic compounds can be saturated, joined by single bonds, or unsaturated, with double bonds or triple bonds. Besides hydrogen, other elements can be bound to the chain, the most common being oxygen, nitrogen, sulfur. The least complex aliphatic compound is methane, most aliphatic compounds are flammable, allowing the use of hydrocarbons as fuel, such as methane in Bunsen burners and as liquefied natural gas, and acetylene in welding. The most important aliphatic compounds are, n-, iso- and cyclo-alkanes n-, iso- and cyclo-alkenes and -alkynes
25.
Genetic code
–
The genetic code is the set of rules by which information encoded within genetic material is translated into proteins by living cells. Translation is accomplished by the ribosome, which links amino acids in an order specified by mRNA, using transfer RNA molecules to carry amino acids, the genetic code is highly similar among all organisms and can be expressed in a simple table with 64 entries. The code defines how sequences of nucleotide triplets, called codons, with some exceptions, a three-nucleotide codon in a nucleic acid sequence specifies a single amino acid. The vast majority of genes are encoded with a single scheme and that scheme is often referred to as the canonical or standard genetic code, or simply the genetic code, though variant codes exist. While the genetic code determines a proteins amino acid sequence, other genomic regions determine when, efforts to understand how proteins are encoded began after DNAs structure was discovered in 1953. George Gamow postulated that sets of three bases must be employed to encode the 20 standard amino acids used by living cells to build proteins, the Crick, Brenner et al. experiment first demonstrated that codons consist of three DNA bases. Marshall Nirenberg and Heinrich J. Matthaei were the first to reveal the nature of a codon in 1961 and they used a cell-free system to translate a poly-uracil RNA sequence and discovered that the polypeptide that they had synthesized consisted of only the amino acid phenylalanine. They thereby deduced that the codon UUU specified the amino acid phenylalanine, therefore, the codon AAA specified the amino acid lysine, and the codon CCC specified the amino acid proline. Using various copolymers most of the remaining codons were then determined, subsequent work by Har Gobind Khorana identified the rest of the genetic code. Shortly thereafter, Robert W. Holley determined the structure of transfer RNA and this work was based upon Ochoas earlier studies, yielding the latter the Nobel Prize in Physiology or Medicine in 1959 for work on the enzymology of RNA synthesis. Extending this work, Nirenberg and Philip Leder revealed the triplet nature. In these experiments, various combinations of mRNA were passed through a filter that contained ribosomes, unique triplets promoted the binding of specific tRNAs to the ribosome. Leder and Nirenberg were able to determine the sequences of 54 out of 64 codons in their experiments, Khorana, Holley and Nirenberg received the 1968 Nobel for their work. The three stop codons were named by discoverers Richard Epstein and Charles Steinberg, amber was named after their friend Harris Bernstein, whose last name means amber in German. The other two stop codons were named ochre and opal in order to keep the color names theme, H. Murakami and M. Sisido extended some codons to have four and five bases. Steven A. Benner constructed a functional 65th codon, in 2016 the first stable semisynthetic organism was created. It was a bacterium with two synthetic bases, in 2017 a mouse engineered with an extended genetic code that can produce proteins with unnatural amino acids was reported. A codon is defined by the initial nucleotide from which starts and sets the frame for a run of successive triplets
26.
Valine
–
Valine encoded by the codons GUU, GUC, GUA, and GUG is an α-amino acid that is used in the biosynthesis of proteins. It contains a group, an α-carboxylic acid group. It is essential in humans, meaning the body cannot synthesize it, human dietary sources are any proteinaceous foods such as meats, dairy products, soy products, beans and legumes. Along with leucine and isoleucine, valine is an amino acid. In sickle-cell disease, valine substitutes for the amino acid glutamic acid in β-globin. Because valine is hydrophobic, the hemoglobin is prone to abnormal aggregation, Valine was first isolated from casein in 1901 by Hermann Emil Fischer. The name valine comes from valeric acid, which in turn is named after the plant valerian due to the presence of the acid in the roots of the plant. According to IUPAC, carbon atoms forming valine are numbered starting from 1 denoting the carboxyl carbon. Valine is an amino acid, hence it must be ingested. It is synthesized in plants via several steps starting from pyruvic acid, the initial part of the pathway also leads to leucine. The intermediate α-ketoisovalerate undergoes reductive amination with glutamate, orten and Otto W. Neuhaus, pages 367-368. Mice fed a valine free diet for one day have improved insulin sensitivity, the valine catabolite 3-hydroxyisobutyrate promotes skeletal muscle insulin resistance in mice by stimulating fatty acid uptake into muscle and lipid accumulation. In humans, a restricted diet lowers blood levels of valine. Experiments in mice have shown that dietary valine is essential for hematopoetic stem cell self-renewal, dietary valine restriction selectively depletes long-term repopulating Hematopoetic Stem Cells in mouse Bone Marrow. Successful stem cell transplantation was achieved in mice without irradiation after 3 weeks on a -Val diet, long term survival of the transplanted mice was achieved when valine was returned to the diet gradually over a 2 week period to avoid refeeding syndrome. Valinol Valine MS Spectrum Isoleucine and valine biosynthesis Valines relationship to prions
27.
Isoleucine
–
Isoleucine encoded by the codons ATT, ATC, ATA is an α-amino acid that is used in the biosynthesis of proteins. It contains a group, an α-carboxylic acid group. It is essential in humans, meaning the body cannot synthesize it, Isoleucine is synthesized from pyruvate employing leucine biosynthesis enzymes in other organisms such as bacteria. However, in cases, MSUD can lead to damage to the brain cells. As an essential nutrient, it is not synthesized in the body, hence it must be ingested, in plants and microorganisms, it is synthesized via several steps, starting from pyruvic acid and alpha-ketoglutarate. It can also be converted into Acetyl CoA and fed into the TCA cycle by condensing with oxaloacetate to form citrate, in mammals Acetyl CoA cannot be converted back to carbohydrate but can be used in the synthesis of ketone bodies or fatty acids, hence ketogenic. Biotin, sometimes referred to as Vitamin B7 or Vitamin H, is a requirement for the full catabolism of isoleucine. Without adequate biotin, the body will be unable to fully break down isoleucine and leucine molecules. Even though this acid is not produced in animals, it is stored in high quantities. Foods that have high amounts of isoleucine include eggs, soy protein, seaweed, turkey, chicken, lamb, cheese, Isoleucine can be synthesized in a multistep procedure starting from 2-bromobutane and diethylmalonate. Synthetic isoleucine was originally reported in 1905, german chemist Felix Ehrlich discovered isoleucine in hemoglobin in 1903. Center for Biological Sequence Analysis, University of Denmark http, //www. cbs. dtu. dk/courses/27619/codon. html Isoleucine and valine biosynthesis
28.
Branched-chain amino acid
–
A branched-chain amino acid is an amino acid having aliphatic side-chains with a branch. Among the proteinogenic amino acids, there are three BCAAs, leucine, isoleucine and valine, synthesis for BCAAs occurs in all location of plants, within the plastids of the cell, as determined by presence of mRNAs which encode for enzymes in the metabolic pathway. BCAAs provide several metabolic and physiologic roles, metabolically, BCAAs promote protein synthesis and turnover, signaling pathways, and metabolism of glucose. Oxidation of BCAAs may increase fatty acid oxidation and play a role in obesity, physiologically, BCAAs take on roles in the immune system and in brain function. BCAAs are broken down effectively by dehydrogenase and decarboxylase enzymes expressed by immune cells, lastly, BCAAs share the same transport protein into the brain with aromatic amino acids. Once in the brain BCAAs may have a role in protein synthesis, synthesis of neurotransmitters, dietary BCAA supplementation has been used clinically to aid in the recovery of burn victims. Dietary BCAAs have been used in an attempt to treat cases of hepatic encephalopathy. They can have the effect of alleviating symptoms, but there is no evidence they benefit mortality rates, nutrition, in mouse studies, BCAAs were shown to cause cell hyper-excitability resembling that usually observed in ALS patients. Yet any link between BCAAs and ALS remains to be fully established, bCAA-restricted diets improve glucose tolerance and promote leanness in mice, and promotes insulin sensitivity in obese rats. Threonine dehydrogenase catalyzes the deamination and dehydration of threonine to 2-ketobutyrate, isoleucine forms a negative feedback loop with threonine dehydrogenase. Next ketoacid reductisomerase reduces the accetohydroxy acids from the step to yield dihydroxyacids in both the valine and isoleucine pathways. Dihydroxyacid dehygrogenase converts the dihyroxyacids in the next step, the final step in the parallel pathway is conducted by amino transferase, which yields the final products of valine and isoleucine. Degradation of branched-chain amino acids involves the branched-chain alpha-keto acid dehydrogenase complex, a deficiency of this complex leads to a buildup of the branched-chain amino acids and their toxic by-products in the blood and urine, giving the condition the name maple syrup urine disease. Enzymes involved are branched chain aminotransferase and 3-methyl-2-oxobutanoate dehydrogenase, while most amino acids are oxidized in the liver, BCAAs are primarily oxidized in the skeletal muscle and other peripheral tissues. Administration of either isoleucine or valine alone had no effect on muscle growth, leucine indirectly activates p70 S6 kinase as well as stimulates assembly of the eIF4F complex, which are essential for mRNA binding in translational initiation. P70 S6 kinase is part of the target of rapamycin complex signaling pathway. At rest protein infusion stimulates protein synthesis 30 minutes after start of infusion, infusion of leucine at rest produces a six hour stimulatory effect and increased protein synthesis by phosphorylation of p70 S6 kinase in skeletal muscles. Following resistance exercise, without BCAA administration, an exercise session does not affect mTOR phosphorylation
29.
Metabolic pathway
–
In biochemistry, a metabolic pathway is a linked series of chemical reactions occurring within a cell. The reactants, products, and intermediates of a reaction are known as metabolites. In a metabolic pathway, the product of one acts as the substrate for the next. These enzymes often require dietary minerals, vitamins, and other cofactors to function, different metabolic pathways function based on the position within a eukaryotic cell and the significance of the pathway in the given compartment of the cell. For instance, the citric cycle, electron transport chain. In contrast, glycolysis, pentose phosphate pathway, and fatty acid biosynthesis all occur in the cytosol of a cell, the two pathways complement each other in that the energy released from one is used up by the other. The degradative process of a catabolic pathway provides the required to conduct a biosynthesis of an anabolic pathway. In addition to the two distinct metabolic pathways is the pathway, which can be either catabolic or anabolic based on the need for or the availability of energy. The end product of a pathway may be used immediately, initiate another metabolic pathway or be stored for later use, metabolic pathways are often considered to flow in one direction. Although all chemical reactions are reversible, conditions in the cell are often such that it is thermodynamically more favorable for flux to flow in one direction of a reaction. For example, one pathway may be responsible for the synthesis of an amino acid. One example of an exception to rule is the metabolism of glucose. Glycolysis results in the breakdown of glucose, but several reactions in the pathway are reversible. Glycolysis was the first metabolic pathway discovered, As glucose enters a cell, metabolic pathways are often regulated by feedback inhibition. Some metabolic pathways flow in a cycle wherein each component of the cycle is a substrate for the subsequent reaction in the cycle, the net reaction is, therefore, thermodynamically favorable, for it results in a lower free energy for the final products. A catabolic pathway is a system that produces chemical energy in the form of ATP, GTP, NADH, NADPH, FADH2, etc. from energy containing sources such as carbohydrates, fats. The end products are carbon dioxide, water, and ammonia. Coupled with an reaction of anabolism, the cell can synthesize new macromolecules using the original precursors of the anabolic pathway
30.
Acetoacetic acid
–
Acetoacetic acid is the organic compound with the formula CH3COCH2COOH. It is the simplest beta-keto acid group, and like other members of this class it is unstable, the methyl and ethyl esters, which are quite stable, are produced on a large scale industrially as precursors to dyes. Acetoacetic acid is a weak acid, in general, the esters are prepared from diketene by treatment with alcohols. Acetoacetic acid can be prepared by the hydrolysis of the ethyl acetoacetate followed by acidification of the anion, in general, acetoacetic acid is generated at 0 °C and used in situ immediately. That is, it reacts about 55 times more slowly and it is a weak acid, with a pKa of 3.58. Acetoacetic esters are used for the reaction, which is widely used in the production of arylide yellows. Although the esters can be used in reaction, diketene also reacts with alcohols. Nitroprusside changes from pink to purple in the presence of acetoacetate, the base of acetoacetic acid. Similar tests are used in dairy cows to test for ketosis
31.
Lysine
–
Lysine, encoded by the codons AAA and AAG, is an α-amino acid that is used in the biosynthesis of proteins. It contains a group, an α-carboxylic acid group. It is essential in humans, meaning the body cannot synthesize it, lysine is a base, as are arginine and histidine. The ε-amino group often participates in hydrogen bonding and as a base in catalysis. The ε-amino group is attached to the carbon from the α-carbon. O-Glycosylation of hydroxylysine residues in the endoplasmic reticulum or Golgi apparatus is used to mark certain proteins for secretion from the cell, deficiencies may cause blindness, as well as many other problems due to its ubiquitous presence in proteins. As an essential amino acid, lysine is not synthesized in animals, in plants and most bacteria, it is synthesized from aspartic acid, L-aspartate is first converted to L-aspartyl-4-phosphate by aspartokinase. ATP is needed as a source for this step. β-Aspartate semialdehyde dehydrogenase converts this into β-aspartyl-4-semialdehyde, energy from NADPH is used in this conversion. 4-hydroxy-tetrahydrodipicolinate synthase adds a pyruvate group to the β-aspartyl-4-semialdehyde, and a molecule is removed. This causes cyclization and gives rise to -4-hydroxy-2,3,4 and this product is reduced to 2,3,4, 5-tetrahydrodipicolinate by 4-hydroxy-tetrahydrodipicolinate reductase. This reaction consumes an NADPH molecule and releases a water molecule. Tetrahydrodipicolinate N-acetyltransferase opens this ring and gives rise to N-succinyl-L-2-amino-6-oxoheptanedionate, two water molecules and one acyl-CoA enzyme are used in this reaction. This reaction is catalyzed by the enzyme succinyl diaminopimelate aminotransferase, a glutamic acid molecule is used in this reaction and an oxoacid is produced as a byproduct. N-succinyl-LL-2, 6-diaminoheptanedionate is converted into LL-2, 6-diaminoheptanedionate by succinyl diaminopimelate desuccinylase, a water molecule is consumed in this reaction and a succinate is produced a byproduct. LL-2, 6-diaminoheptanedionate is converted by diaminopimelate epimerase into meso-2, 6-diamino-heptanedionate, finally, meso-2, 6-diamino-heptanedionate is converted into L-lysine by diaminopimelate decarboxylase. It is worth noting, however, that in fungi, euglenoids, lysine is metabolised in mammals to give acetyl-CoA, via an initial transamination with α-ketoglutarate. The bacterial degradation of lysine yields cadaverine by decarboxylation, allysine is a derivative of lysine, used in the production of elastin and collagen
32.
Protein biosynthesis
–
Protein synthesis is the process whereby biological cells generate new proteins, it is balanced by the loss of cellular proteins via degradation or export. Protein biosynthesis is regulated at multiple steps. They are principally during transcription and translation, the cistron DNA is transcribed into the first of a series of RNA intermediates. The last version is used as a template in synthesis of a polypeptide chain, Protein will often be synthesized directly from genes by translating mRNA. However, when a protein must be available on short notice or in large quantities, a proprotein is an inactive protein containing one or more inhibitory peptides that can be activated when the inhibitory sequence is removed by proteolysis during posttranslational modification. A preprotein is a form that contains a sequence that specifies its insertion into or through membranes. The signal peptide is cleaved off in the endoplasmic reticulum, preproproteins have both sequences still present. The amino acids are linked together to extend the growing protein chain. This whole complex of processes is carried out by the ribosome, the ribosome latches onto the end of an mRNA molecule and moves along it, capturing loaded tRNA molecules and joining together their amino acids to form a new protein chain. Protein biosynthesis, although similar, is different for prokaryotes and eukaryotes. In transcription an mRNA chain is generated, with one strand of the DNA double helix in the genome as a template and this strand is called the template strand. Transcription occurs in the nucleus, where the DNA is held. The DNA structure of the cell is made up of two made up of sugar and phosphate held together by hydrogen bonds between the bases of opposite strands. The sugar and the phosphate in each strand are joined together by stronger covalent bonds. The DNA is unzipped by the helicase, leaving the single nucleotide chain open to be copied. RNA polymerase reads the DNA strand from the 3-prime end to the 5-prime end, the general RNA structure is very similar to the DNA structure, but in RNA the nucleotide uracil takes the place that thymine occupies in DNA. The single strand of mRNA leaves the nucleus through nuclear pores, hnRNA then undergoes splicing of introns via spliceosomes to produce the final mRNA. The synthesis of proteins from RNA is known as translation, in eukaryotes, translation occurs in the cytoplasm, where the ribosomes are located
33.
Phosphorylation
–
Phosphorylation is the addition of a phosphoryl group − to a molecule. In biology, phosphorylation and its counterpart, dephosphorylation, are critical for cellular processes. A large fraction of proteins are at least temporarily phosphorylated, as are many sugars, lipids, Phosphorylation is especially important for protein function as this modification activates many enzymes, thereby regulating their function. Protein phosphorylation is one type of post-translational modification, the prominent role of protein phosphorylation in biochemistry is illustrated by the huge body of studies published on the subject. Phosphorylation of sugars is often the first stage of their catabolism and it allows cells to accumulate sugars because the phosphate group prevents the molecules from diffusing back across their transporter. Phosphorylation of glucose is a key reaction in sugar metabolism because many sugars are first converted to glucose before they are metabolized further. The chemical equation for the conversion of D-glucose to D-glucose-6-phosphate in the first step of glycolysis is given by D-glucose + ATP -> D-glucose-6-phosphate + ADP ΔG° = -16.7 kJ/mol, the two enzymes have been identified as a specific glucokinase and non-specific hexokinase. Hepatic cell is freely permeable to glucose, and the rate of phosphorylation of glucose is the rate-limiting step in glucose metabolism by the liver. The role of glucose 6-phosphate in glycogen synthase, High blood glucose concentration causes an increase in levels of glucose 6 phosphate in liver, skeletal muscle. In liver, synthesis of glycogen is directly correlated by blood glucose concentration and in muscle and adipocytes. High blood glucose releases insulin, stimulating the trans location of specific glucose transporters to the cell membrane, the hexokinase enzyme has a low Km, indicating a high affinity for glucose, so this initial phosphorylation can proceed even when glucose levels at nanoscopic scale within the blood. The phosphorylation of glucose can be enhanced by the binding of Fructose-6-phosphate, fructose consumed in the diet is converted to F1P in the liver. This negates the action of F6P on glucokinase, which favors the forward reaction. The capacity of cells to phosphorylate fructose exceeds capacity of metabolize fructose-1-phosphate. Consuming excess fructose ultimately results in an imbalance in liver metabolism, Phosphorylation of glucose is imperative in processes within the body. For example, phosphorylating glucose is necessary for insulin-dependent mechanistic target of rapamycin pathway activity within the heart and this further suggests a link between intermediary metabolism and cardiac growth. Reversible phosphorylation of proteins is an important regulatory mechanism occurs in both prokaryotic and eukaryotic organisms. Kinases phosphorylate proteins and phosphatases dephosphorylate proteins, many enzymes and receptors are switched on or off by phosphorylation and dephosphorylation
34.
Mechanistic target of rapamycin
–
The mechanistic target of rapamycin, also known as FK506-binding protein 12-rapamycin-associated protein 1, is a kinase that in humans is encoded by the MTOR gene. MTOR is a member of the phosphatidylinositol 3-kinase-related kinase family of protein kinases, MTOR links with other proteins and serves as a core component of two distinct protein complexes, mTOR complex 1 and mTOR complex 2, which regulate different cellular processes. As a core component of mTORC2, mTOR also functions as a protein kinase that promotes the activation of insulin receptors. MTORC2 has also implicated in the control and maintenance of the actin cytoskeleton. MTOR was first named as the target of rapamycin. Rapamycin was discovered in a sample from Easter Island, known locally as Rapa Nui. The bacterium Streptomyces hygroscopicus, isolated from that sample, produces an antifungal that researchers named rapamycin after the island, rapamycin arrests fungal activity at the G1 phase of the cell cycle. In mammals, it suppresses the immune system by blocking the G1 to S phase transition in T-lymphocytes, thus, it is used as an immunosuppressant following organ transplantation. They isolated rapamycin-resistant mutants of Saccharomyces cerevisiae and discovered that mutations in any of three genes can confer rapamycin resistance. Two of the genes were named TOR1 and TOR2 for targets of rapamycin and in honor of the Spalentor, the third gene is FPR1, which encodes the yeast ortholog of FKBP12 binding protein in the TOR complexes. Loss of function mutations in FPR1 confer resistance to rapamycin, and also to FK506, several groups also described the protein independently in the year 1994 using names such as FRAP, RAFT1, RAPT1 and SEP to refer to the protein. Due to the ubiquity of mTOR in animals the meaning of the m has been changed from mammalian to mechanistic. MTOR integrates the input from upstream pathways, including insulin, growth factors, MTOR also senses cellular nutrient, oxygen, and energy levels. Rapamycin inhibits mTOR by associating with its intracellular receptor FKBP12, the FKBP12-rapamycin complex binds directly to the FKBP12-Rapamycin Binding domain of mTOR, inhibiting its activity. MTOR is the subunit of two structurally distinct complexes, mTORC1 and mTORC2. Both complexes localize to different subcellular compartments, thus affecting their activation and function, MTOR Complex 1 is composed of MTOR, regulatory-associated protein of MTOR, mammalian lethal with SEC13 protein 8 and the non-core components PRAS40 and DEPTOR. This complex functions as a sensor and controls protein synthesis. The activity of mTORC1 is regulated by rapamycin, insulin, growth factors, phosphatidic acid, certain amino acids and their derivatives, mechanical stimuli, MTOR Complex 2 is composed of MTOR, rapamycin-insensitive companion of MTOR, MLST8, and mammalian stress-activated protein kinase interacting protein 1