1.
Normal distribution
–
In probability theory, the normal distribution is a very common continuous probability distribution. Normal distributions are important in statistics and are used in the natural and social sciences to represent real-valued random variables whose distributions are not known. The normal distribution is useful because of the limit theorem. Physical quantities that are expected to be the sum of independent processes often have distributions that are nearly normal. Moreover, many results and methods can be derived analytically in explicit form when the relevant variables are normally distributed, the normal distribution is sometimes informally called the bell curve. However, many other distributions are bell-shaped, the probability density of the normal distribution is, f =12 π σ2 e −22 σ2 Where, μ is mean or expectation of the distribution. σ is standard deviation σ2 is variance A random variable with a Gaussian distribution is said to be distributed and is called a normal deviate. The simplest case of a distribution is known as the standard normal distribution. The factor 1 /2 in the exponent ensures that the distribution has unit variance and this function is symmetric around x =0, where it attains its maximum value 1 /2 π and has inflection points at x = +1 and x = −1. Authors may differ also on which normal distribution should be called the standard one, the probability density must be scaled by 1 / σ so that the integral is still 1. If Z is a normal deviate, then X = Zσ + μ will have a normal distribution with expected value μ. Conversely, if X is a normal deviate, then Z = /σ will have a standard normal distribution. Every normal distribution is the exponential of a function, f = e a x 2 + b x + c where a is negative. In this form, the mean value μ is −b/, for the standard normal distribution, a is −1/2, b is zero, and c is − ln /2. The standard Gaussian distribution is denoted with the Greek letter ϕ. The alternative form of the Greek phi letter, φ, is used quite often. The normal distribution is often denoted by N. Thus when a random variable X is distributed normally with mean μ and variance σ2, some authors advocate using the precision τ as the parameter defining the width of the distribution, instead of the deviation σ or the variance σ2
2.
Probability
–
Probability is the measure of the likelihood that an event will occur. Probability is quantified as a number between 0 and 1, the higher the probability of an event, the more certain that the event will occur. A simple example is the tossing of a fair coin, since the coin is unbiased, the two outcomes are both equally probable, the probability of head equals the probability of tail. Since no other outcomes are possible, the probability is 1/2 and this type of probability is also called a priori probability. Probability theory is used to describe the underlying mechanics and regularities of complex systems. For example, tossing a coin twice will yield head-head, head-tail, tail-head. The probability of getting an outcome of head-head is 1 out of 4 outcomes or 1/4 or 0.25 and this interpretation considers probability to be the relative frequency in the long run of outcomes. A modification of this is propensity probability, which interprets probability as the tendency of some experiment to yield a certain outcome, subjectivists assign numbers per subjective probability, i. e. as a degree of belief. The degree of belief has been interpreted as, the price at which you would buy or sell a bet that pays 1 unit of utility if E,0 if not E. The most popular version of subjective probability is Bayesian probability, which includes expert knowledge as well as data to produce probabilities. The expert knowledge is represented by some prior probability distribution and these data are incorporated in a likelihood function. The product of the prior and the likelihood, normalized, results in a probability distribution that incorporates all the information known to date. The scientific study of probability is a development of mathematics. Gambling shows that there has been an interest in quantifying the ideas of probability for millennia, there are reasons of course, for the slow development of the mathematics of probability. Whereas games of chance provided the impetus for the study of probability. According to Richard Jeffrey, Before the middle of the century, the term probable meant approvable. A probable action or opinion was one such as people would undertake or hold. However, in legal contexts especially, probable could also apply to propositions for which there was good evidence, the sixteenth century Italian polymath Gerolamo Cardano demonstrated the efficacy of defining odds as the ratio of favourable to unfavourable outcomes
3.
Statistics
–
Statistics is a branch of mathematics dealing with the collection, analysis, interpretation, presentation, and organization of data. In applying statistics to, e. g. a scientific, industrial, or social problem, populations can be diverse topics such as all people living in a country or every atom composing a crystal. Statistics deals with all aspects of data including the planning of data collection in terms of the design of surveys, statistician Sir Arthur Lyon Bowley defines statistics as Numerical statements of facts in any department of inquiry placed in relation to each other. When census data cannot be collected, statisticians collect data by developing specific experiment designs, representative sampling assures that inferences and conclusions can safely extend from the sample to the population as a whole. In contrast, an observational study does not involve experimental manipulation, inferences on mathematical statistics are made under the framework of probability theory, which deals with the analysis of random phenomena. A standard statistical procedure involves the test of the relationship between two data sets, or a data set and a synthetic data drawn from idealized model. A hypothesis is proposed for the relationship between the two data sets, and this is compared as an alternative to an idealized null hypothesis of no relationship between two data sets. Rejecting or disproving the hypothesis is done using statistical tests that quantify the sense in which the null can be proven false. Working from a hypothesis, two basic forms of error are recognized, Type I errors and Type II errors. Multiple problems have come to be associated with this framework, ranging from obtaining a sufficient sample size to specifying an adequate null hypothesis, measurement processes that generate statistical data are also subject to error. Many of these errors are classified as random or systematic, the presence of missing data or censoring may result in biased estimates and specific techniques have been developed to address these problems. Statistics continues to be an area of research, for example on the problem of how to analyze Big data. Statistics is a body of science that pertains to the collection, analysis, interpretation or explanation. Some consider statistics to be a mathematical science rather than a branch of mathematics. While many scientific investigations make use of data, statistics is concerned with the use of data in the context of uncertainty, mathematical techniques used for this include mathematical analysis, linear algebra, stochastic analysis, differential equations, and measure-theoretic probability theory. In applying statistics to a problem, it is practice to start with a population or process to be studied. Populations can be diverse topics such as all living in a country or every atom composing a crystal. Ideally, statisticians compile data about the entire population and this may be organized by governmental statistical institutes
4.
Percentile
–
A percentile is a measure used in statistics indicating the value below which a given percentage of observations in a group of observations fall. For example, the 20th percentile is the value below which 20% of the observations may be found, the term percentile and the related term percentile rank are often used in the reporting of scores from norm-referenced tests. For example, if a score is at the 86th percentile, the 25th percentile is also known as the first quartile, the 50th percentile as the median or second quartile, and the 75th percentile as the third quartile. In general, percentiles and quartiles are specific types of quantiles, when ISPs bill burstable internet bandwidth, the 95th or 98th percentile usually cuts off the top 5% or 2% of bandwidth peaks in each month, and then bills at the nearest rate. In this way infrequent peaks are ignored, and the customer is charged in a fairer way, the reason this statistic is so useful in measuring data throughput is that it gives a very accurate picture of the cost of the bandwidth. The 95th percentile says that 95% of the time, the usage is below this amount, just the same, the remaining 5% of the time, the usage is above that amount. Physicians will often use infant and childrens weight and height to assess their growth in comparison to national averages and percentiles which are found in growth charts. The 85th percentile speed of traffic on a road is used as a guideline in setting speed limits. The methods given in the Definitions section are approximations for use in small-sample statistics, in general terms, for very large populations following a normal distribution, percentiles may often be represented by reference to a normal curve plot. The normal distribution is plotted along an axis scaled to standard deviations, mathematically, the normal distribution extends to negative infinity on the left and positive infinity on the right. Note, however, that only a small proportion of individuals in a population will fall outside the −3 to +3 range. For example, with human heights very few people are above the +3 sigma height level, Percentiles represent the area under the normal curve, increasing from left to right. Each standard deviation represents a fixed percentile and this is related to the 68–95–99.7 rule or the three-sigma rule. There is no definition of percentile, however all definitions yield similar results when the number of observations is very large. Some methods for calculating the percentiles are given below and this is obtained by first calculating the ordinal rank and then taking the value from the ordered list that corresponds to that rank. A percentile calculated using the Nearest Rank method will always be a member of the ordered list. The 100th percentile is defined to be the largest value in the ordered list, Example 1, Consider the ordered list, which contains five data values. What are the 5th, 30th, 40th, 50th and 100th percentiles of this list using the Nearest Rank method
5.
Standard deviation
–
In statistics, the standard deviation is a measure that is used to quantify the amount of variation or dispersion of a set of data values. The standard deviation of a variable, statistical population, data set. It is algebraically simpler, though in practice less robust, than the absolute deviation. A useful property of the deviation is that, unlike the variance. There are also other measures of deviation from the norm, including mean absolute deviation, in addition to expressing the variability of a population, the standard deviation is commonly used to measure confidence in statistical conclusions. For example, the margin of error in polling data is determined by calculating the standard deviation in the results if the same poll were to be conducted multiple times. This derivation of a deviation is often called the standard error of the estimate or standard error of the mean when referring to a mean. It is computed as the deviation of all the means that would be computed from that population if an infinite number of samples were drawn. It is very important to note that the deviation of a population. The reported margin of error of a poll is computed from the error of the mean and is typically about twice the standard deviation—the half-width of a 95 percent confidence interval. The standard deviation is also important in finance, where the standard deviation on the rate of return on an investment is a measure of the volatility of the investment. For a finite set of numbers, the deviation is found by taking the square root of the average of the squared deviations of the values from their average value. For example, the marks of a class of eight students are the eight values,2,4,4,4,5,5,7,9. These eight data points have the mean of 5,2 +4 +4 +4 +5 +5 +7 +98 =5 and this formula is valid only if the eight values with which we began form the complete population. If the values instead were a sample drawn from some large parent population. In that case the result would be called the standard deviation. Dividing by n −1 rather than by n gives an estimate of the variance of the larger parent population. This is known as Bessels correction, as a slightly more complicated real-life example, the average height for adult men in the United States is about 70 inches, with a standard deviation of around 3 inches
6.
Mean
–
In mathematics, mean has several different definitions depending on the context. An analogous formula applies to the case of a probability distribution. Not every probability distribution has a mean, see the Cauchy distribution for an example. Moreover, for some distributions the mean is infinite, for example, the arithmetic mean of a set of numbers x1, x2. Xn is typically denoted by x ¯, pronounced x bar, if the data set were based on a series of observations obtained by sampling from a statistical population, the arithmetic mean is termed the sample mean to distinguish it from the population mean. For a finite population, the mean of a property is equal to the arithmetic mean of the given property while considering every member of the population. For example, the mean height is equal to the sum of the heights of every individual divided by the total number of individuals. The sample mean may differ from the mean, especially for small samples. The law of large numbers dictates that the larger the size of the sample, outside of probability and statistics, a wide range of other notions of mean are often used in geometry and analysis, examples are given below. The geometric mean is an average that is useful for sets of numbers that are interpreted according to their product. X ¯ =1 n For example, the mean of five values,4,36,45,50,75 is,1 /5 =243000005 =30. The harmonic mean is an average which is useful for sets of numbers which are defined in relation to some unit, for example speed. AM, GM, and HM satisfy these inequalities, A M ≥ G M ≥ H M Equality holds if, in descriptive statistics, the mean may be confused with the median, mode or mid-range, as any of these may be called an average. The mean of a set of observations is the average of the values, however, for skewed distributions. For example, mean income is typically skewed upwards by a number of people with very large incomes. By contrast, the income is the level at which half the population is below. The mode income is the most likely income, and favors the larger number of people with lower incomes, the mean of a probability distribution is the long-run arithmetic average value of a random variable having that distribution. In this context, it is known as the expected value
7.
Central limit theorem
–
If this procedure is performed many times, the central limit theorem says that the computed values of the average will be distributed according to the normal distribution. The central limit theorem has a number of variants, in its common form, the random variables must be identically distributed. In variants, convergence of the mean to the normal distribution also occurs for non-identical distributions or for non-independent observations, in more general usage, a central limit theorem is any of a set of weak-convergence theorems in probability theory. When the variance of the i. i. d, Variables is finite, the attractor distribution is the normal distribution. In contrast, the sum of a number of i. i. d, Random variables with power law tail distributions decreasing as | x |−α −1 where 0 < α <2 will tend to an alpha-stable distribution with stability parameter of α as the number of variables grows. Suppose we are interested in the sample average S n, = X1 + ⋯ + X n n of these random variables, by the law of large numbers, the sample averages converge in probability and almost surely to the expected value µ as n → ∞. The classical central limit theorem describes the size and the form of the stochastic fluctuations around the deterministic number µ during this convergence. For large enough n, the distribution of Sn is close to the distribution with mean µ. The usefulness of the theorem is that the distribution of √n approaches normality regardless of the shape of the distribution of the individual Xi, formally, the theorem can be stated as follows, Lindeberg–Lévy CLT. Suppose is a sequence of i. i. d, Random variables with E = µ and Var = σ2 < ∞. Then as n approaches infinity, the random variables √n converge in distribution to a normal N, n → d N. Note that the convergence is uniform in z in the sense that lim n → ∞ sup z ∈ R | Pr − Φ | =0, the theorem is named after Russian mathematician Aleksandr Lyapunov. In this variant of the limit theorem the random variables Xi have to be independent. The theorem also requires that random variables | Xi | have moments of order. Suppose is a sequence of independent random variables, each with finite expected value μi, in practice it is usually easiest to check Lyapunov’s condition for δ =1. If a sequence of random variables satisfies Lyapunov’s condition, then it also satisfies Lindeberg’s condition, the converse implication, however, does not hold. In the same setting and with the notation as above. Suppose that for every ε >0 lim n → ∞1 s n 2 ∑ i =1 n E =0 where 1 is the indicator function
8.
Confidence interval
–
In statistics, a confidence interval is a type of interval estimate of a population parameter. It is an interval, in principle different from sample to sample. How frequently the observed interval contains the true parameter if the experiment is repeated is called the confidence level, whereas two-sided confidence limits form a confidence interval, and one-sided limits are referred to as lower/upper confidence bounds. Confidence intervals consist of a range of values that act as good estimates of the population parameter. However, the interval computed from a sample does not necessarily include the true value of the parameter. After any particular sample is taken, the parameter is either in the interval or not. Since the observed data are random samples from the true population, the 99% confidence level means that 99% of the intervals obtained from such samples will contain the true parameter. The desired level of confidence is set by the researcher, If a corresponding hypothesis test is performed, the confidence level is the complement of the level of significance, i. e. a 95% confidence interval reflects a significance level of 0.05. The confidence interval contains the values that, when tested. Confidence intervals of difference parameters not containing 0 imply that there is a significant difference between the populations. In applied practice, confidence intervals are typically stated at the 95% confidence level, however, when presented graphically, confidence intervals can be shown at several confidence levels, for example, 90%, 95%, and 99%. Factors affecting the width of the confidence interval include the size of the sample, the level. A larger sample size normally will lead to an estimate of the population parameter. Confidence intervals were introduced to statistics by Jerzy Neyman in a paper published in 1937, Interval estimates can be contrasted with point estimates. A point estimate is a value given as the estimate of a population parameter that is of interest, for example. An interval estimate specifies instead a range within which the parameter is estimated to lie, Confidence intervals are commonly reported in tables or graphs along with point estimates of the same parameters, to show the reliability of the estimates. For example, an interval can be used to describe how reliable survey results are. In a poll of election–voting intentions, the result might be that 40% of respondents intend to vote for a certain party, a 99% confidence interval for the proportion in the whole population having the same intention on the survey might be 30% to 50%