1.
Real number
–
In mathematics, a real number is a value that represents a quantity along a line. The adjective real in this context was introduced in the 17th century by René Descartes, the real numbers include all the rational numbers, such as the integer −5 and the fraction 4/3, and all the irrational numbers, such as √2. Included within the irrationals are the numbers, such as π. Real numbers can be thought of as points on a long line called the number line or real line. Any real number can be determined by a possibly infinite decimal representation, such as that of 8.632, the real line can be thought of as a part of the complex plane, and complex numbers include real numbers. These descriptions of the numbers are not sufficiently rigorous by the modern standards of pure mathematics. All these definitions satisfy the definition and are thus equivalent. The statement that there is no subset of the reals with cardinality greater than ℵ0. Simple fractions were used by the Egyptians around 1000 BC, the Vedic Sulba Sutras in, c.600 BC, around 500 BC, the Greek mathematicians led by Pythagoras realized the need for irrational numbers, in particular the irrationality of the square root of 2. Arabic mathematicians merged the concepts of number and magnitude into a general idea of real numbers. In the 16th century, Simon Stevin created the basis for modern decimal notation, in the 17th century, Descartes introduced the term real to describe roots of a polynomial, distinguishing them from imaginary ones. In the 18th and 19th centuries, there was work on irrational and transcendental numbers. Johann Heinrich Lambert gave the first flawed proof that π cannot be rational, Adrien-Marie Legendre completed the proof, Évariste Galois developed techniques for determining whether a given equation could be solved by radicals, which gave rise to the field of Galois theory. Charles Hermite first proved that e is transcendental, and Ferdinand von Lindemann, lindemanns proof was much simplified by Weierstrass, still further by David Hilbert, and has finally been made elementary by Adolf Hurwitz and Paul Gordan. The development of calculus in the 18th century used the set of real numbers without having defined them cleanly. The first rigorous definition was given by Georg Cantor in 1871, in 1874, he showed that the set of all real numbers is uncountably infinite but the set of all algebraic numbers is countably infinite. Contrary to widely held beliefs, his first method was not his famous diagonal argument, the real number system can be defined axiomatically up to an isomorphism, which is described hereafter. Another possibility is to start from some rigorous axiomatization of Euclidean geometry, from the structuralist point of view all these constructions are on equal footing
2.
Probability density function
–
In a more precise sense, the PDF is used to specify the probability of the random variable falling within a particular range of values, as opposed to taking on any one value. The probability density function is everywhere, and its integral over the entire space is equal to one. The terms probability distribution function and probability function have also sometimes used to denote the probability density function. However, this use is not standard among probabilists and statisticians, further confusion of terminology exists because density function has also been used for what is here called the probability mass function. In general though, the PMF is used in the context of random variables. Suppose a species of bacteria typically lives 4 to 6 hours, what is the probability that a bacterium lives exactly 5 hours. A lot of bacteria live for approximately 5 hours, but there is no chance that any given bacterium dies at exactly 5.0000000000, instead we might ask, What is the probability that the bacterium dies between 5 hours and 5.01 hours. Lets say the answer is 0.02, next, What is the probability that the bacterium dies between 5 hours and 5.001 hours. The answer is probably around 0.002, since this is 1/10th of the previous interval, the probability that the bacterium dies between 5 hours and 5.0001 hours is probably about 0.0002, and so on. In these three examples, the ratio / is approximately constant, and equal to 2 per hour, for example, there is 0.02 probability of dying in the 0. 01-hour interval between 5 and 5.01 hours, and =2 hour−1. This quantity 2 hour−1 is called the probability density for dying at around 5 hours, therefore, in response to the question What is the probability that the bacterium dies at 5 hours. A literally correct but unhelpful answer is 0, but an answer can be written as dt. This is the probability that the bacterium dies within a window of time around 5 hours. For example, the probability that it lives longer than 5 hours, there is a probability density function f with f =2 hour−1. The integral of f over any window of time is the probability that the dies in that window. A probability density function is most commonly associated with absolutely continuous univariate distributions, a random variable X has density fX, where fX is a non-negative Lebesgue-integrable function, if, Pr = ∫ a b f X d x. That is, f is any function with the property that. In the continuous univariate case above, the measure is the Lebesgue measure
3.
Cumulative distribution function
–
In the case of a continuous distribution, it gives the area under the probability density function from minus infinity to x. Cumulative distribution functions are used to specify the distribution of multivariate random variables. The probability that X lies in the semi-closed interval (a, b], in the definition above, the less than or equal to sign, ≤, is a convention, not a universally used one, but is important for discrete distributions. The proper use of tables of the binomial and Poisson distributions depends upon this convention, moreover, important formulas like Paul Lévys inversion formula for the characteristic function also rely on the less than or equal formulation. If treating several random variables X, Y. etc. the corresponding letters are used as subscripts while, if treating only one, the subscript is usually omitted. It is conventional to use a capital F for a distribution function, in contrast to the lower-case f used for probability density functions. This applies when discussing general distributions, some specific distributions have their own conventional notation, the CDF of a continuous random variable X can be expressed as the integral of its probability density function ƒX as follows, F X = ∫ − ∞ x f X d t. In the case of a random variable X which has distribution having a discrete component at a value b, P = F X − lim x → b − F X. If FX is continuous at b, this equals zero and there is no discrete component at b, every cumulative distribution function F is non-decreasing and right-continuous, which makes it a càdlàg function. Furthermore, lim x → − ∞ F =0, lim x → + ∞ F =1, the function f is equal to the derivative of F almost everywhere, and it is called the probability density function of the distribution of X. As an example, suppose X is uniformly distributed on the unit interval, then the CDF of X is given by F = {0, x <0 x,0 ≤ x <11, x ≥1. Suppose instead that X takes only the discrete values 0 and 1, then the CDF of X is given by F = {0, x <01 /2,0 ≤ x <11, x ≥1. Sometimes, it is useful to study the question and ask how often the random variable is above a particular level. This is called the cumulative distribution function or simply the tail distribution or exceedance. This has applications in statistical hypothesis testing, for example, because the one-sided p-value is the probability of observing a test statistic at least as extreme as the one observed. Thus, provided that the test statistic, T, has a continuous distribution, in survival analysis, F ¯ is called the survival function and denoted S, while the term reliability function is common in engineering. Properties For a non-negative continuous random variable having an expectation, Markovs inequality states that F ¯ ≤ E x, as x → ∞, F ¯ →0, and in fact F ¯ = o provided that E is finite. This form of illustration emphasises the median and dispersion of the distribution or of the empirical results, if the CDF F is strictly increasing and continuous then F −1, p ∈, is the unique real number x such that F = p
4.
Expected value
–
In probability theory, the expected value of a random variable, intuitively, is the long-run average value of repetitions of the experiment it represents. For example, the value in rolling a six-sided die is 3.5. Less roughly, the law of large states that the arithmetic mean of the values almost surely converges to the expected value as the number of repetitions approaches infinity. The expected value is known as the expectation, mathematical expectation, EV, average, mean value, mean. More practically, the value of a discrete random variable is the probability-weighted average of all possible values. In other words, each value the random variable can assume is multiplied by its probability of occurring. The same principle applies to a random variable, except that an integral of the variable with respect to its probability density replaces the sum. The expected value does not exist for random variables having some distributions with large tails, for random variables such as these, the long-tails of the distribution prevent the sum/integral from converging. The expected value is a key aspect of how one characterizes a probability distribution, by contrast, the variance is a measure of dispersion of the possible values of the random variable around the expected value. The variance itself is defined in terms of two expectations, it is the value of the squared deviation of the variables value from the variables expected value. The expected value plays important roles in a variety of contexts, in regression analysis, one desires a formula in terms of observed data that will give a good estimate of the parameter giving the effect of some explanatory variable upon a dependent variable. The formula will give different estimates using different samples of data, a formula is typically considered good in this context if it is an unbiased estimator—that is, if the expected value of the estimate can be shown to equal the true value of the desired parameter. In decision theory, and in particular in choice under uncertainty, one example of using expected value in reaching optimal decisions is the Gordon–Loeb model of information security investment. According to the model, one can conclude that the amount a firm spends to protect information should generally be only a fraction of the expected loss. Suppose random variable X can take value x1 with probability p1, value x2 with probability p2, then the expectation of this random variable X is defined as E = x 1 p 1 + x 2 p 2 + ⋯ + x k p k. If all outcomes xi are equally likely, then the weighted average turns into the simple average and this is intuitive, the expected value of a random variable is the average of all values it can take, thus the expected value is what one expects to happen on average. If the outcomes xi are not equally probable, then the simple average must be replaced with the weighted average, the intuition however remains the same, the expected value of X is what one expects to happen on average. Let X represent the outcome of a roll of a fair six-sided die, more specifically, X will be the number of pips showing on the top face of the die after the toss
5.
Mode (statistics)
–
The mode is the value that appears most often in a set of data. The mode of a probability distribution is the value x at which its probability mass function takes its maximum value. In other words, it is the value that is most likely to be sampled, the mode of a continuous probability distribution is the value x at which its probability density function has its maximum value, so the mode is at the peak. Like the statistical mean and median, the mode is a way of expressing, in a single number, the numerical value of the mode is the same as that of the mean and median in a normal distribution, and it may be very different in highly skewed distributions. The mode is not necessarily unique to a distribution, since the probability mass function or probability density function may take the same maximum value at several points x1, x2. The most extreme case occurs in uniform distributions, where all values occur equally frequently, when a probability density function has multiple local maxima it is common to refer to all of the local maxima as modes of the distribution. Such a continuous distribution is called multimodal, in symmetric unimodal distributions, such as the normal distribution, the mean, median and mode all coincide. For samples, if it is known that they are drawn from a symmetric distribution, the mode of a sample is the element that occurs most often in the collection. For example, the mode of the sample is 6, given the list of data the mode is not unique - the dataset may be said to be bimodal, while a set with more than two modes may be described as multimodal. For a sample from a distribution, such as, the concept is unusable in its raw form. The mode is then the value where the histogram reaches its peak, the following MATLAB code example computes the mode of a sample, The algorithm requires as a first step to sort the sample in ascending order. It then computes the derivative of the sorted list. Unlike mean and median, the concept of mode makes sense for nominal data. For example, taking a sample of Korean family names, one might find that Kim occurs more often than any other name, then Kim would be the mode of the sample. In any voting system where a plurality victory, a single modal value determines the victor. Unlike median, the concept of mode makes sense for any random variable assuming values from a space, including the real numbers. For example, a distribution of points in the plane will typically have a mean and a mode, the median makes sense when there is a linear order on the possible values. Generalizations of the concept of median to higher-dimensional spaces are the geometric median, for the remainder, the assumption is that we have a real-valued random variable
6.
Variance
–
The variance has a central role in statistics. It is used in statistics, statistical inference, hypothesis testing, goodness of fit. This makes it a central quantity in numerous such as physics, biology, chemistry, cryptography, economics. The variance of a random variable X is the value of the squared deviation from the mean of X, μ = E . This definition encompasses random variables that are generated by processes that are discrete, continuous, neither, the variance can also be thought of as the covariance of a random variable with itself, Var = Cov . The variance is also equivalent to the second cumulant of a probability distribution that generates X, the variance is typically designated as Var , σ X2, or simply σ2. On computational floating point arithmetic, this equation should not be used, if a continuous distribution does not have an expected value, as is the case for the Cauchy distribution, it does not have a variance either. Many other distributions for which the value does exist also do not have a finite variance because the integral in the variance definition diverges. An example is a Pareto distribution whose index k satisfies 1 < k ≤2. e, the normal distribution with parameters μ and σ is a continuous distribution whose probability density function is given by f =12 π σ2 e −22 σ2. In this distribution, E = μ and the variance Var is related with σ via Var = ∫ − ∞ ∞22 π σ2 e −22 σ2 d x = σ2. The role of the distribution in the central limit theorem is in part responsible for the prevalence of the variance in probability. The exponential distribution with parameter λ is a distribution whose support is the semi-infinite interval. Its probability density function is given by f = λ e − λ x, the variance is equal to Var = ∫0 ∞2 λ e − λ x d x = λ −2. So for an exponentially distributed random variable, σ2 = μ2, the Poisson distribution with parameter λ is a discrete distribution for k =0,1,2, …. Its probability mass function is given by p = λ k k, E − λ, and it has expected value μ = λ. The variance is equal to Var = ∑ k =0 ∞ λ k k, E − λ2 = λ, So for a Poisson-distributed random variable, σ2 = μ. The binomial distribution with n and p is a discrete distribution for k =0,1,2, …, n. Its probability mass function is given by p = p k n − k, the variance is equal to Var = ∑ k =0 n p k n − k 2 = n p
7.
Random variable
–
In probability and statistics, a random variable, random quantity, aleatory variable, or stochastic variable is a variable quantity whose value depends on possible outcomes. It is common that these outcomes depend on physical variables that are not well understood. For example, when you toss a coin, the outcome of heads or tails depends on the uncertain physics. Which outcome will be observed is not certain, of course the coin could get caught in a crack in the floor, but such a possibility is excluded from consideration. The domain of a variable is the set of possible outcomes. In the case of the coin, there are two possible outcomes, namely heads or tails. Since one of these outcomes must occur, thus either the event that the coin lands heads or the event that the coin lands tails must have non-zero probability, a random variable is defined as a function that maps outcomes to numerical quantities, typically real numbers. In this sense, it is a procedure for assigning a numerical quantity to each outcome, and, contrary to its name. What is random is the physics that describes how the coin lands. A random variables possible values might represent the possible outcomes of a yet-to-be-performed experiment and they may also conceptually represent either the results of an objectively random process or the subjective randomness that results from incomplete knowledge of a quantity. The mathematics works the same regardless of the interpretation in use. A random variable has a probability distribution, which specifies the probability that its value falls in any given interval, two random variables with the same probability distribution can still differ in terms of their associations with, or independence from, other random variables. The realizations of a variable, that is, the results of randomly choosing values according to the variables probability distribution function, are called random variates. The formal mathematical treatment of random variables is a topic in probability theory, in that context, a random variable is understood as a function defined on a sample space whose outputs are numerical values. A random variable X, Ω → E is a function from a set of possible outcomes Ω to a measurable space E. The technical axiomatic definition requires Ω to be a probability space, a random variable does not return a probability. The probability of a set of outcomes is given by the probability measure P with which Ω is equipped. Rather, X returns a numerical quantity of outcomes in Ω — e. g. the number of heads in a collection of coin flips