Coin flipping, coin tossing, or heads or tails is the practice of throwing a coin in the air and checking which side is showing when it lands, in order to choose between two alternatives, sometimes used to resolve a dispute between two parties. It is a form of sortition; the party who calls the side wins. The historical origin of coin flipping is the interpretation of a chance outcome as the expression of divine will. Coin flipping was known to the ancient Chinese as 撒大苏打, as some coins had a ship on one side and the head of the emperor on the other. In England, this was referred to as pile; the expression Heads or Tails results from heads and tails being considered complementary body parts. During a coin toss, the coin is thrown into the air such that it rotates edge-over-edge several times. Either beforehand or when the coin is in the air, an interested party calls "heads" or "tails", indicating which side of the coin that party is choosing; the other party is assigned the opposite side. Depending on custom, the coin may be caught.
When the coin comes to rest, the toss is complete and the party who called or was assigned the upper side is declared the winner. It is possible for a coin to land on its edge by landing up against an object or by getting stuck in the ground; however on a flat surface it is possible for a coin to land on its edge, with a chance of about 1 in 6000 for an American nickel. Angular momentum prevents most coins from landing on their edges unsupported if flipped; such cases in which a coin does land on its edge are exceptionally rare and in most cases the coin is re-flipped. The coin may be any type. Larger coins tend to be more popular than smaller ones; some high-profile coin tosses, such as the Cricket World Cup and the Super Bowl, use custom-made ceremonial medallions. Three-way coin flips are possible, by a different process – this can be done either to choose two out of three, or to choose one out of three. To choose two out of three, three coins are flipped, if two coins come up the same and one different, the different one loses, leaving two players.
To choose one out of three, either reverse this, or add a regular two-way coin flip between the remaining players as a second step. Note that the three-way flip is 75% to work each time it is tried, does not require that "heads" or "tails" be called. A famous example of such a three-way coin flip is dramatized in Friday Night Lights, three high school football teams use a three-way coin flip. A legacy of this coin flip was to reduce the use of coin flips to break ties in Texas sports, instead using point-systems to reduce the frequency of ties. Coin tossing is a simple and unbiased way of settling a dispute or deciding between two or more arbitrary options. In a game theoretic analysis it provides odds to both sides involved, requiring little effort and preventing the dispute from escalating into a struggle, it is used in sports and other games to decide arbitrary factors such as which side of the field a team will play from, or which side will attack or defend initially. Factors such as wind direction, the position of the sun, other conditions may affect the decision.
In team sports it is the captain who makes the call, while the umpire or referee oversees such proceedings. A competitive method may be used instead of a toss in some situations, for example in basketball the jump ball is employed, while the face-off plays a similar role in ice hockey. Coin flipping is used to decide which end of the field the teams will play to and/or which team gets first use of the ball, or similar questions in football matches, American football games, Australian rules football and other sports requiring such decisions. In the U. S. a specially minted coin is flipped in National Football League games. The XFL, a short-lived American football league, attempted to avoid coin tosses by implementing a face-off style "opening scramble," in which one player from each team tried to recover a loose football; because of the high rate of injury in these events, it has not achieved mainstream popularity in any football league, coin tossing remains the method of choice in American football.
In an association football match, the team winning the coin toss chooses which goal to attack in the first half. For the second half, the teams switch ends, the team that won the coin toss kicks off. Coin tosses are used to decide which team has the pick of going first or second in a penalty shoot-out. Before the early-1970s introduction of the penalty shootout, coin tosses were needed to decide the outcome of tied matches; the most famous instance of this was the semifinal game of the 1968 European Championship in Italy between Italy and the Soviet Union, which finished 0-0 after extra time. Italy won, went on to become European champions. In cricket the toss is significant, as the decision whether to
Central limit theorem
In probability theory, the central limit theorem establishes that, in some situations, when independent random variables are added, their properly normalized sum tends toward a normal distribution if the original variables themselves are not distributed. The theorem is a key concept in probability theory because it implies that probabilistic and statistical methods that work for normal distributions can be applicable to many problems involving other types of distributions. For example, suppose that a sample is obtained containing a large number of observations, each observation being randomly generated in a way that does not depend on the values of the other observations, that the arithmetic mean of the observed values is computed. If this procedure is performed many times, the central limit theorem says that the distribution of the average will be approximated by a normal distribution. A simple example of this is that if one flips a coin many times the probability of getting a given number of heads in a series of flips will approach a normal curve, with mean equal to half the total number of flips in each series.
The central limit theorem has a number of variants. In its common form, the random variables must be identically distributed. In variants, convergence of the mean to the normal distribution occurs for non-identical distributions or for non-independent observations, given that they comply with certain conditions; the earliest version of this theorem, that the normal distribution may be used as an approximation to the binomial distribution, is now known as the de Moivre–Laplace theorem. In more general usage, a central limit theorem is any of a set of weak-convergence theorems in probability theory, they all express the fact that a sum of many independent and identically distributed random variables, or alternatively, random variables with specific types of dependence, will tend to be distributed according to one of a small set of attractor distributions. When the variance of the i.i.d. Variables is finite, the attractor distribution is the normal distribution. In contrast, the sum of a number of i.i.d.
Random variables with power law tail distributions decreasing as |x|−α − 1 where 0 < α < 2 will tend to an alpha-stable distribution with stability parameter of α as the number of variables grows. Let be a random sample of size n—that is, a sequence of independent and identically distributed random variables drawn from a distribution of expected value given by µ and finite variance given by σ2. Suppose we are interested in the sample average S n:= X 1 + ⋯ + X n n of these random variables. By the law of large numbers, the sample averages converge in probability and surely to the expected value µ as n → ∞; the classical central limit theorem describes the size and the distributional form of the stochastic fluctuations around the deterministic number µ during this convergence. More it states that as n gets larger, the distribution of the difference between the sample average Sn and its limit µ, when multiplied by the factor √n, approximates the normal distribution with mean 0 and variance σ2.
For large enough n, the distribution of Sn is close to the normal distribution with mean µ and variance σ2/n. The usefulness of the theorem is that the distribution of √n approaches normality regardless of the shape of the distribution of the individual Xi. Formally, the theorem can be stated as follows: Lindeberg–Lévy CLT. Suppose is a sequence of i.i.d. Random variables with E = µ and Var = σ2 < ∞. As n approaches infinity, the random variables √n converge in distribution to a normal N: n → d N. In the case σ > 0, convergence in distribution means that the cumulative distribution functions of √n converge pointwise to the cdf of the N distribution: for every real number z, lim n → ∞ Pr = Φ, where Φ is the standard normal cdf evaluated at x. Note that the convergence is uniform in z in the sense that lim n → ∞ sup z ∈ R | Pr − Φ | = 0, where sup denotes the least upper bound of the se
John von Neumann
John von Neumann was a Hungarian-American mathematician, computer scientist, polymath. Von Neumann was regarded as the foremost mathematician of his time and said to be "the last representative of the great mathematicians", he made major contributions to a number of fields, including mathematics, economics and statistics. He was a pioneer of the application of operator theory to quantum mechanics in the development of functional analysis, a key figure in the development of game theory and the concepts of cellular automata, the universal constructor and the digital computer, he published over 150 papers in his life: about 60 in pure mathematics, 60 in applied mathematics, 20 in physics, the remainder on special mathematical subjects or non-mathematical ones. His last work, an unfinished manuscript written while in hospital, was published in book form as The Computer and the Brain, his analysis of the structure of self-replication preceded the discovery of the structure of DNA. In a short list of facts about his life he submitted to the National Academy of Sciences, he stated, "The part of my work I consider most essential is that on quantum mechanics, which developed in Göttingen in 1926, subsequently in Berlin in 1927–1929.
My work on various forms of operator theory, Berlin 1930 and Princeton 1935–1939. During World War II, von Neumann worked on the Manhattan Project with theoretical physicist Edward Teller, mathematician Stanisław Ulam and others, problem solving key steps in the nuclear physics involved in thermonuclear reactions and the hydrogen bomb, he developed the mathematical models behind the explosive lenses used in the implosion-type nuclear weapon, coined the term "kiloton", as a measure of the explosive force generated. After the war, he served on the General Advisory Committee of the United States Atomic Energy Commission, consulted for a number of organizations, including the United States Air Force, the Army's Ballistic Research Laboratory, the Armed Forces Special Weapons Project, the Lawrence Livermore National Laboratory; as a Hungarian émigré, concerned that the Soviets would achieve nuclear superiority, he designed and promoted the policy of mutually assured destruction to limit the arms race.
Von Neumann was born Neumann János Lajos to a wealthy and non-observant Jewish family. After his arrival in the U. S. he had been baptized a Roman Catholic prior to the marriage to his Catholic first wife. Von Neumann was born in Budapest, Kingdom of Hungary, part of the Austro-Hungarian Empire, he was the eldest of three brothers. His father, Neumann Miksa was a banker, he had moved to Budapest from Pécs at the end of the 1880s. Miksa's father and grandfather were both born in Zemplén County, northern Hungary. John's mother was Kann Margit. Three generations of the Kann family lived in spacious apartments above the Kann-Heller offices in Budapest. On February 20, 1913, Emperor Franz Joseph elevated his father to the Hungarian nobility for his service to the Austro-Hungarian Empire; the Neumann family thus acquired the hereditary appellation Margittai. The family had no connection with the town. Neumann János became margittai Neumann János, which he changed to the German Johann von Neumann. Von Neumann was a child prodigy.
When he was 6 years old, he could divide two 8-digit numbers in his head and could converse in Ancient Greek. When the 6-year-old von Neumann caught his mother staring aimlessly, he asked her, "What are you calculating?"Children did not begin formal schooling in Hungary until they were ten years of age. Max believed that knowledge of languages in addition to Hungarian was essential, so the children were tutored in English, French and Italian. By the age of 8, von Neumann was familiar with differential and integral calculus, but he was interested in history, he read his way through Wilhelm Oncken's 46-volume Allgemeine Geschichte in Einzeldarstellungen. A copy was contained in a private library. One of the rooms in the apartment was converted into a library and reading room, with bookshelves from ceiling to floor. Von Neumann entered the Lutheran Fasori Evangélikus Gimnázium in 1911. Eugene Wigner soon became his friend; this was one of the best schools in Budapest and was part of a brilliant education system designed for the elite.
Under the Hungarian system, children received all their education at the one gymnasium. The Hungarian school system produced a generation noted for intellectual achie
In probability theory and related fields, a stochastic or random process is a mathematical object defined as a collection of random variables. The random variables were associated with or indexed by a set of numbers viewed as points in time, giving the interpretation of a stochastic process representing numerical values of some system randomly changing over time, such as the growth of a bacterial population, an electrical current fluctuating due to thermal noise, or the movement of a gas molecule. Stochastic processes are used as mathematical models of systems and phenomena that appear to vary in a random manner, they have applications in many disciplines including sciences such as biology, ecology and physics as well as technology and engineering fields such as image processing, signal processing, information theory, computer science and telecommunications. Furthermore random changes in financial markets have motivated the extensive use of stochastic processes in finance. Applications and the study of phenomena have in turn inspired the proposal of new stochastic processes.
Examples of such stochastic processes include the Wiener process or Brownian motion process, used by Louis Bachelier to study price changes on the Paris Bourse, the Poisson process, used by A. K. Erlang to study the number of phone calls occurring in a certain period of time; these two stochastic processes are considered the most important and central in the theory of stochastic processes, were discovered and independently, both before and after Bachelier and Erlang, in different settings and countries. The term random function is used to refer to a stochastic or random process, because a stochastic process can be interpreted as a random element in a function space; the terms stochastic process and random process are used interchangeably with no specific mathematical space for the set that indexes the random variables. But these two terms are used when the random variables are indexed by the integers or an interval of the real line. If the random variables are indexed by the Cartesian plane or some higher-dimensional Euclidean space the collection of random variables is called a random field instead.
The values of a stochastic process are not always numbers and can be vectors or other mathematical objects. Based on their mathematical properties, stochastic processes can be divided into various categories, which include random walks, Markov processes, Lévy processes, Gaussian processes, random fields, renewal processes, branching processes; the study of stochastic processes uses mathematical knowledge and techniques from probability, linear algebra, set theory, topology as well as branches of mathematical analysis such as real analysis, measure theory, Fourier analysis, functional analysis. The theory of stochastic processes is considered to be an important contribution to mathematics and it continues to be an active topic of research for both theoretical reasons and applications. A stochastic or random process can be defined as a collection of random variables, indexed by some mathematical set, meaning that each random variable of the stochastic process is uniquely associated with an element in the set.
The set used to index. The index set was some subset of the real line, such as the natural numbers, giving the index set the interpretation of time; each random variable in the collection takes values from the same mathematical space known as the state space. This state space can be, for example, the integers, the real n - dimensional Euclidean space. An increment is the amount that a stochastic process changes between two index values interpreted as two points in time. A stochastic process can have many outcomes, due to its randomness, a single outcome of a stochastic process is called, among other names, a sample function or realization. A stochastic process can be classified in different ways, for example, by its state space, its index set, or the dependence among the random variables. One common way of classification is by the cardinality of the state space; when interpreted as time, if the index set of a stochastic process has a finite or countable number of elements, such as a finite set of numbers, the set of integers, or the natural numbers the stochastic process is said to be in discrete time.
If the index set is some interval of the real line time is said to be continuous. The two types of stochastic processes are referred to as discrete-time and continuous-time stochastic processes. Discrete-time stochastic processes are considered easier to study because continuous-time processes require more advanced mathematical techniques and knowledge due to the index set being uncountable. If the index set is the integers, or some subset of them the stochastic process can be called a random sequence. If the state space is the integers or natural numbers the stochastic process is called a discrete or integer-valued stochastic process. If the state space is the real line the stochastic process is referred to as a real-valued stochastic process or a process with continuous state space. If the state space is n -dimensional Euclidean space the stochastic process is called a n -dimensional vector process or n -vector process; the word stochastic in English was used as an adjective with the definition "pertaining to conjecturing", stemming from a Greek word meaning "to aim at a mark, guess", the Oxford English Dictionary gives the year 16
Entropy (information theory)
Information entropy is the average rate at which information is produced by a stochastic source of data. The measure of information entropy associated with each possible data value is the negative logarithm of the probability mass function for the value: S = − ∑ i P i log P i S=-\sum _P_\log; when the data source produces a low-probability value, the event carries more "information" than when the source data produces a high-probability value. The amount of information conveyed by each event defined in this way becomes a random variable whose expected value is the information entropy. Entropy refers to disorder or uncertainty, the definition of entropy used in information theory is directly analogous to the definition used in statistical thermodynamics; the concept of information entropy was introduced by Claude Shannon in his 1948 paper "A Mathematical Theory of Communication". The basic model of a data communication system is composed of three elements, a source of data, a communication channel, a receiver, – as expressed by Shannon – the "fundamental problem of communication" is for the receiver to be able to identify what data was generated by the source, based on the signal it receives through the channel.
The entropy provides an absolute limit on the shortest possible average length of a lossless compression encoding of the data produced by a source, if the entropy of the source is less than the channel capacity of the communication channel, the data generated by the source can be reliably communicated to the receiver. Information entropy is measured in bits or sometimes in "natural units" or decimal digits; the unit of the measurement depends on the base of the logarithm, used to define the entropy. The logarithm of the probability distribution is useful as a measure of entropy because it is additive for independent sources. For instance, the entropy of a fair coin toss is 1 bit, the entropy of m tosses is m bits. In a straightforward representation, log2 bits are needed to represent a variable that can take one of n values if n is a power of 2. If these values are probable, the entropy is equal to this number. If one of the values is more probable to occur than the others, an observation that this value occurs is less informative than if some less common outcome had occurred.
Conversely, rarer events provide more information. Since observation of less probable events occurs more the net effect is that the entropy received from non-uniformly distributed data is always less than or equal to log2. Entropy is zero; the entropy quantifies these considerations when a probability distribution of the source data is known. The meaning of the events observed. Entropy only takes into account the probability of observing a specific event, so the information it encapsulates is information about the underlying probability distribution, not the meaning of the events themselves; the basic idea of information theory is that the more one knows about a topic, the less new information one is apt to get about it. If an event is probable, it is no surprise when it happens and provides little new information. Inversely, if the event was improbable, it is much more informative; the information content is an increasing function of the reciprocal of the probability of the event. If more events may happen, entropy measures the average information content you can expect to get if one of the events happens.
This implies that casting a die has more entropy than tossing a coin because each outcome of the die has smaller probability than each outcome of the coin. Entropy is a measure of unpredictability of the state, or equivalently, of its average information content. To get an intuitive understanding of these terms, consider the example of a political poll; such polls happen because the outcome of the poll is not known. In other words, the outcome of the poll is unpredictable, performing the poll and learning the results gives some new information. Now, consider the case that the same poll is performed a second time shortly after the first poll. Since the result of the first poll is known, the outcome of the second poll can be predicted well and the results should not contain much new information. Consider the example of a coin toss. Assuming the probability of heads is the same as the probability of tails the entropy of the coin toss is as high as it could be. There is no way to predict the outcome of the coin toss ahead of time: if one has to choose, the best one can do is predict that the coin will come up heads, this prediction will be correct with probability 1/2.
Such a coin toss has one bit of entropy since there are two possible outcomes that occur with equal probability, learning the actual outcome contains one bit of information. In contrast, a coin toss using a coin that has two heads and no tails has zero entropy since the coin will always come up heads, the outcome can be predicted pe
In mathematical analysis, a measure on a set is a systematic way to assign a number to each suitable subset of that set, intuitively interpreted as its size. In this sense, a measure is a generalization of the concepts of length and volume. A important example is the Lebesgue measure on a Euclidean space, which assigns the conventional length and volume of Euclidean geometry to suitable subsets of the n-dimensional Euclidean space Rn. For instance, the Lebesgue measure of the interval in the real numbers is its length in the everyday sense of the word 1. Technically, a measure is a function that assigns a non-negative real number or +∞ to subsets of a set X, it must further be countably additive: the measure of a'large' subset that can be decomposed into a finite number of'smaller' disjoint subsets is equal to the sum of the measures of the "smaller" subsets. In general, if one wants to associate a consistent size to each subset of a given set while satisfying the other axioms of a measure, one only finds trivial examples like the counting measure.
This problem was resolved by defining measure only on a sub-collection of all subsets. This means that countable unions, countable intersections and complements of measurable subsets are measurable. Non-measurable sets in a Euclidean space, on which the Lebesgue measure cannot be defined are complicated in the sense of being badly mixed up with their complement. Indeed, their existence is a non-trivial consequence of the axiom of choice. Measure theory was developed in successive stages during the late 19th and early 20th centuries by Émile Borel, Henri Lebesgue, Johann Radon, Maurice Fréchet, among others; the main applications of measures are in the foundations of the Lebesgue integral, in Andrey Kolmogorov's axiomatisation of probability theory and in ergodic theory. In integration theory, specifying a measure allows one to define integrals on spaces more general than subsets of Euclidean space. Probability theory considers measures that assign to the whole set the size 1, considers measurable subsets to be events whose probability is given by the measure.
Ergodic theory considers measures that are invariant under, or arise from, a dynamical system. Let X be a set and Σ a σ-algebra over X. A function μ from Σ to the extended real number line is called a measure if it satisfies the following properties: Non-negativity: For all E in Σ: μ ≥ 0. Null empty set: μ = 0. Countable additivity: For all countable collections i = 1 ∞ of pairwise disjoint sets in Σ: μ = ∑ k = 1 ∞ μ One may require that at least one set E has finite measure; the empty set automatically has measure zero because of countable additivity, because μ = μ = μ + μ + μ + …, which implies that μ = 0. If only the second and third conditions of the definition of measure above are met, μ takes on at most one of the values ±∞ μ is called a signed measure; the pair is called a measurable space, the members of Σ are called measurable sets. If and are two measurable spaces a function f: X → Y is called measurable if for every Y-measurable set B ∈ Σ Y, the inverse image is X-measurable – i.e.: f ∈ Σ X.
In this setup, the composition of measurable functions is measurable, making the measurable spaces and measurable functions a category, with the measurable spaces as objects and the set of measurable functions as arrows. See Measurable function#Term usage variations about another setup. A triple is called a measure space. A probability measure is a measure with total measure one – i.e. Μ = 1. A probability space is a measure space with a probability measure. For measure spaces that are topological spaces various compatibility conditions can be
String (computer science)
In computer programming, a string is traditionally a sequence of characters, either as a literal constant or as some kind of variable. The latter may allow its elements to be mutated and the length changed. A string is considered a data type and is implemented as an array data structure of bytes that stores a sequence of elements characters, using some character encoding. String may denote more general arrays or other sequence data types and structures. Depending on programming language and precise data type used, a variable declared to be a string may either cause storage in memory to be statically allocated for a predetermined maximum length or employ dynamic allocation to allow it to hold a variable number of elements; when a string appears in source code, it is known as a string literal or an anonymous string. In formal languages, which are used in mathematical logic and theoretical computer science, a string is a finite sequence of symbols that are chosen from a set called an alphabet. Let Σ be a non-empty finite set of symbols, called the alphabet.
No assumption is made about the nature of the symbols. A string over Σ is any finite sequence of symbols from Σ. For example, if Σ = 01011 is a string over Σ; the length of a string s can be any non-negative integer. The empty string is the unique string over Σ of length 0, is denoted ε or λ; the set of all strings over Σ of length n is denoted Σn. For example, if Σ = Σ2 =. Note that Σ0 = for any alphabet Σ; the set of all strings over Σ of any length is the Kleene closure of Σ and is denoted Σ*. In terms of Σn, Σ ∗ = ⋃ n ∈ N ∪ Σ n For example, if Σ = Σ* =. Although the set Σ* itself is countably infinite, each element of Σ* is a string of finite length. A set of strings over Σ is called a formal language over Σ. For example, if Σ =, the set of strings with an number of zeros, is a formal language over Σ. Concatenation is an important binary operation on Σ*. For any two strings s and t in Σ*, their concatenation is defined as the sequence of symbols in s followed by the sequence of characters in t, is denoted st.
For example, if Σ =, s = bear, t = hug st = bearhug and ts = hugbear. String concatenation is an non-commutative operation; the empty string ε serves as the identity element. Therefore, the set Σ* and the concatenation operation form a monoid, the free monoid generated by Σ. In addition, the length function defines a monoid homomorphism from Σ* to the non-negative integers. A string s is said to be a substring or factor of t if there exist strings u and v such that t = usv; the relation "is a substring of" defines a partial order on Σ*, the least element of, the empty string. A string s is said to be a prefix of t if there exists a string u such that t = su. If u is nonempty, s is said to be a proper prefix of t. Symmetrically, a string s is said to be a suffix of t if there exists a string u such that t = us. If u is nonempty, s is said to be a proper suffix of t. Suffixes and prefixes are substrings of t. Both the relations "is a prefix of" and "is a suffix of" are prefix orders. A string s = uv.
For example, if Σ = the string 0011001 is a rotation of 0100110, where u = 00110 and v = 01. The reverse of a string is a string in reverse order. For example, if s = abc the reverse of s is cba. A string, the reverse of itself is called a palindrome, which includes the empty string and all strings of length 1, it is useful to define an ordering on a set of strings. If the alphabet Σ has a total order one can define a total order on Σ* called lexicographical order. For example, if Σ = and 0 < 1 the lexicographical order on Σ* includes the relationships ε < 0 < 00 < 000 <... < 0001 < 001 < 01 < 010 < 011 < 0110 < 01111 < 1 < 10 < 100 < 101 < 111 < 1111 < 11111... The lexicographical order is total if the alphabetical order is, but isn't well-founded for any nontrivial alphabet if the alphabetical order is. See Shortlex for an alternative string ordering that preserves well-foundedness. A number of additional operations on strings occur in the formal theory; these are given in the article on string operations.
Strings admit the following interpretation as nodes on a graph: Fixed-length strings can be viewed as nodes on a hypercube Variable-length strings can be viewed as nodes on the k-ary tree, where k is the number of symbols in Σ Infinite strings can be viewed as i