1.
Dice
–
Dice are small throwable objects with multiple resting positions, used for generating random numbers. Dice are suitable as gambling devices for games like craps and are used in non-gambling tabletop games. A traditional die is a cube, with each of its six faces showing a different number of dots from 1 to 6. When thrown or rolled, the die comes to rest showing on its surface a random integer from one to six. A variety of devices are also described as dice, such specialized dice may have polyhedral or irregular shapes. They may be used to produce other than one through six. Loaded and crooked dice are designed to favor some results over others for purposes of cheating or amusement. A dice tray, a used to contain thrown dice, is sometimes used for gambling or board games. Dice have been used since before recorded history, and it is uncertain where they originated, the oldest known dice were excavated as part of a backgammon-like game set at the Burnt City, an archeological site in south-eastern Iran, estimated to be from between 2800–2500 BCE. Other excavations from ancient tombs in the Indus Valley civilization indicate a South Asian origin, the Egyptian game of Senet was played with dice. Senet was played before 3000 BC and up to the 2nd century AD and it was likely a racing game, but there is no scholarly consensus on the rules of Senet. Dicing is mentioned as an Indian game in the Rigveda, Atharvaveda, there are several biblical references to casting lots, as in Psalm 22, indicating that dicing was commonplace when the psalm was composed. Knucklebones was a game of skill played by women and children, although gambling was illegal, many Romans were passionate gamblers who enjoyed dicing, which was known as aleam ludere. Dicing was even a popular pastime of emperors, letters by Augustus to Tacitus and his daughter recount his hobby of dicing. There were two sizes of Roman dice, tali were large dice inscribed with one, three, four, and six on four sides. Tesserae were smaller dice with sides numbered one to six. Twenty-sided dice date back to the 2nd century AD and from Ptolemaic Egypt as early as the 2nd century BC, dominoes and playing cards originated in China as developments from dice. The transition from dice to playing cards occurred in China around the Tang dynasty, in Japan, dice were used to play a popular game called sugoroku
2.
Statistics
–
Statistics is a branch of mathematics dealing with the collection, analysis, interpretation, presentation, and organization of data. In applying statistics to, e. g. a scientific, industrial, or social problem, populations can be diverse topics such as all people living in a country or every atom composing a crystal. Statistics deals with all aspects of data including the planning of data collection in terms of the design of surveys, statistician Sir Arthur Lyon Bowley defines statistics as Numerical statements of facts in any department of inquiry placed in relation to each other. When census data cannot be collected, statisticians collect data by developing specific experiment designs, representative sampling assures that inferences and conclusions can safely extend from the sample to the population as a whole. In contrast, an observational study does not involve experimental manipulation, inferences on mathematical statistics are made under the framework of probability theory, which deals with the analysis of random phenomena. A standard statistical procedure involves the test of the relationship between two data sets, or a data set and a synthetic data drawn from idealized model. A hypothesis is proposed for the relationship between the two data sets, and this is compared as an alternative to an idealized null hypothesis of no relationship between two data sets. Rejecting or disproving the hypothesis is done using statistical tests that quantify the sense in which the null can be proven false. Working from a hypothesis, two basic forms of error are recognized, Type I errors and Type II errors. Multiple problems have come to be associated with this framework, ranging from obtaining a sufficient sample size to specifying an adequate null hypothesis, measurement processes that generate statistical data are also subject to error. Many of these errors are classified as random or systematic, the presence of missing data or censoring may result in biased estimates and specific techniques have been developed to address these problems. Statistics continues to be an area of research, for example on the problem of how to analyze Big data. Statistics is a body of science that pertains to the collection, analysis, interpretation or explanation. Some consider statistics to be a mathematical science rather than a branch of mathematics. While many scientific investigations make use of data, statistics is concerned with the use of data in the context of uncertainty, mathematical techniques used for this include mathematical analysis, linear algebra, stochastic analysis, differential equations, and measure-theoretic probability theory. In applying statistics to a problem, it is practice to start with a population or process to be studied. Populations can be diverse topics such as all living in a country or every atom composing a crystal. Ideally, statisticians compile data about the entire population and this may be organized by governmental statistical institutes
3.
World War II
–
World War II, also known as the Second World War, was a global war that lasted from 1939 to 1945, although related conflicts began earlier. It involved the vast majority of the worlds countries—including all of the great powers—eventually forming two opposing alliances, the Allies and the Axis. It was the most widespread war in history, and directly involved more than 100 million people from over 30 countries. Marked by mass deaths of civilians, including the Holocaust and the bombing of industrial and population centres. These made World War II the deadliest conflict in human history, from late 1939 to early 1941, in a series of campaigns and treaties, Germany conquered or controlled much of continental Europe, and formed the Axis alliance with Italy and Japan. Under the Molotov–Ribbentrop Pact of August 1939, Germany and the Soviet Union partitioned and annexed territories of their European neighbours, Poland, Finland, Romania and the Baltic states. In December 1941, Japan attacked the United States and European colonies in the Pacific Ocean, and quickly conquered much of the Western Pacific. The Axis advance halted in 1942 when Japan lost the critical Battle of Midway, near Hawaii, in 1944, the Western Allies invaded German-occupied France, while the Soviet Union regained all of its territorial losses and invaded Germany and its allies. During 1944 and 1945 the Japanese suffered major reverses in mainland Asia in South Central China and Burma, while the Allies crippled the Japanese Navy, thus ended the war in Asia, cementing the total victory of the Allies. World War II altered the political alignment and social structure of the world, the United Nations was established to foster international co-operation and prevent future conflicts. The victorious great powers—the United States, the Soviet Union, China, the United Kingdom, the Soviet Union and the United States emerged as rival superpowers, setting the stage for the Cold War, which lasted for the next 46 years. Meanwhile, the influence of European great powers waned, while the decolonisation of Asia, most countries whose industries had been damaged moved towards economic recovery. Political integration, especially in Europe, emerged as an effort to end pre-war enmities, the start of the war in Europe is generally held to be 1 September 1939, beginning with the German invasion of Poland, Britain and France declared war on Germany two days later. The dates for the beginning of war in the Pacific include the start of the Second Sino-Japanese War on 7 July 1937, or even the Japanese invasion of Manchuria on 19 September 1931. Others follow the British historian A. J. P. Taylor, who held that the Sino-Japanese War and war in Europe and its colonies occurred simultaneously and this article uses the conventional dating. Other starting dates sometimes used for World War II include the Italian invasion of Abyssinia on 3 October 1935. The British historian Antony Beevor views the beginning of World War II as the Battles of Khalkhin Gol fought between Japan and the forces of Mongolia and the Soviet Union from May to September 1939, the exact date of the wars end is also not universally agreed upon. It was generally accepted at the time that the war ended with the armistice of 14 August 1945, rather than the formal surrender of Japan
4.
German tank problem
–
In the statistical theory of estimation, the German tank problem involves estimating the maximum of a discrete uniform distribution from sampling without replacement. It is named from its application in World War II to the estimation of the number of German tanks and this analysis shows the approach that was used and illustrates the difference between frequentist inference and Bayesian inference. Suppose k =4 tanks with serial numbers 19,40,42 and 60 are captured, the maximal observed serial number, m =60. The unknown total number of tanks is called N.5 ±50.22 and this distribution has positive skewness, related to the fact that there are at least 60 tanks. In many cases, statistical analysis substantially improved on conventional intelligence, in some cases, conventional intelligence was used in conjunction with statistical methods, as was the case in estimation of Panther tank production just prior to D-Day. The US Army was confident that the Sherman tank would continue to perform well, as it had versus the Panzer III and Panzer IV tanks in North Africa, shortly before D-Day, rumors indicated that large numbers of Panzer V tanks were being used. To determine whether this was true, the Allies attempted to estimate the number of tanks being produced, to do this, they used the serial numbers on captured or destroyed tanks. The principal numbers used were gearbox numbers, as fell in two unbroken sequences. Chassis and engine numbers were used, though their use was more complicated. Various other components were used to cross-check the analysis, Similar analyses were done on wheels, which were observed to be sequentially numbered. The analysis of tank wheels yielded an estimate for the number of molds that were in use. A discussion with British road wheel makers then estimated the number of wheels that could be produced from this many molds, analysis of wheels from two tanks yielded an estimate of 270 tanks produced in February 1944, substantially more than had previously been suspected. German records after the war showed production for the month of February 1944 was 276, estimating production was not the only use of this serial-number analysis. According to conventional Allied intelligence estimates, the Germans were producing around 1,400 tanks a month between June 1940 and September 1942, applying the formula below to the serial numbers of captured tanks, the number was calculated to be 256 a month. After the war, captured German production figures from the ministry of Albert Speer showed the number to be 255. Estimates for some specific months are given as, Similar serial-number analysis was used for military equipment during World War II. Factory markings on Soviet military equipment were analyzed during the Korean War, in the 1980s, some Americans were given access to the production line of Israels Merkava tanks. The production numbers were classified, but the tanks had serial numbers, the formula has been used in non-military contexts, for example to estimate the number of Commodore 64 computers built, where the result matches the low-end estimates
5.
Mark and recapture
–
Mark and recapture is a method commonly used in ecology to estimate an animal populations size. A portion of the population is captured, marked, and released, later, another portion is captured and the number of marked individuals within the sample is counted. The method is most useful when it is not practical to count all the individuals in the population, another major application for these methods is in epidemiology, where they are used to estimate the completeness of ascertainment of disease registers. Typical applications include estimating the number of people needing particular services, typically a researcher visits a study area and uses traps to capture a group of individuals alive. Each of these individuals is marked with an identifier. A mark recapture method was first used for study in 1896 by C. G. Johannes Petersen to estimate plaice, Pleuronectes platessa, populations, sufficient time is allowed to pass for the marked individuals to redistribute themselves among the unmarked population. Next, the returns and captures another sample of individuals. Some individuals in this second sample will have been marked during the visit and are now known as recaptures. Other animals captured during the visit, will not have been captured during the first visit to the study area. These unmarked animals are given a tag or band during the second visit. Population size can be estimated from as few as two visits to the study area, commonly, more than two visits are made, particularly if estimates of survival or movement are desired. Regardless of the number of visits, the researcher simply records the date of each capture of each individual. The capture histories generated are analyzed mathematically to estimate population size, survival, in the epidemiological setting, different sources of patients take the place of the repeated field visits in ecology. She captures 10 turtles on her first visit to the lake, a week later she returns to the lake and captures 15 turtles. Five of these 15 turtles have paint on their backs, indicating that they are recaptured animals, the problem is to estimate N. The Lincoln–Petersen method can be used to estimate population size if only two visits are made to the study area and this method assumes that the study population is closed. In other words, the two visits to the area are close enough in time so that no individuals die, are born
6.
Kurtosis
–
In probability theory and statistics, kurtosis is a measure of the tailedness of the probability distribution of a real-valued random variable. Depending on the measure of kurtosis that is used, there are various interpretations of kurtosis. The standard measure of kurtosis, originating with Karl Pearson, is based on a version of the fourth moment of the data or population. This number is related to the tails of the distribution, not its peak, hence, for this measure, higher kurtosis is the result of infrequent extreme deviations, as opposed to frequent modestly sized deviations. The kurtosis of any normal distribution is 3. It is common to compare the kurtosis of a distribution to this value, distributions with kurtosis less than 3 are said to be platykurtic, although this does not imply the distribution is flat-topped as sometimes reported. Rather, it means the distribution produces fewer and less extreme outliers than does the normal distribution, an example of a platykurtic distribution is the uniform distribution, which does not produce outliers. Distributions with kurtosis greater than 3 are said to be leptokurtic and it is also common practice to use an adjusted version of Pearsons kurtosis, the excess kurtosis, which is the kurtosis minus 3, to provide the comparison to the normal distribution. Some authors use kurtosis by itself to refer to the excess kurtosis, for the reason of clarity and generality, however, this article follows the non-excess convention and explicitly indicates where excess kurtosis is meant. Alternative measures of kurtosis are, the L-kurtosis, which is a version of the fourth L-moment. These are analogous to the measures of skewness that are not based on ordinary moments. The kurtosis is the fourth standardized moment, defined as Kurt = μ4 σ4 = E 2, several letters are used in the literature to denote the kurtosis. A very common choice is κ, which is fine as long as it is clear that it does not refer to a cumulant, other choices include γ2, to be similar to the notation for skewness, although sometimes this is instead reserved for the excess kurtosis. The kurtosis is bounded below by the squared skewness plus 1, μ4 σ4 ≥2 +1, the lower bound is realized by the Bernoulli distribution. There is no limit to the excess kurtosis of a general probability distribution. A reason why some authors favor the excess kurtosis is that cumulants are extensive, formulas related to the extensive property are more naturally expressed in terms of the excess kurtosis. Xn be independent random variables for which the fourth moment exists, the excess kurtosis of Y is Kurt −3 =12 ∑ i =1 n σ i 4 ⋅, where σ i is the standard deviation of X i. In particular if all of the Xi have the same variance, the reason not to subtract off 3 is that the bare fourth moment better generalizes to multivariate distributions, especially when independence is not assumed
7.
Hypergeometric distribution
–
In contrast, the binomial distribution describes the probability of k successes in n draws with replacement. In statistics, the hypergeometric test uses the hypergeometric distribution to calculate the significance of having drawn a specific k successes from the aforementioned population. The test is used to identify which sub-populations are over- or under-represented in a sample. This test has a range of applications. For example, a group could use the test to understand their customer base by testing a set of known customers for over-representation of various demographic subgroups. The following conditions characterize the distribution, The result of each draw can be classified into one of two mutually exclusive categories. The probability of a success changes on each draw, as each draw decreases the population, the pmf is positive when max ≤ k ≤ min. The pmf satisfies the recurrence relation P = P with P =, as one would expect, the probabilities sum up to 1, ∑0 ≤ k ≤ n =1 This is essentially Vandermondes identity from combinatorics. Also note the following identity holds, = and this follows from the symmetry of the problem, but it can also be shown by expressing the binomial coefficients in terms of factorials and rearranging the latter. The classical application of the distribution is sampling without replacement. Think of an urn with two types of marbles, red ones and green ones, define drawing a green marble as a success and drawing a red marble as a failure. If the variable N describes the number of all marbles in the urn and K describes the number of green marbles, in this example, X is the random variable whose outcome is k, the number of green marbles actually drawn in the experiment. This situation is illustrated by the following table, Now. Standing next to the urn, you close your eyes and draw 10 marbles without replacement, what is the probability that exactly 4 of the 10 are green. This problem is summarized by the following table, The probability of drawing exactly k green marbles can be calculated by the formula P = f =. Hence, in this example calculate P = f = =5 ⋅814506010272278170 =0.003964583 …, intuitively we would expect it to be even more unlikely for all 5 marbles to be green. P = f = =1 ⋅122175910272278170 =0.0001189375 …, As expected, in Holdem Poker players make the best hand they can combining the two cards in their hand with the 5 cards eventually turned up on the table. The deck has 52 and there are 13 of each suit, for this example assume a player has 2 clubs in the hand and there are 3 cards showing on the table,2 of which are also clubs
8.
Probability theory
–
Probability theory is the branch of mathematics concerned with probability, the analysis of random phenomena. It is not possible to predict precisely results of random events, two representative mathematical results describing such patterns are the law of large numbers and the central limit theorem. As a mathematical foundation for statistics, probability theory is essential to human activities that involve quantitative analysis of large sets of data. Methods of probability theory also apply to descriptions of complex systems given only partial knowledge of their state, a great discovery of twentieth century physics was the probabilistic nature of physical phenomena at atomic scales, described in quantum mechanics. Christiaan Huygens published a book on the subject in 1657 and in the 19th century, initially, probability theory mainly considered discrete events, and its methods were mainly combinatorial. Eventually, analytical considerations compelled the incorporation of continuous variables into the theory and this culminated in modern probability theory, on foundations laid by Andrey Nikolaevich Kolmogorov. Kolmogorov combined the notion of space, introduced by Richard von Mises. This became the mostly undisputed axiomatic basis for modern probability theory, most introductions to probability theory treat discrete probability distributions and continuous probability distributions separately. The more mathematically advanced measure theory-based treatment of probability covers the discrete, continuous, consider an experiment that can produce a number of outcomes. The set of all outcomes is called the space of the experiment. The power set of the space is formed by considering all different collections of possible results. For example, rolling an honest die produces one of six possible results, one collection of possible results corresponds to getting an odd number. Thus, the subset is an element of the set of the sample space of die rolls. In this case, is the event that the die falls on some odd number, If the results that actually occur fall in a given event, that event is said to have occurred. Probability is a way of assigning every event a value between zero and one, with the requirement that the event made up of all possible results be assigned a value of one, the probability that any one of the events, or will occur is 5/6. This is the same as saying that the probability of event is 5/6 and this event encompasses the possibility of any number except five being rolled. The mutually exclusive event has a probability of 1/6, and the event has a probability of 1, discrete probability theory deals with events that occur in countable sample spaces. Modern definition, The modern definition starts with a finite or countable set called the sample space, which relates to the set of all possible outcomes in classical sense, denoted by Ω
9.
Sample maximum and minimum
–
In statistics, the sample maximum and sample minimum, also called the largest observation, and smallest observation, are the values of the greatest and least elements of a sample. They are basic summary statistics, used in statistics such as the five-number summary and seven-number summary. The minimum and the maximum value are the first and last order statistics, if there are outliers, they necessarily include the sample maximum or sample minimum, or both, depending on whether they are extremely high or low. However, the maximum and minimum need not be outliers. The sample maximum and minimum are the least robust statistics, they are sensitive to outliers. They also realize the maximum absolute deviation, they are the furthest points from any given point, for a sample set, the maximum function is non-smooth and thus non-differentiable. For optimization problems that occur in statistics it often needs to be approximated by a function that is close to the maximum of the set. A smooth maximum, for example, g = log is an approximation of the sample maximum. Thus, denoting the sample maximum and minimum by M and m, for example, if n =19, then gives an 18/20 = 90% prediction interval – 90% of the time, the 20th observation falls between the smallest and largest observation seen heretofore. Likewise, n =39 gives a 95% prediction interval, and n =199 gives a 99% prediction interval, due to their sensitivity to outliers, the sample extrema cannot reliably be used as estimators unless data is clean – robust alternatives include the first and last deciles. They are inefficient estimators of location for mesokurtic distributions, such as the normal distribution, if both endpoints are unknown, then the sample range is a biased estimator for the population range, but correcting as for maximum above yields the UMVU estimator. If both endpoints are unknown, then the mid-range is an estimator of the midpoint of the interval.5 million years. Thus if the sample extrema are 6 sigmas from the mean, further, this test is very easy to communicate without involved statistics. These tests of normality can be applied if one faces kurtosis risk and this is elaborated in black swan theory
10.
Entropy (information theory)
–
In information theory, systems are modeled by a transmitter, channel, and receiver. The transmitter produces messages that are sent through the channel, the channel modifies the message in some way. The receiver attempts to infer which message was sent, in this context, entropy is the expected value of the information contained in each message. Messages can be modeled by any flow of information, in a more technical sense, there are reasons to define information as the negative of the logarithm of the probability distribution of possible events or messages. The amount of information of every event forms a random variable whose expected value, units of entropy are the shannon, nat, or hartley, depending on the base of the logarithm used to define it, though the shannon is commonly referred to as a bit. The logarithm of the probability distribution is useful as a measure of entropy because it is additive for independent sources, for instance, the entropy of a coin toss is 1 shannon, whereas of m tosses it is m shannons. Generally, you need log2 bits to represent a variable that can take one of n if n is a power of 2. If these values are equally probable, the entropy is equal to the number of bits, equality between number of bits and shannons holds only while all outcomes are equally probable. If one of the events is more probable than others, observation of event is less informative. Conversely, rarer events provide more information when observed, since observation of less probable events occurs more rarely, the net effect is that the entropy received from non-uniformly distributed data is less than log2. Entropy is zero when one outcome is certain, Shannon entropy quantifies all these considerations exactly when a probability distribution of the source is known. The meaning of the events observed does not matter in the definition of entropy, generally, entropy refers to disorder or uncertainty. Shannon entropy was introduced by Claude E. Shannon in his 1948 paper A Mathematical Theory of Communication, Shannon entropy provides an absolute limit on the best possible average length of lossless encoding or compression of an information source. Entropy is a measure of unpredictability of the state, or equivalently, to get an intuitive understanding of these terms, consider the example of a political poll. Usually, such polls happen because the outcome of the poll is not already known, now, consider the case that the same poll is performed a second time shortly after the first poll. Now consider the example of a coin toss, assuming the probability of heads is the same as the probability of tails, then the entropy of the coin toss is as high as it could be. Such a coin toss has one shannon of entropy since there are two possible outcomes that occur with probability, and learning the actual outcome contains one shannon of information. Contrarily, a toss with a coin that has two heads and no tails has zero entropy since the coin will always come up heads