1.
Sorting algorithm
–
A sorting algorithm is an algorithm that puts elements of a list in a certain order. The most-used orders are numerical order and lexicographical order, more formally, the output must satisfy two conditions, The output is in nondecreasing order, The output is a permutation of the input. Since the dawn of computing, the problem has attracted a great deal of research, perhaps due to the complexity of solving it efficiently despite its simple. For example, bubble sort was analyzed as early as 1956, comparison sorting algorithms have a fundamental requirement of O comparisons, algorithms not based on comparisons, such as counting sort, can have better performance. Sorting algorithms are classified by, Computational complexity in terms of the size of the list. For typical serial sorting algorithms good behavior is O, with parallel sort in O, ideal behavior for a serial sort is O, but this is not possible in the average case. Optimal parallel sorting is O. Comparison-based sorting algorithms, need at least O comparisons for most inputs, in particular, some sorting algorithms are in-place. Strictly, an in-place sort needs only O memory beyond the items being sorted, some algorithms are either recursive or non-recursive, while others may be both. Stability, stable sorting algorithms maintain the order of records with equal keys. Whether or not they are a comparison sort, a comparison sort examines the data only by comparing two elements with a comparison operator. General method, insertion, exchange, selection, merging, etc, exchange sorts include bubble sort and quicksort. Selection sorts include shaker sort and heapsort, also whether the algorithm is serial or parallel. The remainder of this discussion almost exclusively concentrates upon serial algorithms, adaptability, Whether or not the presortedness of the input affects the running time. Algorithms that take this account are known to be adaptive. When sorting some kinds of data, only part of the data is examined when determining the sort order, for example, in the card sorting example to the right, the cards are being sorted by their rank, and their suit is being ignored. This allows the possibility of multiple different correctly sorted versions of the original list, more formally, the data being sorted can be represented as a record or tuple of values, and the part of the data that is used for sorting is called the key. In the card example, cards are represented as a record, and the key is the rank. A sorting algorithm is stable if there are two records R and S with the same key, and R appears before S in the original list
2.
Array data structure
–
In computer science, an array data structure, or simply an array, is a data structure consisting of a collection of elements, each identified by at least one array index or key. An array is stored so that the position of each element can be computed from its index tuple by a mathematical formula, the simplest type of data structure is a linear array, also called one-dimensional array. For example, an array of 10 32-bit integer variables, with indices 0 through 9,2036, so that the element with index i has the address 2000 +4 × i. The memory address of the first element of an array is called first address or foundation address, because the mathematical concept of a matrix can be represented as a two-dimensional grid, two-dimensional arrays are also sometimes called matrices. In some cases the term vector is used in computing to refer to an array, arrays are often used to implement tables, especially lookup tables, the word table is sometimes used as a synonym of array. Arrays are among the oldest and most important data structures, and are used by almost every program and they are also used to implement many other data structures, such as lists and strings. They effectively exploit the addressing logic of computers, in most modern computers and many external storage devices, the memory is a one-dimensional array of words, whose indices are their addresses. Processors, especially vector processors, are optimized for array operations. Arrays are useful mostly because the element indices can be computed at run time, among other things, this feature allows a single iterative statement to process arbitrarily many elements of an array. For that reason, the elements of a data structure are required to have the same size. The set of valid index tuples and the addresses of the elements are usually, Array types are often implemented by array structures, however, in some languages they may be implemented by hash tables, linked lists, search trees, or other data structures. The first digital computers used machine-language programming to set up and access array structures for data tables, vector and matrix computations, john von Neumann wrote the first array-sorting program in 1945, during the building of the first stored-program computer. p. 159 Array indexing was originally done by self-modifying code, and later using index registers, some mainframes designed in the 1960s, such as the Burroughs B5000 and its successors, used memory segmentation to perform index-bounds checking in hardware. Assembly languages generally have no support for arrays, other than what the machine itself provides. The earliest high-level programming languages, including FORTRAN, Lisp, COBOL, and ALGOL60, had support for multi-dimensional arrays, in C++, class templates exist for multi-dimensional arrays whose dimension is fixed at runtime as well as for runtime-flexible arrays. Arrays are used to implement mathematical vectors and matrices, as well as other kinds of rectangular tables, many databases, small and large, consist of one-dimensional arrays whose elements are records. Arrays are used to implement other data structures, such as lists, heaps, hash tables, deques, queues, stacks, strings, one or more large arrays are sometimes used to emulate in-program dynamic memory allocation, particularly memory pool allocation. Historically, this has sometimes been the way to allocate dynamic memory portably
3.
Tony Hoare
–
Sir Charles Antony Richard Hoare FRS FREng, commonly known as Tony Hoare or C. A. R. Hoare, is a British computer scientist. He developed the sorting algorithm quicksort in 1959/1960, born in Colombo, Ceylon to British parents, Tony Hoares father was a colonial civil servant and his mother was the daughter of a tea planter. Hoare was educated in England at the Dragon School in Oxford and he then studied Classics and Philosophy at Merton College, Oxford. On graduating in 1956 he did 18 months National Service in the Royal Navy and he then went to Moscow State University as a British Council exchange student, where he studied machine translation under Andrey Kolmogorov. In 1960, Hoare left the Soviet Union and began working at Elliott Brothers, Ltd, a computer manufacturing firm. He is now an Emeritus Professor there, and is also a researcher at Microsoft Research in Cambridge. In 1982, Hoare was elected a Fellow of the Royal Society and he was elected in 2005 as a Fellow of the Royal Academy of Engineering. Speaking at a conference in 2009, he apologised for inventing the null reference and it was the invention of the null reference in 1965. At that time, I was designing the first comprehensive system for references in an object oriented language. My goal was to ensure that all use of references should be absolutely safe, but I couldnt resist the temptation to put in a null reference, simply because it was so easy to implement. This has led to errors, vulnerabilities, and system crashes. For many years under his leadership his Oxford department worked on formal specification languages such as CSP, programs have now got very large and very critical – well beyond the scale which can be comfortably tackled by formal methods. There have been many problems and failures, but these have always been attributable to inadequate analysis of requirements or inadequate management control. It has turned out that the world just does not suffer significantly from the kind of problem that our research was intended to solve. ACM Turing Award for fundamental contributions to the definition and design of programming languages, the award was presented to him at the ACM Annual Conference in Nashville, Tennessee, on 27 October 1980, by Walter Carlson, chairman of the Awards committee. A transcript of Hoares speech was published in Communications of the ACM, dahl, E. W. Dijkstra and C. A. R. Hoare. Prentice Hall International Series in Computer Science, C. A. R. Hoare and M. J. C. Prentice Hall International Series in Computer Science, C. A. R. Hoare and He Jifeng
4.
Merge sort
–
In computer science, merge sort is an efficient, general-purpose, comparison-based sorting algorithm. Most implementations produce a sort, which means that the implementation preserves the input order of equal elements in the sorted output. Mergesort is a divide and conquer algorithm that was invented by John von Neumann in 1945, a detailed description and analysis of bottom-up mergesort appeared in a report by Goldstine and Neumann as early as 1948. Conceptually, a merge sort works as follows, Divide the unsorted list into n sublists, repeatedly merge sublists to produce new sorted sublists until there is only 1 sublist remaining. This will be the sorted list, example C-like code using indices for top down merge sort algorithm that recursively splits the list into sublists until sublist size is 1, then merges those sublists to produce a sorted list. The copy back step is avoided with alternating the direction of the merge with each level of recursion, a list of zero or one elements is sorted, by definition. If length of m ≤1 then return m // Recursive case, first, divide the list into equal-sized sublists // consisting of the first half and second half of the list. // This assumes lists start at index 0, var left, = empty list var right, = empty list for each x with index i in m do if i < /2 then add x to left else add x to right // Recursively sort both sublists. Left, = merge_sort right, = merge_sort // Then merge the now-sorted sublists, return merge In this example, the merge function merges the left and right sublists. Node is a reference or pointer to a node, the merge function would be similar to the one shown in the top down merge lists example, it merges two already sorted lists, and handles empty lists. In this case, merge would use node for its input parameters, both monotonic and bitonic runs may be exploited, with lists being convenient data structures. In the bottom up merge sort, the starting point assumes each run is one item long, in practice, random input data will have many short runs that just happen to be sorted. In the typical case, the merge sort may not need as many passes because there are fewer runs to merge. In the best case, the input is sorted, so the natural merge sort need only make one pass through the data. In many practical cases, long runs are present. In sorting n objects, merge sort has an average and worst-case performance of O, if the running time of merge sort for a list of length n is T, then the recurrence T = 2T + n follows from the definition of the algorithm. The closed form follows from the master theorem, in the worst case, the number of comparisons merge sort makes is equal to or slightly smaller than, which is between and. For large n and a randomly ordered input list, merge sorts expected number of comparisons approaches α·n fewer than the worst case where α = −1 + ∑ k =0 ∞12 k +1 ≈0.2645
5.
Heapsort
–
In computer science, heapsort is a comparison-based sorting algorithm. The improvement consists of the use of a data structure rather than a linear-time search to find the maximum. Although somewhat slower in practice on most machines than a well-implemented quicksort, heapsort is an in-place algorithm, but it is not a stable sort. Heapsort was invented by J. W. J. Williams in 1964 and this was also the birth of the heap, presented already by Williams as a useful data structure in its own right. In the same year, R. W. Floyd published a version that could sort an array in-place. The heapsort algorithm can be divided into two parts, in the first step, a heap is built out of the data. The heap is often placed in an array with the layout of a binary tree. For a zero-based array, the node is stored at index 0, if i is the index of the current node. ILeftChild = 2*i +1 iRightChild = 2*i +2 In the second step, an array is created by repeatedly removing the largest element from the heap. The heap is updated after each removal to maintain the heap, once all objects have been removed from the heap, the result is a sorted array. Heapsort can be performed in place, the array can be split into two parts, the sorted array and the heap. The storage of heaps as arrays is diagrammed here, the heaps invariant is preserved after each extraction, so the only cost is that of extraction. The heapsort algorithm involves preparing the list by first turning it into a max heap and this repeats until the range of considered values is one value in length. The steps are, Call the buildMaxHeap function on the list, also referred to as heapify, this builds a heap from a list in O operations. Swap the first element of the list with the final element, decrease the considered range of the list by one. Call the siftDown function on the list to sift the new first element to its index in the heap. Go to step unless the range of the list is one element. The buildMaxHeap operation is run once, and is O in performance, the siftDown function is O, and is called n times
6.
Comparison sort
–
The only requirement is that the operator obey two of the properties of a total order, if a ≤ b and b ≤ c then a ≤ c for all a and b, either a ≤ b or b ≤ a. It is possible that both a ≤ b and b ≤ a, in case either may come first in the sorted list. In a stable sort, the order determines the sorted order in this case. A metaphor for thinking about comparison sorts is that someone has a set of unlabelled weights and their goal is to line up the weights in order by their weight without any information except that obtained by placing two weights on the scale and seeing which one is heavier. A comparison sort must have a lower bound of Ω comparison operations. This is a consequence of the information available through comparisons alone — or, to put it differently. In this sense, mergesort, heapsort, and introsort are asymptotically optimal in terms of the number of comparisons they must perform, non-comparison sorts can achieve O performance by using operations other than comparisons, allowing them to sidestep this lower bound. Note that comparison sorts may run faster on some lists, many adaptive sorts such as insertion sort run in O time on an already-sorted or nearly-sorted list, the Ω lower bound applies only to the case in which the input list can be in any possible order. Comparison sorts generally adapt more easily to complex such as the order of floating-point numbers. Additionally, once a comparison function is written, any sort can be used without modification, non-comparison sorts typically require specialized versions for each datatype. This flexibility, together with the efficiency of the above comparison sorting algorithms on modern computers, has led to widespread preference for comparison sorts in most practical work. Some sorting problems admit a strictly faster solution than the Ω bound for comparison sorting, an example is integer sorting, when the keys form a small range, counting sort is an example algorithm that runs in linear time. Other integer sorting algorithms, such as sort, are not asymptotically faster than comparison sorting. The problem of sorting pairs of numbers by their sum is not subject to the Ω bound either, the best known algorithm still takes O time, but only O comparisons. The number of comparisons that a sort algorithm requires increases in proportion to n log . Given a list of numbers, there are n factorial permutations exactly one of which is the list in sorted order. The sort algorithm must gain enough information from the comparisons to identify the correct permutation, if the algorithm always completes after at most f steps, it cannot distinguish more than 2f cases because the keys are distinct and each comparison has only two possible outcomes. Therefore,2 f ≥ n. or equivalently f ≥ log 2 , from Stirlings approximation we know that log 2 = n log 2 n − n + O = Ω
7.
Stable sort
–
A sorting algorithm is an algorithm that puts elements of a list in a certain order. The most-used orders are numerical order and lexicographical order, more formally, the output must satisfy two conditions, The output is in nondecreasing order, The output is a permutation of the input. Since the dawn of computing, the problem has attracted a great deal of research, perhaps due to the complexity of solving it efficiently despite its simple. For example, bubble sort was analyzed as early as 1956, comparison sorting algorithms have a fundamental requirement of O comparisons, algorithms not based on comparisons, such as counting sort, can have better performance. Sorting algorithms are classified by, Computational complexity in terms of the size of the list. For typical serial sorting algorithms good behavior is O, with parallel sort in O, ideal behavior for a serial sort is O, but this is not possible in the average case. Optimal parallel sorting is O. Comparison-based sorting algorithms, need at least O comparisons for most inputs, in particular, some sorting algorithms are in-place. Strictly, an in-place sort needs only O memory beyond the items being sorted, some algorithms are either recursive or non-recursive, while others may be both. Stability, stable sorting algorithms maintain the order of records with equal keys. Whether or not they are a comparison sort, a comparison sort examines the data only by comparing two elements with a comparison operator. General method, insertion, exchange, selection, merging, etc, exchange sorts include bubble sort and quicksort. Selection sorts include shaker sort and heapsort, also whether the algorithm is serial or parallel. The remainder of this discussion almost exclusively concentrates upon serial algorithms, adaptability, Whether or not the presortedness of the input affects the running time. Algorithms that take this account are known to be adaptive. When sorting some kinds of data, only part of the data is examined when determining the sort order, for example, in the card sorting example to the right, the cards are being sorted by their rank, and their suit is being ignored. This allows the possibility of multiple different correctly sorted versions of the original list, more formally, the data being sorted can be represented as a record or tuple of values, and the part of the data that is used for sorting is called the key. In the card example, cards are represented as a record, and the key is the rank. A sorting algorithm is stable if there are two records R and S with the same key, and R appears before S in the original list
8.
Main memory
–
Computer data storage, often called storage or memory, is a technology consisting of computer components and recording media used to retain digital data. It is a function and fundamental component of computers. The central processing unit of a computer is what manipulates data by performing computations, in practice, almost all computers use a storage hierarchy, which puts fast but expensive and small storage options close to the CPU and slower but larger and cheaper options farther away. In the Von Neumann architecture, the CPU consists of two parts, The control unit and the arithmetic logic unit. The former controls the flow of data between the CPU and memory, while the latter performs arithmetic and logical operations on data, without a significant amount of memory, a computer would merely be able to perform fixed operations and immediately output the result. It would have to be reconfigured to change its behavior and this is acceptable for devices such as desk calculators, digital signal processors, and other specialized devices. Von Neumann machines differ in having a memory in which they store their operating instructions, most modern computers are von Neumann machines. A modern digital computer represents data using the numeral system. Text, numbers, pictures, audio, and nearly any form of information can be converted into a string of bits, or binary digits. The most common unit of storage is the byte, equal to 8 bits, a piece of information can be handled by any computer or device whose storage space is large enough to accommodate the binary representation of the piece of information, or simply data. For example, the works of Shakespeare, about 1250 pages in print. Data is encoded by assigning a bit pattern to each character, digit, by adding bits to each encoded unit, redundancy allows the computer to both detect errors in coded data and correct them based on mathematical algorithms. A random bit flip is typically corrected upon detection, the cyclic redundancy check method is typically used in communications and storage for error detection. A detected error is then retried, data compression methods allow in many cases to represent a string of bits by a shorter bit string and reconstruct the original string when needed. This utilizes substantially less storage for many types of data at the cost of more computation, analysis of trade-off between storage cost saving and costs of related computations and possible delays in data availability is done before deciding whether to keep certain data compressed or not. For security reasons certain types of data may be encrypted in storage to prevent the possibility of unauthorized information reconstruction from chunks of storage snapshots. Generally, the lower a storage is in the hierarchy, the lesser its bandwidth and this traditional division of storage to primary, secondary, tertiary and off-line storage is also guided by cost per bit. In contemporary usage, memory is usually semiconductor storage read-write random-access memory, typically DRAM or other forms of fast but temporary storage
9.
Big O notation
–
Big O notation is a mathematical notation that describes the limiting behavior of a function when the argument tends towards a particular value or infinity. It is a member of a family of notations invented by Paul Bachmann, Edmund Landau, in computer science, big O notation is used to classify algorithms according to how their running time or space requirements grow as the input size grows. Big O notation characterizes functions according to their rates, different functions with the same growth rate may be represented using the same O notation. The letter O is used because the rate of a function is also referred to as order of the function. A description of a function in terms of big O notation usually only provides a bound on the growth rate of the function. Associated with big O notation are several related notations, using the symbols o, Ω, ω, Big O notation is also used in many other fields to provide similar estimates. Let f and g be two functions defined on some subset of the real numbers. That is, f = O if and only if there exists a real number M. In many contexts, the assumption that we are interested in the rate as the variable x goes to infinity is left unstated. If f is a product of several factors, any constants can be omitted, for example, let f = 6x4 − 2x3 +5, and suppose we wish to simplify this function, using O notation, to describe its growth rate as x approaches infinity. This function is the sum of three terms, 6x4, −2x3, and 5, of these three terms, the one with the highest growth rate is the one with the largest exponent as a function of x, namely 6x4. Now one may apply the rule, 6x4 is a product of 6. Omitting this factor results in the simplified form x4, thus, we say that f is a big-oh of. Mathematically, we can write f = O, one may confirm this calculation using the formal definition, let f = 6x4 − 2x3 +5 and g = x4. Applying the formal definition from above, the statement that f = O is equivalent to its expansion, | f | ≤ M | x 4 | for some choice of x0 and M. To prove this, let x0 =1 and M =13, Big O notation has two main areas of application. In mathematics, it is used to describe how closely a finite series approximates a given function. In computer science, it is useful in the analysis of algorithms, in both applications, the function g appearing within the O is typically chosen to be as simple as possible, omitting constant factors and lower order terms
10.
Soviet Union
–
The Soviet Union, officially the Union of Soviet Socialist Republics was a socialist state in Eurasia that existed from 1922 to 1991. It was nominally a union of national republics, but its government. The Soviet Union had its roots in the October Revolution of 1917 and this established the Russian Socialist Federative Soviet Republic and started the Russian Civil War between the revolutionary Reds and the counter-revolutionary Whites. In 1922, the communists were victorious, forming the Soviet Union with the unification of the Russian, Transcaucasian, Ukrainian, following Lenins death in 1924, a collective leadership and a brief power struggle, Joseph Stalin came to power in the mid-1920s. Stalin suppressed all opposition to his rule, committed the state ideology to Marxism–Leninism. As a result, the country underwent a period of rapid industrialization and collectivization which laid the foundation for its victory in World War II and postwar dominance of Eastern Europe. Shortly before World War II, Stalin signed the Molotov–Ribbentrop Pact agreeing to non-aggression with Nazi Germany, in June 1941, the Germans invaded the Soviet Union, opening the largest and bloodiest theater of war in history. Soviet war casualties accounted for the highest proportion of the conflict in the effort of acquiring the upper hand over Axis forces at battles such as Stalingrad. Soviet forces eventually captured Berlin in 1945, the territory overtaken by the Red Army became satellite states of the Eastern Bloc. The Cold War emerged by 1947 as the Soviet bloc confronted the Western states that united in the North Atlantic Treaty Organization in 1949. Following Stalins death in 1953, a period of political and economic liberalization, known as de-Stalinization and Khrushchevs Thaw, the country developed rapidly, as millions of peasants were moved into industrialized cities. The USSR took a lead in the Space Race with Sputnik 1, the first ever satellite, and Vostok 1. In the 1970s, there was a brief détente of relations with the United States, the war drained economic resources and was matched by an escalation of American military aid to Mujahideen fighters. In the mid-1980s, the last Soviet leader, Mikhail Gorbachev, sought to reform and liberalize the economy through his policies of glasnost. The goal was to preserve the Communist Party while reversing the economic stagnation, the Cold War ended during his tenure, and in 1989 Soviet satellite countries in Eastern Europe overthrew their respective communist regimes. This led to the rise of strong nationalist and separatist movements inside the USSR as well, in August 1991, a coup détat was attempted by Communist Party hardliners. It failed, with Russian President Boris Yeltsin playing a role in facing down the coup. On 25 December 1991, Gorbachev resigned and the twelve constituent republics emerged from the dissolution of the Soviet Union as independent post-Soviet states
11.
Moscow State University
–
Lomonosov Moscow State University is a coeducational and public research university located in Moscow, Russia. It was founded on January 25,1755 by Mikhail Lomonosov, MSU was renamed after Lomonosov in 1940 and was then known as Lomonosov University. It also claims to house the tallest educational building in the world and it is rated among the universities with the best reputation in the world. Its current rector is Viktor Sadovnichiy, ivan Shuvalov and Mikhail Lomonosov promoted the idea of a university in Moscow, and Russian Empress Elizabeth decreed its establishment on January 251755. The first lectures were given on April 26th, russians still celebrate January 25th as Students Day. Saint Petersburg State University and Moscow State University engage in rivalry over the title of Russias oldest university. The present Moscow State University originally occupied the Principal Medicine Store on Red Square from 1755 to 1787, in the 18th century, the University had three departments, philosophy, medicine, and law. A preparatory college was affiliated with the University until its abolition in 1812, in 1779, Mikhail Kheraskov founded a boarding school for noblemen which in 1830 became a gymnasium for the Russian nobility. The university press, run by Nikolay Novikov in the 1780s, published the most popular newspaper in Imperial Russia, in 1804, medical education split into clinical, surgical, and obstetrics faculties. During 1884–1897, the Department of Medicine -- supported by donations. The campus, and medical education in general, were separated from the University in 1918, as of 2015, Devichye Pole was operated by the independent I. M. Sechenov First Moscow State Medical University and by various other state and private institutions. The roots of student unrest in the University reach deep into the nineteenth century, in 1905, a social-democratic organization emerged at the University and called for the overthrow of the Czarist government and the establishment of a republic in Russia. The imperial government repeatedly threatened to close the University, after the October Revolution of 1917, the institution began to admit the children of the proletariat and peasantry. In 1919, the University abolished fees for tuition and established a facility to help working-class children prepare for entrance examinations. During the implementation of Joseph Stalins First Five-Year Plan, prisoners from the Gulag were forced to construct parts of the newly expanded University, after 1991, nine new faculties were established. The following year, the University gained a status, it is funded directly from the state budget. On March 19,2008, Russias most powerful supercomputer to date and its peak performance of 60 TFLOPS makes it the fastest supercomputer in the Commonwealth of Independent States. Since 1953, most of the faculties have been situated on Sparrow Hills, the main building was designed by architect Lev Vladimirovich Rudnev
12.
Machine translation
–
Current machine translation software often allows for customization by domain or profession, improving output by limiting the scope of allowable substitutions. This technique is effective in domains where formal or formulaic language is used. It follows that machine translation of government and legal documents more readily produces usable output than conversation or less standardised text. With the assistance of techniques, MT has proven useful as a tool to assist human translators and, in a very limited number of cases. The progress and potential of machine translation have been debated much through its history, since the 1950s, a number of scholars have questioned the possibility of achieving fully automatic machine translation of high quality. Some critics claim there are in-principle obstacles to automating the translation process. The idea of machine translation may be traced back to the 17th century, in 1629, René Descartes proposed a universal language, with equivalent ideas in different tongues sharing one symbol. The field of machine translation appeared in Warren Weavers Memorandum on Translation, the first researcher in the field, Yehosha Bar-Hillel, began his research at MIT. A Georgetown University MT research team followed with a demonstration of its Georgetown-IBM experiment system in 1954. MT research programs popped up in Japan and Russia, and the first MT conference was held in London, real progress was much slower, however, and after the ALPAC report, which found that the ten-year-long research had failed to fulfill expectations, funding was greatly reduced. Beginning in the late 1980s, as computational power increased and became less expensive, various MT companies were launched, including Trados, which was the first to develop and market translation memory technology. The first commercial MT system for Russian / English / German-Ukrainian was developed at Kharkov State University, MT on the web started with SYSTRAN Offering free translation of small texts, followed by AltaVista Babelfish, which racked up 500,000 requests a day. Franz-Josef Och won DARPAs speed MT competition, recently, Google announced that Google Translate translates roughly enough text to fill 1 million books in one day. The idea of using computers for translation of natural languages was proposed as early as 1946 by A. D. Warren Weaver wrote an important memorandum Translation in 1949, several papers on the topic were published at the time, and even articles in popular journals. A similar application, also pioneered at Birkbeck College at the time, was reading and composing Braille texts by computer, the human translation process may be described as, Decoding the meaning of the source text, and Re-encoding this meaning in the target language. Behind this ostensibly simple procedure lies a complex cognitive operation, of the source language, as well as the culture of its speakers. The translator needs the same in-depth knowledge to re-encode the meaning in the target language, in its most general application, this is beyond current technology
13.
National Physical Laboratory, UK
–
The National Physical Laboratory is the national measurement standards laboratory for the United Kingdom, based at Bushy Park in Teddington, London, England. It is the largest applied physics organisation in the UK, NPL is an internationally respected centre of excellence in measurement and materials science. Since 1900, when Bushy House was selected as the site of NPL, today it provides the scientific resources for the National Measurement System financed by the Department for Business, Innovation and Skills. NPL also offers a range of services, applying scientific skills to industrial measurement problems. The NPL cooperates with professional networks such as those of the IET to support scientists, NPL is at the forefront of new developments in metrology, such as researching metrology for, and standardizing, nanotechnology. NPL is mainly based on the Teddington site but also has a site in Huddersfield for dimensional metrology, Teddington was also home to the UK National Chemical Laboratory but this was closed in 1965 and some of its work was transferred to NPL. The laboratory was run by the UK government, with members of staff being part of the civil service. Administration of the NPL was contracted out in 1995 under a GOCO model, with Serco winning the bid, under this regime, overhead costs halved, third party revenues grew by 16% per annum, and the number of peer-reviewed research papers published doubled. It was decided in 2012 to change the model for NPL from 2014 onward to include academic partners. The date of the changeover was postponed for up to a year. The laboratory transferred back to Department for Business, Innovation and Skills ownership on 1 January 2015, NPL was initially sited entirely within Bushy House but grew to fill a large selection of buildings on the Teddington site. Many of these buildings were demolished and the moved to a large state-of-the-art laboratory for NPL that was built between 1998 and 2007. In January 2013 funding for a new £25m Advanced Metrology Laboratory was announced that will be built on the footprint of an unused building. H. J. Gough one of the pioneers of research into metal fatigue worked at NPL for 19 years from 1914-38, the inventor Sir Barnes Wallis did early development work there on the Bouncing Bomb used in the Dam Busters wartime raids. Sydney Goldstein and Sir James Lighthill worked in NPLs aerodynamics division during World War II researching boundary layer theory, dr Clifford Hodge also worked there and was engaged in research on semiconductors
14.
Magnetic tape data storage
–
Magnetic tape data storage is a system for storing digital information on magnetic tape using digital recording. Modern magnetic tape is most commonly packaged in cartridges and cassettes, the device that performs writing or reading of data is a tape drive. Autoloaders and tape libraries automate cartridge handling, for example, a common cassette-based format is Linear Tape-Open, which comes in a variety of densities and is manufactured by several companies. Sony announced in 2014 that they had developed a storage technology with the highest reported magnetic tape data density,148 Gbit/in². Initially, magnetic tape for storage was wound on 10. 5-inch reels. This de facto standard for computer systems persisted through the late 1980s. Tape cartridges and cassettes were available as early as the mid-1970s and were used with small computer systems. With the introduction of the IBM3480 cartridge in 1984, large computer systems started to move away from open reel tapes, magnetic tape was first used to record computer data in 1951 on the Eckert-Mauchly UNIVAC I. The UNISERVO drive recording medium was a metal strip of 0. 5-inch wide nickel-plated phosphor bronze. Recording density was 128 characters per inch on eight tracks at a speed of 100 in/s. Of the eight tracks, six were data, one was a parity track, making allowance for the empty space between tape blocks, the actual transfer rate was around 7,200 characters per second. A small reel of mylar tape provided separation from the metal tape, IBM computers from the 1950s used ferrous-oxide coated tape similar to that used in audio recording. IBMs technology soon became the de facto industry standard, magnetic tape dimensions were 0. 5-inch wide and wound on removable reels up to 10.5 inches in diameter. Different tape lengths were available with 1,200 feet and 2,400 feet on mil, during the 1980s, longer tape lengths such as 3,600 feet became available using a much thinner PET film. Most tape drives could support a maximum size of 10.5 inches. CDC used IBM compatible 1/2 inch magnetic tapes, but also offered a 1 inch wide variant, a so-called mini-reel was common for smaller data sets, such as for software distribution. These were 7-inch reels, often with no fixed length—the tape was sized to fit the amount of data recorded on it as a cost-saving measure. Early IBM tape drives, such as the IBM727 and IBM729, were mechanically sophisticated floor-standing drives that used vacuum columns to buffer long u-shaped loops of tape
15.
Insertion sort
–
Insertion sort is a simple sorting algorithm that builds the final sorted array one item at a time. It is much less efficient on large lists than more advanced such as quicksort, heapsort. Insertion sort iterates, consuming one input element each repetition, each iteration, insertion sort removes one element from the input data, finds the location it belongs within the sorted list, and inserts it there. It repeats until no elements remain. Sorting is typically done in-place, by iterating up the array, at each array-position, it checks the value there against the largest value in the sorted list. If larger, it leaves the element in place and moves to the next, if smaller, it finds the correct position within the sorted list, shifts all the larger values up to make a space, and inserts into that correct position. The resulting array after k iterations has the property where the first k +1 entries are sorted and it operates by beginning at the end of the sequence and shifting each element one place to the right until a suitable position is found for the new element. The function has the effect of overwriting the value stored immediately after the sorted sequence in the array. To perform a sort, begin at the left-most element of the array. The ordered sequence into which the element is inserted is stored at the beginning of the array in the set of indices already examined, each insertion overwrites a single value, the value being inserted. The inner loop moves element A to its correct place so that after the loop, note that although the common practice is to implement in-place, which requires checking the elements in-order, the order of checking input elements is actually arbitrary. The choice can be made using almost any pattern, as long as all elements are eventually checked. The algorithm can also be implemented in a recursive way, the idea is that to sort an array of n elements, A, one first orders the subarray A and then inserts the last element. In this case insertion sort has a running time. During each iteration, the first remaining element of the input is compared with the right-most element of the sorted subsection of the array. The simplest worst case input is an array sorted in reverse order, the set of all worst case inputs consists of all arrays where each element is the smallest or second-smallest of the elements before it. In these cases every iteration of the loop will scan. This gives insertion sort a quadratic running time, the average case is also quadratic, which makes insertion sort impractical for sorting large arrays
16.
Shellsort
–
Shellsort, also known as Shell sort or Shells method, is an in-place comparison sort. It can be seen as either a generalization of sorting by exchange or sorting by insertion, the method starts by sorting pairs of elements far apart from each other, then progressively reducing the gap between elements to be compared. Starting with far apart elements, it can move some out-of-place elements into position faster than a simple nearest neighbor exchange, Donald Shell published the first version of this sort in 1959. The running time of Shellsort is heavily dependent on the gap sequence it uses, for many practical variants, determining their time complexity remains an open problem. Shellsort is a generalization of insertion sort that allows the exchange of items that are far apart, the idea is to arrange the list of elements so that, starting anywhere, considering every hth element gives a sorted list. Such a list is said to be h-sorted, equivalently, it can be thought of as h interleaved lists, each individually sorted. If the file is then k-sorted for some integer k. Following this idea for a sequence of h values ending in 1 is guaranteed to leave a sorted list in the end. An example run of Shellsort with gaps 5,3 and 1 is shown below, the first pass, 5-sorting, performs insertion sort on five separate subarrays. For instance, it changes the subarray from to, the next pass, 3-sorting, performs insertion sort on the three subarrays. The last pass, 1-sorting, is an insertion sort of the entire array. As the example illustrates, the subarrays that Shellsort operates on are initially short, later they are longer, in both cases insertion sort works efficiently. Shellsort is unstable, it may change the order of elements with equal values. It is a sorting algorithm in that it executes faster when the input is partially sorted. Using Marcin Ciuras gap sequence, with an insertion sort. The question of deciding which gap sequence to use is difficult, every gap sequence that contains 1 yields a correct sort, however, the properties of thus obtained versions of Shellsort may be very different. The table below compares most proposed gap sequences published so far, some of them have decreasing elements that depend on the size of the sorted array. Others are increasing infinite sequences, whose less than N should be used in reverse order
17.
ALGOL
–
Algol, designated Beta Persei, known colloquially as the Demon Star, is a bright multiple star in the constellation of Perseus. It is the first and best known eclipsing binary, and one of the first non-nova variable stars to be discovered. It is a system, consisting of Beta Persei Aa1, Aa2. Thus, Algols magnitude is usually near-constant at 2.1, there is also a secondary eclipse when the brighter star occults the fainter secondary. This secondary eclipse can only be detected photoelectrically, Algol gives its name to its class of eclipsing variable, known as Algol variables. An Ancient Egyptian Calendar of Lucky and Unlucky Days composed some 3,200 years ago is claimed to be the oldest historical document of the discovery of Algol. In May 1783, he presented his findings to the Royal Society, for his report he was awarded the Copley Medal. In 1881, the Harvard astronomer Edward Charles Pickering presented evidence that Algol was actually an eclipsing binary, thus Algol became one of the first known spectroscopic binaries. Joel Stebbins at the University of Illinois Observatory used a selenium cell photometer to produce the first-ever photoelectric study of a variable star. The light curve revealed the second minimum and the effect between the two stars. Some difficulties in explaining the observed spectroscopic features led to the conjecture that a star may be present in the system. From the point of view of the Earth, Algol Aa1, the total mass of the system is about 5.8 solar masses, and the mass ratios of Aa1, Aa2, and Ab are about 4.5 to 1 to 2. The three components of the triple star used to be, and still sometimes are, referred to as β Per A, B. The Washington Double Star Catalog lists them as Aa1, Aa2, a further five faint stars are also listed as companions. In some binaries similar to Algol, a gas flow can be seen and this system also exhibits x-ray and radio wave flares. The x-ray flares are thought to be caused by the fields of the A and B components interacting with the mass transfer. Mass transfer between the components is small in the Algol system but could be a significant source of change in other Algol-type binaries. Because the total mass of the Algol system is about 5, however, the actual increase in net cometary collisions is thought to have been quite small
18.
Unix
–
Among these is Apples macOS, which is the Unix version with the largest installed base as of 2014. Many Unix-like operating systems have arisen over the years, of which Linux is the most popular, Unix was originally meant to be a convenient platform for programmers developing software to be run on it and on other systems, rather than for non-programmer users. The system grew larger as the system started spreading in academic circles, as users added their own tools to the system. Unix was designed to be portable, multi-tasking and multi-user in a time-sharing configuration and these concepts are collectively known as the Unix philosophy. By the early 1980s users began seeing Unix as a universal operating system. Under Unix, the system consists of many utilities along with the master control program. To mediate such access, the kernel has special rights, reflected in the division between user space and kernel space, the microkernel concept was introduced in an effort to reverse the trend towards larger kernels and return to a system in which most tasks were completed by smaller utilities. In an era when a standard computer consisted of a disk for storage and a data terminal for input and output. However, modern systems include networking and other new devices, as graphical user interfaces developed, the file model proved inadequate to the task of handling asynchronous events such as those generated by a mouse. In the 1980s, non-blocking I/O and the set of inter-process communication mechanisms were augmented with Unix domain sockets, shared memory, message queues, and semaphores. In microkernel implementations, functions such as network protocols could be moved out of the kernel, Multics introduced many innovations, but had many problems. Frustrated by the size and complexity of Multics but not by the aims and their last researchers to leave Multics, Ken Thompson, Dennis Ritchie, M. D. McIlroy, and J. F. Ossanna, decided to redo the work on a much smaller scale. The name Unics, a pun on Multics, was suggested for the project in 1970. Peter H. Salus credits Peter Neumann with the pun, while Brian Kernighan claims the coining for himself, in 1972, Unix was rewritten in the C programming language. Bell Labs produced several versions of Unix that are referred to as Research Unix. In 1975, the first source license for UNIX was sold to faculty at the University of Illinois Department of Computer Science, UIUC graduate student Greg Chesson was instrumental in negotiating the terms of this license. During the late 1970s and early 1980s, the influence of Unix in academic circles led to adoption of Unix by commercial startups, including Sequent, HP-UX, Solaris, AIX. In the late 1980s, AT&T Unix System Laboratories and Sun Microsystems developed System V Release 4, in the 1990s, Unix-like systems grew in popularity as Linux and BSD distributions were developed through collaboration by a worldwide network of programmers
19.
Java (programming language)
–
Java is a general-purpose computer programming language that is concurrent, class-based, object-oriented, and specifically designed to have as few implementation dependencies as possible. It is intended to let application developers write once, run anywhere, Java applications are typically compiled to bytecode that can run on any Java virtual machine regardless of computer architecture. As of 2016, Java is one of the most popular programming languages in use, particularly for client-server web applications, Java was originally developed by James Gosling at Sun Microsystems and released in 1995 as a core component of Sun Microsystems Java platform. The language derives much of its syntax from C and C++, the original and reference implementation Java compilers, virtual machines, and class libraries were originally released by Sun under proprietary licences. As of May 2007, in compliance with the specifications of the Java Community Process, others have also developed alternative implementations of these Sun technologies, such as the GNU Compiler for Java, GNU Classpath, and IcedTea-Web. James Gosling, Mike Sheridan, and Patrick Naughton initiated the Java language project in June 1991, Java was originally designed for interactive television, but it was too advanced for the digital cable television industry at the time. The language was initially called Oak after an oak tree that stood outside Goslings office, later the project went by the name Green and was finally renamed Java, from Java coffee. Gosling designed Java with a C/C++-style syntax that system and application programmers would find familiar, Sun Microsystems released the first public implementation as Java 1.0 in 1995. It promised Write Once, Run Anywhere, providing no-cost run-times on popular platforms, fairly secure and featuring configurable security, it allowed network- and file-access restrictions. Major web browsers soon incorporated the ability to run Java applets within web pages, and Java quickly became popular, while mostly outside of browsers, in January 2016, Oracle announced that Java runtime environments based on JDK9 will discontinue the browser plugin. The Java 1.0 compiler was re-written in Java by Arthur van Hoff to comply strictly with the Java 1.0 language specification, with the advent of Java 2, new versions had multiple configurations built for different types of platforms. J2EE included technologies and APIs for enterprise applications typically run in server environments, the desktop version was renamed J2SE. In 2006, for marketing purposes, Sun renamed new J2 versions as Java EE, Java ME, in 1997, Sun Microsystems approached the ISO/IEC JTC1 standards body and later the Ecma International to formalize Java, but it soon withdrew from the process. Java remains a de facto standard, controlled through the Java Community Process, at one time, Sun made most of its Java implementations available without charge, despite their proprietary software status. Sun generated revenue from Java through the selling of licenses for specialized products such as the Java Enterprise System, on November 13,2006, Sun released much of its Java virtual machine as free and open-source software, under the terms of the GNU General Public License. Suns vice-president Rich Green said that Suns ideal role with regard to Java was as an evangelist and this did not prevent Oracle from filing a lawsuit against Google shortly after that for using Java inside the Android SDK. Java software runs on everything from laptops to data centers, game consoles to scientific supercomputers, on April 2,2010, James Gosling resigned from Oracle. There were five primary goals in the creation of the Java language, It must be simple, object-oriented and it must be robust and secure
20.
Introduction to Algorithms
–
Introduction to Algorithms is a book by Thomas H. Cormen, Charles E. Leiserson, Ronald L. Rivest, and Clifford Stein. It is used as the textbook for algorithms courses at universities and is commonly cited as a reference for algorithms in published papers. The book sold half a million copies during its first 20 years and its fame has led to the common use of the abbreviation CLRS, or, in the first edition, CLR. The first edition of the textbook did not include Stein as an author and it included 2 chapters that were dropped in the second edition. After the addition of the author in the second edition. This first edition of the book was known as The Big White Book. With the second edition, the predominant color of the changed to green. A third edition was published in August 2009, the mobile depicted on the cover, Big Red by Alexander Calder, can be found at the Whitney Museum of American Art in New York City. C Counting and Probability D Matrices Cormen, Thomas H. Leiserson, Charles E. Rivest, Cormen, Thomas H. Leiserson, Charles E. Rivest, Ronald L. Stein, Clifford. Cormen, Thomas H. Leiserson, Charles E. Rivest, Ronald L. Stein, the Art of Computer Programming Official websites by MIT Press MIT lecture MIT6. 046J /18. 410J Introduction to Algorithms - Fall 2005. Held in part by coauthor Charles Leiserson, released as part of MIT OpenCourseWare. Video recordings and transcripts of the lectures, includes slides automatically synchronized to video content. Exercise Solutions While there are no solutions, the following may be helpful
21.
Recursion (computer science)
–
Recursion in computer science is a method where the solution to a problem depends on solutions to smaller instances of the same problem. The approach can be applied to types of problems. The power of recursion evidently lies in the possibility of defining a set of objects by a finite statement. In the same manner, a number of computations can be described by a finite recursive program. Most computer programming languages support recursion by allowing a function to call itself within the program text, some functional programming languages do not define any looping constructs but rely solely on recursion to repeatedly call code. A common computer programming tactic is to divide a problem into sub-problems of the type as the original, solve those sub-problems. For example, the function can be defined recursively by the equations 0. =1 and, for all n >0, n. = n, neither equation by itself constitutes a complete definition, the first is the base case, and the second is the recursive case. Because the base case breaks the chain of recursion, it is also called the terminating case. The job of the cases can be seen as breaking down complex inputs into simpler ones. In a properly designed recursive function, with each recursive call, neglecting to write a base case, or testing for it incorrectly, can cause an infinite loop. For some functions there is not a base case implied by the input data. Many computer programs must process or generate a large quantity of data. Recursion is one technique for representing data whose exact size the programmer does not know, there are two types of self-referential definitions, inductive and coinductive definitions. An inductively defined recursive data definition is one that specifies how to construct instances of the data. For example, linked lists can be defined inductively, The code above specifies a list of strings to be empty, or a structure that contains a string. The self-reference in the definition permits the construction of lists of any number of strings, another example of inductive definition is the natural numbers, A natural number is either 1 or n+1, where n is a natural number. Similarly recursive definitions are used to model the structure of expressions
22.
Median
–
The median is the value separating the higher half of a data sample, a population, or a probability distribution, from the lower half. In simple terms, it may be thought of as the value of a data set. For example, in the set, the median is 6. The median is a commonly used measure of the properties of a set in statistics. The basic advantage of the median in describing data compared to the mean is that it is not skewed so much by extremely large or small values, and so it may give a better idea of a typical value. For example, in understanding statistics like household income or assets which vary greatly, Median income, for example, may be a better way to suggest what a typical income is. The median of a finite list of numbers can be found by arranging all the numbers from smallest to greatest, if there is an odd number of numbers, the middle one is picked. For example, consider the set of numbers,1,3,3,6,7,8,9 This set contains seven numbers, the median is the fourth of them, which is 6. If there are a number of observations, then there is no single middle value. For example, in the set,1,2,3,4,5,6,8,9 The median is the mean of the middle two numbers, this is ÷2, which is 4.5. The formula used to find the number of a data set of n numbers is ÷2. This either gives the number or the halfway point between the two middle values. For example, with 14 values, the formula will give 7.5, and you will also be able to find the median using the Stem-and-Leaf Plot. There is no accepted standard notation for the median. In any of these cases, the use of these or other symbols for the needs to be explicitly defined when they are introduced. The median is used primarily for skewed distributions, which it summarizes differently from the arithmetic mean, the median is 2 in this case, and it might be seen as a better indication of central tendency than the arithmetic mean of 4. The widely cited empirical relationship between the locations of the mean and the median for skewed distributions is, however. There are, however, various relationships for the difference between them, see below
23.
Binomial coefficient
–
In mathematics, any of the positive integers that occurs as a coefficient in the binomial theorem is a binomial coefficient. Commonly, a coefficient is indexed by a pair of integers n ≥ k ≥0 and is written. It is the coefficient of the xk term in the expansion of the binomial power n. The value of the coefficient is given by the expression n. k, arranging binomial coefficients into rows for successive values of n, and in which k ranges from 0 to n, gives a triangular array called Pascals triangle. The properties of binomial coefficients have led to extending the definition to beyond the case of integers n ≥ k ≥0. Andreas von Ettingshausen introduced the notation in 1826, although the numbers were known centuries earlier, the earliest known detailed discussion of binomial coefficients is in a tenth-century commentary, by Halayudha, on an ancient Sanskrit text, Pingalas Chandaḥśāstra. In about 1150, the Indian mathematician Bhaskaracharya gave an exposition of binomial coefficients in his book Līlāvatī, alternative notations include C, nCk, nCk, Ckn, Cnk, and Cn, k in all of which the C stands for combinations or choices. Many calculators use variants of the C notation because they can represent it on a single-line display, in this form the binomial coefficients are easily compared to k-permutations of n, written as P, etc. For natural numbers n and k, the binomial coefficient can be defined as the coefficient of the monomial Xk in the expansion of n, the same coefficient also occurs in the binomial formula, which explains the name binomial coefficient. This shows in particular that is a number for any natural numbers n and k. Most of these interpretations are easily seen to be equivalent to counting k-combinations, several methods exist to compute the value of without actually expanding a binomial power or counting k-combinations. It also follows from tracing the contributions to Xk in n−1, as there is zero Xn+1 or X−1 in n, one might extend the definition beyond the above boundaries to include =0 when either k > n or k <0. This recursive formula then allows the construction of Pascals triangle, surrounded by white spaces where the zeros, or the trivial coefficients, a more efficient method to compute individual binomial coefficients is given by the formula = n k _ k. = n ⋯ k ⋯1 = ∏ i =1 k n +1 − i i and this formula is easiest to understand for the combinatorial interpretation of binomial coefficients. The numerator gives the number of ways to select a sequence of k distinct objects, retaining the order of selection, the denominator counts the number of distinct sequences that define the same k-combination when order is disregarded. Due to the symmetry of the binomial coefficient with regard to k and n−k, calculation may be optimised by setting the limit of the product above to the smaller of k. This formula follows from the formula above by multiplying numerator and denominator by. As a consequence it involves many factors common to numerator and denominator and it is less practical for explicit computation unless common factors are first cancelled
24.
Integer overflow
–
The most common result of an overflow is that the least significant representable bits of the result are stored the result is said to wrap around the maximum. An overflow condition gives incorrect results and, particularly if the possibility has not been anticipated, the register width of a processor determines the range of values that can be represented. In particular, multiplying or adding two integers may result in a value that is small, and subtracting from a small integer may cause a wrap to a large positive value. If the variable has an integer type, a program may make the assumption that a variable always contains a positive value. An integer overflow can cause the value to wrap and become negative, most computers have two dedicated processor flags to check for overflow conditions. The carry flag is set when the result of an addition or subtraction, considering the operands and result as unsigned numbers and this indicates an overflow with a carry/borrow from the most significant bit. This indicates than an overflow has occurred and the result represented in twos complement form would not fit in the given number of bits. Handling, If it is anticipated that overflow may occur and when it happens detected, cPUs generally have a way of detecting this to support addition of numbers larger than their register size, typically using a status bit. Propagation, if a value is too large to be stored it can be assigned a value indicating that overflow has occurred. This is useful so that the problem can be checked for once at the end of a long rather than after each step. This is often supported in Floating Point Hardware called FPUs, run-time overflow detection implementation AddressSanitizer is also available for C compilers. Using such languages may thus be helpful to mitigate this issue, however, in some such languages, situations are still possible where an integer overflow can occur. An example is explicit optimization of a path which is considered a bottleneck by the profiler. In the case of Common Lisp, this is possible by using a declaration to type-annotate a variable to a machine-size word. In Java 8, there are overloaded methods, for example like Math#addExact, in computer graphics or signal processing, it is typical to work on data that ranges from 0 to 1 or from −1 to 1. An example of this is an image where 0 represents black,1 represents white. One operation that one may want to support is brightening the image by multiplying every pixel by a constant, unanticipated arithmetic overflow is a fairly common cause of program errors. Such overflow bugs may be hard to discover and diagnose because they may manifest themselves only for large input data sets
25.
Dutch national flag problem
–
The Dutch national flag problem is a computer science programming problem proposed by Edsger Dijkstra. The flag of the Netherlands consists of three colors, red, white and blue, several solutions exist that have varying performance characteristics, tailored to sorting arrays with either small or large numbers of repeated elements. This problem can also be viewed in terms of rearranging elements of an array, suppose each of the possible elements could be classified into exactly one of three categories. For example, if all elements are in 0,1, the bottom could be defined as elements in 0. 0.3 and the top as 0.3 and greater, the problem is then to produce an array such that all bottom elements come before all middle elements, which come before all top elements. One algorithm is to have the top group grow down from the top of the array, the group grow up from the bottom. The algorithm indexes three locations, the bottom of the top group, the top of the group. Elements that are yet to be sorted fall between the middle and the top group, at each step, examine the element just above the middle. If it belongs to the top group, swap it with the element just below the top, if it belongs in the bottom, swap it with the element just above the bottom. If it is in the middle, leave it, complexity is Θ moves and examinations. The following pseudocode for three-way partitioning assumes zero-based array indexing and it uses three indices i, j and n, maintaining the invariant that i ≤ j. N holds the boundary of numbers greater than mid, J is the position of number under consideration. And i is the boundary for the numbers lesser than the mid, american flag sort Explanation and interactive explanatory execution of the algorithm, sorting two or three colors
26.
Version 7 Unix
–
Seventh Edition Unix, also called Version 7 Unix, Version 7 or just V7, was an important early release of the Unix operating system. V7, released in 1979, was the last Bell Laboratories release to see widespread distribution before the commercialization of Unix by AT&T Corporation in the early 1980s, V7 was originally developed for Digital Equipment Corporations PDP-11 minicomputers and was later ported to other platforms. Unix versions from Bell Labs were designated by the edition of the manual with which they were accompanied. Released in 1979, the Seventh Edition was preceded by Sixth Edition, V7 was the first readily portable version of Unix. The first Sun workstations ran a V7 port by UniSoft, the first version of Xenix for the Intel 8086 was derived from V7, the VAX port of V7, called UNIX/32V, was the direct ancestor of the popular 4BSD family of Unix systems. The group at University of Wollongong that had ported V6 to the Interdata 7/32 ported V7 to that machine as well, Interdata sold the port as Edition VII, making it the first commercial UNIX offering. DEC distributed their own PDP-11 version of V7, called V7M, UEG evolved into the group that later developed Ultrix. At the time of its release, though, its greatly extended feature set came at the expense of a decrease in performance compared to V6, the exact number of system calls varies depending on the operating system version. More recent systems have seen growth in the number of supported system calls. Linux 3.2.0 has 380 system calls and FreeBSD8.0 has over 450, in 2002, Caldera International released V7 as FOSS under a permissive BSD-like software license. Bootable images for V7 can still be downloaded today, and can be run on modern hosts using PDP-11 emulators such as SIMH, an x86 port has been developed by Nordier & Associates. Paul Allen maintains several publicly accessible computer systems, including a PDP-11/70 running Unix Version 7. Request a login from Living Computers, Museum + Labs and try running Version 7 Unix on the original equipment, many new features were introduced in Version 7. Programming tools, lex, lint, and make, the Portable C Compiler was provided along with the earlier, PDP-11-specific, C compiler by Ritchie. These first appeared in the Research Unix lineage in Version 7, mpx files were considered experimental, not enabled in the default kernel, and disappeared from later versions, which offered sockets or CB UNIXs IPC facilities instead. Version 6 Unix Seventh Edition Unix terminal interface Ancient UNIX Unix Seventh Edition manual Browsable source code PDP Unix Preservation Society
27.
Cache memory
–
Cache memory makes use of the fast technology SRAM, against a slower MM DRAM, connected directly to the processor. The term cache is derived from the French and refers to a place one can hide something. This term has many meanings depending on the context, examples are, disk cache, TLB, branch prediction cache, branch history table, Branch Address Cache, trace cache, that are physical memories. Others are handled by the software, to store data in reserved MM space. Note – CPU cache is a term and rarely or wrongly used in both literature and in industrial environment. For instance, in the US patents documents relating to the cache, the term CPU cache is used in less of 2% of the documents against to the Cache Memory and the Memory Cache. Cache general definition The Cache is a memory that stores data, in silent mode to upper level of utilization. In MM read operation, the controller first of all checks if the data is stored in cache. In case of match the data is fastly and directly supplied from the cache to the processor without involving the MM.3, else the data is read from MM. A cache stores only a subset of MM data – the most recent-used MRU, Data read from MM are temporary stored in cache. If the processor requires the data, this is supplied by the cache. The cache is effective because short instruction loops and routines are a common program structure, spatial locality - If a data is referenced, it is very likely that nearby data will be accessed soon. Instructions and data are transferred from MM to the cache in fixed blocks, Cache line size is in the range of 4 to 512 bytes so that more than one processing data is stored in each cache entry. After a first MM access, all cache line data are available in cache, next instruction usually comes from the next memory location. Data is usually structured and data in these structures normally are stored in memory locations. Large lines size increase the spatial locality but increase also the number of invalidated data in case of line replacement, Cache efficiency is measured in terms of Hit Rate. Hit Rate represents the percentage of Hits compared to the number of cache accesses. The opposite of Hit is called Miss The cache efficiency depends by several elements, cache size, line size, type and cache architecture, a good figure of the cache efficiency, for business applications, may be in the range of 80-95 % of hit
28.
Prefix sum
–
In computer science, the prefix sum, cumulative sum, inclusive scan, or simply scan of a sequence of numbers x0, x1, x2. is a second sequence of numbers y0, y1, y2. The sums of prefixes of the sequence, y0 = x0 y1 = x0 + x1 y2 = x0 + x1+ x2. Prefix sums have also much studied in parallel algorithms, both as a test problem to be solved and as a useful primitive to be used as a subroutine in other parallel algorithms. Abstractly, a prefix sum requires only a binary associative operator ⊕, mathematically, the operation of taking prefix sums can be generalized from finite to infinite sequences, in that context, a prefix sum is known as a partial sum of a series. Prefix summation or partial summation form linear operators on the spaces of finite or infinite sequences. The corresponding suffix operations are available as scanr and scanr1. The procedural Message Passing Interface libraries provide an operation MPI_Scan for computing a scan operation between networked processing units. The C++ language has a library function partial_sum, despite its name, it takes a binary operation as one of its arguments. A prefix sum can be calculated in parallel by the following steps, compute the sums of consecutive pairs of items in which the first item of the pair has an even index, z0 = x0 + x1, z1 = x2 + x3, etc. Recursively compute the prefix sum w0, w1, w2. of the sequence z0, z1, z2. Express each term of the final sequence y0, y1, y2. as the sum of up to two terms of these sequences, y0 = x0, y1 = z0, y2 = z0 + x2, y3 = w0. After the first value, each successive number yi is either copied from a half as far through the w sequence. If the input sequence has n steps, then the recursion continues to a depth of O, asymptotically this method takes approximately two read operations and one write operation per item. When a data set may be updated dynamically, it may be stored in a Fenwick tree data structure and this structure allows both the lookup of any individual prefix sum value and the modification of any array value in logarithmic time per operation. For higher-dimensional arrays, the summed area table provides a structure based on prefix sums for computing sums of arbitrary rectangular subarrays. This can be a primitive in image convolution operations. Counting sort is a sorting algorithm that uses the prefix sum of a histogram of key frequencies to calculate the position of each key in the sorted output array. List ranking, the problem of transforming a linked list into an array that represents the sequence of items
29.
Radix sort
–
A positional notation is required, but because integers can represent strings of characters and specially formatted floating point numbers, radix sort is not limited to integers. Radix sort dates back as far as 1887 to the work of Herman Hollerith on tabulating machines, two classifications of radix sorts are least significant digit radix sorts and most significant digit radix sorts. LSD radix sorts process the integer representations starting from the least digit, MSD radix sorts work the other way around. LSD radix sorts typically use the following sorting order, short keys come before longer keys, and keys of the same length are sorted lexicographically. This coincides with the order of integer representations, such as the sequence 1,2,3,4,5,6,7,8,9,10,11. MSD radix sorts use lexicographic order, which is suitable for sorting strings, such as words, or fixed-length integer representations. A sequence such as b, c, d, e, f, g, h, i, j, ba would be sorted as b, ba. The topic of the efficiency of radix sort compared to other sorting algorithms is somewhat tricky, whether radix sort is equally efficient, less efficient or more efficient than the best comparison-based algorithms depends on the details of the assumptions made. Radix sort complexity is O for n keys which are integers of word size w, sometimes w is presented as a constant, which would make radix sort better than the best comparison-based sorting algorithms, which all perform O comparisons to sort n keys. That would seem to make radix sort at most equally efficient as the best comparison-based sorts, the counter argument is that comparison-based algorithms are measured in number of comparisons, not actual time complexity. Under some assumptions the comparisons will be constant time on average, in a sorting algorithm the first comparisons made satisfies the randomness condition, but as the sort progresses the keys compared are clearly not randomly chosen anymore. For example, consider a bottom-up merge sort, the first pass will compare pairs of random keys, but the last pass will compare keys that are very close in the sorting order. This makes merge sort, on class of inputs, take O time. That assumes all memory accesses cost the same, which is not a reasonable assumption as we scale n to infinity. A Least significant digit Radix sort is a fast stable sorting algorithm which can be used to sort keys in integer representation order, keys may be a string of characters, or numerical digits in a given radix. The processing of the keys begins at the least significant digit, the sequence in which digits are processed by an LSD radix sort is the opposite of the sequence in which digits are processed by a most significant digit radix sort. An LSD radix sort operates in O time, where n is the number of keys, a radix sorting algorithm was originally used to sort punched cards in several passes. A computer algorithm was invented for radix sort in 1954 at MIT by Harold H. Seward, in many large applications needing speed, the computer radix sort is an improvement on comparison sorts
30.
Call stack
–
In computer science, a call stack is a stack data structure that stores information about the active subroutines of a computer program. This kind of stack is also known as a stack, program stack, control stack, run-time stack, or machine stack. Although maintenance of the stack is important for the proper functioning of most software. Many computer instruction sets provide special instructions for manipulating stacks, a call stack is used for several related purposes, but the main reason for having one is to keep track of the point to which each active subroutine should return control when it finishes executing. An active subroutine is one that has been called but is yet to complete execution after which control should be handed back to the point of call, such activations of subroutines may be nested to any level, hence the stack structure. If, for example, a subroutine DrawSquare calls a subroutine DrawLine from four different places, to accomplish this, the address following the call instruction, the return address, is pushed onto the call stack with each call. If a called subroutine calls on yet another subroutine, it will push another return address onto the call stack, if the pushing consumes all of the space allocated for the call stack, an error called a stack overflow occurs, generally causing the program to crash. Adding a subroutines entry to the stack is sometimes called winding, conversely. There is usually exactly one call stack associated with a running program, in high-level programming languages, the specifics of the call stack are usually hidden from the programmer. They are given only to a set of functions. This is an example of abstraction, most assembly languages, on the other hand, require programmers to be involved with manipulating the stack. The actual details of the stack in a language depend upon the compiler, operating system. As noted above, the purpose of a call stack is to store the return addresses. When a subroutine is called, the location of the instruction at which it can later resume needs to be saved somewhere, using a stack to save the return address has important advantages over alternative calling conventions. One is that each task can have its own stack, and thus the subroutine can be reentrant, another benefit is that recursion is automatically supported. When a function calls itself recursively, a return address needs to be stored for each activation of the function so that it can later be used to return from the function activation, stack structures provide this capability automatically. It is often convenient to allocate space for use by simply moving the top of the stack by enough to provide the space. This is very fast when compared to dynamic memory allocation, which uses the heap space, note that each separate activation of a subroutine gets its own separate space in the stack for locals