Theoretical computer science
Theoretical computer science is a subset of general computer science and mathematics that focuses on more mathematical topics of computing and includes the theory of computation. It is difficult to circumscribe the theoretical areas precisely; the ACM's Special Interest Group on Algorithms and Computation Theory provides the following description: TCS covers a wide variety of topics including algorithms, data structures, computational complexity and distributed computation, probabilistic computation, quantum computation, automata theory, information theory, program semantics and verification, machine learning, computational biology, computational economics, computational geometry, computational number theory and algebra. Work in this field is distinguished by its emphasis on mathematical technique and rigor. While logical inference and mathematical proof had existed in 1931 Kurt Gödel proved with his incompleteness theorem that there are fundamental limitations on what statements could be proved or disproved.
These developments have led to the modern study of logic and computability, indeed the field of theoretical computer science as a whole. Information theory was added to the field with a 1948 mathematical theory of communication by Claude Shannon. In the same decade, Donald Hebb introduced a mathematical model of learning in the brain. With mounting biological data supporting this hypothesis with some modification, the fields of neural networks and parallel distributed processing were established. In 1971, Stephen Cook and, working independently, Leonid Levin, proved that there exist relevant problems that are NP-complete – a landmark result in computational complexity theory. With the development of quantum mechanics in the beginning of the 20th century came the concept that mathematical operations could be performed on an entire particle wavefunction. In other words, one could compute functions on multiple states simultaneously; this led to the concept of a quantum computer in the latter half of the 20th century that took off in the 1990s when Peter Shor showed that such methods could be used to factor large numbers in polynomial time, which, if implemented, would render most modern public key cryptography systems uselessly insecure.
Modern theoretical computer science research is based on these basic developments, but includes many other mathematical and interdisciplinary problems that have been posed, as shown below: An algorithm is a step-by-step procedure for calculations. Algorithms are used for calculation, data processing, automated reasoning. An algorithm is an effective method expressed as a finite list of well-defined instructions for calculating a function. Starting from an initial state and initial input, the instructions describe a computation that, when executed, proceeds through a finite number of well-defined successive states producing "output" and terminating at a final ending state; the transition from one state to the next is not deterministic. A data structure is a particular way of organizing data in a computer so that it can be used efficiently. Different kinds of data structures are suited to different kinds of applications, some are specialized to specific tasks. For example, databases use B-tree indexes for small percentages of data retrieval and compilers and databases use dynamic hash tables as look up tables.
Data structures provide a means to manage large amounts of data efficiently for uses such as large databases and internet indexing services. Efficient data structures are key to designing efficient algorithms; some formal design methods and programming languages emphasize data structures, rather than algorithms, as the key organizing factor in software design. Storing and retrieving can be carried out on data stored in both main memory and in secondary memory. Computational complexity theory is a branch of the theory of computation that focuses on classifying computational problems according to their inherent difficulty, relating those classes to each other. A computational problem is understood to be a task, in principle amenable to being solved by a computer, equivalent to stating that the problem may be solved by mechanical application of mathematical steps, such as an algorithm. A problem is regarded as inherently difficult if its solution requires significant resources, whatever the algorithm used.
The theory formalizes this intuition, by introducing mathematical models of computation to study these problems and quantifying the amount of resources needed to solve them, such as time and storage. Other complexity measures are used, such as the amount of communication, the number of gates in a circuit and the number of processors. One of the roles of computational complexity theory is to determine the practical limits on what computers can and cannot do. Distributed computing studies distributed systems. A distributed system is a software system in which components located on networked computers communicate and coordinate their actions by passing messages; the components interact with each other. Three significant characteristics of distributed systems are: concurrency of components, lack of a global clock, independent failure of components. Examples of distributed systems vary from SOA-based systems to massively multiplayer online games to peer-to-peer applications, and now a perfect example could be the blockcain.
A computer program that runs in a distributed system is called a distributed program, distributed programming is the process of writing such programs. There are m
Literate programming is a programming paradigm introduced by Donald Knuth in which a program is given as an explanation of the program logic in a natural language, such as English, interspersed with snippets of macros and traditional source code, from which a compilable source code can be generated. The literate programming paradigm, as conceived by Knuth, represents a move away from writing programs in the manner and order imposed by the computer, instead enables programmers to develop programs in the order demanded by the logic and flow of their thoughts. Literate programs are written as an uninterrupted exposition of logic in an ordinary human language, much like the text of an essay, in which macros are included to hide abstractions and traditional source code. Literate programming tools are used to obtain two representations from a literate source file: one suitable for further compilation or execution by a computer, the "tangled" code, another for viewing as formatted documentation, said to be "woven" from the literate source.
While the first generation of literate programming tools were computer language-specific, the ones are language-agnostic and exist above the programming languages. Literate programming was first introduced by Donald E. Knuth in 1984; the main intention behind this approach was to treat a program as literature understandable to human beings. This approach was implemented at Stanford University as a part of research on algorithms and digital typography; this implementation was called “WEB” by Donald Knuth since he believed that it was one of the few three-letter words of English that hadn’t been applied to computing. However, it resembles the complicated nature of software delicately pieced together from simple materials. Literate programming is writing out the program logic in a human language with included code snippets and macros. Macros in a literate source file are title-like or explanatory phrases in a human language that describe human abstractions created while solving the programming problem, hiding chunks of code or lower-level macros.
These macros are similar to the algorithms in pseudocode used in teaching computer science. These arbitrary explanatory phrases become precise new operators, created on the fly by the programmer, forming a meta-language on top of the underlying programming language. A preprocessor is used to substitute arbitrary hierarchies, or rather "interconnected'webs' of macros", to produce the compilable source code with one command, documentation with another; the preprocessor provides an ability to write out the content of the macros and to add to created macros in any place in the text of the literate program source file, thereby disposing of the need to keep in mind the restrictions imposed by traditional programming languages or to interrupt the flow of thought. According to Knuth, literate programming provides higher-quality programs, since it forces programmers to explicitly state the thoughts behind the program, making poorly thought-out design decisions more obvious. Knuth claims that literate programming provides a first-rate documentation system, not an add-on, but is grown in the process of exposition of one's thoughts during a program's creation.
The resulting documentation allows the author to restart his own thought processes at any time, allows other programmers to understand the construction of the program more easily. This differs from traditional documentation, in which a programmer is presented with source code that follows a compiler-imposed order, must decipher the thought process behind the program from the code and its associated comments; the meta-language capabilities of literate programming are claimed to facilitate thinking, giving a higher "bird's eye view" of the code and increasing the number of concepts the mind can retain and process. Applicability of the concept to programming on a large scale, that of commercial-grade programs, is proven by an edition of TeX code as a literate program. Knuth claims that Literate Programming can lead to easy porting of software to multiple environments, cites the implementation of TeX as an example. Literate programming is often misunderstood to refer only to formatted documentation produced from a common file with both source code and comments –, properly called documentation generation – or to voluminous commentaries included with code.
This is backwards: well-documented code or documentation extracted from code follows the structure of the code, with documentation embedded in the code. This misconception has led to claims that comment-extraction tools, such as the Perl Plain Old Documentation or Java Javadoc systems, are "literate programming tools". However, because these tools do not implement the "web of abstract concepts" hiding behind the system of natural-language macros, or provide an ability to change the order of the source code from a machine-imposed sequence to one convenient to the human mind, they cannot properly be called literate programming tools in the sense intended by Knuth. Implementing literate programming consists of two steps: Weaving: Generating comprehensive document about program and its maintenance. Tangling: Generating machine executable codeWeaving and tangling are done on the same source so that they are consistent with each other. A classic example of literate programming is the literate implementation of the standard Unix wc word counting program.
Knuth presented a CWEB version of this example in Chapter 12 of his Literate Programming book. The same example was rewritten for the noweb lit
Mihalis Yannakakis is Professor of Computer Science at Columbia University. He is noted for his work in computational complexity and other related fields, he won the Donald E. Knuth Prize in 2005. Yannakakis was born in Athens, Greece in 1953 and attended Varvakeio High School for his early education, he graduated from the National Technical University of Athens in 1975 with a diploma in Electrical Engineering, earned his Ph. D. in Computer Science from Princeton University in 1979. His dissertation was entitled "The Complexity of Maximum Subgraph Problems". In 1978 he joined Bell Laboratories and served as Director of the Computing Principles Research Department starting from 1991 until 2001, when he left Bell laboratories and joined Avaya Laboratories. There he served as Director of the Computing Principles Research Department until 2002. In 2002 he joined Stanford University where he was a Professor of Computer Science, left in 2003 to join Columbia University in 2004, where he is serving as the Percy K. and Vida L. W. Hudson Professor of Computer Science.
From 1992 to 2003, Yannakakis served on the Editorial board of the SIAM Journal on Computing and was the Editor-in-chief between 1998 and 2003. He was a member of the editorial board of the Journal of the ACM from 1986 to 2000. Other editorial board memberships include the Journal of Computer and System Sciences, the Journal of Combinatorial Optimization, the Journal of Complexity, he has served on conference committees and chaired various conferences, such as the ACM Symposium on Principles of Database Systems and the IEEE Symposium on Foundations of Computer Science. As of November 2015, his publications have been cited close to 27,000 times, he has an h-index of 86. Yannakakis is known for his contributions to computer science in the areas of computational complexity theory, database theory, computer aided verification and testing, algorithmic graph theory. Among his contributions to complexity theory are two papers about the PCP theory and about hardness of approximation. In the Annual ACM Symposium on Theory of Computing of 1988, Yannakakis and Christos Papadimitriou introduced the definitions of the complexity classes Max-NP and Max-SNP.
Max-NP and Max-SNP contain a number of interesting optimization problems, it was shown by Yannakakis and Papadimitriou that these problems have some bounded error. These findings were able to explain the lack of progress, seen in the research community on the approximability of a number of optimization problems, including 3SAT, the Independent Set problem, the Travelling Salesman Problem. Yannakakis and Carsten Lund presented a number of findings regarding the hardness of computing approximations at the Annual ACM Symposium on Theory of Computing of 1993; these findings demonstrated the difficulty of efficiently computing approximate solutions to a number of minimization problems such as Graph coloring and Set covering. Given the unlikely scenario that NP-hard problems such as Graph coloring and Set covering would be solved optimally in polynomial time, there had been many attempts to develop efficient approximation solutions for them. In the area of database theory, his contributions include the initiation of the study of acyclic databases and of non-two-phase locking.
Acyclic database schemes are schemes that contain a single acyclic join dependency and a collection of functional dependencies. With regard to non two-phase locking, Yannakakis demonstrated how knowledge about the structure of a database and the forms of various transactions executed on them could be used to establish whether a particular locking policy is safe or not; the used two phase locking policies consist of two stages - for locking and unlocking entities - and to avoid such a policy it is necessary to impose some structure on the entities of a database. Such a policy need not be two-phase and these policies can be classified according to the connectivity of the above-mentioned hypergraph, 2PL policies being only one particular instance of these. Yannakakis went on to show that for the natural class of safe locking policies, freedom from deadlocks is determined on the order in which entities are accessed by transactions, from this derived simple conditions that would guarantee freedom from deadlocks for an L-policy.
He has contributed to the area of computer aided verification and testing, where he laid the rigorous algorithmic and complexity-theoretic foundations of the field. Some of his contributions include the designing of memory efficient algorithms for the verification of temporal properties of finite-state programs, determining the complexity of testing whether programs satisfy their specifications expressed in linear-time temporal logic, verifying that a model with timing constraints satisfies a given temporal property. Along with Alex Groce and Doron Peled, he introduced Adaptive Model Checking, showing that when inconsistencies are present between a system and the corresponding model, the results of
Computer software, or software, is a collection of data or computer instructions that tell the computer how to work. This is in contrast to physical hardware, from which the system is built and performs the work. In computer science and software engineering, computer software is all information processed by computer systems and data. Computer software includes computer programs and related non-executable data, such as online documentation or digital media. Computer hardware and software require each other and neither can be realistically used on its own. At the lowest programming level, executable code consists of machine language instructions supported by an individual processor—typically a central processing unit or a graphics processing unit. A machine language consists of groups of binary values signifying processor instructions that change the state of the computer from its preceding state. For example, an instruction may change the value stored in a particular storage location in the computer—an effect, not directly observable to the user.
An instruction may invoke one of many input or output operations, for example displaying some text on a computer screen. The processor executes the instructions in the order they are provided, unless it is instructed to "jump" to a different instruction, or is interrupted by the operating system; as of 2015, most personal computers, smartphone devices and servers have processors with multiple execution units or multiple processors performing computation together, computing has become a much more concurrent activity than in the past. The majority of software is written in high-level programming languages, they are easier and more efficient for programmers because they are closer to natural languages than machine languages. High-level languages are translated into machine language using a compiler or an interpreter or a combination of the two. Software may be written in a low-level assembly language, which has strong correspondence to the computer's machine language instructions and is translated into machine language using an assembler.
An outline for what would have been the first piece of software was written by Ada Lovelace in the 19th century, for the planned Analytical Engine. She created proofs to show; because of the proofs and the algorithm, she is considered the first computer programmer. The first theory about software—prior to creation of computers as we know them today—was proposed by Alan Turing in his 1935 essay On Computable Numbers, with an Application to the Entscheidungsproblem; this led to the creation of the academic fields of computer science and software engineering. Computer science is the theoretical study of computer and software, whereas software engineering is the application of engineering and development of software. However, prior to 1946, software was not yet the programs stored in the memory of stored-program digital computers, as we now understand it; the first electronic computing devices were instead rewired in order to "reprogram" them. On all computer platforms, software can be grouped into a few broad categories.
Based on the goal, computer software can be divided into: Application software, software that uses the computer system to perform special functions or provide entertainment functions beyond the basic operation of the computer itself. There are many different types of application software, because the range of tasks that can be performed with a modern computer is so large—see list of software. System software, software for managing computer hardware behaviour, as to provide basic functionalities that are required by users, or for other software to run properly, if at all. System software is designed for providing a platform for running application software, it includes the following: Operating systems which are essential collections of software that manage resources and provides common services for other software that runs "on top" of them. Supervisory programs, boot loaders and window systems are core parts of operating systems. In practice, an operating system comes bundled with additional software so that a user can do some work with a computer that only has one operating system.
TeX, stylized within the system as TeX, is a typesetting system, designed and written by Donald Knuth and released in 1978. TeX is a popular means of typesetting complex mathematical formulae. TeX is popular in academia in mathematics, computer science, engineering, physics and quantitative psychology, it has displaced Unix troff, the other favored formatting system, in many Unix installations, which use both for different purposes. It is used for many other typesetting tasks in the form of LaTeX, ConTeXt, other macro packages. TeX was designed with two main goals in mind: to allow anybody to produce high-quality books using minimal effort, to provide a system that would give the same results on all computers, at any point in time TeX is free software, which made it accessible to a wide range of users; when the first paper volume of Donald Knuth's The Art of Computer Programming was published in 1968, it was typeset using hot metal typesetting set by a Monotype machine. This method, dating back to the 19th century, produced a "classic style" appreciated by Knuth.
When the second edition was published, in 1976, the whole book had to be typeset again because the Monotype technology had been replaced by phototypesetting, the original fonts were no longer available. When Knuth received the galley proofs of the new book on 30 March 1977, he found them inferior. Disappointed by the galley proofs of the second edition of the second volume, he was motivated to design his own typesetting system. Knuth saw for the first time the output of a high-quality digital typesetting system, became interested in digital typography. On 13 May 1977, he wrote a memo to himself describing the basic features of TeX, he planned to finish it on his sabbatical in 1978, but as it happened the language was not "frozen" until 1989, more than ten years later. Guy Steele happened to be at Stanford during the summer of 1978, when Knuth was developing his first version of TeX; when Steele returned to the Massachusetts Institute of Technology that autumn, he rewrote TeX's input/output to run under the Incompatible Timesharing System operating system.
The first version of TeX was written in the SAIL programming language to run on a PDP-10 under Stanford's WAITS operating system. For versions of TeX, Knuth invented the concept of literate programming, a way of producing compilable source code and cross-linked documentation typeset in TeX from the same original file; the language used produces programs in DEC PDP-10 Pascal. A new version of TeX, rewritten from scratch and called TeX82, was published in 1982. Among other changes, the original hyphenation algorithm was replaced by a new algorithm written by Frank Liang. TeX82 uses fixed-point arithmetic instead of floating-point, to ensure reproducibility of the results across different computer hardware, includes a real, Turing-complete programming language, following intense lobbying by Guy Steele. In 1989, Donald Knuth released new versions of Metafont. Despite his desire to keep the program stable, Knuth realised that 128 different characters for the text input were not enough to accommodate foreign languages.
Since version 3, TeX has used an idiosyncratic version numbering system, where updates have been indicated by adding an extra digit at the end of the decimal, so that the version number asymptotically approaches π. This is a reflection of the fact that TeX is now stable, only minor updates are anticipated; the current version of TeX is 3.14159265. The design was frozen after version 3.0, no new feature or fundamental change will be added, so all newer versions will contain only bug fixes. Though Donald Knuth himself has suggested a few areas in which TeX could have been improved, he indicated that he believes that having an unchanged system that will produce the same output now and in the future is more important than introducing new features. For this reason, he has stated that the "absolutely final change" will be to change the version number to π, at which point all remaining bugs will become features. Versions of METAFONT after 2.0 asymptotically approach e, a similar change will be applied after Knuth's death.
Since the source code of TeX is in the public domain, other programmers are allowed to improve the system, but are required to use another name to distribute the modified TeX, meaning that the source code can still evolve. For example, the Omega project was developed after 1991 to enhance TeX's multilingual typesetting abilities. Knuth created "unofficial" modified versions, such as TeX-XeT, which allows a user to mix texts written in left-to-right and right-to-left writing systems in the same document. In several technical fields, in particular, computer science, mathematics and physics, TeX has become a de facto standard. Many thousands of books have been published using TeX, including books published by Addison-Wesley, Cambridge University Press, Oxford University Press and Springer. Numerous journals in these fields are produced using TeX or LaTeX, allowing authors to submit their raw manuscript written in TeX. While many publications in other fields, including dictionaries and legal publications, have been produced using TeX, i
Donald Ervin Knuth is an American computer scientist and professor emeritus at Stanford University. He is the author of the multi-volume work The Art of Computer Programming, he contributed to the development of the rigorous analysis of the computational complexity of algorithms and systematized formal mathematical techniques for it. In the process he popularized the asymptotic notation. In addition to fundamental contributions in several branches of theoretical computer science, Knuth is the creator of the TeX computer typesetting system, the related METAFONT font definition language and rendering system, the Computer Modern family of typefaces; as a writer and scholar, Knuth created the WEB and CWEB computer programming systems designed to encourage and facilitate literate programming, designed the MIX/MMIX instruction set architectures. Knuth opposes granting software patents, having expressed his opinion to the United States Patent and Trademark Office and European Patent Organisation. Knuth was born in Milwaukee, Wisconsin, to German-Americans Ervin Henry Knuth and Louise Marie Bohning.
His father had two jobs: running a small printing company and teaching bookkeeping at Milwaukee Lutheran High School. Donald, a student at Milwaukee Lutheran High School, received academic accolades there because of the ingenious ways that he thought of solving problems. For example, in eighth grade, he entered a contest to find the number of words that the letters in "Ziegler's Giant Bar" could be rearranged to create. Although the judges only had 2,500 words on their list, Donald found 4,500 words, winning the contest; as prizes, the school received a new television and enough candy bars for all of his schoolmates to eat. In 1956, Knuth received a scholarship to the Case Institute of Technology in Ohio, he joined Beta Nu Chapter of the Theta Chi fraternity. While studying physics at the Case Institute of Technology, Knuth was introduced to the IBM 650, one of the early mainframes. After reading the computer's manual, Knuth decided to rewrite the assembly and compiler code for the machine used in his school, because he believed he could do it better.
In 1958, Knuth created a program to help his school's basketball team win their games. He assigned "values" to players in order to gauge their probability of getting points, a novel approach that Newsweek and CBS Evening News reported on. Knuth was one of the founding editors of the Engineering and Science Review, which won a national award as best technical magazine in 1959, he switched from physics to mathematics, in 1960 he received his bachelor of science degree being given a master of science degree by a special award of the faculty who considered his work exceptionally outstanding. In 1963, with mathematician Marshall Hall as his adviser, he earned a PhD in mathematics from the California Institute of Technology. After receiving his PhD, Knuth joined Caltech's faculty as an assistant professor, he accepted a commission to write a book on computer programming language compilers. While working on this project, Knuth decided that he could not adequately treat the topic without first developing a fundamental theory of computer programming, which became The Art of Computer Programming.
He planned to publish this as a single book. As Knuth developed his outline for the book, he concluded that he required six volumes, seven, to cover the subject, he published the first volume in 1968. Just before publishing the first volume of The Art of Computer Programming, Knuth left Caltech to accept employment with the Institute for Defense Analyses' Communications Research Division situated on the Princeton University campus, performing mathematical research in cryptography to support the National Security Agency. Knuth left this position to join the Stanford University faculty, where he is now Fletcher Jones Professor of Computer Science, Emeritus. Knuth is a writer, as well as a computer scientist. Knuth has been called the "father of the analysis of algorithms". In the 1970s, Knuth described computer science as "a new field with no real identity, and the standard of available publications was not that high. A lot of the papers coming out were quite wrong.... So one of my motivations was to put straight a story, badly told."
By 2011, the first three volumes and part one of volume four of his series had been published. Concrete Mathematics: A Foundation for Computer Science 2nd ed. which originated with an expansion of the mathematical preliminaries section of Volume 1 of TAoCP, has been published. Bill Gates has praised the difficulty of the subject matter in The Art of Computer Programming, stating, "If you think you're a good programmer... You should send me a résumé if you can read the whole thing." Knuth is the author of Surreal Numbers, a mathematical novelette on John Conway's set theory construction of an alternate system of numbers. Instead of explaining the subject, the book seeks to show the development of the mathematics. Knuth wanted the book to prepare students for doing creative research. In 1995, Knuth wrote the foreword to the book A=B by Marko Petkovšek, Herbert Wilf and Doron Zeilberger. Knuth is an occasional contributor of language puzzles to Word Ways: The Journal of Recreational Linguistics. Knuth has delved into recreational mathematics.
He contributed articles to the Journal of Recreational Mathematics beginning in the 1960s, was acknowledged as a major contributor in Joseph Madachy's Mathematics on Vacation. Knuth has appeared in a number of Numberphile and Computerphile videos on YouTube where he has discussed topics f
Andrew Chi-Chih Yao is a Chinese computer scientist and computational theorist. He is a Professor and the Dean of Institute for Interdisciplinary Information Sciences at Tsinghua University. Yao used the minimax theorem to prove. Yao became a naturalized U. S. citizen, worked for many years in the U. S. but in 2015 he renounced his U. S. became an academician of the Chinese Academy of Sciences. Yao was born in China, he completed his undergraduate education in physics at the National Taiwan University, before completing a Doctor of Philosophy in physics at Harvard University in 1972, a second PhD in computer science from the University of Illinois at Urbana–Champaign in 1975. He was an assistant professor at MIT, assistant professor at Stanford University, professor at the University of California, Berkeley. From 1982 to 1986, he was a full professor at Stanford University. From 1986 to 2004, he was the William and Edna Macaleer Professor of Engineering and Applied Science at Princeton University, where he continued to work on algorithms and complexity.
In 2004, he became a Professor of the Center for Advanced Study, Tsinghua University and the Director of the Institute for Theoretical Computer Science, Tsinghua University in Beijing. Since 2010, he has served as the Dean of Institute for Interdisciplinary Information Sciences in Tsinghua University, he is the Distinguished Professor-at-Large in the Chinese University of Hong Kong. In 1996 he was awarded the Knuth Prize, he received the Turing Award, the most prestigious award in computer science, in 2000, "in recognition of his fundamental contributions to the theory of computation, including the complexity-based theory of pseudorandom number generation and communication complexity". He is a member of U. S. National Academy of Sciences, a fellow of the American Academy of Arts and Sciences, a fellow of the American Association for the Advancement of Science, a fellow of the Association for Computing Machinery, an academician of Chinese Academy of Sciences, his wife, Frances Yao, is a theoretical computer scientist.
Yao's principle Dolev-Yao model Important publications in cryptography Yao's test Yao's Millionaires' Problem Yao graph Andrew Yao at CASTU Andrew Yao at the Mathematics Genealogy Project Andrew Chi-Chih Yao at DBLP Bibliography Server