String (computer science)
In computer programming, a string is traditionally a sequence of characters, either as a literal constant or as some kind of variable. The latter may allow its elements to be mutated and the length changed. A string is considered a data type and is implemented as an array data structure of bytes that stores a sequence of elements characters, using some character encoding. String may denote more general arrays or other sequence data types and structures. Depending on programming language and precise data type used, a variable declared to be a string may either cause storage in memory to be statically allocated for a predetermined maximum length or employ dynamic allocation to allow it to hold a variable number of elements; when a string appears in source code, it is known as a string literal or an anonymous string. In formal languages, which are used in mathematical logic and theoretical computer science, a string is a finite sequence of symbols that are chosen from a set called an alphabet. Let Σ be a non-empty finite set of symbols, called the alphabet.
No assumption is made about the nature of the symbols. A string over Σ is any finite sequence of symbols from Σ. For example, if Σ = 01011 is a string over Σ; the length of a string s can be any non-negative integer. The empty string is the unique string over Σ of length 0, is denoted ε or λ; the set of all strings over Σ of length n is denoted Σn. For example, if Σ = Σ2 =. Note that Σ0 = for any alphabet Σ; the set of all strings over Σ of any length is the Kleene closure of Σ and is denoted Σ*. In terms of Σn, Σ ∗ = ⋃ n ∈ N ∪ Σ n For example, if Σ = Σ* =. Although the set Σ* itself is countably infinite, each element of Σ* is a string of finite length. A set of strings over Σ is called a formal language over Σ. For example, if Σ =, the set of strings with an number of zeros, is a formal language over Σ. Concatenation is an important binary operation on Σ*. For any two strings s and t in Σ*, their concatenation is defined as the sequence of symbols in s followed by the sequence of characters in t, is denoted st.
For example, if Σ =, s = bear, t = hug st = bearhug and ts = hugbear. String concatenation is an non-commutative operation; the empty string ε serves as the identity element. Therefore, the set Σ* and the concatenation operation form a monoid, the free monoid generated by Σ. In addition, the length function defines a monoid homomorphism from Σ* to the non-negative integers. A string s is said to be a substring or factor of t if there exist strings u and v such that t = usv; the relation "is a substring of" defines a partial order on Σ*, the least element of, the empty string. A string s is said to be a prefix of t if there exists a string u such that t = su. If u is nonempty, s is said to be a proper prefix of t. Symmetrically, a string s is said to be a suffix of t if there exists a string u such that t = us. If u is nonempty, s is said to be a proper suffix of t. Suffixes and prefixes are substrings of t. Both the relations "is a prefix of" and "is a suffix of" are prefix orders. A string s = uv.
For example, if Σ = the string 0011001 is a rotation of 0100110, where u = 00110 and v = 01. The reverse of a string is a string in reverse order. For example, if s = abc the reverse of s is cba. A string, the reverse of itself is called a palindrome, which includes the empty string and all strings of length 1, it is useful to define an ordering on a set of strings. If the alphabet Σ has a total order one can define a total order on Σ* called lexicographical order. For example, if Σ = and 0 < 1 the lexicographical order on Σ* includes the relationships ε < 0 < 00 < 000 <... < 0001 < 001 < 01 < 010 < 011 < 0110 < 01111 < 1 < 10 < 100 < 101 < 111 < 1111 < 11111... The lexicographical order is total if the alphabetical order is, but isn't well-founded for any nontrivial alphabet if the alphabetical order is. See Shortlex for an alternative string ordering that preserves well-foundedness. A number of additional operations on strings occur in the formal theory; these are given in the article on string operations.
Strings admit the following interpretation as nodes on a graph: Fixed-length strings can be viewed as nodes on a hypercube Variable-length strings can be viewed as nodes on the k-ary tree, where k is the number of symbols in Σ Infinite strings can be viewed as i
Phrase structure grammar
The term phrase structure grammar was introduced by Noam Chomsky as the term for grammar studied by Emil Post and Axel Thue. Some authors, reserve the term for more restricted grammars in the Chomsky hierarchy: context-sensitive grammars or context-free grammars. In a broader sense, phrase structure grammars are known as constituency grammars; the defining trait of phrase structure grammars is thus their adherence to the constituency relation, as opposed to the dependency relation of dependency grammars. In linguistics, phrase structure grammars are all those grammars that are based on the constituency relation, as opposed to the dependency relation associated with dependency grammars. Any of several related theories for the parsing of natural language qualify as constituency grammars, most of them have been developed from Chomsky's work, including Government and binding theory, Generalized phrase structure grammar, Head-driven phrase structure grammar, Lexical functional grammar, the minimalist program, Nanosyntax.
Further grammar frameworks and formalisms qualify as constituency-based, although they may not think of themselves as having spawned from Chomsky's work, e.g. Arc pair grammar, Categorial grammar; the fundamental trait that these frameworks all share is that they view sentence structure in terms of the constituency relation. The constituency relation derives from the subject-predicate division of Latin and Greek grammars, based on term logic and reaches back to Aristotle in antiquity. Basic clause structure is understood in terms of a binary division of the clause into subject and predicate; the binary division of the clause results in a one-to-one-or-more correspondence. For each element in a sentence, there are one or more nodes in the tree structure that one assumes for that sentence. A two word sentence such as Luke laughed implies three nodes in the syntactic structure: one for the noun Luke, one for the verb laughed, one for the entirety Luke laughed; the constituency grammars listed above all view sentence structure in terms of this one-to-one-or-more correspondence.
By the time of Gottlob Frege, a competing understanding of the logic of sentences had arisen. Frege rejected the binary division of the sentence and replaced it with an understanding of sentence logic in terms of predicates and their arguments. On this alternative conception of sentence logic, the binary division of the clause into subject and predicate was not possible, it therefore opened the door to the dependency relation. The dependency relation was first acknowledged concretely and developed as the basis for a comprehensive theory of syntax and grammar by Lucien Tesnière in his posthumously published work Éléments de syntaxe structurale; the dependency relation is a one-to-one correspondence: for every element in a sentence, there is just one node in the syntactic structure. The distinction is thus a graph-theoretical distinction; the dependency relation restricts the number of nodes in the syntactic structure of a sentence to the exact number of syntactic units that that sentence contains.
Thus the two word sentence Luke laughed implies just two syntactic nodes, one for Luke and one for laughed. Some prominent dependency grammars are listed here: Recursive categorical syntax, sometimes called algebraic syntax Functional generative description Lexicase Link grammar Meaning-text theory Operator grammar Word grammarSince these grammars are all based on the dependency relation, they are by definition NOT phrase structure grammars. Other grammars avoid attempts to group syntactic units into clusters in a manner that would allow classification in terms of the constituency vs. dependency distinction. In this respect, the following grammar frameworks do not come down solidly on either side of the dividing line: Construction grammar Cognitive grammar
Michael Fredric Sipser is an American theoretical computer scientist who has made early contributions to computational complexity theory. He is a professor of Applied Mathematics and Dean of Science at the Massachusetts Institute of Technology. Sipser was born and raised in Brooklyn, New York and moved to Oswego, New York when he was 12 years old, he earned his BA in mathematics from Cornell University in 1974 and his PhD in engineering from the University of California at Berkeley in 1980 under the direction of Manuel Blum. He joined MIT's Laboratory for Computer Science as a research associate in 1979 and was a Research Staff Member at IBM Research in San Jose. In 1980, he joined the MIT faculty, he spent the 1985-1986 academic year on the faculty of the University of California at Berkeley and returned to MIT. From 2004 until 2014, he served as head of the MIT Mathematics department, he was appointed Dean of the MIT School of Science in 2014. He is a fellow of the American Academy of Sciences.
In 2015 he was elected as a fellow of the American Mathematical Society "for contributions to complexity theory and for leadership and service to the mathematical community." He was elected as an ACM Fellow in 2017. Sipser specializes in algorithms and complexity theory efficient error correcting codes, interactive proof systems, quantum computation, establishing the inherent computational difficulty of problems, he introduced the method of probabilistic restriction for proving super-polynomial lower bounds on circuit complexity in a paper joint with Merrick Furst and James B. Saxe, their result was improved to be an exponential lower bound by Andrew Yao and Johan Håstad. In an early derandomization theorem, Sipser showed that BPP is contained in the polynomial hierarchy, subsequently improved by Peter Gács and Clemens Lautemann to form what is now known as the Sipser-Gàcs-Lautemann theorem. Sipser established a connection between expander graphs and derandomization, he and his PhD student Daniel Spielman introduced an application of expander graphs.
With fellow graduate student David Lichtenstein, Sipser proved. In quantum computation theory, he introduced the adiabatic algorithm jointly with Edward Farhi, Jeffrey Goldstone, Samuel Gutmann. Sipser has long been interested in the NP problem. In 1975, he wagered an ounce of gold with Leonard Adleman that the problem would be solved with a proof that P≠NP by the end of the 20th century. Sipser sent Adleman an American gold eagle coin in 2000. Sipser is the author of Introduction to the Theory of Computation, a textbook for theoretical computer science. Sipser lives in Cambridge, Massachusetts with his wife and has two children: a daughter, who graduated from New York University, a younger son, an undergraduate at MIT. Personal homepage at MIT
A finite-state machine or finite-state automaton, finite automaton, or a state machine, is a mathematical model of computation. It is an abstract machine that can be in one of a finite number of states at any given time; the FSM can change from one state to another in response to some external inputs. An FSM is defined by a list of its states, its initial state, the conditions for each transition. Finite state machines are of two types – deterministic finite state machines and non-deterministic finite state machines. A deterministic finite-state machine can be constructed equivalent to any non-deterministic one; the behavior of state machines can be observed in many devices in modern society that perform a predetermined sequence of actions depending on a sequence of events with which they are presented. Simple examples are vending machines, which dispense products when the proper combination of coins is deposited, whose sequence of stops is determined by the floors requested by riders, traffic lights, which change sequence when cars are waiting, combination locks, which require the input of combination numbers in the proper order.
The finite state machine has less computational power than some other models of computation such as the Turing machine. The computational power distinction means there are computational tasks that a Turing machine can do but a FSM cannot; this is because a FSM's memory is limited by the number of states it has. FSMs are studied in the more general field of automata theory. An example of a simple mechanism that can be modeled by a state machine is a turnstile. A turnstile, used to control access to subways and amusement park rides, is a gate with three rotating arms at waist height, one across the entryway; the arms are locked, blocking the entry, preventing patrons from passing through. Depositing a coin or token in a slot on the turnstile unlocks the arms, allowing a single customer to push through. After the customer passes through, the arms are locked again. Considered as a state machine, the turnstile has two possible states: Unlocked. There are two possible inputs that affect its state: pushing the arm.
In the locked state, pushing on the arm has no effect. Putting a coin in – that is, giving the machine a coin input – shifts the state from Locked to Unlocked. In the unlocked state, putting additional coins in has no effect. However, a customer pushing through the arms, giving a push input, shifts the state back to Locked; the turnstile state machine can be represented by a state transition table, showing for each possible state, the transitions between them and the outputs resulting from each input: The turnstile state machine can be represented by a directed graph called a state diagram. Each state is represented by a node. Edges show the transitions from one state to another; each arrow is labeled with the input. An input that doesn't cause a change of state is represented by a circular arrow returning to the original state; the arrow into the Locked node from the black dot indicates. A state is a description of the status of a system, waiting to execute a transition. A transition is a set of actions to be executed when a condition is fulfilled or when an event is received.
For example, when using an audio system to listen to the radio, receiving a "next" stimulus results in moving to the next station. When the system is in the "CD" state, the "next" stimulus results in moving to the next track. Identical stimuli trigger different actions depending on the current state. In some finite-state machine representations, it is possible to associate actions with a state: an entry action: performed when entering the state, an exit action: performed when exiting the state. Several state transition table types are used; the most common representation is shown below: the combination of current state and input shows the next state. The complete action's information is not directly described in the table and can only be added using footnotes. A FSM definition including the full actions information is possible using state tables; the Unified Modeling Language has a notation for describing state machines. UML state machines overcome the limitations of traditional finite state machines while retaining their main benefits.
UML state machines introduce the new concepts of hierarchically nested states and orthogonal regions, while extending the notion of actions. UML state machines have the characteristics of Moore machines, they support actions that depend on both the state of the system and the triggering event, as in Mealy machines, as well as entry and exit actions, which are associated with states rather than transitions, as in Moore machines. The Specification and Description Language is a standard from ITU that includes graphical symbols to describe actions in the transition: send an event receive an event start a timer cancel a timer start another concurrent state machine decisionSDL embeds basic data types called "Abstract Data Types", an action language, an execution semantic in order to make the finite state machine executable. There are a large number of variants to represent an FSM such as the one in figure 3. In addition to their use in modeling reactive systems
Phrase structure rules
Phrase structure rules are a type of rewrite rule used to describe a given language's syntax and are associated with the early stages of transformational grammar, being first proposed by Noam Chomsky in 1957. They are used to break down a natural language sentence into its constituent parts known as syntactic categories, including both lexical categories and phrasal categories. A grammar that uses phrase structure rules is a type of phrase structure grammar. Phrase structure rules as they are employed operate according to the constituency relation, a grammar that employs phrase structure rules is therefore a constituency grammar. Phrase structure rules are of the following form: A → B C meaning that the constituent A is separated into the two subconstituents B and C; some examples for English are as follows: S ⟶ NP VP NP ⟶ N 1 N 1 ⟶ N 1 The first rule reads: A S consists of a NP followed by a VP. The second rule reads: A noun phrase consists of an optional Det followed by a N; the third rule means that a N can be preceded by an optional AP and followed by an optional PP.
The round brackets indicate optional constituents. Beginning with the sentence symbol S, applying the phrase structure rules successively applying replacement rules to substitute actual words for the abstract symbols, it is possible to generate many proper sentences of English. If the rules are correct any sentence produced in this way ought to be grammatically correct, it is to be expected that the rules will generate syntactically correct but semantically nonsensical sentences, such as the following well-known example: Colorless green ideas sleep furiouslyThis sentence was constructed by Noam Chomsky as an illustration that phrase structure rules are capable of generating syntactically correct but semantically incorrect sentences. Phrase structure rules break sentences down into their constituent parts; these constituents are represented as tree structures. The tree for Chomsky's sentence can be rendered as follows: A constituent is any word or combination of words, dominated by a single node.
Thus each individual word is a constituent. Further, the subject NP Colorless green ideas, the minor NP green ideas, the VP sleep furiously are constituents. Phrase structure rules and the tree structures that are associated with them are a form of immediate constituent analysis. In transformational grammar, systems of phrase structure rules are supplemented by transformation rules, which act on an existing syntactic structure to produce a new one; these transformations are not required for generation, as the sentences they produce could be generated by a suitably expanded system of phrase structure rules alone, but transformations provide greater economy and enable significant relations between sentences to be reflected in the grammar. An important aspect of phrase structure rules is that they view sentence structure from the top down; the category on the left of the arrow is a greater constituent and the immediate constituents to the right of the arrow are lesser constituents. Constituents are successively broken down into their parts as one moves down a list of phrase structure rules for a given sentence.
This top-down view of sentence structure stands in contrast to much work done in modern theoretical syntax. In Minimalism for instance, sentence structure is generated from the bottom up; the operation Merge merges smaller constituents to create greater constituents until the greatest constituent is reached. In this regard, theoretical syntax abandoned phrase structure rules long ago, although their importance for computational linguistics seems to remain intact. Phrase structure rules as they are employed result in a view of sentence structure, constituency-based. Thus, grammars that employ phrase structure rules are constituency grammars, as opposed to dependency grammars, which view sentence structure as dependency-based. What this means is that for phrase structure rules to be applicable at all, one has to pursue a constituency-based understanding of sentence structure; the constituency relation is a one-to-one-or-more correspondence. For every word in a sentence, there is at least one node in the syntactic structure that corresponds to that word.
The dependency relation, in contrast, is a one-to-one relation. The distinction is illustrated with the following trees: The constituency tree on the left could be generated by phrase structure rules; the sentence S is broken down into smaller constituent parts. The dependency tree on the right could not, in contrast, be generated by phrase structure rules. A number of representational phrase structure theories of
Computer science is the study of processes that interact with data and that can be represented as data in the form of programs. It enables the use of algorithms to manipulate and communicate digital information. A computer scientist studies the theory of computation and the practice of designing software systems, its fields can be divided into practical disciplines. Computational complexity theory is abstract, while computer graphics emphasizes real-world applications. Programming language theory considers approaches to the description of computational processes, while computer programming itself involves the use of programming languages and complex systems. Human–computer interaction considers the challenges in making computers useful and accessible; the earliest foundations of what would become computer science predate the invention of the modern digital computer. Machines for calculating fixed numerical tasks such as the abacus have existed since antiquity, aiding in computations such as multiplication and division.
Algorithms for performing computations have existed since antiquity before the development of sophisticated computing equipment. Wilhelm Schickard designed and constructed the first working mechanical calculator in 1623. In 1673, Gottfried Leibniz demonstrated a digital mechanical calculator, called the Stepped Reckoner, he may be considered the first computer scientist and information theorist, among other reasons, documenting the binary number system. In 1820, Thomas de Colmar launched the mechanical calculator industry when he released his simplified arithmometer, the first calculating machine strong enough and reliable enough to be used daily in an office environment. Charles Babbage started the design of the first automatic mechanical calculator, his Difference Engine, in 1822, which gave him the idea of the first programmable mechanical calculator, his Analytical Engine, he started developing this machine in 1834, "in less than two years, he had sketched out many of the salient features of the modern computer".
"A crucial step was the adoption of a punched card system derived from the Jacquard loom" making it infinitely programmable. In 1843, during the translation of a French article on the Analytical Engine, Ada Lovelace wrote, in one of the many notes she included, an algorithm to compute the Bernoulli numbers, considered to be the first computer program. Around 1885, Herman Hollerith invented the tabulator, which used punched cards to process statistical information. In 1937, one hundred years after Babbage's impossible dream, Howard Aiken convinced IBM, making all kinds of punched card equipment and was in the calculator business to develop his giant programmable calculator, the ASCC/Harvard Mark I, based on Babbage's Analytical Engine, which itself used cards and a central computing unit; when the machine was finished, some hailed it as "Babbage's dream come true". During the 1940s, as new and more powerful computing machines were developed, the term computer came to refer to the machines rather than their human predecessors.
As it became clear that computers could be used for more than just mathematical calculations, the field of computer science broadened to study computation in general. In 1945, IBM founded the Watson Scientific Computing Laboratory at Columbia University in New York City; the renovated fraternity house on Manhattan's West Side was IBM's first laboratory devoted to pure science. The lab is the forerunner of IBM's Research Division, which today operates research facilities around the world; the close relationship between IBM and the university was instrumental in the emergence of a new scientific discipline, with Columbia offering one of the first academic-credit courses in computer science in 1946. Computer science began to be established as a distinct academic discipline in the 1950s and early 1960s; the world's first computer science degree program, the Cambridge Diploma in Computer Science, began at the University of Cambridge Computer Laboratory in 1953. The first computer science degree program in the United States was formed at Purdue University in 1962.
Since practical computers became available, many applications of computing have become distinct areas of study in their own rights. Although many believed it was impossible that computers themselves could be a scientific field of study, in the late fifties it became accepted among the greater academic population, it is the now well-known IBM brand that formed part of the computer science revolution during this time. IBM released the IBM 704 and the IBM 709 computers, which were used during the exploration period of such devices. "Still, working with the IBM was frustrating if you had misplaced as much as one letter in one instruction, the program would crash, you would have to start the whole process over again". During the late 1950s, the computer science discipline was much in its developmental stages, such issues were commonplace. Time has seen significant improvements in the effectiveness of computing technology. Modern society has seen a significant shift in the users of computer technology, from usage only by experts and professionals, to a near-ubiquitous user base.
Computers were quite costly, some degree of humanitarian aid was needed for efficient use—in part from professional computer operators. As computer adoption became more widespread and affordable, less human assistance was needed for common usage. Despite its short history as a formal academic discipline, computer science has made a number of fundamental contributions to science and society—in fact, along with electronics, it is
Avram Noam Chomsky is an American linguist, cognitive scientist, political activist, social critic. Sometimes called "the father of modern linguistics", Chomsky is a major figure in analytic philosophy and one of the founders of the field of cognitive science, he holds a joint appointment as Institute Professor Emeritus at the Massachusetts Institute of Technology and laureate professor at the University of Arizona, is the author of over 100 books on topics such as linguistics, war and mass media. Ideologically, he aligns with libertarian socialism. Born to middle-class Ashkenazi Jewish immigrants in Philadelphia, Chomsky developed an early interest in anarchism from alternative bookstores in New York City, he began studying at the University of Pennsylvania at age 16, taking courses in linguistics and philosophy. From 1951 to 1955, he was appointed to Harvard University's Society of Fellows. While at Harvard, he developed the theory of transformational grammar. Chomsky began teaching at MIT in 1957 and emerged as a significant figure in the field of linguistics for his landmark work Syntactic Structures, which remodelled the scientific study of language.
From 1958 to 1959, he was a National Science Foundation fellow at the Institute for Advanced Study. Chomsky is credited as the creator or co-creator of the universal grammar theory, the generative grammar theory, the Chomsky hierarchy, he played a pivotal role in the decline of behaviorism, being critical of the work of B. F. Skinner. Chomsky vocally opposed U. S. involvement in the Vietnam War, believing the war to be an act of American imperialism. In 1967, he attracted widespread public attention for his antiwar essay "The Responsibility of Intellectuals". Associated with the New Left, he was arrested multiple times for his activism and was placed on Richard Nixon's Enemies List. While expanding his work in linguistics over late 1960s and 1970s, he became involved in the so-called linguistics wars with generative semantics. In the 1980s, Chomsky helped develop binding theory. In collaboration with Edward S. Herman, Chomsky co-wrote Manufacturing Consent: The Political Economy of the Mass Media, which articulated the propaganda model of media criticism and worked to expose the Indonesian occupation of East Timor.
Additionally, his defense of freedom of speech—including free speech for Holocaust deniers—generated significant controversy in the Faurisson affair of the early 1980s. In the 1990s, Chomsky started the minimalist program. Since retiring from active teaching, Chomsky has continued his vocal political activism by opposing the War on Terror and supporting the Occupy Movement. One of the most cited scholars in history, Chomsky has influenced a broad array of academic fields, he is recognized as a paradigm shifter who helped spark a major revolution in the human sciences, contributing to the development of a new cognitivistic framework for the study of language and the mind. In addition to his continued scholarly research, he remains a leading critic of U. S. foreign policy and contemporary state capitalism, the Israeli–Palestinian conflict, mainstream news media. His ideas have proved significant within the anti-capitalist and anti-imperialist movements; some of his critics have accused him of anti-Americanism.
Avram Noam Chomsky was born on December 7, 1928, in the East Oak Lane neighborhood of Philadelphia, Pennsylvania. His mother Elsie emigrated from Belarus to the United States in 1906, while his father William Chomsky left Ukraine for the United States in 1913. William was appointed to the faculty at Gratz College in Philadelphia in 1924. Elsie taught at Gratz. Much William Chomsky was appointed professor of Hebrew at Dropsie College from 1955–77. Noam was the Chomsky family's first child, his younger brother, David Eli Chomsky, was born five years in 1934. The brothers were close, although David was more easygoing while Noam could be competitive. Chomsky and his brother were raised Jewish, being taught Hebrew and discussing the political theories of Zionism; as a Jew, Chomsky faced anti-semitism as a child from the Irish and German communities living in Philadelphia. Chomsky described his parents as "normal Roosevelt Democrats" who had a center-left position on the political spectrum, he was influenced by his uncle who owned a newspaper stand in New York City, where Jewish leftists came to debate the issues of the day.
Whenever visiting his uncle, Chomsky frequented left-wing and anarchist bookstores in the city, voraciously reading political literature. He described his discovery of anarchism as "a lucky accident", because it allowed him to become critical of Stalinism and other forms of Marxism–Leninism. Chomsky's primary education was at Oak Lane Country Day School, an independent Deweyite institution that focused on allowing its pupils to pursue their own interests in a non-competitive atmosphere, it was there, at age 10, that he wrote his first article, on the spread of fascism, following the fall of Barcelona to Francisco Franco's fascist army in the Spanish Civil War. At age 12, Chomsky moved on to secondary education at Central High School, where he joined various clubs and societies and excelled academically but was troubled by the hierarchical and regimented teaching methods. During the same time period, Chomsky atten