Hill climbing
In numerical analysis, hill climbing is a mathematical optimization technique which belongs to the family of local search. It is an iterative algorithm that starts with an arbitrary solution to a problem attempts to find a better solution by making an incremental change to the solution. If the change produces a better solution, another incremental change is made to the new solution, so on until no further improvements can be found. For example, hill climbing can be applied to the travelling salesman problem, it is easy to find an initial solution that visits all the cities but will be poor compared to the optimal solution. The algorithm starts with such a solution and makes small improvements to it, such as switching the order in which two cities are visited. A much shorter route is to be obtained. Hill climbing finds optimal solutions for convex problems – for other problems it will find only local optima, which are not the best possible solution out of all possible solutions. Examples of algorithms that solve convex problems by hill-climbing include the simplex algorithm for linear programming and binary search.
To attempt to avoid getting stuck in local optima, one could use restarts, or more complex schemes based on iterations, or on memory, or on memory-less stochastic modifications. The relative simplicity of the algorithm makes it a popular first choice amongst optimizing algorithms, it is used in artificial intelligence, for reaching a goal state from a starting node. Different choices for next nodes and starting nodes are used in related algorithms. Although more advanced algorithms such as simulated annealing or tabu search may give better results, in some situations hill climbing works just as well. Hill climbing can produce a better result than other algorithms when the amount of time available to perform a search is limited, such as with real-time systems, so long as a small number of increments converges on a good solution. At the other extreme, bubble sort can be viewed as a hill climbing algorithm, yet this approach is far from efficient for modest N, as the number of exchanges required grows quadratically.
Hill climbing is an anytime algorithm: it can return a valid solution if it's interrupted at any time before it ends. Hill climbing attempts to maximize a target function f, where x is a vector of continuous and/or discrete values. At each iteration, hill climbing will adjust a single element in x and determine whether the change improves the value of f. With hill climbing, any change that improves f is accepted, the process continues until no change can be found to improve the value of f. X is said to be "locally optimal". In discrete vector spaces, each possible value for x may be visualized as a vertex in a graph. Hill climbing will follow the graph from vertex to vertex, always locally increasing the value of f, until a local maximum x m is reached. In simple hill climbing, the first closer node is chosen, whereas in steepest ascent hill climbing all successors are compared and the closest to the solution is chosen. Both forms fail if there is no closer node, which may happen if there are local maxima in the search space which are not solutions.
Steepest ascent hill climbing is similar to best-first search, which tries all possible extensions of the current path instead of only one. Stochastic hill climbing does not examine all neighbors before deciding. Rather, it selects a neighbor at random, decides whether to move to that neighbor or to examine another. Coordinate descent does a line search along one coordinate direction at the current point in each iteration; some versions of coordinate descent randomly pick a different coordinate direction each iteration. Random-restart hill climbing is a meta-algorithm built on top of the hill climbing algorithm, it is known as Shotgun hill climbing. It iteratively does hill-climbing, each time with a random initial condition x 0; the best x m is kept: if a new run of hill climbing produces a better x m than the stored state, it replaces the stored state. Random-restart hill climbing is a effective algorithm in many cases, it turns out that it is better to spend CPU time exploring the space, than optimizing from an initial condition.
Hill climbing will not find the global maximum, but may instead converge on a local max
Proceedings of the American Mathematical Society
Proceedings of the American Mathematical Society is a monthly peer-reviewed scientific journal of mathematics published by the American Mathematical Society. As a requirement, all articles must be at most 15 printed pages. According to the Journal Citation Reports, the journal has a 2013 impact factor of 0.840. Proceedings of the American Mathematical Society publishes articles from all areas of pure and applied mathematics, including topology, analysis, number theory, logic and statistics; this journal is indexed in the following databases: Mathematical Reviews Zentralblatt MATH Science Citation Index Science Citation Index Expanded ISI Alerting Services CompuMath Citation Index Current Contents / Physical, Chemical & Earth Sciences. Bulletin of the American Mathematical Society Memoirs of the American Mathematical Society Notices of the American Mathematical Society Journal of the American Mathematical Society Transactions of the American Mathematical Society Official website Proceedings of the American Mathematical Society on JSTOR
Graph traversal
In computer science, graph traversal refers to the process of visiting each vertex in a graph. Such traversals are classified by the order. Tree traversal is a special case of graph traversal. Unlike tree traversal, graph traversal may require that some vertices be visited more than once, since it is not known before transitioning to a vertex that it has been explored; as graphs become more dense, this redundancy becomes more prevalent, causing computation time to increase. Thus, it is necessary to remember which vertices have been explored by the algorithm, so that vertices are revisited as infrequently as possible; this may be accomplished by associating each vertex of the graph with a "color" or "visitation" state during the traversal, checked and updated as the algorithm visits each vertex. If the vertex has been visited, it is ignored and the path is pursued no further. Several special cases of graphs imply the visitation of other vertices in their structure, thus do not require that visitation be explicitly recorded during the traversal.
An important example of this is a tree: during a traversal it may be assumed that all "ancestor" vertices of the current vertex have been visited. Both the depth-first and breadth-first graph searches are adaptations of tree-based algorithms, distinguished by the lack of a structurally determined "root" vertex and the addition of a data structure to record the traversal's visitation state. Note. — If each vertex in a graph is to be traversed by a tree-based algorithm the algorithm must be called at least once for each connected component of the graph. This is accomplished by iterating through all the vertices of the graph, performing the algorithm on each vertex, still unvisited when examined. A depth-first search is an algorithm for traversing a finite graph. DFS visits the child vertices before visiting the sibling vertices. A stack is used when implementing the algorithm; the algorithm begins with a chosen "root" vertex. The algorithm backtracks along visited vertices, until it finds a vertex connected to yet more uncharted territory.
It will proceed down the new path as it had before, backtracking as it encounters dead-ends, ending only when the algorithm has backtracked past the original "root" vertex from the first step. DFS is the basis for many graph-related algorithms, including topological sorts and planarity testing. Input: A graph G and a vertex v of G. Output: A labeling of the edges in the connected component of v as discovery edges and back edges.1 procedure DFS: 2 label v as explored 3 for all edges e in G.incidentEdges do 4 if edge e is unexplored 5 w ← G.adjacentVertex 6 if vertex w is unexplored 7 label e as a discovered edge 8 recursively call DFS 9 else 10 label e as a back edge A breadth-first search is another technique for traversing a finite graph. BFS visits the sibling vertices before visiting the child vertices, a queue is used in the search process; this algorithm is used to find the shortest path from one vertex to another. Input: A graph G and a vertex v of G. Output: The closest vertex to v satisfying some conditions, or null if no such vertex exists.1 procedure BFS: 2 create a queue Q 3 enqueue v onto Q 4 mark v 5 while Q is not empty do 6 w ← Q.dequeue 7 if w is what we are looking for 8 return w 9 for all edges e in G.adjacentEdges do 12 x ← G.adjacentVertex 13 if x is not marked 14 mark x 15 enqueue x onto Q 16 return null Breadth-first search can be used to solve many problems in graph theory, for example: finding all vertices within one connected component.
The problem of graph exploration can be seen as a variant of graph traversal. It is an online problem, meaning that the information about the graph is only revealed during the runtime of the algorithm. A common model is as follows: given a connected graph G = with non-negative edge weights; the algorithm starts at some vertex, knows all incident outgoing edges and the vertices at the end of these edges—but not more. When a new vertex is visited again all incident outgoing edges and the vertices at the end are known; the goal is to visit all n vertices and return to the starting vertex, but the sum of the weights of the tour should be as small as possible. The problem can be understood as a specific version of the travelli
Comparison sort
A comparison sort is a type of sorting algorithm that only reads the list elements through a single abstract comparison operation that determines which of two elements should occur first in the final sorted list. The only requirement is that the operator forms a total preorder over the data, with: if a ≤ b and b ≤ c a ≤ c for all a and b, a ≤ b or b ≤ a, it is possible that both a ≤ b ≤ a. In a stable sort, the input order determines the sorted order in this case. A metaphor for thinking about comparison sorts is that someone has a set of unlabelled weights and a balance scale, their goal is to line up the weights in order by their weight without any information except that obtained by placing two weights on the scale and seeing which one is heavier. Some of the most well-known comparison sorts include: Quicksort Heapsort Shellsort Merge sort Introsort Insertion sort Selection sort Bubble sort Odd–even sort Cocktail shaker sort Cycle sort Merge-insertion sort Smoothsort Timsort There are fundamental limits on the performance of comparison sorts.
A comparison sort must have an average-case lower bound of Ω comparison operations, known as linearithmic time. This is a consequence of the limited information available through comparisons alone — or, to put it differently, of the vague algebraic structure of ordered sets. In this sense, mergesort and introsort are asymptotically optimal in terms of the number of comparisons they must perform, although this metric neglects other operations. Non-comparison sorts can achieve O performance by using operations other than comparisons, allowing them to sidestep this lower bound. Comparison sorts may run faster on some lists; the Ω lower bound applies only to the case. Real-world measures of sorting speed may need to take into account the ability of some algorithms to optimally use fast cached computer memory, or the application may benefit from sorting methods where sorted data begins to appear to the user as opposed to sorting methods where no output is available until the whole list is sorted.
Despite these limitations, comparison sorts offer the notable practical advantage that control over the comparison function allows sorting of many different datatypes and fine control over how the list is sorted. For example, reversing the result of the comparison function allows the list to be sorted in reverse. Comparison sorts adapt more to complex orders such as the order of floating-point numbers. Additionally, once a comparison function is written, any comparison sort can be used without modification; this flexibility, together with the efficiency of the above comparison sorting algorithms on modern computers, has led to widespread preference for comparison sorts in most practical work. Some sorting problems admit a faster solution than the Ω bound for comparison sorting; when the keys form a small range, counting sort is an example algorithm. Other integer sorting algorithms, such as radix sort, are not asymptotically faster than comparison sorting, but can be faster in practice; the problem of sorting pairs of numbers by their sum is not subject to the Ω bound either.
The number of comparisons that a comparison sort algorithm requires increases in proportion to n log , where n is the number of elements to sort. This bound is asymptotically tight. Given a list of distinct numbers, there are n factorial permutations one of, the list in sorted order; the sort algorithm must gain enough information from the comparisons to identify the correct permutation. If the algorithm always completes after at most f steps, it cannot distinguish more than 2f cases because the keys are distinct and each comparison has only two possible outcomes. Therefore, 2 f ≥ n!, or equivalently f ≥ log 2 . By looking at the first n / 2 factors of n! = n ⋯ 1, we obtain log 2
Radix sort
In computer science, radix sort is a non-comparative integer sorting algorithm that sorts data with integer keys by grouping keys by the individual digits which share the same significant position and value. A positional notation is required, but because integers can represent strings of characters and specially formatted floating point numbers, radix sort is not limited to integers. Radix sort dates back as far as 1887 to the work of Herman Hollerith on tabulating machines. Most digital computers internally represent all of their data as electronic representations of binary numbers, so processing the digits of integer representations by groups of binary digit representations is most convenient. Radix sorts can be implemented to start at either the most significant digit or least significant digit. For example, when sorting the number 1234 into a list, one could start with the 1 or the 4. LSD radix sorts use the following sorting order: short keys come before longer keys, keys of the same length are sorted lexicographically.
This coincides with the normal order of integer representations, such as the sequence 1, 2, 3, 4, 5, 6, 7, 8, 9, 10, 11. MSD radix sorts use lexicographic order, suitable for sorting strings, such as words, or fixed-length integer representations. A sequence such as "b, c, d, e, f, g, h, i, j, ba" would be lexicographically sorted as "b, ba, c, d, e, f, g, h, i, j". If lexicographic ordering is used to sort variable-length integer representations the representations of the numbers from 1 to 10 would be output as 1, 10, 2, 3, 4, 5, 6, 7, 8, 9, as if the shorter keys were left-justified and padded on the right with blank characters to make the shorter keys as long as the longest key for the purpose of determining sorted order; the topic of the efficiency of radix sort compared to other sorting algorithms is somewhat tricky and subject to quite a lot of misunderstandings. Whether radix sort is efficient, less efficient or more efficient than the best comparison-based algorithms depends on the details of the assumptions made.
Radix sort complexity is O for n keys which are integers of word size w. Sometimes w is presented as a constant, which would make radix sort better than the best comparison-based sorting algorithms, which all perform Θ comparisons to sort n keys. However, in general w cannot be considered a constant: if all n keys are distinct w has to be at least log n for a random-access machine to be able to store them in memory, which gives at best a time complexity Ω; that would seem to make radix sort at best efficient as optimal comparison-based sorts. The counter argument is that comparison-based algorithms are measured in number of comparisons, not actual time complexity. Under some assumptions the comparisons will be constant time on average, under others they will not. Comparisons of randomly generated keys takes constant time on average, as keys differ on the first bit in half the cases, differ on the second bit in half of the remaining half, so on, resulting in an average of two bits that need to be compared.
In a sorting algorithm the first comparisons made satisfies the randomness condition, but as the sort progresses the keys compared are not randomly chosen anymore. For example, consider a bottom-up merge sort; the first pass will compare pairs of random keys, but the last pass will compare keys that are close in the sorting order. This makes merge sort, on this class of inputs, take Ω time; that assumes all memory accesses cost the same, not a physically reasonable assumption as we scale n to infinity, not, in practice, how real computers work. This argument extends from the observation that computers are filled with different types of memory in different and limited quantities. Modern operating systems will position variables into these different systems automatically making memory access time differ as more and more memory is utilized. A Least significant digit Radix sort is a fast stable sorting algorithm which can be used to sort keys in integer representation order. Keys may be a string of characters, or numerical digits in a given'radix'.
The processing of the keys begins at the least significant digit, proceeds to the most significant digit. The sequence in which digits are processed by an LSD radix sort is the opposite of the sequence in which digits are processed by a most significant digit radix sort. An LSD radix sort operates in O time, where n is the number of keys, w is the average key length; this kind of performance for variable-length keys can be achieved by grouping all of the keys that have the same length together and separately performing an LSD radix sort on each group of keys for each length, from shortest to longest, in order to avoid processing the whole list of keys on every sorting pass. A radix sorting algorithm was used to sort punched cards in several passes. A computer algorithm was invented for radix sort in 1954 at MIT by Harold H. Seward. In many large applications needing speed, the computer radix sort is an improvement on comparison sorts. LSD radix sorts have resurfaced as an alternative to high performance comparison-based sorting algorithms that require Ω comparisons, where n is the number of items to be sorted.
Comparison sorts can do no better than Ω execution time but offer the flexibility of being able to sort with respect to more complicated orderings than a lexicographic one. Each k
A* search algorithm
In computer science, A* is a computer algorithm, used in pathfinding and graph traversal, the process of finding a path between multiple points, called "nodes". It enjoys widespread use due to its accuracy. However, in practical travel-routing systems, it is outperformed by algorithms which can pre-process the graph to attain better performance, although other work has found A* to be superior to other approaches. Peter Hart, Nils Nilsson and Bertram Raphael of Stanford Research Institute first published the algorithm in 1968, it can be seen as an extension of Edsger Dijkstra's 1959 algorithm. A* achieves better performance by using heuristics to guide its search. A* was created as part of the Shakey project, which had the aim of building a mobile robot that could plan its own actions. Nils Nilsson proposed using the Graph Traverser algorithm for Shakey's path planning. Graph Traverser is guided by a heuristic function h, the estimated distance from node n to the goal node, it ignores g, the distance from the start node to n.
Bertram Raphael suggested using the sum, g + h. Peter Hart invented the concepts we now call admissibility and consistency of heuristic functions. A* was designed for finding least-cost paths when the cost of a path is the sum of its edge costs, but it has been shown that A* can be used to find optimal paths for any problem satisfying the conditions of a cost algebra; the original 1968 A* paper contained a theorem that no A*-like algorithm could expand fewer nodes than A* if the heuristic function is consistent and A*’s tie-breaking rule is suitably chosen. A correction was published a few years showing that consistency was not required, just admissibility. However, what was shown was that no algorithm can be more efficient than A* with respect to the set of expanded nodes; as shown in Dechter and Pearl’s definitive study of A*'s optimality, it is possible for a different correct algorithm to outperform A* on some graphs while losing to A* on other graphs, or to expand a set of nodes that neither contains the set of nodes expanded by A*, nor is contained in it.
A* is an informed search algorithm, or a best-first search, meaning that it is formulated in terms of weighted graphs: starting from a specific starting node of a graph, it aims to find a path to the given goal node having the smallest cost. It does this by maintaining a tree of paths originating at the start node and extending those paths one edge at a time until its termination criterion is satisfied. At each iteration of its main loop, A * needs to determine, it does so based on the cost of the path and an estimate of the cost required to extend the path all the way to the goal. A* selects the path that minimizes f = g + h where n is the next node on the path, g is the cost of the path from the start node to n, h is a heuristic function that estimates the cost of the cheapest path from n to the goal. A* terminates when the path it chooses to extend is a path from start to goal or if there are no paths eligible to be extended; the heuristic function is problem-specific. If the heuristic function is admissible, meaning that it never overestimates the actual cost to get to the goal, A* is guaranteed to return a least-cost path from start to goal.
Typical implementations of A* use a priority queue to perform the repeated selection of minimum cost nodes to expand. This priority queue is known as fringe. At each step of the algorithm, the node with the lowest f value is removed from the queue, the f and g values of its neighbors are updated accordingly, these neighbors are added to the queue; the algorithm continues. The f value of the goal is the cost of the shortest path, since h at the goal is zero in an admissible heuristic; the algorithm described so far gives us only the length of the shortest path. To find the actual sequence of steps, the algorithm can be revised so that each node on the path keeps track of its predecessor. After this algorithm is run, the ending node will point to its predecessor, so on, until some node's predecessor is the start node; as an example, when searching for the shortest route on a map, h might represent the straight-line distance to the goal, since, physically the smallest possible distance between any two points.
If the heuristic h satisfies the additional condition h ≤ d + h for every edge of the graph h is called monotone, or consistent. In such a case, A* can be implemented more efficiently—roughly speaking, no node needs to be processed more than once —and A* is equivalent to running Dijkstra's algorithm with the reduced cost d' = d + h − h; the following pseudocode describes the algorithm: Remark: the above pseudocode assumes that the heuristic function is monotonic, a frequent case in many practical problems, such as the Shortest Distance Path in road networks. However, if
Connectivity (graph theory)
In mathematics and computer science, connectivity is one of the basic concepts of graph theory: it asks for the minimum number of elements that need to be removed to separate the remaining nodes into isolated subgraphs. It is related to the theory of network flow problems; the connectivity of a graph is an important measure of its resilience as a network. An undirected graph is connected. In a connected graph, there are no unreachable vertices. A graph, not connected is disconnected. An undirected graph G is said to be disconnected if there exist two nodes in G such that no path in G has those nodes as endpoints. A graph with just one vertex is connected. An edgeless graph with two or more vertices is disconnected. In an undirected graph G, two vertices u and v are called connected if G contains a path from u to v. Otherwise, they are called disconnected. If the two vertices are additionally connected by a path of length 1, i.e. by a single edge, the vertices are called adjacent. A graph is said to be connected.
A connected component is a maximal connected subgraph of G. Each vertex belongs to one connected component, as does each edge. A directed graph is called weakly connected if replacing all of its directed edges with undirected edges produces a connected graph, it is connected if it contains a directed path from u to v or a directed path from v to u for every pair of vertices u, v. It is connected, diconnected, or strong if it contains a directed path from u to v and a directed path from v to u for every pair of vertices u, v; the strong components are the maximal connected subgraphs. A cut, vertex cut, or separating set of a connected graph G is a set of vertices whose removal renders G disconnected; the connectivity or vertex connectivity κ is the size of a minimal vertex cut. A graph is called k-vertex-connected if its vertex connectivity is k or greater. More any graph G is said to be k-connected if it contains at least k+1 vertices, but does not contain a set of k − 1 vertices whose removal disconnects the graph.
In particular, a complete graph with n vertices, denoted Kn, has no vertex cuts at all, but κ = n − 1. A vertex cut for two vertices u and v is a set of vertices whose removal from the graph disconnects u and v; the local connectivity κ is the size of a smallest vertex cut separating u and v. Local connectivity is symmetric for undirected graphs. Moreover, except for complete graphs, κ equals the minimum of κ over all nonadjacent pairs of vertices u, v. 2-connectivity is called biconnectivity and 3-connectivity is called triconnectivity. A graph G, connected but not 2-connected is sometimes called separable. Analogous concepts can be defined for edges. In the simple case in which cutting a single, specific edge would disconnect the graph, that edge is called a bridge. More an edge cut of G is a set of edges whose removal renders the graph disconnected; the edge-connectivity λ is the size of a smallest edge cut, the local edge-connectivity λ of two vertices u, v is the size of a smallest edge cut disconnecting u from v. Again, local edge-connectivity is symmetric.
A graph is called k-edge-connected. A graph is said to be maximally connected. A graph is said to be maximally edge-connected. A graph is said to be super-κ if every minimum vertex cut isolates a vertex. A graph is said to be hyper-connected or hyper-κ if the deletion of each minimum vertex cut creates two components, one of, an isolated vertex. A graph is semi-hyper-connected or semi-hyper-κ if any minimum vertex cut separates the graph into two components. More precisely: a G connected graph is said to be super-connected or super-κ if all minimum vertex-cuts consist of the vertices adjacent with one vertex. A G connected graph is said to be super-edge-connected or super-λ if all minimum edge-cuts consist of the edges incident on some vertex. A cutset X of G is called a non-trivial cutset if X does not contain the neighborhood N of any vertex u ∉ X; the superconnectivity κ1 of G is: κ1 = min. A non-trivial edge-cut and the edge-superconnectivity λ1 are defined analogously. One of the most important facts about connectivity in graphs is Menger's theorem, which characterizes the connectivity and edge-connectivity of a graph in terms of the number of independent paths between vertices.
If u and v are vertices of a graph G a collection of paths between u and v is called independent if no two of them share a vertex. The collection is edge-independent if no two paths in it share an edge; the number of mutually independent paths between u and v is written as κ′, the number of mutually edge-independent paths between u and v is written as λ′. Menger's theorem asserts that for distinct vertices u,v, λ equals λ′, if u is not adjacent to v κ equals κ′; this fact is a special case of the max-flow min-cut theorem. The problem of determining whether two vertices in a graph are connected can be solved efficiently using a search algorithm, such as breadth-first search. More it is easy to determine computationally whether a graph is connected, or to count the number of connected compone