In mathematics and computer science, an algorithm is an unambiguous specification of how to solve a class of problems. Algorithms can perform calculation, data processing, automated reasoning, other tasks; as an effective method, an algorithm can be expressed within a finite amount of space and time and in a well-defined formal language for calculating a function. Starting from an initial state and initial input, the instructions describe a computation that, when executed, proceeds through a finite number of well-defined successive states producing "output" and terminating at a final ending state; the transition from one state to the next is not deterministic. The concept of algorithm has existed for centuries. Greek mathematicians used algorithms in the sieve of Eratosthenes for finding prime numbers, the Euclidean algorithm for finding the greatest common divisor of two numbers; the word algorithm itself is derived from the 9th century mathematician Muḥammad ibn Mūsā al-Khwārizmī, Latinized Algoritmi.
A partial formalization of what would become the modern concept of algorithm began with attempts to solve the Entscheidungsproblem posed by David Hilbert in 1928. Formalizations were framed as attempts to define "effective calculability" or "effective method"; those formalizations included the Gödel–Herbrand–Kleene recursive functions of 1930, 1934 and 1935, Alonzo Church's lambda calculus of 1936, Emil Post's Formulation 1 of 1936, Alan Turing's Turing machines of 1936–37 and 1939. The word'algorithm' has its roots in Latinizing the name of Muhammad ibn Musa al-Khwarizmi in a first step to algorismus. Al-Khwārizmī was a Persian mathematician, astronomer and scholar in the House of Wisdom in Baghdad, whose name means'the native of Khwarazm', a region, part of Greater Iran and is now in Uzbekistan. About 825, al-Khwarizmi wrote an Arabic language treatise on the Hindu–Arabic numeral system, translated into Latin during the 12th century under the title Algoritmi de numero Indorum; this title means "Algoritmi on the numbers of the Indians", where "Algoritmi" was the translator's Latinization of Al-Khwarizmi's name.
Al-Khwarizmi was the most read mathematician in Europe in the late Middle Ages through another of his books, the Algebra. In late medieval Latin, English'algorism', the corruption of his name meant the "decimal number system". In the 15th century, under the influence of the Greek word ἀριθμός'number', the Latin word was altered to algorithmus, the corresponding English term'algorithm' is first attested in the 17th century. In English, it was first used in about 1230 and by Chaucer in 1391. English adopted the French term, but it wasn't until the late 19th century that "algorithm" took on the meaning that it has in modern English. Another early use of the word is from 1240, in a manual titled Carmen de Algorismo composed by Alexandre de Villedieu, it begins thus: Haec algorismus ars praesens dicitur, in qua / Talibus Indorum fruimur bis quinque figuris. Which translates as: Algorism is the art by which at present we use those Indian figures, which number two times five; the poem is a few hundred lines long and summarizes the art of calculating with the new style of Indian dice, or Talibus Indorum, or Hindu numerals.
An informal definition could be "a set of rules that defines a sequence of operations". Which would include all computer programs, including programs that do not perform numeric calculations. A program is only an algorithm if it stops eventually. A prototypical example of an algorithm is the Euclidean algorithm to determine the maximum common divisor of two integers. Boolos, Jeffrey & 1974, 1999 offer an informal meaning of the word in the following quotation: No human being can write fast enough, or long enough, or small enough† to list all members of an enumerably infinite set by writing out their names, one after another, in some notation, but humans can do something useful, in the case of certain enumerably infinite sets: They can give explicit instructions for determining the nth member of the set, for arbitrary finite n. Such instructions are to be given quite explicitly, in a form in which they could be followed by a computing machine, or by a human, capable of carrying out only elementary operations on symbols.
An "enumerably infinite set" is one whose elements can be put into one-to-one correspondence with the integers. Thus and Jeffrey are saying that an algorithm implies instructions for a process that "creates" output integers from an arbitrary "input" integer or integers that, in theory, can be arbitrarily large, thus an algorithm can be an algebraic equation such as y = m + n – two arbitrary "input variables" m and n that produce an output y. But various authors' attempts to define the notion indicate that the word implies much more than this, something on the order of: Precise instructions for a fast, efficient, "good" process that specifies the "moves" of "the computer" to find and process arbitrary input integers/symbols m and n, symbols + and =... and "effectively" produce, in a "reasonable" time, output-integer y at a specified place and in a specified format
RGB color space
An RGB color space is any additive color space based on the RGB color model. A particular RGB color space is defined by the three chromaticities of the red and blue additive primaries, can produce any chromaticity, the triangle defined by those primary colors; the complete specification of an RGB color space requires a white point chromaticity and a gamma correction curve. As of 2007, sRGB is by far the most used RGB color space. RGB is an abbreviation for red–green–blue. An RGB color can be understood by thinking of it as all possible colors that can be made from three colored lights for red and blue. Imagine, for example, shining three lights together onto a white wall in a dark room: one red light, one green light, one blue light, each with dimmers. If only the red light is on, the wall will be red. If only the green light is on, the wall will look green. If the red and green lights are on together, the wall will look yellow. Dim the red light and the wall will become more of a yellow-green. Dim the green light instead, the wall will become more orange.
Bringing up the blue light a bit will cause the orange to become less saturated and more whitish. In all, each setting of the three dimmers will produce a different result, either in color or in brightness or both; the set of all possible results is the gamut defined by those particular color lamps. Swap the red lamp for one of a different brand, more orange, there will be a different gamut, since the set of all colors that can be produced with the three lights will be changed. A computer LCD display can be thought of as a grid of millions of little red and blue lamps, each with their own dimmers; the gamut of the display will depend on the three colors used for the red and blue lights. A wide-gamut display will have saturated, "pure" light colors, thus be able to display saturated, deep colors. RGB is a convenient color model for computer graphics because the human visual system works in a way, similar – though not quite identical – to an RGB color space; the most used RGB color spaces are sRGB and Adobe RGB.
Adobe has developed another color space called Adobe Wide Gamut RGB, larger, in detriment to gamut density. As of 2007, sRGB is by far the most used RGB color space in consumer grade digital cameras, HD video cameras, computer monitors. HDTVs use a similar space called Rec. 709, sharing the sRGB primaries. The sRGB space is considered adequate for most consumer applications. Having all devices use the same color space is convenient in that an image does not need to be converted from one color space to another before being displayed. However, sRGB's limited gamut leaves out many saturated colors that can be produced by printers or in film, thus is not ideal for some high quality applications; the wider gamut Adobe RGB is being built into more medium-grade digital cameras, is favored by many professional graphic artists for its larger gamut. RGB spaces are specified by defining three primary colors and a white point. In the table below the three primary colors and white points for various RGB spaces are given.
The primary colors are specified in terms of their CIE 1931 color space chromaticity coordinates. The CIE 1931 color space standard defines both the CIE RGB space, an RGB color space with monochromatic primaries, the CIE XYZ color space, which works like an RGB color space except that it has non-physical primaries that cannot be said to be red and blue. CIE L*a*b* color space Web colors RGB color model RGBA color space Pascale, Danny. "A Review of RGB color spaces...from xyY to R'G'B'". Retrieved 2008-01-21. Susstrunk and Swen. "Standard RGB Color Spaces". Retrieved November 18, 2005. Lindbloom, Bruce. "RGB Working Space Information". Retrieved November 18, 2005. Colantoni, Philippe. "RGB cube transformation in different color spaces". Archived from the original on 2008-05-05; the Difference Between CMYK and RGB in Digital Printing
In digital photography, computer-generated imagery, colorimetry, a grayscale or greyscale image is one in which the value of each pixel is a single sample representing only an amount of light, that is, it carries only intensity information. Grayscale images, a kind of black-and-white or gray monochrome, are composed of shades of gray; the contrast ranges from black at the weakest intensity to white at the strongest. Grayscale images are distinct from one-bit bi-tonal black-and-white images which, in the context of computer imaging, are images with only two colors: black and white. Grayscale images have many shades of gray in between. Grayscale images can be the result of measuring the intensity of light at each pixel according to a particular weighted combination of frequencies, in such cases they are monochromatic proper when only a single frequency is captured; the frequencies can in principle be from anywhere in the electromagnetic spectrum. A colorimetric grayscale image is an image that has a defined grayscale colorspace, which maps the stored numeric sample values to the achromatic channel of a standard colorspace, which itself is based on measured properties of human vision.
If the original color image has no defined colorspace, or if the grayscale image is not intended to have the same human-perceived achromatic intensity as the color image there is no unique mapping from such a color image to a grayscale image. The intensity of a pixel is expressed within a given range between a minimum and a maximum, inclusive; this range is represented in an abstract way as a range from 0 and 1, with any fractional values in between. This notation is used in academic papers, but this does not define what "black" or "white" is in terms of colorimetry. Sometimes the scale is reversed, as in printing where the numeric intensity denotes how much ink is employed in halftoning, with 0% representing the paper white and 100% being a solid black. In computing, although the grayscale can be computed through rational numbers, image pixels are quantized to store them as unsigned integers, to reduce the required storage and computation; some early grayscale monitors can only display up to sixteen different shades, which would be stored in binary form using 4-bits.
But today grayscale images intended for visual display are stored with 8 bits per sampled pixel. This pixel depth allows 256 different intensities to be recorded, simplifies computation as each pixel sample can be accessed individually as one full byte. However, if these intensities were spaced in proportion to the amount of physical light they represent at that pixel, the differences between adjacent dark shades could be quite noticeable as banding artifacts, while many of the lighter shades would be "wasted" by encoding a lot of perceptually-indistinguishable increments. Therefore, the shades are instead spread out evenly on a gamma-compressed nonlinear scale, which better approximates uniform perceptual increments for both dark and light shades making these 256 shades enough to avoid noticeable increments. Technical uses require more levels, to make full use of the sensor accuracy and to reduce rounding errors in computations. Sixteen bits per sample is a convenient choice for such uses, as computers manage 16-bit words efficiently.
The TIFF and PNG image file formats support 16-bit grayscale natively, although browsers and many imaging programs tend to ignore the low order 8 bits of each pixel. Internally for computation and working storage, image processing software uses integer or floating-point numbers of size 16 or 32 bits. Conversion of an arbitrary color image to grayscale is not unique in general. A common strategy is to use the principles of photometry or, more broadly, colorimetry to calculate the grayscale values so as to have the same luminance as the original color image. In addition to the same luminance, this method ensures that both images will have the same absolute luminance when displayed, as can be measured by instruments in its SI units of candelas per square meter, in any given area of the image, given equal whitepoints. Luminance itself is defined using a standard model of human vision, so preserving the luminance in the grayscale image preserves other perceptual lightness measures, such as L*, determined by the linear luminance Y itself which we will refer to here as Ylinear to avoid any ambiguity.
To convert a color from a colorspace based on a typical gamma-compressed RGB color model to a grayscale representation of its luminance, the gamma compression function must first be removed via gamma expansion to transform the image to a linear RGB colorspace, so that the appropriate weighted sum can be applied to the linear color components ( R l i n e a r, G l i n
An image is an artifact that depicts visual perception, such as a photograph or other two-dimensional picture, that resembles a subject—usually a physical object—and thus provides a depiction of it. In the context of signal processing, an image is a distributed amplitude of color. Images may be two-dimensional, such as a photograph or screen display, or three-dimensional, such as a statue or hologram, they may be captured by optical devices – such as cameras, lenses, microscopes, etc. and natural objects and phenomena, such as the human eye or water. The word'image' is used in the broader sense of any two-dimensional figure such as a map, a graph, a pie chart, a painting or a banner. In this wider sense, images can be rendered manually, such as by drawing, the art of painting, rendered automatically by printing or computer graphics technology, or developed by a combination of methods in a pseudo-photograph. A volatile image is one; this may be a reflection of an object by a mirror, a projection of a camera obscura, or a scene displayed on a cathode ray tube.
A fixed image called a hard copy, is one, recorded on a material object, such as paper or textile by photography or any other digital process. A mental image exists in an individual's mind, as something one imagines; the subject of an image need not be real. For example, Sigmund Freud claimed to have dreamed purely in aural-images of dialogs; the development of synthetic acoustic technologies and the creation of sound art have led to a consideration of the possibilities of a sound-image made up of irreducible phonic substance beyond linguistic or musicological analysis. There are Two Types of Images a. Still Image b. Moving Image A still image is a single static image; this phrase is used in photography, visual media and the computer industry to emphasize that one is not talking about movies, or in precise or pedantic technical writing such as a standard. A moving image is a movie or video, including digital video, it could be an animated display such as a zoetrope. A still frame is a still image derived from one frame of a moving one.
In contrast, a film still is a photograph taken on the set of a movie or television program during production, used for promotional purposes. In literature, imagery is a "mental picture", it can both be literal. Aniconism Avatar Cinematography Computer animation Computer-generated imagery Digital image Digital imaging Fine art photography Graphics Imago camera Image editing Pattern recognition Photograph Media related to Images at Wikimedia Commons Quotations related to Image at Wikiquote The dictionary definition of image at Wiktionary The B-Z Reaction: The Moving or the Still Image? Library of Congress – Format Descriptions for Still Images Image Processing – Online Open Research Group Legal Issues Regarding Images Image Copyright Case
Image compression is a type of data compression applied to digital images, to reduce their cost for storage or transmission. Algorithms may take advantage of visual perception and the statistical properties of image data to provide superior results compared with generic data compression methods which are used for other digital data. Image compression may be lossless. Lossless compression is preferred for archival purposes and for medical imaging, technical drawings, clip art, or comics. Lossy compression methods when used at low bit rates, introduce compression artifacts. Lossy methods are suitable for natural images such as photographs in applications where minor loss of fidelity is acceptable to achieve a substantial reduction in bit rate. Lossy compression that produces negligible differences may be called visually lossless. Methods for lossless image compression are: Run-length encoding – used in default method in PCX and as one of possible in BMP, TGA, TIFF Area image compression DPCM and Predictive Coding Entropy encoding Adaptive dictionary algorithms such as LZW – used in GIF and TIFF DEFLATE – used in PNG, MNG, TIFF Chain codesMethods for lossy compression: Reducing the color space to the most common colors in the image.
The selected colors are specified in the colour palette in the header of the compressed image. Each pixel just references the index of a color in the color palette, this method can be combined with dithering to avoid posterization. Chroma subsampling; this takes advantage of the fact that the human eye perceives spatial changes of brightness more than those of color, by averaging or dropping some of the chrominance information in the image. Transform coding; this is the most used method. In particular, a Fourier-related transform such as the Discrete Cosine Transform is used: N. Ahmed, T. Natarajan and K. R. Rao, "Discrete Cosine Transform," IEEE Trans. Computers, 90–93, Jan. 1974. The DCT is sometimes referred to as "DCT-II" in the context of a family of discrete cosine transforms; the more developed wavelet transform is used extensively, followed by quantization and entropy coding. Fractal compression; the best image quality at a given compression rate is the main goal of image compression, there are other important properties of image compression schemes: Scalability refers to a quality reduction achieved by manipulation of the bitstream or file.
Other names for scalability are progressive embedded bitstreams. Despite its contrary nature, scalability may be found in lossless codecs in form of coarse-to-fine pixel scans. Scalability is useful for previewing images while downloading them or for providing variable quality access to e.g. databases. There are several types of scalability: Quality progressive or layer progressive: The bitstream successively refines the reconstructed image. Resolution progressive: First encode a lower image resolution. Component progressive: First encode grey-scale version. Region of interest coding. Certain parts of the image are encoded with higher quality than others; this may be combined with scalability. Meta information. Compressed data may contain information about the image which may be used to categorize, search, or browse images; such information may include color and texture statistics, small preview images, author or copyright information. Processing power. Compression algorithms require different amounts of processing power to decode.
Some high compression algorithms require high processing power. The quality of a compression method is measured by the peak signal-to-noise ratio, it measures the amount of noise introduced through a lossy compression of the image, the subjective judgment of the viewer is regarded as an important measure being the most important measure. Image compression – lecture from MIT OpenCourseWare Image Coding Fundamentals A study about image compression – with basics, comparing different compression methods like JPEG2000, JPEG and JPEG XR / HD Photo Data Compression Basics – includes comparison of PNG, JPEG and JPEG-2000 formats FAQ:What is the state of the art in lossless image compression? from comp.compression IPRG – an open group related to image processing research resources
In computer graphics and digital imaging, image scaling refers to the resizing of a digital image. In video technology, the magnification of digital material is known as upscaling or resolution enhancement; when scaling a vector graphic image, the graphic primitives that make up the image can be scaled using geometric transformations, with no loss of image quality. When scaling a raster graphics image, a new image with a higher or lower number of pixels must be generated. In the case of decreasing the pixel number this results in a visible quality loss. From the standpoint of digital signal processing, the scaling of raster graphics is a two-dimensional example of sample-rate conversion, the conversion of a discrete signal from a sampling rate to another. Image scaling can be interpreted as a form of image resampling or image reconstruction from the view of the Nyquist sampling theorem. According to the theorem, downsampling to a smaller image from a higher-resolution original can only be carried out after applying a suitable 2D anti-aliasing filter to prevent aliasing artifacts.
The image is reduced to the information. In the case of up sampling, a reconstruction filter takes the place of the anti-aliasing filter. A more sophisticated approach to upscaling treats the problem as an inverse problem, solving the question of generating a plausible image, when scaled down, would look like the input image. A variety of techniques have been applied for this, including optimization techniques with regularization terms and the use of machine learning from examples. An image size can be changed in several ways. Nearest-neighbor interpolationOne of the simpler ways of increasing image size is nearest-neighbor interpolation, replacing every pixel with the nearest pixel in the output, for upscaling this means multiple pixels of the same color, this can preserve sharp details in pixel art, but introduce jaggedness in smooth images.'Nearest' in nearest-neighbor doesn't have to be the mathematical nearest. One common implementation is to always round towards zero, rounding this way produces fewer artifacts and is faster to calculate.
Bilinear and bicubic algorithmsBilinear interpolation works by interpolating pixel color values, introducing a continuous transition into the output where the original material has discrete transitions. Although this is desirable for continuous-tone images, this algorithm reduces contrast in a way that may be undesirable for line art. Bicubic interpolation yields better results, with only a small increase in computational complexity. Sinc and Lanczos resamplingSinc resampling in theory provides the best possible reconstruction for a bandlimited signal. In practice, the assumptions behind sinc resampling are not met by real-world digital images. Lanczos resampling, an approximation to the sinc method, yields better results. Bicubic interpolation can be regarded as a computationally efficient approximation to Lanczos resampling. Box samplingOne weakness of bilinear and related algorithms is that they sample a specific number of pixels; when down scaling below a certain threshold, such as more than twice for all bi-sampling algorithms, the algorithms will sample non-adjacent pixels, which results in both losing data, causes rough results.
The trivial solution to this issue is box sampling, to consider the target pixel a box on the original image, sample all pixels inside the box. This ensures; the major weakness of this algorithm is. MipmapAnother solution to the downscale problem of bi-sampling scaling are mipmaps. A mipmap is a prescaled set of downscale copies; when downscaling the nearest larger mipmap is used as the origin, to ensure no scaling below the useful threshold of bilinear scaling is used. This algorithm is fast, easy to optimize, it is standard in many frameworks such as OpenGL. The cost is using more image memory one third more in the standard implementation. Fourier-transform methodsSimple interpolation based on Fourier transform pads the frequency domain with zero components. Besides the good conservation of details, notable is the ringing and the circular bleeding of content from the left border to right border. Edge-directed interpolationEdge-directed interpolation algorithms aim to preserve edges in the image after scaling, unlike other algorithms, which can introduce staircase artifacts.
Examples of algorithms for this task include New Edge-Directed Interpolation, Edge-Guided Image Interpolation, Iterative Curvature-Based Interpolation, Directional Cubic Convolution Interpolation. A 2013 analysis found that DCCI had the best scores in SSIM on a series of test images. HqxFor magnifying computer graphics with low resolution and/or few colors, better results can be achieved by hqx or other pixel-art scaling algorithms; these maintain high level of detail. VectorizationVector extraction, or vectorization, offer another approach. Vectorization first creates a resolution-independent vector representation of the graphic to be scaled; the resolution-independent version is rendered as a raster image at the desired resolution. This technique is used by Adobe Illustrator, Live Trace, Inkscape. Scalable Vector Graphics are well suited to simple geometric images, while photographs do not fare well with vectorization due to their complexity. Deep convolutional neural networksThis method uses machine learning for more detailed images such as photographs and complex artwork.
In information technology, lossy compression or irreversible compression is the class of data encoding methods that uses inexact approximations and partial data discarding to represent the content. These techniques are used to reduce data size for storing and transmitting content; the different versions of the photo of the cat to the right show how higher degrees of approximation create coarser images as more details are removed. This is opposed to lossless data compression; the amount of data reduction possible using lossy compression is much higher than through lossless techniques. Well-designed lossy compression technology reduces file sizes before degradation is noticed by the end-user; when noticeable by the user, further data reduction may be desirable. Lossy compression is most used to compress multimedia data in applications such as streaming media and internet telephony. By contrast, lossless compression is required for text and data files, such as bank records and text articles, it can be advantageous to make a master lossless file which can be used to produce additional copies from.
This allows one to avoid basing new compressed copies off of a lossy source file, which would yield additional artifacts and further unnecessary information loss. It is possible to compress many types of digital data in a way that reduces the size of a computer file needed to store it, or the bandwidth needed to transmit it, with no loss of the full information contained in the original file. A picture, for example, is converted to a digital file by considering it to be an array of dots and specifying the color and brightness of each dot. If the picture contains an area of the same color, it can be compressed without loss by saying "200 red dots" instead of "red dot, red dot...... red dot." The original data contains a certain amount of information, there is a lower limit to the size of file that can carry all the information. Basic information theory says; when data is compressed, its entropy increases, it cannot increase indefinitely. As an intuitive example, most people know that a compressed ZIP file is smaller than the original file, but compressing the same file will not reduce the size to nothing.
Most compression algorithms can recognize when further compression would be pointless and would in fact increase the size of the data. In many cases, files or data streams contain more information than is needed for a particular purpose. For example, a picture may have more detail than the eye can distinguish when reproduced at the largest size intended. Developing lossy compression techniques as matched to human perception as possible is a complex task. Sometimes the ideal is a file that provides the same perception as the original, with as much digital information as possible removed; the terms'irreversible' and'reversible' are preferred over'lossy' and'lossless' for some applications, such as medical image compression, to circumvent the negative implications of'loss'. The type and amount of loss can affect the utility of the images. Artifacts or undesirable effects of compression may be discernible yet the result still useful for the intended purpose. Or lossy compressed images may be'visually lossless', or in the case of medical images, so-called Diagnostically Acceptable Irreversible Compression may have been applied.
More some forms of lossy compression can be thought of as an application of transform coding – in the case of multimedia data, perceptual coding: it transforms the raw data to a domain that more reflects the information content. For example, rather than expressing a sound file as the amplitude levels over time, one may express it as the frequency spectrum over time, which corresponds more to human audio perception. While data reduction is a main goal of transform coding, it allows other goals: one may represent data more for the original amount of space – for example, in principle, if one starts with an analog or high-resolution digital master, an MP3 file of a given size should provide a better representation than a raw uncompressed audio in WAV or AIFF file of the same size; this is because uncompressed audio can only reduce file size by lowering bit rate or depth, whereas compressing audio can reduce size while maintaining bit rate and depth. This compression becomes a selective loss of the least significant data, rather than losing data across the board.
Further, a transform coding may provide a better domain for manipulating or otherwise editing the data – for example, equalization of audio is most expressed in the frequency domain rather than in the raw time domain. From this point of view, perceptual encoding is not about discarding data, but rather about a better representation of data. Another use is for backward compatibility and graceful degradation: in color television, encoding color via a luminance-chrominance transform domain means that black-and-white sets display the luminance, while ignoring the color information. Another example is chroma subsampling: the use of color spaces such as YIQ, used in NTSC, allow one to reduce the resolution on the components to accord with human perception – humans have highest resolution for black-an