1.
Application software
–
An application program is a computer program designed to perform a group of coordinated functions, tasks, or activities for the benefit of the user. Examples of an application include a processor, a spreadsheet, an accounting application, a web browser, a media player, an aeronautical flight simulator. The collective noun application software refers to all applications collectively and this contrasts with system software, which is mainly involved with running the computer. Applications may be bundled with the computer and its software or published separately. Apps built for mobile platforms are called mobile apps, in information technology, an application is a computer program designed to help people perform an activity. An application thus differs from a system, a utility. Depending on the activity for which it was designed, an application can manipulate text, numbers, graphics, some application packages focus on a single task, such as word processing, others, called integrated software include several applications. User-written software tailors systems to meet the specific needs. User-written software includes templates, word processor macros, scientific simulations, graphics. Even email filters are a kind of user software, users create this software themselves and often overlook how important it is. The delineation between system software such as operating systems and application software is not exact, however, and is occasionally the object of controversy. As another example, the GNU/Linux naming controversy is, in part, the above definitions may exclude some applications that may exist on some computers in large organizations. For an alternative definition of an app, see Application Portfolio Management, the word application, once used as an adjective, is not restricted to the of or pertaining to application software meaning. Sometimes a new and popular application arises which only runs on one platform and this is called a killer application or killer app. There are many different ways to divide up different types of application software, web apps have indeed greatly increased in popularity for some uses, but the advantages of applications make them unlikely to disappear soon, if ever. Furthermore, the two can be complementary, and even integrated, Application software can also be seen as being either horizontal or vertical. Horizontal applications are more popular and widespread, because they are general purpose, vertical applications are niche products, designed for a particular type of industry or business, or department within an organization. Integrated suites of software will try to handle every aspect possible of, for example, manufacturing or banking systems, or accounting
2.
Floating-point arithmetic
–
In computing, floating-point arithmetic is arithmetic using formulaic representation of real numbers as an approximation so as to support a trade-off between range and precision. A number is, in general, represented approximately to a number of significant digits and scaled using an exponent in some fixed base. For example,1.2345 =12345 ⏟ significand ×10 ⏟ base −4 ⏞ exponent, the term floating point refers to the fact that a numbers radix point can float, that is, it can be placed anywhere relative to the significant digits of the number. This position is indicated as the exponent component, and thus the floating-point representation can be thought of as a kind of scientific notation. The result of dynamic range is that the numbers that can be represented are not uniformly spaced. Over the years, a variety of floating-point representations have been used in computers, however, since the 1990s, the most commonly encountered representation is that defined by the IEEE754 Standard. A floating-point unit is a part of a computer system designed to carry out operations on floating point numbers. A number representation specifies some way of encoding a number, usually as a string of digits, there are several mechanisms by which strings of digits can represent numbers. In common mathematical notation, the string can be of any length. If the radix point is not specified, then the string implicitly represents an integer, in fixed-point systems, a position in the string is specified for the radix point. So a fixed-point scheme might be to use a string of 8 decimal digits with the point in the middle. The scaling factor, as a power of ten, is then indicated separately at the end of the number, floating-point representation is similar in concept to scientific notation. Logically, a floating-point number consists of, A signed digit string of a length in a given base. This digit string is referred to as the significand, mantissa, the length of the significand determines the precision to which numbers can be represented. The radix point position is assumed always to be somewhere within the significand—often just after or just before the most significant digit and this article generally follows the convention that the radix point is set just after the most significant digit. A signed integer exponent, which modifies the magnitude of the number, using base-10 as an example, the number 7005152853504700000♠152853.5047, which has ten decimal digits of precision, is represented as the significand 1528535047 together with 5 as the exponent. In storing such a number, the base need not be stored, since it will be the same for the range of supported numbers. Symbolically, this value is, s b p −1 × b e, where s is the significand, p is the precision, b is the base
3.
Half-precision floating-point format
–
In computing, half precision is a binary floating-point computer number format that occupies 16 bits in computer memory. In IEEE 754-2008 the 16-bit base 2 format is referred to as binary16. It is intended for storage of many floating-point values where higher precision is not needed, nvidia and Microsoft defined the half datatype in the Cg language, released in early 2002, and implemented it in silicon in the GeForce FX, released in late 2002. The hardware-accelerated programmable shading group led by John Airey at SGI invented the s10e5 data type in 1997 as part of the design effort. This is described in a SIGGRAPH2000 paper and further documented in US patent 7518615 and this format is used in several computer graphics environments including OpenEXR, JPEG XR, OpenGL, Cg, and D3DX. The advantage over 8-bit or 16-bit binary integers is that the dynamic range allows for more detail to be preserved in highlights. The advantage over 32-bit single-precision binary formats is that it half the storage. The F16C extension allows x86 processors to convert half-precision floats to, thus only 10 bits of the significand appear in the memory format but the total precision is 11 bits. In IEEE754 parlance, there are 10 bits of significand, the half-precision binary floating-point exponent is encoded using an offset-binary representation, with the zero offset being 15, also known as exponent bias in the IEEE754 standard. The stored exponents 000002 and 111112 are interpreted specially, the minimum strictly positive value is 2−24 ≈5.96 × 10−8. The minimum positive value is 2−14 ≈6.10 × 10−5. The maximum representable value is ×215 =65504 and these examples are given in bit representation of the floating-point value. This includes the sign bit, exponent, and significand, so the bits beyond the rounding point are 0101. Which is less than 1/2 of a unit in the last place, ARM processors support an alternative half-precision format, which does away with the special case for an exponent value of 31. It is almost identical to the IEEE format, but there is no encoding for infinity or NaNs, instead, an exponent of 31 encodes normalized numbers in the range 65536 to 131008
4.
Single-precision floating-point format
–
Single-precision floating-point format is a computer number format that occupies 4 bytes in computer memory and represents a wide dynamic range of values by using a floating point. In IEEE 754-2008 the 32-bit base-2 format is referred to as binary32. It was called single in IEEE 754-1985, in older computers, different floating-point formats of 4 bytes were used, e. g. GW-BASICs single-precision data type was the 32-bit MBF floating-point format. One of the first programming languages to provide single- and double-precision floating-point data types was Fortran, before the widespread adoption of IEEE 754-1985, the representation and properties of the double float data type depended on the computer manufacturer and computer model. Single-precision binary floating-point is used due to its range over fixed point. A signed 32-bit integer can have a value of 231 −1 =2,147,483,647. As an example, the 32-bit integer 2,147,483,647 converts to 2,147,483,650 in IEEE754 form. Single precision is termed REAL in Fortran, float in C, C++, C#, Java, Float in Haskell, and Single in Object Pascal, Visual Basic, and MATLAB. However, float in Python, Ruby, PHP, and OCaml, in most implementations of PostScript, the only real precision is single. The IEEE754 standard specifies a binary32 as having, Sign bit,1 bit Exponent width,8 bits Significand precision,24 bits This gives from 6 to 9 significant decimal digits precision. Sign bit determines the sign of the number, which is the sign of the significand as well, Exponent is either an 8-bit signed integer from −128 to 127 or an 8-bit unsigned integer from 0 to 255, which is the accepted biased form in IEEE754 binary32 definition. Exponents range from −126 to +127 because exponents of −127 and +128 are reserved for special numbers, the true significand includes 23 fraction bits to the right of the binary point and an implicit leading bit with value 1, unless the exponent is stored with all zeros. Thus only 23 fraction bits of the significand appear in the memory format, B0 =1 + ∑ i =123 b 23 − i 2 − i =1 +1 ⋅2 −2 =1.25 ∈ ⊂ ⊂ [1,2 ). Thus, value = ×1.25 ×2 −3 = +0.15625. Note,1 +2 −23 ≈1.000000119,2 −2 −23 ≈1.999999881,2 −126 ≈1.17549435 ×10 −38,2 +127 ≈1.70141183 ×10 +38. The single-precision binary floating-point exponent is encoded using a representation, with the zero offset being 127. The stored exponents 00H and FFH are interpreted specially, the minimum positive normal value is 2−126 ≈1.18 × 10−38 and the minimum positive value is 2−149 ≈1.4 × 10−45. In general, refer to the IEEE754 standard itself for the conversion of a real number into its equivalent binary32 format
5.
Double-precision floating-point format
–
Double-precision floating-point format is a computer number format that occupies 8 bytes in computer memory and represents a wide, dynamic range of values by using a floating point. Double-precision floating-point format usually refers to binary64, as specified by the IEEE754 standard, in older computers, different floating-point formats of 8 bytes were used, e. g. GW-BASICs double-precision data type was the 64-bit MBF floating-point format. Double-precision binary floating-point is a commonly used format on PCs, due to its range over single-precision floating point, in spite of its performance. As with single-precision floating-point format, it lacks precision on integer numbers when compared with a format of the same size. It is commonly simply as double. The IEEE754 standard specifies a binary64 as having, Sign bit,1 bit Exponent,11 bits Significand precision,53 bits This gives 15–17 significant decimal digits precision. If an IEEE754 double precision is converted to a string with at least 17 significant digits and then converted back to double. The format is written with the significand having an implicit integer bit of value 1, with the 52 bits of the fraction significand appearing in the memory format, the total precision is therefore 53 bits. For the next range, from 253 to 254, everything is multiplied by 2, so the numbers are the even ones. Conversely, for the range from 251 to 252, the spacing is 0.5. The spacing as a fraction of the numbers in the range from 2n to 2n+1 is 2n−52, the maximum relative rounding error when rounding a number to the nearest representable one is therefore 2−53. The 11 bit width of the exponent allows the representation of numbers between 10−308 and 10308, with full 15–17 decimal digits precision, by compromising precision, the subnormal representation allows even smaller values up to about 5 × 10−324. The double-precision binary floating-point exponent is encoded using a representation, with the zero offset being 1023. All bit patterns are valid encoding, except for the above exceptions, the entire double-precision number is described by, sign ×2 exponent − exponent bias ×1. Fraction In the case of subnormals the double-precision number is described by, because there have been many floating point formats with no network standard representation for them, the XDR standard uses big-endian IEEE754 as its representation. It may therefore appear strange that the widespread IEEE754 floating point standard does not specify endianness, theoretically, this means that even standard IEEE floating point data written by one machine might not be readable by another. One area of computing where this is an issue is for parallel code running on GPUs. For example, when using NVIDIAs CUDA platform, on video cards designed for gaming, doubles are implemented in many programming languages in different ways such as the following
6.
Quadruple-precision floating-point format
–
That kind of gradual evolution towards wider precision was already in view when IEEE Standard 754 for Floating-Point Arithmetic was framed. In IEEE 754-2008 the 128-bit base-2 format is referred to as binary128. The IEEE754 standard specifies a binary128 as having, Sign bit,1 bit Exponent width,15 bits Significand precision,113 bits This gives from 33 to 36 significant decimal digits precision. The format is written with an implicit lead bit with value 1 unless the exponent is stored with all zeros, thus only 112 bits of the significand appear in the memory format, but the total precision is 113 bits. The bits are laid out as, A binary256 would have a precision of 237 bits. The stored exponents 000016 and 7FFF16 are interpreted specially, the minimum strictly positive value is 2−16494 ≈ 10−4965 and has a precision of only one bit. The minimum positive value is 2−16382 ≈3.3621 × 10−4932 and has a precision of 113 bits. The maximum representable value is 216384 −216271 ≈1.1897 ×104932 and these examples are given in bit representation, in hexadecimal, of the floating-point value. This includes the sign, exponent, and significand, so the bits beyond the rounding point are 0101. Which is less than 1/2 of a unit in the last place, a common software technique to implement nearly quadruple precision using pairs of double-precision values is sometimes called double-double arithmetic. That is, the pair is stored in place of q, note that double-double arithmetic has the following special characteristics, As the magnitude of the value decreases, the amount of extra precision also decreases. Therefore, the smallest number in the range is narrower than double precision. The smallest number with full precision is 1000.02 × 2−1074, numbers whose magnitude is smaller than 2−1021 will not have additional precision compared with double precision. The actual number of bits of precision can vary, in general, the magnitude of the low-order part of the number is no greater than half ULP of the high-order part. If the low-order part is less than half ULP of the high-order part, certain algorithms that rely on having a fixed number of bits in the significand can fail when using 128-bit long double numbers. Because of the reason above, it is possible to represent values like 1 + 2−1074 and they are represented as a sum of three double-precision values respectively. They can represent operations with at least 159/161 and 212/215 bits respectively, a similar technique can be used to produce a double-quad arithmetic, which is represented as a sum of two quadruple-precision values. They can represent operations with at least 226 bits, quadruple precision is often implemented in software by a variety of techniques, since direct hardware support for quadruple precision is, as of 2016, less common
7.
Octuple-precision floating-point format
–
In computing, octuple precision is a binary floating-point-based computer number format that occupies 32 bytes in computer memory. This 256-bit octuple precision is for applications requiring results in higher than quadruple precision and this format is rarely used and very few things support it. Thus only 236 bits of the significand appear in the memory format, the stored exponents 0000016 and 7FFFF16 are interpreted specially. The minimum strictly positive value is 2−262378 ≈ 10−78984 and has a precision of one bit. The minimum positive value is 2−262142 ≈2.4824 × 10−78913. The maximum representable value is 2262144 −2261907 ≈1.6113 ×1078913 and these examples are given in bit representation, in hexadecimal, of the floating-point value. This includes the sign, exponent, and significand, so the bits beyond the rounding point are 0101. Which is less than 1/2 of a unit in the last place, octuple precision is rarely implemented since usage of it is extremely rare. Apple Inc. had an implementation of addition, subtraction and multiplication of numbers with a 224-bit twos complement significand. One can use general arbitrary-precision arithmetic libraries to obtain octuple precision, there is little to no hardware support for it. Octuple-precision arithmetic is too impractical for most commercial uses of it, IEEE Standard for Floating-Point Arithmetic ISO/IEC10967, Language-independent arithmetic Primitive data type
8.
Decimal32 floating-point format
–
In computing, decimal32 is a decimal floating-point computer numbering format that occupies 4 bytes in computer memory. It is intended for applications where it is necessary to emulate decimal rounding exactly, such as financial, like the binary16 format, it is intended for memory saving storage. Decimal32 supports 7 decimal digits of significand and an exponent range of −95 to +96, because the significand is not normalized, most values with less than 7 significant digits have multiple possible representations, 1×102=0. 1×103=0. 01×104, etc. Decimal32 floating point is a new decimal floating-point format, formally introduced in the 2008 version of IEEE754 as well as with ISO/IEC/IEEE60559,2011. IEEE754 allows two alternative methods for decimal32 values. The standard does not specify how to signify which representation is used, in one representation method, based on binary integer decimal, the significand is represented as binary coded positive integer. The other, alternative, representation method is based on densely packed decimal for most of the significand, both alternatives provide exactly the same range of representable numbers,7 digits of significand and 3×26=192 possible exponent values. The remaining combinations encode infinities and NaNs and this format uses a binary significand from 0 to 107−1 =9999999 = 98967F16 =1001100010010110011111112. The encoding can represent binary significands up to 10×220−1 =10485759 = 9FFFFF16 =1001111111111111111111112, as described above, the encoding varies depending on whether the most significant 4 bits of the significand are in the range 0 to 7, or higher. If the 2 bits after the bit are 11, then the 8-bit exponent field is shifted 2 bits to the right. In this case there is an implicit leading 3-bit sequence 100 in the true significand, compare having an implicit 1 in the significand of normal values for the binary formats. Note also that the 00,01, or 10 bits are part of the exponent field, the leading digit is between 0 and 9, and the rest of the significand uses the densely packed decimal encoding. These six bits after that are the exponent continuation field, providing the less-significant bits of the exponent, the last 20 bits are the significand continuation field, consisting of two 10-bit declets. Each declet encodes three decimal digits using the DPD encoding, the DPD/3BCD transcoding for the declets is given by the following table. B9. b0 are the bits of the DPD, and d2. d0 are the three BCD digits, the 8 decimal values whose digits are all 8s or 9s have four codings each. The bits marked x in the table above are ignored on input, but will always be 0 in computed results
9.
Decimal64 floating-point format
–
In computing, decimal64 is a decimal floating-point computer numbering format that occupies 8 bytes in computer memory. It is intended for applications where it is necessary to emulate decimal rounding exactly, such as financial, decimal64 supports 16 decimal digits of significand and an exponent range of −383 to +384, i. e. ±0. 000000000000000×10^−383 to ±9. 999999999999999×10^384. In contrast, the binary format, which is the most commonly used type, has an approximate range of ±0. 000000000000001×10^−308 to ±1. 797693134862315×10^308. Because the significand is not normalized, most values with less than 16 significant digits have multiple representations, 1×102=0. 1×103=0. 01×104. Decimal64 floating point is a new decimal floating-point format, formally introduced in the 2008 version of IEEE754 as well as with ISO/IEC/IEEE60559,2011. IEEE754 allows two alternative methods for decimal64 values. Both alternatives provide exactly the range of representable numbers,16 digits of significand. In both cases, the most significant 4 bits of the significand are combined with the most significant 2 bits of the exponent to use 30 of the 32 possible values of a 5-bit field, the remaining combinations encode infinities and NaNs. In the cases of Infinity and NaN, all bits of the encoding are ignored. Thus, it is possible to initialize an array to Infinities or NaNs by filling it with a byte value. This format uses a binary significand from 0 to 1016−1 =9999999999999999 = 2386F26FC0FFFF16 =1000111000011011110010011011111100000011111111111111112, the encoding, completely stored on 64 bits, can represent binary significands up to 10×250−1 =11258999068426239 = 27FFFFFFFFFFFF16, but values larger than 1016−1 are illegal. As described above, the encoding varies depending on whether the most significant 4 bits of the significand are in the range 0 to 7, or higher. If the 2 bits after the bit are 11, then the 10-bit exponent field is shifted 2 bits to the right. In this case there is an implicit leading 3-bit sequence 100 for the most bits of the true significand, compare having an implicit 1-bit prefix 1 in the significand of normal values for the binary formats. Note also that the 2-bit sequences 00,01, or 10 after the bit are part of the exponent field. Note that the bits of the significand field do not encode the most significant decimal digit. The highest valid significant is 9999999999999999 whose binary encoding is 0111000011011110010011011111100000011111111111111112, the leading digit is between 0 and 9, and the rest of the significand uses the densely packed decimal encoding. This eight bits after that are the exponent continuation field, providing the less-significant bits of the exponent, the last 50 bits are the significand continuation field, consisting of five 10-bit declets
10.
Decimal128 floating-point format
–
In computing, decimal128 is a decimal floating-point computer numbering format that occupies 16 bytes in computer memory. It is intended for applications where it is necessary to emulate decimal rounding exactly, such as financial, decimal128 supports 34 decimal digits of significand and an exponent range of −6143 to +6144, i. e. ±0. 000000000000000000000000000000000×10^−6143 to ±9. 999999999999999999999999999999999×10^6144. Therefore, decimal128 has the greatest range of values compared with other IEEE basic floating point formats, because the significand is not normalized, most values with less than 34 significant digits have multiple possible representations, 1×102=0. 1×103=0. 01×104, etc. Decimal128 floating point is a new decimal floating-point format, formally introduced in the 2008 version of IEEE754 as well as with ISO/IEC/IEEE60559,2011. IEEE754 allows two alternative methods for decimal128 values. The standard does not specify how to signify which representation is used, in one representation method, based on binary integer decimal, the significand is represented as binary coded positive integer. The other, alternative, representation method is based on densely packed decimal for most of the significand, both alternatives provide exactly the same range of representable numbers,34 digits of significand and 3×212 =12288 possible exponent values. In both cases, the most significant 4 bits of the significand are combined with the most significant 2 bits of the exponent to use 30 of the 32 possible values of 5 bits in the combination field, the remaining combinations encode infinities and NaNs. In the case of Infinity and NaN, all bits of the encoding are ignored. Thus, it is possible to initialize an array to Infinities or NaNs by filling it with a byte value. The encoding can represent binary significands up to 10×2110−1 =12980742146337069071326240823050239, as described above, the encoding varies depending on whether the most significant 4 bits of the significand are in the range 0 to 7, or higher. If the 2 bits after the bit are 11, then the 14-bit exponent field is shifted 2 bits to the right. In this case there is an implicit leading 3-bit sequence 100 in the true significand, compare having an implicit 1 in the significand of normal values for the binary formats. Note also that the 00,01, or 10 bits are part of the exponent field, for the decimal128 format, all of these significands are out of the valid range, and are thus decoded as zero, but the pattern is same as decimal32 and decimal64. The leading digit is between 0 and 9, and the rest of the uses the densely packed decimal encoding. This twelve bits after that are the exponent continuation field, providing the less-significant bits of the exponent, the last 110 bits are the significand continuation field, consisting of eleven 10-bit declets. Each declet encodes three decimal digits using the DPD encoding, the DPD/3BCD transcoding for the declets is given by the following table. B9. b0 are the bits of the DPD, and d2. d0 are the three BCD digits, the 8 decimal values whose digits are all 8s or 9s have four codings each
11.
Computer architecture
–
In computer engineering, computer architecture is a set of rules and methods that describe the functionality, organization, and implementation of computer systems. Some definitions of architecture define it as describing the capabilities and programming model of a computer, in other definitions computer architecture involves instruction set architecture design, microarchitecture design, logic design, and implementation. The first documented computer architecture was in the correspondence between Charles Babbage and Ada Lovelace, describing the analytical engine, johnson had the opportunity to write a proprietary research communication about the Stretch, an IBM-developed supercomputer for Los Alamos National Laboratory. Brooks went on to develop the IBM System/360 line of computers. Later, computer users came to use the term in many less-explicit ways, the earliest computer architectures were designed on paper and then directly built into the final hardware form. The discipline of architecture has three main subcategories, Instruction Set Architecture, or ISA. The ISA defines the code that a processor reads and acts upon as well as the word size, memory address modes, processor registers. Microarchitecture, or computer organization describes how a processor will implement the ISA. The size of a computers CPU cache for instance, is an issue that generally has nothing to do with the ISA, system Design includes all of the other hardware components within a computing system. These include, Data processing other than the CPU, such as memory access Other issues such as virtualization, multiprocessing. There are other types of computer architecture, E. g. the C, C++, or Java standards define different Programmer Visible Macroarchitecture. UISA —a group of machines with different hardware level microarchitectures may share a common microcode architecture, pin Architecture, The hardware functions that a microprocessor should provide to a hardware platform, e. g. the x86 pins A20M, FERR/IGNNE or FLUSH. Also, messages that the processor should emit so that external caches can be invalidated, pin architecture functions are more flexible than ISA functions because external hardware can adapt to new encodings, or change from a pin to a message. The term architecture fits, because the functions must be provided for compatible systems, the purpose is to design a computer that maximizes performance while keeping power consumption in check, costs low relative to the amount of expected performance, and is also very reliable. For this, many aspects are to be considered which includes Instruction Set Design, Functional Organization, Logic Design, the implementation involves Integrated Circuit Design, Packaging, Power, and Cooling. Optimization of the design requires familiarity with Compilers, Operating Systems to Logic Design, an instruction set architecture is the interface between the computers software and hardware and also can be viewed as the programmers view of the machine. Computers do not understand high level languages such as Java, C++, a processor only understands instructions encoded in some numerical fashion, usually as binary numbers. Software tools, such as compilers, translate those high level languages into instructions that the processor can understand, besides instructions, the ISA defines items in the computer that are available to a program—e. g
12.
Central processing unit
–
The computer industry has used the term central processing unit at least since the early 1960s. The form, design and implementation of CPUs have changed over the course of their history, most modern CPUs are microprocessors, meaning they are contained on a single integrated circuit chip. An IC that contains a CPU may also contain memory, peripheral interfaces, some computers employ a multi-core processor, which is a single chip containing two or more CPUs called cores, in that context, one can speak of such single chips as sockets. Array processors or vector processors have multiple processors that operate in parallel, there also exists the concept of virtual CPUs which are an abstraction of dynamical aggregated computational resources. Early computers such as the ENIAC had to be rewired to perform different tasks. Since the term CPU is generally defined as a device for software execution, the idea of a stored-program computer was already present in the design of J. Presper Eckert and John William Mauchlys ENIAC, but was initially omitted so that it could be finished sooner. On June 30,1945, before ENIAC was made, mathematician John von Neumann distributed the paper entitled First Draft of a Report on the EDVAC and it was the outline of a stored-program computer that would eventually be completed in August 1949. EDVAC was designed to perform a number of instructions of various types. Significantly, the programs written for EDVAC were to be stored in high-speed computer memory rather than specified by the wiring of the computer. This overcame a severe limitation of ENIAC, which was the considerable time, with von Neumanns design, the program that EDVAC ran could be changed simply by changing the contents of the memory. Early CPUs were custom designs used as part of a larger, however, this method of designing custom CPUs for a particular application has largely given way to the development of multi-purpose processors produced in large quantities. This standardization began in the era of discrete transistor mainframes and minicomputers and has accelerated with the popularization of the integrated circuit. The IC has allowed increasingly complex CPUs to be designed and manufactured to tolerances on the order of nanometers, both the miniaturization and standardization of CPUs have increased the presence of digital devices in modern life far beyond the limited application of dedicated computing machines. Modern microprocessors appear in electronic devices ranging from automobiles to cellphones, the so-called Harvard architecture of the Harvard Mark I, which was completed before EDVAC, also utilized a stored-program design using punched paper tape rather than electronic memory. Relays and vacuum tubes were used as switching elements, a useful computer requires thousands or tens of thousands of switching devices. The overall speed of a system is dependent on the speed of the switches, tube computers like EDVAC tended to average eight hours between failures, whereas relay computers like the Harvard Mark I failed very rarely. In the end, tube-based CPUs became dominant because the significant speed advantages afforded generally outweighed the reliability problems, most of these early synchronous CPUs ran at low clock rates compared to modern microelectronic designs. Clock signal frequencies ranging from 100 kHz to 4 MHz were very common at this time, the design complexity of CPUs increased as various technologies facilitated building smaller and more reliable electronic devices
13.
Arithmetic logic unit
–
An arithmetic logic unit is a combinational digital electronic circuit that performs arithmetic and bitwise operations on integer binary numbers. This is in contrast to a floating-point unit, which operates on floating point numbers, an ALU is a fundamental building block of many types of computing circuits, including the central processing unit of computers, FPUs, and graphics processing units. A single CPU, FPU or GPU may contain multiple ALUs, in many designs, the ALU also exchanges additional information with a status register, which relates to the result of the current or previous operations. An ALU has a variety of input and output nets, which are the electrical conductors used to digital signals between the ALU and external circuitry. When an ALU is operating, external circuits apply signals to the ALU inputs and, in response, a basic ALU has three parallel data buses consisting of two input operands and a result output. Each data bus is a group of signals that conveys one binary integer number, typically, the A, B and Y bus widths are identical and match the native word size of the external circuitry. The opcode size is related to the number of different operations the ALU can perform, for example, a four-bit opcode can specify up to sixteen different ALU operations. Generally, an ALU opcode is not the same as a machine language opcode, the status outputs are various individual signals that convey supplemental information about the result of an ALU operation. These outputs are usually stored in registers so they can be used in future ALU operations or for controlling conditional branching. The collection of bit registers that store the status outputs are often treated as a single, multi-bit register, zero, which indicates all bits of output are logic zero. Negative, which indicates the result of an operation is negative. Overflow, which indicates the result of an operation has exceeded the numeric range of output. Parity, which indicates whether an even or odd number of bits in the output are logic one, the status input allows additional information to be made available to the ALU when performing an operation. Typically, this is a bit that is the stored carry-out from a previous ALU operation. An ALU is a logic circuit, meaning that its outputs will change asynchronously in response to input changes. In general, external circuitry controls an ALU by applying signals to its inputs, at the same time, the CPU also routes the ALU result output to a destination register that will receive the sum. The ALUs input signals, which are stable until the next clock, are allowed to propagate through the ALU. When the next clock arrives, the destination register stores the ALU result and, since the ALU operation has completed, a number of basic arithmetic and bitwise logic functions are commonly supported by ALUs
14.
Bus (computing)
–
In computer architecture, a bus is a communication system that transfers data between components inside a computer, or between computers. This expression covers all related components and software, including communication protocols. Modern computer buses can use both parallel and bit serial connections, and can be wired in either a multidrop or daisy chain topology, or connected by switched hubs, as in the case of USB. An early computer might contain a hand-wired CPU of vacuum tubes, a drum for main memory. In both examples, computer buses of one form or another move data between all of these devices, in most traditional computer architectures, the CPU and main memory tend to be tightly coupled. In most cases, the CPU and memory share signalling characteristics, the bus connecting the CPU and memory is one of the defining characteristics of the system, and often referred to simply as the system bus. It is possible to allow peripherals to communicate with memory in the same fashion and this is commonly accomplished through some sort of standardized electrical connector, several of these forming the expansion bus or local bus. However, as the differences between the CPU and peripherals varies widely, some solution is generally needed to ensure that peripherals do not slow overall system performance. Many CPUs feature a set of pins similar to those for communicating with memory. Others use smart controllers to place the data directly in memory, most modern systems combine both solutions, where appropriate. As the number of potential peripherals grew, using a card for every peripheral became increasingly untenable. This has led to the introduction of bus systems designed specifically to support multiple peripherals, common examples are the SATA ports in modern computers, which allow a number of hard drives to be connected without the need for a card. However, these systems are generally too expensive to implement in low-end devices. This has led to the development of a number of low-performance bus systems for these solutions. All such examples may be referred to as peripheral buses, although this terminology is not universal, in modern systems the performance difference between the CPU and main memory has grown so great that increasing amounts of high-speed memory is built directly into the CPU, known as a cache. These system buses are used to communicate with most other peripherals, through adaptors. Such systems are more similar to multicomputers, communicating over a bus rather than a network. In these cases, expansion buses are entirely separate and no longer share any architecture with their host CPU, what would have formerly been a system bus is now often known as a front-side bus
15.
IBM System/370
–
The IBM System/370 was a model range of IBM mainframe computers announced on June 30,1970 as the successors to the System/360 family. A Dynamic Address Translation option was not announced until 1972, 128-bit floating point arithmetic on all models, the original System/370 line underwent several architectural improvements during its roughly 20-year lifetime. The first System/370 machines, the Model 155, the Model 165, and these changes included,13 new instructions, among which were MOVE LONG, COMPARE LOGICAL LONG, thereby permitting operations on up to 2^24-1 bytes, vs. They did not include support for virtual memory, in 1972, a very significant change was made when support for virtual memory was introduced with IBMs System/370 Advanced Function announcement. IBM had initially chosen to exclude virtual storage from the S/370 line, the S/370-145 had an associative memory used by the microcode for the DOS compatibility feature from its first shipments in June 1971, the same hardware was used by the microcode for DAT. The 145 microcode architecture simplified the addition of virtual memory, allowing this capability to be present in early 145s without the extensive modifications needed in other models. The Reference and Change bits of the Storage-protection Keys, however, were labeled on the rollers, existing S/370-145 customers were happy to learn that they did not have to purchase a hardware upgrade in order to run DOS/VS or OS/VS1. After installation, these models were known as the S/370-155-II and S/370-165-II, IBM wanted customers to upgrade their 155 and 165 systems to the widely sold S/370-158 and -168. This led to the original S/370-155 and S/370-165 models being described as boat anchors, later architectural changes primarily involved expansions in memory – both physical memory and virtual address space – to enable larger workloads and meet client demands for more storage. This was the trend as Moores Law eroded the unit cost of memory. As with all IBM mainframe development, preserving backward compatibility was paramount, in October 1981, the 3033 and 3081 processors added extended real addressing, which allowed 26-bit addressing for physical storage. This capability appeared later on other systems, such as the 4381 and 3090, the cross-memory services capability which facilitated movement of data between address spaces was actually available just prior to S/370-XA architecture on the 3031,3032 and 3033 processors. As described above, the S/370 product line underwent a major architectural change, the evolution of S/370 addressing was always complicated by the basic S/360 instruction set design, and its large installed code base, which relied on a 24-bit logical address. Most shops thus continued to run their 24-bit applications in a higher-performance 31-bit world and this evolutionary implementation had the characteristic of solving the most urgent problems first, relief for real memory addressing being needed sooner than virtual memory addressing. IBMs choice of 31-bit addressing for 370-XA involved various factors, the System/360 Model 67 had included a full 32-bit addressing mode, but this feature was not carried forward to the System/370 series, which began with only 24-bit addressing. When IBM later expanded the S/370 address space in S/370-XA, several reasons are cited for the choice of 31 bits, in particular, the standard subroutine calling convention marked the final parameter word by setting its high bit. Interaction between 32-bit addresses and two instructions that treated their arguments as signed numbers, input from key initial Model 67 sites, which had debated the alternatives during the initial system design period, and had recommended 31 bits. The following table summarizes the major S/370 series and models, the second column lists the principal architecture associated with each series
16.
SIMD
–
Single instruction, multiple data, is a class of parallel computers in Flynns taxonomy. It describes computers with multiple processing elements that perform the operation on multiple data points simultaneously. Thus, such machines exploit data level parallelism, but not concurrency, there are simultaneous computations, SIMD is particularly applicable to common tasks like adjusting the contrast in a digital image or adjusting the volume of digital audio. Most modern CPU designs include SIMD instructions in order to improve the performance of multimedia use, vector processing was especially popularized by Cray in the 1970s and 1980s. The first era of modern SIMD machines was characterized by massively parallel processing-style supercomputers such as the Thinking Machines CM-1 and these machines had many limited-functionality processors that would work in parallel. Supercomputing moved away from the SIMD approach when inexpensive scalar MIMD approaches based on commodity processors such as the Intel i860 XP became more powerful, the current era of SIMD processors grew out of the desktop-computer market rather than the supercomputer market. Sun Microsystems introduced SIMD integer instructions in its VIS instruction set extensions in 1995, MIPS followed suit with their similar MDMX system. The first widely deployed desktop SIMD was with Intels MMX extensions to the x86 architecture in 1996 and this sparked the introduction of the much more powerful AltiVec system in the Motorola PowerPCs and IBMs POWER systems. Intel responded in 1999 by introducing the all-new SSE system, since then, there have been several extensions to the SIMD instruction sets for both architectures. A modern supercomputer is almost always a cluster of MIMD machines, a modern desktop computer is often a multiprocessor MIMD machine where each processor can execute short-vector SIMD instructions. An application that may take advantage of SIMD is one where the value is being added to a large number of data points. One example would be changing the brightness of an image, each pixel of an image consists of three values for the brightness of the red, green and blue portions of the color. To change the brightness, the R, G and B values are read from memory, a value is added to them, with a SIMD processor there are two improvements to this process. For one the data is understood to be in blocks, instead of a series of instructions saying retrieve this pixel, now retrieve the next pixel, a SIMD processor will have a single instruction that effectively says retrieve n pixels. For a variety of reasons, this can take less time than retrieving each pixel individually. Another advantage is that the instruction operates on all loaded data in a single operation. In other words, if the SIMD system works by loading up eight points at once. Not all algorithms can be vectorized easily, note, Batch-pipeline systems are most advantageous for cache control when implemented with SIMD intrinsics, but they are not exclusive to SIMD features
17.
Streaming SIMD Extensions
–
SSE contains 70 new instructions, most of which work on single precision floating point data. SIMD instructions can greatly increase performance when exactly the same operations are to be performed on multiple data objects, typical applications are digital signal processing and graphics processing. Intels first IA-32 SIMD effort was the MMX instruction set, MMX had two main problems, it re-used existing floating point registers making the CPU unable to work on both floating point and SIMD data at the same time, and it only worked on integers. SSE floating point instructions operate on a new independent register set, SSE was subsequently expanded by Intel to SSE2, SSE3, SSSE3, and SSE4. Because it supports floating point math, it had a wider application than MMX, the addition of integer support in SSE2 made MMX largely redundant, though further performance increases can be attained in some situations by using MMX in parallel with SSE operations. SSE was originally called Katmai New Instructions, Katmai being the name for the first Pentium III core revision. During the Katmai project Intel sought to distinguish it from their product line. It was later renamed Intel Streaming SIMD Extensions, then SSE, AMD eventually added support for SSE instructions, starting with its Athlon XP and Duron processors. SSE originally added eight new 128-bit registers known as XMM0 through XMM7, the AMD64 extensions from AMD added a further eight registers XMM8 through XMM15, and this extension is duplicated in the Intel 64 architecture. There is also a new 32-bit control/status register, MXCSR, the registers XMM8 through XMM15 are accessible only in 64-bit operating mode. This means that the OS must know how to use the FXSAVE and FXRSTOR instructions and this support was quickly added to all major IA-32 operating systems. The first CPU to support SSE, the Pentium III, shared execution resources between SSE and the FPU, while a compiled application can interleave FPU and SSE instructions side-by-side, the Pentium III will not issue an FPU and an SSE instruction in the same clock cycle. SSE introduced both scalar and packed floating point instructions, consider an operation like vector addition, which is used very often in computer graphics applications. To add two single precision, four-component vectors together using x86 requires four floating-point addition instructions and this corresponds to four x86 FADD instructions in the object code. On the other hand, as the following shows, a single 128-bit packed-add instruction can replace the four scalar addition instructions. SSE2, Willamette New Instructions, introduced with the Pentium 4, is an enhancement to SSE. SSE2 adds two major features, double-precision floating point for all SSE operations, and MMX integer operations on 128-bit XMM registers, in the original SSE instruction set, conversion to and from integers placed the integer data in the 64-bit MMX registers. SSE2 enables the programmer to perform SIMD math on any data type entirely with the XMM vector-register file and it offers an orthogonal set of instructions for dealing with common data types