Internal code

From Wikipedia, the free encyclopedia
Jump to navigation Jump to search

The term internal code is a word-for-word translation of the Chinese term neima (內碼, 内码; pinyin: nèimă; jyutping: noi6 maa5). The term is primarily used by Chinese people.

Originally referring to the encoding used “internally” by a Chinese system, it refers to the encoding of a character in some character set, or to the character encoding being used, it is not an encoding in itself, and the actual encoding being referred to has to be determined by context.

On any computer system, the internal code is the native encoding being used. For example, in a Big5-based system (e.g., Microsoft Windows 3.1 localized for traditional Chinese), the internal code is Big5; similarly, in a GB-based system (e.g., DOS running CCDOS), the internal code would be GB2312. On early computers, the Chinese language card takes the internal codes and renders the corresponding Chinese characters on the screen [1]. On many modern operating systems (such as all modern Microsoft Windows systems), the internal code is a form of Unicode.

Within a particular encoding, the internal code of a certain character simply means the value of the code point used to represent that character. For example, in the Big5 encoding, the character "一" (Chinese, lit. one) has the internal code of A440 (hexadecimal); in the GB encoding, the same character has the internal code of D2BB.

The "neima method"[edit]

For more examples when the internal code is Unicode, see Unicode.

The internal code can be used as an input method for inputting Han characters; this input method is usually called 內碼 in Chinese and is usually provided for Big5 and GB internal codes; in English, it may be variously called “neima”, “internal code”, “raw code”, or other similar names.

For example, in a Big5-based system, one can input the character 一 by typing “A440” using the “internal code” input method. On the other hand, in a GB-based system, to input the character one would itype “D2BB”.

NeiMa expects the user to input the desired character by providing its value within the user-chosen character set.

For example, to input the Chinese character "不" (Pinyin "bù" - English "not") one can start the Neima editor, switch to Unicode character encoding mode and then type "bù"'s hexadecimal value within Unicode's table, which is 4E0D. NeiMa is a very awkward way of typing in characters, as a user would need to know the code points of all needed characters.

More generally speaking, NeiMa in Unicode mode accepts any code point within the Unicode table, so users are not limited to inputting Chinese characters, but also any other character that can be found within the Unicode table. For example, Latin Capital Letter A, A, may be inputted with NeiMa using A's Unicode code point, which is 0041.

On a Unicode-based system, one might be able to input a character by typing its Unicode number in hexadecimal; such an input method might also be called “neima”, or it might be called “Unicode”.

On a JIS-based system (Japanese), there might be a kuten input method that allows characters to be input using a form of the internal code called the "kuten form"; this kind of input method is called quwei on GB-based systems (Chinese). Although the kuten (quwei) form is related to the internal code, such input methods are not usually referred as “internal code” input methods.

The use of an “internal code” input method to input characters would not be normally very practical. However, it is useful for inputting special symbols that may otherwise be impossible to input using other input methods. However, the user would need to have a table of characters with their internal codes.

See also[edit]


  1. ^ 朱, 巧明 (2005). 中文信息处理技术教程. p. 162. ISBN 9787302117612.