8-Bit Character Sets [ MPE XL Native Language Programmer's Guide ] MPE/iX 5.0 Documentation
MPE XL Native Language Programmer's Guide
8-Bit Character Sets
Within NLS, each supported language is associated with an 8-bit character
set. (One character set may support many languages.) Before the
introduction of NLS, the only widely supported character set was USASCII,
a 128-character set designed to support American English text. USASCII
uses only seven bits of an 8-bit byte to encode a character, the eighth
or high-order bit is always zero.
It is possible to build supersets of USASCII permitting encoding and
manipulation of characters required by languages other than American
English, by using the eighth bit. These supersets are referred to as
8-bit or extended character sets. New characters are added with code
values in the range 161-254.
NOTE All character sets are supersets of USASCII, and are occasionally
referred to as ASCII character sets.
Another method of providing foreign characters not supported by NLS
involves 12 existing characters in USASCII with substitution characters.
The 7-bit substitution set eliminates some characters in favor of others
needed by a particular local language. A different substitution set is
necessary for each language. The NLS 8-bit character sets support all
USASCII characters (except for \ in KANA8) in addition to the characters
needed to support several Western European-based languages, Middle
Eastern countries, and KATAKANA.
NOTE Because 8-bit character sets are used in NLS, all bits of every
byte have significance. Application software must take care to
preserve the eighth bit (high-order), not allowing it to be
modified or reused for any special purpose. No differentiation
should be made between characters that have the eighth bit turned
off or on, as all are characters of equal status in the extended
character set.
MPE/iX 5.0 Documentation