HP 3000 Manuals

8-Bit Character Sets [ MPE XL Native Language Programmer's Guide ] MPE/iX 5.0 Documentation
MPE XL Native Language Programmer's Guide

8-Bit Character Sets 

Within NLS, each supported language is associated with an 8-bit character
set.  (One character set may support many languages.)  Before the
introduction of NLS, the only widely supported character set was USASCII,
a 128-character set designed to support American English text.  USASCII
uses only seven bits of an 8-bit byte to encode a character, the eighth
or high-order bit is always zero.

It is possible to build supersets of USASCII permitting encoding and
manipulation of characters required by languages other than American
English, by using the eighth bit.  These supersets are referred to as
8-bit or extended character sets.  New characters are added with code
values in the range 161-254.


NOTE  All character sets are supersets of USASCII, and are occasionally
      referred to as ASCII character sets.


Another method of providing foreign characters not supported by NLS
involves 12 existing characters in USASCII with substitution characters.
The 7-bit substitution set eliminates some characters in favor of others
needed by a particular local language.  A different substitution set is
necessary for each language.  The NLS 8-bit character sets support all
USASCII characters (except for \ in KANA8) in addition to the characters
needed to support several Western European-based languages, Middle
Eastern countries, and KATAKANA.


NOTE  Because 8-bit character sets are used in NLS, all bits of every
      byte have significance.  Application software must take care to
      preserve the eighth bit (high-order), not allowing it to be
      modified or reused for any special purpose.  No differentiation
      should be made between characters that have the eighth bit turned
      off or on, as all are characters of equal status in the extended
      character set.
MPE/iX 5.0 Documentation