Recognizing Primitive Data Types [ DATA TYPES CONVERSION Programmer's Guide ] MPE/iX 5.0 Documentation
DATA TYPES CONVERSION Programmer's Guide
Recognizing Primitive Data Types
Data is an abstraction of information. Data must be structured in a form
that the computer is designed to process; data conversion is the
translation of information to a form acceptable to the computer.
The 900 Series HP 3000 Computer Systems instruction set is designed to
operate on certain fundamental data types. The following data types are
recognized by MPE XL and its subsystems:
* Characters.
* The following numeric types:
Integers.
Real numbers (in floating point notation).
Decimals: packed, unpacked, and floating-point.
NOTE Although decimal is not really a system primitive type, it is
included in this manual because it is so widely used on MPE XL.
Floating-point decimals are used by BASIC; packed and unpacked
decimals are used by COBOL and RPG.
Each data type requires a specific bit format. In this manual, bit
fields are described as (bit:length), where bit is the first bit in the
field and length is the number of consecutive bits in the field. For
example, "bits (13:3)" refers to bits 13, 14, and 15. Bit 0 is the most
significant bit.
Character
Character code formats are primitive data types. Characters are the
letters, numbers, and symbols on your keyboard. The computer relates
each alphanumeric character to an 8-bit (one byte) binary number,
according to a correspondence code. Some of the characters are easily
displayable, like +, ?, 8, and z; some are not, like a blank space or the
carriage return.
MPE supports the two common American English character codes: ASCII
(American Standard Code for Information Interchange) and EBCDIC (Extended
Binary Coded Decimal Interchange Code). Several natural language types
are also supported. See Appendix A for ASCII and EBCDIC codes and
equivalents.
Character data types are useful for storing strings of symbols like
names, addresses, or identification numbers, and for reading the keyboard
or writing to the screen. Remember, variables saved as data type
character are recognized by the computer as symbols, not as numeric
values.
ASCII. MPE and its subsystems use ASCII data type to represent character
data. ASCII is the format adopted by ANSI, the American National
Standards Institute. Most MPE interfaces use ASCII to accept or return
character data.
Appendix A shows the ASCII and EBCDIC character code values, along with
their decimal, octal, and hexadecimal equivalents.
ASCII is used in this guide as the name of a data type. ASCII data type
corresponds to the ASCII character code format. The codes for byte
values in the range 0 to 127 conform to the ASCII standard format. Byte
values in the range 128 to 255 are interpreted using Hewlett-Packard's
extended ROMAN8 character set. MPE XL and its subsystems use values in
this range to support extended (8-bit) character sets.
Figure 2-1 shows the ASCII data type bit format.
Figure 2-1. Bit Format: ASCII Character
EBCDIC. EBCDIC is another coding format widely used in the computer
industry for character data. Like ASCII, it is based on the byte.
EBCDIC is used in this guide as the name of a data type. EBCDIC data
type corresponds to EBCDIC character code format for byte values in the
range 0 to 255.
Appendix A shows the ASCII and EBCDIC character code values, along with
their decimal, octal, and hexadecimal equivalents.
Figure 2-2 shows the bit format for EBCDIC data type.
Figure 2-2. Bit Format: EBCDIC Character
Numeric
MPE XL subsystems support three primitive data types for numbers:
* Integer.
* Real.
* Decimal.
Integer. An integer is any positive or negative whole number, including
zero. Integers are useful for counting and for incrementing in loops.
Signed integers are a useful form for exchanging numeric data between
languages.
MPE XL integers can be 8, 16, 32, or 64 bits long. They can be unsigned
or signed (+ or -). Signed integers are represented in twos complement
form.
Table 2-1. MPE XL Integer Types
-----------------------------------------------------------------------------------------------
| | | |
| Size Type | Range | Stored At: |
| | | |
-----------------------------------------------------------------------------------------------
| | | |
| 8-bit: unsigned | 0 to 255 | byte addresses |
| | | |
-----------------------------------------------------------------------------------------------
| | | |
| 16-bit: signed | -32,768 to 32,767 | half-word addresses |
| | | |
| unsigned | 0 to 65,535 | half-word addresses |
| | | |
-----------------------------------------------------------------------------------------------
| | | |
| 32-bit: signed | -2,147,483,648 to | word addresses |
| | 2,147,483,647 | |
| | | |
| unsigned | 0 to 4,294,967,295 | word addresses |
| | | |
-----------------------------------------------------------------------------------------------
The chart below shows the representation of the whole number (base-ten)
73 as an unsigned integer, a signed positive number, and a signed
negative number.
Unsigned | Signed
|
----------------------------------------------------------------------------------------
|
| Positive Negative
|
(73) | (+73) (-73)
|
01001001 | 01001001 10110111
|
Unsigned Integer. Unsigned integers are stored in the computer in their
base-two form. If you are reading or writing unsigned integers in a
language, the compiler converts for you, according to the formatting
conventions of the individual language.
An unsigned n-bit number can represent any value from 0 to 2n -1.
Reading an Unsigned Integer: One method of reading an unsigned integer
as a base-ten value is to consider the bits as columns whose values are
powers of two. The rightmost (least significant) bit is the units column
and has a weight of 20 , or 1. Going toward the left (the most
significant bit), the columns have progressively greater weight: 20 , 21 ,
22 ,...2n-1 . The decimal-based value of unsigned binary numbers is
computed by multiplying the value in each column by the weight of the
column, and then adding all the results. An unsigned integer represented
with ones in the 20 , 23 , and 26 columns and zeros in all the other
columns would be computed as follows:
1*(20 ) + 1*( 23 ) + 1*(26 ) = 73.
Writing an Unsigned Integer: One method of manually determining the
unsigned integer representation of a base-ten value is to use successive
subtraction. For example, the largest power of 2 that is less than or
equal to the value of decimal-base 73 is 26 , or 64. Subtracting 64 from
73 leaves a remainder of 9. The largest power of 2 that is less or equal
than 9 is 23 , or 8. Subtracting 8 from 9 leaves a remainder of 1. The
only power of 2 that is less than or equal to 1 is 20 , or 1. This leaves
a remainder of 0, so the computation is finished. Thus, 73 is
represented in binary with a 1 in the 20 , the 23 , and the 26 columns and
a zero in all the others.
Signed Integer. Signed integers are stored in the computer in twos
complement form. If you are reading or writing signed integers in a
language, the compiler converts for you, according to the formatting
conventions of the individual language.
A signed n-bit integer in twos complement form can represent any value
from -(2n-1 ) to +2n-1 -1.
When the n-bit positive integer i is added to its n-bit integer negative
(complement), -i, and both are in twos complement form, the result is
always an n-bit zero.
Reading a Signed Integer: The computer represents both positive and
negative numbers in twos complement form much the same way that it would
represent an unsigned integer: beginning at the rightmost (least
significant bit) and going toward the left, the columns have
progressively greater weight: 20 , 21 , 22 ,...2n-1 . The only difference
is that the most significant bit of a twos complement number is negative.
That is, it has a weight of -(2n-1 ).
To manually convert a signed integer in twos complement form to a
base-ten integer, you can use the column method explained in Unsigned
Integers, above. However, you give the leftmost column of a twos
complement number a weight of -(2n-1 ).
In the example below, this method is used to interpret the signed binary
integers 01010101 and 10101010, written in twos complement form, as
decimal-based integers:
(01010101)base 2 = the sum of: | | (10101010)base 2 = the sum of:
| |
(1 x 20 ) = 1 | | (0 x 20 ) = 0
| |
(0 x 21 ) = 0 | | (1 x 21 ) = 2
| |
(1 x 22 ) = 4 | | (0 x 22 ) = 0
| |
(0 x 23 ) = 0 | | (1 x 23 ) = 8
| |
(1 x 24 ) = 16 | | (0 x 24 ) = 0
| |
(0 x 25 ) = 0 | | (1 x 25 ) = 32
| |
(1 x 26 ) = 64 | | (0 x 26 ) = 0
| |
(0 x -(27 )) = 0 | | (1 x -(27 )) = -128
| |
--------------------------------------------------------------------------------------------
| |
(01010101)base 2 = 85base 10 | and | (10101010)base 2 = -86base 10
| |
Writing a Signed Integer: Converting a signed base-ten number to twos
complement form is not difficult.
You can represent the positive signed integers just as explained in
Unsigned Integers, above.
You can represent a negative integer quickly and easily using the
following technique, which takes advantage of the properties of binary
numbers: First, ignoring the sign, represent the value as an unsigned
binary integer. Next, reverse all the 0s and 1s. Finally, add 1 to the
result. Thus, the twos complement of 10101010 is (01010101 + 1), or
01010110.
You can check your conversion by adding the positive and negative numbers
(in twos complement form) to see if they total zero. From the example
above, notice that adding the 8-bit integer 10101010 to its twos
complement, 01010110, yields a 9-bit result, 100000000. However, the
system defines the result type to be 8-bit integer and recognizes only
the 8 zeros, so the result is zero.
Figure 2-3 shows bit formats for the 32-bit integer type.
Figure 2-3. Bit Format: 32-Bit Integer
Real. A real number is a value in the set of zero and the positive or
negative rational numbers. Signed integers and fractions are included,
although fractions may be approximated. Imaginary and complex numbers
are not included in the set of real numbers, although high-level
languages may have constructs for storing and working with them.
The real data type is a useful form for representing very large or small
values. Special formats are reserved to represent zero, infinity, and
NaN (not a number).
Real data type represents real numbers by using a type of floating-point,
or scientific, notation. In this notation, you generally express a very
large or very small number as a fraction multiplied by a power of the
number base. For example, the base-ten number .000025 could be expressed
as +.25 * 10 -4 The general floating-point, or scientific notation, form
is:
Sf F * (B ** Se E)
where: Sf is the sign (+ or -) of the number.
F is the fraction or mantissa.
* is the symbol for multiplication.
B the base is represented as an integer.
** is the symbol for exponentiation.
Se is the sign (+ or -) of the exponent.
E the exponent or characteristic is
represented as an integer.
NOTE In this manual, assume all representations of floating-point real
numbers use an integer base of 10 (decimal-based, or base-ten)
unless otherwise indicated. Internally, the computer uses a base
of two (is binary-based), and the conversion is approximate.
You can represent real numbers four ways. You can choose either in IEEE
or HP3000 format and use either single-precision or double-precision
size.
IEEE or HP3000 Format. MPE XL recognizes two formats for storing
floating-point real numbers: IEEE and HP3000. Programs compiled in NM
use IEEE as the default. Programs compiled in CM use HP3000, the MPE XL
emulation of the MPE V/E system floating-point format. NM programs
accessing HP3000 data must either specify a special compiler option or
convert CM data to NM before operations.
Single or Double Precision. You can represent single-precision (32-bit)
or double-precision (64-bit) real numbers in both IEEE and HP3000
notation. Table 2-2 shows a summary of the range and accuracy of each.
Table 2-2. Ranges and Accuracies for Floating-Point Real Numbers
-----------------------------------------------------------------------------------------
| |
| IEEE HP3000 |
| |
-----------------------------------------------------------------------------------------
| |
| Single precision: |
| |
| Accuracy (in decimal digits) 7.2 6.9 |
| |
| Range -3.4E38 to -1.4E-45 -1.27E77 to -8.6E-78 |
| 0 0 |
| +1.4E-45 to +3.4E38 +8.6E-78 to +1.2E77 |
| |
-----------------------------------------------------------------------------------------
| |
| Double precision: |
| |
| Accuracy (in decimal digits) 15.9 16.5 |
| |
| Range -1.8E308 to -4.9E-324 -1.2E77 to -8.6E-78 |
| 0 0 |
| +4.9E-324 to +1.8E308 +8.6E-78 to +1.2E77 |
| |
-----------------------------------------------------------------------------------------
| |
| Note: Values in this table are rounded. |
| |
-----------------------------------------------------------------------------------------
Fields of a Real Number. In MPE XL format, real numbers have three
fields:
Sign.
Mantissa.
Exponent.
Different representations of real numbers have the three fields aligned
on different boundaries. In all formats, the sign field is the first
bit, the mantissa is in normalized form, and the exponent is biased.
The sign field, bit (0:1), is 0 if number is positive, 1 if negative.
Mantissas are represented in normalized form. That is, the leading one
is stripped and binary point is not explicitly expressed. Each expressed
mantissa, then, has an implied leading one and binary point. For
example, a mantissa represented by 10101010101010101010101 is interpreted
as the value 1.10101010101010101010101.
The exponents of real numbers are biased. This means that both positive
and negative true exponents are represented using only unsigned binary
integers. The bias amount, or excess, is the difference between the true
exponent and the represented exponent. The negative true exponents
correspond to the lower range of the represented exponents. The positive
true exponents correspond to the upper range of the represented
exponents. The true exponent zero corresponds to the midpoint in the
range of the represented exponents. For example, consider an exponent
field n bits long where the true exponent is T, the represented exponent
is E, and the bias is b. For any real number x, then, xT = xE-b , and xE
= xT+b .
Exponent fields of all zeros or all ones are reserved. If the exponent
of a floating-point number is all zeros and the mantissa is zero, the
number is regarded as zero. If the exponent of a floating-point number
is all zeros and the mantissa is not all zero, the number is regarded as
denormalized. If the exponent of a floating-point number is all ones and
the mantissa is zero, the number is regarded as a signed infinity. If
the exponent is all ones and the mantissa is not zero, the interpretation
is NaN (Not-a-Number, undefined).
If any process attempts to operate on an infinity or a NaN, a system trap
may occur and data may be corrupted. Invalid operation is signaled when
the source is a signaling or a quiet NaN. The result is the destination
format's largest finite number with the sign of the source.
Any operation that involves a signaling NaN or invalid operation returns
a quiet NaN as the result when no trap occurs and a floating-point result
is to be delivered. If an operation is using one or two quiet NaNs as
input, it signals no exception; however, if a floating-point result is to
be delivered, a quiet NaN is returned that is the same as one of the
input NaNs.
IEEE Real Number Format. IEEE numbers conform to the format set up by
the Institute of Electrical and Electronics Engineers and the American
National Standards Institute (std 754-1985). Single-precision numbers
are one NM word, aligned on 32-bit boundaries. Double precision numbers
are two NM words, aligned on 64-bit boundaries.
NOTE In this manual, bit fields are described as (bit:length), where bit
is the first bit in the field and length is the number of
consecutive bits in the field. For example, "bits (11:3)" refers
to bits 11, 12, and 13. Bit 0 is the most significant bit.
IEEE numbers in MPE floating-point notation contain three fields:
Sign: The sign field is bit (0:1), the first bit of the first
word. A value of 0 indicates the number is positive,
and a value of 1 indicates the number is negative. The
sign bit is the only difference between a real number
value and its negative.
Exponent: The single-precision exponent field is bits (1:8) of
the first NM word, and is biased by 127. The
double-precision exponent field is bits (1:11) of the
first NM word, and is biased by 1023.
Mantissa: The single-precision mantissa field is bits (9:23). The
double-precision mantissa field is bits (12:52). MPE
stores the mantissa as normalized data represented as a
binary number of 23 bits for the single-precision
format, and 52 bits, with an assumed 1. leading the
field.
A previous section, "Fields of a Real Number", explains biased exponent
and normalized mantissa.
IEEE Conversion Example. Consider converting an IEEE single-precision
floating-point number into a base-ten number using this formula:
(-1)sign * 2Exponent-127 * (1.0 + Mantissa + 2-23 )
where: Sign Bit (0:1), the sign field, is 0 if
number is positive, 1 if negative.
* is the symbol for multiplication.
Exponent Bits (1:8), the exponent field, is the
biased representation of the true
exponent.
+ is the symbol for addition.
Mantissa Bits (9:23) is the normalized form of
the mantissa, or fraction.
2-23 is added for rounding.
The (base-ten) floating-point number 100.00 (hexadecimal $42c80000) is
represented as 0 10000101 10010000000000000000000. Using the formula, we
obtain the correct result as follows:
Table 2-3. Determining the Base-Ten Equivalent of an IEEE Real Number
S(ign) | E(xponent) | M(antissa)
| |
= 0 | 10000101 | 10010000000000000000000
| |
---------------------------------------------------------------------------------------------
| |
| |
(-1)S | *2E-127 | *(1.0+M+2-23 )
| |
| |
= -10 | * 2133-127 | * 1.0+9/16 +2-23
| |
= 1 | * 64 | * 1.0 + 0.5625 + .00000011920929
| |
= | 64 | * 1.56250011020929
| |
= | 100 |
| |
Figure 2-4 shows the bit format for floating-point real numbers in IEEE
single-precision format.
Figure 2-4. Bit Format: Single-Precision Real in IEEE Floating-Point Notation
Figure 2-5 shows the IEEE real number double-precision bit format.
Figure 2-5. Bit Format: Double-Precision Real in IEEE Floating-Point Notation
HP3000 Real Number Format. Single-precision HP3000 real numbers are 32
bits (2 CM words), and double-precision are 64 bits (4 CM words). When
stored in memory, HP3000 reals are aligned on CM word boundaries.
NOTE In this manual, bit fields are described as (bit:length), where bit
is the first bit in the field and length is the number of
consecutive bits in the field. For example, "bits (11:3)" refers
to bits 11, 12, and 13.
Real numbers in HP3000 floating-point notation contain three fields:
Sign: The sign field is bit (0:1) of the first word. A value
of 0 indicates the number is positive and a value of 1
indicates the number is negative. The sign is the only
difference between a real number value and its negative.
Exponent: The exponent field is bits (1:9) of the first CM word in
the single-precision and double-precision format. The
represented exponent range is 0 to 511. Exponents are
biased by +256.
Mantissa: The mantissa field is bits (10:6) of the first CM word
and bits (0:16) of the other words. MPE stores the
mantissa as normalized data of 22 bits for the
single-precision format, and 54 bits for the
double-precision, with an asssumed 1. leading the
field.
A previous section, "Fields of a Real Number", explains biased exponent
and normalized mantissa.
NOTE In this manual, bit fields are described as (bit:length), where bit
is the first bit in the field and length is the number of
consecutive bits in the field. For example, "bits (11:3)" refers
to bits 11, 12, and 13. Bit 0 is the most significant bit.
Figure 2-6 shows the HP3000 real number single-precision bit format.
Figure 2-6. Bit Format: Single-Precision Real in HP3000 Floating-point Notation
Figure 2-7 shows the HP3000 real number double-precision bit format.
Figure 2-7. Bit Format: Double-Precision Real in HP3000 Floating-point Notation
Decimals. MPE V has system microcode instructions to handle packed
decimals. For compatibility, MPE XL has compiler library procedures that
run in NM and emulate the MPE V instruction set.
In MPE XL, three languages use decimal types. COBOL and RPG use packed
or unpacked decimals. BASIC has its own type, the floating-point
decimal.
In the decimal types, numbers are represented decimal digit by decimal
digit. The individual digits of the decimal number are each represented
in a BCD (Binary Coded Decimal) nibble. Each nibble is four bits long.
Figure 2-8 shows the bit format for each BCD nibble portion of a decimal.
Figure 2-8. Bit Format: BCD Nibble
Packed Decimal Format. Packed decimals represent numbers with BCD
(Binary Coded Decimal) nibbles. In packed decimals, each decimal digit
of the number is individually represented by a 4-bit BCD.
Decimals are always an even number of nibbles long.
Figure 2-8, above, shows the bit format for each BCD nibble portion of a
decimal.
The rightmost (least significant) nibble is for the sign. There are
three defined nibble combinations for the sign nibble. The three defined
codes are:
hexadecimal C (1100) for positive
hexadecimal D (1101) for negative
hexadecimal F (1111) for unsigned
Since each of the other nibbles represents the decimal digits 0 through
9, the valid nibble combinations are 0000 through 1001 for all but the
last nibble.
For example, to represent -52,194 as a packed decimal type, you would use
one nibble for each of the five digits and (the last) one for the sign:
0101 0010 0001 1001 0100 1101
5 2 1 9 4 D=negative
In COBOL, the PICTURE (PIC) clause specifies the position of the decimal
point. For example, the PIC clause 999V99, specifies three digits will
be followed by an implied decimal point and two more digits. If you pass
the digits 12345 to a variable defined with this PIC clause, its value
would be 123.45.
In COBOL and RPG, using packed decimal will probably make your program
more efficient than using unpacked. If you do use unpacked decimal, the
compiler usually converts to packed for calculations.
Figure 2-9 shows the bit format for the packed decimal.
Figure 2-9. Bit Format: Packed Decimal
Unpacked Decimal Format. COBOL and RPG represent numbers with packed and
unpacked decimal types. For an unpacked decimal, each decimal digit is
one byte long. Unpacked decimals are ASCII characters, interpreted by a
correspondence code. The bit format is the ASCII character format in
Figure 2-1. For more information, see the notes on COBOL and RPG,
"Formatting Data in Programs", later in this chapter.
Floating-Point Decimal Format. HP Business BASIC represents decimal and
short decimal types in floating-point decimal notation. The
floating-point decimal form is similar to the E notation used to
represent very small or very large numbers, as when 3.2E-27 is used to
represent the value 3.2 x 10-27 . The BASIC number is normalized (see
below).
A decimal in HP Business BASIC/XL is 64 bits long; a short decimal is 32
bits long. Table 2-4, below, shows a summary of the range and accuracy
of each.
Table 2-4. Range and Precision for Floating-Point Decimals
-----------------------------------------------------------------------------------------------
| | | |
| | BASIC Decimal | BASIC Short Decimal |
| | | |
-----------------------------------------------------------------------------------------------
| | | |
| | | |
| Precision: | 12 digits | 6 digits |
| | | |
-----------------------------------------------------------------------------------------------
| | | |
| Range: | -9.99999999999E511 through -1.00000000000E|51-9.99999E63 through -1.00000-E63 |
| | 0 | 0 |
| | 1.00000000000E-511 through 9.99999999999E5|1 1.00000E-63 through 9.9999E63 |
| | | |
-----------------------------------------------------------------------------------------------
The representation of the value zero is a special case. To represent the
value zero, set all the bits to zero. Since the number is normalized, it
is assumed that the mantissa never begins with a zero unless the value of
zero is intended.
Fields of BASIC decimals: Floating-point decimals have three fields:
* Exponent.
* Mantissa.
* Sign.
The exponent field contains a signed integer, represented in twos
complement form. The decimal exponent field is the first 10 bits, bits
(0:10), and ranges from -511 to +511. The short decimal exponent field
is the first seven bits (bits 0:7) and ranges from -63 to +63.
NOTE In this manual, bit fields are described as (bit:length), where bit
is the first bit in the field and length is the number of
consecutive bits in the field. For example, "bits (11:3)" refers
to bits 11, 12, and 13. Bit 0 is the most significant bit.
In the mantissa field, each decimal digit of the number is individually
represented by a BCD (Binary Coded Decimal) nibble. Each nibble is four
bits long. (See Figure 2-8.) Since each nibble in this field represents
the decimal digits 0 through 9, the valid mantissa nibble combinations
are 0000 through 1001.
The number is normalized. That is,
* The decimal point is implied, or assumed to belong, immediately
following the first BCD digit of the mantissa field.
* The first BCD of the mantissa is never zero, unless you intend to
represent the number zero.
The mantissa field of a 64-bit decimal is bits (12:48). It has the
capacity for 12 digits, each represented in a 4-bit nibble. The mantissa
field of a 32-bit decimal is bits (8:24). It has the capacity for 6
digits, each represented in a 4-bit nibble.
The sign field of a 64-bit decimal is bits (60:4), which are the four
least significant bits, or the least significant BCD nibble. The
hexadecimal value C (1100) in the sign nibble indicates the number is
positive, and D (1101) indicates the number is negative.
The sign field of a 32-bit short decimal is the seventh bit, bit (7:1).
A value of 0 in the sign bit indicates the number is positive, and a
value of 1 indicates the number is negative.
Figure 2-10 shows the bit format for the floating-point decimal.
Figure 2-10. Bit Format: Floating-Point Decimal
Figure 2-11 shows the bit format for the short floating-point decimal.
Figure 2-11. Bit Format: Short Floating-Point Decimal
MPE/iX 5.0 Documentation