Recognizing Primitive Data Types

Data is an abstraction of information. Data must be structured in a form that the computer is designed to process; data conversion is the translation of information to a form acceptable to the computer.

The 900 Series HP 3000 Computer Systems instruction set is designed to operate on certain fundamental data types. The following data types are recognized by MPE XL and its subsystems:

Characters.
The following numeric types:
- Integers.
- Real numbers (in floating point notation).
- Decimals: packed, unpacked, and floating-point.




	NOTE: Although decimal is not really a system primitive type, it is included in this manual because it is so widely used on MPE XL. Floating-point decimals are used by BASIC; packed and unpacked decimals are used by COBOL and RPG.

Each data type requires a specific bit format. In this manual, bit fields are described as (bit:length), where bit is the first bit in the field and length is the number of consecutive bits in the field. For example, "bits (13:3)" refers to bits 13, 14, and 15. Bit 0 is the most significant bit.

Character

Character code formats are primitive data types. Characters are the letters, numbers, and symbols on your keyboard. The computer relates each alphanumeric character to an 8-bit (one byte) binary number, according to a correspondence code. Some of the characters are easily displayable, like +, ?, 8, and z; some are not, like a blank space or the carriage return.

MPE supports the two common American English character codes: ASCII (American Standard Code for Information Interchange and EBCDIC (Extended Binary Coded Decimal Interchange Code). Several natural language types are also supported. See Appendix A for ASCII and EBCDIC codes and equivalents.

Character data types are useful for storing strings of symbols like names, addresses, or identification numbers, and for reading the keyboard or writing to the screen. Remember, variables saved as data type character are recognized by the computer as symbols, not as numeric values.

ASCII

MPE and its subsystems use ASCII data type to represent character data. ASCII is the format adopted by ANSI, the American National Standards Institute. Most MPE interfaces use ASCII to accept or return character data.

Appendix A shows the ASCII and EBCDIC character code values, along with their decimal, octal, and hexadecimal equivalents.

ASCII is used in this guide as the name of a data type. ASCII data type corresponds to the ASCII character code format. The codes for byte values in the range 0 to 127 conform to the ASCII standard format. Byte values in the range 128 to 255 are interpreted using Hewlett-Packard's extended ROMAN8 character set. MPE XL and its subsystems use values in this range to support extended (8-bit) character sets.

Figure 2-1 “Bit Format: ASCII Character” shows the ASCII data type bit format.

Figure 2-1 Bit Format: ASCII Character

EBCDIC

EBCDIC is another coding format widely used in the computer industry for character data. Like ASCII, it is based on the byte.

EBCDIC is used in this guide as the name of a data type. EBCDIC data type corresponds to EBCDIC character code format for byte values in the range 0 to 255.

Appendix A shows the ASCII and EBCDIC character code values, along with their decimal, octal, and hexadecimal equivalents.

Figure 2-2 “Bit Format: EBCDIC Character” shows the bit format for EBCDIC data type.

Figure 2-2 Bit Format: EBCDIC Character

Numeric

MPE XL subsystems support three primitive data types for numbers:

Integer.
Real.
Decimal.

Integer

An integer is any positive or negative whole number, including zero. Integers are useful for counting and for incrementing in loops. Signed integers are a useful form for exchanging numeric data between languages.

MPE XL integers can be 8, 16, 32, or 64 bits long. They can be unsigned or signed (+ or -). Signed integers are represented in twos complement form.

Table 2-1 MPE XL Integer Types

Size	Type	Range	Stored At:
8-bit:	unsigned	0 to 255	byte addresses
16-bit:	signed	-32,768 to 32,767	half-word addresses
	unsigned	0 to 65,535	half-word addresses
32-bit:	signed	-2,147,483,648 to 2,147,483,647	word addresses
	unsigned	0 to 4,294,967,295	word addresses

The chart below shows the representation of the whole number (base-ten) 73 as an unsigned integer, a signed positive number, and a signed negative number.

Unsigned	Signed
	Positive	Negative
(73)	(+73)	(-73)
01001001	01001001	10110111

Unsigned Integer

Unsigned integers are stored in the computer in their base-two form. If you are reading or writing unsigned integers in a language, the compiler converts for you, according to the formatting conventions of the individual language.

An unsigned n-bit number can represent any value from 0 to 2ⁿ-1.

Reading an Unsigned Integer: One method of reading an unsigned integer as a base-ten value is to consider the bits as columns whose values are powers of two. The rightmost (least significant) bit is the units column and has a weight of 2⁰, or 1. Going toward the left (the most significant bit), the columns have progressively greater weight: 2⁰, 2¹, 2², . . . 2^n-1. The decimal-based value of unsigned binary numbers is computed by multiplying the value in each column by the weight of the column, and then adding all the results. An unsigned integer represented with ones in the 2⁰, 2³ , and 2⁶ columns and zeros in all the other columns would be computed as follows:

1*(2⁰) + 1*( 2³) + 1*(2⁶) = 73.

Writing an Unsigned Integer: One method of manually determining the unsigned integer representation of a base-ten value is to use successive subtraction. For example, the largest power of 2 that is less than or equal to the value of decimal-base 73 is 2⁶, or 64. Subtracting 64 from 73 leaves a remainder of 9. The largest power of 2 that is less or equal than 9 is 2³, or 8. Subtracting 8 from 9 leaves a remainder of 1. The only power of 2 that is less than or equal to 1 is 2⁰, or 1. This leaves a remainder of 0, so the computation is finished. Thus, 73 is represented in binary with a 1 in the 2⁰, the 2³, and the 2⁶ columns and a zero in all the others.

Signed Integer

Signed integers are stored in the computer in twos complement form. If you are reading or writing signed integers in a language, the compiler converts for you, according to the formatting conventions of the individual language.

A signed n-bit integer in twos complement form can represent any value from -(2^n-1) to +2^n-1-1.

When the n-bit positive integer i is added to its n-bit integer negative (complement), -i, and both are in twos complement form, the result is always an n-bit zero.

Reading a Signed Integer: The computer represents both positive and negative numbers in twos complement form much the same way that it would represent an unsigned integer: beginning at the rightmost (least significant bit) and going toward the left, the columns have progressively greater weight: 2⁰, 2¹, 2², ...2^n-1. The only difference is that the most significant bit of a twos complement number is negative. That is, it has a weight of -(2^n-1).

To manually convert a signed integer in twos complement form to a base-ten integer, you can use the column method explained in Unsigned Integers, above. However, you give the leftmost column of a twos complement number a weight of -(2^n-1).

In the example below, this method is used to interpret the signed binary integers 01010101 and 10101010, written in twos complement form, as decimal-based integers:

(01010101)_{base 2}	=	the sum of:		(10101010)^{base 2}	=	the sum of:
(1 x 2⁰)	=	1		(0 x 2⁰)	=	0
(0 x 2¹)	=	0		(1 x 2¹)	=	2
(1 x 2²)	=	4		(0 x 2²)	=	0
(0 x 2³)	=	0		(1 x 2³)	=	8
(1 x 2⁴)	=	16		(0 x 2⁴)	=	0
(0 x 2⁵)	=	0		(1 x 2⁵)	=	32
(1 x 2⁶)	=	64		(0 x 2⁶)	=	0
(0 x -(2⁷))	=	0		(1 x -(2⁷))	=	-128
(01010101)_{base 2}	=	85_{base 10}	and	(10101010)_{base 2}	=	-86_{base 10}

Writing a Signed Integer: Converting a signed base-ten number to twos complement form is not difficult.

You can represent the positive signed integers just as explained in Unsigned Integers, above.

You can represent a negative integer quickly and easily using the following technique, which takes advantage of the properties of binary numbers: First, ignoring the sign, represent the value as an unsigned binary integer. Next, reverse all the 0s and 1s. Finally, add 1 to the result. Thus, the twos complement of 10101010 is (01010101 + 1), or 01010110.

You can check your conversion by adding the positive and negative numbers (in twos complement form) to see if they total zero. From the example above, notice that adding the 8-bit integer 10101010 to its twos complement, 01010110, yields a 9-bit result, 100000000. However, the system defines the result type to be 8-bit integer and recognizes only the 8 zeros, so the result is zero.

Figure 2-3 “Bit Format: 32-Bit Integer” shows bit formats for the 32-bit integer type.

Figure 2-3 Bit Format: 32-Bit Integer

Real

A real number is a value in the set of zero and the positive or negative rational numbers. Signed integers and fractions are included, although fractions may be approximated. Imaginary and complex numbers are not included in the set of real numbers, although high-level languages may have constructs for storing and working with them

The real data type is a useful form for representing very large or small values. Special formats are reserved to represent zero, infinity, and NaN (not a number).

Real data type represents real numbers by using a type of floating-point, or scientific, notation. In this notation, you generally express a very large or very small number as a fraction multiplied by a power of the number base. For example, the base-ten number .000025 could be expressed as +.25 * 10 ^-4 The general floating-point, or scientific notation, form is:

S_fF * (B ** S_e E)

where:

S_f: is the sign (+ or -) of the number.
F: is the fraction or mantissa.
*: is the symbol for multiplication.
B: the base is represented as an integer.
**: is the symbol for exponentiation.
S_e: is the sign (+ or -) of the exponent.
E: the exponent or characteristic is represented as an integer.




	NOTE: In this manual, assume all representations of floating-point real numbers use an integer base of 10 (decimal-based, or base-ten) unless otherwise indicated. Internally, the computer uses a base of two (is binary-based), and the conversion is approximate.

You can represent real numbers four ways. You can choose either in IEEE or HP3000 format and use either single-precision or double-precision size.

IEEE or HP3000 Format

MPE XL recognizes two formats for storing floating-point real numbers: IEEE and HP3000. Programs compiled in NM use IEEE as the default. Programs compiled in CM use HP3000, the MPE XL emulation of the MPE V/E system floating-point format. NM programs accessing HP3000 data must either specify a special compiler option or convert CM data to NM before operations.

Single or Double Precision

You can represent single-precision (32-bit) or double-precision (64-bit) real numbers in both IEEE and HP3000 notation. Table 2-2 “Ranges and Accuracies for Floating-Point Real Numbers” shows a summary of the range and accuracy of each.

Table 2-2 Ranges and Accuracies for Floating-Point Real Numbers

	IEEE	HP3000
Single precision:
Accuracy (in decimal digits)	7.2	6.9
Range	-3.4E38 to -1.4E-45	-1.27E77 to -8.6E-78
	0	0
	+1.4E-45 to +3.4E38	+8.6E-78 to +1.2E77
Double precision:
Accuracy (in decimal digits)	15.9	16.5
Range	-1.8E308 to -4.9E-324	-1.2E77 to -8.6E-78
	0	0
	+4.9E-324 to +1.8E308	+8.6E-78 to +1.2E77
Note: Values in this table are rounded.

Fields of a Real Number

In MPE XL format, real numbers have three fields:

Sign.
Mantissa.
Exponent.

Different representations of real numbers have the three fields aligned on different boundaries. In all formats, the sign field is the first bit, the mantissa is in normalized form, and the exponent is biased.

The sign field, bit (0:1), is 0 if number is positive, 1 if negative.

Mantissas are represented in normalized form. That is, the leading one is stripped and binary point is not explicitly expressed. Each expressed mantissa, then, has an implied leading one and binary point. For example, a mantissa represented by 10101010101010101010101 is interpreted as the value 1.10101010101010101010101.

The exponents of real numbers are biased. This means that both positive and negative true exponents are represented using only unsigned binary integers. The bias amount, or excess, is the difference between the true exponent and the represented exponent. The negative true exponents correspond to the lower range of the represented exponents. The positive true exponents correspond to the upper range of the represented exponents. The true exponent zero corresponds to the midpoint in the range of the represented exponents. For example, consider an exponent field n bits long where the true exponent is T, the represented exponent is E, and the bias is b. For any real number x, then, x^T = x^E-b, and x^E = x^T+b.

Exponent fields of all zeros or all ones are reserved. If the exponent of a floating-point number is all zeros and the mantissa is zero, the number is regarded as zero. If the exponent of a floating-point number is all zeros and the mantissa is not all zero, the number is regarded as denormalized. If the exponent of a floating-point number is all ones and the mantissa is zero, the number is regarded as a signed infinity. If the exponent is all ones and the mantissa is not zero, the interpretation is NaN (Not-a-Number, undefined).

If any process attempts to operate on an infinity or a NaN, a system trap may occur and data may be corrupted. Invalid operation is signaled when the source is a signaling or a quiet NaN. The result is the destination format's largest finite number with the sign of the source.

Any operation that involves a signaling NaN or invalid operation returns a quiet NaN as the result when no trap occurs and a floating-point result is to be delivered. If an operation is using one or two quiet NaNs as input, it signals no exception; however, if a floating-point result is to be delivered, a quiet NaN is returned that is the same as one of the input NaNs.

IEEE Real Number Format

IEEE numbers conform to the format set up by the Institute of Electrical and Electronics Engineers and the American National Standards Institute (std 754-1985). Single-precision numbers are one NM word, aligned on 32-bit boundaries. Double precision numbers are two NM words, aligned on 64-bit boundaries.




	NOTE: In this manual, bit fields are described as (`bit`:`length`), where `bit` is the first bit in the field and `length` is the number of consecutive bits in the field. For example, "bits (11:3)" refers to bits 11, 12, and 13. Bit 0 is the most significant bit.

IEEE numbers in MPE floating-point notation contain three fields:

Sign:: The sign field is bit (0:1), the first bit of the first word. A value of 0 indicates the number is positive, and a value of 1 indicates the number is negative. The sign bit is the only difference between a real number value and its negative.
Exponent:: The single-precision exponent field is bits (1:8) of the first NM word, and is biased by 127. The double-precision exponent field is bits (1:11) of the first NM word, and is biased by 1023.
Mantissa:: The single-precision mantissa field is bits (9:23). The double-precision mantissa field is bits (12:52). MPE stores the mantissa as normalized data represented as a binary number of 23 bits for the single-precision format, and 52 bits, with an assumed 1. leading the field.

A previous section, “Fields of a Real Number”, explains biased exponent and normalized mantissa.

IEEE Conversion Example

Consider converting an IEEE single-precision floating-point number into a base-ten number using this formula:

(-1)^sign * 2^Exponent-127 * (1.0 + Mantissa + 2^-23)

where:

Sign: Bit (0:1), the sign field, is 0 if number is positive, 1 if negative.
*: is the symbol for multiplication.
Exponent: Bits (1:8), the exponent field, is the biased representation of the true exponent.
+: is the symbol for addition.
Mantissa: Bits (9:23) is the normalized form of the mantissa, or fraction.
2^-23: is added for rounding.

The (base-ten) floating-point number 100.00 (hexadecimal $42c80000) is represented as 0 10000101 10010000000000000000000. Using the formula, we obtain the correct result as follows:

Table 2-3 Determining the Base-Ten Equivalent of an IEEE Real Number

	S(ign)	E(xponent)	M(antissa)
=	0	10000101	10010000000000000000000
	(-1)^S	*2^E-127	*(1.0+M+2^-23)
=	-1⁰	* 2^133-127	* 1.0+9/16+2^-23
=	1	*64	*1.0+0.5625+.00000011920929
=		64	* 1.56250011020929
=		100

Figure 2-4 “Bit Format: Single-Precision Real in IEEE Floating-Point Notation” shows the bit format for floating-point real numbers in IEEE single-precision format.

Figure 2-4 Bit Format: Single-Precision Real in IEEE Floating-Point Notation

Figure 2-5 “Bit Format: Double-Precision Real in IEEE Floating-Point Notation”shows the IEEE real number double-precision bit format.

Figure 2-5 Bit Format: Double-Precision Real in IEEE Floating-Point Notation

HP3000 Real Number Format

Single-precision HP3000 real numbers are 32 bits (2 CM words), and double-precision are 64 bits (4 CM words). When stored in memory, HP3000 reals are aligned on CM word boundaries.




	NOTE: In this manual, bit fields are described as (`bit`:`length`), where `bit` is the first bit in the field and `length` is the number of consecutive bits in the field. For example, "bits (11:3)" refers to bits 11, 12, and 13.

Real numbers in HP3000 floating-point notation contain three fields:

Sign:: The sign field is bit (0:1) of the first word. A value of 0 indicates the number is positive and a value of 1 indicates the number is negative. The sign is the only difference between a real number value and its negative.
Exponent:: The exponent field is bits (1:9) of the first CM word in the single-precision and double-precision format. The represented exponent range is 0 to 511. Exponents are biased by +256.
Mantissa:: The mantissa field is bits (10:6) of the first CM word and bits (0:16) of the other words. MPE stores the mantissa as normalized data of 22 bits for the single-precision format, and 54 bits for the double-precision, with an asssumed 1. leading the field.

A previous section, “Fields of a Real Number”, explains biased exponent and normalized mantissa.




	NOTE: In this manual, bit fields are described as (`bit`:`length`), where `bit` is the first bit in the field and `length` is the number of consecutive bits in the field. For example, "bits (11:3)" refers to bits 11, 12, and 13. Bit 0 is the most significant bit.

Figure 2-6 “Bit Format: Single-Precision Real in HP3000 Floating-point Notation HP3000 Real Number Format.” shows the HP3000 real number single-precision bit format

Figure 2-6 Bit Format: Single-Precision Real in HP3000 Floating-point Notation HP3000 Real Number Format.

Figure 2-7 “Bit Format: Double-Precision Real in HP3000 Floating-point Notation” shows the HP3000 real number double-precision bit format.

Figure 2-7 Bit Format: Double-Precision Real in HP3000 Floating-point Notation

Decimals

MPE V has system microcode instructions to handle packed decimals. For compatibility, MPE XL has compiler library procedures that run in NM and emulate the MPE V instruction set.

In MPE XL, three languages use decimal types. COBOL and RPG use packed or unpacked decimals. BASIC has its own type, the floating-point decimal.

In the decimal types, numbers are represented decimal digit by decimal digit. The individual digits of the decimal number are each represented in a BCD (Binary Coded Decimal) nibble. Each nibble is four bits long.

Figure 2-8 “Bit Format: BCD Nibble”shows the bit format for each BCD nibble portion of a decimal.

Figure 2-8 Bit Format: BCD Nibble

Packed Decimal Format

Packed decimals represent numbers with BCD (Binary Coded Decimal) nibbles. In packed decimals, each decimal digit of the number is individually represented by a 4-bit BCD.

Decimals are always an even number of nibbles long.

Figure 2-8 “Bit Format: BCD Nibble”, above, shows the bit format for each BCD nibble portion of a decimal.

The rightmost (least significant) nibble is for the sign. There are three defined nibble combinations for the sign nibble. The three defined codes are:

hexadecimal C	(1100)	for positive
hexadecimal D	(1101)	for negative
hexadecimal F	(1111)	for unsigned

Since each of the other nibbles represents the decimal digits 0 through 9, the valid nibble combinations are 0000 through 1001 for all but the last nibble.

For example, to represent -52,194 as a packed decimal type, you would use one nibble for each of the five digits and (the last) one for the sign:


   0101 0010 0001 1001 0100 1101 
    5    2    1    9    4   D=negative

In COBOL, the PICTURE (PIC) clause specifies the position of the decimal point. For example, the PIC clause 999V99, specifies three digits will be followed by an implied decimal point and two more digits. If you pass the digits 12345 to a variable defined with this PIC clause, its value would be 123.45.

In COBOL and RPG, using packed decimal will probably make your program more efficient than using unpacked. If you do use unpacked decimal, the compiler usually converts to packed for calculations.

Figure 2-9 “Bit Format: Packed Decimal”shows the bit format for the packed decimal.

Figure 2-9 Bit Format: Packed Decimal

Unpacked Decimal Format

COBOL and RPG represent numbers with packed and unpacked decimal types. For an unpacked decimal, each decimal digit is one byte long. Unpacked decimals are ASCII characters, interpreted by a correspondence code. The bit format is the ASCII character format in Figure 2-1 “Bit Format: ASCII Character”. For more information, see the notes on COBOL and RPG, “Formatting Data in Programs”, later in this chapter.

Floating-Point Decimal Format

HP Business BASIC represents decimal and short decimal types in floating-point decimal notation. The floating-point decimal form is similar to the E notation used to represent very small or very large numbers, as when 3.2E-27 is used to represent the value 3.2 x 10^-27 The BASIC number is normalized (see below).

A decimal in HP Business BASIC/XL is 64 bits long; a short decimal is 32 bits long. Table 2-4 “Range and Precision for Floating-Point Decimals”, below, shows a summary of the range and accuracy of each.

Table 2-4 Range and Precision for Floating-Point Decimals

	BASIC Decimal	BASIC Short Decimal


Precision:	12 digits	6 digits
Range:	-9.99999999999E511 through -1.00000000000E-511	-9.99999E63 through -1.00000-E63
	0	0
	1.00000000000E-511 through 9.99999999999E511	1.00000E-63 through 9.9999E63

The representation of the value zero is a special case. To represent the value zero, set all the bits to zero. Since the number is normalized, it is assumed that the mantissa never begins with a zero unless the value of zero is intended.

Fields of BASIC decimals: Floating-point decimals have three fields:

Exponent.
Mantissa.
Sign.

The exponent field contains a signed integer, represented in twos complement form. The decimal exponent field is the first 10 bits, bits (0:10), and ranges from -511 to +511. The short decimal exponent field is the first seven bits (bits 0:7) and ranges from -63 to +63.




	NOTE: In this manual, bit fields are described as (`bit`:`length`), where `bit` is the first bit in the field and `length` is the number of consecutive bits in the field. For example, "bits (11:3)" refers to bits 11, 12, and 13. Bit 0 is the most significant bit.

In the mantissa field, each decimal digit of the number is individually represented by a BCD (Binary Coded Decimal) nibble. Each nibble is four bits long. (See Figure 2-8 “Bit Format: BCD Nibble”.) Since each nibble in this field represents the decimal digits 0 through 9, the valid mantissa nibble combinations are 0000 through 1001.

The number is normalized. That is,

The decimal point is implied, or assumed to belong, immediately following the first BCD digit of the mantissa field.
The first BCD of the mantissa is never zero, unless you intend to represent the number zero.

The mantissa field of a 64-bit decimal is bits (12:48). It has the capacity for 12 digits, each represented in a 4-bit nibble. The mantissa field of a 32-bit decimal is bits (8:24). It has the capacity for 6 digits, each represented in a 4-bit nibble.

The sign field of a 64-bit decimal is bits (60:4), which are the four least significant bits, or the least significant BCD nibble. The hexadecimal value C (1100) in the sign nibble indicates the number is positive, and D (1101) indicates the number is negative.

The sign field of a 32-bit short decimal is the seventh bit, bit (7:1). A value of 0 in the sign bit indicates the number is positive, and a value of 1 indicates the number is negative.

Figure 2-10 “Bit Format: Floating-Point Decimal” shows the bit format for the floating-point decimal.

Figure 2-10 Bit Format: Floating-Point Decimal

Figure 2-11 “Bit Format: Short Floating-Point Decimal” shows the bit format for the short floating-point decimal.

Figure 2-11 Bit Format: Short Floating-Point Decimal