Concepts of Classes of Data [ Micro Focus COBOL Language Reference ] MPE/iX 5.0 Documentation
Micro Focus COBOL Language Reference
Concepts of Classes of Data
The seven categories of data items (see the section The PICTURE Clause in
the chapter Program Definition) are grouped into three classes;
alphabetic, numeric, and alphanumeric. For alphabetic and numeric, the
classes and categories are equivalent. The alphanumeric class includes
the categories of alphanumeric edited, numeric edited and alphanumeric
(without editing). Every elementary item except for an index data item a
pointer, and a precedure-pointer belongs to one of the classes and
further to one of the categories. The class of a group item is treated
at object time as alphanumeric regardless of the class of elementary
items subordinate to that group item.
For ANS85 only: Every data item that is a function is an elementary
item, and belongs to one of the categories alphanumeric or numeric, and
to the corresponding class; the category of each function is determined
by the definition of the function. The definition is made in these
specifications. (See the section Intrinsic Functions in the chapter
Program Definition.)
The following table depicts the relationship of the class and categories
of data items.
Table 2-3: Data Levels, Classes and Categories
-----------------------------------------------------------
| | | |
| Level of Item | Class | Category |
| | | |
-----------------------------------------------------------
| | | |
| Elementary | Alphabetic | Alphabetic |
| | | |
-----------------------------------------------------------
| | | |
| | Numeric | Numeric |
| | | |
| | | Internal |
| | | Floating-point (MF, |
| | | OSVS and VSC2 only) |
| | | |
| | | External |
| | | Floating-point (MF, |
| | | OSVS and VSC2 only) |
| | | |
-----------------------------------------------------------
| | | |
| | Alphanumeric | Numeric Edited |
| | | Alphanumeric Edited |
| | | Alphanumeric |
| | | |
| | | DBCS (MF and VSC2 |
| | | only) |
| | | |
-----------------------------------------------------------
| | | |
| Non-Elementary | Alphanumeric | Alphabetic Numeric |
| Group | | |
| | | Internal |
| | | Floating-point (MF, |
| | | OSVS and VSC2 only) |
| | | |
| | | Numeric Edited |
| | | Alphanumeric Edited |
| | | Alphanumeric |
| | | |
| | | External |
| | | Floating-point (MF, |
| | | OSVS and VSC2 only) |
| | | |
| | | DBCS (MF and VSC2 |
| | | only) |
| | | |
-----------------------------------------------------------
Algebraic Signs
Algebraic signsfall into two categories:
1. operational signs, which are associated with signed numeric data
items and signed numeric literals to indicate their algebraic
properties;
2. editing signs, which appear on edited reports to identify the sign
of the item.
The SIGN clause permits the programmer to state explicitly the location
of the operational sign. The clause is optional; if it is not used,
operational signswill be represented as described in the section
Selection of Character Representation and Radix.
Editing signs are inserted into a data item through the use of the sign
control symbols of the PICTURE clause.
Standard Alignment Rules
The standard rules for positioning data within an elementary item depend
on the category of the receiving item. These rules are:
1. If the receiving data item is described as numeric:
a. the data is aligned by decimal point and is moved to the
receiving character positions with zero fill or truncation
on either end as required
b. when an assumed decimal point is not explicitly specified,
the data item is treated as if it had an assumed decimal
point immediately following its rightmost character and is
aligned as in paragraph a. above.
2. If the receiving data item is a numeric edited data item, the data
moved to the edited item is aligned by decimal point with zero
fill or truncation at either end as required within the receiving
character positions of the data item, except where editing
requirements cause replacement of the leading zeros.
3. If the receiving data item is alphanumeric (other than a numeric
edited data item), alphanumeric edited or alphabetic, the sending
data is moved to the receiving character position and aligned at
the leftmost character position in the data item with space fill
or truncation to the right, as required.
4. For MF, OSVS, and VSC2 only: If the receiving data item is
external floating-point, the leftmost non-zero digit, if one
exists, is aligned on the leftmost digit position: the exponent
is adjusted accordingly.
If the JUSTIFIED clause is specified for the receiving item, these
standard rules are modified as described in the section The JUSTIFIED
Clause in the chapter Program Definition.
Item Alignment for Increased Object-code Efficiency
Some computer memories are organized so that natural addressing
boundaries exist in the computer memory (for example, word boundaries,
half-word boundaries, byte boundaries). The way in which data is stored
is determined by the object program, and need not respect these natural
boundaries.
However, certain uses of data (for example, in arithmetic operations or
in subscripting) can be facilitated if the data is stored so as to be
aligned on these boundaries. Specifically, additional machine operations
in the object program can be repeated for the accessing and storage of
data if portions of two or more data items appear between adjacent
natural boundaries, or if certain natural boundaries divide a single data
item.
Data items which are aligned on these natural boundaries in such a way as
to avoid additional machine operations are defined to be synchronized. A
synchronized item is assumed to be introduced and carried in that form;
conversion to synchronized form occurs only during the execution of a
statement (other than READ or WRITE) which stores data in the item.
Synchronizationcan be accomplished in two ways:
1. by use of the SYNCHRONIZED clause
2. by organizing the data suitably on the appropriate natural
boundaries without the use of the SYNCHRONIZED clause.
By use of the SYNCHRONIZED clause, the use of special types of alignment
within a group can affect the results of statements in which the group is
used as an operand. The effect of the implicit FILLER and the semantics
of any statement referencing these groups is described later in this
chapter.
Selection of Character Representation and Radix
The value of a numeric item (defined as numeric by its PICTURE, see the
section The PICTURE Clause - Numeric Data Rules in the chapter Program
Definition) can be represented in the computer's storage in either binary
or decimal form depending on the USAGE clause of the declaration (see the
section The USAGE Clause in the chapter Program Definition). These
numeric formatsare:
* DISPLAY
* COMPUTATIONAL, COMP, BINARY(ANS85), COMPUTATIONAL-4or COMP-4(OSVS)
(VSC2)
* COMPUTATIONAL-3, COMP-3(OSVS, VSC2, MF and XOPEN) or
PACKED-DECIMAL(ANS85)
* For MF only: COMPUTATIONAL-5, COMP-5, COMPUTATIONAL-Xor COMP-X
* For OSVS, VSC2, and MF only: COMPUTATIONAL-1, COMPUTATIONAL-2
* For VSC2 and MF only: POINTER
* For MF only: PROCEDURE POINTER
For ANS85 only: An alphanumeric function is always represented in the
standard data format. Its size is determined by the definition of the
function. The implementor specifies the representation of integer and
numeric functions, and this representation need not be the standard data
format. Integer and numeric functions can be used only in arithmetic
expressions, and represent the value resulting from the evaluation of the
function without the restriction on composite of operands and/or
receiving data items.
When a computer provides more than one means of representing data, the
standard data format must be used for data items other than integer and
numeric functions (ANS85), if not otherwise specified by the data
description.
DISPLAY Format.
The COBOL digit characters from 0 to 9 that represent the number value
are held in radix 10, one digit character per byte of computer storage.
This is the standard data format of the COBOL language. If the data item
is signed and the sign is not specified as SEPARATE (see the section The
SIGN Clause in the chapter Program Definition) the numeric sign is
incorporated into either the leading or trailing digit, according to the
LEADING or TRAILING phrase in the SIGN clause. Signed data is
incorporated into the requisite digit as shown in Table 2-4 below.
(Effectively, bit 6 (hexadecimal value "40" ) of the character is set
from 0 to 1 if the number has a negative value.) If the data item is
signed and the sign is specified as SEPARATE, then the sign is held as a
separate single COBOL character, additional to the digits, either "+" or
"-" as necessary. If the data item is signed and no SIGN clause applies,
the numeric sign is incorporated into the trailing digit.
In the following table, the numbers in brackets represent the hexadecimal
encoding for the COBOL character. The encoding can be varied by the
CHARSET and SIGN system directives. See your COBOL System Reference for
details of these directives.
Table 2-4 : DISPLAY Non-SEPARATE Sign-Digit Characters
Storage character position requirements for DISPLAY data items are thus
equal to the number of "9"s in the PICTURE clause plus one if the sign is
specified as SEPARATE. The SYNCHRONIZED clause has no effect on DISPLAY
format data declarations.
COMPUTATIONAL, COMP, BINARY (ANS85), COMPUTATIONAL-4 (MF) (OSVS) (VSC2)
Format
This format holds numeric data items in computer storage in pure binary
two's complement representation. In this format, number values are held
in radix of 2 where each computer bit in the representation starting from
the right (least-significant) end represents the presence or absence of
an increasingly significant power of 2 in that value. Negative numbers
are represented by complementing (inverting all the bit values of) their
positive counterpart, and then adding one to the whole. Storage
requirements depend on the number of "9"s in the PICTURE clause, and
whether the numeric data item is signed or unsigned (see the sections The
PICTURE Clause, The COMPUTATIONAL Clause, and The SIGN Clause in the
chapter Program Definition); also your COBOL system will assign storage
for COMPUTATIONAL items in one of two modes; byte-storage and
word-storage. Byte-storage is the default storage-assignment mode for
this COBOL implementation.
Computer Memory Natural Boundaries: The fundamental natural boundaries
of a modern computer's memory are usually based on an eight-bit
character, known as a byte. Within this fundamental framework, machines
fall into two broad categories; those with no other natural boundaries,
called here byte-storage computers, and those with other natural
boundaries based upon multiples of the fundamental boundary of the byte,
called here word-storage computers.
In byte-storage mode, COBOL assigns numeric storage so that each numeric
item occupies the minimum number of bytes (see the section Selection Of
Character Representation And Radix in this chapter); the SYNCHRONIZED
clause has no meaning in the context and hence has no effect.
Within word-storage computers, natural boundaries can occur at 2-byte,
4-byte and/or 8-byte boundaries. The COBOL language can provide such
data item storage-assignment and synchronizationwhen the COMPUTATIONAL
clause and possibly the SYNCHRONIZED clause are used. This word-storage
assignment of COMPUTATIONAL format data is controlled by your COBOL
system directive, IBMCOMP (see your COBOL System Reference for details of
using this feature).
Table 2-5: COMP(UTATIONAL) Format Data Item Character-Position (Byte) Storage Assignment
-------------------------------------------------------------
| | |
| Number of Digits (9s) in | Character-Positions |
| PICTURE Representation | (Bytes) of |
| | Storage-Assigned |
| | |
-------------------------------------------------------------
| | | | |
| Signed | Unsigned | Byte-Storage | Word-Storage |
| | | Mode | Mode |
| | | | |
| 1-2 | 1-2 | 1 | 2 |
| | | | |
| 3-4 | 3-4 | 2 | 2 |
| | | | |
| 5-6 | 5-7 | 3 | 4 |
| | | | |
| 7-9 | 8-9 | 4 | 4 |
| | | | |
| 10-11 | 10-12 | 5 | 8 |
| | | | |
| 12-14 | 13-14 | 6 | 8 |
| | | | |
| 15-16 | 15-16 | 7 | 8 |
| | | | |
| 17-18 | 17-18 | 8 | 8 |
| | | | |
-------------------------------------------------------------
Synchronization: If a data item description contains the SYNCHRONIZED
clause, and word-storage mode is enabled, the position of that item
within the computer storage is aligned so that the right-hand
(least-significant) end is on a natural boundary of the computer's
storage. Extra character positions (bytes) of computer storage are
reserved adjacent to synchronized items to achieve this alignment; these
bytes, known as padding bytes or implicit FILLER bytes, are normally
inaccessible to the computer program except as part of a group item.
Each elementary data item that is described as SYNCHRONIZED is aligned to
the natural storage boundary that corresponds to its data item storage
assignment (according to Table 2-5 above). Thus, in word-storage mode, a
numeric data item with a PICTURE description of S9(5) would be assigned 4
bytes of storage (being 1 padding byte and 3 data bytes). If
SYNCHRONIZED was specified, it would be aligned to the next nearest
4-byte boundary (that is, with the total (4-byte) storage assignment
aligned such that the number of bytes from the beginning of the record
containing that item to the left-hand (most-significant) end of that item
was a multiple of four). If the previous item does not end on a 4-byte
boundary, implicit FILLER assignments are necessary to achieve this.
Other such implicit FILLER bytes can be generated by the use of
SYNCHRONIZED items in non-elementary data descriptions containing an
OCCURS clause (see the section The OCCURS Clause in the chapter Program
Definition). This is because further bytes may need to be reserved for
each group item occurrence in order that the second or subsequent
occurrences have the same alignment to the natural boundaries of the
computer storage as did the first occurrence.
Implicit Synchronization: With word-storage modeenabled, all
record-level data descriptions are automatically synchronized to a full
8-byte boundary.
For MF only: Where automatic alignment is placed, it is sensitive to the
ALIGN directive. See your COBOL System Reference for details.
Example of Implicit FILLER Assignments: The following COBOL data
description will produce the computer storage allocation shown in Figure
2-3. An explanation of the symbols used in Figure 2-3 is given below the
figure.
01 UNSYNCHRONIZED-RECORD.
02 UNSYNCHRONIZED-DATA-1 PIC9(3) DISPLAY.
02 UNSYNCHRONIZED-DATA-2 PICX(2).
01 COMPOUND-REPEATED-RECORD.
02 ELEMENTARY-ITEM-1 PIC X(2).
02 GROUP-ITEM OCCURS 3 TIMES.
03 ELEMENTARY-ITEM-2 PIC X.
03 ELEMENTARY-ITEM-3 PIC S9(2) COMP SYNC.
03 ELEMENTARY-ITEM-4 PIC S9(4)V9(2) COMP SYNC.
03 ELEMENTARY-ITEM-5 PIC X(5).
Figure 2-3 : Sample Computer Storage Allocation
where:
@ indicates implicit FILLER bytes allocated due to automatic
synchronization of a record (01-level) description.
# indicates implicit FILLER bytes allocated when following data
item is explicitly synchronized.
$ indicates implicit FILLER bytes allocated when a non-elementary
item is subject to an OCCURS clause.
9 indicates bytes allocated for a numeric DISPLAY character.
A indicates bytes allocated for an alphanumeric DISPLAYcharacter.
C indicates bytes allocated for a COMPUTATIONALdata storage.
Truncation: In data items of USAGE COMP, data is held in binary format
as described in the previous sections. The storage allocatedfor an item
can have space for larger numbers than specifed by the PICTURE clause.
For example, an item described as PIC 99 COMP is normally assigned one
byte, which can hold numbers up to 255
To conform with the rules of ANSI COBOL, numbers behave as decimal
numbers, regardless of their format. If, in an arithmetic statement, the
result is bigger than the PICTURE clause of a receiving item allows, a
size error occurs, and if the ON SIZE ERROR phrase is specifed the result
is not stored in the receiving item. In a non-arithmetic statement, if
this situation occurs, the decimal value is truncated on the left, to the
number of digits specified in the PICTURE clause.
For MF only: However, data in USAGE COMP items can be forced to behave
as binary data, that is, truncation occurs only if it is necessary in
order for the data to fit the space allocated. The behavior of USAGE
COMP items is controlled by the setting of the COBOL system directive,
TRUNC. (See your COBOL System Reference for details on how to invoke this
feature.) This directive selects whether the decimal value is truncated
to the picture size, or the binary value is truncated to the space
available. It distinguishes between results of arithmetic statements,
and data being moved by non-arithmetic statements.
Regardless of the setting of any directive (MF only) an arithmetic
statement gives the size error condition if the result has more decimal
digits than specified in the PICTURE clause of a receiving item.
Example of Truncation (MF only): The TRUNC directive can change the
results of some operations, as demonstrated in the following examples in
which item A is described as PIC99 COMP.
-------------------------------------------------------------------------------------------------
| | |
| Operation | Result |
| | |
-------------------------------------------------------------------------------------------------
| | | | |
| | TRUNC | NOTRUNC | TRUNC"ANSI" |
| | | | |
| MOVE 163 TO A | 63 | 163 | 63 |
| | | | |
| MOVE 263 TO A | 63 | 7 | 63 |
| | | | |
| MOVE 13 TO A, ADD 150 | 63 | 163 | undefined results |
| TO A | | | |
| | | | |
| MOVE 13 TO A, ADD 250 | 63 | 7 | undefined results |
| TO A | | | |
| | | | |
-------------------------------------------------------------------------------------------------
NOTE
1. This directive has no effect on the truncation of low-order
digits in non-integer data. This always conforms with the
behavior specified in ANSI COBOL.
2. If the IBMCOMP system directive is set, extra upper bytes
may be allocated to a COMP item. These are counted in the
space allocated. When the IBMCOMP is on, padding bytes may
be generated before a COMP item with a SYNC clause; these
are not part of the item, and are never affected by data
stored in the item.
3. When a value being stored into a signed item is limited to
the number of digits by the PICTURE clause, it can never be
big enough to overwrite the sign bit. When the NOTRUNC
directive is set this is not true, and the value, if large
enough will overwrite the sign bit.
MPE/iX 5.0 Documentation