ALTSEQ [ SORT-MERGE/XL General Users Guide ] MPE/iX 5.0 Documentation
SORT-MERGE/XL General Users Guide
ALTSEQ
The >ALTSEQ command defines a collating sequence other than the standard
ASCII or EBCDIC format. The >ALTSEQ command must be preceded by a >DATA
command. It is effective only if the keys are of type BYTE and if the
input data is ASCII. (Refer to Appendix B for information on ASCII and
EBCDIC character set values.)
SYNTAX
>A[LTSEQ] modspec1[, modspec2]...[, modspecN]
PARAMETERS
modspec A set of parameters you use to define your own collating
sequence. You can use more than one group of these
parameters in one or more successive >ALTSEQ commands until
the desired collating sequence is defined.
The modspec parameter set has the following form:
{ = }
[EACH] leftspec { } rightspec
{WITH}
or
{WITH}
MERGE leftspec { } rightspec
{ = }
To specify leftspec and rightspec use the following form:
{string }
{num byte }
{range string }
EACH The EACH parameter indicates that the collating sequence is
to be modified by assigning each character of leftspec the
ordinal value obtained by taking the ASCII code decimal
value of the corresponding character in rightspec. If
leftspec is longer than rightspec, rightspec is concatenated
to itself enough times to make it equal in length to
leftspec.
MERGE The MERGE parameter indicates that the collating
merging leftspec and rightspec. Characters are selected
alternatively from leftspec and rightspec.
NOTE If neither EACH nor MERGE is specified, the collating sequence is
modified as if EACH was specified, but rightspec is padded with
blanks if it is shorter than leftspec.
= When used in the modspec parameter, the equal sign (=)
functions as a separator between leftspec and rightspec.
WITH The WITH parameter can be used interchangeably with the
equal sign (=) and is generally used when MERGE is
specified.
string A string is a single character or a group of ASCII or EBCDIC
characters specified by enclosing them in quotation marks,
for example, "J" or "JAS".
num byte A numerical specification used in the following form:
[%[(bb)]] nnn
The bb is a base of any decimal number between 2 and 16
inclusive. You specify %(bb) to indicate a base other than
8 or 10.
The % indicates base 8 when no (bb) is specified. If both %
and (bb) are omitted, the nnn parameter is assumed to be a
decimal number (that is, base 10).
The nnn represents a number (integer) with a value between 0
and decimal 255, inclusive. Each %n is a digit between 0
and 9, inclusive, or one of the letters A, B, C, D, E, or F.
The letters A through F are used to represent the digits 10
through 15, when a base greater than 10 is used. Each digit
n or nnn must be less than the base bb.
For example, 12 represents the decimal value 12; %12
represents the octal value 12, which is equivalent to the
decimal value 10; and %(16)12 represents the hexadecimal
value 12, which is equivalent to the decimal value 18.
range string Specifies two characters separated by a minus sign (-) and
enclosed in quotation marks, or two numeric byte
specifications separated by a minus sign. For example,
"A-Z" or %101-%132 (which is the octal specification for the
character range "A-Z").
NOTE Whenever a minus sign (-) is the second character in a group of
three characters, the group is treated as a range. In all other
cases, the minus sign is treated the same as any other character.
For example, "A-D" represents the four characters A, B, C, and D
while "AD-" represents the three characters A, D, and -.
DISCUSSION
Each modification of the collating sequence changes the ordinal values in
the translation table assigned to the characters specified by leftspec.
Refer to the >SHOW command for a discussion of the translation table. If
rightspec is longer than leftspec, the extra characters are ignored. If
leftspec is longer than rightspec and neither EACH nor MERGE has been
specified, rightspec is padded with blanks to make it equal in length to
leftspec. For example, the command, >ALTSEQ "SAW"="TG" gives S, A, and W
the ordinal values T, G, and space. (See the discussion below for
explanations of modspec with EACH and MERGE.) These assignments of new
ordinal values are only for collating purposes. That is, the identity of
the character is not lost; data is unchanged and appears in its original
form in the output.
You must issue a >DATA command, specifying data type and a collating
sequence type before you can use the >ALTSEQ command in any SORT/XL or
MERGE/XL operation. The system displays the error message THE DATA
COMMAND MUST BE ISSUED BEFORE THE ALTSEQ COMMAND CAN BE ISSUED, if the
>ALTSEQ command is not preceded by a >DATA command.
NOTE The operation of SORT/XL (or MERGE/XL) is slower when you define a
collating sequence with the >ALTSEQ command than when a standard
ASCII or EBCDIC collating sequence is used.
Using modspec With EACH
If EACH is specified, the modifications of the collating sequence are the
same as explained above, except if leftspec is longer than rightspec,
rightspec is concatenated to itself a sufficient number of times to make
it equal in length to leftspec. For example, the command, >ALTSEQ EACH
"ADW"="FG", give A, D, and W the ordinal values obtained by taking the
ASCII code decimal values of F, G, and F. Assuming the basic collating
sequence has been specified as ASCII, this means A=70 appears in the
sixth row, fifth column of the translation table, D=71 in the sixth row,
eighth column, and W=70 in the eighth row, seventh column. Note that 70
and 71 are the ASCII code decimal values of the characters F and G,
respectively. For additional information refer to the "EXAMPLES" section
below.
Using modspec With MERGE
When MERGE is specified in the modspec parameter, the values in the
translation table assigned to the characters specified by leftspec and
rightspec, and the characters in between are modified. Characters are
selected alternatively from leftspec and rightspec and the translation
table is modified so the characters collate in this order. The first
character is always selected from leftspec. If leftspec precedes
rightspec in the collating sequence, the sequence is modified so the
characters between the two ranges collate after the merger of the ranges.
If rightspec precedes leftspec, the characters between the two ranges
collate before the first character of the first range. When either range
is exhausted, the characters from the other range are simply appended
until that range is also exhausted. Note that the strings specified by
leftspec and rightspec must be strictly increasing and contiguous
whenever MERGE is specified.
If you wish to do an alphabetic sorting in which each upper case letter
collates ahead of the corresponding lower case letter, use the command
>ALTSEQ MERGE "A-Z" WITH "a-z". The following six special characters
follow the lower case z since the first range precedes the second range:
[ \ ] ^ _ and `
If the modspec is MERGE "a-z" WITH "A-Z", the same six characters precede
the lower case a. For additional information, refer to the "EXAMPLES"
section below.
Consider this form of modspec as a shorthand for the modspec specifying
EACH. For example, the command, >ALTSEQ MERGE "A-Z" WITH "a-z", is
equivalent to the longer command >ALTSEQ "AaBb...Zz"= "AB...Zab...z",
where ... represents all the necessary characters.
EXAMPLES
The following examples show how to use various parameters with the
>ALTSEQ command, as well as the resulting collating sequences.
Standard ASCII Collating Sequence
To display the standard collating sequence enter the DATA IS ASCII,
SEQUENCE IS ASCII and >SHOW SEQUENCE commands, as shown below. Refer to
this display, for comparative purposes, to see what occurs to the
collating sequence when you use >ALTSEQ for various functions in the
following examples.
:SORT
HP32214A.01.00 SORT/3000 THU, JUN 4, 1987, 8:10 AM
(c): HEWLETT-PACKARD CO. 1986
>DATA IS ASCII, SEQUENCE IS ASCII
>SHOW SEQUENCE
nul soh stx etx eot enq ack bel bs ht lf vt ff cr so si
dle dc1 dc2 dc3 dc4 nak syn etb can em sub esc fs gs rs us
sp ! " # $ % & ' ( ) * + , - . /
0 1 2 3 4 5 6 7 8 9 : ; < = > ?
@ A B C D E F G H I J K L M N 0
P Q R S T U V W X Y Z [ \ ] ^ _
` a b c d e f g h i j k l m n o
p q r s t u v w x y z { | } ~ del
Using the EACH Parameter
The following example shows how to use the >ALTSEQ command with the EACH
parameter followed by a string specification:
:SORT
HP32214A.01.00 SORT/3000 THU, JUN 4, 1987, 8:10 AM
(c) HEWLETT-PACKARD CO. 1986
>DATA IS ASCII, SEQUENCE IS ASCII
>ALTSEQ EACH "LMN"="ST"
>SHOW SEQUENCE
nul soh stx etx eot enq ack bel bs ht lf vt ff cr so si
dle dc1 dc2 dc3 dc4 nak syn etb can em sub esc fs gs rs us
sp ! " # $ % & ' ( ) * + , - . /
0 1 2 3 4 5 6 7 8 9 : ; < = > ?
@ A B C D E F G H I J K O P Q R
L= N= S M= T U V W X Y Z [ \ ] ^ -
` a b c d e f g h i j k l m n o
p q r s t u v w x y z { | } ~ del
The result of the modspec in the above example where EACH "LMN"="ST" is
shown below:
Original List Sorted Result
TOKEN COST
MOP COME
COST SING
COME NOSE
TABLE LONESOME
MISS SOLE
SING TABLE
NOSE MISS
LONESOME TOKEN
SOLE MOP
During the sort operation, L and N are equated to S, and M is equated to
T.
Using >ALTSEQ Without the EACH Parameter
The following example shows how to use the >ALTSEQ command without
including the EACH parameter. You may use abbreviated forms for >ALTSEQ
(>A), >SHOW SEQUENCE (>SH S), and SEQUENCE IS ASCII (SEQ A), if you wish.
:SORT
HP32214A.01.00 SORT/3000 THU, JUN 4, 1987, 8:15 AM
(c) HEWLETT-PACKARD CO. 1986
>DATA IS ASCII, SEQUENCE IS ASCII
>ALTSEQ "ABC" = "X"
>SHOW SEQUENCE
nul soh stx etx eot enq ack bel bs ht lf vt ff cr so si
dle dc1 dc2 dc3 dc4 nak syn etb can em sub esc fs gs rs us
sp= B= C ! " # $ % & ' ( ) * + , -
. / 0 1 2 3 4 5 6 7 8 9 : ; < =
> ? @ D E F G H I J K L M N O P
Q R S T U V W A= X Y Z [ \ ] ^ _
` a b c d e f g h i j k l m n o
p q r s t u v w x y z { | } ~ del
The >ALTSEQ command pads X with two blank characters making it equal to
ABC in length. Note the character sp (space) is equated to B and C and
the character A is equated to X. The table position identified by each
character of the left string is replaced by the corresponding character
of the right string until the string ABC is exhausted.
Numeric Byte Specification
The following example shows how to use the >ALTSEQ command for a numeric
byte specification:
:SORT
HP32214A.01.00 SORT/3000 THU, JUN 4, 1987, 8:20 PM
(c) HEWLETT-PACKARD CO. 1986
>DATA IS ASCII, SEQUENCE IS ASCII
>ALTSEQ 65=%141
>SHOW SEQUENCE
nul soh stx etx eot enq ack bel bs ht lf vt ff cr so si
dle dc1 dc2 dc3 dc4 nak syn etb can em sub esc fs gs rs us
sp ! " # $ % & ' ( ) * + , - . /
0 1 2 3 4 5 6 7 8 9 : ; < = > ?
@ B C D E F G H I J K L M N O P
Q R S T U V W X Y Z [ \ ] ^ - `
A= a b c d e f g h i j k l m n o
p q r s t u v w x y z { | } ~ del
In this example, the upper case A (represented by the decimal value 65)
is assigned the same ordinal value as the lower case a (represented by
the octal value %141) in the final collating sequence.
Using a Range String Specification
The following example shows how to use the >ALTSEQ command for a range
string specification:
:SORT
HP32214A.01.00 SORT/3000 THU, JUN 4, 1987, 8:25 AM
(c) HEWLETT-PACKARD CO. 1986
>ALTSEQ %101-%132="a-z"
>SHOW SEQUENCE
nul soh stx etx eot enq ack bel bs ht lf vt ff cr so si
dle dc1 dc2 dc3 dc4 nak syn etb can em sub esc fs gs rs us
sp ! " # $ % & ' ( ) * + , - . /
0 1 2 3 4 5 6 7 8 9 : ; < = > ?
@ [ \ ] ^ _ ` A= a B= b C= c D= d E=
e F= f G= g H= h I= i J= j K= k L= l M=
m N= n O= o P= p Q= q R= r S= s T= t U=
u V= v W= w X= x Y= y Z= z { | } ~ del
The left range in the above example is specified by two numeric byte
specifications separated by a minus sign. Note that the same range can
be represented by "A-Z" (characters), %101-"Z" (octal representation), or
65-90 (decimal representation).
Collating Upper Case Before Lower Case
The following example shows how to use the >ALTSEQ command for collating
upper case, then lower case characters. This is a commonly used
alternative to the standard collating sequence.
:SORT
HP32214A.01.00 SORT/3000 THU, JUN 4, 1987, 8:30 AM
(c) HEWLETT-PACKARD CO. 1986
>ALTSEQ MERGE "A-Z" WITH "a-z"
>SHOW SEQUENCE
nul soh stx etx eot enq ack bel bs ht lf vt ff cr so si
dle dc1 dc2 dc3 dc4 nak syn etb can em sub esc fs gs rs us
sp ! " # $ % & ' ( ) * + , - . /
@ A a B b C c D d E e F f G g H
0 1 2 3 4 5 6 7 8 9 : ; < = > ?
h I i J j K k L l M m N n O o P
p Q q R r S s T t U u V v W w X
x Y y Z z [ \ ] ^ _ ` { | } ~ del
The six characters [, \, ], ^, _,'' and ` follow the lower case z. The
result of MERGE "A-Z" WITH "a-z" is as follows:
Original Sorted List Sorted List
List Without MERGE Using MERGE
CAN AXE AXE
shovel BROOM BROOM
MAN CAN boy
BROOM DOG CAN
TABLE MAN DOG
AXE TABLE drawer
drawer boy MAN
boy drawer shovel
DOG shovel TABLE
Collating Lower case Before Upper Case
The following shows how to use the >ALTSEQ command to collate lower case
alphabetic characters, and have each followed by its corresponding upper
case character:
:SORT
HP32214A.01.00 SORT/3000 THU, JUN 4, 1987, 8:35 AM
(c) HEWLETT-PACKARD CO. 1986
>ALTSEQ MERGE "a-z" = "A-Z"
>SHOW SEQUENCE
nul soh stx etx eot enq ack bel bs ht lf vt ff cr so si
dle dc1 dc2 dc3 dc4 nak syn etb can em sub esc fs gs rs us
sp ! " # $ % & ' ( ) * + , - . /
0 1 2 3 4 5 6 7 8 9 : ; < = > ?
@ [ \ ] ^ _ ` a A b B c C d D e
E f F g G h H i I j J k K l L m
M n N o O p P q Q r R s S t T u
U v V w W x X y Y z Z { | } ~ del
The six characters [, \, ], ^, _, and `'' precede the lower case a.
The result of MERGE "a-z" = "A-Z" is as follows:
Original Sorted List Sorted List
List Without MERGE Using MERGE
CAN AXE AXE
shovel BROOM boy
MAN CAN BROOM
BROOM DOG CAN
TABLE MAN drawer
AXE TABLE DOG
drawer boy MAN
boy drawer shovel
DOG shovel TABLE
Merging Unequal Strings
The following example shows how to use the >ALTSEQ command to merge
unequal strings:
:SORT
HP32214A.01.00 SORT/3000 THU, JUN 4, 1987, 8:40 AM
(c) HEWLETT-PACKARD CO. 1986
>ALTSEQ MERGE "ABCD" WITH "ab"
>SHOW SEQUENCE
nul soh stx etx eot enq ack bel bs ht lf vt ff cr so si
dle dc1 dc2 dc3 dc4 nak syn etb can em sub esc fs gs rs us
sp ! " # $ % & ' ( ) * + , - . /
0 1 2 3 4 5 6 7 8 9 : ; < = > ?
@ A a B b C D E F G H I J K L M
N O P Q R S T U V W X Y Z [ \ ]
^ _ ` c d e f g h i j k l m n o
p q r s t u v w x y z { | } ~ del
The collating sequence appears AaBbCDE...Z. The merging of the strings
continues until the right string ab is exhausted.
ADDITIONAL DISCUSSION
Refer to the >DATA and >SHOW commands in this chapter.
MPE/iX 5.0 Documentation