HP 3000 Manuals

ALTSEQ [ SORT-MERGE/XL General Users Guide ] MPE/iX 5.0 Documentation


SORT-MERGE/XL General Users Guide

ALTSEQ 

The >ALTSEQ command defines a collating sequence other than the standard
ASCII or EBCDIC format.  The >ALTSEQ command must be preceded by a >DATA
command.  It is effective only if the keys are of type BYTE and if the
input data is ASCII. (Refer to Appendix B for information on ASCII and
EBCDIC character set values.)

SYNTAX 

     >A[LTSEQ] modspec1[, modspec2]...[, modspecN]

PARAMETERS 

modspec                   A set of parameters you use to define your own collating
                          sequence.  You can use more than one group of these
                          parameters in one or more successive >ALTSEQ commands until
                          the desired collating sequence is defined.

                          The modspec parameter set has the following form:

                                        { =  }
                               [EACH] leftspec {    } rightspec 
                                         {WITH}

                                                     or 

                                           {WITH}
                               MERGE    leftspec {    } rightspec 
                                           { =  }

                          To specify leftspec and rightspec use the following form:

                                                    {string        }
                                                     {num byte      }
                                                     {range string  }

EACH                      The EACH parameter indicates that the collating sequence is
                          to be modified by assigning each character of leftspec the
                          ordinal value obtained by taking the ASCII code decimal
                          value of the corresponding character in rightspec.  If
                          leftspec is longer than rightspec, rightspec is concatenated
                          to itself enough times to make it equal in length to
                          leftspec.

MERGE                     The MERGE parameter indicates that the collating

                          merging leftspec and rightspec.  Characters are selected
                          alternatively from leftspec and rightspec.


NOTE If neither EACH nor MERGE is specified, the collating sequence is modified as if EACH was specified, but rightspec is padded with blanks if it is shorter than leftspec.
= When used in the modspec parameter, the equal sign (=) functions as a separator between leftspec and rightspec. WITH The WITH parameter can be used interchangeably with the equal sign (=) and is generally used when MERGE is specified. string A string is a single character or a group of ASCII or EBCDIC characters specified by enclosing them in quotation marks, for example, "J" or "JAS". num byte A numerical specification used in the following form: [%[(bb)]] nnn The bb is a base of any decimal number between 2 and 16 inclusive. You specify %(bb) to indicate a base other than 8 or 10. The % indicates base 8 when no (bb) is specified. If both % and (bb) are omitted, the nnn parameter is assumed to be a decimal number (that is, base 10). The nnn represents a number (integer) with a value between 0 and decimal 255, inclusive. Each %n is a digit between 0 and 9, inclusive, or one of the letters A, B, C, D, E, or F. The letters A through F are used to represent the digits 10 through 15, when a base greater than 10 is used. Each digit n or nnn must be less than the base bb. For example, 12 represents the decimal value 12; %12 represents the octal value 12, which is equivalent to the decimal value 10; and %(16)12 represents the hexadecimal value 12, which is equivalent to the decimal value 18. range string Specifies two characters separated by a minus sign (-) and enclosed in quotation marks, or two numeric byte specifications separated by a minus sign. For example, "A-Z" or %101-%132 (which is the octal specification for the character range "A-Z").
NOTE Whenever a minus sign (-) is the second character in a group of three characters, the group is treated as a range. In all other cases, the minus sign is treated the same as any other character. For example, "A-D" represents the four characters A, B, C, and D while "AD-" represents the three characters A, D, and -.
DISCUSSION Each modification of the collating sequence changes the ordinal values in the translation table assigned to the characters specified by leftspec. Refer to the >SHOW command for a discussion of the translation table. If rightspec is longer than leftspec, the extra characters are ignored. If leftspec is longer than rightspec and neither EACH nor MERGE has been specified, rightspec is padded with blanks to make it equal in length to leftspec. For example, the command, >ALTSEQ "SAW"="TG" gives S, A, and W the ordinal values T, G, and space. (See the discussion below for explanations of modspec with EACH and MERGE.) These assignments of new ordinal values are only for collating purposes. That is, the identity of the character is not lost; data is unchanged and appears in its original form in the output. You must issue a >DATA command, specifying data type and a collating sequence type before you can use the >ALTSEQ command in any SORT/XL or MERGE/XL operation. The system displays the error message THE DATA COMMAND MUST BE ISSUED BEFORE THE ALTSEQ COMMAND CAN BE ISSUED, if the >ALTSEQ command is not preceded by a >DATA command.
NOTE The operation of SORT/XL (or MERGE/XL) is slower when you define a collating sequence with the >ALTSEQ command than when a standard ASCII or EBCDIC collating sequence is used.
Using modspec With EACH If EACH is specified, the modifications of the collating sequence are the same as explained above, except if leftspec is longer than rightspec, rightspec is concatenated to itself a sufficient number of times to make it equal in length to leftspec. For example, the command, >ALTSEQ EACH "ADW"="FG", give A, D, and W the ordinal values obtained by taking the ASCII code decimal values of F, G, and F. Assuming the basic collating sequence has been specified as ASCII, this means A=70 appears in the sixth row, fifth column of the translation table, D=71 in the sixth row, eighth column, and W=70 in the eighth row, seventh column. Note that 70 and 71 are the ASCII code decimal values of the characters F and G, respectively. For additional information refer to the "EXAMPLES" section below. Using modspec With MERGE When MERGE is specified in the modspec parameter, the values in the translation table assigned to the characters specified by leftspec and rightspec, and the characters in between are modified. Characters are selected alternatively from leftspec and rightspec and the translation table is modified so the characters collate in this order. The first character is always selected from leftspec. If leftspec precedes rightspec in the collating sequence, the sequence is modified so the characters between the two ranges collate after the merger of the ranges. If rightspec precedes leftspec, the characters between the two ranges collate before the first character of the first range. When either range is exhausted, the characters from the other range are simply appended until that range is also exhausted. Note that the strings specified by leftspec and rightspec must be strictly increasing and contiguous whenever MERGE is specified. If you wish to do an alphabetic sorting in which each upper case letter collates ahead of the corresponding lower case letter, use the command >ALTSEQ MERGE "A-Z" WITH "a-z". The following six special characters follow the lower case z since the first range precedes the second range: [ \ ] ^ _ and ` If the modspec is MERGE "a-z" WITH "A-Z", the same six characters precede the lower case a. For additional information, refer to the "EXAMPLES" section below. Consider this form of modspec as a shorthand for the modspec specifying EACH. For example, the command, >ALTSEQ MERGE "A-Z" WITH "a-z", is equivalent to the longer command >ALTSEQ "AaBb...Zz"= "AB...Zab...z", where ... represents all the necessary characters. EXAMPLES The following examples show how to use various parameters with the >ALTSEQ command, as well as the resulting collating sequences. Standard ASCII Collating Sequence To display the standard collating sequence enter the DATA IS ASCII, SEQUENCE IS ASCII and >SHOW SEQUENCE commands, as shown below. Refer to this display, for comparative purposes, to see what occurs to the collating sequence when you use >ALTSEQ for various functions in the following examples. :SORT HP32214A.01.00 SORT/3000 THU, JUN 4, 1987, 8:10 AM (c): HEWLETT-PACKARD CO. 1986 >DATA IS ASCII, SEQUENCE IS ASCII >SHOW SEQUENCE nul soh stx etx eot enq ack bel bs ht lf vt ff cr so si dle dc1 dc2 dc3 dc4 nak syn etb can em sub esc fs gs rs us sp ! " # $ % & ' ( ) * + , - . / 0 1 2 3 4 5 6 7 8 9 : ; < = > ? @ A B C D E F G H I J K L M N 0 P Q R S T U V W X Y Z [ \ ] ^ _ ` a b c d e f g h i j k l m n o p q r s t u v w x y z { | } ~ del Using the EACH Parameter The following example shows how to use the >ALTSEQ command with the EACH parameter followed by a string specification: :SORT HP32214A.01.00 SORT/3000 THU, JUN 4, 1987, 8:10 AM (c) HEWLETT-PACKARD CO. 1986 >DATA IS ASCII, SEQUENCE IS ASCII >ALTSEQ EACH "LMN"="ST" >SHOW SEQUENCE nul soh stx etx eot enq ack bel bs ht lf vt ff cr so si dle dc1 dc2 dc3 dc4 nak syn etb can em sub esc fs gs rs us sp ! " # $ % & ' ( ) * + , - . / 0 1 2 3 4 5 6 7 8 9 : ; < = > ? @ A B C D E F G H I J K O P Q R L= N= S M= T U V W X Y Z [ \ ] ^ - ` a b c d e f g h i j k l m n o p q r s t u v w x y z { | } ~ del The result of the modspec in the above example where EACH "LMN"="ST" is shown below: Original List Sorted Result TOKEN COST MOP COME COST SING COME NOSE TABLE LONESOME MISS SOLE SING TABLE NOSE MISS LONESOME TOKEN SOLE MOP During the sort operation, L and N are equated to S, and M is equated to T. Using >ALTSEQ Without the EACH Parameter The following example shows how to use the >ALTSEQ command without including the EACH parameter. You may use abbreviated forms for >ALTSEQ (>A), >SHOW SEQUENCE (>SH S), and SEQUENCE IS ASCII (SEQ A), if you wish. :SORT HP32214A.01.00 SORT/3000 THU, JUN 4, 1987, 8:15 AM (c) HEWLETT-PACKARD CO. 1986 >DATA IS ASCII, SEQUENCE IS ASCII >ALTSEQ "ABC" = "X" >SHOW SEQUENCE nul soh stx etx eot enq ack bel bs ht lf vt ff cr so si dle dc1 dc2 dc3 dc4 nak syn etb can em sub esc fs gs rs us sp= B= C ! " # $ % & ' ( ) * + , - . / 0 1 2 3 4 5 6 7 8 9 : ; < = > ? @ D E F G H I J K L M N O P Q R S T U V W A= X Y Z [ \ ] ^ _ ` a b c d e f g h i j k l m n o p q r s t u v w x y z { | } ~ del The >ALTSEQ command pads X with two blank characters making it equal to ABC in length. Note the character sp (space) is equated to B and C and the character A is equated to X. The table position identified by each character of the left string is replaced by the corresponding character of the right string until the string ABC is exhausted. Numeric Byte Specification The following example shows how to use the >ALTSEQ command for a numeric byte specification: :SORT HP32214A.01.00 SORT/3000 THU, JUN 4, 1987, 8:20 PM (c) HEWLETT-PACKARD CO. 1986 >DATA IS ASCII, SEQUENCE IS ASCII >ALTSEQ 65=%141 >SHOW SEQUENCE nul soh stx etx eot enq ack bel bs ht lf vt ff cr so si dle dc1 dc2 dc3 dc4 nak syn etb can em sub esc fs gs rs us sp ! " # $ % & ' ( ) * + , - . / 0 1 2 3 4 5 6 7 8 9 : ; < = > ? @ B C D E F G H I J K L M N O P Q R S T U V W X Y Z [ \ ] ^ - ` A= a b c d e f g h i j k l m n o p q r s t u v w x y z { | } ~ del In this example, the upper case A (represented by the decimal value 65) is assigned the same ordinal value as the lower case a (represented by the octal value %141) in the final collating sequence. Using a Range String Specification The following example shows how to use the >ALTSEQ command for a range string specification: :SORT HP32214A.01.00 SORT/3000 THU, JUN 4, 1987, 8:25 AM (c) HEWLETT-PACKARD CO. 1986 >ALTSEQ %101-%132="a-z" >SHOW SEQUENCE nul soh stx etx eot enq ack bel bs ht lf vt ff cr so si dle dc1 dc2 dc3 dc4 nak syn etb can em sub esc fs gs rs us sp ! " # $ % & ' ( ) * + , - . / 0 1 2 3 4 5 6 7 8 9 : ; < = > ? @ [ \ ] ^ _ ` A= a B= b C= c D= d E= e F= f G= g H= h I= i J= j K= k L= l M= m N= n O= o P= p Q= q R= r S= s T= t U= u V= v W= w X= x Y= y Z= z { | } ~ del The left range in the above example is specified by two numeric byte specifications separated by a minus sign. Note that the same range can be represented by "A-Z" (characters), %101-"Z" (octal representation), or 65-90 (decimal representation). Collating Upper Case Before Lower Case The following example shows how to use the >ALTSEQ command for collating upper case, then lower case characters. This is a commonly used alternative to the standard collating sequence. :SORT HP32214A.01.00 SORT/3000 THU, JUN 4, 1987, 8:30 AM (c) HEWLETT-PACKARD CO. 1986 >ALTSEQ MERGE "A-Z" WITH "a-z" >SHOW SEQUENCE nul soh stx etx eot enq ack bel bs ht lf vt ff cr so si dle dc1 dc2 dc3 dc4 nak syn etb can em sub esc fs gs rs us sp ! " # $ % & ' ( ) * + , - . / @ A a B b C c D d E e F f G g H 0 1 2 3 4 5 6 7 8 9 : ; < = > ? h I i J j K k L l M m N n O o P p Q q R r S s T t U u V v W w X x Y y Z z [ \ ] ^ _ ` { | } ~ del The six characters [, \, ], ^, _,'' and ` follow the lower case z. The result of MERGE "A-Z" WITH "a-z" is as follows: Original Sorted List Sorted List List Without MERGE Using MERGE CAN AXE AXE shovel BROOM BROOM MAN CAN boy BROOM DOG CAN TABLE MAN DOG AXE TABLE drawer drawer boy MAN boy drawer shovel DOG shovel TABLE Collating Lower case Before Upper Case The following shows how to use the >ALTSEQ command to collate lower case alphabetic characters, and have each followed by its corresponding upper case character: :SORT HP32214A.01.00 SORT/3000 THU, JUN 4, 1987, 8:35 AM (c) HEWLETT-PACKARD CO. 1986 >ALTSEQ MERGE "a-z" = "A-Z" >SHOW SEQUENCE nul soh stx etx eot enq ack bel bs ht lf vt ff cr so si dle dc1 dc2 dc3 dc4 nak syn etb can em sub esc fs gs rs us sp ! " # $ % & ' ( ) * + , - . / 0 1 2 3 4 5 6 7 8 9 : ; < = > ? @ [ \ ] ^ _ ` a A b B c C d D e E f F g G h H i I j J k K l L m M n N o O p P q Q r R s S t T u U v V w W x X y Y z Z { | } ~ del The six characters [, \, ], ^, _, and `'' precede the lower case a. The result of MERGE "a-z" = "A-Z" is as follows: Original Sorted List Sorted List List Without MERGE Using MERGE CAN AXE AXE shovel BROOM boy MAN CAN BROOM BROOM DOG CAN TABLE MAN drawer AXE TABLE DOG drawer boy MAN boy drawer shovel DOG shovel TABLE Merging Unequal Strings The following example shows how to use the >ALTSEQ command to merge unequal strings: :SORT HP32214A.01.00 SORT/3000 THU, JUN 4, 1987, 8:40 AM (c) HEWLETT-PACKARD CO. 1986 >ALTSEQ MERGE "ABCD" WITH "ab" >SHOW SEQUENCE nul soh stx etx eot enq ack bel bs ht lf vt ff cr so si dle dc1 dc2 dc3 dc4 nak syn etb can em sub esc fs gs rs us sp ! " # $ % & ' ( ) * + , - . / 0 1 2 3 4 5 6 7 8 9 : ; < = > ? @ A a B b C D E F G H I J K L M N O P Q R S T U V W X Y Z [ \ ] ^ _ ` c d e f g h i j k l m n o p q r s t u v w x y z { | } ~ del The collating sequence appears AaBbCDE...Z. The merging of the strings continues until the right string ab is exhausted. ADDITIONAL DISCUSSION Refer to the >DATA and >SHOW commands in this chapter.


MPE/iX 5.0 Documentation