On the Road Again [ HP LaserRX/MPE: A Journey of Discovery ] MPE/iX 5.0 Documentation
HP LaserRX/MPE: A Journey of Discovery
On the Road Again
Open the TRAPPER2.PRF file on the CD. (Hint: Select the Open Local
command from the File menu. If you need additional help, return to the
"Starting the Engine" section for the procedure.)
Draw the Application Transaction Response graph using the TRAPPER2.PRF
file. To do this:
1. From the Draw Graphs dialog box, select the following:
a. Graph=Application Transaction Response.
It should still be selected from the last time you drew a
graph.
b. X-Axis=Week.
c. Points Every...=Hour.
d. Shift=All Day.
e. Starting Date=15 August 1988.
2. Click OK.
This graph should not look different from the one you would get from
TRAPPER.PRF because it came from the same system and covers the same time
interval. The TRAPPER2.PRF graph can give us more detailed information.
To do this:
1. Close all open graphs (either manually or by playing back your
macro).
2. From the Draw Graphs dialog box, select the following:
a. Graph=Application Transaction Response.
It should still be selected from the last time you drew a
graph.
b. X-Axis=Day.
c. Points Every...=5 Minutes.
d. Shift=All Day.
e. Starting Day=15 August 1988.
3. Click OK.
NOTE Points Every...=5 Minutes was not an available option in the
TRAPPER.PRF file because detailed 5-minute data on global or
application records was not extracted, only summaries (hourly
data).
To select the Other application, use the vertical scroll bar to scroll
through the graph. You will see a single day (15 August 1988) with data
points every 5 minutes (about 288 data points on each line). If this
graph is too busy for you, simplify it by using zoom-by-time to enlarge a
portion (select a time in the middle of the day--11:00 to 12:00).
Application Transaction Response Graph
Now you are ready to ask for process-level data--the most detailed
information available from HP LaserRX/MPE. To do this:
1. Select the Process command from the Zoom menu.
2. Define the zoomed area as follows:
a. Place the cursor on the graph to the left of the area that
interests you.
b. Hold down the mouse button, and drag the mouse to the right
until you define the entire area of interest.
c. Release the button.
HP LaserRX/MPE offers two Zoom-detail commands: Application and Process.
Either application or process textual data can be zoomed from a global,
disc space, or application graph.
Although similar, the Time, Application, and Process commands are
mutually exclusive--only one can be enabled at a time. If one is already
enabled, enabling any other will automatically disable the first. Of
course, you can proceed with none enabled.
HP LaserRX/MPE responds to zoom-by-time and zoom-by-application and
zoom-by-process procedures differently. Zoom-by-time expands a graph's
time scale, while zoom-by-application and zoom-by-process present
detailed tabular information on the graph's components.
For zoom-by-process, this means creating a table that lists all processes
logged during the selected time.
If you zoom-by-process on an application graph that shows a single
application, you will see only processes belonging to that application.
Summary graphs list the five busiest applications in the display period.
If you zoom-by-process on a summary graph, you will see all processes
logged.
In the example here, you have zoom-by-process information on a single
application--Other--so you see only processes that were logged as
belonging to that application. Try it.
Move your cursor to an interesting time--about 11:45. Hold down the
mouse button and drag the mouse to about 11:55, and then release the
button.
Tip Process records can accumulate very quickly. If you include too
long a time in the zoom period, you will get too many process
records--more than you have time to read. When possible, restrict
zoom-by-process to a short time range. If HP LaserRX/MPE cannot fit
all requested process records into a 64-Kbyte data object, it will
suggest you select a smaller interval to allow it to display more
process records.
Your screen will display many numbers, but you do not need to know
everything about every number to use this display. This information is
in tabular rather than graphic form. Because the table is wider and
longer than your screen, all the data cannot be displayed at one time.
Use the vertical and horizontal scroll bars to see additional data.
Unlike the scroll bars on the graphs, these bars let you move through the
window to see other parts of the table.
The vertical scroll bar on the right of the screen moves the data up or
down in the window. You can scroll one line of data at a time by
clicking the top or bottom arrow. To scroll one page (minus one line) at
a time, click the shaded part of the scroll bar. Try it, and note that
the date and time values change toward the left side of the display.
The horizontal scroll bar moves the data to the right or left across the
window to show different columns of data. Clicking either the arrows or
the shaded parts of a scroll bar will move one column to the left or
right. Try it. Note that the first column (Program Name) remains in
position on the screen and is not overwritten.
You can change position rapidly, either vertically or horizontally, by
placing the cursor on the scroll box, and then holding down the mouse
button and dragging the scroll box to another location. Take a moment or
two to practice changing position in the tabular window.
Tip The size of the text in the table is fixed, but the size of the
window is not. You can expand the window to see more text.
Maximizing the window will let you see a screen full of detailed
data.
Let's look at some of the numbers to see if we can determine what is
causing such poor response times.
Notice that some processes are displayed in different colors. To find
what the colors mean, scroll horizontally across the table until you see
a column called Interest Reason. This column lists codes that indicate
why this process record was logged. These codes are also color codes.
Each code corresponds to a certain color, and the color is used in the
displayed graphs. The code-color combinations are specified in the
following list. You can also check Help if you need a reminder about
their meanings.
Code Color Cause
C Red Process is using too much CPU time.
D Dark blue Process is doing too many physical disc I/Os.
P Green Process is getting poor response-to-prompt times.
F Violet Process is getting poor first-response times.
T Light blue Process is doing too many terminal transactions.
N Black Process is new (was created in the last minute).
K Black Process was just killed (terminated last minute).
Too much, too many, and poor characterize interesting-process thresholds.
You defined these terms when you set up the parameters file (PARM) for
the data collection program on the HP 3000. Refer to the HP LaserRX/MPE
User's Manual: Collection Software for information on how to set up the
collection program and define Interesting Process thresholds.
A process can be interesting for more than one reason. For example, it
might be using a lot of CPU and disc space, and, at the same time, it
might be getting poor response times. In such cases, the color used to
describe the process curve is chosen in the specific order shown above:
red, dark blue, green, violet, light blue, black. The first color to
match one of the process's Interest Reason codes becomes the color of the
process curve.
A few of the processes displayed will be green or violet. Green
indicates that the process is getting poor response-to-prompt times.
Violet indicates that the process is getting poor first-response times.
But how poor is poor? Scroll the window horizontally until you see
columns headed Num Trans, Avg 1st Resp, and Avg Prompt.
Column Heading Definition
Num Trans Number of terminal transactions completed during the
1-minute logging interval.
Avg 1st Resp Average first-response time during the interval.
Avg Prompt Average response-to-prompt time during the interval.
Unless changed by the user who set up SCOPE(XL), the default
response-time thresholds are as follows:
Response Time Threshold
Avg 1st Resp 1.0 seconds
Avg Prompt 5.0 seconds
If you are wondering whether the green and violet processes actually have
such poor response times and if these processes can be responsible for
the transactions causing average response times being so high, the answer
is: probably.
Next, you must try to find out what can be causing such poor response
times. Scroll horizontally until you see a group of columns heads: Run
Time through Logon Jobname. All of these columns will fit on your screen
if you maximize the process window. Scan down the columns headed
Job/Session, Interest Reason, and Program Name until you find the
beginning of a session.
Look for a command interpreter program--a program with the name CI.. or
one with a colon (:) in the first column of Program Name. Look for a
New Interest Reason code (N). Note the Job/Session number, scan up a few
lines, and then scan down a page or two to find all other processes
sharing that Job/Session number.
Some Technical Explanation
Usually, processes are listed in the table chronologically. Within the
same minute, however, processes might be listed by their PINs. Thus, you
might see the start of a new process (Interest=New) before you see the
end of the previous process (Interest=Killed). You might also see a
process listed before the command interpreter is listed, even though the
CI must be created first.
A second phenomenon can be confusing. A killed process is not posted
until 1 minute after its death because even after it dies, some
system-level activities can be occurring on its behalf.
One such continuing activity would be the physical I/Os queued by the
last logical disc I/Os the killed process did. Another activity would be
closing the killed process's logon terminal ($STDIN, $STDLIST), thus
completing the last terminal transaction. Such postkill activities are
attributed to the process even after it has died.
You might see a process listed as being New one minute, and then as
Killed a minute later, even if the process only lasted a second or two.
If the process spanned a logging time, it might even appear in three
1-minute samples as New, Active, and then Killed.
Tip Search before and after the time interval where you might expect a
process record to be recorded, and do not try to position things
sequentially within a minute.
Continuing the Journey
If you look around a bit or try another session, you will be able to
identify a sequence such as the following (perhaps with a slight
difference in the order of the records).
Run Interest Job/ Logon Logon
Program Name Time Reason Que Session Ldev Jobname
:PASSCHECK 21S N P I C 409 107
BJ,MAILMAN.HPMAIL.SYS
XCHECK.XSEC.SY 0S N I C 409 107
BJ,MAILMAN.HPMAIL.SYS
PASSCHG.PUB.SY 0S N I C 409 107
BJ,MAILMAN.HPMAIL.SYS
XCHECK 0S K I C 409 107
BJ,MAILMAN.HPMAIL.SYS
CI.. 44S PF I C 409 107
BJ,MAILMAN.HPMAIL.SYS
PASSCHG.PUB.SY 0S K I C 409 107
BJ,MAILMAN.HPMAIL.SYS
CI.. 44S K I C 409 107
BJ,MAILMAN.HPMAIL.SYS
Sample Process Table
The listing will have other process records interspersed with those that
interest you.
Making Sense
Let's examine these records to try to understand what happened here.
First, the Command Interpreter (CI..) logs on and issues the command
PASSCHECK. Because this is not a valid MPE command, you must assume that
it is a UDC. The session had been logged on for 21 seconds when its data
was recorded (Run Time=21S). It already had a transaction with poor
response-to-prompt time (Interest Reason=P, and the entry itself is
green).
The programs XCHECK.XSEC.SYS and PASSCHG.PUB.SYS were created (Interest
Reason=N). They might have been created by the PASSCHECK UDC.
NOTE You might not see every command interpreter command issued, or even
the CI command that caused the CI.. to be logged. But you will
see the command that was last executed when the process was logged.
If the last command executed is not available--for example, if the
CI has died already--you will see CI.. listed.
Next, the XCHECK program terminates (Interest Reason=K). The total run
time was very short (0S means less than 1 second). If you scroll to the
right, you will see that XCHECK did no terminal transactions.
Notice that CI.. is logged again, indicating the command interpreter
probably died before it was logged, but it was not marked as killed to
allow all of its activities to finish and be accounted for.
Also notice that CI.. had a poor response-to-prompt (Interest Reason=P)
and first-response (Interest Reason=F) transaction. Can you tell which
CI transaction could generate this response time? You cannot be sure at
this time; it could have been something such as running a program. (A
program that does not read data from the terminal does not do any
terminal transactions. In this case, the time it takes the program to
run is charged to the CI as Response Time.) You will not see any program
that ran for that time interval listed in the CI record as Response Time.
This can be confusing.
Next you see that the PASSCHG program was killed. Apparently, it died
earlier because the table shows a run time of less than 1 second. Its
appearance in the log is delayed to allow activities to be processed.
Lastly, CI.. is marked as killed, and the session is officially
finished.
Some Conclusions
Briefly, the session logs on and executes the PASSCHECK UDC. The XCHECK
and PASSCHG program are run without doing any terminal transactions.
Finally, the CI does something that takes some time, and then logs off.
Something the CI is doing is causing the long response times. What could
that be? It cannot be running a program because the only programs
running are PASSCHG and XCHECK, and neither runs long enough to cause
this response time.
Is that correct? If the CI ran a program, wouldn't you see it on this
display? Maybe not. Remember, you were looking at the Other application
when you did this zoom-by-process. When you zoom-by-process on a single
application, you see only processes belonging to that application. This
means that if the CI--an Other process--runs a program that belongs to
another application (for example, HPMAIL), it will not be listed. Can
you do anything about this?
You can access the Busiest application graph, and then do a
zoom-by-process to see all processes. To do this, close this process
window, and slide the vertical scroll bar's scroll box to the top of the
graph until Busiest is displayed in the dialog window. Draw the
Application Transaction Response graph. Now do a zoom-by-process on the
same time period that you used for the previous graph, and see if you can
find the same job or session. Compare it to the previous example.
Notice that the DSSTRUCK.HPMAIL.SYS program ran. This accounted for that
large response time charged against the command interpreter. Because
DSSTRUCK did no terminal transactions of its own, all of its run time is
charged as response time to CI.., the process that ran it.
In conclusion, running HPMAIL slave trucks (program DSSTRUCK) from a
virtual terminal across DS results in poor response times. Although
these are batch-type activities, they are counted as terminal
transactions because they log on as sessions to DS virtual terminals.
This is a normal activity for HPDesk. In this case, the only abnormal
thing about Trapper is that its only other significant activity is to act
as an electronic mail distribution hub.
MPE/iX 5.0 Documentation