HP 3000 Manuals

On the Road Again [ HP LaserRX/MPE: A Journey of Discovery ] MPE/iX 5.0 Documentation


HP LaserRX/MPE: A Journey of Discovery

On the Road Again 

Open the TRAPPER2.PRF file on the CD. (Hint:  Select the Open Local
command from the File menu.  If you need additional help, return to the
"Starting the Engine"  section for the procedure.)

Draw the Application Transaction Response graph using the TRAPPER2.PRF
file.  To do this:

   1.  From the Draw Graphs dialog box, select the following:

          a.  Graph=Application Transaction Response.

              It should still be selected from the last time you drew a
              graph.

          b.  X-Axis=Week.

          c.  Points Every...=Hour.

          d.  Shift=All Day.

          e.  Starting Date=15 August 1988.

   2.  Click  OK.

This graph should not look different from the one you would get from
TRAPPER.PRF because it came from the same system and covers the same time
interval.  The TRAPPER2.PRF graph can give us more detailed information.
To do this:

   1.  Close all open graphs (either manually or by playing back your
       macro).

   2.  From the Draw Graphs dialog box, select the following:

          a.  Graph=Application Transaction Response.

              It should still be selected from the last time you drew a
              graph.

          b.  X-Axis=Day.

          c.  Points Every...=5 Minutes.

          d.  Shift=All Day.

          e.  Starting Day=15 August 1988.

   3.  Click  OK.


NOTE Points Every...=5 Minutes was not an available option in the TRAPPER.PRF file because detailed 5-minute data on global or application records was not extracted, only summaries (hourly data).
To select the Other application, use the vertical scroll bar to scroll through the graph. You will see a single day (15 August 1988) with data points every 5 minutes (about 288 data points on each line). If this graph is too busy for you, simplify it by using zoom-by-time to enlarge a portion (select a time in the middle of the day--11:00 to 12:00).
[]
Application Transaction Response Graph Now you are ready to ask for process-level data--the most detailed information available from HP LaserRX/MPE. To do this: 1. Select the Process command from the Zoom menu. 2. Define the zoomed area as follows: a. Place the cursor on the graph to the left of the area that interests you. b. Hold down the mouse button, and drag the mouse to the right until you define the entire area of interest. c. Release the button. HP LaserRX/MPE offers two Zoom-detail commands: Application and Process. Either application or process textual data can be zoomed from a global, disc space, or application graph. Although similar, the Time, Application, and Process commands are mutually exclusive--only one can be enabled at a time. If one is already enabled, enabling any other will automatically disable the first. Of course, you can proceed with none enabled. HP LaserRX/MPE responds to zoom-by-time and zoom-by-application and zoom-by-process procedures differently. Zoom-by-time expands a graph's time scale, while zoom-by-application and zoom-by-process present detailed tabular information on the graph's components. For zoom-by-process, this means creating a table that lists all processes logged during the selected time. If you zoom-by-process on an application graph that shows a single application, you will see only processes belonging to that application. Summary graphs list the five busiest applications in the display period. If you zoom-by-process on a summary graph, you will see all processes logged. In the example here, you have zoom-by-process information on a single application--Other--so you see only processes that were logged as belonging to that application. Try it. Move your cursor to an interesting time--about 11:45. Hold down the mouse button and drag the mouse to about 11:55, and then release the button.
Tip Process records can accumulate very quickly. If you include too long a time in the zoom period, you will get too many process records--more than you have time to read. When possible, restrict zoom-by-process to a short time range. If HP LaserRX/MPE cannot fit all requested process records into a 64-Kbyte data object, it will suggest you select a smaller interval to allow it to display more process records.
Your screen will display many numbers, but you do not need to know everything about every number to use this display. This information is in tabular rather than graphic form. Because the table is wider and longer than your screen, all the data cannot be displayed at one time. Use the vertical and horizontal scroll bars to see additional data. Unlike the scroll bars on the graphs, these bars let you move through the window to see other parts of the table. The vertical scroll bar on the right of the screen moves the data up or down in the window. You can scroll one line of data at a time by clicking the top or bottom arrow. To scroll one page (minus one line) at a time, click the shaded part of the scroll bar. Try it, and note that the date and time values change toward the left side of the display. The horizontal scroll bar moves the data to the right or left across the window to show different columns of data. Clicking either the arrows or the shaded parts of a scroll bar will move one column to the left or right. Try it. Note that the first column (Program Name) remains in position on the screen and is not overwritten. You can change position rapidly, either vertically or horizontally, by placing the cursor on the scroll box, and then holding down the mouse button and dragging the scroll box to another location. Take a moment or two to practice changing position in the tabular window.
Tip The size of the text in the table is fixed, but the size of the window is not. You can expand the window to see more text. Maximizing the window will let you see a screen full of detailed data.
Let's look at some of the numbers to see if we can determine what is causing such poor response times. Notice that some processes are displayed in different colors. To find what the colors mean, scroll horizontally across the table until you see a column called Interest Reason. This column lists codes that indicate why this process record was logged. These codes are also color codes. Each code corresponds to a certain color, and the color is used in the displayed graphs. The code-color combinations are specified in the following list. You can also check Help if you need a reminder about their meanings. Code Color Cause C Red Process is using too much CPU time. D Dark blue Process is doing too many physical disc I/Os. P Green Process is getting poor response-to-prompt times. F Violet Process is getting poor first-response times. T Light blue Process is doing too many terminal transactions. N Black Process is new (was created in the last minute). K Black Process was just killed (terminated last minute). Too much, too many, and poor characterize interesting-process thresholds. You defined these terms when you set up the parameters file (PARM) for the data collection program on the HP 3000. Refer to the HP LaserRX/MPE User's Manual: Collection Software for information on how to set up the collection program and define Interesting Process thresholds. A process can be interesting for more than one reason. For example, it might be using a lot of CPU and disc space, and, at the same time, it might be getting poor response times. In such cases, the color used to describe the process curve is chosen in the specific order shown above: red, dark blue, green, violet, light blue, black. The first color to match one of the process's Interest Reason codes becomes the color of the process curve. A few of the processes displayed will be green or violet. Green indicates that the process is getting poor response-to-prompt times. Violet indicates that the process is getting poor first-response times. But how poor is poor? Scroll the window horizontally until you see columns headed Num Trans, Avg 1st Resp, and Avg Prompt. Column Heading Definition Num Trans Number of terminal transactions completed during the 1-minute logging interval. Avg 1st Resp Average first-response time during the interval. Avg Prompt Average response-to-prompt time during the interval. Unless changed by the user who set up SCOPE(XL), the default response-time thresholds are as follows: Response Time Threshold Avg 1st Resp 1.0 seconds Avg Prompt 5.0 seconds If you are wondering whether the green and violet processes actually have such poor response times and if these processes can be responsible for the transactions causing average response times being so high, the answer is: probably. Next, you must try to find out what can be causing such poor response times. Scroll horizontally until you see a group of columns heads: Run Time through Logon Jobname. All of these columns will fit on your screen if you maximize the process window. Scan down the columns headed Job/Session, Interest Reason, and Program Name until you find the beginning of a session. Look for a command interpreter program--a program with the name CI.. or one with a colon (:) in the first column of Program Name. Look for a New Interest Reason code (N). Note the Job/Session number, scan up a few lines, and then scan down a page or two to find all other processes sharing that Job/Session number. Some Technical Explanation Usually, processes are listed in the table chronologically. Within the same minute, however, processes might be listed by their PINs. Thus, you might see the start of a new process (Interest=New) before you see the end of the previous process (Interest=Killed). You might also see a process listed before the command interpreter is listed, even though the CI must be created first. A second phenomenon can be confusing. A killed process is not posted until 1 minute after its death because even after it dies, some system-level activities can be occurring on its behalf. One such continuing activity would be the physical I/Os queued by the last logical disc I/Os the killed process did. Another activity would be closing the killed process's logon terminal ($STDIN, $STDLIST), thus completing the last terminal transaction. Such postkill activities are attributed to the process even after it has died. You might see a process listed as being New one minute, and then as Killed a minute later, even if the process only lasted a second or two. If the process spanned a logging time, it might even appear in three 1-minute samples as New, Active, and then Killed.
Tip Search before and after the time interval where you might expect a process record to be recorded, and do not try to position things sequentially within a minute.
Continuing the Journey If you look around a bit or try another session, you will be able to identify a sequence such as the following (perhaps with a slight difference in the order of the records). Run Interest Job/ Logon Logon Program Name Time Reason Que Session Ldev Jobname :PASSCHECK 21S N P I C 409 107 BJ,MAILMAN.HPMAIL.SYS XCHECK.XSEC.SY 0S N I C 409 107 BJ,MAILMAN.HPMAIL.SYS PASSCHG.PUB.SY 0S N I C 409 107 BJ,MAILMAN.HPMAIL.SYS XCHECK 0S K I C 409 107 BJ,MAILMAN.HPMAIL.SYS CI.. 44S PF I C 409 107 BJ,MAILMAN.HPMAIL.SYS PASSCHG.PUB.SY 0S K I C 409 107 BJ,MAILMAN.HPMAIL.SYS CI.. 44S K I C 409 107 BJ,MAILMAN.HPMAIL.SYS Sample Process Table The listing will have other process records interspersed with those that interest you. Making Sense Let's examine these records to try to understand what happened here. First, the Command Interpreter (CI..) logs on and issues the command PASSCHECK. Because this is not a valid MPE command, you must assume that it is a UDC. The session had been logged on for 21 seconds when its data was recorded (Run Time=21S). It already had a transaction with poor response-to-prompt time (Interest Reason=P, and the entry itself is green). The programs XCHECK.XSEC.SYS and PASSCHG.PUB.SYS were created (Interest Reason=N). They might have been created by the PASSCHECK UDC.
NOTE You might not see every command interpreter command issued, or even the CI command that caused the CI.. to be logged. But you will see the command that was last executed when the process was logged. If the last command executed is not available--for example, if the CI has died already--you will see CI.. listed.
Next, the XCHECK program terminates (Interest Reason=K). The total run time was very short (0S means less than 1 second). If you scroll to the right, you will see that XCHECK did no terminal transactions. Notice that CI.. is logged again, indicating the command interpreter probably died before it was logged, but it was not marked as killed to allow all of its activities to finish and be accounted for. Also notice that CI.. had a poor response-to-prompt (Interest Reason=P) and first-response (Interest Reason=F) transaction. Can you tell which CI transaction could generate this response time? You cannot be sure at this time; it could have been something such as running a program. (A program that does not read data from the terminal does not do any terminal transactions. In this case, the time it takes the program to run is charged to the CI as Response Time.) You will not see any program that ran for that time interval listed in the CI record as Response Time. This can be confusing. Next you see that the PASSCHG program was killed. Apparently, it died earlier because the table shows a run time of less than 1 second. Its appearance in the log is delayed to allow activities to be processed. Lastly, CI.. is marked as killed, and the session is officially finished. Some Conclusions Briefly, the session logs on and executes the PASSCHECK UDC. The XCHECK and PASSCHG program are run without doing any terminal transactions. Finally, the CI does something that takes some time, and then logs off. Something the CI is doing is causing the long response times. What could that be? It cannot be running a program because the only programs running are PASSCHG and XCHECK, and neither runs long enough to cause this response time. Is that correct? If the CI ran a program, wouldn't you see it on this display? Maybe not. Remember, you were looking at the Other application when you did this zoom-by-process. When you zoom-by-process on a single application, you see only processes belonging to that application. This means that if the CI--an Other process--runs a program that belongs to another application (for example, HPMAIL), it will not be listed. Can you do anything about this? You can access the Busiest application graph, and then do a zoom-by-process to see all processes. To do this, close this process window, and slide the vertical scroll bar's scroll box to the top of the graph until Busiest is displayed in the dialog window. Draw the Application Transaction Response graph. Now do a zoom-by-process on the same time period that you used for the previous graph, and see if you can find the same job or session. Compare it to the previous example. Notice that the DSSTRUCK.HPMAIL.SYS program ran. This accounted for that large response time charged against the command interpreter. Because DSSTRUCK did no terminal transactions of its own, all of its run time is charged as response time to CI.., the process that ran it. In conclusion, running HPMAIL slave trucks (program DSSTRUCK) from a virtual terminal across DS results in poor response times. Although these are batch-type activities, they are counted as terminal transactions because they log on as sessions to DS virtual terminals. This is a normal activity for HPDesk. In this case, the only abnormal thing about Trapper is that its only other significant activity is to act as an electronic mail distribution hub.


MPE/iX 5.0 Documentation