HP 3000 Manuals

Coming Home [ HP LaserRX/MPE: A Journey of Discovery ] MPE/iX 5.0 Documentation


HP LaserRX/MPE: A Journey of Discovery

Coming Home 

Whether this system has a CPU bottleneck remains a question.  Where else
would you look to find the answer?  While still using the TRAPPER2.PRF
file, do the following:

   1.  Close all open graphs.

   2.  From the Draw Graphs dialog box, select the following:

          a.  Graph=Global CPU Utilization and Global System CPU
              Utilization.

              Remember to deselect Application Transaction Response.

          b.  X-Axis=Week.

          c.  Points Every...=Hour.

          d.  Shift=All Day.

          e.  Starting Day=15 August 1988.

          f.  Ignore Weekends=Enabled (checked).

              Because this file has no weekend data, HP LaserRX/MPE plots
              a 5-day week.

   3.  Click  OK.

[]
Global CPU Utilization and Global System CPU Utilization Graphs The resulting graphs appear virtually identical. Both show how CPU time was used throughout the week, but different information is detailed on each. Look at the Global CPU Utilization graph. CPU use (top of the violet Other area) reaches 100 percent and then flattens out for 4 or 5 hours during the middle of the day (as you saw on the Global Bottlenecks graph at the beginning of this journey). This indicates that if more CPU time were available, somebody would use it. The question is who? Many batch jobs are using CPU time. Usually, batch jobs are either using CPU time or are waiting on disc. Since the Paused-For-Disc area (light blue) is not large, probably only batch jobs would benefit if more CPU were available. Are the interactive sessions getting enough CPU? Yes, the red area at the bottom of the Global CPU Utilization graph rarely shows more than 10 percent busy. Most sessions run at higher priority than batch jobs. Therefore, it is reasonable to say that the interactive sessions are getting all the CPU they need. Actually, this situation is quite unusual. On most systems, interactive sessions use 30 to 50 percent of the CPU during the day. Getting more CPU (or using less) would not help the response times of these interactive sessions very much. It looks like the batch jobs are running all day, every day. Perhaps having the batch jobs complete more quickly would be a good idea. More available CPU might help the batch jobs run faster and complete more quickly.
Axiom You can relieve a performance bottleneck either by supplying more of the critical resource or by using less of it.
If the CPU is the performance bottleneck, you can get more of the resource by upgrading to a faster CPU (for example, upgrade a Series 58 system to Series 70 or 950). To use less of this resource, you must find a way to avoid using some of the CPU you are currently using. Re-examine the graphs. In the Global CPU Utilization graph, Sessions and Jobs are doing necessary work. You cannot expect them to use less CPU without reworking the application. System is the amount of CPU used by processes that do not belong to a job or a session. These are usually spoolers and data communication monitors. In this case, you might suspect that most of this CPU is being used by the DS data communication monitors, and you cannot do much about that. This leaves the Other category. Paused indicates time when the CPU is not being used because it is waiting for disc. So paused time is available for anyone who needs it. What makes up the Other category of CPU use? Look at the Global System CPU Utilization graph. It breaks the Other category into its three components: memory management, disc caching, and interrupt control stack. Processes is the sum of session, job, and system shown on the previous graph. Paused is paused. If you look at the Global System CPU Utilization graph, you will see the memory manager is not a problem now that extra memory has been added. Disc caching is fairly high. You saw earlier that disc caching was not helping that much, so you could disable disc caching if extra CPU were needed. This might be the time to disable disc caching. Because ICS is mainly handling interrupts and it does not appear to be excessive, leave it alone for now. Almost a Three-Point Landing Trapper was short on main memory until 21 April 1988 when additional memory was installed. This additional memory eliminated the swapping activity. On 29 April, the addition of a third disc drive lowered overall disc utilization without affecting throughput significantly. Later, a disc drive was removed. This increased disc utilization, but not to the previous level. Disc caching can eliminate about 50 percent of all disc I/Os, but requires a substantial amount of CPU to do this. Disabling disc caching on all disc drives might be beneficial to executing batch jobs. Monitor the system for 1 or 2 days to see if this change would be beneficial or detrimental. Without disc caching, the amount of CPU capacity released might be used by batch jobs during peak times. You might expect total CPU used to stay below 100 percent for most, if not all, of the day. With caching eliminated, you also can expect to see more Paused-For-Disc time. The question to be answered is: Does the increase in Paused time equal or exceed the amount of CPU released by disabling disc caching? If Paused time is less than the old Paused + Disc Caching time, the extra CPU is probably being used by Jobs and Sessions to get finished faster. If the opposite is true, you underestimated the effectiveness of disc caching, and you should enable it immediately. There are many other possibilities. What if you raised the random fetch quantum on disc caching to 64 sectors? Because database access uses random I/Os, this might increase the probability of eliminating a successive I/O and increase main memory usage. Because no other memory problems are apparent, you might try this. There is a chance that having fewer, larger disc cache domains will decrease the CPU overhead for disc caching because there are fewer domains to search. There is no single correct answer. The best you can do is predict what might help, try it, and evaluate the results. Then, if it is still necessary, make a new prediction based on your results and try it again. When you cannot think of anything more you can do to improve performance, your system should be optimally tuned. Undoubtedly, there is more you can learn from analyzing the Trapper system. Because this logfile is from a real system, there might be more surprises awaiting discovery. But now you might prefer to take a brief tour of some of the other available logfiles to study how other environments look when you examine them using HP LaserRX/MPE.


MPE/iX 5.0 Documentation