HP 3000 Manuals

Running Out of Cache [ HP LaserRX/MPE: A Journey of Discovery ] MPE/iX 5.0 Documentation


HP LaserRX/MPE: A Journey of Discovery

Running Out of Cache 

Notice that a fairly high amount of CPU, 10 to 15 percent, was spent on
disc caching on the Global System CPU Utilization graph.  It would be a
good idea to check disc performance.  To do this:

   1.  Close all open graphs.

       You can play back your macro script to close the graphs quickly.

   2.  From the Draw Graphs dialog box, select the following:

          a.  Graph=Global Disc Summary.

              Remember to deselect Global Bottlenecks and Global System
              CPU Utilization.

          b.  X-Axis=Year.

          c.  Points Every...=Day.

          d.  Shift=All Day.

          e.  Starting Day=1 March 1988.

   3.  Click  OK.

[]
Global Disc Summary Graph What does this graph show you? What do the curves mean? Use Help to get more information, if necessary. Basically, the four curves show the following: Logical Rate of disc transfers that would occur if disc caching were not enabled. Physical Actual rate of disc transfers. Mem Mgr Rate of disc transfers caused by memory management (swapping). Util Percentage of available disc transfer time being used. The difference between logical and physical disc transfers equals the rate of transfers eliminated by disc caching. This difference is the benefit of having disc caching. If disc caching is unavailable on a disc drive, logical disc transfers equal physical transfers, and there is no disc-caching benefit. Disc utilization is calculated as the percentage of time disc transfers are taking place (or the system could not initiate a new disc transfer for some reason). Utilization ranges between zero, if no transfers are taking place, and 100 percent, if every disc drive is transferring data as fast as possible. Factors that increase disc utilization include the following: * Increased physical I/O rate. * Increased transfer time. Longer than normal seek times or very large transfer sizes can increase the transfer time. * Channel or controller contention. If two discs share a controller, when one is transferring data the other disc is also considered busy because it cannot start a transfer until the controller is released. * Physically slower disc drives. They will have longer transfer times than faster disc drives will have when they are transferring the same data. What can you tell about Trapper? The logical disc rate is about twice the physical disc rate, which means about 50 percent of the disc I/Os are eliminated due to disc caching. This isn't bad, but it isn't terrific either. It is not uncommon to eliminate 60 to 70 percent of the disc I/Os. It takes about 1 percent of the CPU used by disc caching to eliminate one disc I/O per second. If you apply this rule of thumb to Trapper, you will see that 8 to 10 I/Os per second (logical minus physical) are being eliminated, and disc caching is taking 10 to 15 percent of the CPU (from the Global System CPU Utilization graph). This shows that although disc caching helps, it might be using too much CPU. If Trapper is short on CPU or memory, you might improve overall performance by turning disc caching off. Trapper was short on memory until 21 April, but it looks like enough memory was added to relieve the shortage. Turning caching off would help only if Trapper is short of CPU. Trapper might be short of CPU, but you cannot determine that without investigating overall CPU usage. Examine the Utilization curve. It runs at about 20 percent until 19 April and then drops to less than 10 percent. What might have caused that? Scan the factors affecting disc utilization listed previously, and check what you already know: * Did the physical I/O rate change? Not really. Although the physical I/O rate does change, it increases. This would make utilization increase, not decrease. * Did the transfer times change? You don't know. * Was there channel or controller contention? You don't know. * Were disc drives physically faster or slower? You can't tell. It is apparent that you cannot determine what caused the drop in utilization. You do know at least one thing that did not cause it, and learning what did not cause a problem can be an important step toward learning what did cause it. What next? You need more details. At this point, you can close the Global Disc Summary graph or leave it on your screen for reference. To obtain more details, do the following: 1. From the Draw Graphs dialog box, select the following: a. Graph=Global Disc Detail. Remember to deselect Global Disc Summary. b. X-Axis=Day. c. Points Every...=Hour. d. Shift=All Day. e. Starting Day=Any day before CPU utilization dropped, such as 21 April. 2. Click OK.
[]
Global Disc Detail Graph The graph shows each disc drive on the system (or the top--most used--five drives, if more than five are used). The drives are sorted in descending order from the most used to the least used. You can see that the disc I/Os are not balanced. Ldev 2 is used much more frequently than Ldev 1. The Utilization value on the Global Disc Summary graph is an average of all the disc drives, but the Global Disc Detail graph shows each drive's utilization, independently. Scroll one day to the right (click the gray area to the right side of the horizontal scroll bar). Keep scrolling one day at a time until you reach 29 April. What is different? A new disc drive, Ldev 3, was added. Would adding this drive affect overall utilization? Yes, overall utilization would be affected if the drive had its own controller and other factors (such as physical I/O rate) remained the same. Average utilization should drop. It did. Trapper's load is still unbalanced when transferring to Ldev 1, but the load is even between Ldev 2 and Ldev 3. You could try to shift files to Ldev 1 to balance the load on this system. But because Ldev 1 is a 7925 disc, while Ldev 2 and Ldev 3 are 7935 discs, it might be difficult to balance the load this way. Extra Credit Exercise How much good can you do by balancing the load on the disc drives? In other words, what is your payback if you make an effort to fine-tune disc utilization? To find out, display the Global CPU Utilization graph. Check how much time the CPU was idle while awaiting completion of a disc transfer. On a system-wide basis, you cannot expect more throughput than you can obtain by keeping the CPU's paused time at zero. On the other hand, you can only assess the effect on terminal response time by calculating how long specific processes had to wait for disc I/Os to complete. For now, only determine throughput. You will study transaction response times on the next part of this journey.


MPE/iX 5.0 Documentation