|
SUCCESS STORY 1
Acceleration of
New ways of tuning iSeries-systems are taken by the
danish company iPerformance with the tool GiAPA, that also showed
convincing results at German customers.
Despite the
fact that hardware gets less and less expensive, the total cost for
additional CPU, main storage and/or discs can be huge - and
sometimes seems unnecessarily high. Against this background the
Danish company iPerformance ApS has already for years offered
performance improvements on a 'no cure - no pay' basis, and later
included all the experiences gained in a software solution called
GiAPA (Global iSeries Application Performance Analyzer) that was
introduced on the market in 2005.
The taskforce asked iPerformance to support them in optimizing a batch job. When the outcome of the optimization was beyond expectation - an 80 % improvement in elapsed time of - it was decided to ask iPerformance to assist in analyzing all jobs on the production systems.
A manual
analysis of thousands of jobs and programs would however be too much
to deal with. Therefore Kühne + Nagel decided to buy a GiAPA
license. This tool uses the same simple analysis methodology, but
can do it automatically and simultaneously for all active jobs.
Sales Statistics Batch Job Speeded up by 86 %
Needing to run this job
frequently it was desirable to speed it up. The first step was to
pinpoint exactly what was using most of the run time. A low CPU
percentage (when CPU resources are plentiful) means that the job
is waiting for something most of the time.
The random access to several
large data base files did not allow the operating system
expert cache to make the records needed available in advance,
and the files were too large to be kept in main storage.
All the random data base accesses could then be replaced by index search operations, and the indexes would not be bigger than it should be possible for storage management to keep them in main storage.
The strategy proved to be successful: The new version of the program only uses around 40 minutes elapsed time, and the total CPU time used also decreased, although the job CPU percentage is very high - the job does not need to wait for data being fetched from disk. But running with the low batch job priority, this will never disturb any interactive jobs.
GiAPA at Tele Columbus .Advanced Capacity Planning: Analyzing Application Peaks
Tele Columbus Daten und Service GmbH in Hannover, Germany, had a capacity planning problem to solve before the introduction of a large number of additional users of one of their major interactive applications. How much additional CPW capacity was needed?
The application in question was currently in use from many work stations, and could be identified by generic job names A206*, HE*, NI*, PC*, Platz*, TM*, and TS*. But to calculate the additional resources needed it was not sufficient to know the total CPU and/or I/O usage by these jobs, because the usage of this application was varying very much depending on the time of the day, with some rather significant peaks.
To base capacity planning on the total use of resources by the current users would be similar to plan a highway based on the total number of cars per 24 hours, without considering the morning and evening rush hours.
On the other hand, basing the upgrade on the peaks only would likely mean an unnecessarily large investment. It was obvious that a more detailed analysis was needed. The data required to answer the questions were delivered by GiAPA, Global iSeries Performance Analyzer from iPerformance. This software product collects performance data for all jobs and tasks every 15 seconds. During the subsequent analysis of the data, all jobs and tasks not showing any signs of having or causing performance problems are normally removed from the final exception reporting data, but an option allows *ALL data to be kept.
When all data is kept, the GiAPA report options allow very flexible, user defined selection criteria, amongst others capable of automatically accumulating all data belonging to the job names listed above, and present the results in various ways. In this case two reports were selected: A histogram showing in how many 15-second collection intervals each CPU percentage was used by the jobs selected.
A report by 15 seconds interval showing the use of CPU and I/Os by type by the selected jobs and by all jobs and tasks.
The first report documented that whilst peaks reaching almost 100 % of the total available CPU did occur, this happened so rarely that it would be an overkill to base the upgrade on these peaks and double the CPW capacity. Satisfying the CPU required by the application in 99.5 % of all cases could be achieved with a considerably smaller upgrade.
The second report, where also totals for disk I/Os were available, showed that the disk I/O capacity needed to accommodate the additional workload apparently was available, even considering all the other applications running on this server. It also confirmed the results from the histogram report: Other applications running in parallel to the selected jobs did not need so much CPU resources that a larger upgrade was needed.
So although GiAPA never was intended to be a capacity planning tool, its ability to deliver any kind of job performance data down to intervals of only 15 seconds can produce reports enabling management to base hardware upgrade decisions on a more correct basis.
Why are Jobs Delayed Although Resources are Available?
The server was in no way overloaded - an average CPU usage of around 65% meant that 4 or 5 of the 16 CPUs were idling most of the time, and also the disk I/O rates were well within the recommended values. Many batch jobs were running in parallel to produce the print, but they showed close to no use of resources - in fact, they did not really seem to move for prolonged periods of time.
It can be tough to locate the cause of performance problems when jobs are eating up a lot of resources, but it is often more difficult to find the cause when close to no resources are used. A record or object lock was an obvious guess, but did not seem to be the case. Moreover, the application was designed to allow many parallel jobs (and not cause any locks), and with the number of transactions almost unchanged such locks would have happened long before. Fortunately the company had an ace up its sleeves: The software package GiAPA from iPerformance. GiAPA (Global iSeries Performance Analyzer) includes options for analyzing where jobs are delayed, when they do not move. GiAPA Trace Job Analysis showed that the delays, sometimes exceeding 30 seconds, always occurred when a data base file or member in QTEMP was created or deleted, i.e. within IBM supplied programs like QDBCRTME (Data Base Create Member).
In the meantime the problem could
be avoided by clearing and reusing instead of creating and
deleting the files in QTEMP - - and the print started to appear
within seconds again.
To follow later
|