SUCCESS STORY 1

 

Acceleration of "too slow" applications


Unconventional method brings unusual results for Kühne + Nagel (AG & Co.) KG

 

New ways of tuning iSeries-systems are taken by the danish company iPerformance with the tool GiAPA, that also showed convincing results at German customers.

Common Germany, the organization for eServer users of SMB, organized mid October a workshop in Göttingen with the theme Performance Optimization of Applications. Increasing response times or jobs running for an excessively long time means in most companies upgrading the hardware, since it probably is more expensive to have a larger number of employees waiting at the terminals..

 

Despite the fact that hardware gets less and less expensive, the total cost for additional CPU, main storage and/or discs can be huge - and sometimes seems unnecessarily high. Against this background the Danish company iPerformance ApS has already for years offered performance improvements on a 'no cure - no pay' basis, and later included all the experiences gained in a software solution called GiAPA (Global iSeries Application Performance Analyzer) that was introduced on the market in 2005.

How this tool works shows the following example from the international logistics company Kühne + Nagel, who with a number of 890-models is one of Germanys largest iSeries users. The IT-department had noticed that the increase in per cent of IT hardware costs exceeded the growth of number of transactions, so they decided to form taskforce to change this ratio.

 

The taskforce asked iPerformance to support them in optimizing a batch job. When the outcome of the optimization was beyond expectation - an 80 % improvement in elapsed time of - it was decided to ask iPerformance to assist in analyzing all jobs on the production systems.

 

The methods here used were presented by Kaare Plesner, founder and CEO of iPerformance, on the Common workshop in Göttingen. The techniques used were probably for the attendees surprisingly simple: Data originating from generally available operating system display commands is written to a data base, then sorted and listed with queries or simple programs producing exception reports - and suddenly it is in most cases very clear why and where the performance weaknesses occur - in many cases even down to source code line number.

 

A manual analysis of thousands of jobs and programs would however be too much to deal with. Therefore Kühne + Nagel decided to buy a GiAPA license. This tool uses the same simple analysis methodology, but can do it automatically and simultaneously for all active jobs.

The results hereby achieved were presented by Michael Albrecht, responsible for the performance analysis of the international standard applications in the Kühne + Nagel head office in Hamburg. He described a large number of run time improvements that were achieved using these simple methods: The results so far on one model 890 in Hamburg represented a saving of approximately 4 CPU's, each having price tag of about € 250.000.

Another significant advantage of the GiAPA is its very limited use of resources - less than 1 per mille of one CPU when collecting performance data for all jobs and tasks.

 

 

  Back to Top 

 

SUCCESS STORY 2

Sales Statistics Batch Job Speeded up by 86 %

 

Cordes und Graefe KG in Bremen, Germany, was in a batch job creating different levels of sales statistics based on more than 30 million order line records. To create the statistics, additional information had to be fetched from other data bases (customer data, rebates, etc.) - a total of five other files were accessed by key for every record. The job ran with a rather low CPU percentage for around 5 hours on an iSeries server with abundant CPU, main storage, and disk capacity.

 

Needing to run this job frequently it was desirable to speed it up. The first step was to pinpoint exactly what was using most of the run time. A low CPU percentage (when CPU resources are plentiful) means that the job is waiting for something most of the time.

Using GiAPA (Global iSeries Applications Performance Analyzer) from iPerformance to analyze the job it was easy to see the reason for the low CPU percentage: The job was most of the time waiting for the completion of physical disk I/Os because of millions of synchronous data base reads. GiAPA also showed that the time was used by QDBGETKY (Read a record by key), and showed the programs and the source statement numbers doing these reads. Also the files names involved could be seen, but they were of course known.

 

The random access to several large data base files did not allow the operating system expert cache to make the records needed available in advance, and the files were too large to be kept in main storage.

However, only very few fields were needed from the files read randomly, and iPerformance therefore suggested reading each of these files at job start, using sequential blocked access, and loading the key fields and the few bytes of data needed into user indexes.

 

All the random data base accesses could then be replaced by index search operations, and the indexes would not be bigger than it should be possible for storage management to keep them in main storage.

 

The strategy proved to be successful: The new version of the program only uses around 40 minutes elapsed time, and the total CPU time used also decreased, although the job CPU percentage is very high - the job does not need to wait for data being fetched from disk. But running with the low batch job priority, this will never disturb any interactive jobs.

 

  Back to Top 

 

 

SUCCESS STORY 3

GiAPA at Tele Columbus

.Advanced Capacity Planning: Analyzing Application Peaks

 

Tele Columbus Daten und Service GmbH in Hannover, Germany, had a capacity planning problem to solve before the introduction of a large number of additional users of one of their major interactive applications. How much additional CPW capacity was needed?

 

The application in question was currently in use from many work stations, and could be identified by generic job names A206*, HE*, NI*, PC*, Platz*, TM*, and TS*. But to calculate the additional resources needed it was not sufficient to know the total CPU and/or I/O usage by these jobs, because the usage of this application was varying very much depending on the time of the day, with some rather significant peaks.

 

To base capacity planning on the total use of resources by the current users would be similar to plan a highway based on the total number of cars per 24 hours, without considering the morning and evening rush hours.

 

On the other hand, basing the upgrade on the peaks only would likely mean an unnecessarily large investment. It was obvious that a more detailed analysis was needed. The data required to answer the questions were delivered by GiAPA, Global iSeries Performance Analyzer from iPerformance. This software product collects performance data for all jobs and tasks every 15 seconds. During the subsequent analysis of the data, all jobs and tasks not showing any signs of having or causing performance problems are normally removed from the final exception reporting data, but an option allows *ALL data to be kept.

 

When all data is kept, the GiAPA report options allow very flexible, user defined selection criteria, amongst others capable of automatically accumulating all data belonging to the job names listed above, and present the results in various ways. In this case two reports were selected:

A histogram showing in how many 15-second collection intervals each CPU percentage was used by the jobs selected.

 

A report by 15 seconds interval showing the use of CPU and I/Os by type by the selected jobs and by all jobs and tasks.

 

The first report documented that whilst peaks reaching almost 100 % of the total available CPU did occur, this happened so rarely that it would be an overkill to base the upgrade on these peaks and double the CPW capacity. Satisfying the CPU required by the application in 99.5 % of all cases could be achieved with a considerably smaller upgrade.

 

The second report, where also totals for disk I/Os were available, showed that the disk I/O capacity needed to accommodate the additional workload apparently was available, even considering all the other applications running on this server. It also confirmed the results from the histogram report: Other applications running in parallel to the selected jobs did not need so much CPU resources that a larger upgrade was needed.

 

So although GiAPA never was intended to be a capacity planning tool, its ability to deliver any kind of job performance data down to intervals of only 15 seconds can produce reports enabling management to base hardware upgrade decisions on a more correct basis.

 

 

  Back to Top 

 

SUCCESS STORY 4

Why are Jobs Delayed Although Resources are Available?

Kühne + Nagel (AG & Co.) KG in Hamburg, Germany, ran into serious performance problems when print jobs started piling up on the job queues of their iSeries server. Consignment documents, invoices, etc., that normally were printed within seconds, started having delays of several hours, obviously causing severe problems for the business. The print jobs had been working fine for years - and no changes had been made.

 

The server was in no way overloaded - an average CPU usage of around 65% meant that 4 or 5 of the 16 CPUs were idling most of the time, and also the disk I/O rates were well within the recommended values. Many batch jobs were running in parallel to produce the print, but they showed close to no use of resources - in fact, they did not really seem to move for prolonged periods of time.

 

It can be tough to locate the cause of performance problems when jobs are eating up a lot of resources, but it is often more difficult to find the cause when close to no resources are used. A record or object lock was an obvious guess, but did not seem to be the case. Moreover, the application was designed to allow many parallel jobs (and not cause any locks), and with the number of transactions almost unchanged such locks would have happened long before.


Fortunately the company had an ace up its sleeves: The software package GiAPA from iPerformance. GiAPA (Global iSeries Performance Analyzer) includes options for analyzing where jobs are delayed, when they do not move. GiAPA Trace Job Analysis showed that the delays, sometimes exceeding 30 seconds, always occurred when a data base file or member in QTEMP was created or deleted, i.e. within IBM supplied programs like QDBCRTME (Data Base Create Member).


With this documentation in hand the ball could be played to IBM who proved that seize waits (seize = operating system internal lock) during update of the list of owned objects for the user profile QDBSHR were causing the problem. (IBM subsequently decided to issue a PTF to remove the problem for the current version of the operating system, and will of course include this change in the new OS version.)

 

In the meantime the problem could be avoided by clearing and reusing instead of creating and deleting the files in QTEMP - - and the print started to appear within seconds again.

 

  Back to Top 

 

SUCCESS STORY 5

To follow later

 

  Back to Top