Checking I/O Efficiency

In commercial applications data base I/Os are almost always the bottleneck, on average accounting for somewhat more than 2/3 of all resource usage. Today the processors are so fast that even complex calculations within the business logic only rarely consume noticeable CPU.

If anyone should doubt this fact then just remember that even when several batch jobs are active at the same time, the CPU usage will normally still be far from 100 % - the reason being that we almost all the time are awaiting I/Os.

A job could use 100 % CPU if all data all the time was available in memory when needed by the programs. This is not the case - the bottleneck is the I/Os where data is read into or written from main storage.

.

Bottleneck

Remembering this fact it can hardly surprise that the most rewarding optimization in most cases spells speeding up I/Os, which again means reducing the number of accesses.

Surprising is however that many (i.e., most) programmers never were trained to consider this fact. On the contrary, the saying goes that “Machines are so fast today that performance is no concern”.

GiAPA is to our knowledge the only tool that analyzes file accesses down to relative record number. Doing this we obtain astonishing successes, often able to find very significant speed-up potential within applications believed to run efficiently. 

GiAPA's unique analysis of how files are accessed makes it very easy to spot optimization potential. Position the cursor within the Job Summary report on the job you want to analyze, and hit a command key to reach one of the four reports. Each tell their part of the story - below a couple of examples are explained:

Example 1 shows 50+ % optimization potential:

The second line on the report shows file R1NCMRW, file number 5 (GiAPA's internal numbering). Data was collected for 1184 intervals, more than enough for basing reliable conclusions on the statistics. 250.792.549 reads were used, but the difference between the highest and the lowest relative record number seen in the 1184 "HotSpot" samples was only 1665 records. GiAPA therefore suggests in the rightmost column that 250.790.884 reads could be saved if each record only was read once (the 1665 records could be stored in a table in memory).

GiAPA also tells that 18 million reads were used to access only one record (probably parameter information) in file number 6. The pink line tells that total potential savings seems to be 422 million I/Os. This should cut run time in half - but see also the example 2: We could do still more here.

    

GiAPA tells:         

250.792.549 reads were used on <= 1665 records

FileAnSum

Example 2 reveals that even more could be improved:

Turning to the detailed file statistics report for file 2 (first line on report above) and paging down through all the 1184 samples taken for this file we could see that the relative record number increased all the time (= file read in "arriving sequence"), and that the number of I/Os and the relative record number were not far apart, meaning that the file was read one record at the time.

It would be MUCH more efficient to read the file in large blocks, and it would only require a very small change in the program source code, if any at all - depending on the type of read used, a change of CL code might suffice.

FileStat