I have two big loves in my life, but right now I’ll share only the second one: the accounting information provided by IBM® DB2®. Many application performance problems can be solved by looking into accounting reports. If you want to get a better position inside your organization, you should really work to become a champion in DB2 accounting reports.
Despite the advantages of using those accounting reports, there is still a problem—one that our industry has been facing for more than 25 years. Accounting information takes us only to the package level; it will not directly pinpoint which of the many SQL statements inside a given package is the one that consumes most of the resources.
DB2 10 introduces a new feature as part of its enhanced monitoring support that can help take the consumption figures down to the per-package statement level using a simple and standard method. This new feature, called statement-level monitoring, deserves more attention.
In this two-part article, I will take you through the aspects of implementing this feature in your organization by breaking it up into smaller, easier-to-understand sections. I will focus on what the new data presents, how it is retrieved, and how it can help diagnose performance problems.
Let’s start by looking at what the data presents. DB2 can now optionally collect every thinkable performance number on every single piece of static SQL that has been bound after the system went to new-function mode (NFM). DB2 in NFM will populate a new globalwide unique statement number column (STMT_ID) in SYSIBM.SYSPACKSTMT. This STMT_ID is the anchor point for the data collection. Packages that have not yet been bound after NFM will not have any of this new data collected.
The figures collected will be cumulative—they are not counted per transaction but will be updated in storage as long as the package is present in the environmental descriptor manager (EDM) pool. The figures can be read on the fly from a program you write. You have the option of reading these figures and noting how any DB2 resource is consumed per statement in a given package. For instance, if you do this every 15 minutes, you can see how the statement consumption evolves in this interval and gives you vital information.
It is very important to understand that this way, the figures have no direct relationship to the accounting information since this information is normally carved per unit of work (UOW) or thread. However, in most cases, the data will clearly pinpoint which statements are consuming which resources. If you can save this information and show it to developers in a graphical way, you should start seeing the picture.
The data is anchored on all the information you would expect, such as PACKAGENAME, STMT_ID, COLLECTION, and CONSISTENCY TOKEN. So if you have several versions of a package running or the same version in a different collection, you will get an individual set of data collected by DB2.
To get a peek into the kind of information contained, you can look into your SDSNMACS-datasets in member DSNDQW05 and search for QW0401 (see Figure 1).Figure 1. Example of some of the fields in the 401 records
How do you incorporate this feature into your infrastructure? Let me end the first part of this article by illustrating what you could show to your developers or your fellow DBAs.
This example performance data warehouse is built upon accounting information. Until now, users could see the consumption at the package level, which shows how resources consumed are distributed on program names for a given business area (see Figure 2).Figure 2. A DBA’s view of CPU resources consumed for business area per program name
Figure 2 shows all programs within a given business area, where it is clear that two programs —BASM006 and BASMANV—are using approximately 88 percent of the CPU time within this business area. Until now, there were details about which statement is the most consuming. After implementing the statement-level monitoring feature, you can show the distribution for all thinkable resources. Take a look at the program that used most of the CPU time (BASM006) and parse the individual statement numbers (Figure 3).Figure 3. The distribution of resources used per statement as returned in the 401 records
The big numbers on the legends for the pie charts are the statement numbers. The statement text is easily retrieved from SYSIBM.SYSPACKSTMT. You can now see that for the CPU time (second pie, top row), two statements are in play: 2734762 and 2734763 are consuming approximately 97 percent of the CPU time burned for that package.
In Part 2 of this article, I will talk about how to enable the data collection in DB2 and how to code the calls that will retrieve the data. This process involves advanced Assembler programming.
IBM big data in a minute: Bringing the power of Hadoop to the enterprise
Video: The right tool for the job
Nature of analytics video: IBM and the swan of all fears
IBM redesigns its Big Data & Analytics website with IBM Watson Foundations capabilities
Visit a website with comprehensive resources dedicated to the chief data officer role
Podcast: Learn about the InfoSphere Streams project at GitHub