Simplifying Db2 for z/OS CPU Optimization: Eradicating Inefficient SQL Processing

Without doubt the IBM Z Mainframe server is recognised as the de facto choice for storing mission critical System of record (SOR) data in database repositories for 92 of the top 100 global banks, 23 of the 25 top global airlines; the top 10 global insurers & ~70% of all Fortune 500 companies. ~80% of mission critical data is hosted by IBM Z Mainframe servers, processing 30+ Billion transactions per day, including ~90% of all credit card transactions. This data is accessed by ~1.3 Million CICS transactions per second, compared with a Google (mostly search) processing rate of ~70,000 transactions per second. Interestingly enough, despite processing so many mission critical transactions the IBM Z Mainframe server platform is only accountable for ~6.2% of global IT spend. One must draw one’s own conclusions as to why some IT professionals perceive the IBM Z Mainframe server as being a legacy platform, not worthy of consideration as a strategic IT server platform…

The digital transformation has delivered an exponential growth of data, typically classified as Cloud, Mobile & Social based. This current & ever-growing data source requires intelligent analytics to deliver meaningful business decisions, requiring agile application software delivery to gain competitive edge. This digital approach can sometimes deliver a myriad of micro business application changes, personalised for each & every customer, often delivering “pop-up” applications…

IBM Z Mainframe software costs are often criticized as being a major barrier to maintaining or indeed commissioning the platform. IBM have tried to minimize these costs with numerous sub-capacity pricing options over the last 30 years or so, but this is perceived by many as being overly complicated; although with a modicum of knowledge, a specialized personnel resource can easily control software costs. All that said, IBM have introduced Tailored Fit Pricing for IBM Z, in an attempt to simplify software cost management. A recent blog reviewed the Tailored Fit Pricing for IBM Z offering & whether you decide whether this IBM Z pricing mechanism is suitable for your organization, optimizing IBM Z CPU MSU/MIPS usage is mandatory. Recognizing that the IBM Z Mainframe server is the de facto database server for System of Record data, primarily via the Db2 subsystem, clearly optimizing Db2 CPU usage, whether OLTP transactions, typically via CICS, or the batch window, has been & always will be, worthwhile…

All too often, many IT disciplines can be classified with a generic 80/20 rule & typically data can be classified accordingly, where 80% of data is accessed 20% of the time & 20% of data is accessed 80% of the time. The challenge with such a blunt Rule of Thumb (ROT) is that it’s static, but it’s a good starting point. Ideally for any large data source, there would be a dynamic sampling mechanism that would identify the most active data, loading this into the highest speed memory resource to reduce I/O access times & therefore CPU usage. Dynamic management of such a data buffer would render the 80/20 rule extraneous to requirements, as each & every business has their own data access profile. However, a simple cost benefit & therefore Proof of Value (POV) analysis could ensue.

From a Db2 viewpoint, pre-defined structures such as buffer pools offer some relief in storing highly referenced data in a high-speed server memory resource, but this has a finite capacity versus performance benefit, not necessarily using the fastest memory structures available nor dynamically caching the most accessed data. The business considerations of not optimizing Db2 data access are:

  • Elongated Batch Processing: With ever increasing amounts of data to process & greater demands for 247365 availability & real-time access, data access optimization is fundamental for optimized service delivery, often measured by mission critical SLA & KPI metrics. Optimized batch processing is a fundamental requirement for acceptable customer facing business service delivery.
  • Slow Transaction Response Times: As the nature of customer requirements change, mobile device applications exponentially increasing the number of daily transactions, overall system resource capacity constraints are often stressed during peak hours. Optimized transaction response time is a fundamental requirement, being the most transparent service delivered to each & every end customer.

An easy but very expensive solution to remediate batch processing & transaction response issues is to provide more resources via a CPU server upgrade activity. A more sensible approach is to optimize the currently deployed resources, safeguarding that frequently accessed data is mostly if not always high speed cache resident, reducing the I/O processing overhead, reducing CPU usage, which in turn will optimize batch processing & transaction response times, while controlling associated IBM Z Mainframe server hardware & software costs.

The ubiquitous Db2 data access method is Structured Query Language (SQL) based, where IBM has their own implementation, SQL for Db2 for z/OS, which could be via the commonly used COBOL (EXEC SQL) programming language or a Db2 Connect API (E.g. ADO.NET, CLI, Embedded SQL, JDBC, ODBC, OLE DB, Perl, PHP, pureQuery, Python, Ruby, SQLJ). For Db2 Connect, there are 2 types of embedded SQL processing, static & dynamic SQL. Static SQL minimizes execution time by processing in advance. Though some relief is provided by Dynamic Statement Cache, dynamic SQL is processed when the SQL statement is submitted to the IBM Z Db2 server. Dynamic SQL is more flexible, but potentially slower. The decision to use static or dynamic SQL is typically made by the application programmer. There is a danger that Dynamic Statement Cache might be considered as a panacea for SQL CPU performance optimization, but as per any other performance activity, reviewing any historical changes is a good idea. The realm of possibility exists for the Db2 Subject Matter Expert (SME) to be pleasantly surprised that more often than not, there are still significant SQL CPU optimization opportunities…

From a generic Db2 viewpoint, with static SQL, you cannot change the form of SQL statements unless you make changes to the program. However, you can increase the flexibility of static statements by using host variables. Obviously, application program changes are not always desirable.

Dynamic SQL provides flexibility, if an application program needs to process many data types & structures, dictating that the program cannot define a model for each one, dynamic SQL overcomes this challenge. Dynamic SQL processing is facilitated by Query Management Facility (QMF), SQL Processing Using File Input (SPUFI) or the UNIX Systems Services (USS) Command Line Processor (CLP). Not all SQL statements are supported when using dynamic SQL. A Db2 application program that processes dynamic SQL accepts as input, or generates, an SQL statement in the form of a character string. Programming is simplified when you can structure programs not to use SELECT statements, or to use only those that return a known number of values of known types.

For Db2 data access, SQL statement processing requires an access path. The major SQL statement performance factors to consider are the amount of time that Db2 uses to determine the access path at run time & whether the access path is efficient. Db2 determines the SQL statement access path either when you bind the plan or package that contains the SQL statement or when the SQL statement executes. The repeating cost of preparing a dynamic SQL statement can make the performance worse when compared with static SQL statements. However, if you execute the same SQL statement often, using the dynamic SQL statement cache decreases the number of times dynamic statements must be prepared.

Typically, organizations have embraced static SQL over dynamic because static is more predictable, showing little or no change, while dynamic implies ever changing & unpredictable. Db2 performance optimization functions have been incorporated into base Db2 (E.g. Buffer Pools) & software products (E.g. IBM Db2 AI for z/OS, IBM Db2 for z/OS Optimizer, IBM Db2 Analytics Accelerator, IBM Z Table Accelerator, IZTA), with varying levels of benefit & cost. Ultimately IBM Z Mainframe customers need simple cost-efficient off-the-shelf solutions of a plug & play variety & without doubt, optimizing static SQL data processing is a pragmatic option for reducing Db2 subsystem CPU usage.

In Db2 Version 10, support for 64-bit run time was introduced, providing Virtual Storage Constraint Relief (VSCR), improving the vertical scalability of Db2 subsystems. With Db2 Version 11, the key z/Architecture benefit of 64-bit virtual addressing support was finally introduced, increasing capacity of central memory & virtual address spaces from 2 GB to 16 EB (Exabytes), eliminating most storage constraints. It therefore follows that any Db2 CPU performance optimization solution should also exploit the z/Architecture 64-bit feature, to support the ever-increasing data storage requirements of today’s digital workloads.

As we have identified, Db2 can consume significant amounts of z/OS CPU accessing & retrieving the same static frequently used data elements repetitively. Upon analysis, these static frequently used data elements are typically identified originating from a small percentage of Db2 tablespaces. Typically, at first glance these simple SQL programs are considered as low risk, but are repeatedly processed, often in peak processing times, consuming excessive CPU & increasing processing cost accordingly, typically z/OS Monthly Licence Charges (MLC) related. Db2 optimization tools for access path or buffer pool management provide some benefit, but this is not always significant & may require application changes. Patently there is a clear & present requirement for a simple plug & play solution, transparent to Db2 processing, maintaining an optimized high-performance in-memory cache of frequently used Db2 data, safeguarding data integrity in environments various, including SYSPLEX, Data Sharing, et al…

QuickSelect is a plug-in solution dynamically activated in a batch or OLTP environment (I.E. CICS, IMS/TM) intercepting repetitive SQL statements from Db2 application programs, storing the most active result set, not necessarily the entire tablespace, in a high-performance in-memory cache, returning to applications the same result set as per Db2, but much faster & using less CPU accordingly. QuickSelect is completely transparent to z/OS applications, eliminating any requirement to change/recompile/relink application source or rebind packages. QuickSelect processing can be switched on or off using a single keystroke, either defaulting to standard Db2 SQL processing or to benefit from the QuickSelect high-speed cache for optimized CPU resource usage.

The 64-bit QuickSelect server, implemented as a started task, intelligently caching data in self-managed memory above the bar, supporting up to 16 EB of memory, eliminating concerns of using any other commonly used storage areas (E.g. ECSA). The intelligent caching mechanism safeguards that only highly active data is retained, optimizing the associated cache memory size required.

QuickSelect caches frequently requested Db2 SQL result sets, returning these results to the application from QuickSelect cache, when a repetition of the same SQL is encountered. For data integrity purposes, QuickSelect immediately invalidates result sets upon detection of changes to underlying tables, implicitly validating each cache resident SQL result set. Changes to Db2 data by application programs are captured by a standard Db2 VALIDPROC process, attached to the typically small subset of frequently accessed tables of interest to QuickSelect. Db2 automatically activates the VALIDPROC routine whenever the table contents are changed by INSERT, DELETE, UPDATE or TRUNCATE statements, invalidating cached data from the updated tables automatically. For standard Db2 utilities such as LOAD/REPLACE, REORG/DISCARD & RECOVER, table-level changes are identified by a QuickSelect utility-trap, invalidating cached data from the updated tables automatically. QuickSelect also supports SYSPLEX & Data Sharing environments, supporting update activity via the same XCF functions & processes used by Db2.

QuickSelect delivers the following benefits:

  • CPU Savings: Meaningful reduction (E.g. 20%) in the Db2 SQL direct processing; 10%+ peak time CPU reduction is not uncommon.
  • Faster Processing: Optimized CPU usage delivers shorter batch processing & OLTP transaction response times, for related SLA & KPI objective compliance.
  • Transparent Implementation: No application changes required, source code, load module or Db2 package.
  • Survey Mode: Unobtrusive & minimal Db2 workload overhead data sampling to identify potential CPU savings from repetitive SQL & tables of interest, before implementation.
  • Staggered Deployment: Granular criteria (E.g. Job, Program, Table, Transaction, Etc.) implementation ability.
  • Reporting & Analytics: Extensive information detailing cache usage for Db2 programs & tables.

Since 1993 Db2 has evolved dramatically, in line with the evolution of the IBM Z Mainframe server. When considering today’s requirement for a digital world, processing ever increasing amounts of mission critical data, a base requirement to optimize CPU processing for Db2 SQL data access is mandatory. In a hybrid support environment where today’s IBM Z Mainframe support resource requires an even blend of technical & business skills, plug & play, easy-to-use & results driven solutions are required to optimize CPU usage, transparent to the subsystem & related application programs. QuickSelect is such a solution, fully exploiting 64-bit z/Architecture for ultimate scalability, identifying & resolving a common CPU consuming data access problem, for a mission critical resource, namely the Db2 subsystem, maintaining mission-critical System of Record data.

z/OS CPU optimization is a mandatory requirement for every organization, to reduce associated software & hardware costs & in theory, as a mandatory pre requisite for deploying the Tailored Fit Pricing for IBM Z pricing mechanism. Tailored Fit Pricing uses the previous 12 Months SCRT submissions to establish a baseline for MSU charging over a contracted period, typically 3 years. If there are any unused MSU resources, these are carried forward to the next year, but if those MSU resources remain unused at the end of the contracted period, they are lost, meaning the organization has paid too much. If the MSU resource exceeds the agreed Tailored Fit Pricing, excess MSU resources are charged at a discounted rate. Clearly achieving an optimal MSU baseline before embarking on a Tailored Fit Pricing contract is arguably mandatory & it therefore follows that optimizing CPU forever more, safeguards optimal z/OS MLC charging during the Tailored Fit Pricing contract. QuickSelect for Db2 is a seamless CPU optimization product that will perpetually deliver benefit, assisting organizations minimize their z/OS MLC costs, whether they continue to proactively manage the R4HA, submitting monthly SCRT reports or they embark on a Tailored Fit Pricing contract…

System z: Optimizing DASD I/O Subsystem Performance

Historically there was a very simple synergy between the IBM S/370 Mainframe and its supporting disk I/O (DASD) subsystem, allowing for Mainframe host to physical and logical disk device (I.E. 3390) connectivity. The analysis and tuning of this I/O subsystem has always been and continues to be supported by the SMF Type 7n records via IBM RMF and the BMC CMF alternative. However, over the years, major advances in DASD subsystems and the System z Mainframe server have delivered many layers of technology resources (E.g. Cache, Memory, FICON Channels, RAID Storage, Proprietary Microcode, et al) and this has introduced complexities into highlighting DASD I/O subsystem performance problems.

The focus of technology based metrics (E.g. I/O Rate Response Time, I/O MB/S Bandwidth, et al) have also been complemented with more meaningful business focussed Service Level Agreements (SLA). Therefore today’s System z I/O Performance Analyst must gather and act upon proactive meaningful information from the ever-increasing amounts of performance data available. Put another way, too much data can deliver not enough information! As previously stated, it was forever thus, RMF and CMF have always collected the requisite performance data available and arguably no other data source is required (E.g. OMEGAMON/TMON/SYSVIEW Performance Monitor, SAS/MXG/MICS/WPS Performance Database). RMF/CMF is the ideal data source for thorough and timely System z I/O performance management, where intelligent analytics and expert knowledge are required to present this “Golden Record”.

However, today’s System z Support Teams need simple and timely presentation of the data, highlighting potential challenges, graphically presented for their Management, allowing for simple tracking of SLA agreements and technology changes (I.E. Software/Hardware Upgrades).

Additionally, Workload Manager (WLM) can control non-paging queued DASD I/O requests, based upon device busy conditional processing. Therefore the z/OS system can manage I/O priorities in a Sysplex, based on WLM service class goals. WLM dynamically adjusts the I/O priority based on service class goal performance and whether a DASD device can influence the overall performance objectives. For obvious reasons, this WLM function does not micro-manage I/O priorities, only changing a service class period’s I/O priority infrequently. WLM is deployed by many System z users to assist in the automated management of system resources (E.g. CPU, Memory, I/O, et al), based upon Service Level goals.

From a DASD subsystem technology viewpoint, there is no longer an obvious one-one direct connection between the Mainframe host and DASD device. An increasing number of technological advances, both microcode and hardware (E.g. Memory, Fibre Channel, Function Assist Processing, et al) have diminished the requirement for data access directly from the physical device. Put another way, in today’s world of System z servers with multiple cache level CPU chips (I.E. Relative Nest Intensity), massive and multiple processor memory resources (I.E. z13 @ 10 TB Memory), high bandwidth Fibre Channel (I.E. FICON, zHPF) subsystem and a hierarchy of DASD memory (I.E. SSD/Flash, Cache), it’s not uncommon to consider an I/O that requires physical device access as a problem! Finally and most importantly, from a DASD subsystem viewpoint, each of the recognized System z DASD providers, EMC (Symmetrix VMAX), HDS (VSP G1000) and IBM (DS8870) have highly proprietary DASD subsystems that provide z/OS plug compatibility, but deliver overall I/O performance using their own unique architecture and internal algorithms.

Of course, an over configured hardware environment will deliver a poor TCO, while an under configured environment will manifest in SLA issues and bad user experiences, where the middle-ground always delivers the optimal environment. Resource optimization always demands proactive day-to-day management, from an internal and indeed external communication viewpoint. With the highly proprietary design features of the IHV DASD subsystems, whether EMC, HDS or IBM, having the right information and identifying the precise problem, simplifies the communication process with the IHV. Such communication might highlight a resource under provision (E.g. Memory Capacity), a subsystem setting tweak requirement, either host or subsystem based, or indeed a hardware failure. In today’s world, these issues need to be fixed in minutes or hours, not days or weeks.

Therefore, where does today’s System z I/O Performance Analyst start to collect the required information to safeguard that their DASD subsystem is optimized, both from a capacity and performance viewpoint?

A simplistic viewpoint of an I/O health-check should consider the following:

  • Service Level Agreements (SLA): Are overall objectives being delivered or missed?
  • User Experience: Are users (customers) complaining of poor service or response times?
  • I/O Metric Performance: Are there obvious signs of abnormal performance statistics?

Several decades ago, an overall I/O health check might have been a periodic (E.g. Weekly or longer) activity, whereas today it’s undoubtedly a Business As Usual (BAU) and 24*7 activity. Therefore a fully automated solution is required, built upon the tried and tested System z performance fundamentals, namely RMF or CMF. The ideal solution will perform analytics based data reduction, presenting the right information, at the right time, allowing for intelligent business based communication, both internally, to customers and end users from an SLA viewpoint, and externally, with IHV DASD suppliers, safeguarding optimal performance and TCO.

EADM (Easy Analyze DASD Mainframe) is a solution from Technical Storage that performs automated performance analysis of the z/OS I/O subsystem, delivering predictive analytics for better storage capacity planning and performance measurement. The Technical Storage EADM architects have in excess of 40 years IBM Mainframe experience, specializing in the I/O subsystem, and so it’s no surprise that EADM delivers expert and timely knowledge via an easy-to-use solution.

EADM is an easy-to-install and easy-to-use plug-and-play solution that has no proprietary considerations, requiring no additional System z resource (E.g. CPU, Memory, DASD, et al) requirements. Installed on Microsoft server platforms, EADM is easily virtualized via VMware, Hyper-V, et al, requiring no target database for performance data storage. EADM performs a daily health check of the entire System z disk subsystem. EADM works around the clock, delivering customized and automatic user friendly GUI type reports. For today’s System z technician, the open and IP architecture base of EADM allows for secure remote access via Mobile, Tablet or Laptop devices, as and when required.

Operations and performance teams are alerted as soon as performance variances occur, typically in minutes, assisting in the identification of underlying root problems, causing changes in system behaviour. Incorporating intelligent and meaningful I/O performance indicators, with drill-down and zoom-in ability, storage technicians can determine if the problem is temporary, permanent, local or global. By simplifying the data reduction process (E.g. RMF/CMF data from numerous LPAR/Sysplex environments), EADM safeguards that the internal technical team can efficiently manage their ever increasingly complex and large DASD environment, for intelligent and timely communications with internal business teams and external suppliers alike.

EADM simplifies the System z I/O subsystem capacity and performance management process, delivering expert reports and timely historical analysis, for example:

  • Automatic daily (24 Hour) analysis of Sysplex wide workload (On-Line TP & Batch) I/O response times
  • Systematic intelligent alerts of early performance variances with exact occurrence time indicators
  • Identification of I/O performance hot-spots with DASD volume and data set level granularity
  • Performance trending at DFSMS Storage Group, Subsystem LCU and DASD volume level
  • DR (E.g. PPRC) simulations to prevent data loss and forecast Data Centre failover scenarios
  • I/O subsystem WLM indicators to determine exactly what impacts performance objectives
  • Full FICON channels and zHPF analysis, incorporating typical I/O throughput indicators
  • HyperPAV and associated LCU indicators to easily balance volumes, optimizing PAV alias allocation
  • Performance monitoring and balancing via intelligent LCU, SSID and I/O analytics
  • DASD capacity usage via DCOLLECT data, comparing assigned vs. allocated vs. actual disk utilization
  • EADM supports entry-level several LPAR and complex multiple CPC/LPAR System z configurations

A well provisioned and performing System z I/O subsystem is of vital importance for safeguarding today’s ever increasing storage requirements of mission critical business applications. A poorly performing I/O subsystem will generate unnecessary and extra CPU overhead, with potential and tangible TCO impact, in conjunction with potential business impact. Although the advances of the System z server and underlying DASD I/O subsystem can compensate for many application code or data placement issues, the fundamental concepts of analysing and tuning the I/O subsystem remain.

Therefore the savvy and proactive System z customer will safeguard that they find a solution to deliver optimal DASD I/O performance. Without doubt, such an analysis could be performed by a highly-skilled individual, but today’s 21st Century world demands a hybrid of technical and commercial skills. Therefore a solution that incorporates the diagnostic knowledge of the most highly trained technician, performs intelligent analytics on a plethora of Sysplex wide performance data sources and presents the information required, is one that will deliver benefit each and every day. EADM is an example of such a solution, delivering demonstrable System z TCO optimization benefits, while safeguarding a short-term ROI, with simple deployment and resource utilization attributes.