By Leon Katsnelson
By Susan Visser
By Bernie Spang
By the DB2 Guys
By Fred Ho
By Louis T. Cherian
By Shweta Shandilya
By Lawrence Weber
By Serge Rielau
By Dwaine Snow
Unless you’ve been hiding under a rock for the last few months, you know that IBM is pretty excited about the latest release of Informix. The company has good reason to be. During the beta program, Lester Knutsen and Art Kagel worked together to evaluate the newest release of Informix 11.7 server (formerly code-named Panther). They can tell you firsthand that there’s some exciting stuff under the hood. In this article, they’ll share the results of their performance testing, and they’ll describe the new features and how to use them.
We tested Informix 11.7 using the benchmarks that we developed for the Fastest Informix DBA contests. For each test, we started with Informix 11.5, optimized it to get the best possible performance, and measured how quickly the job completed. Then, we installed 11.7 on the same hardware and measured the job again.
Out of the box, version 11.7 was faster on almost every trial. The first benchmark was a batch-billing job that generated bills for more than 100,000 customers. Informix 11.7 ran this job up to 17 percent faster than the best time posted by the previous version.
The next benchmark we tested was from the Fastest Informix DBA Contest III in 2010. It was an online transaction processing (OLTP) benchmark that used the open source BenchmarkSQL Java program to generate 100 sessions doing inserts, updates, and deletes against a database. The goal is to generate as many transactions as possible in a fixed amount of time; this JDBC benchmark closely resembles the TPC-C standard for OLTP. Again, using the same configuration as the prior version, 11.7 generated 12 percent more transactions.
Given that we didn’t use any of the new features available in 11.7, these results are pretty amazing. Informix 11.7 also has a number of exciting new features for data warehouse design and development that we are still testing, using several very complex queries. Some of the new join features of 11.7 have helped us improve the performance of one query by 22 percent.
When it set out to develop the 11.7 release of Informix Dynamic Server, the IBM development team pulled out all of the stops to extend and expand each area where Informix is already very strong. New features enhance OLTP data security, OLTP performance, data warehouse performance, and administration. This part of the article will address the features we think are most significant, what the benefits are, and how to implement them.
What do we mean by OLTP data security? The worst-case scenario in an OLTP environment is loss of transactional data. For example, if you’re a bank and a user completes a transaction, you cannot afford to lose it. Your company is on the hook for that money, whether it’s a $20 ATM withdrawal or a multimillion-dollar electronic funds transfer. To take another scenario, if your primary server crashes, what do you do? If you tell the users who are mid-transaction, “Sorry. System problem. Please try again later,” you will lose a customer or maybe several hundred customers.
Informix 11.7 has a new feature that IBM calls “transaction survival.” We call it “uninterruptible transactions.” If your primary MACH11 server goes down, any transactions begun by a user connected to any secondary server (HDR, SDS, or RSS) will be picked up by the surviving secondary servers. One of the surviving servers is promoted to be the new primary server, and the transaction continues unharmed and uninterrupted. Using SDS secondary servers and the Connection Manager, failover can be completed in mere seconds; this capability will support many organizations’ service level agreements (SLAs), and most users will not even notice the delay.
The best part is that you don’t have to do much at all to implement this amazing feature. No application changes, onmode flag, or SQL application programming interface (API) functions are needed. A single new ONCONFIG parameter (FAILOVER_TX_TIMEOUT) sets the time that secondary servers will wait for a new primary server to make contact before rolling back open transactions after a primary server failure. Your existing applications don’t have to be modified. This is all automatic and effort-free. It is not yet clear whether older applications will need to be recompiled with the latest Informix SDK version (3.70xC1 or later), but most sites will do this as part of an upgrade process anyway. After that, you just need to make sure that all of your critical applications are connected to a writable secondary server rather than connected directly to the primary server.
Informix 11.7 also has native time series data support, which allows relational data to be combined with time series data in the same query. This native support helps improve query performance and reduce storage requirements for time series data. The customizable data loader handles data in almost any format and integrates with real-time streaming data products such as IBM InfoSphere Streams. Plus, the time series toolkit enables you to develop custom analytics to run in your database.
Beyond the underlying database engine improvements, Informix 11.7 has several features you can use to further improve performance. The following sections describe three of these features in an OLTP context; however, each of these features will improve the performance of decision support and data warehouse systems as well.
What’s new: Eliminate index on foreign keys. Informix 11.7 eliminates the requirement for a foreign key constraint to have an index on the foreign key. Indexes on the foreign keys referencing tables with few rows have low selectivity, and the utility of requiring the index has always been questionable.
The engine does not need these indexes to enforce the constraints unless cascading deletes are enabled, and you probably don’t need them for searching because most systems have composite keys containing the foreign key columns that are used for searching and filtering. It is rare in an OLTP query for the lookup tables to be selected as the primary query tables requiring an index on the dependent table to support the join. Removing this requirement will reduce the overhead of inserting and modifying data in tables that have many code columns, without causing any impact on query performance.
What to do. To create a new constraint without a supporting index, follow the constraint definition with the INDEX DISABLED clause:
ALTER TABLE mytable ADD CONSTRAINT FOREIGN KEY (fkcol) REFERENCES fktable(keycol) CONSTRAINT mytable_fk1 INDEX DISABLED;
If you have an existing constraint and would like to take advantage of this feature, you must drop the constraint and re-create it. There is no option to disable index support for an existing constraint. When the constraint creation is complete, the index supporting the constraint is disabled and its space is dropped.
What’s new: Forest of Trees indexes. Forest of Trees indexes combine many of the advantages of a hash index with those of a traditional Informix B+tree index. B+tree indexes can become rather deep with relatively few unique elements on a level in indexes on keys where the first column(s) have low selectivity but are still important to the correct processing of OLTP queries (think geographic codes such as country code, region code, state code, and combinations of these). With a Forest of Trees index, the database architect can specify one or more of the leading columns to be used to create a hash key. Then for each hash value, a separate B+tree is created that contains only the remaining key parts following the hash columns. The result is much flatter B+tree indexes. One of the drawbacks of using a Forest of Trees index is that you can’t perform range scans on the hash columns (for that you will also need a pure B+tree index). But, you can perform range scans on the remaining columns in the index key. The most significant benefit will be indexes that have fewer levels and therefore faster performance.
What to do. Create a Forest of Trees index just as you would create any other index, except that after the column list, specify the HASH ON clause:
CREATE INDEX mytable_fot ON mytable( col_1, col_2, col_3 ) IN indx_dbs
HASH ON ( col_1, col2 ) WITH 200 BUCKETS;
Note that the storage clause must precede the HASH ON clause, which is not clear from the syntax diagrams in the manual.
What’s new: Multiple index scans. Multiple index scans in the optimizer are another big win for OLTP. Often you have a join between two tables on one or more columns, with filters on one or more different columns in those tables. Both sets of columns may be indexed, but by different indexes. In earlier releases of Informix, the optimizer had to select the “best” single index for a query from all the indexes available based on index statistics and data distributions of the key columns. As a result, the engine needed to read many rows of data and apply the final filters using the actual row data, even though indexes of the filter columns were available. The optimizer in Informix Extended Parallel Server (XPS) could use more than one index on each table in a query, and now Informix 11.7 can as well. This capability can enable DBAs and database architects to reduce the number of long key compound indexes, which can also improve insert, delete, and update performance. This feature could be a huge win when running queries like the one previously described against databases that are delivered as part of a third-party application—you can’t control the schema, and the composite indexes that could make such queries reasonably efficient may not exist.
What to do. The best part of this feature is that there is nothing you need to do. Just make sure your statistics and data distributions are properly maintained. Once you have seen multiple index scans improve your query performance, you may want to look into replacing some compound key indexes with simpler ones and reducing the total number of indexes, which may improve delete, update, and insert processing times as well.
Informix 11.7 expands the capabilities of our favorite database in a number of exciting ways. We’re still investigating the new features, and even in this article, we’ve only been able to scratch the surface of what’s possible. Check out Lester’s Informix DBA column in this issue for more tidbits, and let us know what you think of the new version, and what you’re most excited about.
IBM is making quite a bit of noise about the heterogeneous grid capabilities of Informix, and rightfully so. If you have lots of old hardware around, don’t throw it out. Don’t sell it to a junker or to a used equipment reseller; Informix gives you the ability to use different hardware, operating systems, or even different versions of Informix from cluster to cluster. Configure your old hardware as an IDS grid node and add it to your company server farm’s net computing power at near zero cost (since the boxes are already paid for and amortized). You cannot mix hardware, operating system, or Informix versions between MACH11 servers backing each other up, but you can have separate MACH11 clusters running on different platforms that are linked into a single grid using Enterprise Replication. Heterogeneous grid capabilities are a part of the Informix Flexible Grid feature. They also automate some feats that previously required both planning and manual configuration. For example, the new grid management features and the ifxclone utility make it simple to do zero-downtime upgrades. The engine can temporarily (and automatically) convert your HDR pair into an ER pair, upgrade the Informix version on one server, clone the upgraded server, and reestablish HDR.
DB2 TechTalk: Deep Dive on BLU Acceleration in DB2 10.5, Super Analytics Super Easy
Thursday, May 30: 12:30 – 2:00 PM ET
Informix Chat with the Lab: Primary Storage Manager (PSM) a Parallel Backup Alternative to Ontape
Thursday, May 30: 11:30 – 1 PM ET
Big Data Seminar 2013, Featuring Krish Krishnan
June 14 in New York City
marcus evans Pharma Data Analytics Conference
July 10-11 in Philadelphia
IBM Smarter Content Summit 2013
Big Data at the Speed of Business
Broadcast event replay now available
Information on Demand 2013: Early Bird Registration Now Open
November 3-7 in Las Vegas