How Much Do Concurrent Updates Impact Query Performance in Oracle?

by Roger Schrag
Database Specialists, Inc.

About Database Specialists, Inc.
Database Specialists, Inc. provides remote DBA services and onsite database support for your mission critical Oracle systems. Since 1995, we have been providing Oracle database consulting in Solaris, HP-UX, Linux, AIX, and Windows environments. We are DBAs, speakers, educators, and authors. Our team is continually recognized by Oracle, at national conferences and by leading trade publications. Learn more about our remote DBA, database tuning, and consulting services. Or, call us at 415-344-0500 or 888-648-0500.

Introduction

Oracle has a strict read-consistency model coupled with high concurrency that sets it apart from other database products such as Microsoft’s SQL Server. You can update rows in a table while I query those very same rows. Your updates will not get lost, my query results will not be corrupted, and neither of us will block the other from doing their work. The net result is that with Oracle we can run batch jobs, data loads, and reports all at the same time and never worry about things like “dirty reads” or “read locks.” However, this great functionality comes at a cost to performance—Oracle has to do more work to ensure read-consistency if one user is updating a table while another user is querying it. This raises the question: Just how much slower will a report run if a batch update job is running at the same time?

In this paper, we will first discuss briefly what is meant by read-consistency and how Oracle maintains it in a multi-user environment. Then we will briefly look at how Oracle’s read-consistency model might impact performance from a theoretical standpoint. Next, we’ll spend the bulk of this paper discussing in practical terms how to detect and measure performance degradation caused by concurrent activities that make Oracle work harder to maintain read-consistency. We will look at reproducible examples, TKPROF reports, and v$ views in order to measure how much slower a query runs when the tables being read are undergoing concurrent updates.

Would it be faster to run the update jobs and the report jobs sequentially instead of at the same time? What can we do if both jobs must run at the same time? How can we determine if the two are interacting with each other in a way that degrades performance? You will have the tools to answer questions like these after reading this paper.

A quick note about scope before we get started: This paper looks at the performance implications of Oracle’s read-consistency and concurrency mechanisms. We will not discuss the theoretical correctness of Oracle’s interpretation of the ANSI standards for transaction isolation levels or transaction serializability. Coding practices that developers must use in certain situations to maintain accurate results in a multi-user environment (such as SELECT FOR UPDATE) are also outside the scope of this paper.

Read-Consistency and Concurrency

In this section, we will explain what we mean by read-consistency. Then we’ll look at Oracle’s approach to read-consistency and concurrency, followed by other approaches used by other database vendors. This subject matter can get very complicated. We will keep the discussion at a pretty high level in order to present the concepts in enough detail to follow the rest of this paper, without being weighed down by all of the details of complex algorithms built into the Oracle engine.

We use the term “read-consistency” to mean the accurate retrieval of information. In particular, we want the results of a query to reflect data integrity and correctness as of a single point in time. By data integrity we mean that transactional boundaries must be preserved. For example, you should never see invoice lines without the invoice header—or vice versa—if they were created in the same transaction.

By correctness as of a single point in time, we mean that no matter how long a query takes to run, the result set from the very first row to the very last row must reflect the state of the data in the database as of one point in time. Suppose you have $100 in your checking account and $1,300 in your savings account and you go to your bank’s web site to transfer $500 from savings into checking. What if the bank was computing your daily combined balance right as you executed the funds transfer? Imagine the bank seeing your checking account balance as $100 (before the funds transfer) and your savings account balance as $800 (after the funds transfer). This would lead the bank to compute your combined balance as $900, and you would be hit with a service charge because you were below the $1,000 minimum combined balance. The bank would be wrong to assess this service charge, and their error would stem from the fact that they did not have a read-consistent view of your bank account balances.

Read-Consistency in Oracle

Oracle implements read-consistency in a way that maximizes concurrency. A query in Oracle is never blocked by write operations or other queries, and a write operation is never blocked by queries. That is, I can query a table while another user queries or even updates the same table, and I will not be stopped from proceeding with my query. If the other user writes to the table I am querying, they will never be blocked by my query.

All queries in Oracle give results that are based on the data that was in the database at one single point in time, something we will refer to as a query’s reference point. Thus, no matter how long a query takes to complete, the entire result set will be based on the data as of one exact point in time. Typically a query’s reference point is the moment the query began. Oracle also provides a facility known as read-only transactions where a whole set of queries can be based on the same reference point—the moment the read-only transaction began. This allows read-consistency to be maintained across queries.

Oracle uses a multi-versioning mechanism in order to offer such a high degree of concurrency while maintaining read-consistency. Multi-versioning was introduced way back in Oracle V6. When transactions insert, update, or delete data in tables, information is written to an undo segment (also known as a rollback segment) so that Oracle will be able to back out the change if the user decides to roll back their transaction instead of committing it. However, this undo information does more than allow for transaction rollback—it provides the foundation for the multi-versioning mechanism.

If a query comes across a data block that has been updated by another session but has not yet been committed, the query reads entries from the appropriate undo segment and recreates a version of the data block that looks like the data block did before the uncommitted transaction changed it. If a query comes across a data block that was updated and committed after the query began, then the query again reads undo entries and this time recreates a version of the data block that looks like the data block as of the query’s reference point.

In summary, Oracle uses the information in undo segments to allow the reconstruction of data blocks as they looked in the past—what we call multi-versioning. Oracle uses multi-versioning to enable queries to run concurrently with updates and other queries with no blocking required. This, in a nutshell, is how Oracle is able to ensure read-consistency while allowing a high degree of concurrency.

Other Approaches to Read-Consistency

Other database vendors take different approaches to read-consistency and concurrency. Some databases limit concurrency in order to avoid circumstances that could cause data integrity or accuracy issues. Others take the approach of allowing concurrency with the caveat that data integrity or accuracy could be compromised.

Using “read locks” and “write locks” it is possible to ensure read-consistency, but at the expense of concurrency. On such databases, users wait in line to access data in certain situations. A batch job might not be able to update data in a table, for example, because a report job is reading data from the same table. Or perhaps a user’s query will have to wait until an update transaction commits before the tables can be accessed.

Allowing concurrent access to data without having a sophisticated mechanism like multi-versioning to preserve read-consistency can lead to a variety of data anomalies. Some databases allow “dirty reads” to happen—a situation where the results of a query include uncommitted transactions. Imagine making a data entry error and rolling back the update, but a batch job that was running at the same time included the erroneous data in its final report! This can happen in databases that allow dirty reads. “Fuzzy reads,” meanwhile, can arise when a query’s result set is not accurate as of a single point in time—the bank funds transfer example discussed earlier is an example of this phenomenon.

The Theoretical Cost of Read-Consistency in Oracle

In this section, we will look at the performance implications of Oracle’s read-consistency mechanisms from a theoretical perspective. Again we will keep the discussion fairly high level and stick to concepts that will provide useful background for the rest of this paper.

Before Oracle can use the contents of a data block while processing a query, it must first check to see if the data block contains any updates made after the query’s reference point. This seems like a very inexpensive check, because Oracle records a timestamp of sorts (called the system change number or SCN) in data blocks whenever they are updated. Comparing the SCN of the query’s reference point to the data block’s SCN seems easy.

Oracle must also check to see if the data block contains uncommitted changes made by another session. This check seems inexpensive as well, because Oracle maintains in each data block a list of “interested transactions” (called the ITL) that are active in the data block. If no transactions are active in a data block, then Oracle knows there is no uncommitted work there.

Again, these two checks that Oracle performs when reading every data block during query processing seem like they should have little impact on overall performance. However, we need to consider the situation where one or both of these checks indicate that the contents of a data block are not suitable for use by the query as-is and an alternate version of the data block must be created. In this case, Oracle needs to read data from the undo segment (listed in the ITL) and reconstruct a copy of the data block as it would have looked before the recent or uncommitted transaction. This could require physical disk I/O if the needed undo blocks are not already in the buffer cache. An additional buffer in the buffer cache will be used to hold this altered data block, and additional CPU time will be required to allocate the buffer in the buffer cache, copy the data block, and apply the undo. Furthermore, there is the risk that the necessary undo information will no longer be available because that extent of the undo segment has been reused. In this situation the ORA-01555 “snapshot too old” error will occur and the query will fail.

To summarize: Oracle maintains a timestamp and list of interested transactions in every data block in order to ensure read-consistency. Maintaining this information in every data block does not seem like it should be very expensive. While processing a query, Oracle must check the SCN and ITL on every data block before looking at its contents. This seems like it should be very inexpensive. But if Oracle determines that it cannot use the data block as-is and needs to reconstruct an older version, then a possibly nontrivial performance price will need to be paid.

In general, these performance costs seem like a small price to pay in order to maintain data integrity and accuracy without sacrificing concurrency. Moreover, if we can detect when excessive multi-versioning is happening, quantify it, and figure out ways to avoid it or at least minimize its performance impacts, then we will be even better off.

Measuring the True Cost of Read-Consistency

In this section, we will set theory aside and move on to reproducible test cases. First we will set up a demo schema, and then we will run through two simulations. In each scenario we will measure resource consumption for a query both with and without other activities taking place that force Oracle to create alternate versions of data blocks. The difference in resource consumption in each simulation should give us some insight into the actual resource cost of multi-versioning.

I encourage you to try the simulations shown here on your Oracle system. The results I will present were collected on an Oracle 9i database running on Sun Solaris. I expect you would see similar results on any version of Oracle ranging from Oracle V6 up through Oracle 10g. Although the code was written for Oracle 9i, it should work as-is on Oracle 10g, and with minor modifications on Oracle 8i.

Schema Setup

We will create a few tables for a simulated banking application and populate them with pseudo-random data. By pseudo-random, I mean data values that are scattered over a spectrum in a predictable manner. The data is not truly random. If you drop the tables and reload them on the same database using the code shown here, the tables should end up with the exact same data on the second and subsequent loads. This allows you to repeat a simulation without changing the expected results.

First we create a separate tablespace just for this project. We’ll turn off Automatic Segment Space Management (ASSM) so that Oracle allocates data blocks within segments in a linear, easily predictable manner:

CREATE TABLESPACE test
DATAFILE '/u03/oradata/dev920ee/test01.dbf' SIZE 200m
EXTENT MANAGEMENT LOCAL AUTOALLOCATE
SEGMENT SPACE MANAGEMENT MANUAL;

Next we grant quota on this tablespace to the database user who will own the schema objects we are about to create:

ALTER USER rschrag QUOTA UNLIMITED ON test;

Now we are ready to create a bank_accounts table and load it with 1,000,000 records. Each row will contain information about one checking or savings account. About 90% of the accounts will be set as active and the other 10% will be inactive. We won’t create and populate a bank_customers table, but we will assign a customer_id to each of the bank accounts. We’ll occasionally repeat customer_ids so that some customers have multiple accounts. We’ll also include an other_stuff column in the table so that each row takes up about 120 bytes. The code to create and populate the bank_accounts table is as follows:

CREATE TABLE bank_accounts
(
account_id          NUMBER,
account_number      VARCHAR2(18),
customer_id         NUMBER,
current_balance     NUMBER,
last_activity_date  DATE,
account_type        VARCHAR2(10),
status              VARCHAR2(10),
other_stuff         VARCHAR2(100)
)
TABLESPACE test;

BEGIN
  dbms_random.seed ('Set the random seed so that ' ||
                    'this script will be repeatable');
  FOR i IN 0..9 LOOP
    FOR j IN i * 100000..i * 100000 + 99999 LOOP
      INSERT INTO bank_accounts
      (
      account_id, account_number, customer_id,
      current_balance, last_activity_date,
      account_type, status, other_stuff
      )
      VALUES
      (
      j,
      LPAD (LTRIM (TO_CHAR (TRUNC (dbms_random.value * 1000000000000000000))),
            15, '0'),
      TRUNC (dbms_random.value * 700000),
      TRUNC (dbms_random.value * 5000, 2) + 250.00,
      TO_DATE ('12-31-2005 12:00:00', 'mm-dd-yyyy hh24:mi:ss') -
        dbms_random.value * 30,
      DECODE (TRUNC (dbms_random.value * 3),
              1, 'SAVINGS', 'CHECKING'),
      DECODE (TRUNC (dbms_random.value * 10),
              1, 'INACTIVE', 'ACTIVE'),
      'xxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxx'
      );
    END LOOP;
    COMMIT;
  END LOOP;
END;
/

To complete the setup of the bank_accounts table, we create a primary key and compute optimizer statistics for the table and primary key index:

ALTER TABLE bank_accounts
ADD CONSTRAINT bank_accounts_pk PRIMARY KEY (account_id)
USING INDEX TABLESPACE test;

BEGIN
  dbms_stats.gather_table_stats (USER, 'BANK_ACCOUNTS', cascade=>TRUE);
END;
/

Next we create a bank_transactions table and load it with approximately 90,000 records. Each row will contain information about one deposit or withdrawal. This table will simulate a queue that is populated by ATMs as people deposit and withdraw money. We’ll set a processed flag to “n” on each row for now to show that the transaction has not been applied to the bank_accounts table yet. The code to create and populate the bank_transactions table is as follows:

CREATE TABLE bank_transactions
(
transaction_id      NUMBER,
account_id          NUMBER,
transaction_date    DATE,
transaction_type    VARCHAR2(10),
amount              NUMBER,
processed           VARCHAR2(1)
)
TABLESPACE test;

DECLARE
  v_transaction_id   NUMBER;
  v_transaction_date DATE;
  v_transaction_type VARCHAR2(10);
  v_amount           NUMBER;
BEGIN
  v_transaction_id := 1;
  v_transaction_date := TO_DATE ('01-01-2006 00:00:00',
                                 'mm-dd-yyyy hh24:mi:ss');
  FOR i IN 1..100000 LOOP
    v_amount := TRUNC (dbms_random.value * 10) * 20 + 20;
    IF TRUNC (dbms_random.value * 2) = 1 THEN
      v_transaction_type := 'DEPOSIT';
    ELSE
      v_amount := 0 - v_amount;
      v_transaction_type := 'WITHDRAWAL';
    END IF;
    INSERT INTO bank_transactions
    (
    transaction_id, account_id, transaction_date,
    transaction_type, amount, processed
    )
    SELECT v_transaction_id, account_id, v_transaction_date,
           v_transaction_type, v_amount, 'n'
    FROM   bank_accounts
    WHERE  account_id = TRUNC (dbms_random.value * 1000000)
    AND    status = 'ACTIVE';
    v_transaction_id := v_transaction_id + SQL%ROWCOUNT;
    v_transaction_date := v_transaction_date + (dbms_random.value / 5000);
  END LOOP;
  COMMIT;
END;
/

To complete the setup of the bank_transactions table, we create a primary key and compute optimizer statistics for the table and primary key index:

ALTER TABLE bank_transactions
ADD CONSTRAINT bank_transactions_pk PRIMARY KEY (transaction_id)
USING INDEX TABLESPACE test;

BEGIN
  dbms_stats.gather_table_stats (USER, 'BANK_TRANSACTIONS', cascade=>TRUE);
END;
/

To complete the schema setup, we create a stored procedure that posts transactions to bank account balances by reading a specified number of bank_transactions rows and updating rows in the bank_accounts table accordingly. The stored procedure does the work in an autonomous transaction as follows:

CREATE OR REPLACE PROCEDURE post_transactions (p_record_count IN NUMBER)
IS
  PRAGMA AUTONOMOUS_TRANSACTION;
  CURSOR c_bank_transactions IS
    SELECT   account_id, transaction_date, amount
    FROM     bank_transactions
    WHERE    processed = 'n'
    ORDER BY transaction_id
    FOR UPDATE;
  v_record_count NUMBER;
BEGIN
  v_record_count := 0;
  FOR r IN c_bank_transactions LOOP
    UPDATE bank_accounts
    SET    current_balance = current_balance + r.amount,
           last_activity_date = r.transaction_date
    WHERE  account_id = r.account_id;
    UPDATE bank_transactions
    SET    processed = 'y'
    WHERE  CURRENT OF c_bank_transactions;
    v_record_count := v_record_count + 1;
    EXIT WHEN v_record_count >= p_record_count;
  END LOOP;
  COMMIT;
END post_transactions;
/

Our schema is now complete, and we are ready to run the simulations.

Simulation #1: Querying a Bank Balance While Posting a Deposit to a Different Account

In our theoretical discussion we pointed out that if Oracle finds uncommitted work in a data block while processing a query, Oracle must reconstruct a version of the data block from before the uncommitted work was applied to the data block. While the test to see if uncommitted work is present does not seem expensive, creating an alternate version of the data block if necessary could take some effort. In this simulation, we will quantify that effort.

First we start SQL*Plus and log onto the database in a brand new session. Then we enable tracing at a highly detailed level with the following statements:

ALTER SESSION SET statistics_level = ALL;
ALTER SESSION SET sql_trace = TRUE;

Next we run the following query to retrieve information about one specific bank account:

SELECT account_number, status, account_type,
       TO_CHAR (last_activity_date, 'mm-dd-yyyy hh24:mi:ss') last_activity,
       TO_CHAR (current_balance, '$999,990.00') current_balance
FROM   bank_accounts
WHERE  account_id = 2;

Depending on the contents of the buffer cache and the shared SQL area, this query may or may not incur physical disk reads and/or a hard parse. To eliminate these variables and establish a discernable pattern, we run the same query three more times. It is important to keep the query the exact same each time we run it—even small changes in white space will cause a new hard parse.

Now we start a second SQL*Plus session and update (but do not commit) one row in the bank_accounts table as follows:

UPDATE bank_accounts
SET    last_activity_date =
           TO_DATE ('01-03-2006 11:15:22', 'mm-dd-yyyy hh24:mi:ss'),
       current_balance = current_balance + 20
WHERE  account_id = 3;

Notice that this statement updates a different row in the bank_accounts table than the one that we have been querying. However, the two rows should reside in the same data block. (This is one of the reasons we created our test tablespace with ASSM turned off.)

Now we go back to the first SQL*Plus session and repeat our old query a fifth time:

SELECT account_number, status, account_type,
       TO_CHAR (last_activity_date, 'mm-dd-yyyy hh24:mi:ss') last_activity,
       TO_CHAR (current_balance, '$999,990.00') current_balance
FROM   bank_accounts
WHERE  account_id = 2;

In this SQL*Plus session we have now queried a row from a table five times—four times with no read-consistency conflicts and one time with an uncommitted update elsewhere in the same data block that will force Oracle to perform a multi-versioning operation.

We can now exit both SQL*Plus sessions and turn our attention to the trace file we have created. We fetch the trace file and run TKPROF with the aggregate=no and sys=no options:

tkprof simulation1.trc simulation1.prf aggregate=no sys=no

Our first execution of the query appears in the TKPROF report like this:

SELECT account_number, status, account_type,
       TO_CHAR (last_activity_date, 'mm-dd-yyyy hh24:mi:ss') last_activity,
       TO_CHAR (current_balance, '$999,990.00') current_balance
FROM   bank_accounts
WHERE  account_id = 2

call     count       cpu    elapsed       disk      query    current        rows
------- ------  -------- ---------- ---------- ---------- ----------  ----------
Parse        1      0.03       0.05          0          0          0           0
Execute      1      0.00       0.00          0          0          0           0
Fetch        2      0.00       0.02          4          4          0           1
------- ------  -------- ---------- ---------- ---------- ----------  ----------
total        4      0.03       0.08          4          4          0           1

Misses in library cache during parse: 1
Optimizer goal: CHOOSE
Parsing user id: 97

Rows     Row Source Operation
-------  ---------------------------------------------------
      1  TABLE ACCESS BY INDEX ROWID BANK_ACCOUNTS (cr=4 r=4 w=0 time=14008 us)
      1   INDEX UNIQUE SCAN BANK_ACCOUNTS_PK (cr=3 r=3 w=0 time=13763 us)(object id 32144)

We can see that a hard parse took place as well as four physical disk reads. Since the query itself is very simple and only required four logical reads, the hard parse and physical reads overshadow the true work required to perform this query. Looking at the second execution of the query in the TKPROF report, we see:

SELECT account_number, status, account_type,
       TO_CHAR (last_activity_date, 'mm-dd-yyyy hh24:mi:ss') last_activity,
       TO_CHAR (current_balance, '$999,990.00') current_balance
FROM   bank_accounts
WHERE  account_id = 2

call     count       cpu    elapsed       disk      query    current        rows
------- ------  -------- ---------- ---------- ---------- ----------  ----------
Parse        1      0.00       0.00          0          0          0           0
Execute      1      0.00       0.00          0          0          0           0
Fetch        2      0.00       0.00          0          4          0           1
------- ------  -------- ---------- ---------- ---------- ----------  ----------
total        4      0.00       0.00          0          4          0           1

Misses in library cache during parse: 0
Optimizer goal: CHOOSE
Parsing user id: 97

Rows     Row Source Operation
-------  ---------------------------------------------------
      1  TABLE ACCESS BY INDEX ROWID BANK_ACCOUNTS (cr=4 r=0 w=0 time=58 us)
      1   INDEX UNIQUE SCAN BANK_ACCOUNTS_PK (cr=3 r=0 w=0 time=36 us)(object id 32144)

We can see that no hard parse or physical reads were necessary this time, now that the parse and necessary data blocks are cached in the SGA. We can also see that the execution plan and number of logical reads (four) for this execution are the same as before. Since the query runs so quickly, the CPU and elapsed time figures reported with a resolution of one hundredth of a second are not very useful. However, the top line of the Row Source Operation listing shows that the query ran in 58 microseconds.

We won’t show the TKPROF output for the third and fourth executions of the query here, but they were extremely similar to the second execution. In fact the only difference was that the third and fourth executions ran in 76 and 78 microseconds, respectively.

Now let’s look at the TKPROF output for the fifth execution of the query. Remember that this execution ran after another session had updated but not yet committed a row in the same data block we are accessing:

SELECT account_number, status, account_type,
       TO_CHAR (last_activity_date, 'mm-dd-yyyy hh24:mi:ss') last_activity,
       TO_CHAR (current_balance, '$999,990.00') current_balance
FROM   bank_accounts
WHERE  account_id = 2

call     count       cpu    elapsed       disk      query    current        rows
------- ------  -------- ---------- ---------- ---------- ----------  ----------
Parse        1      0.00       0.00          0          0          0           0
Execute      1      0.00       0.00          0          0          0           0
Fetch        2      0.00       0.00          0          6          0           1
------- ------  -------- ---------- ---------- ---------- ----------  ----------
total        4      0.00       0.00          0          6          0           1

Misses in library cache during parse: 0
Optimizer goal: CHOOSE
Parsing user id: 97

Rows     Row Source Operation
-------  ---------------------------------------------------
      1  TABLE ACCESS BY INDEX ROWID BANK_ACCOUNTS (cr=6 r=0 w=0 time=538 us)
      1   INDEX UNIQUE SCAN BANK_ACCOUNTS_PK (cr=3 r=0 w=0 time=64 us)(object id 32144)

This time Oracle performed six logical reads instead of four, and the query took 538 microseconds instead of something in the range of 58 to 78 microseconds. From the Row Source Operation Listing we can see that the extra logical reads and time were spent on the table access and not the index access. This is because when Oracle looked at the data block from the table, it found the uncommitted update from the second session and had to build an alternate version of the data block with the uncommitted work rolled back. This extra effort required two logical reads and took roughly 460 microseconds.

Should we really care about two extra logical reads and a couple hundred microseconds? Are we not splitting hairs here? In this simulation the performance penalty caused by the concurrent update would probably not cause anybody concern. However, there is a different way to look at this—we could say that the concurrent update caused this query to require 50% more logical reads and seven times as much execution time. This may not be a concern for one query as simple as the one we have run here. But clearly updates could cause significant performance degradation to concurrent queries if multi-versioning were necessary for a large number of data blocks. Also keep in mind that here we looked at the simplest multi-versioning scenario; what if there had been a dozen updates that were not committed or occurred after the query’s reference point? Creating a version of the data block usable by this query would have been much more expensive.

Another interesting fact this simulation points out is that multi-versioning will occur even if a session queries different rows from data blocks than those that are being updated by another session. That is, a session could query one set of rows while another session updates a totally different set of rows. But if any queried rows share a data block with updated rows, then multi-versioning will occur.

Simulation #2: Reporting Minimum Bank Balances While Posting Transactions

In the first simulation we looked at a very simple query so that we could pinpoint the performance cost of having Oracle build an alternate version of one data block in a simple case. In this simulation we will examine a situation in which a report is being run to determine which customers have combined bank account balances below the minimum while a batch job is posting withdrawals and deposits from ATMs into the bank_accounts table. In this simulation we will see widespread multi-versioning.

Like last time, we start SQL*Plus, log onto the database in a brand new session, and enable tracing:

ALTER SESSION SET statistics_level = ALL;
ALTER SESSION SET sql_trace = TRUE;

Next, we run a query to display customers whose combined account balance is less than $1,000. To keep the report output brief, we will only look at a subset of customers:

VARIABLE low_balances REFCURSOR

BEGIN
  OPEN :low_balances FOR
    SELECT   /*+ CACHE (bank_accounts) */
             customer_id, COUNT (*) active_accounts,
             SUM (current_balance) combined_balance,
             MAX (last_activity_date) last_activity_date
    FROM     bank_accounts
    WHERE    status = 'ACTIVE'
    AND      customer_id BETWEEN 10000 AND 10999
    GROUP BY customer_id
    HAVING   SUM (current_balance) < 1000
    ORDER BY active_accounts, customer_id;
END;
/

SELECT   b.value, a.name
FROM     v$statname a, v$mystat b
WHERE    a.name IN ('consistent gets', 'consistent changes')
AND      b.statistic# = a.statistic#
ORDER BY a.statistic#;

PRINT low_balances

SELECT   b.value, a.name
FROM     v$statname a, v$mystat b
WHERE    a.name IN ('consistent gets', 'consistent changes')
AND      b.statistic# = a.statistic#
ORDER BY a.statistic#;

The CACHE hint in the query tells Oracle to keep data blocks read from the bank_accounts table in the buffer cache according to the LRU algorithm, something that usually does not happen during full table scans. This will allow us to eliminate the variability of physical disk reads from the mix when evaluating the TKPROF report.

Also note the statistics reporting before and after the query results are fetched. This allows us to see how many logical reads (in read-consistent mode) Oracle had to perform, as well as how many changes had to be rolled back in alternate versions of data blocks to ensure read-consistency.

As in the first simulation, the query may or may not incur physical disk reads and/or a hard parse the first time it is run. To eliminate these variables and establish a discernable pattern, we again run the same query three more times. Each execution will report about 16,680 consistent gets and zero consistent changes. The bank_accounts table segment is 16,668 blocks long, so the reported number of consistent gets makes intuitive sense. (Figure a few extra reads for overhead.)

Now we run the query a fifth time, except we call the post_transactions stored procedure after opening the cursor and before fetching the results from it. The post_transactions call will post 10,000 deposits and withdrawals from the bank_transactions table to the bank_accounts table in an autonomous transaction, simulating updates being performed in another session by a batch job while the minimum balance report is running. Since the updates occur after the query’s reference point (the opening of the cursor), the cursor output must be based on bank balances before the post_transactions call. The fifth query execution looks like this:

BEGIN
  OPEN :low_balances FOR
    SELECT   /*+ CACHE (bank_accounts) */
             customer_id, COUNT (*) active_accounts,
             SUM (current_balance) combined_balance,
             MAX (last_activity_date) last_activity_date
    FROM     bank_accounts
    WHERE    status = 'ACTIVE'
    AND      customer_id BETWEEN 10000 AND 10999
    GROUP BY customer_id
    HAVING   SUM (current_balance) < 1000
    ORDER BY active_accounts, customer_id;
END;
/

EXECUTE post_transactions (10000)

SELECT   b.value, a.name
FROM     v$statname a, v$mystat b
WHERE    a.name IN ('consistent gets', 'consistent changes')
AND      b.statistic# = a.statistic#
ORDER BY a.statistic#;

PRINT low_balances

SELECT   b.value, a.name
FROM     v$statname a, v$mystat b
WHERE    a.name IN ('consistent gets', 'consistent changes')
AND      b.statistic# = a.statistic#
ORDER BY a.statistic#;

This fifth execution of the query generated 26,691 consistent gets and 10,000 consistent changes. As we would expect, Oracle had to create alternate versions of data blocks as they would have appeared at the query’s reference point, and this required many extra logical reads.

We have now run the minimum balance report five times—four times with no read-consistency conflicts and one time with numerous committed updates that had to be rolled back in the name of read-consistency. We fetch the trace file and run TKPROF again with the aggregate=no and sys=no options:

tkprof simulation2.trc simulation2.prf aggregate=no sys=no

Our first execution of the minimum balance report looks like this:

SELECT   /*+ CACHE (bank_accounts) */
             customer_id, COUNT (*) active_accounts,
             SUM (current_balance) combined_balance,
             MAX (last_activity_date) last_activity_date
    FROM     bank_accounts
    WHERE    status = 'ACTIVE'
    AND      customer_id BETWEEN 10000 AND 10999
    GROUP BY customer_id
    HAVING   SUM (current_balance) < 1000
    ORDER BY active_accounts, customer_id

call     count       cpu    elapsed       disk      query    current        rows
------- ------  -------- ---------- ---------- ---------- ----------  ----------
Parse        1      0.02       0.06          0          0          0           0
Execute      1      0.00       0.00          0          0          0           0
Fetch        4      5.24       7.84      16669      16679          0          48
------- ------  -------- ---------- ---------- ---------- ----------  ----------
total        6      5.26       7.91      16669      16679          0          48

Misses in library cache during parse: 1
Optimizer goal: CHOOSE
Parsing user id: 97     (recursive depth: 1)

Rows     Row Source Operation
-------  ---------------------------------------------------
     48  SORT ORDER BY (cr=16679 r=16669 w=0 time=7846722 us)
     48   FILTER  (cr=16679 r=16669 w=0 time=7835555 us)
    708    SORT GROUP BY (cr=16679 r=16669 w=0 time=7834846 us)
   1281     TABLE ACCESS FULL BANK_ACCOUNTS (cr=16679 r=16669 w=0 time=7795324 us)

We can see that a hard parse took place, as well as numerous physical reads. Looking at the second execution of the query in the TKPROF report, we see:

SELECT   /*+ CACHE (bank_accounts) */
             customer_id, COUNT (*) active_accounts,
             SUM (current_balance) combined_balance,
             MAX (last_activity_date) last_activity_date
    FROM     bank_accounts
    WHERE    status = 'ACTIVE'
    AND      customer_id BETWEEN 10000 AND 10999
    GROUP BY customer_id
    HAVING   SUM (current_balance) < 1000
    ORDER BY active_accounts, customer_id

call     count       cpu    elapsed       disk      query    current        rows
------- ------  -------- ---------- ---------- ---------- ----------  ----------
Parse        1      0.00       0.00          0          0          0           0
Execute      1      0.00       0.00          0          0          0           0
Fetch        4      2.80       2.79          0      16679          0          48
------- ------  -------- ---------- ---------- ---------- ----------  ----------
total        6      2.80       2.79          0      16679          0          48

Misses in library cache during parse: 0
Optimizer goal: CHOOSE
Parsing user id: 97     (recursive depth: 1)

Rows     Row Source Operation
-------  ---------------------------------------------------
     48  SORT ORDER BY (cr=16679 r=0 w=0 time=2793933 us)
     48   FILTER  (cr=16679 r=0 w=0 time=2793371 us)
    708    SORT GROUP BY (cr=16679 r=0 w=0 time=2792563 us)
   1281     TABLE ACCESS FULL BANK_ACCOUNTS (cr=16679 r=0 w=0 time=2768765 us)

No hard parse or physical reads were necessary this time. As we would expect, the execution plan and number of logical reads (16,679) for this execution are the same as for the first execution. Moreover, the number of logical reads reported by TKPROF matches what we retrieved from v$mystat.

The third and fourth query executions appeared very similar to the second, but the fifth was again quite different:

SELECT   /*+ CACHE (bank_accounts) */
             customer_id, COUNT (*) active_accounts,
             SUM (current_balance) combined_balance,
             MAX (last_activity_date) last_activity_date
    FROM     bank_accounts
    WHERE    status = 'ACTIVE'
    AND      customer_id BETWEEN 10000 AND 10999
    GROUP BY customer_id
    HAVING   SUM (current_balance) < 1000
    ORDER BY active_accounts, customer_id

call     count       cpu    elapsed       disk      query    current        rows
------- ------  -------- ---------- ---------- ---------- ----------  ----------
Parse        1      0.00       0.00          0          0          0           0
Execute      1      0.00       0.00          0          0          0           0
Fetch        4      3.42       3.81          0      26691          0          48
------- ------  -------- ---------- ---------- ---------- ----------  ----------
total        6      3.42       3.81          0      26691          0          48

Misses in library cache during parse: 0
Optimizer goal: CHOOSE
Parsing user id: 97     (recursive depth: 1)

Rows     Row Source Operation
-------  ---------------------------------------------------
     48  SORT ORDER BY (cr=26691 r=0 w=0 time=3814002 us)
     48   FILTER  (cr=26691 r=0 w=0 time=3813425 us)
    708    SORT GROUP BY (cr=26691 r=0 w=0 time=3812575 us)
   1281     TABLE ACCESS FULL BANK_ACCOUNTS (cr=26691 r=0 w=0 time=3780240 us)

The number of logical reads climbed 60% to 26,691 (matching what we saw in v$mystat). CPU and elapsed time climbed by over 20% also. It is interesting to note that the Row Source Operation listing now shows that a full table scan of the bank_accounts table took 26,691 logical reads, when in fact the table has only 16,668 blocks below the highwater mark. This should be a clue to us that multi-versioning is happening. The 10,000 consistent changes reported from v$mystat confirms this.

An additional performance implication of the multi-versioning that is not explicitly reported by TKPROF is that Oracle had to allocate additional buffers in the buffer cache in order to build those 10,000 additional data block versions. On a busy database, multi-versioning could cause data blocks to get aged out of the buffer cache sooner. This could lead to additional physical reads when those data blocks are later needed.

In this exercise, we have simulated the concurrent running of a report and a batch job that updates the same data being read by the report. By looking at session statistics and the TKPROF report we can see that doing so caused Oracle to do significant multi-versioning, and this in turn caused a nontrivial performance penalty.

Lessons Learned

By understanding how concurrent updates impact queries in Oracle, we can better understand the performance implications of multi-versioning, detect when excessive multi-versioning is happening, and minimize performance degradation.

Interaction Between Writers and Readers

As advertised, writers do not block readers and readers do not block writers in Oracle. However, Oracle must resort to multi-versioning when users query data residing in data blocks that are being updated by other users at roughly the same time. The users could be working with different rows of data, but if the rows reside in the same data block then multi-versioning will be necessary.

Each time Oracle needs to create an alternate version of a data block, a performance penalty is paid in the form of extra logical reads, extra CPU time, allocation of an extra buffer in the buffer cache, possibility of extra physical reads, and the possibility that the undo information required to create the alternate version is no longer available (the ORA-01555 condition).

The impact of multi-versioning on overall database performance is usually not an issue. As we have seen, creating an alternate version of a data block can cost as little as two logical reads and a few hundred microseconds of elapsed time (or even less on a server with a faster CPU). However, the scheduling of batch jobs and reports can unintentionally lead to widespread multi-versioning. This can drag down system performance significantly.

Detecting Excessive Multi-versioning

How much multi-versioning should be considered excessive will vary from one environment to the next. Ultimately it becomes a judgment call based on the needs of the users and the system resources available. Oracle gives us a few system-wide and session-level statistics for measuring multi-versioning activity, and there are also a few clues we can look for in individual execution plan statistics.

The v$sysstat, v$sesstat, and v$mystat views show several statistics relating to multi-versioning, including:

In theory, the bottom four statistics should add up to the value of the consistent gets statistic, but in fact they usually won’t. Steve Adams attributes this to a bug in the consistent gets statistic, which he discusses at http://www.ixora.com.au/q+a/0012/12093712.htm.

TKPROF, v$sql_plan_statistics, and v$sql_plan_statistics_all are all able to provide actual resource usage statistics for individual row source operations of a SQL statement’s execution plan. (Note the distinction between actual resource usage and projected usage as estimated by the cost-based optimizer.) You should suspect that multi-versioning is happening when you see either of the following in a row source operation:

Reducing Performance Degradation

The key to reducing performance degradation from excessive multi-versioning is to reduce the amount of multi-versioning that takes place. This is usually achieved through schedule adjustment. For example, if the inventory report is running too slowly, then try not to run it at the same time the inventory update batch job runs.

If there is a strong desire to reduce the volume of multi-versioning but conflicting reports and update jobs must run concurrently, then in some cases there could conceivably be a benefit to storing fewer rows per data block by setting PCTFREE artificially high. This will cause segments to take up more space, and operations like full table scans will require the reading of more blocks. But in some cases this can reduce multi-versioning activity.

Finally, if multi-versioning cannot be avoided, then make sure that multi-versioning is not dragging down buffer cache efficiency by tying up multiple buffers with different versions of the same data blocks. In this situation it is possible that a larger buffer cache could help reduce the negative impact of multi-versioning on cache efficiency.

It should be emphasized that multi-versioning does not usually cause significant performance problems. You should not pursue the reduction of multi-versioning operations unless you have determined that this is indeed a root cause of performance issues in your environment and there is no lower hanging fruit available to go after first. If you do determine that excessive multi-versioning is a problem in your environment and you cannot address the issue through job schedule adjustment, then think carefully and run validation tests before adjusting PCTFREE settings or buffer cache sizes in production environments.

Conclusion

Oracle’s sophisticated multi-versioning mechanism allows a high degree of concurrency without compromising read-consistency. Users can query data while other users are updating the very same data, and there is no need to worry about data integrity or accuracy issues in the result set. The multi-versioning mechanism was introduced in Oracle V6, and today this part of the Oracle kernel is extremely mature and stable.

Just because we can run reports and batch update jobs against the same data at the same time doesn’t mean we necessarily should. If Oracle needs to create alternate versions of data blocks as they appeared at a time in the past in order to preserve data integrity and accuracy, extra system resources will be required. This is usually not a problem, but excessive multi-versioning can degrade performance to a noticeable degree.

Through the use of system- and session-level statistics and execution plan detailed statistics we can detect when multi-versioning is occurring. Through tracing and the use of TKPROF we can quantify the cost of multi-versioning. Although we don’t typically need to worry about the performance implications of multi-versioning, Oracle gives us the tools to detect when it is happening and to measure the impact on system performance. We can use this information to decide when it is necessary to reschedule conflicting batch jobs and reports, or take more drastic action.

About the Author

Roger Schrag has been an Oracle DBA and application architect for over fifteen years. He started out at Oracle Corporation on the Oracle Financials development team and moved into the roles of production DBA and database architect at various companies in the San Francisco Bay Area. Roger is a frequent speaker at Oracle Open World and the IOUG Live! conferences. He is also Director of Conference Programming for the Northern California Oracle Users Group. In 1995, Roger founded Database Specialists, Inc. which provides remote DBA services and onsite database support for mission critical Oracle systems.

Still Looking for Help on this Subject?

Get a Consultation
We would be happy to talk with you about our services and how our senior-level database team might help you. Call Database Specialists at 415-344-0500 or 888-648-0500 or fill out a free consultation request form.
Complimentary Newsletter
If you'd like to receive our complimentary monthly newsletter with database tips and new white paper announcements, sign up for The Specialist.


Copyright © 2006 Database Specialists, Inc. http://www.dbspecialists.com