July 2012

Database Specialists

The Specialist

Vol. 12 No. 2

Database News You Can Use

Expert onsite and remote DBA services
for your Oracle databases
Tel: 415-344-0500 · Toll-Free: 1-888-648-0500
In This Issue
RMAN ‘Backup Location’ Syntax
Using RDBMS_REDEFINITION
Three things to do to stabilize your RAC environment
Using Large Memory Pages with Oracle 11.2
Database Rx Portal New Features
Quick Links
Join Our List
Join Our Mailing List
July, 2012
Welcome to the July, 2012 issue of the Database Specialists newsletter!
Working with Oracle on on a day-to-day basis can be a humbling experience.  Just as soon as you think you understand how things are working, Oracle releases a new version of their software that turns everything on its head.  Even if you stay with a given version, you’ll find sooner or later that Oracle is no longer supporting it, or releasing fixes for it.  It takes a lot of effort to stay up-to-date, and it’s especially hard to understand how a future version of Oracle will work with your application without a lot of experience on that new version.
In this issue of The Specialist, we’ve gathered some great points from our staff consultants Ian Jones, Mike Dean, Terry Sutton, and Gary Sadler regarding features in new versions of Oracle, and how they can affect your databases.
Also, we’re really pleased to announce that we’ll be updating our Database Rx Portal software quite soon with quite a few improvements and new features — Jay Stanley explains some of the more visible changes that are pending.
As always, we hope that you enjoy this issue of The Specialist!

RMAN ‘Backup Location’ syntax & RMAN-05548

Ian Jones, Sr. Staff Consultant

Here is an interesting issue we came across recently after an upgrade from 11.1 -> 11.2. This particular database has many development / test / qa copies and approximately 1/3 of the database is application auditing data that is not needed in the non-production copies. So we skip the AUDITDATA tablespace during the duplication which both makes the clonings faster and requires less disk space. In 11.1 we were able to create the duplicate databases using the following rman command

$rman target=sys@PRD1 auxiliary=/
RMAN> run {
2> set until time “TO_DATE(’01-NOV-2011 20:00:00′,’DD-MON-YYYY HH24:MI:SS’)”;
3> allocate auxiliary channel ch1 type disk;
4> duplicate target database to TST1 skip tablespace AUDITDATA;
5> }

 

This worked fine, rman performed the duplication, dropped the AUDITDATA tablespace and we subsequently did a CONTENT=METADATA_ONLY datapump export / import to recreate the empty objects in that tablespace and constraints (that were dropped cascade during the duplication). After the upgrade to 11.2 the exact same duplication fails with error RMAN-05548: The set of duplicated tablespaces is not self-contained.

 

Sure enough 11.2 has some nice RMAN enhancements: http://docs.oracle.com/cd/E11882_01/server.112/e22487/chapter1.htm
but unfortunately the transportable tablespace (TTS) check is preventing us from doing what we used to (and still need to) do. To cut a long story short the workaround is to perform the duplication without connecting to the primary database since rman is not connected to the primary it cannot then query the data dictionary to perform the TTS check which causes the error. So our clonings are now performed via
$rman auxiliary=/
RMAN> run {
2> set until time “TO_DATE(’14-FEB-2012 20:00:00′,’DD-MON-YYYY HH24:MI:SS’)”;
3> duplicate database to TST1 backup location ‘/orabackups/PRD1/rman’ skip tablespace AUDITDATA;
4>}

Notice the use of the (also new in 11.2) BACKUP LOCATION syntax which allows us to perform the duplication from any directory containing a copy of the Production backup, we no longer need to use the same path that the backup had in Production. This new BACKUP LOCATION syntax can also be used when creating a standby database again optionally without connecting back to the primary e.g. you can create a standby database from a backup copy out on NFS using the following syntax.

$rman auxiliary=/
RMAN> run {
2> duplicate database for standby backup location ‘/orabackups/PRD1/rman’ nofilenamecheck;
3 }

Notice the above syntax is able to create the standby database without the need of a separate standby control file backup which was required in earlier versions in other words the above command is able to use the regular auto control file backup to create the standby control file which is another nice enhancement.

Using DBMS_REDEFINITION – A Case Study

Mike Dean, Sr Staff Consultant

 

I recently wrote a blog (http://dbspecialists.wpengine.com/blog/database-maintenance/resizing-datafiles-and-understanding-the-high-water-mark/) about understanding the High Water Mark (HWM) in datafiles.  This knowledge of the HWM came in handy for me recently at a customer site.  Over time, their database had grown to over 500GB and was threatening to fill up their filesystem.  They had recently done a purge of about 400GB worth of old data and wanted to shrink the datafiles but were unable to do so because of the High Water Mark.  In order to resolve this and limit downtime, I decided to create a new tablespace and use DBMS_REDEFINITION to move objects from the existing tablespace to the new tablespace.  Here ( http://dbspecialists.wpengine.com/files/presentations/online_redef.html ) is a great presentation by Chris Lawson and Roger Schrag of Database Specialists that gives an overview of how DBMS_REDEFINITION works.  Rather than repeat what they already said, I will describe some of the issues that I ran into when I implemented this procedure on a live database.  This is on a 10.2.0.1 Enterprise Edition database running on 64-bit Linux.

As a very quick review, here are the steps involved in using DBMS_REDEFINITION to move objects to a new tablespace:

1)      Run DBMS_REDEFINITION.CAN_REDEF_TABLE to make sure the table is able to be redefined online

2)      Create the “Interim” table as an empty copy of the source table in the new tablespace

3)      Run DBMS_REDEFINITION.START_REDEF_TABLE to begin the process.

4)      Copy the table dependents (constraints, triggers, indexes) to the Interim table with DBMS_REDEFINITION.COPY_TABLE_DEPENDENTS

5)      Sync the Interim table with DBMS_REDEFINITION.SYNC_INTERIM_TABLE

6)    Finish the redefinition process with DBMS_REDEFINITION.FINISH_REDEF_TABLE.  It is during this part of the operation that Oracle will require an exclusive lock on the source table but it is really very brief.

 

The USERS tablespace consisted of 18 datafiles ranging from 20GB to 32GB.  My goal was to decrease each datafile to 10GB so I ran the following query to determine exactly which objects needed to move.

 

select unique segment_name, owner from dba_extentswhere file_id in(select file_id from dba_data_fileswhere tablespace_name=’USERS’)and (block_id + blocks-1)*8192 > 10737418240;From this I found that there were more than 100 tables along with the indexes that needed move from the USERS tablespace to DATA tablespace.

Issue #1) Handling Not Null constraints

The first issue I encountered occurs if the interim table has NOT NULL constraints on it.  If it does, during the COPY_TABLE_DEPENDENTS procedure, you will get this error:

DECLARE*ERROR at line 1:ORA-01442: column to be modified to NOT NULL is already NOT NULLORA-06512: at “SYS.DBMS_REDEFINITION”, line 984ORA-06512: at “SYS.DBMS_REDEFINITION”, line 1726ORA-06512: at line 4The workaround for this is simple.  After creating the table with “Create Table As Select where 0=1”, I run a little PL/SQL loop to drop the NOT NULL constraints from the interim table.  This issue is documented in Oracle Support Article ID 1116785.1: “ORA-01442 DURING DBMS_REDEFINTION.COPY_TABLE_DEPENDENTS”

 

The second issue I had with NOT NULL constraints is described in Oracle Support Article ID 1089860.1 “Why Are NOT NULL Constraints Not Copied By DBMS_REDEFINITION.COPY_TABLE_DEPENDENTS?“.  As it turns out, if the compatible parameter is set to to 10.2 or higher, it will copy the NOT NULL constraints but will do so in NOVALIDATE mode.  This means that when you do a describe of the table, you won’t see “NOT NULL”.  If compatible is not set to 10.2 or higher, then the constraints really don’t get copied.  In this case, the compatible is set to 10.2.0.1 so the workaround was to run “ALTER TABLE..ENABLE VALIDATE CONSTRAINT” for each NOT NULL on the table.  Again, a simple PL/SQL loop took care of this for me.

 

Issue #2) Rebuilding Indexes

When the COPY_TABLE_DEPENDENTS runs, it will rebuild the indexes but it rebuilds them in their original tablespace which is not what I wanted.  So, after running COPY_TABLE_DEPENDENTS, I rebuilt the indexes in the new tablespace

 

alter index <index name> rebuld tablespace DATA online nologging 

Issue #3) Handling Context indexes

In this database, there are several tables that have Context Indexes built on them.  There is a bug (#4688172) in Oracle 10.2 that won’t allow DBMS_REDEFINITION to work with these tables. According to the Bug: “Online Table Redefinition with domain index works fine in 9.2.x and 10gR1 whereas it fails in 10gR2 with following errors:

BEGIN
DBMS_REDEFINITION.FINISH_REDEF_TABLE(‘tc’, ‘ptab’, ‘int_ptab’);
END;
/
BEGIN
*
ERROR at line 1:
ORA-42012: error occurred while completing the redefinition
ORA-12008: error in materialized view refresh path
ORA-29886: feature not supported for domain indexes
ORA-6512: at “SYS.DBMS_REDEFINITION”, line 76
ORA-6512: at “SYS.DBMS_REDEFINITION”, line 1376
ORA-6512: at line 2 

The workaround is to set Event 10995 to Level 1 prior to executing any dbms_redefinition API.  Unfortunately, setting this event requires the database to be restarted and has not yet been done.

 

Issue #4) Library Catch Latching

Each time I moved a table, all dependent procedures, triggers, functions etc of that table would become invalid.  Normally, Oracle will just recompile any invalid procedures as they are needed and there won’t be a problem.  In this case, however, there is one procedure that runs constantly in the database.  So, when I moved a table and caused this procedure to become invalid, it couldn’t be recompiled because the procedure was already in use. Other sessions that tried to execute this procedure would wait on Library Cache Latch but could never get the Latch.  The solution was to kill the blocking session and allow the procedure to recompile.

 

Issue #5) Execution Plan Changing

This is something that can potentially happen anytime that Oracle parses an SQL statement and chooses the execution plan.  There is always the possibility that Oracle will not choose the best plan and sometimes it will be orders of magnitude worse than what it had been using minutes before.  This happened a couple of times while I was moving all of these objects.  In each case, I forced Oracle to re-parse again by collecting optimizer statistics and luckily Oracle chose the better plan the second time around.

 

So in spite of these issues I ran into,  I would say that DBMS_REDEFINTION is really a great way to redefine tables with little if any downtime needed.  Of course, nothing is perfect and things will change from version to version but it is definitely a tool I would recommend you become familiar with.

Three things to do to stabilize your RAC environment
Terry Sutton

 

The latest issue of Oracle Database Support News has an interesting article, “Top 3 Things to do NOW to stabilize your RAC environment”.  We all know that using RAC adds complexity to our database infrastructure.  And there are multiple install guides and My Oracle Support notes with details on setting up and optimizing your clusters.  This article offers what it considers the three most important things you can do to achieve and maintain stability in your RAC environment.

 

1. Adopt a Patchset Update (PSU) strategy and stick with it.

 

Oracle releases PSUs on a quarterly basis, containing a small number of critical fixes in each release.  Applying the PSUs on a quarterly basis is highly recommended, but for customers who can’t schedule application of PSUs quarterly, Oracle says you should strive to do it every 6 months at a minimum.  This should reduce your likelihood of hitting the most common bugs, and if issues do occur it will speed up resolution (who hasn’t gotten frustrated when Oracle Support’s response to a problem is “please apply the latest patches to your database?”).

 

PSUs can be applied separately to the Grid Infrastructure (GI) home, without needing to be applied at the same time as the RDBMS home.  As long as the GI is patched first, the homes can be patched separately if desired, reducing the length of each maintenance window.

 

PSUs in a RAC environment are guaranteed to be rolling installable, both for the GI and RDBMS homes.  This means that they can be applied with no downtime (remember, this is what Oracle says; your mileage may vary).

 

2. Implement Diagwait on pre-11.2 Clusters

 

Oracle says that nearly half of the services requests logged in 2012 have been for pre-11.2 clusters.  For a while now Oracle has been recommending that the diagwait cluster parameter be set to 13.  By default the clusterware daemon process OPROCD runs with a timeout value of 1 second and a margin of 0.5 seconds.  If OPROCD has not been scheduled for 1.5 seconds, the box is rebooted.  This default of 1.5 seconds is too small to prevent unnecessary reboot for busy systems.

 

Setting diagwait to 13 increase the margin to 10 seconds.  This has the effect of preventing a large percentage of “false” reboots.  In the event a reboot is still necessary, the added time enables more log data to be flushed to disk prior to the reboot.  This can shorten the resolution time for determination of the root cause.

 

Since 11g Release 2 this change is no longer necessary.  For prior releases, it does require a complete cluster shutdown.  You can check its current setting with the command:

$CLUSTERWARE_HOME/bin/crsctl get css diagwait

 

3. Implement OS Watcher Black Box or Cluster Health Monitor

 

Many of you have probably been told to use OS Watcher by Oracle Support.  It’s nothing magical, just a fairly complete way of gathering OS information frequently, and can provide valuable information when trying to debug a problem.  It is fairly lightweight and easy to implement.  And it has the added advantage that Oracle Support people are used to seeing the data presented in this way, so resolution of issues can be speeded up.  For some reason Oracle now calls OS Watcher “OS Watcher Black Box” (perhaps to make you feel better about installing it on your production system), but names aside, it’s a useful tool.

 

Starting with Oracle 11.2.0.3, Oracle GI installations include a new monitoring tool called Cluster Health Monitor.  It has some similarities to OSWBB, but it collects data more frequently and keeps it for a shorter period of time.

 

Oracle Support strongly recommends that all clustered environments have one or both of these tools running at all times.

 

This note is intended as a brief introduction to some important RAC Best Practices recommended by Oracle.  More detailed descriptions of the recommendations can be found at: https://support.oracle.com/CSP/main/article?cmd=show&type=NOT&doctype=ANNOUNCEMENT&id=230.1

Using Large Memory Pages with Oracle 11.2

Gary Sadler, Sr. Staff Consultant

 

A recent late-night incident at a client site involved a database with extreme performance problems following a storage system failure and subsequent database restart. The client observed excessive swapping and paging with the run queue length breaking triple digits and the server and database becoming virtually unresponsive.  The server had not been restarted and the database instance parameters were confirmed to be identical to the settings prior to the database bounce. The SGA was on the order of 50 GB with the buffer cache consuming most of that. After a period of speculation about what might be causing the problem we noticed that the huge pages usage didn’t look quite right.  The operating system was RHEL 5.4.

 

$ cat /proc/meminfo | grep Huge

HugePages_Total: 25600

HugePages_Free:  25431

HugePages_Rsvd:    169

Hugepagesize:     2048 kB

 

With a running database featuring a 50 GB SGA there should be more huge pages reserved.  They won’t all get reserved upon startup but more than 169 should have been reserved at this point for sure.  As of 11.2.0.2 there is a new instance parameter, USE_LARGE_PAGES, which can be set to one of the following three values:

 

TRUE  – Use huge pages if they’re available (default). Some combination of large and small pages for the SGA is okay.

FALSE – Don’t use huge pages even if they’re available

ONLY  – Allocation of huge pages for the entire SGA is required for the database to start

 

Since the client database above was 11.2.0.3 if we had set USE_LARGE_PAGES=only then when we had tried to bounce the database it wouldn’t have started and we would have seen something like this in the alert log:

 

Fri Apr 20 15:16:11 2012

Starting ORACLE instance (normal)

****************** Large Pages Information *****************

Parameter use_large_pages = only

 

Large Pages unused system wide = 25431 (26041344 KB) (alloc incr 4096 KB)

Large Pages configured system wide = 25600 (51200 MB)

Large Page size = 2048 KB

 

ERROR:

Failed to allocate shared global region with large pages, unix errno = 12.

Aborting Instance startup.

ORA-27137: unable to allocate Large Pages to create a shared memory segment

 

ACTION:

Total Shared Global Region size is 51200 MB. Increase the number of

unused large pages to atleast 25600 (51200 MB) to allocate 100% Shared Global

Region with Large Pages.

***********************************************************

 

This would have flagged the problem immediately.  We would have noticed that there were shared memory segments stuck in memory that couldn’t be reallocated prompting a server reboot, which incidentally did solve the problem.

 

For questions about huge pages in general and how to configure them see My Oracle Support doc ID 361468.1 or give us a call.  It is important to note that use of huge pages is not compatible with Automatic Memory Management (AMM) configured through the instance parameters MEMORY_TARGET and MEMORY_MAX_TARGET.  You can use the legacy Automatic Shared Memory Management (ASMM) configured through the instance parameters SGA_TARGET, SGA_MAX_SIZE, and PGA_AGGREGATE_TARGET, however.

Database Rx Portal new features

Jay Stanley, Sr. Staff Consultant

 

We’re pleased to announce that quite soon, we’ll be updating the Database Rx Portal with a host of new features. These new features make using the portal to understand what is going on in your databases even easier than before, and provide new insights into how the database is doing over time.

 

A sampling of some of the new features includes:

  • Graph improvements; graphs can be requested to render using the ‘R’ statistics package, which make very easy-to-read and understand graphs.
  • There is a new ‘1 year’ duration pulldown for many reports, so that you can see an entire year of activity over time.
  • The Database and Tablespace Growth pages now include the potential % free, assuming all datafiles autoextend as much as they can.
  • There’s a new calculation that estimates the number of days until a given tablespace is full. This can really help in cases where it may take awhile to acquire more physical storage.
  • There is now support for graphs of ‘custom metrics’ in the database, so if those are in use, activity over time can be understood.
  • There is now a graph of free archived redo log space over time; this makes it much easier to understand what is happening to that storage over time.
  • We’ve added an ‘elapsed time’ column in our SQL Statement Detail and High Resource SQL pages, so that it’s easier to see the impact of various things on statement performance over time.
  • It’s now easier to see the details about a SQL statement from the main Performance Overview page.

We hope that you enjoy these new features, and would love to hear your thoughts!

Call Database Specialists when you need remote DBA services or onsite support for your mission-critical Oracle database systems. Arrange a free consultation with a senior Database Specialists team member to find out how we can help increase your uptime, improve performance, minimize risk, and reduce costs. Visit our website for no-cost resources, white papers, conference presentations and handy scripts.

Sincerely,
David Wolff
CEO, Database Specialists, Inc.

dwolff@dbspecialists.com
(415) 344-0500 x48

Leave a Reply

Your email address will not be published. Required fields are marked *