Shared Server Issues in Oracle 9i

by Brian Keating
Database Specialists, Inc.

About Database Specialists, Inc.
Database Specialists, Inc. provides remote DBA services and onsite database support for your mission critical Oracle systems. Since 1995, we have been providing Oracle database consulting in Solaris, HP-UX, Linux, AIX, and Windows environments. We are DBAs, speakers, educators, and authors. Our team is continually recognized by Oracle, at national conferences and by leading trade publications. Learn more about our remote DBA, database tuning, and consulting services. Or, call us at 415-344-0500 or 888-648-0500.

Introduction

The Oracle shared server architecture provides a mechanism for a relatively large number of user connections to be established to a database with a relatively small amount of physical memory required. However, there are a number of potential issues with the shared server architecture that DBAs should be aware of before even considering implementing this feature. In this paper we will look at three of these issues.

Long Running SQL

First of all, the shared server architecture is particularly sensitive to inefficient SQL. Poorly tuned SQL statements can easily take many minutes or even hours to complete. This is a problem even with dedicated sessions, of course, but the issue is much more severe in a shared server environment. This is because by definition there are a finite number of shared servers available—as specified by the parameter max_shared_servers. Therefore, if frequently-executed SQL statements take a very long time to complete, then it is very easy to get into a situation where all of the shared servers are busy running those statements—and PMON is not able to spawn any more servers because max_shared_servers has been reached. This situation essentially results in a system-wide "hang" in which many user sessions are not able to do anything.

This exact issue was encountered on one of my recent projects. An Oracle 8i database that used shared servers was upgraded to Oracle 9i release 2. The Oracle 9i optimizer chose much less efficient execution plans for some of the application's most frequently executed SQL. This caused those statements to take orders of magnitude longer to complete. (The statements only took a few seconds to complete in Oracle 8i; but they took up to 20 minutes to complete in Oracle 9i!) Fortunately, I was able to resolve this issue with a combination of indexes, histograms and optimizer mode changes.

Shared Server Aborts

Another type of issue that can occur with shared servers is that in some cases a shared server process can completely abort. When this happens, an ORA-00600 [16365] error will be listed in the alert log and a trace file will be generated. PMON will then restart the aborted shared server process automatically.

There is very limited information available on this issue; but my understanding is that this issue can occur when a shared server completes executing a SQL statement and tries to write the results of that statement to a virtual circuit—but that circuit has been updated (or "dirtied") by a dispatcher. In other words the sequence of events is this: A shared server picks up a SQL statement from a virtual circuit and begins executing it. Then a dispatcher places a new message onto that same virtual circuit. When the shared server completes its SQL statement and tries to update the virtual circuit with the results of that statement, the problem occurs. More detailed information on this issue can be found in Metalink Doc ID 106607.1.

This issue has also occurred in the project mentioned above. The issue had occurred in Oracle 8i as well; but the frequency of the issue increased dramatically after the Oracle 9i upgrade; it occurred about one time per week in Oracle 8i, but it occurred about five or six times per day after the Oracle 9i upgrade. In addition, when a shared server abort occurs on this system, the user who was running the SQL statement in question encounters errors in the application such that the user has to log out of the application completely and then log back in. Finally, users reported that every time the error occurs, the user has been in one particular module of the application and they have been trying to execute multiple statements against the database in very quick succession—i.e., they were trying to execute a new statement against the database before their last statement has completed. If the users waited until their previous statement completed before they executed a subsequent statement, then users did not encounter the issue.

The trace files that are generated when a shared server aborts contain the SQL statement that the shared server was executing when it aborted. Almost all of the trace files in our case contained one particular statement, and that statement is called by the module of the application that users encountered the problem in. Also, that statement was one of the statements that was much less efficient—and therefore, took much longer—in Oracle 9i than in Oracle 8i. Therefore, it appears that the reason for the increased frequency of this issue is that the shared servers were taking longer to complete that statement in Oracle 9i than in Oracle 8i; and that increased the probability that the shared server's virtual circuit will have been "dirtied" while that statement was being run. Tuning that statement brought the frequency of shared server aborts back to the Oracle 8i level.

In any case, the bottom line on this issue is that the shared server architecture is designed to be used with "serial" applications; i.e., applications that force the user to wait until their current statement has completed before executing a subsequent statement. Therefore, applications that do not enforce such serial access certainly have the potential to encounter the shared server abort issue.

Service Handler Blocked Status

The final type of shared server issue that I will discuss is an issue in which all of the service handlers in an environment get into a "blocked" state. In a shared server environment, dispatchers "register" with a listener by establishing service handlers to that listener. Evidently, when the service handlers are under a relatively heavy load, those service handlers can sometimes get into a "blocked" state—such that those service handlers will not accept any new connections to the database. When that happens, users receive errors (usually ORA-12537) when they try to establish new connections to the database. Existing connections are apparently not affected, however. The issue generally seems to last for a few minutes at a time, before appearing to resolve itself. The following information on this issue is from Oracle Support:

What these symptoms indicate, is the Service Handlers registered with the TNS Listener for the Oracle9i R2 (9205) instance are going into a 'blocked' status based on the number service request handled since the last SERVICE_UPDATE from the instance's PMON process.

This issue has occurred in the project mentioned above as well. It seemed to start occurring sporadically once the number of concurrent connections to the database reached about 1300; and the issue became extremely frequent—almost every other minute—when the connections reached about 1370. Note that Oracle does not spawn any new dispatchers (and therefore, does not establish any new service handlers) when this issue occurs; and even when I manually increased the number of dispatchers the issue remained.

This certainly appears to be a type of "threshold" issue; i.e. the service handlers stop accepting requests when the load on them reaches a certain level. As such, it may be possible to reconfigure the listener (or instance) to allow the service handlers to operate under a heavier load in order to increase this threshold. However, I can not say for certain that this is the case, because after the above issue occurred, I was obliged to remove the use of the shared server architecture completely from the instance in question—because this issue and the other two mentioned earlier were extremely disruptive to end users.

Conclusion

The instance in the project mentioned above now uses dedicated sessions, with the parameters pga_aggregate_target and workarea_size_policy=auto set. Those parameters have helped to reduce the memory requirements needed to handle dedicated sessions; but note that even with those parameters set the amount of memory required to handle dedicated sessions is still much higher than the memory required for shared server connections. Therefore, if memory is severely limited in an environment—and if the issues above can be avoided—then the shared server architecture may still be a viable choice in that environment.

About the Author

Brian Keating, OCP, has been an Oracle DBA and Unix system administrator for over nine years. He is currently a consultant with Database Specialists, Inc., a consulting group specializing in business solutions based on Oracle technology. You can visit Database Specialists on the web at http://www.dbspecialists.com, and you can contact Brian Keating at bkeating@dbspecialists.com.

Still Looking for Help on this Subject?

Get a Consultation
We would be happy to talk with you about our services and how our senior-level database team might help you. Call Database Specialists at 415-344-0500 or 888-648-0500 or fill out a free consultation request form.
Complimentary Newsletter
If you'd like to receive our complimentary monthly newsletter with database tips and new white paper announcements, sign up for The Specialist.


Copyright © 2005 Database Specialists, Inc. http://www.dbspecialists.com