|SQL or PL/SQL?
|This month’s question comes to us from a member of the Database Specialists team:
“In my experience, SQL statements have unpredictable performance and can be very hard to read especially when they are longer than a dozen lines. Is it OK to use PL/SQL when the same thing can be done with a complex SQL query?”
Senior staff consultant Ian Jones thoughtfully responds:
We have probably all encountered situations where an SQL statement appears to perform dramatically differently than it did previously. There are many possible causes for this. Sometimes this is because of a caching effect where a statement is slow the first time it is executed but faster upon subsequent executions. This is because the data blocks are cached either in the Oracle SGA or even outside of Oracle in the Unix buffer cache or storage array cache. Also, if an SQL statement is very long or complex, its hard parse time can be significant on its first execution. Another common reason why an SQL statement performs differently is because the underlying statistics have changed and so the optimizer has changed the execution plan. This can be detected by examining the historical execution plans using Statspack or AWR data and fixed by reverting the statistics if desired; for example, by using DBMS_STATS.RESTORE_TABLE_STATS or RESTORE_SCHEMA_STATS. Another common cause is “bind variable peeking” when the execution plan changes because Oracle is optimizing for a particular atypical bind variable value. Sometimes the performance of two similar executions is different because the underlying data is very different, this usually happens when the underlying data is heavily skewed so that executions look identical but are not. The best way to confirm this is to use TKPROF and observe the differences in the row counts. In some cases, I have known developers to make a “minor” change to a statement and be puzzled why it now takes several hours to execute instead of minutes: the “I only added one more value into the IN clause” effect!
On the “SQL very hard to read” issue, most of us prefer our SQL statements formatted in a particular way. Hopefully on large projects, there is a coding convention that we are comfortable with. If we write the SQL ourselves, then it should be formatted and commented in such a way that we can understand it months later. However sometimes we find ourselves working on complicated statements produced by others with little or no commenting, and formatted in a way that makes it tricky to understand. In these cases, to assist in building up the necessary knowledge of the tables and indexes involved, I usually take the time to manually rewrite the SQL into the syntax and case that I prefer and if necessary change the table aliases and add comments to the clauses and filters to help understand it. Once I have understood the query, making surgical changes to the original is easier. I have tried various automatic rewrite tools over the years but have never found one that I have consistently stuck with.
Finally, the “Is it OK to use PL/SQL when the same thing can be done with a complex SQL query?” question. The main reason against writing a PL/SQL version of a complex SQL query is performance. The optimizer is usually better at finding a good execution plan for a complicated statement than a human is. To illustrate this, consider a simple multi-table join,
CURSOR c_mj IS
a.col1 = b.col1 AND
b.col2 = c.col2 AND
b.col3 = c.col3 AND
c.col4 = d.col4 AND
a.col7 = ‘CLOSED’ AND
c.col3 = ’20’ AND
d.col5 = ‘D’
Even with a simple statement like this, the optimal execution plan needs to take into account lots of variables including the number of rows in each table, the length of the rows, the selectivity of the where-clause filters, the available indexes, the data skewness of the filter columns, the capability of the machine (I/O characteristics of the system, the CPU performance, available memory), etc. The optimizer will also change the execution plan dynamically over time as the above characteristics change. Of course, we could implement this query from PL/SQL in different ways; for example, here is the simplest “nested loops” approach:
FOR c_a_rec IN (SELECT * FROM a WHERE a.col1 = ‘CLOSED’) LOOP
FOR c_b_rec IN (SELECT * FROM b WHERE b.col1 = a.col1) LOOP
FOR c_c_rec IN (SELECT * FROM c WHERE c.col2 = c_b_rec.col2 AND c.col3 = c_b_rec.col3 AND c.col3 = 10) LOOP
FOR c_d_rec IN (SELECT * FROM d WHERE d.col4 = c_c_rec.col4 AND d.col5 = ‘D’) LOOP
Whatever way we choose to implement this query in PL/SQL, we have effectively hard coded a single row-at-a-time execution plan that is static no matter how the underlying data volumes change.
Another reason against turning a complex SQL statement into a PL/SQL program is that SQL is intrinsically set-oriented and thus well suited for expressing complex requirements. The following example is a very succinct query for rows in table A that are not also in table B. An equivalent PL/SQL program would be more complex and not perform as well.
SELECT * FROM A
SELECT * FROM B
The amount that can be achieved with a single SQL statement has increased dramatically over time. In recent years we have gained new functionality such as regular expressions, sub-query factoring, analytic functions, and the model clause. Staying comfortable with all these enhancements can be an effort because most statements we write don’t use them but, as the philosopher might say, “There is no royal road to high performance SQL”!
I hope this answer helps you. Best of luck.