A single procedure to “hang” the database.

I really enjoy it when people don’t understand the features they are using it makes life really interesting, automatic parallelism is a great new feature of 11.2 but we found a new and interesting way to break things. As part of the application we are implementing customers may re-request fairly large volumes of data to limit the impact on the system as a whole the design is such that a cursor is opened and the data returned in chunks of between  500 rows, if two customers execute re-requests at the same time then two cursors are opened cutomer1 gets 500 rows then customer2 gets 500 rows then back to customer1 and so on. If ten customers issue re-requests results are sent back in chunks of 500 to each customer in turn and so on.

The intention is to prevent customers complaining that they are waiting too long to receive data at the same time as preventing re-requests flooding the database and making the other critical parts of the system becoming unresponsive.

Nice Idea but someone didn’t read enough about automatic parallelism, here is a simple demo, first create a table e.g. (this code has been shamelessly adapted from Jonathan Lewis’ site)

create table t1 
nologging 
 as
 with generator as ( 
 select  --+ materialize 
 rownum id 
 from dual 
 connect by
 rownum <=  10000 
 ) 
 select
 rownum                   id, 
 trunc(dbms_random.value(1,1000))    n1, 
 trunc(dbms_random.value(1,1000))     n2, 
 lpad(rownum,30,'0')          small_vc 
 from
 generator    v1, 
 generator   v2 
 where
 rownum <= 100000000;
/
 
 begin
 dbms_stats.gather_table_stats( 
 ownname      => user, 
 tabname      =>'T1', 
 method_opt    => 'for all columns size 1'
 ); 
 end; 
/

Then a nice simple piece of pl/sql to demonstrate the issue (database version 11.2.0.1, 8 cores, parallel_server_target set to 64, parallel_min_time_threshold=1 you may need different values to force the behaviour)

declare
cursor cursor1 is select * from t1  order by id;
cursor cursor2 is select * from t1  order by id;
cursor cursor3 is select * from t1  order by id;
cursor cursor4 is select * from t1  order by id;
cursor cursor5 is select * from t1  order by id;
cursor cursor6 is select * from t1  order by id;
cursor cursor7 is select * from t1  order by id;
cursor cursor8 is select * from t1  order by id;
cursor cursor9 is select * from t1  order by id;
cursor cursor10 is select * from t1  order by id;
id dbms_sql.number_table;
n1 dbms_sql.number_table;
n2 dbms_sql.number_table;
small_vc dbms_sql.varchar2_table;
BEGIN
open cursor1;
open cursor2;
open cursor3;
open cursor4;
open cursor5;
open cursor6;
open cursor7;
open cursor8;
open cursor9;
open cursor10;

loop
fetch cursor1 bulk collect into id,n1,n2,small_vc limit 500;
fetch cursor2 bulk collect into id,n1,n2,small_vc limit 500;
fetch cursor3 bulk collect into id,n1,n2,small_vc limit 500;
fetch cursor4 bulk collect into id,n1,n2,small_vc limit 500;
fetch cursor5 bulk collect into id,n1,n2,small_vc limit 500;
fetch cursor6 bulk collect into id,n1,n2,small_vc limit 500;
fetch cursor7 bulk collect into id,n1,n2,small_vc limit 500;
fetch cursor8 bulk collect into id,n1,n2,small_vc limit 500;
fetch cursor9 bulk collect into id,n1,n2,small_vc limit 500;
fetch cursor10 bulk collect into id,n1,n2,small_vc limit 500;
exit when cursor1%notfound;
end loop;

close cursor1;
close cursor2;
close cursor3;
close cursor4;
close cursor5;
close cursor6;
close cursor7;
close cursor8;
close cursor9;
close cursor10;
End;
/

Now execute the stored procedure and wait, and wait, and continue waiting forever…..

The problem is simple(once we knew what was going on), auto parallelism will queue statements if it can’t execute with the degree of parallelism requested by about cursor 8 our database was unable to assign the required parallel slaves so it queued the statement, this means no one else ever gets any rows untill we kill the process. Typically what we were seeing was one session waiting on “PX Queuing: statement queue” event. unfortunately getting the code fixed is going to be a long slow process so at the moment automatic parallelism and statement queuing are switched off.

About these ads

3 Responses to A single procedure to “hang” the database.

  1. […] This post was mentioned on Twitter by OraNA.info, John Seaman. John Seaman said: RT @orana: A single procedure to “hang” the database. http://dlvr.it/GbS3n […]

  2. Dom Brooks says:

    “cutomer1 gets 500 rows then customer2 gets 500 rows then back to customer1. If ten customers issue re-requests results are sent back in chunks of 500 to each customer in turn and so on.”

    How was this design arrived at?
    Sounds like you effectively have a single user database.
    It sounds bizarre and completely unscalable.

    • randomdba says:

      Arrive at a design is a little optimistic, its sort of a third party product, (supplied by a different division of the group). My recomendation is to grab all the rows and disiminate them as fast a possible, or possibly cache most of the data at the mid tier either in the application layer or 11g client result cache as we can supply a superset of the data and filter at the client.
      However I’m currently trying to get the developers to understand bind variables. I’d really like them to get the hang of the technology form the 80s/90s first before we start on the 21st century stuff.

Leave a Reply

Fill in your details below or click an icon to log in:

WordPress.com Logo

You are commenting using your WordPress.com account. Log Out / Change )

Twitter picture

You are commenting using your Twitter account. Log Out / Change )

Facebook photo

You are commenting using your Facebook account. Log Out / Change )

Google+ photo

You are commenting using your Google+ account. Log Out / Change )

Connecting to %s

Follow

Get every new post delivered to your Inbox.

%d bloggers like this: