A single procedure to “hang” the database.

February 14, 2011

I really enjoy it when people don’t understand the features they are using it makes life really interesting, automatic parallelism is a great new feature of 11.2 but we found a new and interesting way to break things. As part of the application we are implementing customers may re-request fairly large volumes of data to limit the impact on the system as a whole the design is such that a cursor is opened and the data returned in chunks of between  500 rows, if two customers execute re-requests at the same time then two cursors are opened cutomer1 gets 500 rows then customer2 gets 500 rows then back to customer1 and so on. If ten customers issue re-requests results are sent back in chunks of 500 to each customer in turn and so on.

The intention is to prevent customers complaining that they are waiting too long to receive data at the same time as preventing re-requests flooding the database and making the other critical parts of the system becoming unresponsive.

Nice Idea but someone didn’t read enough about automatic parallelism, here is a simple demo, first create a table e.g. (this code has been shamelessly adapted from Jonathan Lewis’ site)

create table t1 
nologging 
 as
 with generator as ( 
 select  --+ materialize 
 rownum id 
 from dual 
 connect by
 rownum <=  10000 
 ) 
 select
 rownum                   id, 
 trunc(dbms_random.value(1,1000))    n1, 
 trunc(dbms_random.value(1,1000))     n2, 
 lpad(rownum,30,'0')          small_vc 
 from
 generator    v1, 
 generator   v2 
 where
 rownum <= 100000000;
/
 
 begin
 dbms_stats.gather_table_stats( 
 ownname      => user, 
 tabname      =>'T1', 
 method_opt    => 'for all columns size 1'
 ); 
 end; 
/

Then a nice simple piece of pl/sql to demonstrate the issue (database version 11.2.0.1, 8 cores, parallel_server_target set to 64, parallel_min_time_threshold=1 you may need different values to force the behaviour)

declare
cursor cursor1 is select * from t1  order by id;
cursor cursor2 is select * from t1  order by id;
cursor cursor3 is select * from t1  order by id;
cursor cursor4 is select * from t1  order by id;
cursor cursor5 is select * from t1  order by id;
cursor cursor6 is select * from t1  order by id;
cursor cursor7 is select * from t1  order by id;
cursor cursor8 is select * from t1  order by id;
cursor cursor9 is select * from t1  order by id;
cursor cursor10 is select * from t1  order by id;
id dbms_sql.number_table;
n1 dbms_sql.number_table;
n2 dbms_sql.number_table;
small_vc dbms_sql.varchar2_table;
BEGIN
open cursor1;
open cursor2;
open cursor3;
open cursor4;
open cursor5;
open cursor6;
open cursor7;
open cursor8;
open cursor9;
open cursor10;

loop
fetch cursor1 bulk collect into id,n1,n2,small_vc limit 500;
fetch cursor2 bulk collect into id,n1,n2,small_vc limit 500;
fetch cursor3 bulk collect into id,n1,n2,small_vc limit 500;
fetch cursor4 bulk collect into id,n1,n2,small_vc limit 500;
fetch cursor5 bulk collect into id,n1,n2,small_vc limit 500;
fetch cursor6 bulk collect into id,n1,n2,small_vc limit 500;
fetch cursor7 bulk collect into id,n1,n2,small_vc limit 500;
fetch cursor8 bulk collect into id,n1,n2,small_vc limit 500;
fetch cursor9 bulk collect into id,n1,n2,small_vc limit 500;
fetch cursor10 bulk collect into id,n1,n2,small_vc limit 500;
exit when cursor1%notfound;
end loop;

close cursor1;
close cursor2;
close cursor3;
close cursor4;
close cursor5;
close cursor6;
close cursor7;
close cursor8;
close cursor9;
close cursor10;
End;
/

Now execute the stored procedure and wait, and wait, and continue waiting forever…..

The problem is simple(once we knew what was going on), auto parallelism will queue statements if it can’t execute with the degree of parallelism requested by about cursor 8 our database was unable to assign the required parallel slaves so it queued the statement, this means no one else ever gets any rows untill we kill the process. Typically what we were seeing was one session waiting on “PX Queuing: statement queue” event. unfortunately getting the code fixed is going to be a long slow process so at the moment automatic parallelism and statement queuing are switched off.


Index fragmentation – Just like SQL Server

March 6, 2009

My current project is working for a client consolidating a bunch of SQL server databases into a properly licenced and managed sql serve cluster, Its been a while since I did a lot in SQL Server and it seemed like a good way to refresh some skills I hadn’t used for  while. One of the big differences is sql server best practices recommends running a maintenance job to rebuild fragmented indexes, current thinking in Oracle is that frequent (and most of the time any) index rebuilds are not required. Index fragmentation in SQL Server has a two specific meanings Internal fragmentation is wasted space in an index the kind of thing you get after heavy deletes of almost all the data in a range of values, the other is external fragmentation where pages are stored out of order on disk  (a more in depth explanation is here http://www.sql-server-performance.com/articles/per/index_fragmentation_p1.aspx) most of the discussion about oracle indexes discusses the first type of fragmentation but I’ve seen the second type discussed far less so I figured I’d see if the same thing happens. first of all to build some test tables and indexes.

drop table TEST_FRAG;
create table TEST_FRAG (id number not null, value varchar2(10));
insert into TEST_FRAG select rownum*2-1, ‘Stuff’ from dual connect by level <=200000;
create unique index TEST_FRAG_IDX on TEST_FRAG(id);
insert into TEST_FRAG select rownum*2, ‘TEST_FRAG’ from dual connect by level <=200000;

drop table TEST_UNFRAG;
create table TEST_UNFRAG (id number not null, value varchar2(10));
insert into TEST_UNFRAG select rownum, ‘Stuff’ from dual connect by level <=400000;
create unique index TEST_UNFRAG_IDX on TEST_UNFRAG(id);
commit;

begin
DBMS_STATS.GATHER_TABLE_STATS (ownname => ‘CHRIS’,tabname => ‘TEST_FRAG’,estimate_percent => 100,method_opt=> ‘FOR ALL COLUMNS SIZE 1’);
DBMS_STATS.GATHER_TABLE_STATS (ownname => ‘CHRIS’,tabname => ‘TEST_UNFRAG’,estimate_percent => 100,method_opt=> ‘FOR ALL COLUMNS SIZE 1’);
end;

This Creates a couple of simple tables and indexes which we can then run some tests on, first of all this:-

alter system flush buffer_cache;
alter session set events ‘10046 trace name context forever, level 8′;
select count(id) from test_frag where id >=1000 and id <=20000;
select count(id) from test_unfrag where id >=1000 and id <=20000;

both take next to no time, taking a look at the traces we can see a small difference, for the first statement we get the following wait events in the trace file.

WAIT #4: nam=’db file sequential read’ ela= 58 file#=4 block#=19812 blocks=1 obj#=67986 tim=1236296568417821
WAIT #4: nam=’db file sequential read’ ela= 57 file#=4 block#=18928 blocks=1 obj#=67986 tim=1236296568418122
WAIT #4: nam=’db file sequential read’ ela= 59 file#=4 block#=19816 blocks=1 obj#=67986 tim=1236296568418396
WAIT #4: nam=’db file sequential read’ ela= 57 file#=4 block#=18929 blocks=1 obj#=67986 tim=1236296568418700
WAIT #4: nam=’db file sequential read’ ela= 60 file#=4 block#=19820 blocks=1 obj#=67986 tim=1236296568418972
WAIT #4: nam=’db file sequential read’ ela= 56 file#=4 block#=18930 blocks=1 obj#=67986 tim=1236296568419259

We can see the range scan is reading blocks from different extents as it progresses through the index for the second table the waits look like this,

WAIT #3: nam=’db file sequential read’ ela= 46 file#=4 block#=20847 blocks=1 obj#=67988 tim=1236296568430710
WAIT #3: nam=’db file sequential read’ ela= 48 file#=4 block#=20848 blocks=1 obj#=67988 tim=1236296568430986
WAIT #3: nam=’db file sequential read’ ela= 58 file#=4 block#=20849 blocks=1 obj#=67988 tim=1236296568431264
WAIT #3: nam=’db file sequential read’ ela= 58 file#=4 block#=20850 blocks=1 obj#=67988 tim=1236296568431545
WAIT #3: nam=’db file sequential read’ ela= 56 file#=4 block#=20851 blocks=1 obj#=67988 tim=1236296568431803
WAIT #3: nam=’db file sequential read’ ela= 56 file#=4 block#=20852 blocks=1 obj#=67988 tim=1236296568432065

Here we are scanning the blocks in order, overall however the total work done is much the same (I got a total of 44 blocks read from the “Fragmented” index and 42 from the “unfragmented”) and we are not seeing a huge difference in access time as the elapsed time is pretty much the same. the next test gives some different results.

alter system flush buffer_cache;
alter session set events ‘10046 trace name context forever, level 8’;
select count(value) from test_frag where id >=1000 and id <=20000;
select count(value) from test_unfrag where id >=1000 and id <=20000;

Looking in the trace file here shows the first query actually has a completely different execution plan, it does this

Rows     Row Source Operation
——-  —————————————————
1  SORT AGGREGATE (cr=1005 pr=1001 pw=1001 time=0 us)
19001   TABLE ACCESS FULL TEST_FRAG (cr=1005 pr=1001 pw=1001 time=509 us cost=277 size=228024 card=19002)

where the second query does this

Rows     Row Source Operation
——-  —————————————————
1  SORT AGGREGATE (cr=83 pr=83 pw=83 time=0 us)
19001   TABLE ACCESS BY INDEX ROWID TEST_UNFRAG (cr=83 pr=83 pw=83 time=1460 us cost=84 size=190020 card=19002)
19001    INDEX RANGE SCAN TEST_UNFRAG_IDX (cr=42 pr=42 pw=42 time=497 us cost=42 size=0 card=19002)(object id 67988)

Why is this? well the answer lies in the clustering factor because of the way the data was loaded the clustering factor for the first table =400,000 and for the second = 880 (the second index is slightly smaller as well)  this has nothing to do with the placement of the index blocks on disk and everything to do with the slightly bizarre way the data was loaded,  rebuilding the index doesn’t change the clustering factor so won’t alter the execution plan unless we manually manipulate the statistics or use hints to force a range scan.

So do we have a similar type of fragmentation to the SQL server definition of External, well we appear too however the impact is not particularly large also in this case we could avoid it by creating the index with a pctfree of 50 and the organisation of the table data is actually the larger factor here we can’t fix the clustering factor by rebuilding the index only by changing the way we load the data.


Really simple queuing PL/SQL calcuations..

February 21, 2009


A couple of posts by Chen Shapira, (this one barbershop-queue-demo and this one random-thoughts-about-queues)  over at prodlife.wordpress.com regarding queuing in systems and queuing theory piqued my interest. Way back in the beginning of my career in IT I worked in an outsourced  call centre building reports, providing management information etc. one critical piece of this work was calculating requred staffing levels. A lot of this was based on a couple of pieces of maths the ErlangB and ErlangC Formulas as part of the reporting I ended up producing some PL/SQL to perform these calculations its not the best but if your interested its here queuing_calc_plsql.

This is based on the Excel erlangc calculator written by Lester Bromley at www.erlang.co.uk however my code has a couple of additional caveats, there is no error or input checking in the pl/sql I may add it a some point but then again I may not, it is therefore possible to get nonsensical responses or divide by zero errors.

Servicetime is defined in the functions as the time spent queuing not the queuing time+transaction time.

the PL/sql functions I wrote allow you to specify an interval rather than use a default of one hour, also the orgonal provides far more functions than i translated, I’ll probably get around to the others at some point but it may take a while.

The same conditions apply, you can use any of this code you like however do not sell rent or lease it out, also its provided as is and I’m not responsible for any loss or damages if its faulty or provides incorrect answers, also if you want to modify re-code into a different language feel free but it would be polite to contact Lester prior to publishing anything based on this as the original work is his.

Anyway back to the point, Chens post got me too thinking could this stuff be useful in an OTLP environment, my thoughts are possibly. It should be possible to estimate the number of cpus based on a given volume of transactions, cpu time required for each transaction and the desired service level. I suspect the model won’t be perfect as cpu usage for a transaction may be split into multiple chunks however hopefully I’ll get some time to produce a test case/simulation and see if it works aver the next week or so, if anyone has any bright ideas let me know..

Chris


I didn’t do It

January 6, 2009

A recent incident at home reminded me of an interaction I had a few months ago with a developer, first the incident at home…

I’d been working on my laptop in the back room (commenly refered to as the X-box room, the office, and occasionally the pit), having finished I left the laptop on the floor open and wne toff to do something else. An hour or so later whillst my son was playing the xbox I hear a loud crash from the back room and as “what was that” the reply came back “nothing”, obviously this wasn’t true and on investigating I found that the metal box file I’d left on the couch had fallen off and landed on the laptop leaving a really impressive scratch on the screen. Whats interesting is the reaction this caused the instantaneous denial that anything had happened and a desire to avoid being in trouble led to a response of “nothing” and whillst I was angry it was obvious that it was my own fault as I should have put the laptop away rather than leave it lying out.

OK now how was this similar to the interaction at work, well basically we put a new application release containng a significant number of changes of an into a test enviroment, the system being primarily a batch driven data warehouse the initial results all looked good all the package updates worked so everyone went home happy. Next day everything was fine too the overnight loads completed slightly faster than previously everyones happy. Next day  I get a call

Developer: “The overnight batch hasn’t finnished can you see whats wrong with the database?”

Me: “how far has it got?”

Developer: “its still loading the security reference tables rigth at the start of the batch”

Me: “OK I’ll take a look, did you make any changes to this? I thought the changes where in the widget data loads”

Develper: “we didn’t chnage any of this code in this release”

So its off into an investigation of why we suddenly have a performance problem, we start with the usual checks nothing in the alert log, no hardware issues reported, nothing in the messages logs. A look at whats actually running shows the process id inserting into the second of a set of tables used for row level security, and checking the explain plan shows its using a really inificent execution path.

A review of the process shows that we load the first of two tables with about 10 million rows then load te second with around 35 million rows based on the first table joined to some other static tables. reviewing these We find there are no stats on the first table, at tis point I’m starting to get an idea of whats going on so looking at the code it looks like this.

Truncate table a;

truncatete table b;

select blah,blah, blah into table a from …;

insert a.blah,a.blah,…  into table b from a, …;

commit;

A quick dig into the code repository shows this has been changed the two truncates used to be deletes, this explains a coupe of things, the first loads where faster as removing the data from the look up tables is now a truncatete so doesn’t generate the undo that it used too however this has coincided with the gather stats job so the statistics show table a has zero rows instead of 10 million. So the following set of loads started to use a really bad execution path full scanning table a for every row inserted into table b. Mystery solved and fixed by adding a gather stats after the data is loaded into each table  (alternatively we could have locked the stats).

What does this show? well you shouldd always tell your DBA the truth as it may save time also we really are not trying to shift blame when we ask questions about things changing its a tool that allows us to narrow the range of our search to the problem. Also avoid a culture of blame as it leads to individuals covering things up meaning you find out something is going wrong much later when it may be too late to fix the problem or cost much more to put right.