Flashback logging only has a 2% performance decrease…or not.

February 17, 2010

Firstly apologies to both people who have previously read my blog, things have been a little hectic over the past 12 months working away from home in the Netherlands has taken its toll on my spare time, and weekends are to be spent either with the family or working, things have settled into a more stable pattern now so here goes..

Some background, I’ve been assigned into a project to configure dataguard on a very busy production system performing a mixed bag of batch processing, messaging and DSS loads. Performance and availability of the system is critical for the business so everyone was happy that when flashback was proposed as an addition to the system they could quote this bit of marketing http://www.oracle.com/technology/deploy/availability/htdocs/Flashback_Overview.htm (search for 2% and you will see the bit I mean).  Great everyone was happy then i get moved onto the project to help with configuration and testing and start upsetting people..

Prior to starting the formal tests we have a fairly informal period where we have built the test databases configured the standby got the team used to data guard brokers foibles, failed over, rebuilt the primary using flash back database, performed some switch overs and switch backs, broken the configuration by an ill advised flashback on the primary fixed the configuration switched over a couple more times, then hit a bug (5937899) handily unpublished regarding flashback data guard and switch overs fortunately this was recoverable as we had set guaranteed restore points which is where our problems really got started.

As part of the test plan we have a set of processes to run, the plan is to run everything, measure performance of the network, host o/s, SAN generate awr reports etc and then flashback to our guaranteed restore point alter the configuration and re-run the same processes, giving us data we can compare e.g. no data guard, archive log shipping, lgwr async and finally lgwr sync. The plan was good right up until we tried some Micky Mouse testing with simple scripts creating tables by selecting from dba objects, delete insert commit really simple stuff, so first run through no data guard no flashback everything is good, enable  data guard configure flashback database set a guaranteed restore point, performance nosedives and top 5 wait events looks like

Top 5 Timed Events

Event Waits Time(s) Avg Wait(ms) % Total Call Time Wait Class
log file sync 2,055 557 271 35.7 Commit
CPU time 497 31.8
db file parallel write 33,547 437 13 28.0 System I/O
flashback buf free by RVWR 430 402 935 25.7 Other
log file parallel write 777 96 124 6.2 System I/O

This is not good, the log file sync time isn’t unexpected (in fact is not much greater than without dataguard or flashback on), but the flashback buf free by RVWR is not something we expected to see at this level.  Finding this really doesn’t bode well for the rest of the testing and the impact on production, so we start digging around, switch of flash back, configure a restore point with flashback logging not enabled. Until eventually I’m looking at the session wait events and the flash recovery area at the same time, and notice that all the sessions go into waiting on flashback buf free by RVWR at the same time and this coincides with a new flashback log being created. because of the load we have generated flashback logs are created ~500MB in size, every session we are running to generate load waits while this is happening.

Right so what is the real problem? well its the way we have set things up, with flash back logging enabled and no restore points configured flash back logs will get reused (how soon depends on db_flashback_retention_target) once the system hits a steady state it shouldn’t be creating new flashback logs just reusing old ones. In our test environment we enable flashback logging and configure a guaranteed restore point so we don’t ever reuse a flashback log we just create new ones this isn’t representative of the live system as we will not be setting guaranteed restore points, so our next step is try to find a way to enable flashback to a guaranteed point in test without inflicting the waits we have seen in our quick and dirty testing. testing on a small system the following seems to get around the problem.

Configure the test database and standby server.
Set flashback database on, set db_flashback_retention_target to a big number (even the default of 1440 will work)
execute one of our scripts to generate a lot of changes and wait until we have enough flash back logs (yes i know how many is enough?)
set a guaranteed restore point
set db_flashback_retention_target to a small number (<60)

what will then happen is the existing flashback logs will be reused removing the extreme waits we are getting waiting on new flashback log files being created while still allowing us to flashback and re-run the production like process tests against the same data.

The real point here isn’t about flash back logging or a specific process its about testing in our case we have delayed the start of the formal tests by a day but have saved time overall as we will not end up having to re-run tests that take 4-5 hours to complete, well at least not for this reason :),  in summary :-

validating your environment before you start time consuming and possibly expensive tests is a good idea

Small fairly crude tests can show up problems in the configuration shortening the test cycle.

Having time to mess about and quickly retry things is invaluable for DBA’s implementing features that haven’t been used by the team before.

Never, ever quote a performance figure pulled from marketing material no matter what the source is, always verify in your own environment.


Index Fragmentation – more thoughts.

March 18, 2009

Looking at my last post I’m somewhat unsatisfied about the script to create the “fragmented” index so I thought I’d have another go, modifying it to be like this:

drop table TEST_FRAG2;
create table TEST_FRAG2 (id number not null, value varchar2(10));
insert into TEST_FRAG2 select id,’stuff’ from
(select rownum as id from dual connect by level <=400000)
where mod(trunc(id/400),2)=0;
create unique index TEST_FRAG2_IDX on TEST_FRAG2(id);
insert into TEST_FRAG2 select id,’stuff’ from
(select rownum as id from dual connect by level <=400000)
where mod(trunc(id/400),2)=1;
commit;

begin
DBMS_STATS.GATHER_TABLE_STATS (ownname => ‘CHRIS’,tabname => ‘TEST_FRAG2’,
estimate_percent => 100,method_opt=> ‘FOR ALL COLUMNS SIZE 1’);
end;

This creates an index which has a similar type of physical fragmentation as the pervious example but a better clustering factor, running the first test again. (I’m seeing a faster response from disk today, however I wouldn’t read too muchinto that as my test database is 11g , Oracle E-Linux 5.2 on Oracle VM on the cheapest dell server I could get with one disk and a bunch of third party ram).

alter system flush buffer_cache;
alter session set events ‘10046 trace name context forever, level 8’;
select count(id) from test_frag2 where id >=1000 and id <=20000;
alter session set events ‘10046 trace name context off’;

Taking a look at the trace file shows the following (this is a small sample) type of activity we are reading blocks that are not sequentially placed on disk

WAIT #11: nam=’db file sequential read’ ela= 42 file#=4 block#=25070 blocks=1 obj#=68308 tim=1236636891337307
WAIT #11: nam=’db file sequential read’ ela= 27 file#=4 block#=25956 blocks=1 obj#=68308 tim=1236636891337422
WAIT #11: nam=’db file sequential read’ ela= 27 file#=4 block#=25960 blocks=1 obj#=68308 tim=1236636891337539
WAIT #11: nam=’db file sequential read’ ela= 26 file#=4 block#=25071 blocks=1 obj#=68308 tim=1236636891337667
WAIT #11: nam=’db file sequential read’ ela= 29 file#=4 block#=25968 blocks=1 obj#=68308 tim=1236636891337783
WAIT #11: nam=’db file sequential read’ ela= 26 file#=4 block#=25964 blocks=1 obj#=68308 tim=1236636891337911
WAIT #11: nam=’db file sequential read’ ela= 26 file#=4 block#=25072 blocks=1 obj#=68308 tim=1236636891338034
WAIT #11: nam=’db file sequential read’ ela= 28 file#=4 block#=25976 blocks=1 obj#=68308 tim=1236636891338151

then runing the second test

alter system flush buffer_cache;
alter session set events ‘10046 trace name context forever, level 8’;
select count(value) from test_frag2 where id >=1000 and id <=20000;
select count(value) from test_unfrag where id >=1000 and id <=20000;
alter session set events ‘10046 trace name context off’;

now uses the same execution plan as before, looking at the tkproffed output shows the query on test_frag2 doing more work.

select count(value)
from
test_frag2 where id >=1000 and id <=20000

call     count       cpu    elapsed       disk      query    current        rows
------- ------  -------- ---------- ---------- ---------- ----------  ----------
Parse        1      0.00       0.00          0          0          0           0
Execute      1      0.00       0.00          0          0          0           0
Fetch        1      0.01       0.01        108        155          0           1
------- ------  -------- ---------- ---------- ---------- ----------  ----------
total        3      0.01       0.01        108        155          0           1

select count(value)
from
test_unfrag where id >=1000 and id <=20000

call     count       cpu    elapsed       disk      query    current        rows
------- ------  -------- ---------- ---------- ---------- ----------  ----------
Parse        1      0.00       0.00          0          0          0           0
Execute      1      0.00       0.00          0          0          0           0
Fetch        1      0.01       0.01         83         83          0           1
------- ------  -------- ---------- ---------- ---------- ----------  ----------
total        3      0.01       0.01         83         83          0           1

A coalesce on the index TEST_FRAG2_IDX improves things but still we are doing slightly more work, as shown below, a rebuild gives exactly the same result.

select count(value)
from
test_frag2 where id >=1000 and id <=20000

call     count       cpu    elapsed       disk      query    current        rows
------- ------  -------- ---------- ---------- ---------- ----------  ----------
Parse        1      0.00       0.00          0          0          0           0
Execute      1      0.00       0.00          0          0          0           0
Fetch        2      0.02       0.02         85        132          0           1
------- ------  -------- ---------- ---------- ---------- ----------  ----------
total        4      0.02       0.02         85        132          0           1

I’ll need to take a look at whats going on here but I suspect the extra work is because the table data is scattered accross more blocks rather than anything to do with the index structure


Index fragmentation – Just like SQL Server

March 6, 2009

My current project is working for a client consolidating a bunch of SQL server databases into a properly licenced and managed sql serve cluster, Its been a while since I did a lot in SQL Server and it seemed like a good way to refresh some skills I hadn’t used for  while. One of the big differences is sql server best practices recommends running a maintenance job to rebuild fragmented indexes, current thinking in Oracle is that frequent (and most of the time any) index rebuilds are not required. Index fragmentation in SQL Server has a two specific meanings Internal fragmentation is wasted space in an index the kind of thing you get after heavy deletes of almost all the data in a range of values, the other is external fragmentation where pages are stored out of order on disk  (a more in depth explanation is here http://www.sql-server-performance.com/articles/per/index_fragmentation_p1.aspx) most of the discussion about oracle indexes discusses the first type of fragmentation but I’ve seen the second type discussed far less so I figured I’d see if the same thing happens. first of all to build some test tables and indexes.

drop table TEST_FRAG;
create table TEST_FRAG (id number not null, value varchar2(10));
insert into TEST_FRAG select rownum*2-1, ‘Stuff’ from dual connect by level <=200000;
create unique index TEST_FRAG_IDX on TEST_FRAG(id);
insert into TEST_FRAG select rownum*2, ‘TEST_FRAG’ from dual connect by level <=200000;

drop table TEST_UNFRAG;
create table TEST_UNFRAG (id number not null, value varchar2(10));
insert into TEST_UNFRAG select rownum, ‘Stuff’ from dual connect by level <=400000;
create unique index TEST_UNFRAG_IDX on TEST_UNFRAG(id);
commit;

begin
DBMS_STATS.GATHER_TABLE_STATS (ownname => ‘CHRIS’,tabname => ‘TEST_FRAG’,estimate_percent => 100,method_opt=> ‘FOR ALL COLUMNS SIZE 1’);
DBMS_STATS.GATHER_TABLE_STATS (ownname => ‘CHRIS’,tabname => ‘TEST_UNFRAG’,estimate_percent => 100,method_opt=> ‘FOR ALL COLUMNS SIZE 1’);
end;

This Creates a couple of simple tables and indexes which we can then run some tests on, first of all this:-

alter system flush buffer_cache;
alter session set events ‘10046 trace name context forever, level 8′;
select count(id) from test_frag where id >=1000 and id <=20000;
select count(id) from test_unfrag where id >=1000 and id <=20000;

both take next to no time, taking a look at the traces we can see a small difference, for the first statement we get the following wait events in the trace file.

WAIT #4: nam=’db file sequential read’ ela= 58 file#=4 block#=19812 blocks=1 obj#=67986 tim=1236296568417821
WAIT #4: nam=’db file sequential read’ ela= 57 file#=4 block#=18928 blocks=1 obj#=67986 tim=1236296568418122
WAIT #4: nam=’db file sequential read’ ela= 59 file#=4 block#=19816 blocks=1 obj#=67986 tim=1236296568418396
WAIT #4: nam=’db file sequential read’ ela= 57 file#=4 block#=18929 blocks=1 obj#=67986 tim=1236296568418700
WAIT #4: nam=’db file sequential read’ ela= 60 file#=4 block#=19820 blocks=1 obj#=67986 tim=1236296568418972
WAIT #4: nam=’db file sequential read’ ela= 56 file#=4 block#=18930 blocks=1 obj#=67986 tim=1236296568419259

We can see the range scan is reading blocks from different extents as it progresses through the index for the second table the waits look like this,

WAIT #3: nam=’db file sequential read’ ela= 46 file#=4 block#=20847 blocks=1 obj#=67988 tim=1236296568430710
WAIT #3: nam=’db file sequential read’ ela= 48 file#=4 block#=20848 blocks=1 obj#=67988 tim=1236296568430986
WAIT #3: nam=’db file sequential read’ ela= 58 file#=4 block#=20849 blocks=1 obj#=67988 tim=1236296568431264
WAIT #3: nam=’db file sequential read’ ela= 58 file#=4 block#=20850 blocks=1 obj#=67988 tim=1236296568431545
WAIT #3: nam=’db file sequential read’ ela= 56 file#=4 block#=20851 blocks=1 obj#=67988 tim=1236296568431803
WAIT #3: nam=’db file sequential read’ ela= 56 file#=4 block#=20852 blocks=1 obj#=67988 tim=1236296568432065

Here we are scanning the blocks in order, overall however the total work done is much the same (I got a total of 44 blocks read from the “Fragmented” index and 42 from the “unfragmented”) and we are not seeing a huge difference in access time as the elapsed time is pretty much the same. the next test gives some different results.

alter system flush buffer_cache;
alter session set events ‘10046 trace name context forever, level 8’;
select count(value) from test_frag where id >=1000 and id <=20000;
select count(value) from test_unfrag where id >=1000 and id <=20000;

Looking in the trace file here shows the first query actually has a completely different execution plan, it does this

Rows     Row Source Operation
——-  —————————————————
1  SORT AGGREGATE (cr=1005 pr=1001 pw=1001 time=0 us)
19001   TABLE ACCESS FULL TEST_FRAG (cr=1005 pr=1001 pw=1001 time=509 us cost=277 size=228024 card=19002)

where the second query does this

Rows     Row Source Operation
——-  —————————————————
1  SORT AGGREGATE (cr=83 pr=83 pw=83 time=0 us)
19001   TABLE ACCESS BY INDEX ROWID TEST_UNFRAG (cr=83 pr=83 pw=83 time=1460 us cost=84 size=190020 card=19002)
19001    INDEX RANGE SCAN TEST_UNFRAG_IDX (cr=42 pr=42 pw=42 time=497 us cost=42 size=0 card=19002)(object id 67988)

Why is this? well the answer lies in the clustering factor because of the way the data was loaded the clustering factor for the first table =400,000 and for the second = 880 (the second index is slightly smaller as well)  this has nothing to do with the placement of the index blocks on disk and everything to do with the slightly bizarre way the data was loaded,  rebuilding the index doesn’t change the clustering factor so won’t alter the execution plan unless we manually manipulate the statistics or use hints to force a range scan.

So do we have a similar type of fragmentation to the SQL server definition of External, well we appear too however the impact is not particularly large also in this case we could avoid it by creating the index with a pctfree of 50 and the organisation of the table data is actually the larger factor here we can’t fix the clustering factor by rebuilding the index only by changing the way we load the data.


Really simple queuing PL/SQL calcuations..

February 21, 2009


A couple of posts by Chen Shapira, (this one barbershop-queue-demo and this one random-thoughts-about-queues)  over at prodlife.wordpress.com regarding queuing in systems and queuing theory piqued my interest. Way back in the beginning of my career in IT I worked in an outsourced  call centre building reports, providing management information etc. one critical piece of this work was calculating requred staffing levels. A lot of this was based on a couple of pieces of maths the ErlangB and ErlangC Formulas as part of the reporting I ended up producing some PL/SQL to perform these calculations its not the best but if your interested its here queuing_calc_plsql.

This is based on the Excel erlangc calculator written by Lester Bromley at www.erlang.co.uk however my code has a couple of additional caveats, there is no error or input checking in the pl/sql I may add it a some point but then again I may not, it is therefore possible to get nonsensical responses or divide by zero errors.

Servicetime is defined in the functions as the time spent queuing not the queuing time+transaction time.

the PL/sql functions I wrote allow you to specify an interval rather than use a default of one hour, also the orgonal provides far more functions than i translated, I’ll probably get around to the others at some point but it may take a while.

The same conditions apply, you can use any of this code you like however do not sell rent or lease it out, also its provided as is and I’m not responsible for any loss or damages if its faulty or provides incorrect answers, also if you want to modify re-code into a different language feel free but it would be polite to contact Lester prior to publishing anything based on this as the original work is his.

Anyway back to the point, Chens post got me too thinking could this stuff be useful in an OTLP environment, my thoughts are possibly. It should be possible to estimate the number of cpus based on a given volume of transactions, cpu time required for each transaction and the desired service level. I suspect the model won’t be perfect as cpu usage for a transaction may be split into multiple chunks however hopefully I’ll get some time to produce a test case/simulation and see if it works aver the next week or so, if anyone has any bright ideas let me know..

Chris


I didn’t do It

January 6, 2009

A recent incident at home reminded me of an interaction I had a few months ago with a developer, first the incident at home…

I’d been working on my laptop in the back room (commenly refered to as the X-box room, the office, and occasionally the pit), having finished I left the laptop on the floor open and wne toff to do something else. An hour or so later whillst my son was playing the xbox I hear a loud crash from the back room and as “what was that” the reply came back “nothing”, obviously this wasn’t true and on investigating I found that the metal box file I’d left on the couch had fallen off and landed on the laptop leaving a really impressive scratch on the screen. Whats interesting is the reaction this caused the instantaneous denial that anything had happened and a desire to avoid being in trouble led to a response of “nothing” and whillst I was angry it was obvious that it was my own fault as I should have put the laptop away rather than leave it lying out.

OK now how was this similar to the interaction at work, well basically we put a new application release containng a significant number of changes of an into a test enviroment, the system being primarily a batch driven data warehouse the initial results all looked good all the package updates worked so everyone went home happy. Next day everything was fine too the overnight loads completed slightly faster than previously everyones happy. Next day  I get a call

Developer: “The overnight batch hasn’t finnished can you see whats wrong with the database?”

Me: “how far has it got?”

Developer: “its still loading the security reference tables rigth at the start of the batch”

Me: “OK I’ll take a look, did you make any changes to this? I thought the changes where in the widget data loads”

Develper: “we didn’t chnage any of this code in this release”

So its off into an investigation of why we suddenly have a performance problem, we start with the usual checks nothing in the alert log, no hardware issues reported, nothing in the messages logs. A look at whats actually running shows the process id inserting into the second of a set of tables used for row level security, and checking the explain plan shows its using a really inificent execution path.

A review of the process shows that we load the first of two tables with about 10 million rows then load te second with around 35 million rows based on the first table joined to some other static tables. reviewing these We find there are no stats on the first table, at tis point I’m starting to get an idea of whats going on so looking at the code it looks like this.

Truncate table a;

truncatete table b;

select blah,blah, blah into table a from …;

insert a.blah,a.blah,…  into table b from a, …;

commit;

A quick dig into the code repository shows this has been changed the two truncates used to be deletes, this explains a coupe of things, the first loads where faster as removing the data from the look up tables is now a truncatete so doesn’t generate the undo that it used too however this has coincided with the gather stats job so the statistics show table a has zero rows instead of 10 million. So the following set of loads started to use a really bad execution path full scanning table a for every row inserted into table b. Mystery solved and fixed by adding a gather stats after the data is loaded into each table  (alternatively we could have locked the stats).

What does this show? well you shouldd always tell your DBA the truth as it may save time also we really are not trying to shift blame when we ask questions about things changing its a tool that allows us to narrow the range of our search to the problem. Also avoid a culture of blame as it leads to individuals covering things up meaning you find out something is going wrong much later when it may be too late to fix the problem or cost much more to put right.


How do I become a DBA?

January 6, 2009

My first post is my answer to this question, I see it get asked occasionally on the otn forums and elsewhere, unfortunately my answer probably doesn’t help anyone else, I got here by the following route:-

After leaving school I took Physics at university for the same reasons most 18 year olds take a university degree, I had no idea what career I actually wanted to follow but was good at physics and maths and all my friends were also heading off to University so I figured I didn’t want to be the only one staying in my hometown and getting a job.

Four years later (after a small hiccup and a change of university) I realised that I didn’t want work in the field I had just spent four years studying and the most likely alternative didn’t seem to attractive (apparently 50% of physics graduates end up as accountants), however I was really interested in computing so found myself starting out as an outsourced phone tech support agent for a large Internet provider, back in the days of dial up. Surprisingly this didn’t put me off a career in IT completely, trying to guide people through connecting to the Internet when they have just unpacked their first PC was a sometimes frustrating procedure especially as almost no one I spoke to had two telephone lines. After a year i moved into the IT department as an MIS analyst (OK the only MIS analyst) and spent a lot of time with Excel, over the next three years spent time working with crystal reports, oracle SQL and P/SQL and MS SQL Server, as well as assorted CTI and telephony technologies. Then in 2000 whilst my employer was between financial controllers and DBA’s my manager managed to get the 8i dba training courses approved for me and a colleague, the next four years I was a mix of DBA and developer working hours that I never want to repeat, until finally moving on in 2004, since then I’ve been a DBA covering both development and production dba roles.

Overall I have been very fortunate in working for a company that was setting up for the first time in the UK, I got to try lots of different IT roles and found the one that suits me and I enjoy, a more normal career path taking a computer science degree and entering a graduate training program may have been quicker but at least this way I know I’m in the role I enjoy most and one that I’m well suited too.

Hence my answer to the question is I don’t know how you become a DBA, I know how I got here but its not the most direct route and was the result of working for and with the right people at the right time.