Recently I got an opportunity to dig into performance of database. We were not facing any performance issue, but we wanted to understand whats happening under the hood. What are the bottlenecks and how do we prepare and scale our database further.
In future the load on the databases are going to be crazy and we want to make sure that we look into every consumer of our database and optimize the usage.
I went back into the basics of database performance tuning – ASH. Active session history is the beast which has loads of information and provides information at very granular level – “session”. We can dig into ASH data and generate different reports rolled-up against various entities like module, action, sql_id, user_id etc. We can determine how much each of these entities are using database resources.
One of the most fundamental measure of database performance is DB Time. This is the amount of time a session has spend on database activities. This is a combination of CPU Time + non-Idle wait times. This does not consider any idle wait time so even if session is done with its activity but just sitting idle doing nothing, that idle time is not considered part of DB Time.
This gives a very accurate measure of amount of database resources a session is burning.
Active Session History:
ASH or Active Session History provides snapshots of active sessions connected to database (status=ACTIVE in v$session) and loads of other information related to those sessions about what they were doing when snapshot was taken. To understand how we can extract this information and make sense about it, we need to understand how ASH is build and what that information means.
Active session history is exposed by database using a view V$ACTIVE_SESSION_HISTORY. This view stores samples taken every 1 sec. A background process MMNL (Memory manager lite) takes a snapshot from v$session and v$session_wait about the current state of active sessions and dumps the information in V$ACTIVE_SESSION_HISTORY. Note that only active sessions’s snapshot is taken.
V$ACTIVE_SESSION_HISTORY is a memory view stored in the form of circular buffer. It means that this view has size limit on its growth (approx 2MB / CPU). So when it reaches to that size it starts overwriting the oldest records in circular buffer. You can visualize the same using following figure
So how these samples are useful to us in finding DB Time ?
Lets say a session has spend 10 sec of active time in database. It means DB time for that session is 10 sec. During this 10 sec of active time, we will have 10 samples of that session taken as interval time between snapshot is 1 sec. So if we just take a count(*) from V$ACTIVE_SESSION_HISTORY, we will see 10 records for that session. This means that to get DB Time of a session we just have to take count(*) from V$ACTIVE_SESSION_HISTORY for that session.
Its little difficult to adjust to the fact that time parameter can be obtained using count(*) from a view. But this is because interval time between samples is 1 sec.
So DB Time is actually – [ Number of samples ] X [ Time interval between samples ]
Since time interval between samples is 1 sec, we can get DB time as number of samples which is count(*) from V$ACTIVE_SESSION_HISTORY
Again, calculating DB Time using count(*) from ASH is approximation as sessions which are taking fraction of times < 1 sec will not be counted if they happens between 2 samples and DB time of other sessions which ends just before next samples is taken will be counted until last samples when they were active. But those were very few cases and amount of DB time not measure in such cases will be very less compared to total DB time we are measuring. You can visualize the same using following figure.
DBA_HIST_ACTIVE_SESS_HISTORY is a history table which maintains history of active sessions between AWR snapshots. Since V$ACTIVE_SESSION_HISTORY is a memory view and has limited space for growth, data from V$ACTIVE_SESSION_HISTORY flows to DBA_HIST_ACTIVE_SESS_HISTORY table.
DBA_HIST_ACTIVE_SESS_HISTORY table enables us to view active session data from the past. Data in DBA_HIST_ACTIVE_SESS_HISTORY is retained based on retention level set for AWR snapshots (30 days by default). So every time AWR snapshot happens, data from V$ACTIVE_SESSION_HISTORY goes into this table.
But NOT all data from V$ACTIVE_SESSION_HISTORY goes into DBA_HIST_ACTIVE_SESS_HISTORY. When AWR snapshot happens, it picks only 1 out of 10 samples from V$ACTIVE_SESSION_HISTORY into history table.
Complete visualization of active sessions looks like below
Since AWR snapshot interval is 1 hour, it will take AWR snapshots every 1 hour and during that time 1 out of 10 samples form V$ACTIVE_SESSION_HISTORY will move to DBA_HIST_ACTIVE_SESS_HISTORY table.
So if want to calculate DB Time from DBA_HIST_ACTIVE_SESS_HISTORY table, we need to consider sample time as 10 sec instead of 1 sec.
DB Time = [ Number of samples ] X [ Time interval between samples ] = count(1) * 10
That’s why when we are calculating DB Time of a session or SQL from DBA_HIST_ACTIVE_SESS_HISTORY table, we need to take count(*) * 10.
Calculating DB Time
With this brief understanding of ASH, we can use these tables to find out DB time for a session and consolidate sessions to calculate DB Time for module/user etc.
If we can list modules in terms of % of DB time they are consuming in database, modules with high % of DB time are the one using most resources in database and has high impact to database. Same holds true at user level as well. We can find out which user is contributing highest DB time.
I have given couple of queries to find out % of DB Time different modules are contributing in database. This gives a fair idea about module wise resource consumption in database.
Note that I have randomized module names using DBMS_RANDOM.string(‘L’,TRUNC(DBMS_RANDOM.value(10,21))) to avoid exposing actual module name outside.
set line 999 set pagesize 999 set verify off col module format a30; BREAK ON REPORT COLUMN DUMMY HEADING '' compute sum of PCT_CPU on REPORT with total_cpu as (select count(1) from V$ACTIVE_SESSION_HISTORY where SESSION_TYPE = 'FOREGROUND') select module, count(1) "module_cpu", (select * from total_cpu) "total_cpu", round((count(1)/(select * from total_cpu))*100,2) PCT_CPU from V$ACTIVE_SESSION_HISTORY where SESSION_TYPE = 'FOREGROUND' group by module order by PCT_CPU desc;
MODULE module_cpu total_cpu PCT_CPU ------------------------------ ---------- ---------- ---------- vvadcagpas 29355 187065 15.69 cbkcbqchts 26506 187065 14.17 zugkvhbajxpff 24655 187065 13.18 tvfqhrkqzktiiceolrfd 10046 187065 5.37 paczxkndgsaqt 6740 187065 3.6 pzqacdgxeqzbtqhalfc 6220 187065 3.33 nvqqscmayx 5905 187065 3.16 zbirttdyoxxbslopnm 5412 187065 2.89 bbztiihehigeqwxvk 5051 187065 2.7 kvosekxfhkmfw 4687 187065 2.51 aenyopwmok 3977 187065 2.13 brfqfuwdyhyrej 3256 187065 1.74 abruxykxhvqgfg 3227 187065 1.73 zfrahazoqgdxubpni 2759 187065 1.47 ddxinnevqd 2633 187065 1.41 lructcsjkxhxatfidpf 2518 187065 1.35 naawhsfqvcaoathn 2246 187065 1.2 sljzltjhhigepqfmcji 2142 187065 1.15 chvjvntynkjb 2102 187065 1.12 zlzhwwhzbhfdtsszfedj 2005 187065 1.07 yvkuwzjkjhyeamco 1610 187065 .86 fdjwspubzyhyxyzzkgyi 1578 187065 .84 risowlcnkizrbpgt 1557 187065 .83 ywwdycosdpcfsflaubia 1539 187065 .82 pxlqrkmvreppum 1496 187065 .8 amivlnlivqeesnqdkto 1404 187065 .75 jjaqyosqsc 1347 187065 .72 yxkgfgktafvx 1354 187065 .72 gqistslkwdw 1328 187065 .71 bhuzvdpklocmnyimtzux 1272 187065 .68 uimycabvbjcczibrn 1053 187065 .56 ksnvqslyker 997 187065 .53 ofekwpklwpjtha 975 187065 .52 wmkmovueneno 904 187065 .48 umdtggtvdptkubfs 795 187065 .42 dddegpqpwyalotxtoh 761 187065 .41 pwycwyhavccwjuufqjx 733 187065 .39 pxvcbhpfmeuwen 729 187065 .39 nhezqmgkqccdcog 716 187065 .38 ixtljofuazhpp 628 187065 .34 mbotmyoairyqbdg 600 187065 .32 aocabnswozdg 586 187065 .31 qtdbdbuffprxj 586 187065 .31 suiwwwtolcnkbnkcfwxw 560 187065 .3 btbdbtaztym 562 187065 .3 xmvdgeczrot 567 187065 .3 gkwzuwlgsh 538 187065 .29 yhgmyqgqkvzmvbpsi 463 187065 .25 obahkcsiafagmkxxu 457 187065 .24 tcjoiyhzpuutbfhegn 448 187065 .24 brlrkszrpmtw 426 187065 .23 spzvvwzvoh 435 187065 .23 ckvvarzrtpedhjl 334 187065 .18 mnamshpswmsif 329 187065 .18 gwvwypqrpwxkbugms 293 187065 .16 clkuqearqhguezagwjlc 302 187065 .16 fskwtuctalnxhhu 306 187065 .16 dyljhijlwyfmjy 277 187065 .15 eqdifihzda 255 187065 .14 nkrvwqrldd 229 187065 .12 jkevokwkwokbnkbx 231 187065 .12 niatevdadex 198 187065 .11 qfjiifhiubvstyp 200 187065 .11 acquoqslwxedulsmekbu 180 187065 .1 hjfimltalvopvuum 156 187065 .08 bmyyhtzdqebo 152 187065 .08 tkehpofgfe 157 187065 .08 qnhgoqgstvq 131 187065 .07 ddlloytpmyhpzpmk 125 187065 .07 bvborwlump 133 187065 .07 juirsmdjuejot 124 187065 .07 saeoojlzfepjfybpclls 129 187065 .07 ppfdzcozyhxmf 135 187065 .07 xbrwqdopemjqrhebglzs 115 187065 .06 zrfdsejwxgitsorat 119 187065 .06 vsmsefmxnkoeb 97 187065 .05 dvhoxhzayqyhsny 98 187065 .05 cwjpiesdhpxvjs 101 187065 .05 bvffncgtiojaj 98 187065 .05 flfuciblxfzdcbo 87 187065 .05 asucxewzdruwyfmg 66 187065 .04 dztczkivnnbuzyuh 67 187065 .04 hslcljfjmrhduw 73 187065 .04 lywascpugxxjidx 74 187065 .04 cawgdisaagltmie 73 187065 .04 dpqbuvfkef 68 187065 .04 tjpaushwfftmm 54 187065 .03 ulgktcqvsazktojskfo 56 187065 .03 ckoayyqtcxybkukgqvgl 49 187065 .03 aqmphwbrngbotpxehyhe 49 187065 .03 asqllnbivusyu 60 187065 .03 skvrrqoshnirxbud 59 187065 .03 vobtqhfewje 58 187065 .03 xurrifyontk 34 187065 .02 rliqgkwdwao 42 187065 .02 ziwftblonh 36 187065 .02 wzybbgmxkx 30 187065 .02 vhumxxhimauc 16 187065 .01 vssaefphdkebdx 26 187065 .01 vjbjocafrwgdv 11 187065 .01 xgcwcxrlzrjvysanlijw 14 187065 .01 ztzxyfykhnxozdn 20 187065 .01 btlbvffybkyjsn 12 187065 .01 pmmqhsjuqhdaf 22 187065 .01 kaumdvjqfcprmaeygr 11 187065 .01 kadliemxdxhtzleqjr 28 187065 .01 hlebszxkejjgkfbpmjew 16 187065 .01 atwvjasykmlmldbipswh 16 187065 .01 ---------- sum 99.95
Similarly to check DB Time for sessions in past, we need to use DBA_HIST_ACTIVE_SESS_HISTORY table. Following query will provide % DB time used by individual modules calculated as % of total DB time between 2 AWR snapshots
set line 999 set pagesize 999 set verify off BREAK ON REPORT COLUMN DUMMY HEADING '' compute sum of PCT_CPU on REPORT with total_cpu as (select count(1)*10 from dba_hist_active_sess_history where snap_id >= &begin_snap_id and snap_id < &end_snap_id and SESSION_TYPE = 'FOREGROUND') select module, count(1)*10 "module_cpu", (select * from total_cpu) "total_cpu", round((count(1)*10/(select * from total_cpu))*100,2) PCT_CPU from dba_hist_active_sess_history where snap_id >= &&begin_snap_id and snap_id < &&end_snap_id and SESSION_TYPE = 'FOREGROUND' group by module order by PCT_CPU desc
Note that we have used count(*) * 10 in above query because 1 of 10 samples go to this table. Also, you need to specify begin_snap_id and end_snap_id and module DB Time usage during those snapshots will be calculated.
If we want to ignore waits and only focus on pure CPU used by modules, we can list modules as per CPU time they consume as % of total CPU time of database.
I think DB time is still better metric than CPU time because DB time covers waits as well. Waits slow down database.
We can calculate CPU time of a session by doing the same count(*) from V$ACTIVE_SESSION_HISTORY but we need to count only those samples during which session was actually using CPU. An ACTIVE session could be either using CPU to do some processing or it could be actively waiting on something (example IO when doing db file sequential read). Active session using CPU is represented by session_state = ‘ON CPU’. So we will count only those samples for session where session_state was ON CPU
Following query will give CPU time used by each module as % of total CPU time using V$ACTIVE_SESSION_HISTORY
set line 999 set pagesize 999 set verify off BREAK ON REPORT COLUMN DUMMY HEADING '' compute sum of PCT_CPU on REPORT with total_cpu as (select count(1) from V$ACTIVE_SESSION_HISTORY where SESSION_TYPE != 'BACKGROUND' and session_state = 'ON CPU') select module, count(1) "module_cpu", (select * from total_cpu) "total_cpu", round((count(1)/(select * from total_cpu))*100,2) PCT_CPU from V$ACTIVE_SESSION_HISTORY where SESSION_TYPE != 'BACKGROUND' and session_state = 'ON CPU' group by module order by PCT_CPU desc
We can find similar CPU time for past sessions using DBA_HIST_ACTIVE_SESS_HISTORY table. Following query will provide CPU time used by module as % of total CPU time between 2 AWR snapshots
set line 999 set pagesize 999 set verify off BREAK ON REPORT COLUMN DUMMY HEADING '' compute sum of PCT_CPU on REPORT with total_cpu as (select count(1)*10 from dba_hist_active_sess_history where snap_id >= &&begin_snap_id and snap_id < &&end_snap_id and SESSION_TYPE != 'BACKGROUND' and session_state = 'ON CPU') select module, count(1)*10 "module_cpu", (select * from total_cpu) "total_cpu", round((count(1)*10/(select * from total_cpu))*100,2) PCT_CPU from dba_hist_active_sess_history where snap_id >= &&begin_snap_id and snap_id < &&end_snap_id and SESSION_TYPE != 'BACKGROUND' and session_state = 'ON CPU' group by module order by PCT_CPU desc
Note, that in above query from DBA_HIST_ACTIVE_SESS_HISTORY, we have used count(*) * 10 because only 1 of 10 samples goes into this table.
I think for now above information is sufficient to give a basic idea about ASH.
I will be writing more articles on performance tuning using ASH as I dig further.
Hope this helps !!
5 thoughts on “Oracle Performance Tuning – ASH Basics”
I don’t think you can reliably calculate a session’s DB Time like this from a sample unless you know that it is likely to be running constantly between samples – a significant subset of usage.
This is why the columns TM_DELTA_DB_TIME and TM_DELTA_TIME are provided – to allow this to be accurately tracked.
However in my experience these columns I’ve mentioned are not reliable:
If a session is ACTIVE it will be either on CPU or on non-idle wait event. Sum of CPU + non-idle wait event = DB Time. So it has to be running constantly between samples. If its not running constantly between samples than session has become INACTIVE and will not be considered in next immediate sample.
I don’t think session can be running a SQL in ACTIVE status and in-between go to IDLE wait event and later continue with same instance SQL execution after that.
Click to access db-perf-tuning-ow08-131582.pdf
Sure it can.
Consider a SQL statement which is returning many rows back to a client with lots of round trips. It will flick constantly between states and in between any two samples, the round tripping might well add up to significantly more time than the SQL execution.
If this were not the case the additional ASH columns mentioned above would not be required.
If fetch size is small, client have to do many round trips to fetch all records. But fetch starts when execution is complete and wait event we will see is “SQL*Net message to client”, which comes under Idle wait event category.
Let me try this out and verify the result. I think Tanel’s snapper will be helpful in this situation to check if query returning a result goes back to User IO waits or not.
I think the column labels in the first query should be DB Time (not CPU time). Thanks!