Library Cache: Mutex X – Bug 20879889 – Fixed in 11.2.0.4

I recently encountered a bug related to MView log causing very high library cache: mutex x wait events.

I will brief about the debugging steps I tried and fix for the same.

Few things to note before I proceed:-

  • We observed huge wait events for library cache: mutex X whenever we performed flip to standby or when DB was bounced. I am implying that library cache was cold and didn’t had required cursor information and object handles.
  • Load on the database was very high. This includes very high number of sessions connected to database and high number of DMLs performed on same table by many sessions.
  • Table on which DMLs are performed is having MLOG created on that based on primary key. This is required as these changes and data needs to flow to downstream databases via MView (fast refresh). So around 10 downstream sites were registered on this MLOG.

Root Cause Analysis:

After looking at the wait events, we immediately started digging into the root cause. We wanted to understand the bottleneck which is causing these wait events and remove whatever it takes to reduce these wait events.

I started with checking top 10 wait events in last 20 mins from v$active_session_history


SQL>select * from (select event, count(1) from v$active_session_history
2 where sample_time > (sysdate - 20/1440) group by event order by 2 desc) where rownum < 10;

EVENT                                    COUNT(1)
---------------------------------------- ----------
library cache: mutex X                        50943
checkpoint completed                          15170
read by other session                          5487
row cache lock                                 4205
log file sync                                  3137
flashback buf free by RVWR                     1815
db file sequential read                        1675
log file switch completion                     1611
cursor: pin S wait on X                        1516

9 rows selected.

Clearly library cache: mutex X was way higher.

We can check who is causing library cache: mutex X by checking P1 and P2 of that wait event


SQL>select event, p1, count(1) from v$active_session_history where sample_time > (sysdate - 20/1440) and event = 'library cache: mutex X' group by event, p1 order by 3;

EVENT                                    P1         COUNT(1)
---------------------------------------- ---------- ----------
library cache: mutex X                    421399181          1
library cache: mutex X                   3842104349          1
library cache: mutex X                   3477606468          1
library cache: mutex X                   2432877226          1
library cache: mutex X                    955484820          2
library cache: mutex X                        37005         25
library cache: mutex X                   1412465886        297
library cache: mutex X                   2417922189      50615

8 rows selected.

As you can see max wait events are caused by P1 = 2417922189

P1 is idn – can be used for finding the cursor related to mutex

For library cache: mutex X

  1. P1 represents library cache hash bucket number (if idn <= 131072)
  2. P1 represents hash value of the library cache object under protection (if idn > 131072)

In our case hash value was 2417922189. So it represents library cache object.

We can check if this hash value belongs to a cursor (SQL) using v$sql


SQL>select sql_id from v$sql where hash_value = 2417922189;

SQL_ID
-------------
3b7aa6f81x44d

SQL>@sql
Enter SQL_ID:- 3b7aa6f81x44d
old 3: where sql_id = '&SQL_ID'
new 3: where sql_id = '3b7aa6f81x44d'

INSERT /*+ IDX(0) */ INTO "BOOKER"."MLOG$_FULFILLMENT_DEMAND_I"
(dmltype$$,old_new$$,snaptime$$,change_vector$$,xid$$,"WAREHOUSE
_ID","CO_WAREHOUSE_ITEM_ID") VALUES (:d,:o,to_date('4000-01-01:0
0:00:00','YYYY-MM-DD:HH24:MI:SS'),:c,:x,:1,:2)

As you can see this is DML on MLOG table. So clearly MLOG seems to be the bottleneck.

Immediate Action:

Wait events were making every thing stand still and nothing was getting processed. Since database availability was critical and we couldn’t afford to loose any more time because of wait events, our immediate action was to drop MLOG.

But since DB was having thousands of sessions and all stuck in library cache: mutex X, we cannot even get a lock to drop MLOG (not even with ddl_lock_timeout). Killing a session was not helping as they were keep coming back in no time.

So we decided to take quick bounce of DB (by keeping listeners down), dropped this MLOG and started listeners again.

This has fixed the issue and there were no wait events.

On down side, all downstream sites had to do complete refresh of MView followed by fast refresh as they got deregistered from MLOG. But it was OK as size of master table was not very big.

Patch Fix:

We also tried to investigate why an MLOG can cause this issue ? But were not able to get any insight.

MLOG is a table where primary key of changed records in main table will get inserted and when downstream table refreshes the data, these records will get deleted. So we don’t expect much higher size of MLOG. Unfortunately this is case it was 7.7GB (bigger than master table size).

Because the size of MLOG was too high, whenever a downstream database is doing fast refresh it has to update SNAPTIME$$ column in MLOG table which tells MLOG which rows are refreshed by downstream DB at what time. This update might be taking more time and holding enqueue locks. But inserts should not be blocked because of this update as it inserts a new record and doesn’t collied at enqueue level. Also, we were seeing these library cache wait events just after bounce or failover.

This is identified as a bug and Oracle has provided following patch as a fix for this issue

Patch 20879889: INSERT INTO MV LOG LEAVING TOO MANY OPEN CURSORS AFTER UPGR TO 11.2.0.4

Make sure you have this patch applied if you are using MLOG in your database.

Hope this helps !!

Reference:

https://support.oracle.com/epmos/faces/PatchDetail?requestId=19179589&_afrLoop=14055698364071&patchId=20879889&_afrWindowMode=0&_adf.ctrl-state=xqyu7jtft_77

 

 

Mutex: What do we know ?

In my previous article on Latches, I mentioned various things related to latches including different types and their behavior.

In this article I will describe similar things about mutexes.

Mutex are low level serialization / locking mechanism to protect memory structure inside library cache.

Why there was a change from Latches ? 

Mutex was introduced from 10.2 onwards and have been proved very efficient in managing library cache operations.

Mutex takes less memory then latches. Typically latch structure takes around 110 bytes on 32 bit system whereas mutex takes only 30 bytes. This variation is because of number of instructions it takes to acquire latch vs mutex. Latch takes around 150-200 instructions where as mutex takes 30-35 instructions. But the down side is that mutex gives less information about the waits and blockers compared to latches.

As we know the basics of library cache architecture, it consists of hash buckets and each bucket will contain linked list of library cache object handles. Whenever an access to library cache object happens, it is hashed to a bucket and corresponding latch should be obtained to traverse the chain in that bucket. Latch will be released when corresponding object is found or not found.

But there are only 67 library cache latches available to protect 131,072 buckets created by default in library cache. So single latch covers multiple buckets. This creates a false contention, meaning if 2 process are trying to access 2 different buckets protected by same latch, one of them have to wait until other completes traversing its bucket. So even though they are not colliding on same bucket they still will be blocked on each other because latch mechanism.

Further, if required object is found in library cache, process needs to pin that object while it is being used. Pin will basically protect object so that no other process can modify that object or that object will not be discarded from memory while you are accessing. When you are done, you need to take a latch again to unpin the object. These latches to pin and unpin needs memory allocation and deallocation to happen, which is expensive.

Compared to above process, mutex has many benefits

As against 67 latches to protect 131,072 buckets, Oracle introduced 131,072 mutexes to protect each bucket. So false contention is reduced drastically. False contention can still occur if 2 process want to access to 2 different objects which belongs to same bucket but its very rare.

After a mutex is obtained process will traverse the chain until it finds the required object. Once an object is found, a process needs to pin the object to access the same. But in case of mutex, there is no need for “latch: library cache pin”, instead mutex itself will act as a pin. Mutex acts as serialization mechanism to traverse a linked list as well as cursor pin structure. A mutex pin can be referenced by multiple sessions, providing that all sessions reference the mutex in shared mode only. The total number of sessions referencing a mutex in shared (S) mode is called reference count (ref count). The ref count for a mutex is stored in the mutex itself. Whenever a session wants to pin an object in shared pool, it increments this ref count of the mutex associated with that object. This is much faster and no need for any memory allocation or deallocation. Once process is done with that object, it will reduce the ref count. An object is not discarded from memory until ref count is zero.

Why not to increase number of latches ?

a. latches needs more memory and they are allocated upfront. As against Mutexes which are dynamic and are created when requested, latches are created when instance starts and its memory locations are externalized in view v$latch, v$latch_children

b. more latches you have, the more work you may have to do in some form of maintenance, reporting, or garbage collection. This will increase processing demand on server.

Now that we know why Oracle introduced mutex, lets move further in understanding more.

Shared, Exclusive and Examine mode:

Shared mode: Multiple sessions can access the memory structure protected by mutex in shared mode. Meaning every session has read access to structure protected by mutex. Every time a session access or acquire mutex in shared mode, ref count of mutex needs to be updated in mutex structure. Number of sessions accessing a mutex can be seen in lower bytes of P3 value of mutex wait event (cursor: pin S).

Exclusive mode: This mode is incompatible with all modes. Only 1 session can hold the lock in exclusive mode. In this mode, upper bytes of P3 value of mutex wait events represents the SID of holding session.

Examine mode: This mode indicates that mutex is in transition phase from shared mode to exclusive mode. During this time no other session can access or modify the mutex. In this mode, upper bytes of P3 value of mutex wait events represents the SID of holding session and lower bytes represents number of session holding mutex in shared mode.

Mutex Acquisition:

So how mutex acquisition works ?

We have seen in the past article on latches about how latch acquisition has worked. Mutex acquisition is similar to latch acquisition in that it tries for immediate gets and if its not able to acquire, it will spin followed by sleep (but sleep depends on oracle version).

Over past years with different releases of Oracle, mutex acquisition algorithm has changed drastically and I would say it got more stabilized with new releases. In Oracle 10.2, mutex was used for only pinning the object in library cache (only when _kks_use_mutex_pin=true) and there was no sleep.

Below algorithm applies to other mutexes which were introduced in 11g.

Oracle 10.2 – 11.1

Following is the mutex acquisition algorithm in 10.2 – 11.2

- Immediate gets mutex
      - spin gets mutex
            - yield()

Session will try to acquire mutex in immediate mode, if mutex is not available, it will spin. If still mutex is not available it will yield CPU, meaning, that process will be placed at the end of run queue but it will still be “ON CPU”. Because of this, CPU consumption was very high. Also because there was no sleep, wait interface was not able to record actual time waited for mutex acquisition. In AWR report, in top 5 timed events, we could see high number of waits, but total time waited used to be very low and CPU time always used to be close to DB time.

If CPU resources are not available OS will de-schedule such process (which is spinning and yielding CPU) and at that time that actual wait event for cursor: pin S gets recorded. But if a system has lot of CPU (like in many systems), that process will not even get de-scheduled, and Oracle thinks it is not waiting at all.

Therefore, in Oracle 10.2-11.1 the “cursor: pin S” wait time is the pure wait for CPU

For systems which do not have enough spare CPU, “cursor: pin S” used to be top wait events because it used to get de-scheduled from run queue and also whenever it used to spin and yield CPU consumption used to sky rocket. These are the systems where mutex issue was coming into light.

Because of these issue, Oracle released patch 6904068: High CPU usage when there are “cursor: pin S” waits.

This patch was released for 11.1 and later back ported to 10.2. With this patch, Oracle introduced underscore parameter – _first_spare_parameter. Default value of this parameter was 1 (centisec). This parameter provides a sleep time for mutex if its not acquired after first spin. So mutex behavior became similar to latch behavior but wait timeout of 10 ms was too high.

On better side, CPU consumption on system decreased and because of sleep time, this wait event (cursor: pin S) was shown up correctly on top 5 timed event (because total time waited was quiet accurate because of sleep). Parameter _first_spare_parameter was dynamic and if set to 0, mutex acquisition will behave without sleep (aggregated CPU usage).

But will all above discussion, question arises – if cursor: pin S is a shared mutex which can be acquired by multiple sessions concurrently why there would be blocking or wait events ?

Well couple of scenario when we can see this wait event

  1. If a mutex is in transition state “E” as mentioned above
  2. Everytime a session acquires mutex in shared mode, it has to update ref count of that mutex. That can cause contention.

What happened in 11.2 ?

Version 11.2.0.1

Behavior of mutex in 11.2.0.1 is same as 10.2. It was aggressive with no sleep. But it was visible as top event in-spite of no sleep calls.

This happened because Oracle counted wait time as time between first spin and successful mutex acquisition. So all the yield happening after the first spin was considered in waits. So total time waited was shown as high in AWR report and it was shown as one of the top timed events (even without sleep).

Version 11.2.0.2.2 PST

This version had a drastic change in behavior of mutex. Oracle came up with a exhaustive structure for controlling mutex called – mutex waits schemes.

Oracle introduced several underscore parameter to control behavior of mutex spin, yield and sleep.

Following underscore parameters were introduced starting this version

_mutex_wait_scheme – this parameter was introduced with 2 different values for 3 different type of mutex wait configurations.

0 - Always yield
1 - Always sleep for _mutex_wait_time
2 - Exponential back off with max sleep time of _mutex_wait_time

_mutex_spin_count – Number of times the process should spin. Default 255
_mutex_wait_time – amount of time process should sleep. Default 1ms

Along with this parameter, Oracle introduced following parameter to control yield and sleep of mutex

_wait_yield_mode – this defines if the process should yield first or sleep first. possible values are “yield” (default) or “sleep”

_wait_yield_hp_mode – defines high priority processes> default is SMON and VKTM

_wait_yield_sleep_time_msecs – defines how much a process should sleep before it yields again (in millisec). Default 1

_wait_yield_sleep_freq – number of yield cycles to do before it goes to sleep. default 100

_wait_yield_yield_freq – number of sleep cycles to do before it yields. default 20

_mutex_wait_scheme=0

I think in this scheme _mutex_wait_time is not applicable. Becuase sleep time depends on _wait_yield_sleep_freq. But I am not very sure.

With _mutex_wait_scheme=0 and default values for above _wait_yield* parameters above, mutex acquisition will work following way

- Immediate mutex gets
      - spin - _mutex_spin_count times
            - yield() - 100 times
                  - sleep - 20 times (20 cycles of 1 ms each)
            - yield() - 100 times
                  - sleep - 20 times (20 cycles of 1 ms each)

Using above underscore parameter we can vary the behaviour. Example if we change “_wait_yield_mode” to “sleep” instead of “yield”, oracle process will first sleep

- Immediate mutex gets
      - spin 
            - sleep - 20 times (20 cycles of 1 ms each)
      - yield() - 100 times
            - sleep - 20 times (20 cycles of 1 ms each)
      - yield() - 100 times

_mutex_wait_scheme=1

In this scheme _mutex_wait_time comes into picture. This is “always sleep” mode. So process goes to sleep after first spin, but wakes up after timeout of every _mutex_wait_time

So behavior will look like following

- Immediate mutex gets
      - Spin for _mutex_spin_count
            - sleep for _mutex_wait_time
            - sleep for _mutex_wait_time
            - sleep for _mutex_wait_time
            - sleep for _mutex_wait_time
            ...
            ...

_mutex_wait_scheme=2

In this scheme _mutex_wait_time defines the max time a process should wait. This is “exponential backoff” scheme. Wait time increases exponentially until it reaches a max value set by _mutex_wait_time.

Other than this behavior differs in initial spin and yield cycle. It spins and yield 2 times initially before the sleep begins

So behavior will look like following

- Immediate mutex gets
      - Spin for _mutex_spin_count - 2 times
            - yield() - 2 times
                  - sleep - with exponential increase of timeouts

Starting from 11.2.0.2.2, _mutex_wait_scheme=2 is the default behavior

Hope this helps !!

References:

http://blog.tanelpoder.com/2008/08/03/library-cache-latches-gone-in-oracle-11g/

http://oracleinaction.com/latche-lock-pin-mutex/

https://andreynikolaev.wordpress.com