November 2015 – Persistent Storage Solutions

Latches: What do we know ?

Published on November 26, 2015 by Advait5 Comments

Latches are low level serialization mechanism which protects memory areas inside SGA. They are light wait and less sophesticated than enqueues and can be acquired and released very quickly.

Latch acquisition does not involve any complex algorithm and is based on test-and-set atomic instruction of a computer processor.

Latch Classification:

Latch can be classified in multiple ways:-

Shared latch and exclusive latch:

Shared latch is the one which can be shared by multiple processes/sessions.

Example if a session wants to read a block in memory and at the same time other session also wants to read the same block in memory, they can acquire shared latch. Example of shared latch is “latch: cache buffer chain”. This latch was exclusive in older version and Oracle changed this to shared latch from Oracle 9i onwards.

Exclusive latch is the one which is acquired when a session wants to make modification to block or memory area. Exclusive latch can be held by only 1 session/process at a time and is not compatible with any other latch.

Both shared latch and exclusive latch are not compatible with each other. Process holding a shared latch will block other process which needs exclusive latch but will allow other process which needs shared latch on same memory area. Similarly process holding exclusive latch will not allow any other process which needs either shared latch or exclusive latch.

Example of exclusive latch is “library cache latches” in previous version. These were taken in exclusive mode even for traversing a hash bucket. These latches are no more present in 11g and they are replaced by mutex.

As per Andrey Nikolaev 460 out of 551 latches are exclusive in Oracle 11.2.0.2. In each new version, Oracle tries to make latches more sharable in order to reduce contention.

Another classification of latches is immedaite latch and willing to wait latch

Immedaite latch and willing to wait latch:

Immediate latches are the one that session try to acquire immediately without waits. If the latch is not available, session will not wait and may get terminated or check for another latch. Example of immediate latch is redo copy latch. This is immediate latch because Oracle only wants to know if anyone else is currently copying redo data to log buffer, but not who exactly is copying and where, as this does not matter to LGWR.

Willing to wait latches are the one which will wait if the latch is not available. These latches have little complex behavior then immediate latches. Most of the latches in oracle are willing to wait latches

Why do we see wait events on shared latches ?

After reading about shared latches and exclusive latches, one question comes to mind regarding shared latches.

If “latch: cache buffer chain” latch is a shared latch (past 9i version) and they dont block other shared latches, why do we see “latch: cache buffer chain” latch wait events even in latest version of Oracle ?

The answer to this is explained by Andrey Nikolaev. Andrey has given a very practical example of why this happens. When multiple processes are trying to acquire shared latch (example CBC latch), we should not see any wait events or process blocking each other. The moment another session try to acquire exclusive lock on the resource (where other sessions were having shared latch), exclusive latches are given higher preference than shared latch. So session with exclusive latch will get access to resource and it will block all other sessions having shared latch. This will form a chain of sessions willing to acquire shared latch. Strange part of this algorithm is that even after exclusive latch has been released by the processing holding it, other processes cannot acquire shared latch concurrently. They still continue to be in latch queue and they get access to shared latch one by one. First process in queue will get shared latch and once its done with the access, it will release the latch and post the next process that latch is available. This is one of the major reason why we see wait events on shared latches.

Process flow for latch acquisition

Following is the process flow for acquiring a latch

 - Immediate latch gets 
       - spin latch gets
             - Add process to queue of latch waits
                   - Sleep until posted

The last step in process flow (sleep until posted) is a changed behavior. Initially until 9i, Oracle used to wake up periodically to check if latch has been freed or not and if latch is still not avaialble go back to sleep. So process flow in 9i used to be following

 - Immediate latch gets 
       - spin latch gets
             - Add process to queue of latch waits
                   - Sleep for fixed time and wake up
             - immediate latch gets
                   - Sleep for fixed time and wake up
             - immediate latch gets
                   - Sleep for fixed time and wake up

This behavior has been change from 10g onwards and holding process will wakeup the waiter process after the latch becomes free. This has been explain well by Tannel Podder. Also behavior is explained in more details by ALEX FATKULIN.

Some times, holding process was not able to wake up the process waiting for latch because of bugs or lower kernel version. So older version of Oracle used to have some default timeout available so that in case holding process miss the wakeup call, it will not wait indefinitely and will wakeup after timeout. Oracle has introduced a parameter _enable_reliable_latch_waits={true|false}, which alters this behavior. If this is set to true then no timeout is added and holding process continue to sleep until it gets posted by holding process. False represents the behavior otherwise.

Only latch which is exception to above process flow is “process allocation” latch. This latch does not depends on holding process to post when latch is free. It wakes up preriodically to acquire latch.

Latch Behavior

Oracle has introduced few parameters to control behavior of latch. We will discuss brief about those parameters and how they affect shared latch and exclusive latches

When process try to acquire latch, it attempts to acquire latch in immediate mode. If latch is not available in immediate mode, it will spin for defined number of times and try again. If still latch is not free, process will go to sleep.

Process Spin:

Spinning is a process of consuming/burning CPU so that process will stay “ON CPU” and not to get descheduled from CPU cycles. During spinning, process will burn CPU for few micro seconds with the hope that after passing that much time, latch will be available for it to acquire. This does increase CPU consumption, but it saves time as it may avoid the needing for the process to sleep (which is expensive as it involves context switches).

Number of times a process spins to acquire latch depends on type of latch.

Exclusive latch:

For exclusive latch, this “can be” controlled by _spin_count but needs database bounce. I said “can be” because exclusive latch spins ar actually decided by “spin” column in x$ksllclass table.


SQL>select indx,spin from x$ksllclass;

INDX SPIN
---------- ----------
0 20000
1 20000
2 20000
3 20000
4 20000
5 20000
6 20000
7 20000

8 rows selected.

There are 8 classes of latches (as indicated by indx column) that we will discuss later in this article. By default all latches belongs to class 0.
If we want to change spin count for exclusive latches, we need to change value of SPIN column for class 0. This can be done by changing _spin_count and bouncing the instance (but that will change spin count for all classes), or by setting _latch_class_0 parameter (which will change spin count for only class 0). We have similar parameter to change spin count for other classes (_latch_class_[0-7]).

So changing _spin_count is not a good idea. Instead we can move a specific latch for which we want to change the spin count to another class (1-7) and change specific underscore parameter (_latch_class_[1-7]).

By details _spin_count parameter is not applicable to exclusive latches as default value of _spin_count is 2000 where as exclusive latch spin for 20000 times as mentioned in above table x$ksllclass.
But changing _spin_count will make it applicable for exclusive latch as well.

Shared Latch:

For shared latch, number of times process spins is _spin_count * 2. This has been proved by Andrey Nikolaev. Also, _spin_count parameter is applicable to shared latches by default. So since default value of _spin_count is 2000, shared latches spins 4000 times (2000 * 2).

Diagnosing latch contention:

I am not going to mention much here because Tannel Podder has already written great script – latchprof.sql and latchprofx.sql which can be used to analyze latch wait events.

Tannel has also written great article on how to diagnose latch wait events – http://tech.e2sn.com/oracle/troubleshooting/latch-contention-troubleshooting

In next article, I will try to cover mutex.

Hope this helps !!

Reference:

https://andreynikolaev.wordpress.com

http://tech.e2sn.com/oracle/troubleshooting/latch-contention-troubleshooting

Brief about Workload Management in Oracle RAC

Published on November 17, 2015 by Advait3 Comments

This is a brief article about workload management in RAC. I tried to cover different components of workload management in RAC and how they are configured at client side or server side. I haven’t gone into details of configuration steps but just mentioned in brief about how it can be done.

Readers are advised to refer Oracle documentation to understand details about configuration of workload management.

Workload management on RAC

There are 2 major components of workload management:

Failover – if connection to one instance fails, Oracle should failover the connection to another instance
Load Balancing – Workload on RAC instances should be distributed equally

Failover can be connect time or run time. Similarly load balancing can be achieved during connect time or run time.

We can also configure these components either on client side or on server side.

I tried to put workload management in RAC in single block diagram to get high level overview. Following figures gives a summary of configuring workload management in RAC.

workload

Lets check how to configure each of these components.

Failover

We can achieve failover at connect time or at run time. So depending on how we want to achieve failover (during connect time or run time), we can configure the same on client side or on server side

Connect time failover, Client side

Connect time failover can be configured on client side as connection failover happens if instance is down before it can even get connected. So logically its not possible to have it on server side (because that will need connection to complete and in that case it wont be connect time failover).

This is achieved on client side using FAILOVER=ON parameter in TNS string.
Example:

ORCL=
     (DESCRIPTION=
         (ADDRESS_LIST=
         (FAILOVER=ON)
         (ADDRESS= (PROTOCOL=TCP) (HOST=orcl_node1-vip) (PORT=1521))
         (ADDRESS= (PROTOCOL=TCP) (HOST=orcl_node2-vip) (PORT=1521))
         (ADDRESS= (PROTOCOL=TCP) (HOST=orcl_node3-vip) (PORT=1521))
         (ADDRESS= (PROTOCOL=TCP) (HOST=orcl_node4-vip) (PORT=1521))
     )
    (CONNECT_DATA= (SERVICE_NAME= ORCL))
)

Run time failover, Client side

At run time, we can achieve failover using Transparent Application Failover (TAF).
TAF can be configured on client side in TNS string using FAILOVER_MODE parameter.
Example:

ORCL=
     (DESCRIPTION=
         (ADDRESS_LIST=
         (FAILOVER=ON)
         (ADDRESS= (PROTOCOL=TCP) (HOST=orcl_node1-vip) (PORT=1521))
         (ADDRESS= (PROTOCOL=TCP) (HOST=orcl_node2-vip) (PORT=1521))
         (ADDRESS= (PROTOCOL=TCP) (HOST=orcl_node3-vip) (PORT=1521))
         (ADDRESS= (PROTOCOL=TCP) (HOST=orcl_node4-vip) (PORT=1521))
     )
     (CONNECT_DATA= (SERVICE_NAME=ORCL)
     (FAILOVER_MODE=(TYPE=select)(METHOD=basic))
)

If you check above TNS string, we have FAILOVER_MODE parameter and it specifies the failover type and method. If FAILOVER_MODE is specified then in case of instance outage, existing connected sessions will automatically failover at run time to other existing instances. TAF has more details then specified here. You can check Oracle documentation or reference links in this article for complete details about TAF.

Run time failover, Server side

Same TAF implementation can be done on server side as well. This is done as part of service management in RAC.
We can use SRVCTL to configure services on RAC add TAF parameters.

[oracle@orcl_node1 ~]$ srvctl add service -d orcl -s test -r orcl1 -P BASIC -e SESSION -m BASIC
[oracle@orcl_node1 ~]$ srvctl start service -d orcl -s test
[oracle@orcl_node1 ~]$ srvctl status service -d orcl -s test
Service test is running on instance(s) orcl1
[oracle@orcl_node1 ~]$ srvctl config service -d orcl -s test
Service name: test
Service is enabled
Server pool: orcl_test
Cardinality: 1
Disconnect: false
Service role: PRIMARY
Management policy: AUTOMATIC
DTP transaction: false
AQ HA notifications: false
Failover type: SESSION
Failover method: BASIC
TAF failover retries: 0
TAF failover delay: 0
Connection Load Balancing Goal: 
Runtime Load Balancing Goal:
TAF policy specification: BASIC
Preferred instances: orcl1
Available instances:
[oracle@orcl_node1 ~]$

We can also use Fast Connection Failover (FCF) using Fast Application Notification (FAN) events in OCI clients to get notifications about instance availability. Based on these notification, clients can reconnect to available instances.

Load Balancing

We can achieve load balancing at connect time or at run time. Depending upon when we want to achieve load balancing (connect time or run time), we can configure load balancing on client side or on server side.

Connect time load balancing, Client side

We can achieve connect time load balancing on client side using LOAD_BALANCE=ON parameter in TNS string.

Example:

ORCL=
     (DESCRIPTION=
         (ADDRESS_LIST=
         (LOAD_BALANCE=ON)
         (ADDRESS= (PROTOCOL=TCP) (HOST=orcl_node1-vip) (PORT=1521))
         (ADDRESS= (PROTOCOL=TCP) (HOST=orcl_node1-vip) (PORT=1521))
         (ADDRESS= (PROTOCOL=TCP) (HOST=orcl_node1-vip) (PORT=1521))
         (ADDRESS= (PROTOCOL=TCP) (HOST=orcl_node1-vip) (PORT=1521))
     )
    (CONNECT_DATA= (SERVICE_NAME= ORCL)
)

LOAD_BALANCE parameter is set to ON by default and we do not have to specify explicitely. However, putting LOAD_BALANCE=OFF will disable load balancing. Oracle picks ramdom hosts from address list and try to load balance connections to database instances. With 11.2, Oracle introduced SCAN listener which provide host IPs in round robin fashion. So with single SCAN alias in TNS names, connections to different hosts are balanced automatically. This is going to deprecate LOAD_BALANCING parameter in TNS. Example:

ORCL=
     (DESCRIPTION=
         (LOAD_BALANCE=ON)
         (ADDRESS= (PROTOCOL=TCP) (HOST=orcl_scan) (PORT=1521))
     )
     (CONNECT_DATA= (SERVICE_NAME= ORCL)
 )

Connect time load balancing, Server side

We can enable server side load balancing using CLB_GOAL service attribute.
Oracle has introduced load balanving advisory in 10g which keeps track of loads on individual instances. Having dynamic registration keeps all listeners aware of load profile of each instances. We need to set remote_listener to tns alias containing all virual IP address of all nodes in cluster. Even with SCAN listener, we need to keep SCAN VIP in remote_listener parameter.

CLB_GOAL stands for connect time load balancing goal. It is used to define expected session duration for the service. For example if we have OLTP service and we expect lots of short sessions which last for very short time (few secs to few mins), then we can set CLB_GOAL for that service as SHORT. If service is expected to serve sessions which are going to be connected for longer duration (few mins to hours), we can set CLB_GOAL to LONG. Setting CLB_GOAL will instruct listener to route connections based on metrics. Possible metrics are load per node (based on CPU run queue) (CLB_GOAL=short) or number of current connections (CLB_GOAL = long).

If CLB_GOAL is short then Oracle considers load per node (based on CPU run queue) as metrics and route connection to host where load is less.
If CLB_GOAL is long then Oracle considers number of connections to instance as metrics and route connection to host where number of connections are less.

Example:

[oracle@orcl_node1 ~]$ srvctl modify service -d orcl -s test -j SHORT
[oracle@orcl_node1 ~]$ srvctl config service -d orcl -s test
Service name: test
Service is enabled
Server pool: orcl_test
Cardinality: 1
Disconnect: false
Service role: PRIMARY
Management policy: AUTOMATIC
DTP transaction: false
AQ HA notifications: false
Failover type: SESSION
Failover method: BASIC
TAF failover retries: 0
TAF failover delay: 0
Connection Load Balancing Goal: SHORT
Runtime Load Balancing Goal: NONE
TAF policy specification: BASIC
Preferred instances: orcl1
Available instances:
[oracle@orcl_node1 ~]$

Run time load balancing, Server side

We cannot have client side load balancing during run time. This is because when we consider run time load balancing, each transaction needs to be balanced as against each connection.
Load balancing advisory serve as basic for runtime connection load balancing. Using dynamic service registration services are registered with all listeners. PMON of each instance updates the load profile to all listeners. Since listeners knows the load profile of all instances sessions are directed to most appropriate instance depending on the goal of runtime load balancing. Connection allocation is based on current performance level provided by the database instances as indicated by load balancing advisory FAN events. This provides load balancing at the transaction level instead of load balancing at the time of initial connection.

Service level of instances are analyzed based on runtime load balancing goal

Service time (internet web processing)
Throughput (batch processing)

Above runtime load balancing goals can be set using GOAL parameter of server (dont get confused with CLB_GOAL parameter, which is for connect time load balancing).

We can set this parameter GOAL on server side for each service using SRVCTL.

Once we set this parameter to either GOAL_SERVICE_TIME or GOAL_THROUGHPUT, Oracle will balance the load using following metrics

If GOAL_SERVICE_TIME is used, Oracle will check the service time ie. how fast an instance if serving a single transaction. Oracle will have these metrics for each of the instances and connection for a service will be diverted to an instance where service time is best. This is mainly for OLTP transactions.
If GOAL_THROUGHPUT is used, Oracle will check the throughput metrics ie. which instance is doing max amount of work in least time and forward the instance which is having best throughput. This is mainly for batch processing.

Example:

[oracle@orcl_node1 ~]$ srvctl modify service -d orcl -s test -B SERVICE_TIME
[oracle@orcl_node1 ~]$ srvctl config service -d orcl -s test
Service name: test
Service is enabled
Server pool: orcl_test
Cardinality: 1
Disconnect: false
Service role: PRIMARY
Management policy: AUTOMATIC
DTP transaction: false
AQ HA notifications: false
Failover type: SESSION
Failover method: BASIC
TAF failover retries: 0
TAF failover delay: 0
Connection Load Balancing Goal: SHORT
Runtime Load Balancing Goal: SERVICE_TIME
TAF policy specification: BASIC
Preferred instances: orcl1
Available instances: