Oracle RAC – Persistent Storage Solutions

Brief about Workload Management in Oracle RAC

Published on November 17, 2015 by Advait3 Comments

This is a brief article about workload management in RAC. I tried to cover different components of workload management in RAC and how they are configured at client side or server side. I haven’t gone into details of configuration steps but just mentioned in brief about how it can be done.

Readers are advised to refer Oracle documentation to understand details about configuration of workload management.

Workload management on RAC

There are 2 major components of workload management:

Failover – if connection to one instance fails, Oracle should failover the connection to another instance
Load Balancing – Workload on RAC instances should be distributed equally

Failover can be connect time or run time. Similarly load balancing can be achieved during connect time or run time.

We can also configure these components either on client side or on server side.

I tried to put workload management in RAC in single block diagram to get high level overview. Following figures gives a summary of configuring workload management in RAC.

workload

Lets check how to configure each of these components.

Failover

We can achieve failover at connect time or at run time. So depending on how we want to achieve failover (during connect time or run time), we can configure the same on client side or on server side

Connect time failover, Client side

Connect time failover can be configured on client side as connection failover happens if instance is down before it can even get connected. So logically its not possible to have it on server side (because that will need connection to complete and in that case it wont be connect time failover).

This is achieved on client side using FAILOVER=ON parameter in TNS string.
Example:

ORCL=
     (DESCRIPTION=
         (ADDRESS_LIST=
         (FAILOVER=ON)
         (ADDRESS= (PROTOCOL=TCP) (HOST=orcl_node1-vip) (PORT=1521))
         (ADDRESS= (PROTOCOL=TCP) (HOST=orcl_node2-vip) (PORT=1521))
         (ADDRESS= (PROTOCOL=TCP) (HOST=orcl_node3-vip) (PORT=1521))
         (ADDRESS= (PROTOCOL=TCP) (HOST=orcl_node4-vip) (PORT=1521))
     )
    (CONNECT_DATA= (SERVICE_NAME= ORCL))
)

Run time failover, Client side

At run time, we can achieve failover using Transparent Application Failover (TAF).
TAF can be configured on client side in TNS string using FAILOVER_MODE parameter.
Example:

ORCL=
     (DESCRIPTION=
         (ADDRESS_LIST=
         (FAILOVER=ON)
         (ADDRESS= (PROTOCOL=TCP) (HOST=orcl_node1-vip) (PORT=1521))
         (ADDRESS= (PROTOCOL=TCP) (HOST=orcl_node2-vip) (PORT=1521))
         (ADDRESS= (PROTOCOL=TCP) (HOST=orcl_node3-vip) (PORT=1521))
         (ADDRESS= (PROTOCOL=TCP) (HOST=orcl_node4-vip) (PORT=1521))
     )
     (CONNECT_DATA= (SERVICE_NAME=ORCL)
     (FAILOVER_MODE=(TYPE=select)(METHOD=basic))
)

If you check above TNS string, we have FAILOVER_MODE parameter and it specifies the failover type and method. If FAILOVER_MODE is specified then in case of instance outage, existing connected sessions will automatically failover at run time to other existing instances. TAF has more details then specified here. You can check Oracle documentation or reference links in this article for complete details about TAF.

Run time failover, Server side

Same TAF implementation can be done on server side as well. This is done as part of service management in RAC.
We can use SRVCTL to configure services on RAC add TAF parameters.

[oracle@orcl_node1 ~]$ srvctl add service -d orcl -s test -r orcl1 -P BASIC -e SESSION -m BASIC
[oracle@orcl_node1 ~]$ srvctl start service -d orcl -s test
[oracle@orcl_node1 ~]$ srvctl status service -d orcl -s test
Service test is running on instance(s) orcl1
[oracle@orcl_node1 ~]$ srvctl config service -d orcl -s test
Service name: test
Service is enabled
Server pool: orcl_test
Cardinality: 1
Disconnect: false
Service role: PRIMARY
Management policy: AUTOMATIC
DTP transaction: false
AQ HA notifications: false
Failover type: SESSION
Failover method: BASIC
TAF failover retries: 0
TAF failover delay: 0
Connection Load Balancing Goal: 
Runtime Load Balancing Goal:
TAF policy specification: BASIC
Preferred instances: orcl1
Available instances:
[oracle@orcl_node1 ~]$

We can also use Fast Connection Failover (FCF) using Fast Application Notification (FAN) events in OCI clients to get notifications about instance availability. Based on these notification, clients can reconnect to available instances.

Load Balancing

We can achieve load balancing at connect time or at run time. Depending upon when we want to achieve load balancing (connect time or run time), we can configure load balancing on client side or on server side.

Connect time load balancing, Client side

We can achieve connect time load balancing on client side using LOAD_BALANCE=ON parameter in TNS string.

Example:

ORCL=
     (DESCRIPTION=
         (ADDRESS_LIST=
         (LOAD_BALANCE=ON)
         (ADDRESS= (PROTOCOL=TCP) (HOST=orcl_node1-vip) (PORT=1521))
         (ADDRESS= (PROTOCOL=TCP) (HOST=orcl_node1-vip) (PORT=1521))
         (ADDRESS= (PROTOCOL=TCP) (HOST=orcl_node1-vip) (PORT=1521))
         (ADDRESS= (PROTOCOL=TCP) (HOST=orcl_node1-vip) (PORT=1521))
     )
    (CONNECT_DATA= (SERVICE_NAME= ORCL)
)

LOAD_BALANCE parameter is set to ON by default and we do not have to specify explicitely. However, putting LOAD_BALANCE=OFF will disable load balancing. Oracle picks ramdom hosts from address list and try to load balance connections to database instances. With 11.2, Oracle introduced SCAN listener which provide host IPs in round robin fashion. So with single SCAN alias in TNS names, connections to different hosts are balanced automatically. This is going to deprecate LOAD_BALANCING parameter in TNS. Example:

ORCL=
     (DESCRIPTION=
         (LOAD_BALANCE=ON)
         (ADDRESS= (PROTOCOL=TCP) (HOST=orcl_scan) (PORT=1521))
     )
     (CONNECT_DATA= (SERVICE_NAME= ORCL)
 )

Connect time load balancing, Server side

We can enable server side load balancing using CLB_GOAL service attribute.
Oracle has introduced load balanving advisory in 10g which keeps track of loads on individual instances. Having dynamic registration keeps all listeners aware of load profile of each instances. We need to set remote_listener to tns alias containing all virual IP address of all nodes in cluster. Even with SCAN listener, we need to keep SCAN VIP in remote_listener parameter.

CLB_GOAL stands for connect time load balancing goal. It is used to define expected session duration for the service. For example if we have OLTP service and we expect lots of short sessions which last for very short time (few secs to few mins), then we can set CLB_GOAL for that service as SHORT. If service is expected to serve sessions which are going to be connected for longer duration (few mins to hours), we can set CLB_GOAL to LONG. Setting CLB_GOAL will instruct listener to route connections based on metrics. Possible metrics are load per node (based on CPU run queue) (CLB_GOAL=short) or number of current connections (CLB_GOAL = long).

If CLB_GOAL is short then Oracle considers load per node (based on CPU run queue) as metrics and route connection to host where load is less.
If CLB_GOAL is long then Oracle considers number of connections to instance as metrics and route connection to host where number of connections are less.

Example:

[oracle@orcl_node1 ~]$ srvctl modify service -d orcl -s test -j SHORT
[oracle@orcl_node1 ~]$ srvctl config service -d orcl -s test
Service name: test
Service is enabled
Server pool: orcl_test
Cardinality: 1
Disconnect: false
Service role: PRIMARY
Management policy: AUTOMATIC
DTP transaction: false
AQ HA notifications: false
Failover type: SESSION
Failover method: BASIC
TAF failover retries: 0
TAF failover delay: 0
Connection Load Balancing Goal: SHORT
Runtime Load Balancing Goal: NONE
TAF policy specification: BASIC
Preferred instances: orcl1
Available instances:
[oracle@orcl_node1 ~]$

Run time load balancing, Server side

We cannot have client side load balancing during run time. This is because when we consider run time load balancing, each transaction needs to be balanced as against each connection.
Load balancing advisory serve as basic for runtime connection load balancing. Using dynamic service registration services are registered with all listeners. PMON of each instance updates the load profile to all listeners. Since listeners knows the load profile of all instances sessions are directed to most appropriate instance depending on the goal of runtime load balancing. Connection allocation is based on current performance level provided by the database instances as indicated by load balancing advisory FAN events. This provides load balancing at the transaction level instead of load balancing at the time of initial connection.

Service level of instances are analyzed based on runtime load balancing goal

Service time (internet web processing)
Throughput (batch processing)

Above runtime load balancing goals can be set using GOAL parameter of server (dont get confused with CLB_GOAL parameter, which is for connect time load balancing).

We can set this parameter GOAL on server side for each service using SRVCTL.

Once we set this parameter to either GOAL_SERVICE_TIME or GOAL_THROUGHPUT, Oracle will balance the load using following metrics

If GOAL_SERVICE_TIME is used, Oracle will check the service time ie. how fast an instance if serving a single transaction. Oracle will have these metrics for each of the instances and connection for a service will be diverted to an instance where service time is best. This is mainly for OLTP transactions.
If GOAL_THROUGHPUT is used, Oracle will check the throughput metrics ie. which instance is doing max amount of work in least time and forward the instance which is having best throughput. This is mainly for batch processing.

Example:

[oracle@orcl_node1 ~]$ srvctl modify service -d orcl -s test -B SERVICE_TIME
[oracle@orcl_node1 ~]$ srvctl config service -d orcl -s test
Service name: test
Service is enabled
Server pool: orcl_test
Cardinality: 1
Disconnect: false
Service role: PRIMARY
Management policy: AUTOMATIC
DTP transaction: false
AQ HA notifications: false
Failover type: SESSION
Failover method: BASIC
TAF failover retries: 0
TAF failover delay: 0
Connection Load Balancing Goal: SHORT
Runtime Load Balancing Goal: SERVICE_TIME
TAF policy specification: BASIC
Preferred instances: orcl1
Available instances:

Reference:

http://www.oracle.com/technetwork/database/features/oci/taf-10-133239.pdf

http://docs.oracle.com/cd/B19306_01/rac.102/b14197/hafeats.htm

Oracle Real Application Cluster 10g

Published on November 1, 2010 by Advait1 Comment

Introduction

Oracle Real Application Cluster (RAC) is a revolution in the database management system. It is an extension of Oracle single instance database. RAC is basically a cluster of instances working on the same database. As you know instance is nothing but the computer memory and some background processes, so in case of RAC we have multiple such instances which are installed and configured on different nodes and we have a single database (datafiles) which are accessed by these instances. This post explains the technical details about the RAC architecture and also I will discuss about the installation of RAC.

What is Oracle Real Application Cluster 10g?

Software Architecture

A RAC is a clustered database. A cluster is a group of independent servers that cooperate as a single system. In the event of system failure clustering ensure high availablity to the user. Access to mission critical data is not lost. Redundant hardware components, such as additional nodes, interconnects and disks, allow the cluster to provide high availability. Such redundant hardware architecture avoids a single point of failure and ensures high availability for the system.

Above figure shows the architecture for RAC. In RAC each instance runs on a seperate server which can access database made of multiple disks. For RAC to act as a sngle database, each seperate instance in a RAC should be a part of cluster. For the external users all the instance (nodes) which are part of cluster will look as single instance.

For each instance to be a part of cluster, we need to have some cluster software installed and all the instance should register in the cluster software. From Oracle Database 10g onwards, Oracle provides its own clusterware, A software to be installed on the nodes which are the part of cluster. Advantage with Oracle clusterware is that customer doesn’t have to purchase any third party clusterware. Also the clusterware provided by Oracle is integrated with OUI for easy installation. When a node in a Oracle cluster is started, all instances, listener and services are stared automatically. If an instance fail, the clusterware will automatically restart the instance so the services is often restored before the administrator notices it was down.

Network Architecture

Each RAC node should have at least one static IP address for the public network (Used by application) and one static IP address for the private cluster interconnect. Also we can have 1 virtual IP address(VIP) for each node.

The private networks are critical components of a RAC cluster. The private networks should only be used by Oracle to carry Cluster Manager and Cache Fusion (Explained Later) inter-node connection. A RAC database does not require a separate private network, but using a public network can degrade database performance (high latency, low bandwidth). Therefore the private network should have high-speed NICs (preferably one gigabit or more) and it should only be used by Oracle.

Virtual IPs are required for fail over. This is called TAF (Transparent Application Failover). Processes external to the Oracle 10g RAC cluster control the Transparent Application Failover (TAF). This means that the failover types and methods can be unique for each Oracle Net client. For failover to happen client connections are made using VIPs.

Hardware Architecture

Both Oracle Clusterware and Oracle RAC require access to disks that are shared by each node in the cluster. The shared disks must be configured using OCFS (1 or 2), raw devices or third party cluster file system such as GPFS or Veritas.

OCFS2 is a general-purpose cluster file system that can be used to store Oracle Clusterware files, Oracle RAC database files, Oracle software, or any other types of files normally stored on a standard filesystem such as ext3. This is a significant change from OCFS Release 1, which only supported Oracle Clusterware files and Oracle RAC database files. Note that ASM cannot be used to store the Oracle clusterware files, since clusterware is installed before installaing ASM and also clusterware have to be started before starting ASM.

OCFS2 is available free of charge from Oracle as a set of three RPMs: a kernel module, support tools, and a console. There are different kernel module RPMs for each supported Linux kernel.

Installing RAC 10g

Installing a RAC is a 5 step process as given below.

1) Complete Pre-Installation Task
Hardware Requirement
Software Requirement
Environment Configuration, Kernel parameter and so on.
2) Perform CRS installation
3) Perform Oracle Database 10g Installation
4) Perform Cluster Database creation
5) Complete post installation task

Pre-Installation Task

Check System Requirement

– Atleast 512MB of RAM
Run below command to check
# grep MemTotal /proc/meminfo
– Atleast 1G of swap space
Run below command to check
# grep SwapTotal /proc/meminfo
– /tmp directory should be 400M
Run below command to check
df -h /tmp

Check Software Requirement
– package Requirements

For installing RAC, the packages required for Red Hat 3.0 are:
gcc-3.2.3-2
compat-db-4.0.14.5
compat-gcc-7.3-2.96.122
compat-gcc-c++-7.3-2.96.122
compat-libstdc++-7.3-2.96.122
compat-libstdc++-devel-7.3-2.96.122
openmotif21-2.1.30-8
setarch-1.3-1

you can verify if these or higher version packages are present or not using following command
# rpm -q <package_name>

– Create Groups and Users

You can create unix user groups and user IDs using groupadd and useradd commands. We need 1 oracle user and 2 groups – “oinstall” being the primary and “dba” being secondary.

# groupadd -g 500 oinstall
# groupadd -g 501 dba
# useradd -u 500 -d /home/oracle -g “oinstall” -G “dba” -m -s /bin/bash oracle

Configure Kernel Paramters

– Make sure that following parameters are set in /etc/sysctl.conf

kernel.shmall = 2097152
kernel.shmmax = 536870912
kernel.shmmni = 4096
kernel.sem = 250 32000 100 128
fs.file-max = 658576
net.ipv4.ip_local_port_range = 1024 65000
net.core.rmem_default = 262144
net.core.rmem_max = 1048536
net.core.wmem_default = 262144
net.core.wmem_max = 1048536

To load the new setting run /sbin/sysctl –p

These are the minimum required values, you can have higher values as well if your server configuration allows.

Setting the system environment

– Set the user Shell limits

cat >> /etc/security/limits.conf << EOF
oracle soft nproc 2047
oracle hard nproc 16384
oracle soft nofile 1024
oracle hard nofile 65536
EOF

cat >> /etc/pam.d/login << EOF
session required /lib/security/pam_limits.so
EOF

cat >> /etc/profile << EOF
if [ \$USER = “oracle” ]; then
if [ \$SHELL = “/bin/ksh” ]; then
ulimit -p 16384
ulimit -n 65536
else
ulimit -u 16384 -n 65536
fi
umask 022
fi
EOF

cat >> /etc/csh.login << EOF
if ( \$USER == “oracle” ) then
limit maxproc 16384
limit descriptors 65536
umask 022
endif
EOF

– Configure the Hangcheck Timer

Hangcheck-timer module monitors the Linux kernal for extended operating system hangs that can affect the reliability of RAC node and cause database corruption. If a hang occurs, the module reboots the node.

You can check if the hangcheck-timer module is loaded by running lsmod command as root user.

/sbin/lsmode | grep -i hang

If the module is not running you can load it manually using below command.

modprobe hangcheck-timer hangcheck_tick=30 hangcheck_margin=180
cat >> /etc/rc.d/rc.local << EOF
modprobe hangcheck-timer hangcheck_tick=30 hangcheck_margin=180
EOF

– Configuring /etc/hosts

/etc/hosts contains the hostname and IP address of the server.

You will need 3 hostnames for each node in the cluster. One will be public hostname for primary interface. Second will be private hostname for cluster interconnect. Third will be virtual hostnames (VIP) for high availability.

For Node 1

# Do not remove the following line, or various programs
# that require network functionality will fail.
127.0.0.1 localhost.localdomain localhost
140.87.222.169 ocvmrh2045.us.oracle.com ocvmrh2045 #node1 public
140.87.241.194 ocvmrh2045-nfs.us.oracle.com ocvmrh2045-nfs ocvmrh2045-a #node1 nfs
152.68.143.111 ocvmrh2045-priv.us.oracle.com ocvmrh2045-priv #node1 private
152.68.143.112 ocvmrh2053-priv.us.oracle.com ocvmrh2053-priv #node2 private
140.87.222.220 ocvmrh2051.us.oracle.com ocvmrh2051 # Node1 vip
140.87.222.225 ocvmrh2056.us.oracle.com ocvmrh2056 # Node2 vip

For Node 2

# Do not remove the following line, or various programs
# that require network functionality will fail.
127.0.0.1 localhost.localdomain localhost
140.87.222.222 ocvmrh2053.us.oracle.com ocvmrh2053 # Node2 Public
140.87.241.234 ocvmrh2053-nfs.us.oracle.com ocvmrh2053-nfs ocvmrh2053-a # Node2 nfs
152.68.143.111 ocvmrh2045-priv.us.oracle.com ocvmrh2045-priv # Node1 Private
152.68.143.112 ocvmrh2053-priv.us.oracle.com ocvmrh2053-priv # Node2 Private
140.87.222.220 ocvmrh2051.us.oracle.com ocvmrh2051 # Node1 vip
140.87.222.225 ocvmrh2056.us.oracle.com ocvmrh2056 # Node2 vip

– Creating database Directories

You have to get the following directories created for you with a write permission for oracle user.

Oracle Base Directories
Oracle Inventory Directory
CRS Home Directory
Oracle Home Directory

In our case the directories are:

Oracle Base Directories – /u01/app/
Oracle Inventory Directory – /u01/app/oraInventory
CRS Home Directory – /u01/app/oracle/product/10.2.0/crs
Oracle Home Directory – /u01/app/oracle/product/10.2.0/db

Configure SSH for User Equivalence

The OUI detects whether the machine on which you are installing RAC is a part of cluster. If its a part of cluster then you have to select the other nodes which are the part of cluster and on which you want to install the patch. But when OUI tries to install the patch on other node while connecting from 1st node, it will ask for login credential and prompt for a password in between the installation, which we want to avoid. For this purpose we have to have user equivelence in place. User equivalence can be achieved by using SSH. First you have ot configure SSH.

Logon as the “oracle” UNIX user account

# su – oracle

If necessary, create the .ssh directory in the “oracle” user’s home directory and set the correct permissions on it:

$ mkdir -p ~/.ssh
$ chmod 700 ~/.ssh

Enter the following command to generate an RSA key pair (public and private key) for version 3 of the SSH protocol:

$ /usr/local/git/bin/ssh-keygen -t rsa

Enter the following command to generate a DSA key pair (public and private key) for version 3 of the SSH protocol:

$ /usr/local/git/bin/ssh-keygen -t dsa

Repeat the above steps for all Oracle RAC nodes in the cluster

Create authorized key file.

$ touch ~/.ssh/authorized_keys
$ cd ~/.ssh
bash-3.00$ ls -lrt *.pub
-rw-r–r– 1 oracle oinstall 399 Nov 20 11:51 id_rsa.pub
-rw-r–r– 1 oracle oinstall 607 Nov 20 11:51 id_dsa.pub

The listing above should show the id_rsa.pub and id_dsa.pub public keys created in the previous section

In this step, use SSH to copy the content of the ~/.ssh/id_rsa.pub and ~/.ssh/id_dsa.pub public key from all Oracle RAC nodes in the cluster to the authorized key file just created (~/.ssh/authorized_keys).

Here node 1 is ocvmrh2045 and node 2 is ocvmrh2053

ssh ocvmrh2045 cat ~/.ssh/id_rsa.pub >> ~/.ssh/authorized_keys
ssh ocvmrh2045 cat ~/.ssh/id_dsa.pub >> ~/.ssh/authorized_keys
ssh ocvmrh2053 cat ~/.ssh/id_rsa.pub >> ~/.ssh/authorized_keys
ssh ocvmrh2053 cat ~/.ssh/id_dsa.pub >> ~/.ssh/authorized_keys

Now that we have the entry for all the public keys on both the node in this file, we should copy the file to all the RAC nodes. We done have to do this on all nodes, just copying the file to other nodes will do.

scp ~/.ssh/authorized_keys ocvmrh2053:.ssh/authorized_keys

Set permissions to the authorized file

chmod 600 ~/.ssh/authorized_keys

Establish User Equivalence

Once SSH is configured we can go ahead with configuring user equivalence.

su – oracle

exec /usr/local/git/bin/ssh-agent $SHELL
$ /usr/local/git/bin/ssh-add

Identity added: /home/oracle/.ssh/id_rsa (/home/oracle/.ssh/id_rsa)
Identity added: /home/oracle/.ssh/id_dsa (/home/oracle/.ssh/id_dsa)

– Test Connectivity

Try the below command and it should not ask for the password. It might ask the password for the first time, but after that it should be able to execute the steps without asking for password.

ssh ocvmrh2045 “date;hostname”
ssh ocvmrh2053 “date;hostname”
ssh ocvmrh2045-priv “date;hostname”
ssh ocvmrh2053-priv “date;hostname”

Partitioning the disk

In order to use OCFS2, you need to first partition the unused disk. You can use “/sbin/sfdisk –s” as a root user to check the partitions. We will be creating a single partition to be used by OCFS2. As a root user, run the below command.

# fdisk /dev/sdc
Device contains neither a valid DOS partition table, nor Sun, SGI or OSF disklabel
Building a new DOS disklabel. Changes will remain in memory only,
until you decide to write them. After that, of course, the previous
content won’t be recoverable.

The number of cylinders for this disk is set to 1305.
There is nothing wrong with that, but this is larger than 1024,
and could in certain setups cause problems with:
1) software that runs at boot time (e.g., old versions of LILO)
2) booting and partitioning software from other OSs
(e.g., DOS FDISK, OS/2 FDISK)
Warning: invalid flag 0x0000 of partition table 4 will be corrected by w(rite)

Command (m for help): n
Command action
e extended
p primary partition (1-4)
p
Partition number (1-4): 1
First cylinder (1-1305, default 1):
Using default value 1
Last cylinder or +size or +sizeM or +sizeK (1-1305, default 1305):
Using default value 1305

Command (m for help): w
The partition table has been altered!

Calling ioctl() to re-read partition table.
Syncing disks.

You can verify the new partition now as

# fdisk -l /dev/sdc

Disk /dev/sdc: 10.7 GB, 10737418240 bytes
255 heads, 63 sectors/track, 1305 cylinders
Units = cylinders of 16065 * 512 = 8225280 bytes

Device Boot Start End Blocks Id System
/dev/sdc1 1 1305 10482381 83 Linux

When finished partitioning, run the ‘partprobe‘ command as root on each of the remaining cluster nodes in order to assure that the new partitions are configured.

Configure OCFS2

We will be using OCFS2 here in this installation. OCFS is a cluster file system solution provided by Oracle, which is specially meant for RAC instllation. Once configure the disk for OCFS we can use the same for clusterware files (like OCR – Oracle Cluster Registry file and Voting file), as well as we can use the same disk for database files.

# ocfs2console

Select Cluster -> Configure Nodes

Click on Add on the next window, and enter the Name and IP Address of each node in the cluster.
Note: Use node name to be the same as returned by the ‘hostname’ command

Apply, and Close the window.

After exiting the ocfs2console, you will have a /etc/ocfs2/cluster.conf similar to the following on all nodes. This OCFS2 configuration file should be exactly the same on all of the nodes:

node:
ip_port = 7777
ip_address = 140.87.222.169
number = 0
name = ocvmrh2045
cluster = ocfs2

node:
ip_port = 7777
ip_address = 140.87.222.222
number = 1
name = ocvmrh2053
cluster = ocfs2

cluster:
node_count = 2
name = ocfs2

Configure O2CB to Start on Boot and Adjust O2CB Heartbeat Threshold

You now need to configure the on-boot properties of the O2CB driver so that the cluster stack services will start on each boot. You will also be adjusting the OCFS2 Heartbeat Threshold from its default setting of 7 to 601. All the tasks within this section will need to be performed on both nodes in the cluster as root user.

Set the on-boot properties as follows:

# /etc/init.d/o2cb offline ocfs2
# /etc/init.d/o2cb unload
# /etc/init.d/o2cb configure

Configuring the O2CB driver.

This will configure the on-boot properties of the O2CB driver.
The following questions will determine whether the driver is loaded on boot. The current values will be shown in brackets (‘[]’). Hitting <ENTER> without typing an answer will keep that current value. Ctrl-C will abort.

Load O2CB driver on boot (y/n) [y]: y
Cluster to start on boot (Enter “none” to clear) [ocfs2]: ocfs2
Specify heartbeat dead threshold (>=7) [7]: 601
Writing O2CB configuration: OK
Loading module “configfs”: OK
Mounting configfs filesystem at /config: OK
Loading module “ocfs2_nodemanager”: OK
Loading module “ocfs2_dlm”: OK
Loading module “ocfs2_dlmfs”: OK
Mounting ocfs2_dlmfs filesystem at /dlm: OK
Starting O2CB cluster ocfs2: OK

We can now check again to make sure the settings took place in for the o2cb cluster stack:

Verify that ocfs2 and o2cb are started. Check this on both nodes. As root user:

# chkconfig –list |egrep “ocfs2|o2cb”
ocfs2 0:off 1:off 2:on 3:on 4:on 5:on 6:off
o2cb 0:off 1:off 2:on 3:on 4:on 5:on 6:off

If it doesn’t look like above on both nodes, turn them on by following command as root:

# chkconfig ocfs2 on
# chkconfig o2cb on

Create and format the OCFS2 filesystem on the unused disk partition

As root on each of the cluster nodes, create the mount point directory for the OCFS2 file system.

# mkdir /u03

Run the below command as a root user only on 1 node to create a OCFS2 file system on the unused disk /dev/sdc1, that you partitioned above.

# mkfs.ocfs2 -b 4K -C 32K -N 4 -L /u03 /dev/sdc1
mkfs.ocfs2 1.2.2
Filesystem label=/u03
Block size=4096 (bits=12)
Cluster size=32768 (bits=15)
Volume size=10733944832 (327574 clusters) (2620592 blocks)
11 cluster groups (tail covers 5014 clusters, rest cover 32256 clusters)
Journal size=67108864
Initial number of node slots: 4
Creating bitmaps: done
Initializing superblock: done
Writing system files: done
Writing superblock: done
Formatting Journals: done
Writing lost+found: done
mkfs.ocfs2 successful

The meaning of the above command is partition with a volume label of “/u03” (-L /u03), a block size of 4K (-b 4K) and a cluster size of 32K (-C 32K) with 4 node slots (-N 4).

Once OCFS2 filesystem is configured on the disk, you can mount the same.

Mount OCFS2 filesystem on both nodes

Run the below command on all nodes to mount the disk having OCFS2 file system.

# mount -t ocfs2 -L /u03 -o datavolume /u03

You can verify if the disk is mounted correctly or not using below command

# df /u03

Filesystem 1K-blocks Used Available Use% Mounted on
/dev/sdc1 10482368 268736 10213632 3% /u03

Create the directories for shared files

As root user, run the following commands on node1 only. Since /u03 is on a shared disk, all the files added from one node will be visible on other nodes.

CRS files:

# mkdir /u03/oracrs
# chown oracle:oinstall /u03/oracrs
# chmod 775 /u03/oracrs

Database files:

# mkdir /u03/oradata
# chown oracle:oinstall /u03/oradata
# chmod 775 /u03/oradata

Installaing Oracle Clusterware

Before installing the Oracle RAC database, we need to install Oracle clusterware. Clusterware will create 2 important files. One is the OCR file (Oracle Cluster Registry) and other is Voting file. OCR file is used for registering the nodes involved in RAC installation and to store all the details about those nodes. Voting file is used to get the status of each node after a definite period of time. Each node will register its presence after a definite time into this voting file. This is called heart beat of RAC. If a node goes down, then it wont be able to register its presence in voting file and other instance will come to know. Other instance will then bring up the crashed instance.

Follow the below step to install clusterware.

From the setup directory run the ./runInstaller command

Below are the main screen and the inputs to be given.

Welcome page – Click Next

Specify Inventory Directory and Credentials – Enter the inventory location where it should create inventory

Specify Home Details – Specify the correct location of home. Provide the location for crs home. Note that this location may not be a shared location. This is the location for installing a crs software and not for OCR and voting file.

Product Specific Prerequisite Checks – OUI will perform the required pre-reqs checks. Once done, press Next.

Specify Cluster Configuration – On this screen you need to add all the servers that will be part of RAC installation. Basically this is a push install, where the installation will be pushed to all the nodes we are adding here. So that we “don’t’ have to install CRS again from node 2.

Specify Network Interface Usage – We need at least 1 network to be private and not to be used by application. So make 1 network as private network, so that we can use the same for interconnect.

Specify OCR Location – This is where you will provide the location for OCR file. Remember that this file should be shared and accessible to all the nodes. We know that we have a shared disk /u03. In the above step under “Create the directories for shared files”, we created a “/u03/oracrs” directory. This can be provided here.

Specify Voting Disk Location – On this screen you will provide the location for voting file. You need to provide the shared location here as well. You can provide the same shared location we created in above step.

Summary – Click on Install

You can verify the cluster installation by running olsnodes.

bash-3.00$ /u01/app/oracle/product/10.2.0/crs/bin/olsnodes
ocvmrh2045
ocvmrh2053

Create the RAC Database

You can follow the same steps that you follow for installing the single instance database, only couple of screens are new in this instllation as compared to single instance database installation.

4) Specify Hardware Cluster Installation Mode – Select cluster installation and click on select all to select all the nodes in the cluster. This will propogate the installation in all the nodes.

10) Specify Database Storage Options – In this case if you are not using ASM or Raw devices and using file system then specify the shared location we created above. This is important because all the instance should have access to the datafiles. We are creating multiple instances but we are having single database(database files).

At the end, it will give the summary and you click on install.

Congratulations! Your new Oracle 10g RAC Database is up and ready for use.

References

Oracle RAC Documentation – http://download.oracle.com/docs/html/B10766_08/toc.htm

Oracle Technical White Paper May 2005 by Barb Lundhild

Converting a single instance database to RAC – http://www.oracle.com/technology/pub/articles/chan_sing2rac_install.html#prelim

Oracle Certified RAC Expert

Published on April 5, 2010 by Advait5 Comments

Hi All,

Just to inform all my viewers..

I have cleared Oracle Database 10g: RAC Administration exam (1Z0-048). This qualifies me as Oracle Certified RAC Expert.

About Exam:

This Exam (1Z0-048) is one of the Certified Expert Programs. Pre-requisite for this exam is to either have OCP 10g or you can attained any approved Oracle university course before giving the exam. If you are not a OCP 10g and you are attaining the course in OU, then you need to submit a course submission form.

How to prepare for this exam:

I don’t want to give any hard and fast rule on my own about exam preparation. But here is what I did for clearing this exam.

1. Read ILT (Instructor Led Training) materials of the course “Oracle Database 10g: RAC for Administrators Release 2” thoroughly (may be 2-3 times). I read it 2 times

2. If possible read the book “Oracle Database 10g: RAC Handbook” – by K. Gopalkrishnan

3. Oracle online document – 2 Day + Real Application Clusters Guide

Exam Information

You can find more information about this exam on Oracle site

Converting a RAC database to single instance database – Oracle RAC 10g

Published on February 26, 2010 by Advait2 Comments

This post will brief you about steps to convert a RAC database to single instance.

I don’t think this is supported by oracle as I was not able to find any document or metalink note ID which will provide the steps. So please do not try the same in production environment unless you have a conformation from Oracle Support.

However I tried the following steps in test instance an it worked for me.

1) Stop database using srvctl

srvctl stop database -d prod

2) Remove the database entry from crs

bash-3.00$ srvctl remove instance -d prod -i prod1
Remove instance prod1 from the database prod? (y/[n]) y
bash-3.00$ srvctl remove instance -d prod -i prod2
Remove instance prod2 from the database prod? (y/[n]) y
bash-3.00$ srvctl remove database -d prod
Remove the database prod? (y/[n]) y
bash-3.00$

3) Start the database on first instance

SQL> startup
ORACLE instance started.
Total System Global Area  838860800 bytes
Fixed Size                  1222168 bytes
Variable Size             188746216 bytes
Database Buffers          645922816 bytes
Redo Buffers                2969600 bytes
Database mounted.
Database opened.
SQL> alter system set cluster_database=false scope=spfile;
System altered.
SQL> alter system set cluster_database_instances=1 scope=spfile;
System altered.
SQL> alter database disable thread 2;
Database altered.

4) Delete the unwanted thread and redo logfiles

SQL> select thread#, group# from v$log;
THREAD#     GROUP#
---------- ----------
1          1
1          2
1          3
2          4
2          5
2          6
6 rows selected.
SQL> alter database drop logfile group 4;
Database altered.
SQL> alter database drop logfile group 5;
Database altered.
SQL> alter database drop logfile group 6;
Database altered.

5) Drop the unwanted undo tablespace

SQL> drop tablespace UNDOTBS2 including contents and datafiles;

Tablespace dropped.

6) Create pfile from spfile

SQL> create pfile from spfile;

File created.

SQL> shut immediate

Database closed.
Database dismounted.
ORACLE instance shut down.

7) Remove all references of second instance from the pfile and start the instance using pfile. Later you can again create spfile from pfile.

Converting a Single Instance Database to RAC Manually – Oracle RAC 10g

Published on February 22, 2010 by Advait7 Comments

There are different ways to convert a single instance database to RAC.

1) Manual Method

2) Using rconfig

3) Using DBCA

4) Using grid control

In this post we will see step by step manual method to convert a single instance database to RAC. Later on I will also try to cover other methods

Manual Method

Following are the single instance details

ORACLE_HOME=/u01/app/oracle/product/10.2.0/db_1

ORACLE_SID=prod

Datafile location = /u03/oradata/prod — /u03 is a ocfs shared file system. So we are going to keep the datafiles are same location while converting to RAC.

In case your files are at some other slot or disk and not in shared file system, you need to copy the same and then rename the files when you mount the instance.

Database version = 10g R2 (10.2.0.1.0)

Steps to convert single instance to RAC

Step 1) Install clusterware on the nodes on which you want to setup RAC

For detailed steps on installing clusterware on the nodes, you can refer to my previous post – https://avdeo.com/oracle-real-application-cluster-10g/

or you can refer to any other post over the internet. Basically you need to setup the IP addresses and other OS related files and variables before you can install the clusterware.

Your clusterware version must be greater then or equal to the single instance RDBMS version. Make sure you do this step correct.

Following are cluster installation details

Cluster name : crs

Cluster install location : /u01/app/oracle/product/10.2.0/crs

OCR File location : /u03/oracrs/ocr.ora

Voting disk location : /u03/oracrs/vote.crs

Step 2) Install Oracle Database 10g Real Application Cluster software

Again for this step, you can refer to my previous post – https://avdeo.com/oracle-real-application-cluster-10g/ or any post on the internet.

Just install the s/w. The RDBMS software version must be same as your single instance RDBMS software version

Following are the details of installation

RAC RACLE_HOME=/u01/app/oracle/product/10.2.0/db

Number of instances = 2

Node names for 2 instances = ocvmrh2103, ocvmrh2190

/u01 is a individual filesystem and is not mounted commonly on both nodes. This is a separate ORACLE_HOME architecture.

Step 3) Take the backup of single instance database and restore the same to the shared file system location.

This step is not required in my case as I created the database on a shared filesystem only. But this is only for demo purpose. For real time scenario, you need to copy datafiles to shared filsystem.

Step 4) Copy init.ora file of single instance and add following parameters

bash-3.00$ cp initprod.ora /tmp/initprod.ora

In my case the database name is “prod” and I am converting this single instance database to a 2 node RAC. So in my case instance 1 name becomes prod1 and instance 2 name becomes prod2

So add following parameters to /tmp/initprod.ora file

*.cluster_database = TRUE 
*.cluster_database_instances = 2 
*.undo_management=AUTO 
prod1.undo_tablespace=UNDOTBS1
prod1.instance_name=prod1
prod1.instance_number=1 
prod1.thread=1 
prod1.local_listener=listener_ocvmrh2103
prod2.instance_name=prod2
prod2.instance_number=2 
prod2.local_listener=listener_ocvmrh2190
prod2.thread=2 
prod2.undo_tablespace=UNDOTBS2

Step 5) change the location of controlfile in the above /tmp/initprod.ora file

In my case the controlfiles are present in /u03 location which is a OCFS shared filesystem. So I dont have to change the locaiton of controlfiles in my init.ora

Incase you have moved the controlfiles along with the datafiles to shared filesystem location, then you need to change the path of controlfile in the above init.ora file copied in/tmp location

Step 6) Create SPFILE from PFILE

SQL> select name from v$database;

NAME
---------
PROD

SQL> create spfile='/u03/oradata/prod/spfileprod.ora' from pfile='/tmp/initprod.ora';

File created.

SQL>

Step 7) Copy spfile to the RAC ORACLE_HOME/dbs location of instance 1 and create pfile

bash-3.00$ cp spfileprod.ora /u01/app/oracle/product/10.2.0/db/dbs/spfileprod.ora
bash-3.00$ pwd
/u01/app/oracle/product/10.2.0/db/dbs
bash-3.00$ cat initprod1.ora
spfile='/u01/app/oracle/product/10.2.0/db/dbs/spfileprod.ora'

Step 8) Create new password file for prod1 instance under RAC oracle home

bash-3.00$ orapwd file=orapwprod1 password=welcome1

Step 9) Start database in mount stage and rename datafiles and redo log files to new shared location

In my case since the datafiles and online redo logs are placed at same shared location, I dont need to do this step. However in real time scenario, this step is required.

make sure that your ORACLE_HOME variable is set to RAC ORACLE_HOME

bash-3.00$ echo $ORACLE_HOME
/u01/app/oracle/product/10.2.0/db
bash-3.00$ echo $ORACLE_SID
prod1
SQL> startup mount pfile=initprod1.ora
ORACLE instance started.

Total System Global Area  838860800 bytes
Fixed Size                  1222168 bytes
Variable Size             213912040 bytes
Database Buffers          620756992 bytes
Redo Buffers                2969600 bytes
Database mounted.
SQL>

Step 10) Add second thread to database which will be for instance 2

SQL> alter database add logfile thread 2 group 4 ('/u03/oradata/prod/redo2_01.dbf') size 50M, group 5 ('/u03/oradata/prod/redo2_02.dbf') size 50M, group 6 ('/u03/oradata/prod/redo2_03.dbf') size 50M;

Database altered.

SQL> ALTER DATABASE OPEN;

Database altered.

SQL> alter database enable public thread 2;

Database altered.

Step 11) Create undo tablespace for instance 2

The name of the undo tablespace should be same as you specified in the init.ora file in step 4 above.

SQL> CREATE UNDO TABLESPACE UNDOTBS2 DATAFILE '/u03/oradata/prod/undotbs2_01.dbf' size 25M;

Tablespace created.

Step 12) Run $ORACLE_HOME/rdbms/admin/catclust.sql to create cluster database specific views within the existing instance 1

SQL> @?/rdbms/admin/catclust.sql

Step 13)

On the second node, set ORACLE_HOME and SID for instance 2

bash-3.00$ export ORACLE_HOME=/u01/app/oracle/product/10.2.0/db
bash-3.00$ export ORACLE_SID=prod2

Create initprod2.ora on second node similar to node 1. In this case you have to copy spfile to second node as well. You can also keep spfile in shared location (/u03 in my case) and put same path in initprod2.ora

bash-3.00$ pwd
/u01/app/oracle/product/10.2.0/db/dbs
bash-3.00$ ls -lrt spfileprod.ora 
-rw-r-----  1 oracle oinstall 3584 Feb 19 12:36 spfileprod.ora
bash-3.00$ cat initprod2.ora 
spfile='/u01/app/oracle/product/10.2.0/db/dbs/spfileprod.ora'

Step 14) Create new password file for instance 2

bash-3.00$ orapwd file=orapwprod2 password=welcome1

Step 15) Start the second instance

SQL> startup pfile=initprod2.ora
ORACLE instance started.

Total System Global Area  838860800 bytes
Fixed Size                  1222168 bytes
Variable Size             213912040 bytes
Database Buffers          620756992 bytes
Redo Buffers                2969600 bytes
Database mounted.
Database opened.

You might face some issue while starting second instance as bdump, udump and cdump dir location will be that of single instance ORACLE_HOME which is not present in node2.

Also you might hit following error

SQL> startup
ORA-09925: Unable to create audit trail file
Linux Error: 2: No such file or directory
Additional information: 9925
SQL> Disconnected

Make sure you alter following parameters to a valid location and copy spfileprod.ora again to node2

audit_file_dest
background_dump_dest
user_dump_dest
core_dump_dest

Step 16) Add the converted database to cluster

move the spfile to the common location such as /u03/oradata/prod and modify both the pfiles so that both pfiles refers to same spfile and there are no 2 copies.

bash-3.00$ srvctl add database -d prod -o /u01/app/oracle/product/10.2.0/db -p /u03/oradata/prod/spfileprod.ora
bash-3.00$ srvctl add instance -d prod -i prod1 -n OCVMRH2103
bash-3.00$ srvctl add instance -d prod -i prod2 -n OCVMRH2190

References:

Metalink Note ID : 747457.1