Setting up Replication Slave – MySQL

Introduction:

Replication is one of the basic and most essential feature available in MySQL. Replication allows you to maintain additional copy of database on another server and keep itself updated with latest changes in master databases. Having another copy of data (other than backup) is always a good option. In case of any issues on master server we can always failover to slave and make it a master. This provides high availability. You can have multiple slaves pulling data from a single master, and you can have slaves replicating data off other slaves

There are 2 parts involved in setting up a replication.

Part 1 – Cloning of existing MySQL instance and create a duplicate MySQL environment (similar to master).

This can be done by exporting data using mysqldump and run the sql file created on target database (if data is relatively small) or use “SELECT ... INTO OUTFILE” and “LOAD DATA INFILE”

But relatively easiest way is to setup master slave relationship and set gtid_next variable to the starting gtid of master. This way it will fetch all transactions and make it in sync. Of course you will need GTID to be enabled for this.

Part 2 – Enabling Master – Slave relationship to replicate changes happened on master to slave.

If GTID is enabled, you can configure master – slave using GTID.

If GTID is not enabled, you can use bin_log name and position to configure master – slave.

Setting up replication when GTID is enabled:

So in my current configuration, I have a master where GTID is enabled. After enabling GTID I have loaded the required database “deo”.

I have installed another MySQL setup on another server and enabled GTID.

Parameter Changes:

We need to enable following parameters

Master

server-id=1

We need to assign server ID to master and slave. Remember that server ID should be unique for every MySQL instances in replication.

Also, since GTID is enabled, I am assuming following parameters are setup correctly

gtid_mode=ON
enforce_gtid_consistency=true
log_bin
log_slave_updates=true

Even if GTID is not enabled, you need to setup server_id and log_bin parameters for setting up replication.

Bounce the instance after setting up server_id parameters (and many be GTID parameters if they were not set before).

Slave

server_id=2

Again, I am assuming GTID is enabled on slave mysql instance and above parameters are set correctly.

Bounce the instance after setting up server_id parameters (and many be GTID parameters if they were not set before).

Setting up replication

At this point master looks like following


mysql> show global variables like '%gtid%';
+--------------------------+-------------------------------------------+
| Variable_name | Value |
+--------------------------+-------------------------------------------+
| enforce_gtid_consistency | ON |
| gtid_executed | 2441a23c-b4ae-11e3-9d0b-80c16e2272cc:1-21 |
| gtid_mode | ON |
| gtid_owned | |
| gtid_purged | |
+--------------------------+-------------------------------------------+

mysql> show databases;
+--------------------+
| Database |
+--------------------+
| information_schema |
| deo |
| mysql |
| performance_schema |
| test |
+--------------------+
5 rows in set (0.00 sec)

mysql> show variables like '%server_id%';
+----------------+-------+
| Variable_name | Value |
+----------------+-------+
| server_id | 1 |
| server_id_bits | 32 |
+----------------+-------+

It has one extra database “deo” with few tables in that database.

I will create a replication user on master and grant replication privileges to that user

mysql> CREATE USER 'replicate'@'%' IDENTIFIED BY 'welcome';
Query OK, 0 rows affected (0.04 sec)

mysql> GRANT REPLICATION SLAVE ON *.* TO 'replicate'@'%';
Query OK, 0 rows affected (0.01 sec)

mysql> show master status\G
*************************** 1. row ***************************
File: bin_log.000002
Position: 6740
Binlog_Do_DB:
Binlog_Ignore_DB:
Executed_Gtid_Set: 2441a23c-b4ae-11e3-9d0b-80c16e2272cc:1-23
1 row in set (0.00 sec)

At this point my slave looks like following


mysql> show variables like '%gtid%';
+--------------------------+-----------+
| Variable_name | Value |
+--------------------------+-----------+
| enforce_gtid_consistency | ON |
| gtid_executed | |
| gtid_mode | ON |
| gtid_next | AUTOMATIC |
| gtid_owned | |
| gtid_purged | |
+--------------------------+-----------+

mysql> show variables like '%server_id%';
+----------------+-------+
| Variable_name | Value |
+----------------+-------+
| server_id | 2 |
| server_id_bits | 32 |
+----------------+-------+

mysql> show slave status\G
Empty set (0.00 sec)

At this point we should point the slave to master and start the slave.

Since we are using GTID, we do not have to provide MASTER_LOG_FILE and MASTER_LOG_POS parameters in CHANGE MASTER TO command.

We can simply use MASTER_AUTO_POSITION=1 and MySQL will automatically find the changes it has to replicate.

Pointing slave to master

Use following command to point slave to master

CHANGE MASTER TO MASTER_HOST = 'master_host', MASTER_PORT = master_port, MASTER_USER = 'replicate', MASTER_PASSWORD = 'welcome', MASTER_AUTO_POSITION = 1;
MASTER_HOST is the host name on which master is running
MASTER_PORT is the MySQL port of master
MASTER_USER is the replication user we created in above step. "replicate" in our case
MASTER_PASSWORD is the password we provided for replicate user.
MASTER_AUTO_POSITION = 1. If GTID is enabled, setting this parameter will automatically replicate the changes.

Start Slave

Once we point slave to master, we need to start the slave.

mysql> start slave;
Query OK, 0 rows affected (0.00 sec)

Once you start the slave you can check the status


mysql> show slave status\G
*************************** 1. row ***************************
Slave_IO_State: Waiting for master to send event
Master_Host: advait-1.desktop
Master_User: replicate
Master_Port: 3306
Connect_Retry: 60
Master_Log_File: bin_log.000003
Read_Master_Log_Pos: 191
Relay_Log_File: advait-2-relay-bin.000003
Relay_Log_Pos: 397
Relay_Master_Log_File: bin_log.000003
Slave_IO_Running: Yes
Slave_SQL_Running: Yes
Replicate_Do_DB:
Replicate_Ignore_DB:
Replicate_Do_Table:
Replicate_Ignore_Table:
Replicate_Wild_Do_Table:
Replicate_Wild_Ignore_Table:
Last_Errno: 0
Last_Error:
Skip_Counter: 0
Exec_Master_Log_Pos: 191
Relay_Log_Space: 7243
Until_Condition: None
Until_Log_File:
Until_Log_Pos: 0
Master_SSL_Allowed: No
Master_SSL_CA_File:
Master_SSL_CA_Path:
Master_SSL_Cert:
Master_SSL_Cipher:
Master_SSL_Key:
Seconds_Behind_Master: 0
Master_SSL_Verify_Server_Cert: No
Last_IO_Errno: 0
Last_IO_Error:
Last_SQL_Errno: 0
Last_SQL_Error:
Replicate_Ignore_Server_Ids:
Master_Server_Id: 1
Master_UUID: 2441a23c-b4ae-11e3-9d0b-80c16e2272cc
Master_Info_File: /usr/local/mysql/data/master.info
SQL_Delay: 0
SQL_Remaining_Delay: NULL
Slave_SQL_Running_State: Slave has read all relay log; waiting for the slave I/O thread to update it
Master_Retry_Count: 86400
Master_Bind:
Last_IO_Error_Timestamp:
Last_SQL_Error_Timestamp:
Master_SSL_Crl:
Master_SSL_Crlpath:
Retrieved_Gtid_Set: 2441a23c-b4ae-11e3-9d0b-80c16e2272cc:1-23
Executed_Gtid_Set: 2441a23c-b4ae-11e3-9d0b-80c16e2272cc:1-23,
8c593930-b4da-11e3-9e2d-b499ba05473c:1-4
Auto_Position: 1
1 row in set (0.00 sec)

So if you consider RETRIEVED_GTID and EXECUTED_GTID variable, you will see it has pulled up all transactions from master and applied on slave.

Retrieved_Gtid_Set: 2441a23c-b4ae-11e3-9d0b-80c16e2272cc:1-23
Executed_Gtid_Set: 2441a23c-b4ae-11e3-9d0b-80c16e2272cc:1-23, 8c593930-b4da-11e3-9e2d-b499ba05473c:1-4

Also, if you check databases on slave, you will see “deo” database is visible and it has all the data same as master.

In EXECUTED_GTID_SET variable, you will see 2 GTIDs separated by comma. This is because if you execute some transactions on slave directly, slave will generate its own GTID, which is 8c593930-b4da-11e3-9e2d-b499ba05473c:1-4 in this case.

In order to prevent any transaction happening on slave, we can make slave read only. This will prevent any user from executing any transactions on slave. But this does allow replicate user to keep replicating transactions happening on master.

Making slave readonly

Add a parameter read_only=ON in /etc/my.cnf on slave side and bounce the slave.


mysql> show variables like '%read_only%';
+------------------+-------+
| Variable_name | Value |
+------------------+-------+
| innodb_read_only | OFF |
| read_only | ON |
| tx_read_only | OFF |
+------------------+-------+

Once slave is readonly, no user can do any transactions on slave, except replicate user and root user.

Hope this helps !!

Enabling GTID in MySQL

What is GTID ?

GTID stands for Global Transaction Identifier. This is a new feature introduced in 5.6 version.

I have given a link in reference section of this article which explains the concept of GTID.

Global transaction identifier is a unique transaction ID assigned to every transaction that happens in MySQL database. This ID has specific importance in maintaining replication slave. Using this ID we can easily track which transactions has been applied to slave and which are yet to be applied. This also prevents applying same transaction again to slave.

Here is how it works. When the master commits a transaction, it generates an identifier consisting of two components:

  • The first component is the server’s server_uuid (server_uuid is a global variable introduced in MySQL 5.6 which holds a UUID, i.e. a 128-bit random number generated so that the probability that two servers in the world have the same server_uuid is so small that we can assume every UUID is unique).
  • The second component is an integer; it is 1 for the first transaction committed on the server, 2 for the second transaction, and so on. Much like an autoincrement column, it is incremented by one for each commit.

The first component ensures that two transactions generated on different servers have different GTIDs. The second component ensures that two transactions generated on the same server have different GTIDs.

In text, we write the GTID as “UUID:N”, e.g., 22096C54-FE03-4B44-95E7-BD3C4400AF21:4711.

The GTID is written as an event to the binary log prior to the transaction

The binary log including the GTID is replicated to the slave. When the slave reads the GTID, it makes sure that the transaction retains the same GTID after it has been committed on the slave. Thus, the slave does not generate a new GTID;

Just to give a brief idea about binary logs – Binary logs are basically logs written for every transactions. They are like the archive logs/online redo logs in Oracle database terminology. I also see GTID is kind of a SCN number in Oracle database terminology.

Steps to enable GTID

You need to set following parameters either in /etc/my.cnf or while starting MySQL instance to enable GTID.

gtid_mode=ON
enforce_gtid_consistency=true (Starting with MySQL 5.6.9, –disable-gtid-unsafe-statements is now named –enforce-gtid-consistency) 
log_bin=/usr/local/mysql/log/bin_log 
log_slave_updates=true

If you are starting the instances using above parameters as arguments inline with mysqld_safe binary, you need to replace underscore(_) with hypen(-) in above parameters and include double hypens at the beginning as below

--gtid-mode=ON
--enforce-gtid-consistency=true (Starting with MySQL 5.6.9, –disable-gtid-unsafe-statements is now named –enforce-gtid-consistency) 
--log-bin=/usr/local/mysql/log/bin_log 
--log-slave_updates=true

I have updated /etc/my.cnf with above parameters and bounced MySQL instance

I can see GTID is now enabled


mysql> show global variables like '%GTID%';
+--------------------------+-------+
| Variable_name | Value |
+--------------------------+-------+
| enforce_gtid_consistency | ON |
| gtid_executed | |
| gtid_mode | ON |
| gtid_owned | |
| gtid_purged | |
+--------------------------+-------+

mysql> show variables like '%gtid_next%';
+---------------+-----------+
| Variable_name | Value |
+---------------+-----------+
| gtid_next | AUTOMATIC |
+---------------+-----------+

mysql> select @@global.gtid_executed;
+------------------------+
| @@global.gtid_executed |
+------------------------+
| |
+------------------------+
1 row in set (0.00 sec)

I will now import a database named “deo” into this MySQL instance.

Exporting deo database from my desktop server MySQL instance:

advait.desktop$ mysqldump --databases deo -u root -p > deo.sql
Enter password:
advait.desktop$

Importing deo database into my new MySQL instance:

/usr/local/mysql/bin/mysql -u root < deo.sql

If I check in my new MySQL instance, I can see deo database


mysql> show databases;
+--------------------+
| Database |
+--------------------+
| information_schema |
| deo |
| mysql |
| performance_schema |
| test |
+--------------------+
5 rows in set (0.00 sec)

If I check the GTID value now


mysql> select @@global.gtid_executed;
+-------------------------------------------+
| @@global.gtid_executed |
+-------------------------------------------+
| 2441a23c-b4ae-11e3-9d0b-80c16e2272cc:1-21 |
+-------------------------------------------+
1 row in set (0.00 sec)

It means that 21 transactions has happened.

References:

GTID concepts – http://dev.mysql.com/doc/refman/5.6/en/replication-gtids-concepts.html

http://svenmysql.blogspot.in/2012/10/failover-and-flexible-replication.html

Hope this helps !!

Installing MySQL Utilities

MySQL utilities are set of scripts provided for DBAs for general database administration. These are basically command line utilities which can be used to perform various tasks.

Following is the listing of MySQL utility version 1.3 that I installed in my environment

advait.desktop$ ls -lL /usr/local/bin/mysql*
-rwxr-xr-x 1 root root 10546 Mar 26 00:34 /usr/local/bin/mysqlauditadmin
-rwxr-xr-x 1 root root 13443 Mar 26 00:34 /usr/local/bin/mysqlauditgrep
-rwxr-xr-x 1 root root 10142 Mar 26 00:34 /usr/local/bin/mysqldbcompare
-rwxr-xr-x 1 root root 13509 Mar 26 00:34 /usr/local/bin/mysqldbcopy
-rwxr-xr-x 1 root root 14781 Mar 26 00:34 /usr/local/bin/mysqldbexport
-rwxr-xr-x 1 root root 12719 Mar 26 00:34 /usr/local/bin/mysqldbimport
-rwxr-xr-x 1 root root 9551 Mar 26 00:34 /usr/local/bin/mysqldiff
-rwxr-xr-x 1 root root 6339 Mar 26 00:34 /usr/local/bin/mysqldiskusage
-rwxr-xr-x 1 root root 13554 Mar 26 00:34 /usr/local/bin/mysqlfailover
-rwxr-xr-x 1 root root 16487 Mar 26 00:34 /usr/local/bin/mysqlfrm
-rwxr-xr-x 1 root root 5218 Mar 26 00:34 /usr/local/bin/mysqlindexcheck
-rwxr-xr-x 1 root root 4717 Mar 26 00:34 /usr/local/bin/mysqlmetagrep
-rwxr-xr-x 1 root root 5298 Mar 26 00:34 /usr/local/bin/mysqlprocgrep
-rwxr-xr-x 1 root root 6452 Mar 26 00:34 /usr/local/bin/mysqlreplicate
-rwxr-xr-x 1 root root 14169 Mar 26 00:34 /usr/local/bin/mysqlrpladmin
-rwxr-xr-x 1 root root 5612 Mar 26 00:34 /usr/local/bin/mysqlrplcheck
-rwxr-xr-x 1 root root 5796 Mar 26 00:34 /usr/local/bin/mysqlrplshow
-rwxr-xr-x 1 root root 7228 Mar 26 00:34 /usr/local/bin/mysqlserverclone
-rwxr-xr-x 1 root root 4881 Mar 26 00:34 /usr/local/bin/mysqlserverinfo
-rwxr-xr-x 1 root root 6194 Mar 26 00:34 /usr/local/bin/mysqluc
-rwxr-xr-x 1 root root 6667 Mar 26 00:34 /usr/local/bin/mysqluserclone

You can find the complete usage of these scripts in official documentation of mysql utilities. I have provided the link for the same in reference section of this article.

I will cover the installation part of MySQL utilities in this article.

Step 1) Download MySQL utilities from http://dev.mysql.com/downloads/tools/utilities/ location

Select the OS as “Linux – Generic” and download .tar.gz file

Step 2) Extract the file

Copy the file to the server on which you have installed MySQL or on any server where you want to install utilities and extract the tar file.

In my case I am going to install this on my desktop where MySQL is running

So my tar file is residing at /home/advait/mysql/mysql-utilities-1.3.6.tar.gz

I can extract the content in one of the directory under my MySQL installation.

So I extracted above tar file at /usr/local/mysql

cd /usr/local/mysql
tar xvfz /home/advait/mysql/mysql-utilities-1.3.6.tar.gz

Step 3) Build and install utilities

Once you extract you will see mysql-utilities-1.3.6 directory. Inside this directory we set setup.py

Use setup.py to build and install utilities

Build should be done using mysql (non root) user. In my case I am using advait user to start/stop and manage mysql. So I will build using advait user

python setup.py build
advait.desktop$ python setup.py build
running build
running build_py
creating build
creating build/lib
creating build/lib/mysql
copying mysql/__init__.py -> build/lib/mysql
creating build/lib/mysql/utilities
copying mysql/utilities/exception.py -> build/lib/mysql/utilities
copying mysql/utilities/__init__.py -> build/lib/mysql/utilities
creating build/lib/mysql/utilities/command
copying mysql/utilities/command/read_frm.py -> build/lib/mysql/utilities/command
copying mysql/utilities/command/rpl_admin.py -> build/lib/mysql/utilities/command
copying mysql/utilities/command/serverinfo.py -> build/lib/mysql/utilities/command
copying mysql/utilities/command/utilitiesconsole.py -> build/lib/mysql/utilities/command
...
changing mode of build/scripts-2.7/mysqlserverclone from 644 to 755
changing mode of build/scripts-2.7/mysqldbcompare from 644 to 755
changing mode of build/scripts-2.7/mysqlserverinfo from 644 to 755
changing mode of build/scripts-2.7/mysqluserclone from 644 to 755

Install should be done using “root” user since we have installed MySQL using root user. If you use any other user (other than root), that user may not have permissions on python libraries and so install can fail.

advaitd-2.desktop$ python setup.py install
running install
running build
running build_py
running build_scripts
running install_lib
creating /usr/local/lib/python2.7/site-packages/mysql
copying build/lib/mysql/__init__.py -> /usr/local/lib/python2.7/site-packages/mysql
creating /usr/local/lib/python2.7/site-packages/mysql/utilities
copying build/lib/mysql/utilities/exception.py -> /usr/local/lib/python2.7/site-packages/mysql/utilities
creating /usr/local/lib/python2.7/site-packages/mysql/utilities/common
copying build/lib/mysql/utilities/common/lock.py -> /usr/local/lib/python2.7/site-packages/mysql/utilities/common
...
changing mode of /usr/local/bin/mysqldbexport to 755
changing mode of /usr/local/bin/mysqldbimport to 755
changing mode of /usr/local/bin/mysqlindexcheck to 755
changing mode of /usr/local/bin/mysqlrplcheck to 755
changing mode of /usr/local/bin/mysqlserverinfo to 755
changing mode of /usr/local/bin/mysqlfrm to 755
running install_egg_info

 

Step 4) Testing

Once install is done, you can verify running one of the script if things are working fine.

All MySQL utilities will get installed at /usr/local/bin directory and they should be part of PATH variable. So no need to give complete path.

Example:

advait.desktop$ mysqldiskusage --server=root:<password>@localhost
# Source on localhost: ... connected.
# Database totals:
+---------------------+------------+
| db_name | total |
+---------------------+------------+
| deo | 546,720 |
| mysql | 1,542,216 |
| performance_schema | 396,071 |
+---------------------+------------+
Total database disk usage = 2,485,007 bytes or 2.37 MB
#...done.

You might face following error if python connector module is not installed

advait.desktop$ mysqldiskusage --server=root:<password>@localhost
Traceback (most recent call last):
 File "/usr/local/bin/mysqldiskusage", line 35, in <module>
 from mysql.utilities.common.server import connect_servers
 File "/usr/local/lib/python2.7/site-packages/mysql/utilities/common/server.py", line 31, in <module>
 import mysql.connector
ImportError: No module named connector

If you face this issue, you need to install python connector from http://dev.mysql.com/downloads/connector/python/ location

Step 5) Installing python connector

Again download the connector from http://dev.mysql.com/downloads/connector/python/ location

scp the same to the machine on which you were installing MySQL utilities

Extract the .tar.gz file to one of the directories in MySQL installation. In my case I am installing them under utilities directory – /usr/local/mysql/mysql-utilities-1.3.6

Build using non-root user (advait in my case).

python setup.py build

Install using root user

python setup.py install

After this you should be good to use MySQL utilities.

Reference:

MySQL Utilities download – http://dev.mysql.com/downloads/tools/utilities/

Connector download – http://dev.mysql.com/downloads/connector/python/

MySQL Utilities overview – http://dev.mysql.com/doc/mysql-utilities/1.3/en/utils-overview.html

Hope this helps !!

Installing MySQL

I am exploring another database storage solution “MySQL” to understand practical applications of using MySQL in our environment.

This is one of the many articles that I will be writing on MySQL. This article gives you brief steps to installing MySQL on Linux.

Nothing different than what is provided in MySQL documentation except that this is less elaborate with only required steps compared to MySQL documentation.

This is how you can do the installation:-

Step 1) Down the software from http://dev.mysql.com/downloads/

If you are downloading only for checking out usability and features and for personal learning – go for community version. Click on “MySQL Community Server” link.

Select the OS as “Linux – Generic”

Scroll to the bottom and you will see last 2 lines provides the option to download binaries in the form of tar.gz files – “Linux – Generic (glibc 2.5) (x86, 64-bit), Compressed TAR Archive” andLinux – Generic (glibc 2.5) (x86, 32-bit), Compressed TAR Archive”

Depending on the hardware server you have download either 32 bit or 64 bit version

Step 2) Extract the files on server (login as root) 

Once you download the software (may be on your laptop), SCP the same to the linux server on which you want to install MySQL.

From here on you should be logged in as “root” user on your linux server

Extract the software on linux server at /usr/local location

In my case software was lying on my linux server at /home/advait/mysql location. I extracted using following commands

cd /usr/local 
tar xvfz /home/advait/mysql/mysql-5.6.16-linux-glibc2.5-x86_64.tar.gz

Create a new symlink “mysql” to point to extracted directory (it looks neat without version number and other details).

ln -s mysql-5.6.16-linux-glibc2.5-x86_64 mysql

Step 3) Run following steps to change ownerships

Since you want a superuser to have control of mysql and you dont want to login every time as “root” user on your server, you should be changing the ownership to the user which should be the administrator of mysql.

Usual convention suggests creating “mysql” user on your linux host and giving ownership to “mysql” user.

But on my desktop, I have a user called “advait” and I will be giving ownership to this user.

chown -R advait mysql
cd mysql
chown -R advait .
chgrp -R <grp name> .

Above <grp name> should be the group your user ID belongs to. In my case its again “advait”. To avoid confusion I have used <grp name> instead.

Step 4) Create MySQL database

you should be in /usr/local/mysql directory. Following script will create MySQL database named “mysql” and all required metadata tables inside that.

it will also create a test database for testing purpose.

scripts/mysql_install_db --user=advait

Step 5) Change permissions and create Log directory

you should be in /usr/local/mysql directory

chown -R advait data
chgrp -R <grp name> data
mkdir log
chown -R advait log
chgrp -R <grp name> log

data directory will hold data about tables and log directory will hold log files.

Step 6) Copy config file and modify

Copy my.cnf from /usr/local/mysql to /etc/

you should be in /usr/local/mysql directory

cp my.cnf /etc/my.cnf

modify following parameters in /etc/my.cnf and put them under [mysql.server] heading

[mysql.server]
log_bin = /usr/local/mysql/log
basedir = /usr/local/mysql
datadir = /usr/local/mysql/data
port = 3306
socket = /tmp/mysql.sock
user = advait
join_buffer_size = 256M
sort_buffer_size = 16M
read_rnd_buffer_size = 16M

These parameters will be used by mysql instance to during startup.

Step 7) Start MySQL instance

You should start mysql server using your user ID to which you have given the ownership. In my case “advait”.

Login as “advait” and run following commands to start MySQL instance

cd /usr/local/mysql

Use one of the following command to start MySQL instance.

bin/mysqld_safe &
support-files/mysql.server start

You can actually copy support-files/mysql.server in /etc/init.d and make MySQL instance start automatically when server starts

Step 8) Make MySQL instance start automatically with server reboot

cp /usr/local/mysql/support-files/mysql.server /etc/init.d/mysql
chmod +x /etc/init.d/mysql
chkconfig --add mysql

Reference:

http://dev.mysql.com/doc/refman/5.0/en/installing.html

Hope this helps !!

A look into Amazon DynamoDB

Why noSQL ?

From past few years (after 2009) we are seeing high trend towards noSQL databases. Many big enterprises are exploring option for moving services to noSQL databases and many already did.

Its not that existing Oracle databases cannot support the system. We have many rich features and products from Oracle which can be made to work together to support huge internet applications. But there are many vantage points to consider when we are designing a data store for a system. Among those include availability, cost, performance, scalability etc.

Every system is made of many small  services and when system grows bigger, design is made decentralized and service oriented. Data stores are sharded to store information related to service rather than entire system. With this change in architecture even small data store solutions to support individual service is good enough. System will look like having many small services with there own data stores and services can talk to each other via independent API. This way we achieve a goal of isolated services and at the same time make sure data is exchanged to represent business flow.

With such architecture in mind, every services can design and choose the database solution depending on the type of data it supports. Representing every type of data into relational form and storing them in relational database is not a efficient option. That is why enterprises are analyzing data and choosing data store wisely to support there data. More over with smaller data store requirement to support only the required service (rather than entire system), one can always look for cheaper data storage solution especially for tire 2 or tire 3 services.

Few more reasons for considering noSQL data storage solution would be:

  • Data Velocity – Huge data getting generated in less time
  • Data Variety – Structured, semi-structures and unstructured
  • Data Volume – Terabytes and petabytes of data
  • Dynamic Schema – Flexible data model
  • Auto-sharding – Horizontal scalability
  • Continuous Availability
  • Integrated Caching
  • Replication – Automatic replication to support high availability
  • Dynamic provisioned throughput
Above reasons to some extend becomes the limitations for relational databases.

Amazon DynamoDB

Recently I explored a noSQL data store – Amazon DynamoDB.

This is one of the simplest form of data storage solution where data is stored in the form of key-value pair

I have presented the dynamoDB architecture, features and limitation.

You can view the presentation on slideshare – www.slideshare.net/advaitdeo/dynamodb-presentation-31000206

Hope this helps !!

Redo Behavior

What it used to be

After Oracle 9.2, there has been a significant changes in redo behavior. Before 10g, Oracle used to have a single log buffer for writing redo information which eventually gets written to online redo logs in round robin fashion. Of course if your database is in archive log mode information from online redo logs will flow to archive logs before that online redo log file gets overwritten.

Why changed

Oracle 10g onwards behaviour has changed because of contention. You see whenever a session makes a change, oracle has to write the change vector to redo buffer and for that it has to acquire a latch “redo allocation”. If you have just single log buffer, Oracle will have a single latch to protect that memory area and any session making changes have to acquire this latch to write redo change vector to redo buffer.
Think about an OLTP system with many session doing many transactions at the same time. This can cause serious contention for redo allocation latch and was not a scalable solution.

What changed

So from 10g onwards, Oracle has split log buffer into multiple buffer – these are called public redo strands. In 10g it was controlled by parameter log_parallelism and in 11g its controlled by hidden parameter _log_parallelism_max. Default value for _log_parallelism_max is decided by number of CPU. Minimum value for this parameter is 2 for CPU count 16 or below. For every 16 CPUs _log_parallelism_max increases by 1. So for 64 CPU machine _log_parallelism_max will default to 4.

You can check number of public strands in your database using following SQL

SQL> select
 PTR_KCRF_PVT_STRAND ,
 FIRST_BUF_KCRFA ,
 LAST_BUF_KCRFA ,
 TOTAL_BUFS_KCRFA ,
 STRAND_SIZE_KCRFA ,
 indx
from
 x$kcrfstrand ;
PTR_KCRF_PVT_STR FIRST_BUF_KCRFA LAST_BUF_KCRFA TOTAL_BUFS_KCRFA STRAND_SIZE_KCRFA INDX
---------------- ---------------- ---------------- ---------------- ----------------- ----------
00 00000000726BD000 0000000075166E00 87376 44736512 0
00 0000000075167000 0000000077C10E00 87376 44736512 1
00 0000000077C11000 000000007A6BAE00 87376 44736512 2
00 000000007A6BB000 000000007D164E00 87376 44736512 3
00 000000007D165000 000000007FC0EE00 87376 44736512 4
00 000000007FC0F000 00000000826B8E00 87376 44736512 5

As you can see from the output, I have 6 strands of around 43MB each. So my log_buffer must be 43 * 6 ~= 256 MB

SQL>select value from v$parameter where name = 'log_buffer';
VALUE
--------------------
268419072

You can also see that for 6 public redo strands, we have 6 latches to protect those memory structures.

SQL>select ADDR, LATCH#, CHILD#, NAME from v$latch_children where name like 'redo allocation%' order by child#;
ADDR LATCH# CHILD# NAME
---------------- ---------- ---------- ------------------------------
0000000FE1971430 187 1 redo allocation
0000000FE19714D0 187 2 redo allocation
0000000FE1971570 187 3 redo allocation
0000000FE1971610 187 4 redo allocation
0000000FE19716B0 187 5 redo allocation
0000000FE1971750 187 6 redo allocation
6 rows selected.

Please note that I am not talking about private redo strands in this post. Private redo strands and in memory undo can be configured separately to further reduce the contention and improve performance. I will talk about private redo strands in my next post.

The Myth

One of the myth is that log switch happens whenever online redo log file is 100% full
Well, its not like that. There is an internal algorithm that determines the log switch moment. This also has a very good reason because Oracle research finds that doing the log switch at the last moment could incur performance problems. As a result, after the log switch occurs, the archivers are copying only the actual information from the redo log files. Since the redo logs are not 100% full after the log switch and the archive logs are not blank padded after the copy operation has finished, this results in uneven, smaller files than the original redo log files.

How log switch happens

So there is an internal algorithm which decides when log switch should happen. I dont know if this internal algorithm is publicized anywhere, but based on my understanding and putting togetther many random articles, I think here is what happens.

When the RDBMS switches into a new online redo logfile, all the log buffer redo strand memory is “mapped” to the logfile space. If the logfile is larger than the log buffer then each strand will map/reserve its strand size worth of logfile space, and the remaining logfile space (the “log residue”) is still available.

Following figure shows the situation

1

 

 

 

 

 

 

 

 

 

 

 

 

 

 

If online redo log file size is smaller than log buffer then the whole logfile space is divided/mapped/reserved equally among all the strands, and there is no unreserved space (ie no log residue).

Following figure shows the situation

2

 

 

 

 

 

 

 

 

 

 

 

 

 

 

When any process fills a strand such that all the reserved underlying logfile space for that strand is used, AND there is no log residue, then a log switch is scheduled.

Consider first case where we have residue space available in online redo logs. Now when a strand is full it will look out for another “mapping” space in redo log file equivalent to strand size. If its available it will use it.

3

 

 

 

 

 

 

 

 

 

 

 

 

 

 

This will continue until last set of equivalue strand space available in online redo log is allocated as shown in figure below

4

 

 

 

 

 

 

 

 

 

 

 

 

 

 

Now at this point if one of the strand is full and it doesnt have space available, it will trigger log switch. So if other strands are not full and we have space available in other strands, that space gets wasted. So if you have 4 strand and at the end one strand is full, you will have space wastage equivalent to size of 3 strands (which are not full).

But since online redo log file is big enough wastage of space equivalue to 3 strands doesnt really look big.

Online redo log size – 1GB
Space wastage – 3 strands = 3 * 64MB = 192MB

So archive log size would be around (1GB – 192MB) = 810MB
Around 20% wastage

Consider second case where we dont have any residue space available. With initial allocations, strands are mapped to online redo logs and when one of the strand is full, it will trigger a log switch as there is no space available in online redo log file. So same space got wasted (3 strands), but in this case since online redo log file is small we see considerable wastage of space.

Online redo log size – 256MB
Space wastage – 3 strands = 3 * 64MB = 192MB

So archive log size would be around (256 MB – 192 MB) = 64 MB
Around 75% wastage.

Disadvantages of small redo logs

So you can see if you online redo logs are very small (size near to log buffer), you will end up wasting lot of space in online redo logs causing frequent log switches.

 Suggestions

Some suggestion to reduce log switches and to make use of all space in online redo logs

  1. Make online redo log files atleast 4-6 times log buffer size. This ensures we have enough number of allocations of strands before log switch happens
  2. Check optimum value of _log_parallelism_max. While Oracle decides default based on CPU, I would say its better to have lower value. Lower value will make less number of strands and wastage will reduce.
  3. Check for optimum value of log_buffer. default value is usually 1MB * number of CPU. In my opinion its not good to go beyond 128MB. But it may vary with system and in your system you may find higher value to be better.

References

Oracle Forum – https://forums.oracle.com/thread/2346745

https://nzdba.wordpress.com/2011/12/18/hollow-redo-logs/

Doc ID 1356604.1

Effect of Net Timeout Parameter in DG configuration

One of the parameter we configure in physical standby setup is about how much amount of time LGWR on primary should wait for physical standby to respond.

When changes happens on primary side, those redo changes are shipped on physical standby database. If physical standby database is down or if standby server is not reachable, we need to have some time limit on how much time primary should wait for standby to respond (and then move ahead without try to ship redo changes to standby). This limit is defined by Net Timeout parameter.

You can check definition in Oracle docs for the same – http://docs.oracle.com/cd/E11882_01/server.112/e17023/dbpropref.htm#i101032

"The NetTimeout configurable database property specifies the number of seconds the LGWR waits for Oracle Net Services to respond to a LGWR request. It is used to bypass the long connection timeout in TCP."

One of the issue I was seeing is my DG broker was giving following error

Dataguard Configuration...
  Protection Mode: MaxAvailability
  Databases:
    orcl_b - Primary database
      Error: ORA-16825: multiple errors or warnings, including fast-start failover-related errors or warnings, detected for the database
    orcl_a - (*) Physical standby database
      Warning: ORA-16817: unsynchronized fast-start failover configuration
  (*) Fast-Start Failover target

When I checked database info in verbose mode, I saw following

DGMGRL> show database verbose orcl_a

Database - orcl_a

  Role:            PHYSICAL STANDBY
  Intended State:  APPLY-ON
  Transport Lag:   1 minute 1 second
  Apply Lag:       3 minutes 7 seconds
  Real Time Query: OFF

This means that even when my DB is in MaxAvailbility mode, I still see lag and standby is not getting in synch with primary.

My broker log file (drc<ORACLE_SID>.log in diagnostic_dest location) was showing following error

08/03/2013 07:51:44
Redo transport problem detected: redo transport for database orcl_a has the following error:
  ORA-16198: Timeout incurred on internal channel during remote archival
Data Guard Broker Status Summary:
  Type                        Name                             Severity  Status
  Configuration               FSF                               Warning  ORA-16607
  Primary Database            orcl_b                              Error  ORA-16825
  Physical Standby Database   orcl_a                            Warning  ORA-16817

Oracle error ORA-16198 represent timeout issue that must be happening while contacting standby site.

When I sanity checked standby, everything was fine. So I checked NET Timeout parameter which define the timeout value when primary should be able to contact standby.

I realized that timeout value is very less on my system.

When you do show database verbose <unique name>, it shows you properties

NetTimeout                      = '4'

In my case it was set to 4, which is very low value.
As soon as I set this value to around 10, everything was back to normal.

There is no standard value for this parameter, but usual value should be between 10-30 depending on the network config you have. Basically primary should be able to contact standby within this timelimit and hear back from standby.

Downside for keeping this value higher is, in case if something goes wrong with your standby, your primary will hang for that much time.

So, in my case if I am setting a value of 10 sec for Net Timeout parameter and something goes wrong with standby, my primary database will keep trying to send redo entry to standby for 10 sec and till that time commit wont happen (if I am in MaxAvailability mode).

So we need to balance out the value of this parameter and make sure we set optimum value.

Hope this helps !!

Reference:

http://docs.oracle.com/cd/E11882_01/server.112/e17023/dbpropref.htm#i101032