Restoring Slave when GTID is enabled on master

This is a quick post on issues I faced while restoring a slave when GTID was enabled on master.

I have master created few days back and now I am trying to create a slave. I have GTID enabled on master.

Master status:

root [mysql] >show master status \G
*************************** 1. row ***************************
             File: bin_log.000002
         Position: 16682
     Binlog_Do_DB:
 Binlog_Ignore_DB:
Executed_Gtid_Set: f7718b80-c237-11e4-baa8-a0369f370a52:1-59
1 row in set (0.00 sec)
root [mysql] >show variables like '%gtid%';
+--------------------------+-----------+
| Variable_name            | Value     |
+--------------------------+-----------+
| enforce_gtid_consistency | ON        |
| gtid_executed            |           |
| gtid_mode                | ON        |
| gtid_next                | AUTOMATIC |
| gtid_owned               |           |
| gtid_purged              |           |
+--------------------------+-----------+
6 rows in set (0.00 sec)

Step 1) Take backup from master

mysqldump -u root -p --databases deo fctest nosql --set-gtid-purged=off > /u01/backups/mysql_backup.sql

I used above command to take backup of master.

Note that I used –set-gtid-purged=off. If GTID is enabled on master and we are creating a slave, we want slave to start reading from the point when this backup is created. Because all previous information will be restored and available on slave as part of this backup restore.

If you want slave to start reading from this point and not from the beginning you should NOT set this variable to off. I will explain the significance of this variable later in the post.

Step 2) Install and create MySQL instance

Refer to my previous post http://avdeo.com/2014/02/18/installing-mysql/ for the same

Step 3) SCP backup file to slave server

slave-host.example.com$ scp master-host:/u01/backups/mysql_backup.sql /u01/backups/mysql_backup.sql

Step 4) Import backup into slave

slave-host.example.com$ mysql -u root -p </u01/backups/mysql_backup.sql

Step 5) Start Slave

We have to point slave to our master and then start slave.

When GTID is enabled, you can use MASTER_AUTO_POSITION=1 and slave should automatically start applying from correct GTID

mysql> CHANGE MASTER TO MASTER_HOST = 'master-host.example.com', MASTER_PORT = 3306, MASTER_USER = 'replicate', MASTER_PASSWORD = 'welcome', MASTER_AUTO_POSITION = 1;
Query OK, 0 rows affected, 2 warnings (0.12 sec)
mysql> start slave;
Query OK, 0 rows affected (0.03 sec)

mysql> show slave status \G
*************************** 1. row ***************************
               Slave_IO_State: Waiting for master to send event
                  Master_Host: master-host.example.com
                  Master_User: replicate
                  Master_Port: 3306
                Connect_Retry: 60
              Master_Log_File: bin_log.000002
          Read_Master_Log_Pos: 16682
               Relay_Log_File: relay_log.000002
                Relay_Log_Pos: 547
        Relay_Master_Log_File: bin_log.000002
             Slave_IO_Running: Yes
            Slave_SQL_Running: No
              Replicate_Do_DB:
          Replicate_Ignore_DB:
           Replicate_Do_Table:
       Replicate_Ignore_Table:
      Replicate_Wild_Do_Table:
  Replicate_Wild_Ignore_Table:
                   Last_Errno: 1133
                   Last_Error: Error 'Can't find any matching row in the user table' on query. Default database: 'mysql'. Query: 'SET PASSWORD FOR 'root'@'master-host.example.com'='*DF216F57F1F2066124E1AA5491D995C3CB57E4C2''
                 Skip_Counter: 0
          Exec_Master_Log_Pos: 341
              Relay_Log_Space: 17086
              Until_Condition: None
               Until_Log_File:
                Until_Log_Pos: 0
           Master_SSL_Allowed: No
           Master_SSL_CA_File:
           Master_SSL_CA_Path:
              Master_SSL_Cert:
            Master_SSL_Cipher:
               Master_SSL_Key:
        Seconds_Behind_Master: NULL
Master_SSL_Verify_Server_Cert: No
                Last_IO_Errno: 0
                Last_IO_Error:
               Last_SQL_Errno: 1133
               Last_SQL_Error: Error 'Can't find any matching row in the user table' on query. Default database: 'mysql'. Query: 'SET PASSWORD FOR 'root'@'master-host.example.com'='*DF216F57F1F2066124E1AA5491D995C3CB57E4C2''
  Replicate_Ignore_Server_Ids:
             Master_Server_Id: 2
                  Master_UUID: f7718b80-c237-11e4-baa8-a0369f370a52
             Master_Info_File: mysql.slave_master_info
                    SQL_Delay: 0
          SQL_Remaining_Delay: NULL
      Slave_SQL_Running_State:
           Master_Retry_Count: 86400
                  Master_Bind:
      Last_IO_Error_Timestamp:
     Last_SQL_Error_Timestamp: 150304 08:03:47
               Master_SSL_Crl:
           Master_SSL_Crlpath:
           Retrieved_Gtid_Set: f7718b80-c237-11e4-baa8-a0369f370a52:1-59
            Executed_Gtid_Set: c21f28d2-c243-11e4-baf5-2c600c20dba4:1-59,
f7718b80-c237-11e4-baa8-a0369f370a52:1
                Auto_Position: 1
1 row in set (0.00 sec)

Now, if we see above slave status output, its showing error in SQL thread. SQL that is failing is Query: ‘SET PASSWORD FOR ‘root’@’master-host.example.com’=’*DF216F57F1F2066124E1AA5491D995C3CB57E4C2”

This is the first DML that was run on master. Its failing because master-host.example.com entry is not present in mysql.user table.  This is because its a slave on slave host and so root user entry will be present corresponding to slave host only.

Never the less, it should not be running DMLs from the beginning because we have restored everything until GTID 1-59.

Significance of –set-gtid-purged=off

The reason its applying transactions from the beginning is because we used –set-gtid-purged=off while taking backup.

When we are using GTID, we need to know in which GTID is the master and set it on the slave. MySQL keeps two global variables with GTID numbers on it:

  • gtid_executed: it contains a representation of the set of all transaction logged in the binary log
  • gtid_purged: it contains a representation of the set of all transactions deleted from the binary log

So now, the process is the following:

  • take a backup from the master and store the value of gtid_executed
  • restore the backup on the slave and set gtid_purged with the value of gtid_executed from the master

If we don’t use –set-gtid-purged=off in mysqldump, these global variables are automatically set in dump file created by mysqldump. So that when we apply that dump file on slave it will set these variables and slave will start pulling from GTID 60 onwards (in our example).

Fixing the issue

So at this point we have a backup which does not have GTID information. Lets see how to fix this issue manually.

I tried setting GTID_NEXT to 60 by first executing the transaction on master and make GTID 60 on master.

mysql> set gtid_next = "f7718b80-c237-11e4-baa8-a0369f370a52:60";
Query OK, 0 rows affected (0.00 sec)

mysql> begin; commit;
Query OK, 0 rows affected (0.00 sec)

Query OK, 0 rows affected (0.00 sec)

mysql> set gtid_next = "AUTOMATIC";
Query OK, 0 rows affected (0.00 sec)

mysql> stop slave;
Query OK, 0 rows affected (0.02 sec)

mysql> start slave;
Query OK, 0 rows affected (0.03 sec)

But this didn’t work.

We know that master is done until GTID 59 and we have those transaction in backup and slave also have them because backup was restored completely.

Lets try to set GTID_PURGED to remove those transactions from BINLOG

mysql> SET GLOBAL gtid_purged="f7718b80-c237-11e4-baa8-a0369f370a52:1:59";
ERROR 1840 (HY000): @@GLOBAL.GTID_PURGED can only be set when @@GLOBAL.GTID_EXECUTED is empty.

Its saying that GTID_EXECUTED should be empty. Lets check whats the current value and try to set to empty

mysql> show global variables like 'gtid_executed';
+---------------+--------------------------------------------------------------------------------------+
| Variable_name | Value                                                                                |
+---------------+--------------------------------------------------------------------------------------+
| gtid_executed | c21f28d2-c243-11e4-baf5-2c600c20dba4:1-59,
f7718b80-c237-11e4-baa8-a0369f370a52:1:59 |
+---------------+--------------------------------------------------------------------------------------+
1 row in set (0.00 sec)
mysql> SET GLOBAL gtid_executed = "";
ERROR 1238 (HY000): Variable 'gtid_executed' is a read only variable

We cannot assign empty value to gtid_executed variable because its read-only and since we restored the backup gtid_executed is updated till the point this instance has executed the transactions.

Only way to fix this is using reset master on slave

When we issue reset master on slave, it will reset GTID_EXECUTED and make it null. After that we can set GTID_PURGED till 1-59 and slave can start applying from 60

mysql> reset master;
Query OK, 0 rows affected (0.07 sec)

mysql> show global variables like 'GTID_EXECUTED';
+---------------+-------+
| Variable_name | Value |
+---------------+-------+
| gtid_executed |       |
+---------------+-------+
1 row in set (0.00 sec)
mysql> set global GTID_PURGED="f7718b80-c237-11e4-baa8-a0369f370a52:1-59";
Query OK, 0 rows affected (0.07 sec)

mysql> start slave;
Query OK, 0 rows affected (0.01 sec)
mysql> show slave status \G
*************************** 1. row ***************************
               Slave_IO_State: Waiting for master to send event
                  Master_Host: master-host.example.com
                  Master_User: replicate
                  Master_Port: 3306
                Connect_Retry: 60
              Master_Log_File: bin_log.000002
          Read_Master_Log_Pos: 16953
               Relay_Log_File: relay_log.000004
                Relay_Log_Pos: 442
        Relay_Master_Log_File: bin_log.000002
             Slave_IO_Running: Yes
            Slave_SQL_Running: Yes
              Replicate_Do_DB:
          Replicate_Ignore_DB:
           Replicate_Do_Table:
       Replicate_Ignore_Table:
      Replicate_Wild_Do_Table:
  Replicate_Wild_Ignore_Table:
                   Last_Errno: 0
                   Last_Error:
                 Skip_Counter: 0
          Exec_Master_Log_Pos: 16953
              Relay_Log_Space: 1202
              Until_Condition: None
               Until_Log_File:
                Until_Log_Pos: 0
           Master_SSL_Allowed: No
           Master_SSL_CA_File:
           Master_SSL_CA_Path:
              Master_SSL_Cert:
            Master_SSL_Cipher:
               Master_SSL_Key:
        Seconds_Behind_Master: 0
Master_SSL_Verify_Server_Cert: No
                Last_IO_Errno: 0
                Last_IO_Error:
               Last_SQL_Errno: 0
               Last_SQL_Error:
  Replicate_Ignore_Server_Ids:
             Master_Server_Id: 2
                  Master_UUID: f7718b80-c237-11e4-baa8-a0369f370a52
             Master_Info_File: mysql.slave_master_info
                    SQL_Delay: 0
          SQL_Remaining_Delay: NULL
      Slave_SQL_Running_State: Slave has read all relay log; waiting for the slave I/O thread to update it
           Master_Retry_Count: 86400
                  Master_Bind:
      Last_IO_Error_Timestamp:
     Last_SQL_Error_Timestamp:
               Master_SSL_Crl:
           Master_SSL_Crlpath:
           Retrieved_Gtid_Set: f7718b80-c237-11e4-baa8-a0369f370a52:1-60
            Executed_Gtid_Set: f7718b80-c237-11e4-baa8-a0369f370a52:1-60
                Auto_Position: 1
1 row in set (0.00 sec)

This fixed the issue.

Hope this helps !!

References:

http://www.percona.com/blog/2013/02/08/how-to-createrestore-a-slave-using-gtid-replication-in-mysql-5-6/

 

Moving InnoDB Table Between Servers

This is a small article telling how to move a innoDB table from one MySQL server to another.

Moving MyISAM table from one server to another is very straight forward process. MyISAM table generates 3 files at file system level.

  • .frm file contains table structure/definition
  • .MYD file contains table data
  • .MYI file contains index data

We can simply copy these 3 files to another MySQL server installation under desired database and it will show up in “SHOW TABLES” command.

But things are different in InnoDB, because InnoDB also stores metadata in default tablespace represented by ibdata* datafiles in data directory.

InnoDB table has 2 types of files generated

  • .frm file contains table structure/definition
  • .ibd file contains data of table as well as index

Apart from above files, we also have metadata like table ID, database name and transaction information about tables in default tablespace (ibdata1 datafile).

This makes InnoDB table special and, simply copying .frm and .ibd files from source server to destination server will not migrate/copy InnoDB table to different server.

We need to do additional steps to complete the copy/migration. Please follow below steps to copy InnoDB table from one server to another.

 

Step 1) On destination database, create required database if not available.

So lets say, I want to move one table called flavors from fctest database on source server to destination server.

For that to happen, I have to make sure that on destination server, I have fctest database created. If its not created, create the same.

You need to do this, even if you want to move this table to some other existing database on destination server.

We have to first move the table to fctest database on destination (same as source) and then we will rename the same to desired database.

On destination server:

root [deo] >show databases;
+--------------------+
| Database           |
+--------------------+
| information_schema |
| deo                |
| menagerie          |
| mysql              |
| performance_schema |
| test               |
+--------------------+
6 rows in set (0.00 sec)

I don’t have fctest database. Creating the same.

root [deo] >create database fctest;
Query OK, 1 row affected (0.00 sec)

root [deo] >
root [deo] >use fctest;
Database changed
root [fctest] >

Step 2) Create empty table structure same as source table

Once you create the database fctest, create empty table flavors whose structure/definition is same as source.

You can get the definition from source using “SHOW CREATE TABLE” command

On Source server:

mysql> show create table flavors;
+---------+----------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------+
| Table   | Create Table                                                                                                                                                                                                                                                                                                                                                                                       |
+---------+----------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------+
| flavors | CREATE TABLE `flavors` (
  `flavor_id` int(11) NOT NULL AUTO_INCREMENT,
  `build_type` varchar(255) NOT NULL DEFAULT '',
  `variant` varchar(255) NOT NULL DEFAULT '',
  `threads` varchar(255) NOT NULL DEFAULT '',
  `first_event_id` int(11) NOT NULL DEFAULT '0',
  `last_event_id` int(11) DEFAULT NULL,
  PRIMARY KEY (`flavor_id`)
) ENGINE=InnoDB AUTO_INCREMENT=30 DEFAULT CHARSET=latin1 |
+---------+----------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------+
1 row in set (0.00 sec)

Create the same on destination server

On Destination server:

root [fctest] >CREATE TABLE flavors (
    ->   flavor_id int(11) NOT NULL AUTO_INCREMENT,
    ->   build_type varchar(255) NOT NULL DEFAULT '',
    ->   variant varchar(255) NOT NULL DEFAULT '',
    ->   threads varchar(255) NOT NULL DEFAULT '',
    ->   first_event_id int(11) NOT NULL DEFAULT '0',
    ->   last_event_id int(11) DEFAULT NULL,
    ->   PRIMARY KEY (flavor_id))
    ->  ENGINE=InnoDB;
Query OK, 0 rows affected (0.00 sec)
root [fctest] >show tables;
+------------------+
| Tables_in_fctest |
+------------------+
| flavors          |
+------------------+
1 row in set (0.00 sec)

Step 3) Discard tablespace on destination server

This step is required so that MySQL will discard current .ibd file generated for this table.

root [fctest] >ALTER TABLE flavors DISCARD TABLESPACE;
Query OK, 0 rows affected (0.00 sec)

Step 4) Copy .frm and .ibd file from source server to destination server

At this point copy the .frm and .ibb files from source server to destination server.

Before copying these files, make sure that no transactions are happening on this table (which you are migrating).

Safest way is to use “FLUSH TABLE <table_name> WITH READ LOCK” on source database which will flush table data and takes read only lock so that no DML will happen on the table.

While you keep that session active, copy the .frm and .ibd files to destination server.

Step 5) Import tablespace on destination server

Once you copy .frm and .ibd files to destination server, import the tablespace, which will make required metadata imported into default tablespace and accept the .ibd file.

root [fctest] >ALTER TABLE flavors IMPORT TABLESPACE;
Query OK, 0 rows affected (0.00 sec)
root [fctest] >show tables;
+------------------+
| Tables_in_fctest |
+------------------+
| flavors          |
+------------------+
1 row in set (0.00 sec)
root [fctest] >select count(1) from flavors;
+----------+
| count(1) |
+----------+
|       29 |
+----------+
1 row in set (0.00 sec)

Step 6) Rename the table to desired database

Since we wanted to move this table to deo database on destination server, we can simply run below rename command and it will move the tables to deo database.

root [fctest] >rename table fctest.flavors to deo.flavors;
Query OK, 0 rows affected (0.00 sec)

you can later drop fctest database on destination server.

Hope this helps !!

Disabling GTID in MySQL replication

There could be situations where we have to disable GTID in our replication setup.
We can follow below steps to do so. No need to change any GTID related parameters. We just have to change slave configuration and bounce slave. No need to touch master.

Step 1) Confirm that you are using GTID

If you check processlist on master, you can see GTID is enabled

master> show processlist \G
*************************** 1. row ***************************
     Id: 51
   User: replicate
   Host: master-1.db.example.com:43420
     db: NULL
Command: Binlog Dump GTID
   Time: 81316
  State: Master has sent all binlog to slave; waiting for binlog to be updated
   Info: NULL

Check the “command” above and it says “Binlog Dump GTID

 

Step 2) Stop slave, reset master_auto_position to 0


mysql> stop slave;
Query OK, 0 rows affected (0.00 sec)

mysql> change master to master_auto_position=0;
Query OK, 0 rows affected (0.00 sec)

Step 3) Check slave status for master log file and position

mysql> show slave status \G
*************************** 1. row ***************************
               Slave_IO_State:
                  Master_Host: master-1.db.example.com
                  Master_User: replicate
                  Master_Port: 3306
                Connect_Retry: 5
              Master_Log_File: bin_log.000003
          Read_Master_Log_Pos: 191
               Relay_Log_File: relay_log.000001
                Relay_Log_Pos: 4
        Relay_Master_Log_File: bin_log.000003
             Slave_IO_Running: No
            Slave_SQL_Running: No
              Replicate_Do_DB:
          Replicate_Ignore_DB:
           Replicate_Do_Table:
       Replicate_Ignore_Table:
       ...
       ...
           Master_SSL_Crlpath:
           Retrieved_Gtid_Set:
            Executed_Gtid_Set: 8c593930-b4da-11e3-9e2d-b499ba05473c:1-7,
cc4192b6-a60d-11e4-8301-80c16e2272cc:1-168
                Auto_Position: 0
1 row in set (0.01 sec)

 

Step 4) Use change master to point to master log file and position


mysql> change master to master_host='master-1.db.example.com' , master_port = 3306 , master_user = 'replicate' , master_password='welcome' , master_log_file = 'bin_log.000003' , master_log_pos = 191, master_connect_retry = 5;
Query OK, 0 rows affected, 2 warnings (0.00 sec)

mysql> show warnings;
+-------+------+--------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------+
| Level | Code | Message                                                                                                                                                                                                                                                                              |
+-------+------+--------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------+
| Note  | 1759 | Sending passwords in plain text without SSL/TLS is extremely insecure.                                                                                                                                                                                                               |
| Note  | 1760 | Storing MySQL user name or password information in the master info repository is not secure and is therefore not recommended. Please consider using the USER and PASSWORD connection options for START SLAVE; see the 'START SLAVE Syntax' in the MySQL Manual for more information. |
+-------+------+--------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------+
2 rows in set (0.00 sec)

 

Step 5) start slave


mysql> start slave;
Query OK, 0 rows affected (0.00 sec)

show slave status will still show GTIDs but its not using GTIDs now

mysql> show slave status \G
*************************** 1. row ***************************
               Slave_IO_State: Waiting for master to send event
                  Master_Host: master-1.db.example.com
                  Master_User: replicate
                  Master_Port: 3306
                Connect_Retry: 5
              Master_Log_File: bin_log.000003
          Read_Master_Log_Pos: 465
               Relay_Log_File: relay_log.000002
                Relay_Log_Pos: 586
        Relay_Master_Log_File: bin_log.000003
             Slave_IO_Running: Yes
            Slave_SQL_Running: Yes
              Replicate_Do_DB:
              ...
              ...
     Last_SQL_Error_Timestamp:
               Master_SSL_Crl:
           Master_SSL_Crlpath:
           Retrieved_Gtid_Set: 8c593930-b4da-11e3-9e2d-b499ba05473c:8
            Executed_Gtid_Set: 8c593930-b4da-11e3-9e2d-b499ba05473c:1-8,
cc4192b6-a60d-11e4-8301-80c16e2272cc:1-168
                Auto_Position: 0
1 row in set (0.00 sec)

 

We can confirm that GTID is not getting used by checking processlist on master

master>show processlist \G
*************************** 1. row ***************************
     Id: 1
   User: root
   Host: localhost
     db: deo
Command: Query
   Time: 0
  State: init
   Info: show processlist
*************************** 2. row ***************************
     Id: 3
   User: replicate
   Host: master-1.db.example.com:59631
     db: NULL
Command: Binlog Dump
   Time: 24
  State: Master has sent all binlog to slave; waiting for binlog to be updated
   Info: NULL
2 rows in set (0.00 sec)

If you check “command” for row 2, you will notice GTID is gone.

Hope this helps !!

Semi-synchronous Replication in MySQL

If we talk about MySQL replication, we always default to asynchronous replication. By asynchronous, we mean master will keep doing transactions while they are getting replicated to slave via IO thread asynchronously. In this replication, its possible that slave will fall behind but that does not impact master in anyway.
Problem with this replication is data-loss. Example: In case slave is falling behind and master host crashes, some of the transactions may not get transfered/replicated to slave. Following 2 options exists in this scenario

  1. Wait for master host to come up. Which may take around 20-25 mins
  2. Promote slave to master with some data loss

Nether options sounds good on production database. That’s why we have semi-synchronous replication.

Purpose of semi-synchronous replication is twofold

  1. Reduce replication lag and keep slave in sync with master
  2. Ensure no data-loss in case master crashes

Semi-synchronous replication has been in place from MySQL 5.5 onwards in the form of plugin. We just have to install the plugin and then we can enable the same. But again this semi-synchronous replication available in MySQL 5.5 and 5.6 does not guarantee “no data loss” completely for the reasons we will see later but it keeps slave in sync with master.

MySQL 5.7 Oracle has made some enhanced semi-synchronous replication which they call “loss-less semi-synchronous replication”. This guarantees that there is no data loss in case master crashes.

We will take a look at how semi-synchronous replication works, how to install the plugin and whats the enhancement in MySQL 5.7 to prevent data loss.

How Semi-Synchronous replication works?

As discussed above, there are 2 different algorithm for semi-synchronous replication.

  1. Normal semi-synchronous replication available in mysql 5.5 and 5.6 – does not guarantee “no data loss”
  2. Loss-less semi-synchronous replication available in mysql 5.7 – guarantee “no data loss”

lets take a look at each and how it works

1) Normal semi-synchronous replication in MySQL 5.5 and 5.6

If we talk above true synchronous replication – In truly synchronous replication, when you commit a transaction, the commit does not complete on master until all replicas have also committed successfully and acknowledged to master.

In MySQL’s semi-synchronous replication when transaction is committed on master, its written to binary log and innodb transaction log and commit is complete. At this point the data has not even reached to any of the slave but on master commit is complete. The client issuing the commit is still on hold and is not told that commit has complete.
At this point slave IO thread pulls transaction data from binary logs and updated relay logs, flush relay logs to disk and acknowledges master that slave has received and written the data. Only after receiving ack from at least 1 slave, MySQL confirms the client that transaction is committed.

There is a problem with above functionality. MySQL is only delaying acknowledge of commit to client. In reality its committing the transaction on master way before that (before even sending data to slave). Since committing of transaction is done before waiting for ack from semi-sync slave, so the committed rows are visible from applications, even though semi-sync slaves may not have received the data. If master is crashed and none of the slaves received the data, the data is lost but applications may have seen them. This is called phantom reads, and in many cases it’s problematic.

2) Loss-less semi-synchronous replication in MySQL 5.7

loss-less semi-synchronous replication in MySQL 5.7 answers above problem. Only change that’s done in this semi-synchronous replication is that commit on primary is delayed until master receives acknowledgement from only of the semi-synchronous slave that data is received and written by the slave. So when committed data is visible from applications, one of the semi-sync slaves have received that. Phantom read risk is much smaller: if both master and the latest semi-sync slave are down at the same time, data is lost. But it’s much less likely to happen compared to normal semi-sync.

To avoid data loss and phantom reads, Normal Semi-sync can’t meet your expectations. Using Loss-Less Semi-sync is needed.
With Loss-Less Semi-Synchronous replication, committed data should be on one of the slaves, so you can recover from the latest slave. You can always do fast failover here.

The acknowledgement from the slave means, it has received the transaction and applied it to the relay log and flushed it to disk. It doesn’t mean the data has actually gone into the database.

Just to make this statement clearer, it means the transaction was written to the relay log, and the relay log was flushed to disk.
There is still the need for the SQL thread to pick it up and apply it to the database. It’s very unlikely not to complete this task, but that way we’re getting a bit more performance.

Implementing semi-synchronous replication

1) check if semi-synchronous plugin is installed using “show plugins;”


mysql> show plugins;
+----------------------------+----------+--------------------+---------+---------+
| Name                       | Status   | Type               | Library | License |
+----------------------------+----------+--------------------+---------+---------+
| binlog                     | ACTIVE   | STORAGE ENGINE     | NULL    | GPL     |
| mysql_native_password      | ACTIVE   | AUTHENTICATION     | NULL    | GPL     |
| mysql_old_password         | ACTIVE   | AUTHENTICATION     | NULL    | GPL     |
| sha256_password            | ACTIVE   | AUTHENTICATION     | NULL    | GPL     |
| MRG_MYISAM                 | ACTIVE   | STORAGE ENGINE     | NULL    | GPL     |
| MyISAM                     | ACTIVE   | STORAGE ENGINE     | NULL    | GPL     |
| MEMORY                     | ACTIVE   | STORAGE ENGINE     | NULL    | GPL     |
| CSV                        | ACTIVE   | STORAGE ENGINE     | NULL    | GPL     |
| InnoDB                     | ACTIVE   | STORAGE ENGINE     | NULL    | GPL     |
| INNODB_TRX                 | ACTIVE   | INFORMATION SCHEMA | NULL    | GPL     |
| INNODB_LOCKS               | ACTIVE   | INFORMATION SCHEMA | NULL    | GPL     |
...
...
| PERFORMANCE_SCHEMA         | ACTIVE   | STORAGE ENGINE     | NULL    | GPL     |
| BLACKHOLE                  | ACTIVE   | STORAGE ENGINE     | NULL    | GPL     |
| FEDERATED                  | DISABLED | STORAGE ENGINE     | NULL    | GPL     |
| ARCHIVE                    | ACTIVE   | STORAGE ENGINE     | NULL    | GPL     |
| partition                  | ACTIVE   | STORAGE ENGINE     | NULL    | GPL     |
+----------------------------+----------+--------------------+---------+---------+
42 rows in set (0.01 sec)

You should be looking for “rpl_semi_sync_master” name.
In my case plugin was not installed.

Optionally you can also check system variables


mysql> show variables like 'rpl_semi_sync%';
Empty set (0.00 sec)

Nothing. That means semi-synchronous functionality is not installed.

2) Install semi-synchronous plugin

You need to install “semisync_master.so” on master and “semisync_slave.so” on slave

on Master:


mysql> INSTALL PLUGIN rpl_semi_sync_master SONAME 'semisync_master.so';
Query OK, 0 rows affected (0.12 sec)

on Slave:


mysql> INSTALL PLUGIN rpl_semi_sync_slave SONAME 'semisync_slave.so';
Query OK, 0 rows affected (0.00 sec)

These .so plugin files are available in <mysql_install_dir>/lib/plugin directory.

“show plugins;” should show the plugin now


mysql> show plugins;
+----------------------------+----------+--------------------+--------------------+---------+
| Name                       | Status   | Type               | Library            | License |
+----------------------------+----------+--------------------+--------------------+---------+
| binlog                     | ACTIVE   | STORAGE ENGINE     | NULL               | GPL     |
| mysql_native_password      | ACTIVE   | AUTHENTICATION     | NULL               | GPL     |
| mysql_old_password         | ACTIVE   | AUTHENTICATION     | NULL               | GPL     |
| sha256_password            | ACTIVE   | AUTHENTICATION     | NULL               | GPL     |
| MRG_MYISAM                 | ACTIVE   | STORAGE ENGINE     | NULL               | GPL     |
...
...
| rpl_semi_sync_master       | ACTIVE   | REPLICATION        | semisync_master.so | GPL     |
+----------------------------+----------+--------------------+--------------------+---------+
43 rows in set (0.00 sec)

You should also see the variables


mysql> show variables like 'rpl_semi_syn%';
+------------------------------------+-------+
| Variable_name                      | Value |
+------------------------------------+-------+
| rpl_semi_sync_master_enabled       | OFF   |
| rpl_semi_sync_master_timeout       | 10000 |
| rpl_semi_sync_master_trace_level   | 32    |
| rpl_semi_sync_master_wait_no_slave | ON    |
+------------------------------------+-------+
4 rows in set (0.00 sec)

3) Enable semi-synchronous replication

On master:


mysql> SET GLOBAL rpl_semi_sync_master_enabled = 1;
Query OK, 0 rows affected (0.03 sec)

mysql> SET GLOBAL rpl_semi_sync_master_timeout = 10000;
Query OK, 0 rows affected (0.00 sec)

rpl_semi_sync_master_timeout is the timeout in millisec. We are setting a timeout of 10 sec.

On slave:


mysql> SET GLOBAL rpl_semi_sync_slave_enabled = 1;
Query OK, 0 rows affected (0.00 sec)

mysql> stop slave io_thread;
Query OK, 0 rows affected (0.00 sec)

mysql> start slave io_thread;
Query OK, 0 rows affected (0.00 sec)

we have to bounce IO_THREAD on slave so that it will connect back to master as semi-synchronous slave.

Monitoring Semi-synchronous replication

We have few status variables that we can monitor to check if semi-synchronous replication is working as expected or not.

If you check status variables on master, it shows following status variables.


mysql> SHOW STATUS LIKE 'Rpl_semi_sync%';
+--------------------------------------------+-------+
| Variable_name                              | Value |
+--------------------------------------------+-------+
| Rpl_semi_sync_master_clients               | 1     |
| Rpl_semi_sync_master_net_avg_wait_time     | 0     |
| Rpl_semi_sync_master_net_wait_time         | 0     |
| Rpl_semi_sync_master_net_waits             | 0     |
| Rpl_semi_sync_master_no_times              | 0     |
| Rpl_semi_sync_master_no_tx                 | 0     |
| Rpl_semi_sync_master_status                | ON    |
| Rpl_semi_sync_master_timefunc_failures     | 0     |
| Rpl_semi_sync_master_tx_avg_wait_time      | 0     |
| Rpl_semi_sync_master_tx_wait_time          | 0     |
| Rpl_semi_sync_master_tx_waits              | 0     |
| Rpl_semi_sync_master_wait_pos_backtraverse | 0     |
| Rpl_semi_sync_master_wait_sessions         | 0     |
| Rpl_semi_sync_master_yes_tx                | 0     |
+--------------------------------------------+-------+
14 rows in set (0.00 sec)

On replication slave we have only 1 variable Rpl_semi_sync_slave_status.

Among the variables mentioned above, 3 varaibles are important for us to watch

Rpl_semi_sync_master_status -> Whether semi-synchronous replication currently is operational on the master
Rpl_semi_sync_master_yes_tx -> The number of commits that were acknowledged successfully by a slave
Rpl_semi_sync_master_no_tx  -> The number of commits that were not acknowledged successfully by a slave

Currently we see that Rpl_semi_sync_master_status is ON, that means the semi-synchronous replication is working fine.

Lets create a table in “deo” database and insert few records.

On master:


mysql> use deo;
Reading table information for completion of table and column names
You can turn off this feature to get a quicker startup with -A

Database changed
mysql> create table test (col1 smallint not null);
Query OK, 0 rows affected (0.14 sec)

mysql> set autocommit=1;
Query OK, 0 rows affected (0.00 sec)

mysql> insert into test values (1),(2),(3);
Query OK, 3 rows affected (0.02 sec)
Records: 3  Duplicates: 0  Warnings: 0

mysql> select * from test;
+------+
| col1 |
+------+
|    1 |
|    2 |
|    3 |
+------+
3 rows in set (0.00 sec)
mysql> show status like 'Rpl_semi_sync%';
+--------------------------------------------+-------+
| Variable_name                              | Value |
+--------------------------------------------+-------+
| Rpl_semi_sync_master_clients               | 1     |
| Rpl_semi_sync_master_net_avg_wait_time     | 1356  |
| Rpl_semi_sync_master_net_wait_time         | 4070  |
| Rpl_semi_sync_master_net_waits             | 3     |
| Rpl_semi_sync_master_no_times              | 0     |
| Rpl_semi_sync_master_no_tx                 | 0     |
| Rpl_semi_sync_master_status                | ON    |
| Rpl_semi_sync_master_timefunc_failures     | 0     |
| Rpl_semi_sync_master_tx_avg_wait_time      | 1379  |
| Rpl_semi_sync_master_tx_wait_time          | 4139  |
| Rpl_semi_sync_master_tx_waits              | 3     |
| Rpl_semi_sync_master_wait_pos_backtraverse | 0     |
| Rpl_semi_sync_master_wait_sessions         | 0     |
| Rpl_semi_sync_master_yes_tx                | 3     |
+--------------------------------------------+-------+
14 rows in set (0.00 sec)

If you see above, we have 3 transactions done (corrosponding to 3 inserts since autocommit was on) and all are replicated synchronously on slave.

Rpl_semi_sync_master_yes_tx has become 3 and Rpl_semi_sync_master_no_tx is still 0.

Lets stop replication IO thread so that changes in master will not get replicated on slave

on slave:


mysql> stop slave io_thread;
Query OK, 0 rows affected (0.00 sec)

Now, lets insert a record in test table in master

On master:


mysql> insert into test values (4);
Query OK, 1 row affected (10.01 sec)

mysql> show status like 'Rpl_semi_sync%';
+--------------------------------------------+-------+
| Variable_name                              | Value |
+--------------------------------------------+-------+
| Rpl_semi_sync_master_clients               | 1     |
| Rpl_semi_sync_master_net_avg_wait_time     | 1022  |
| Rpl_semi_sync_master_net_wait_time         | 4090  |
| Rpl_semi_sync_master_net_waits             | 4     |
| Rpl_semi_sync_master_no_times              | 1     |
| Rpl_semi_sync_master_no_tx                 | 1     |
| Rpl_semi_sync_master_status                | OFF   |
| Rpl_semi_sync_master_timefunc_failures     | 0     |
| Rpl_semi_sync_master_tx_avg_wait_time      | 1379  |
| Rpl_semi_sync_master_tx_wait_time          | 4139  |
| Rpl_semi_sync_master_tx_waits              | 3     |
| Rpl_semi_sync_master_wait_pos_backtraverse | 0     |
| Rpl_semi_sync_master_wait_sessions         | 0     |
| Rpl_semi_sync_master_yes_tx                | 3     |
+--------------------------------------------+-------+
14 rows in set (0.00 sec)

If you see, Rpl_semi_sync_master_no_tx is updated to 1 and Rpl_semi_sync_master_status is updated to off.
It means that replication is changed back to asynchronous.

Also note that insert statement took 10 secs to complete, because semi-sync replication timeout was set to 10 sec (rpl_semi_sync_master_timeout = 10000)

We can now start the slave IO thread and it will be back to semi-synchronous replication

on slave:


mysql> start slave io_thread;
Query OK, 0 rows affected (0.00 sec)

mysql> select * from test;
+------+
| col1 |
+------+
|    1 |
|    2 |
|    3 |
|    4 |
+------+
4 rows in set (0.00 sec)

On master:


mysql> show status like 'Rpl_semi_sync%';
+--------------------------------------------+-------+
| Variable_name                              | Value |
+--------------------------------------------+-------+
| Rpl_semi_sync_master_clients               | 1     |
| Rpl_semi_sync_master_net_avg_wait_time     | 1626  |
| Rpl_semi_sync_master_net_wait_time         | 8131  |
| Rpl_semi_sync_master_net_waits             | 5     |
| Rpl_semi_sync_master_no_times              | 1     |
| Rpl_semi_sync_master_no_tx                 | 1     |
| Rpl_semi_sync_master_status                | ON    |
| Rpl_semi_sync_master_timefunc_failures     | 0     |
| Rpl_semi_sync_master_tx_avg_wait_time      | 1379  |
| Rpl_semi_sync_master_tx_wait_time          | 4139  |
| Rpl_semi_sync_master_tx_waits              | 3     |
| Rpl_semi_sync_master_wait_pos_backtraverse | 0     |
| Rpl_semi_sync_master_wait_sessions         | 0     |
| Rpl_semi_sync_master_yes_tx                | 3     |
+--------------------------------------------+-------+
14 rows in set (0.00 sec)

So after starting slave IO thread, it recovered immediately and semi-synchronous replication enabled automatically.

Reference:

http://yoshinorimatsunobu.blogspot.in/2014/04/semi-synchronous-replication-at-facebook.html
http://dev.mysql.com/doc/refman/5.5/en/replication-semisync.html
http://www.percona.com/blog/2012/01/19/how-does-semisynchronous-mysql-replication-work/

Hope this help !!

Converting non-CDB database as PDB in existing CDB

This is a short article on converting existing non-container (non-cdb) 12c database as pluggable database (pdb) to existing container database (cdb).

Version of Non-CDB database = 12.1.0.1

Version of CDB database = 12.1.0.2

Following steps will plug-in non-cdb database into cdb as pdb and also will upgrade the version to 12.1.0.2

My non-cdb database name is deo12c

My cdb database name is deocdb

Step 1) Make sure that non-cdb database is at least 12.1.0.1 version. This is the first version available.

Step 2) Compatible is set to at least 12.0.0 in non-cdb database

Step 3) Open non-cdb database in read-only to create XML file for the PDB

Shutting down non-cdb database (deo12c)

SYS.DEO12C.PRIMARY>shut immediate;
Database closed.
Database dismounted.
ORACLE instance shut down.

Mounting the database and opening in read-only

SYS.DEO12C.PRIMARY>startup mount
ORA-32004: obsolete or deprecated parameter(s) specified for RDBMS instance
ORACLE instance started.

Total System Global Area 4294967296 bytes
Fixed Size            3719064 bytes
Variable Size         4177526888 bytes
Database Buffers      100663296 bytes
Redo Buffers           13058048 bytes
Database mounted.
SYS.DEO12C.PRIMARY>alter database open read only;

Database altered.

SYS.DEO12C.PRIMARY>

Step 4) Create XML file for PDB

SYS.DEO12C.PRIMARY-RO>exec DBMS_PDB.DESCRIBE('/fs-a01-a/backups/deo12c/deopdb2.xml');

PL/SQL procedure successfully completed.

Step 5) Shutdown non-cdb database

SYS.DEO12C.PRIMARY-RO>SHUTDOWN IMMEDIATE
Database closed.
Database dismounted.
ORACLE instance shut down.
SYS.DEO12C.PRIMARY-RO>

Step 6) Plugin non-cdb database as PDB into CDB

Running following command on CDB database – deocdb

SYS.DEOCDB.PRIMARY>CREATE PLUGGABLE DATABASE deopdb2 USING '/fs-a01-a/backups/deo12c/deopdb2.xml' NOCOPY TEMPFILE REUSE;

Pluggable database created.

SYS.DEOCDB.PRIMARY>

Step 7) Convert the dictionary of new plugged in database to the PDB type

Change the container to new pluggable database

SYS.DEOCDB.PRIMARY>ALTER SESSION set container=deopdb2;

Session altered.

SYS.DEOCDB.PRIMARY>@$ORACLE_HOME/rdbms/admin/noncdb_to_pdb.sql

Step 8) Start new pluggable database

SYS.DEOCDB.PRIMARY>alter pluggable database deopdb2 open;

Warning: PDB altered with errors.

When we opened the pluggable database, we got a warning. You can check the details of warning by querying PDB_PLUG_IN_VIOLATIONS table

SYS.DEOCDB.PRIMARY>select name,cause,type,message,status from PDB_PLUG_IN_VIOLATIONS where name = 'DEOPDB2' and status != 'RESOLVED';

NAME                   CAUSE                  TYPE    MESSAGE                        STATUS
------------------------------ ------------------------------ --------- -------------------------------------------------- ---------
DEOPDB2                NATIONAL CHARACTER SET          ERROR    National character set mismatch: PDB national char PENDING
                                    acter set AL16UTF16. CDB national character set UT
                                    F8.

DEOPDB2                OPTION                  WARNING    Database option APS mismatch: PDB installed versio PENDING
                                    n NULL. CDB installed version 12.1.0.2.0.

DEOPDB2                OPTION                  WARNING    Database option CATJAVA mismatch: PDB installed ve PENDING
                                    rsion NULL. CDB installed version 12.1.0.2.0.

DEOPDB2                OPTION                  WARNING    Database option CONTEXT mismatch: PDB installed ve PENDING
                                    rsion NULL. CDB installed version 12.1.0.2.0.

DEOPDB2                OPTION                  WARNING    Database option DV mismatch: PDB installed version PENDING
                                     NULL. CDB installed version 12.1.0.2.0.

DEOPDB2                OPTION                  WARNING    Database option JAVAVM mismatch: PDB installed ver PENDING
                                    sion NULL. CDB installed version 12.1.0.2.0.

DEOPDB2                OPTION                  WARNING    Database option OLS mismatch: PDB installed versio PENDING
                                    n NULL. CDB installed version 12.1.0.2.0.

DEOPDB2                OPTION                  WARNING    Database option ORDIM mismatch: PDB installed vers PENDING
                                    ion NULL. CDB installed version 12.1.0.2.0.

DEOPDB2                OPTION                  WARNING    Database option OWM mismatch: PDB installed versio PENDING
                                    n NULL. CDB installed version 12.1.0.2.0.

DEOPDB2                OPTION                  WARNING    Database option SDO mismatch: PDB installed versio PENDING
                                    n NULL. CDB installed version 12.1.0.2.0.

DEOPDB2                OPTION                  WARNING    Database option XML mismatch: PDB installed versio PENDING
                                    n NULL. CDB installed version 12.1.0.2.0.

DEOPDB2                OPTION                  WARNING    Database option XOQ mismatch: PDB installed versio PENDING
                                    n NULL. CDB installed version 12.1.0.2.0.

DEOPDB2                APEX                  WARNING    APEX mismatch: PDB installed version NULL CDB inst PENDING
                                    alled version 4.2.5.00.08


13 rows selected.

These warnings can be ignored. These are basically coming because dba_registry of container database doesn’t match new pluggable database.

In my case container database has following registry

SYS.DEOCDB.PRIMARY>alter session set container=cdb$root;

Session altered.

SYS.DEOCDB.PRIMARY>col comp_name format a40;
SYS.DEOCDB.PRIMARY>SELECT comp_id,comp_name,version FROM dba_registry;

COMP_ID                COMP_NAME                VERSION
------------------------------ ---------------------------------------- ------------------------------
DV                   Oracle Database Vault            12.1.0.2.0
APEX                   Oracle Application Express        4.2.5.00.08
OLS                   Oracle Label Security            12.1.0.2.0
SDO                   Spatial                    12.1.0.2.0
ORDIM                   Oracle Multimedia            12.1.0.2.0
CONTEXT                Oracle Text                12.1.0.2.0
OWM                   Oracle Workspace Manager         12.1.0.2.0
XDB                   Oracle XML Database            12.1.0.2.0
CATALOG                Oracle Database Catalog Views        12.1.0.2.0
CATPROC                Oracle Database Packages and Types    12.1.0.2.0
JAVAVM                   JServer JAVA Virtual Machine        12.1.0.2.0
XML                   Oracle XDK                12.1.0.2.0
CATJAVA                Oracle Database Java Packages        12.1.0.2.0
APS                   OLAP Analytic Workspace            12.1.0.2.0
XOQ                   Oracle OLAP API                12.1.0.2.0
RAC                   Oracle Real Application Clusters     12.1.0.2.0

16 rows selected.

In my pluggable database, registry has only following components

SYS.DEOCDB.PRIMARY>alter session set container=deopdb2;

Session altered.

SYS.DEOCDB.PRIMARY>alter session set container=deopdb2;

Session altered.

SYS.DEOCDB.PRIMARY>SELECT comp_id,comp_name,version FROM dba_registry;

COMP_ID                COMP_NAME                VERSION
------------------------------ ---------------------------------------- ------------------------------
XDB                   Oracle XML Database            12.1.0.2.0
CATALOG                Oracle Database Catalog Views        12.1.0.2.0
CATPROC                Oracle Database Packages and Types    12.1.0.2.0
RAC                   Oracle Real Application Clusters     12.1.0.2.0

4 rows selected.

You can get rid of above warnings by installing the required components in your pluggable database and match it with container database.

References:

https://docs.oracle.com/database/121/UPGRD/upgrade.htm#CHDBEDDA

Hope this helps !!

MySQL Performance Schema

Performance schema, as name suggest is a tool for monitoring MySQL performance. It was implemented staring from MySQL 5.5 and improved further in MySQL 5.6. Performance Schema includes set of tables that gives information on how database is performing. It provides a way to inspect internal execution of the server at run time. Performance schema monitors every “event” the server does that takes time and has been instrumented so that timing information can be collected. It is implemented using the PERFORMANCE_SCHEMA storage engine and the performance_schema database.

Performance Schema focuses primarily on performance data. This differs from INFORMATION_SCHEMA, which serves for inspection of metadata. Having its own engine allows us to access information about server execution while having minimal impact on server performance. Tables created in performance_schema are persistent in memory. Information from those tables will be wiped out if the database is rebooted. Memory allocation for performance_schema table is all done at server startup, so there is no further memory reallocation or sizing, which greatly streamlines performance.

In this article we will see how different tables are organized and what kind of information each table store. We will also see how we can customize instrumentation in performance schema to gather required performance stats. Performance schema is a very big topic and there have been very exhaustive documentation on the same. I tried to make this article brief and cover major areas of performance schema. For further details readers can check out the document links provided in reference section.

Tables in Performance schema

We can categorize the tables in performance schema in 5 broad categories as described below.

1) Setup tables – These tables contains information about setting up monitoring/instrumentation. We can modify (update) these tables to customize what kind of performance data to be collected. These tables define who, what, how  to monitor, and where to save this data.

2) Current Events tables – These tables records performance data about the current events happening in database. This also depends on what kind of instrumentation we have enabled in setup tables. These tables contains most recently collected monitored events.

3) History events table – Performance data collected in current events table move to these history table after a period of time. So structure wise history tables are same as current events table but they contains historical data

4) Summary tables – These tables contacts performance data summarized over an entity. So performance data of events are summarized by various entities like account_id, host_name or by thread. These tables has aggregate value of events grouped by different entities.

5) Object Instances tables – These tables records what instances of objects (files, locks, mutexes, etc) are currently in use. Example if a mutex is currently in use, you will see an entry corresponding t that mutex in mutex_instances table.

Lets try to dig into each category and see what tables we have and what they store.

1) Setup tables

Most of the tables in performance_schema database is read only tables. There are very few tables which we can modify to adjust instrumentation. Setup tables are among those few tables which can be modified to define what kind of performance data we want to collect.

SETUP_ACTORS – This table defines who to monitor. We can insert username@hostname combination in this table that we want to monitor.

By default it has only 1 entry and every user is monitored

mysql> select * from setup_actors;
+------+------+------+
| HOST | USER | ROLE |
+------+------+------+
| %    | %    | %    |
+------+------+------+
1 row in set (0.00 sec)

 

 

SETUP_OBJECTS – This table defines which objects (table) to monitor. By details mysql does not monitor any activity in performance_schema, information_schema and MySQL databases. Everything else is monitored. If you are interested in monitoring only 1 or 2 databases, you can remove the entry of monitoring all databases and insert into databases that you want to monitor.

Default entries looks file following

mysql> select * from setup_objects;
+-------------+--------------------+-------------+---------+-------+
| OBJECT_TYPE | OBJECT_SCHEMA      | OBJECT_NAME | ENABLED | TIMED |
+-------------+--------------------+-------------+---------+-------+
| TABLE       | mysql              | %           | NO      | NO    |
| TABLE       | performance_schema | %           | NO      | NO    |
| TABLE       | information_schema | %           | NO      | NO    |
| TABLE       | %                  | %           | YES     | YES   |
+-------------+--------------------+-------------+---------+-------+
4 rows in set (0.02 sec)

 

SETUP_INSTRUMENTS – This table defines the “what” part – what to monitor. This is a major table in performance_schema database. This table has list of all instrumentation that is available in mysql. This table lists classes of instrumented objects for which events can be collected. Each instrument added to the source code provides a row for this table, even when the instrumented code is not executed. You can enable or disable each instrumentation in this table.

This table looks like following

mysql> select * from setup_instruments;
+---------------------------------------------------------------------------------------+---------+-------+
| NAME                                                                                  | ENABLED | TIMED |
+---------------------------------------------------------------------------------------+---------+-------+
| wait/synch/mutex/sql/PAGE::lock                                                       | NO      | NO    |
| wait/synch/mutex/sql/TC_LOG_MMAP::LOCK_sync                                           | NO      | NO    |
| wait/synch/mutex/sql/TC_LOG_MMAP::LOCK_active                                         | NO      | NO    |
| wait/synch/mutex/sql/TC_LOG_MMAP::LOCK_pool                                           | NO      | NO    |
| wait/synch/mutex/sql/LOCK_des_key_file                                                | NO      | NO    |
| wait/synch/mutex/sql/MYSQL_BIN_LOG::LOCK_commit                                       | NO      | NO    |
...
...
...
| statement/com/Set option                                                              | YES     | YES   |
| statement/com/Fetch                                                                   | YES     | YES   |
| statement/com/Daemon                                                                  | YES     | YES   |
| statement/com/Binlog Dump GTID                                                        | YES     | YES   |
| statement/com/Error                                                                   | YES     | YES   |
| statement/abstract/Query                                                              | YES     | YES   |
| statement/abstract/new_packet                                                         | YES     | YES   |
| statement/abstract/relay_log                                                          | YES     | YES   |
| wait/io/socket/sql/server_tcpip_socket                                                | NO      | NO    |
| wait/io/socket/sql/server_unix_socket                                                 | NO      | NO    |
| wait/io/socket/sql/client_connection                                                  | NO      | NO    |
| idle                                                                                  | YES     | YES   |
+---------------------------------------------------------------------------------------+---------+-------+

Instrument names have multiple parts and form a hierarchy. An instrument name consists of a sequence of components separated by ‘/’ characters. Example names:

wait/io/file/myisam/log
wait/io/file/mysys/charset
wait/lock/table/sql/handler
wait/synch/cond/mysys/COND_alarm
wait/synch/cond/sql/BINLOG::update_cond
wait/synch/mutex/mysys/BITMAP_mutex
wait/synch/mutex/sql/LOCK_delete
wait/synch/rwlock/sql/Query_cache_query::lock
stage/sql/closing tables
stage/sql/Sorting result
statement/com/Execute
statement/com/Query
statement/sql/create_table
statement/sql/lock_tables

The instrument name space has a tree-like structure. The components of an instrument name from left to right provide a progression from more general to more specific. The number of components a name has depends on the type of instrument.

The interpretation of a given component in a name depends on the components to the left of it. For example, myisam appears in both of the following names, but myisam in the first name is related to file I/O, whereas in the second it is related to a synchronization instrument:

wait/io/file/myisam/log
wait/synch/cond/myisam/MI_SORT_INFO::cond

SETUP_CONSUMERS – This defines type of consumers for which event information is getting collected. In a way this defines is history information should be kept or not or just current event information should be collected and stored.

mysql> select * from setup_consumers;
+--------------------------------+---------+
| NAME                           | ENABLED |
+--------------------------------+---------+
| events_stages_current          | NO      |
| events_stages_history          | NO      |
| events_stages_history_long     | NO      |
| events_statements_current      | YES     |
| events_statements_history      | NO      |
| events_statements_history_long | NO      |
| events_waits_current           | NO      |
| events_waits_history           | NO      |
| events_waits_history_long      | NO      |
| global_instrumentation         | YES     |
| thread_instrumentation         | YES     |
| statements_digest              | YES     |
+--------------------------------+---------+
12 rows in set (0.00 sec)

If you enable history table for any of the event type, mysql will keep history of performance data collecetd for that event type.
Instrument data will be collected for only those consumers which are enabled in above table.

While we are talking above setup_consumers table, I would like to explain about 3 consumers mentioned in above tables

global_instrumentation – is the highest level consumer. If global_instrumentation is NO, it disables all instrumentation. All other settings are lower level and are not checked; it does not matter what they are set to. If global_instrumentation is YES, the Performance Schema maintains information for global states and also checks the thread_instrumentation consumer.

thread_instrumentation – is checked only if global_instrumentation is YES. Otherwise, if thread_instrumentation is NO, it disables thread-specific instrumentation and all lower-level settings are ignored. No information is maintained per thread and no individual events are collected in the current-events or event-history tables. If thread_instrumentation is YES, the Performance Schema maintains thread-specific information and also checks events_xxx_current consumers.

statement_digest – This consumer requires global_instrumentation to be YES or it is not checked. There is no dependency on the Statement Event consumers. This means you can obtain statistics per digest without having to collect statistics in events_statements_current, which is advantageous in terms of overhead.

Following figure show instrumentation hierarchy:

instru

 

 

 

 

 

 

 

 

SETUP_TIMERS – This table defines that is the unit of values for all events that are collecting performance data. For some time of event it could be timing information, in that case it defines that unit of time that is used for those events. If for few events, unit should be number of instances occurring then this table should be updated to reflect the same.

By default it defines following unit of measure for each type of event

mysql> select * from setup_timers;
+-----------+-------------+
| NAME      | TIMER_NAME  |
+-----------+-------------+
| idle      | MICROSECOND |
| wait      | CYCLE       |
| stage     | NANOSECOND  |
| statement | NANOSECOND  |
+-----------+-------------+
4 rows in set (0.00 sec)

CYCLE refers to the timer that is based on the CPU (processor) cycle counter

2) Current Events tables

Once you update setup tables to set correct monitoring in place, MySQL will start collecting performance data for the events for which instrumentation is enabled.
Following are the current tables

mysql> show tables like '%current%';
+------------------------------------------+
| Tables_in_performance_schema (%current%) |
+------------------------------------------+
| events_stages_current                    |
| events_statements_current                |
| events_waits_current                     |
+------------------------------------------+
3 rows in set (0.00 sec)

 

Each current table corresponds to high level instrument type. As we have seen above, instrument name is a tree type hierarchy structure and top of the tree is high level instrument type.
If we want to find out high level instrument name, we can use following SQL

mysql> select distinct substr(name,1,instr(name,'/')) from setup_instruments where name != 'idle';
+--------------------------------+
| substr(name,1,instr(name,'/')) |
+--------------------------------+
| wait/                          |
| stage/                         |
| statement/                     |
+--------------------------------+
3 rows in set (0.08 sec)

 

Statement – indicates a sql statement (like SELECT) or command (like CONNECT)
Stages  – is the stage of statement processing, the , like SORTING RESULT
Wait – is an instrumented wait event, like WAIT I/O/FILE or WAIT/LOCK
Idle – is when a connection is idle, that is,  a socket is idle if it is waiting for a request from the client

So corresponding to each of above instrument type, we have current tables.
These tables are populated they are enabled in setup_consumers tables. If you check above the output of setup_consumers table, you will see entries corresponding to these tables.

mysql> select * from setup_consumers where name like '%current%';
+---------------------------+---------+
| NAME                      | ENABLED |
+---------------------------+---------+
| events_stages_current     | NO      |
| events_statements_current | YES     |
| events_waits_current      | NO      |
+---------------------------+---------+
3 rows in set (0.01 sec)

 

So by default, instrumentation for current statements are enabled by default

EVENTS_STAGES_CURRENT

This table contains current stage events, one row per thread showing the current status of the thread’s most recent monitored stage event.
Every statement passes through various stages and we can enable/disable instrumentation for every stage. If you have enabled instrumentation for a stage and if your statement is currently in that stage, we will see a record in this table.
Again, if query taking less time, you will hardly see anything in this table for that thread. Advice is to enable history table.

mysql> select * from setup_instruments where name like 'stage%';
+---------------------------------------------------------------------------------------+---------+-------+
| NAME                                                                                  | ENABLED | TIMED |
+---------------------------------------------------------------------------------------+---------+-------+
| stage/sql/After create                                                                | NO      | NO    |
| stage/sql/allocating local table                                                      | NO      | NO    |
| stage/sql/preparing for alter table                                                   | NO      | NO    |
| stage/sql/altering table                                                              | NO      | NO    |
| stage/sql/committing alter table to storage engine                                    | NO      | NO    |
| stage/sql/Changing master                                                             | NO      | NO    |
...
...
...

 

EVENTS_STATEMENTS_CURRENT

This table has data about current statement events, one row per thread showing the current status of the thread’s most recent monitored statement event.
Any user runs a statement, performance data about that statement gets recorded in this table.
Please note that this table stores only 1 row per thread, so as soon as user runs another SQL in same connection, previous SQL will be gone.
That’s why its useful to enable history tables in setup_consumers.

This table has many important columns to store performance data like

TIMER_START, TIMER_END, TIMER_WAIT – Which defines how much time the SQL took
LOCK_TIME – defines how much time lock was held
ROWS_AFFECTED, ROWS_SENT, ROWS_EXAMINED – defines how much work was done
CREATED_TMP_DISK_TABLES, CREATED_TMP_TABLES – defines if temp tables where created and if they got spiled over on disk
SELECT_FULL_JOIN, SELECT_FULL_RANGE_JOIN, SELECT_RANGE, SELECT_RANGE_CHECK, SELECT_SCAN – defines access path

mysql> select * from performance_schema.events_statements_current where thread_id = 31 \G
*************************** 1. row ***************************
              THREAD_ID: 31
               EVENT_ID: 89
           END_EVENT_ID: 112
             EVENT_NAME: statement/sql/select
                 SOURCE: mysqld.cc:939
            TIMER_START: 351432249673000000
              TIMER_END: 351432249845000000
             TIMER_WAIT: 172000000
              LOCK_TIME: 73000000
               SQL_TEXT: select * from event
                 DIGEST: 4f58e5f352636c00d01edc576732852b
            DIGEST_TEXT: SELECT * FROM EVENT
         CURRENT_SCHEMA: deo
            OBJECT_TYPE: NULL
          OBJECT_SCHEMA: NULL
            OBJECT_NAME: NULL
  OBJECT_INSTANCE_BEGIN: NULL
            MYSQL_ERRNO: 0
      RETURNED_SQLSTATE: NULL
           MESSAGE_TEXT: NULL
                 ERRORS: 0
               WARNINGS: 0
          ROWS_AFFECTED: 0
              ROWS_SENT: 20
          ROWS_EXAMINED: 20
CREATED_TMP_DISK_TABLES: 0
     CREATED_TMP_TABLES: 0
       SELECT_FULL_JOIN: 0
 SELECT_FULL_RANGE_JOIN: 0
           SELECT_RANGE: 0
     SELECT_RANGE_CHECK: 0
            SELECT_SCAN: 1
      SORT_MERGE_PASSES: 0
             SORT_RANGE: 0
              SORT_ROWS: 0
              SORT_SCAN: 0
          NO_INDEX_USED: 1
     NO_GOOD_INDEX_USED: 0
       NESTING_EVENT_ID: NULL
     NESTING_EVENT_TYPE: NULL
1 row in set (0.00 sec)

EVENTS_WAITS_CURRENT

This table defines wait event information for current wait events, one row per thread showing the current status of the thread’s most recent monitored wait event.
If a thread has passed that wait event, it will remove that record from this table and insert another record about wait event that thread is experiencing right now.
Its bit difficult to use this view unless thread is experience waits currently and for longer time (enough to get hold of data from this table). Else its good idea to enable history tables.

mysql> select * from events_waits_current \G
*************************** 1. row ***************************
            THREAD_ID: 30
             EVENT_ID: 135
         END_EVENT_ID: 135
           EVENT_NAME: idle
               SOURCE: mysqld.cc:909
          TIMER_START: 351281970836000000
            TIMER_END: 351286937566000000
           TIMER_WAIT: 4966730000000
                SPINS: NULL
        OBJECT_SCHEMA: NULL
          OBJECT_NAME: NULL
           INDEX_NAME: NULL
          OBJECT_TYPE: NULL
OBJECT_INSTANCE_BEGIN: 0
     NESTING_EVENT_ID: NULL
   NESTING_EVENT_TYPE: NULL
            OPERATION: idle
      NUMBER_OF_BYTES: NULL
                FLAGS: NULL
1 row in set (0.00 sec)

You might see the same timing columns in this table as in events_statements_current table. But note that these timings column belongs to wait event. They represent how much was the wait time, whereas timings column in events_statements_current table represent how much was the time taken by the entire statement.

3) History events table

These tables as name says, stores historical information of the event that happened. As we have learned above that current table stores only current event that is happening for every thread. If the thread has passed that event, data in current table changes. So history tables are available in MySQL which will store history of passed event happened in a thread.
Corresponding to every current table, we have 2 history table _history and _history_long

mysql> show tables like '%history%';
+------------------------------------------+
| Tables_in_performance_schema (%history%) |
+------------------------------------------+
| events_stages_history                    |
| events_stages_history_long               |
| events_statements_history                |
| events_statements_history_long           |
| events_waits_history                     |
| events_waits_history_long                |
+------------------------------------------+
6 rows in set (0.00 sec)

_history table contains the most recent 10 records for every thread and _history_long table contains most recent 10,000 records for every thread.
Example if a thread is executing 20 SQL statements, events_statements_history will records latest 10 SQL statements executed by that thread, where as events_statements_history_long table will record all 20 sql statements. As new events are added to the table, older events are discarded if the table is full.  Events are not added to the table until they have ended.

Structure wise they are same as current table. They just store more data per thread.
You can enable history tables and history_long table from setup_consumers table

mysql> select * from setup_consumers where name like '%history%';
+--------------------------------+---------+
| NAME                           | ENABLED |
+--------------------------------+---------+
| events_stages_history          | NO      |
| events_stages_history_long     | NO      |
| events_statements_history      | NO      |
| events_statements_history_long | NO      |
| events_waits_history           | NO      |
| events_waits_history_long      | NO      |
+--------------------------------+---------+
6 rows in set (0.00 sec)

If you think that storing 10 records in history table and 10000 records in history_long table is less, you can increase how many records you want to store in history and history long for each of the instrument type table.
Following system level variables are available which can be set at start of instance which defines how many records to store in these tables

mysql> show variables like 'performance_schema_%history%';
+--------------------------------------------------------+-------+
| Variable_name                                          | Value |
+--------------------------------------------------------+-------+
| performance_schema_events_stages_history_long_size     | 10000 |
| performance_schema_events_stages_history_size          | 10    |
| performance_schema_events_statements_history_long_size | 10000 |
| performance_schema_events_statements_history_size      | 10    |
| performance_schema_events_waits_history_long_size      | 10000 |
| performance_schema_events_waits_history_size           | 10    |
+--------------------------------------------------------+-------+
6 rows in set (0.00 sec)

Note that changing these values will need mysql reboot to take affect and also increasing these values means that you will need more memory to store these values. If you increase number of records setting in system variables above, but there is no sufficient memory available to store those many records, MySQL will not be able to load history table.
You can check if this situation is happening by tracking some status variables in MySQL. The Performance Schema implements several status variables that provide information about instrumentation that could not be loaded or created due to memory constraints:

mysql> show status like 'performance%';
+-----------------------------------------------+-------+
| Variable_name                                 | Value |
+-----------------------------------------------+-------+
| Performance_schema_accounts_lost              | 0     |
| Performance_schema_cond_classes_lost          | 0     |
| Performance_schema_cond_instances_lost        | 0     |
| Performance_schema_digest_lost                | 0     |
| Performance_schema_file_classes_lost          | 0     |
| Performance_schema_file_handles_lost          | 0     |
| Performance_schema_file_instances_lost        | 0     |
| Performance_schema_hosts_lost                 | 0     |
| Performance_schema_locker_lost                | 0     |
| Performance_schema_mutex_classes_lost         | 0     |
| Performance_schema_mutex_instances_lost       | 0     |
| Performance_schema_rwlock_classes_lost        | 0     |
| Performance_schema_rwlock_instances_lost      | 0     |
| Performance_schema_session_connect_attrs_lost | 0     |
| Performance_schema_socket_classes_lost        | 0     |
| Performance_schema_socket_instances_lost      | 0     |
| Performance_schema_stage_classes_lost         | 0     |
| Performance_schema_statement_classes_lost     | 0     |
| Performance_schema_table_handles_lost         | 0     |
| Performance_schema_table_instances_lost       | 0     |
| Performance_schema_thread_classes_lost        | 0     |
| Performance_schema_thread_instances_lost      | 0     |
| Performance_schema_users_lost                 | 0     |
+-----------------------------------------------+-------+
23 rows in set (0.04 sec)

Each of the above status variables will get updated if corresponding performance schema table is not able to store records. We will see details about above variables in later section of this article.

 

4) Summary tables

Summary tables provide aggregated information for terminated events over time. The tables in this group summarize event data in different ways.
There are bunch of summary tables available in performance schema

 mysql> show tables like '%summary%';
+----------------------------------------------------+
| Tables_in_performance_schema (%summary%)           |
+----------------------------------------------------+
| events_stages_summary_by_account_by_event_name     |
| events_stages_summary_by_host_by_event_name        |
| events_stages_summary_by_thread_by_event_name      |
| events_stages_summary_by_user_by_event_name        |
| events_stages_summary_global_by_event_name         |
| events_statements_summary_by_account_by_event_name |
| events_statements_summary_by_digest                |
| events_statements_summary_by_host_by_event_name    |
| events_statements_summary_by_thread_by_event_name  |
| events_statements_summary_by_user_by_event_name    |
| events_statements_summary_global_by_event_name     |
| events_waits_summary_by_account_by_event_name      |
| events_waits_summary_by_host_by_event_name         |
| events_waits_summary_by_instance                   |
| events_waits_summary_by_thread_by_event_name       |
| events_waits_summary_by_user_by_event_name         |
| events_waits_summary_global_by_event_name          |
| file_summary_by_event_name                         |
| file_summary_by_instance                           |
| objects_summary_global_by_type                     |
| socket_summary_by_event_name                       |
| socket_summary_by_instance                         |
| table_io_waits_summary_by_index_usage              |
| table_io_waits_summary_by_table                    |
| table_lock_waits_summary_by_table                  |
+----------------------------------------------------+
25 rows in set (0.00 sec)

Each summary table has grouping columns that determine how to group the data to be aggregated, and summary columns that contain the aggregated values
I will give 1 liner for few tables to understand how data is stored in these tables.

Event Wait Summaries:

  •     events_waits_summary_global_by_event_name: Wait events summarized per event name
  •     events_waits_summary_by_instance: Wait events summarized per instance
  •     events_waits_summary_by_thread_by_event_name: Wait events summarized per thread and event name

Stage Summaries:

  •     events_stages_summary_by_thread_by_event_name: Stage waits summarized per thread and event name
  •     events_stages_summary_global_by_event_name: Stage waits summarized per event name

Statement Summaries:

  •     events_statements_summary_by_thread_by_event_name: Statement events summarized per thread and event name
  •     events_statements_summary_global_by_event_name: Statement events summarized per event name

4.1 Digest summary Table

In above summary table, you see a table – events_statements_summary_by_digest
So what exactly is digest table

As of MySQL 5.6.5, the Performance Schema maintains statement digest information. Digesting converts a SQL statement to normalized form and computes a hash value for the result.
Normalization permits statements that are similar to be grouped and summarized to expose information about the types of statements the server is executing and how often they occur.

We can enable digest summary by enabling the same in setup_consumers table

The statement event tables (events_statements_current, events_statements_history, and events_statements_history_long) have DIGEST and DIGEST_TEXT columns that contain digest MD5 values and the corresponding normalized statement text strings.

Normalizing a statement transforms the statement text to a more standardized string representation that preserves the general statement structure while removing information not essential to the structure.Object identifiers such as database and table names are preserved. Values and comments are removed, and whitespace is adjusted.

Consider these statements:

SELECT * FROM orders WHERE customer_id=10 AND quantity>20
SELECT * FROM orders WHERE customer_id = 20 AND quantity > 100

To normalize these statements, the Performance Schema replaces data values by ? and adjusts whitespace. Both statements yield the same normalized form and thus are considered “the same”:

SELECT * FROM orders WHERE customer_id = ? AND quantity > ?

The normalized statement contains less information but is still representative of the original statement. Other similar statements that have different comparison values have the same normalized form.

5) Object Instances tables

Instance tables records synchronization object instances. There are three types of synchronization objects: cond, mutex, and rwlock.

cond – represents condition. condition is a synchronization mechanism used in the code to signal that a specific event has happened, so that a thread waiting for this condition can resume work
mutex – mutex is a synchronization mechanism used in the code to enforce that only one thread at a given time can have access to some common resource. The resource is said to be “protected” by the mutex.
rwlock – synchronization mechanism used in the code to enforce that threads at a given time can have access to some common resource following certain rules. The resource is said to be “protected” by the rwlock. The access is either shared (many threads can have a read lock at the same time) or exclusive (only one thread can have a write lock at a given time).

Corresponding to this we have following 5 instances table

mysql> select TABLE_NAME from information_schema.tables where TABLE_NAME like '%instance%' and table_name not like '%summary%';
+------------------+
| TABLE_NAME       |
+------------------+
| cond_instances   |
| file_instances   |
| mutex_instances  |
| rwlock_instances |
| socket_instances |
+------------------+
5 rows in set (0.03 sec)

cond_instances – table lists all the conditions seen by the Performance Schema while the server executes
file_instances – table lists all the files seen by the Performance Schema when executing file I/O instrumentation. If a file on disk has never been opened, it will not be in file_instances. When a file is deleted from the disk, it is also removed from the file_instances table.
mutex_instances – table lists all the mutexes seen by the Performance Schema while the server executes.
rwlock_instances – table lists all the rwlock instances (read write locks) seen by the Performance Schema while the server executes.
socket_instances – table provides a real-time snapshot of the active connections to the MySQL server.

You can enable this instrumentation using setup_instruments table. Every instance tables mentioned above will get an entry if those events are enabled and executed.

setup_instruments table has many instruments as explained above. Enabling these instruments will populate specific instances table

 

+------------------------------------------------------------+---------+-------+
| NAME                                                       | ENABLED | TIMED |
+------------------------------------------------------------+---------+-------+
| wait/synch/mutex/sql/PAGE::lock                            | NO      | NO    |
| wait/synch/mutex/sql/TC_LOG_MMAP::LOCK_sync                | NO      | NO    |
| wait/synch/mutex/sql/TC_LOG_MMAP::LOCK_active              | NO      | NO    |
| wait/synch/mutex/sql/TC_LOG_MMAP::LOCK_pool                | NO      | NO    |
| wait/synch/mutex/sql/LOCK_des_key_file                     | NO      | NO    |
| wait/synch/mutex/sql/MYSQL_BIN_LOG::LOCK_commit            | NO      | NO    |
...
| wait/synch/rwlock/sql/Binlog_transmit_delegate::lock       | NO      | NO    |
| wait/synch/rwlock/sql/Binlog_relay_IO_delegate::lock       | NO      | NO    |
| wait/synch/rwlock/sql/LOCK_grant                           | NO      | NO    |
| wait/synch/rwlock/sql/LOGGER::LOCK_logger                  | NO      | NO    |
| wait/synch/rwlock/sql/LOCK_sys_init_connect                | NO      | NO    |
| wait/synch/rwlock/sql/LOCK_sys_init_slave                  | NO      | NO    |
...
| wait/synch/cond/sql/TC_LOG_MMAP::COND_active               | NO      | NO    |
| wait/synch/cond/sql/TC_LOG_MMAP::COND_pool                 | NO      | NO    |
| wait/synch/cond/sql/MYSQL_BIN_LOG::COND_done               | NO      | NO    |
| wait/synch/cond/sql/MYSQL_BIN_LOG::update_cond             | NO      | NO    |
| wait/synch/cond/sql/MYSQL_BIN_LOG::prep_xids_cond          | NO      | NO    |
| wait/synch/cond/sql/MYSQL_RELAY_LOG::COND_done             | NO      | NO    |
...
| wait/io/file/sql/map                                       | YES     | YES   |
| wait/io/file/sql/binlog                                    | YES     | YES   |
| wait/io/file/sql/binlog_index                              | YES     | YES   |
| wait/io/file/sql/relaylog                                  | YES     | YES   |
| wait/io/file/sql/relaylog_index                            | YES     | YES   |
| wait/io/file/sql/casetest                                  | YES     | YES   |
...
| wait/io/socket/sql/server_tcpip_socket                     | NO      | NO    |
| wait/io/socket/sql/server_unix_socket                      | NO      | NO    |
| wait/io/socket/sql/client_connection                       | NO      | NO    |
+------------------------------------------------------------+---------+-------+

Lets take a example of mutex_instance table and understand how it works and how these tables play a role.

The mutex_instances table lists all the mutexes seen by the Performance Schema while the server executes. A mutex is a synchronization mechanism used in the code to enforce that only one thread at a given time can have access to some common resource. The resource is said to be “protected” by the mutex.

When two threads executing in the server (for example, two user sessions executing a query simultaneously) do need to access the same resource (a file, a buffer, or some piece of data), these two threads will compete against each other, so that the first query to obtain a lock on the mutex will cause the other query to wait until the first is done and unlocks the mutex.

For every mutex instrumented in the code, the Performance Schema provides the following information.

  •     The setup_instruments table lists the name of the instrumentation point, with the prefix wait/synch/mutex/.
  •     When some code creates a mutex, a row is added to the mutex_instances table. The OBJECT_INSTANCE_BEGIN column is a property that uniquely identifies the mutex.
  •     When a thread attempts to lock a mutex, the events_waits_current table shows a row for that thread, indicating that it is waiting on a mutex (in the EVENT_NAME column), and indicating which mutex is waited on (in the OBJECT_INSTANCE_BEGIN column).
  •     When a thread succeeds in locking a mutex:
    • events_waits_current shows that the wait on the mutex is completed (in the TIMER_END and TIMER_WAIT columns)
    • The completed wait event is added to the events_waits_history and events_waits_history_long tables
    • mutex_instances shows that the mutex is now owned by the thread (in the THREAD_ID column).
  •     When a thread unlocks a mutex, mutex_instances shows that the mutex now has no owner (the THREAD_ID column is NULL).
  •     When a mutex object is destroyed, the corresponding row is removed from mutex_instances.

By performing queries on both of the following tables, a monitoring application or a DBA can detect bottlenecks or deadlocks between threads that involve mutexes:

  •     events_waits_current, to see what mutex a thread is waiting for
  •     mutex_instances, to see which other thread currently owns a mutex

 

6) Other remaining tables

Apart from above 5 categories there are few other tables which records account level and thread level information.

account table – Connection statistics per client account. “Account” here means username + hostname combination


mysql> select * from accounts;
+------+-----------+---------------------+-------------------+
| USER | HOST      | CURRENT_CONNECTIONS | TOTAL_CONNECTIONS |
+------+-----------+---------------------+-------------------+
| NULL | NULL      |                  20 |                26 |
| root | localhost |                   1 |                 7 |
+------+-----------+---------------------+-------------------+
2 rows in set (0.01 sec)

users table – Connection statistics per client user name


mysql> select * from users;
+------+---------------------+-------------------+
| USER | CURRENT_CONNECTIONS | TOTAL_CONNECTIONS |
+------+---------------------+-------------------+
| NULL |                  20 |                26 |
| root |                   1 |                 7 |
+------+---------------------+-------------------+
2 rows in set (0.01 sec)

hosts table – Connection statistics per client host name


mysql> select * from hosts;
+-----------+---------------------+-------------------+
| HOST      | CURRENT_CONNECTIONS | TOTAL_CONNECTIONS |
+-----------+---------------------+-------------------+
| NULL      |                  20 |                26 |
| localhost |                   1 |                 7 |
+-----------+---------------------+-------------------+
2 rows in set (0.00 sec)

host_cache – Information from the internal host cache.

performance_timers – Shows which event timers are available

This table shows available times in your mysql server. If a timer is not listed in this table it cannot be used. value of timers for various event types set in setup_timers should be from performance_timers table


mysql> select * from performance_timers;
+-------------+-----------------+------------------+----------------+
| TIMER_NAME  | TIMER_FREQUENCY | TIMER_RESOLUTION | TIMER_OVERHEAD |
+-------------+-----------------+------------------+----------------+
| CYCLE       |      2932537313 |                1 |             22 |
| NANOSECOND  |      1000000000 |             1000 |            264 |
| MICROSECOND |         1000000 |                1 |            116 |
| MILLISECOND |            1037 |                1 |            135 |
| TICK        |             101 |                1 |            254 |
+-------------+-----------------+------------------+----------------+
5 rows in set (0.00 sec)

threads – table contains a row for each server thread. Each row contains information about a thread and indicates whether monitoring is enabled for it

Performance Schema status variables

We have following performance_schema status variables


mysql> show status like '%performance%';
+-----------------------------------------------+-------+
| Variable_name                                 | Value |
+-----------------------------------------------+-------+
| Performance_schema_accounts_lost              | 0     |
| Performance_schema_cond_classes_lost          | 0     |
| Performance_schema_cond_instances_lost        | 0     |
| Performance_schema_digest_lost                | 0     |
| Performance_schema_file_classes_lost          | 0     |
| Performance_schema_file_handles_lost          | 0     |
| Performance_schema_file_instances_lost        | 0     |
| Performance_schema_hosts_lost                 | 0     |
| Performance_schema_locker_lost                | 0     |
| Performance_schema_mutex_classes_lost         | 0     |
| Performance_schema_mutex_instances_lost       | 0     |
| Performance_schema_rwlock_classes_lost        | 0     |
| Performance_schema_rwlock_instances_lost      | 0     |
| Performance_schema_session_connect_attrs_lost | 0     |
| Performance_schema_socket_classes_lost        | 0     |
| Performance_schema_socket_instances_lost      | 0     |
| Performance_schema_stage_classes_lost         | 0     |
| Performance_schema_statement_classes_lost     | 0     |
| Performance_schema_table_handles_lost         | 0     |
| Performance_schema_table_instances_lost       | 0     |
| Performance_schema_thread_classes_lost        | 0     |
| Performance_schema_thread_instances_lost      | 0     |
| Performance_schema_users_lost                 | 0     |
+-----------------------------------------------+-------+
23 rows in set (0.00 sec)

If an instrumentation could not be loaded because of memory constraints, corresponding status variable will get updated.
Non-zero values of these variables does not mean there is any issue with end user operation. Queries and DMLs will get executed as expected. Non-Zero values of these variables only mean corresponding instrumentation is not getting recorded because of less memory.

Names for these variables have several forms:

Performance_schema_accounts_lost – The number of times a row could not be added to the accounts table because it was full.
Performance_schema_xxx_classes_lost – How many instruments of type xxx could not be loaded.
Performance_schema_hosts_lost – The number of times a row could not be added to the hosts table because it was full.
Performance_schema_xxx_instances_lost – How many instances of object type xxx could not be created.
Performance_schema_xxx_handles_lost – How many instances of object type xxx could not be opened.
Performance_schema_locker_lost – How many events are “lost” or not recorded for various reason.
Performance_schema_session_connect_attrs_lost – The number of times a connection attribute string was larger than the reserved storage.
Performance_schema_users_lost – The number of times a row could not be added to the users table because it was full.

 

References:

http://manuales.guebs.com/mysql-5.6/performance-schema.html

http://dev.mysql.com/doc/refman/5.6/en/performance-schema-quick-start.html

http://mysqlintheenterprise.com/2013/03/21/a-visual-guide-to-the-mysql-performance-schema/

http://www.databasejournal.com/features/mysql/an-overview-of-the-mysql-performance-schema.html

 

Creating Oracle 12c Multitenant Container Database

Oracle has come up with new feature called multitenant database. Using this feature we can have many pluggable databases plugged into single container database.
This article explains pluggable databases architecture and how to create them.
I am not covering administration part. Viewers can check the documentation mentioned in reference section to check administrative part for managing pluggable databases.

Multitenant Architecture

One of the high-profile new features of Oracle 12c Enterprise Edition(EE) is the multitenant option – multitenant container database (CDB). This option allows the creation of many pluggable databases (PDBs) inside a CDB. The PDBs share resources provided by the CDB, such as memory, background processes, UNDO, REDO, and control files. This enables more databases to run on a single platform than was possible with the former Oracle 11gR2 architecture.
You can consider all databases till 11g as non-container databases or non-CDB. In oracle 12c we can still create non-CDB databases using normal “create database” command and administration wise there is no difference in managing those 12c databases compared to 11g database.

Oracle Multitenant architecture contains following:

One root container database
This looks like a normal database, but the purpose is different. This stores Oracle-supplied metadata and common users. It has all component that a normal database has – controlfile, online redo logs, SYSTEM and SYSAUX tablespace, Pfile and SPfile, archive logs and flasback logs etc. You can also create additional users, tablespace in this database.

One seed pluggable database
This is a seed databases which gets created when we create container database. This is nothing but set of datafiles in different location. This seed database can be used to create new pluggable databases.

Zero or more user-created PDBs
We can create multiple pluggable databases in single container databases. There are different methods of creating pluggable databases. Using above seed database to create pluggable database is just 1 method. We can also clone existing pluggable database from same container database or different container database. We can also make other non-CDB oracle 12c databases as pluggable database in our container database. You can refer to doc mentioned in reference section for more details on this.

Following figure show multitenant architecture:

cdbContainer database is same as normal database having all required component same as normal database.
Pluggable database has its own SYSTEM, SYSAUX and data tablespace. All other components like UNDO tablespace, online redo logs, controlfile, archive logs etc are common for container database and pluggable database.

When compared to RAC, I feel this architecture is exactly opposite. In RAC we have multiple instances and 1 database. In multitenant architecture we have single instance and multiple databases.

Creating Multitenant Container database

Creating container database is similar to creating non-container database but with few changes. Following changes should be made if we want to create container database

1) Define enable_pluggable_database parameter to true.
By default this parameter is false. Meaning that by default database will be created as non-multitenant database. If you want to create multitenant database, you have to make this parameter true.
Failing to make this parameter true will give you following error if you try to create multitenant database

ORA-65093: multitenant container database not set up properly

2) Change in “Create database” command
So if you are going to create multitenant database using “create database” command, you need to add couple of clause to this command.
Following clause needs to be added

  • ENABLE PLUGGABLE DATABASE
  • SEED

Following steps can be used to create multitenant database

Step 1) Create pfile and spfile

You can use following parameters as generic to create test database. You can alter the memory parameters and include/remove many parameters as required. But these are good enough to create test instance.

cat initdeocdb.ora

db_name='deocdb'
sga_max_size=4G
shared_pool_size=1200M
shared_pool_reserved_size=300M
java_pool_size = 200M
pga_aggregate_target=1G
processes = 300
audit_trail ='db'
db_block_size=8192
db_domain=''
db_recovery_file_dest='/u01/oradata/databases/flashback'
db_recovery_file_dest_size=50G
open_cursors=300
remote_login_passwordfile='EXCLUSIVE'
undo_tablespace='UNDO_T1'
control_files = (/ctl-01/databases/deocdb/control.ctl, /ctl-02/databases/deocdb/control.ctl)
compatible ='12.0.0'
audit_file_dest = /dumps-01/databases/deocdb/adump
core_dump_dest = /dumps-02/databases/deocdb/cdump
diagnostic_dest = /dumps-01
sec_case_sensitive_logon = FALSE
utl_file_dir = /dumps-01/databases/deocdb/output
enable_pluggable_database=true

create spfile from pfile;


***Its important to set java_pool_size in pfile/spfile. If you do not set java_pool_size, we will see following errors in alert log while creating container database

ORA-04031: unable to allocate 4096 bytes of shared memory (“java pool”,”/51835a0f_ReflectionFactoryGet”,”JOXLE^5cb1fb41″,”:SGAClass”)

Step 2) Create database command

I used following create database command to create container database. Highlighted lines are additional clauses for creating container database.


startup nomount;

CREATE DATABASE deocdb
USER SYS IDENTIFIED BY welcome
USER SYSTEM IDENTIFIED BY welcome
LOGFILE GROUP 1 ('/redo-01-a/databases/deocdb/redo-t01-g01-m1.log',
                 '/redo-03-a/databases/deocdb/redo-t01-g01-m2.log') SIZE 100M BLOCKSIZE 512,
        GROUP 2 ('/redo-02-a/databases/deocdb/redo-t01-g02-m1.log',
                 '/redo-04-a/databases/deocdb/redo-t01-g02-m2.log') SIZE 100M BLOCKSIZE 512,
        GROUP 3 ('/redo-01-a/databases/deocdb/redo-t01-g03-m1.log',
                 '/redo-03-a/databases/deocdb/redo-t01-g03-m2.log') SIZE 100M BLOCKSIZE 512,
        GROUP 4 ('/redo-02-a/databases/deocdb/redo-t01-g04-m1.log',
                 '/redo-04-a/databases/deocdb/redo-t01-g04-m2.log') SIZE 100M BLOCKSIZE 512
MAXLOGHISTORY 1
MAXLOGFILES 16
MAXLOGMEMBERS 3
MAXDATAFILES 1024
CHARACTER SET UTF8
NATIONAL CHARACTER SET UTF8
EXTENT MANAGEMENT LOCAL
DATAFILE '/u01/oradata/databases/deocdb/system01.dbf'   
    SIZE 700M REUSE AUTOEXTEND ON NEXT 10240K MAXSIZE UNLIMITED
SYSAUX DATAFILE '/u01/oradata/databases/deocdb/sysaux01.dbf'   
    SIZE 550M REUSE AUTOEXTEND ON NEXT 10240K MAXSIZE UNLIMITED
DEFAULT TABLESPACE admin
DATAFILE '/u01/oradata/databases/deocdb/admin-01.dbf' 
    SIZE 500M REUSE AUTOEXTEND ON MAXSIZE UNLIMITED  
DEFAULT TEMPORARY TABLESPACE TEMP 
TEMPFILE '/u01/oradata/databases/deocdb/temp01.dbf' 
    SIZE 20M REUSE AUTOEXTEND ON NEXT 640K MAXSIZE UNLIMITED
UNDO TABLESPACE undo_t1
DATAFILE '/u01/oradata/databases/deocdb/undo_t1-01.dbf'
    SIZE 200M REUSE AUTOEXTEND ON NEXT 5120K MAXSIZE UNLIMITED
ENABLE PLUGGABLE DATABASE
SEED
    FILE_NAME_CONVERT = ('/u01/oradata/databases/deocdb/', 
                         '/u01/oradata/databases/pdbseed/')
    SYSTEM DATAFILES SIZE 125M AUTOEXTEND ON NEXT 10M MAXSIZE UNLIMITED
    SYSAUX DATAFILES SIZE 100M
USER_DATA TABLESPACE users
DATAFILE '/u01/oradata/databases/pdbseed/users-01.dbf'
    SIZE 200M REUSE AUTOEXTEND ON MAXSIZE UNLIMITED;

Step 3) run catcdb.sql

DO NOT RUN catlog.sql and catproc.sql
If you are creating container database, you need to run only catcdb.sql.
This script takes care of running all other dependent scripts internally.

If you run catlog.sql and catproc.sql, this script will fail later at many stages with following errors

catcon: ALL catcon-related output will be written to dbmsxdbt_catcon_2527.lst
catcon: See dbmsxdbt*.log files for output generated by scripts
catcon: See dbmsxdbt_*.lst files for spool files, if any
catconInit: database is not open on the default instance
Unexpected error encountered in catconInit; exiting

In case of normal database creation, we usually run catlog.sql and catproc.sql as mentioned in http://docs.oracle.com/database/121/ADMIN/create.htm#ADMIN11082
But that’s for normal database creation and process is same in 12c. But for creating container database, we should be running only catcdb.sql

catcdb.sql will ask for 3 inputs – new sys password, new system password and temp tablespace

SQL> @?/rdbms/admin/catcdb.sql

Session altered.

Enter new password for SYS:
Enter new password for SYSTEM:
Enter temporary tablespace name: TEMP

Once you provide this, it will create root container database with whatever database name you give. Container name for root container DB will be CDB$ROOT.
It will also create seed pluggable container database. Container name will be seed$pdb

Step 4) Run utlrp.sql to compile invalid objects.

There will be few invalid objects in database. You can run utlrp.sql to compile the same

Validation

You can run few commands to validate if you database is container databases and what all containers it has.

We have a new column in v$database called CDB. If your database is container database it will show as YES

SYS.DEOCDB.PRIMARY>select name, cdb from v$database;

NAME    CDB
--------- ---
DEOCDB    YES

1 row selected.

You can also check different containers in your database using following

SYS.DEOCDB.PRIMARY>select CON_ID, NAME, OPEN_MODE, TOTAL_SIZE from v$containers;

    CON_ID NAME         OPEN_MODE  TOTAL_SIZE
---------- ------------------------------ ---------- ----------
   1 CDB$ROOT       READ WRITE        0
   2 PDB$SEED       READ ONLY  1599078400

V$CONTAINER is the new view available in 12c. When this is run from root container, it shows all containers CDB as well as PDB. But if you run same command from any PDB, it will show that container.

SYS.DEOCDB.PRIMARY>alter session set container=PDB$SEED;

Session altered.

SYS.DEOCDB.PRIMARY>select CON_ID, NAME, OPEN_MODE, TOTAL_SIZE from v$containers;

    CON_ID NAME         OPEN_MODE  TOTAL_SIZE
---------- ------------------------------ ---------- ----------
   2 PDB$SEED       READ ONLY  1599078400

Creating Pluggable database

Following command can be used to create pluggable database from seed pluggable database PDB$SEED

SYS.DEOCDB.PRIMARY>create pluggable database deopdb1 admin user deodba identified by welcome FILE_NAME_CONVERT=('/u01/oradata/databases/pdbseed/','/u01/oradata/databases/deopdb1/');

Pluggable database created.

Above command is in its basic form and all parameter/clauses mentioned are mandatory. You need to provide FILE_NAME_CONVERT so that files from seed database will be copied to location you provided for your pluggable database. There are many parameters to this command and you can explore them in documentation link provided in reference.

Initially when you create pluggable database it will be in MOUNTED state. This is because pluggable database use same memory structure and control file as root container database. So memory structure and control file is already allocated.

SYS.DEOCDB.PRIMARY>select CON_ID, NAME, OPEN_MODE, TOTAL_SIZE from v$containers;

    CON_ID NAME         OPEN_MODE  TOTAL_SIZE
---------- ------------------------------ ---------- ----------
   1 CDB$ROOT       READ WRITE        0
   2 PDB$SEED       READ ONLY  1599078400
   3 DEOPDB1        MOUNTED       0

You need to just open the database.
You can open pluggable database from root container when you logged in using following command

SYS.DEOCDB.PRIMARY>show con_name;

NAME_COL_PLUS_PDB_CONTAINER
-------------------------------------
CDB$ROOT

SYS.DEOCDB.PRIMARY>alter pluggable database deopdb1 open;

Pluggable Database opened.

Or you can change container name and simply start

SYS.DEOCDB.PRIMARY>alter session set container=DEOPDB1;

Session altered.

SYS.DEOCDB.PRIMARY>show con_name;

NAME_COL_PLUS_PDB_CONTAINER
-------------------------------------
DEOPDB1

SYS.DEOCDB.PRIMARY>startup
Pluggable Database opened.

You can shutdown pluggable database using similar commands

if you are in root container, you can use following commands

SYS.DEOCDB.PRIMARY>show con_name;

NAME_COL_PLUS_PDB_CONTAINER
-------------------------------------
CDB$ROOT
SYS.DEOCDB.PRIMARY>alter pluggable database deopdb1 close;

Pluggable database altered.

Else you can change container and shutdown the database.

SYS.DEOCDB.PRIMARY>alter session set container=DEOPDB1;

Session altered.

SYS.DEOCDB.PRIMARY>show con_name;

NAME_COL_PLUS_PDB_CONTAINER
--------------------------------------
DEOPDB1
SYS.DEOCDB.PRIMARY>
SYS.DEOCDB.PRIMARY>
SYS.DEOCDB.PRIMARY>shut immediate;
Pluggable Database closed.

Reference

http://docs.oracle.com/database/121/ADMIN/cdb_intro.htm#ADMIN13508
http://docs.oracle.com/database/121/ADMIN/cdb_plug.htm#ADMIN13556
http://docs.oracle.com/database/121/ADMIN/cdb_create.htm#ADMIN13521
https://docs.oracle.com/database/121/CNCPT/cdbovrvw.htm#CNCPT89234

Hope this helps !!