Pointing MySQL Read-Replica to new master after failover

Here I am describing a simple scenario where we have a read replica (other than slave) which is currently pointing to master mysql instance.
We have another slave which is also pointing to same master. What we are going to see is, how to point read replica to new master after failover.

Environment:

To make you familier with environment, we have 3 instances of mysql

mysql_a -> Current master
mysql_b -> Current slave
mysql_c -> read replica pointing to mysql_a

Technically read replica is same as slave, but in this context we are saying read replica to make you understand that this is additional slave which does not take part in failover scenario.
So excercise is to failover master from mysql_a to mysql_b and point read replica mysql_c to mysql_b

I am using GTID on all 3 instances and both slaves are in sync with master.

On Master (mysql_a):

root [mysql] >show master status \G
*************************** 1. row ***************************
             File: bin_log.000007
         Position: 612
     Binlog_Do_DB:
 Binlog_Ignore_DB:
Executed_Gtid_Set: f7718b80-c237-11e4-baa8-a0369f370a52:1-91
1 row in set (0.00 sec)

On Slave (mysql_b):

mysql> show slave status \G
*************************** 1. row ***************************
               Slave_IO_State: Waiting for master to send event
               ...
               ...
              Master_Log_File: bin_log.000007
          Read_Master_Log_Pos: 612
               Relay_Log_File: relay_log.000010
                Relay_Log_Pos: 818
        Relay_Master_Log_File: bin_log.000007
             Slave_IO_Running: Yes
            Slave_SQL_Running: Yes
            ...
            ...
           Master_SSL_Crlpath:
           Retrieved_Gtid_Set: f7718b80-c237-11e4-baa8-a0369f370a52:1-91
            Executed_Gtid_Set: f7718b80-c237-11e4-baa8-a0369f370a52:1-91
                Auto_Position: 1

On read-replica (mysql_c):

root [mysql] >show slave status \G
*************************** 1. row ***************************
               Slave_IO_State: Waiting for master to send event
               ...
               ...
              Master_Log_File: bin_log.000007
          Read_Master_Log_Pos: 612
               Relay_Log_File: relay_log.000005
                Relay_Log_Pos: 442
        Relay_Master_Log_File: bin_log.000007
             Slave_IO_Running: Yes
            Slave_SQL_Running: Yes
            ...
            ...
           Retrieved_Gtid_Set: f7718b80-c237-11e4-baa8-a0369f370a52:91
            Executed_Gtid_Set: 6e68fbf2-f9a0-11e4-a3f7-90e2ba6ebdb0:1-7,
f7718b80-c237-11e4-baa8-a0369f370a52:1-91
                Auto_Position: 1

Lagging read-replica:

Before I failover, I want to make read replica lag by couple of transaction so that when we recover and point replica to new master we can clearly see transactions are getting applied from old and new master.
I will stop IO slave on read replica (mysql_c) so that it will stop reading binlog from master (mysql_a)

root [mysql] >stop slave IO_thread;
Query OK, 0 rows affected (0.03 sec)

root [mysql] >show slave status \G
*************************** 1. row ***************************
               Slave_IO_State:
               ...
               ...
             Slave_IO_Running: No
            Slave_SQL_Running: Yes

So any change to master will not be replicated to read replica (mysql_c)

Now, lets change few records in master and move forward its GTID

On Master (mysql_a):

root [deo] >show master status \G
*************************** 1. row ***************************
             File: bin_log.000007
         Position: 1220
     Binlog_Do_DB:
 Binlog_Ignore_DB:
Executed_Gtid_Set: f7718b80-c237-11e4-baa8-a0369f370a52:1-93
1 row in set (0.00 sec)

I did couple of transactions on master (mysql_a) and increased GTID from 1-91 to 1-93.

I see that those trasactions are already applied to slave (mysql_b)

On Slave (mysql_b):

           Retrieved_Gtid_Set: f7718b80-c237-11e4-baa8-a0369f370a52:1-93
            Executed_Gtid_Set: f7718b80-c237-11e4-baa8-a0369f370a52:1-93
                Auto_Position: 1
1 row in set (0.00 sec)

But we didnt get those transaction on read replica (mysql_c) as IO thread is down

On read replica (mysql_c):

           Retrieved_Gtid_Set: f7718b80-c237-11e4-baa8-a0369f370a52:91
            Executed_Gtid_Set: 6e68fbf2-f9a0-11e4-a3f7-90e2ba6ebdb0:1-7,
f7718b80-c237-11e4-baa8-a0369f370a52:1-91
                Auto_Position: 1

Failover:

Now, we will failover master to slave (mysql_b)

Failover is simple process. We just have to stop current master and make sure slave is not read only. Plus any other custom configuration that you have done on master, you can need to do the same on slave.

So after failover to mysql_b, my new master looks like below

mysql> show master status \G
*************************** 1. row ***************************
             File: bin_log.000003
         Position: 2495
     Binlog_Do_DB:
 Binlog_Ignore_DB:
Executed_Gtid_Set: f7718b80-c237-11e4-baa8-a0369f370a52:1-93
1 row in set (0.00 sec)

we can also do reset master, that will flush binlog file and reset GTID to 1. But lets not do that as that is not the objective of our excercise.

Pointing read replica to new master:

Now, we need to point our read replica (which is still pointing to old master (mysql_a)) to new master (mysql_b)
To do that we need to recover the transactions that were missing from old master and continue receving transactions from new master.

When we do new transactions on new master, we get different GTID.
Example, I did 1 transaction after failover on new master and my new master status looks like below

mysql> show master status \G
*************************** 1. row ***************************
             File: bin_log.000003
         Position: 2802
     Binlog_Do_DB:
 Binlog_Ignore_DB:
Executed_Gtid_Set: c21f28d2-c243-11e4-baf5-2c600c20dba4:1,
f7718b80-c237-11e4-baa8-a0369f370a52:1-93
1 row in set (0.00 sec)
c21f28d2-c243-11e4-baf5-2c600c20dba4:1 is the new GTID

If we just start slave on read replica it will not be able to do fetch any transaction as old master is down. So we need to first point read replica to new master and then start slave which will fetch all old and new transactions

root [mysql] >stop slave;
Query OK, 0 rows affected (0.03 sec)

root [mysql] >change master to master_host='mysql_b.example.com', master_port=3306, master_user='replicate', MASTER_PASSWORD = 'welcome', MASTER_AUTO_POSITION = 1;
Query OK, 0 rows affected, 2 warnings (0.07 sec)

root [mysql] >start slave;
Query OK, 0 rows affected (0.03 sec)

root [mysql] >show slave status \G
*************************** 1. row ***************************
               Slave_IO_State: Waiting for master to send event
                  Master_Host: mysql_b.example.com
                  ...
                  ...
           Retrieved_Gtid_Set: c21f28d2-c243-11e4-baf5-2c600c20dba4:1,
f7718b80-c237-11e4-baa8-a0369f370a52:92-93
            Executed_Gtid_Set: 6e68fbf2-f9a0-11e4-a3f7-90e2ba6ebdb0:1-7,
c21f28d2-c243-11e4-baf5-2c600c20dba4:1,
f7718b80-c237-11e4-baa8-a0369f370a52:1-93
                Auto_Position: 1

In above output of slave status, we can check Retrieved_Gtid_Set and we can clearly see it has retrieved previous master’s 2 transactions f7718b80-c237-11e4-baa8-a0369f370a52:92-93 as well as new transaction that I did after failover – c21f28d2-c243-11e4-baf5-2c600c20dba4:1
In Executed_Gtid_Set we can see it has executed all these transactions – c21f28d2-c243-11e4-baf5-2c600c20dba4:1 and f7718b80-c237-11e4-baa8-a0369f370a52:1-93

Hope this helps !!

Apache Cassandra – NoSQL storage solution

These days I am exploring another storage solution – Cassandra.

Apache Cassandra datastore was originally developed by Facebook as open source NoSQL data storage system. Its actually based on Amazon’s dynamoDB database. Apache Cassandra is an open source distributed database management system designed to handle large amounts of data across many commodity servers, providing high availability with no single point of failure. Cassandra offers robust support for clusters spanning multiple datacenters, with asynchronous masterless replication allowing low latency operations for all clients.

Datastax Technology has created enterprise edition of Cassandra which is built on Apache Cassandra. Today we have multiple flavors of Cassandra available from Apache as well as datastax.

Cassandra is a NoSQL database storage solution and it stores the data using simple key-value pairs. Along with enterprise software, datastax also provide huge documentation for learning Cassandra. They also provide self-paced training and instructor led training for learning Cassandra.

I have started learning Cassandra using self-paced training available at following location – https://academy.datastax.com/courses

Apart from that, datastax also has very active blog where they discuss different issues and features available in Cassandra – http://www.datastax.com/dev/blog/

Installation:

You can either go with full installation of Cassandra on multiple physical nodes and creating a cluster or you can simulate a cluster on single node using CCM (Cassandra Cluster Manager).

Going for official Cassandra software on multiple physical nodes might not be feasible for everyone. Thats why CCM is the best utility to learn Cassandra.

You can find instruction to install CCM at following location – http://www.datastax.com/dev/blog/ccm-a-development-tool-for-creating-local-cassandra-clusters

Valid Versions:

At the time of this writing, most stable version of Apache Cassandra is 2.1.14. Latest version of Apache Cassandra released is 2.1.15. You have older version like 2.0.9 which was also stable.

You can get complete list of Apache Cassandra at – http://archive.apache.org/dist/cassandra/

You can check datastax community versions of Cassandra at http://planetcassandra.org/cassandra/

Community version is for learning and is free to download, install and play around.

CCM Installation Issue:

I faced following issue when I installed CCM on my ubuntu 12.04 machine.

ccm create --version=2.0.9 --nodes=6 deo
Downloading http://archive.apache.org/dist/cassandra/2.0.9/apache-cassandra-2.0.9-src.tar.gz to /tmp/ccm-2oKzAH.tar.gz (10.810MB)
  11335077  [100.00%]
Extracting /tmp/ccm-2oKzAH.tar.gz as version 2.0.9 ...
Compiling Cassandra 2.0.9 ...
Deleted /home/local/advaitd/.ccm/repository/2.0.9 due to error
Traceback (most recent call last):
  File "/usr/local/bin/ccm", line 5, in <module>
    pkg_resources.run_script('ccm==2.0.3.1', 'ccm')
  File "/usr/lib/python2.7/dist-packages/pkg_resources.py", line 499, in run_script
    self.require(requires)[0].run_script(script_name, ns)
  File "/usr/lib/python2.7/dist-packages/pkg_resources.py", line 1235, in run_script
    execfile(script_filename, namespace, namespace)
  File "/usr/local/lib/python2.7/dist-packages/ccm-2.0.3.1-py2.7.egg/EGG-INFO/scripts/ccm", line 72, in <module>
    cmd.run()
  File "/usr/local/lib/python2.7/dist-packages/ccm-2.0.3.1-py2.7.egg/ccmlib/cmds/cluster_cmds.py", line 127, in run
    cluster = Cluster(self.path, self.name, install_dir=self.options.install_dir, version=self.options.version, verbose=True)
  File "/usr/local/lib/python2.7/dist-packages/ccm-2.0.3.1-py2.7.egg/ccmlib/cluster.py", line 51, in __init__
    dir, v = self.load_from_repository(version, verbose)
  File "/usr/local/lib/python2.7/dist-packages/ccm-2.0.3.1-py2.7.egg/ccmlib/cluster.py", line 64, in load_from_repository
    return repository.setup(version, verbose)
  File "/usr/local/lib/python2.7/dist-packages/ccm-2.0.3.1-py2.7.egg/ccmlib/repository.py", line 40, in setup
    download_version(version, verbose=verbose, binary=binary)
  File "/usr/local/lib/python2.7/dist-packages/ccm-2.0.3.1-py2.7.egg/ccmlib/repository.py", line 221, in download_version
    raise e
ccmlib.common.CCMError: Error compiling Cassandra. See /home/local/advaitd/.ccm/repository/last.log for details

 

I posted the same error on github community and immediately got a solution – https://github.com/pcmanus/ccm/issues/268

Suggested me to use binary version of Cassandra for download -v binary:2.0.9. Cluster creation was successful after using binary version.

You can create as many nodes cluster as you want to. All it does is, it creates those many directories and treat them as separate nodes.

I create six node cluster on my ubuntu machine.

advaitd@desktop:~$ ccm status
Cluster: 'deo'
--------------
node1: UP
node3: UP
node2: UP
node5: UP
node4: UP
node6: UP

CCM Installation details:

CCM creates hidden directory under your home directory and a separate installation directory for each node under that hidden directory as shown below.

advaitd@desktop:~$ pwd
/home/local/advaitd
advaitd@desktop:~$ ls -rlt .ccm
total 12
drwxr-xr-x 3 advaitd domain^users 4096 Apr 23 12:15 repository
-rw-r--r-- 1 advaitd domain^users    4 Apr 23 12:15 CURRENT
drwxr-xr-x 8 advaitd domain^users 4096 Apr 23 12:15 deo
advaitd@desktop:~$ ls -rlt .ccm/deo/
total 28
-rw-r--r-- 1 advaitd domain^users  291 Apr 23 12:15 cluster.conf
drwxr-xr-x 8 advaitd domain^users 4096 Apr 23 12:16 node2
drwxr-xr-x 8 advaitd domain^users 4096 Apr 23 12:16 node1
drwxr-xr-x 8 advaitd domain^users 4096 Apr 23 12:16 node5
drwxr-xr-x 8 advaitd domain^users 4096 Apr 23 12:16 node3
drwxr-xr-x 8 advaitd domain^users 4096 Apr 23 12:16 node6
drwxr-xr-x 8 advaitd domain^users 4096 Apr 23 12:16 node4
advaitd@desktop:~$

So in each of the above node directory we have a Cassandra software installed. Each of above node directory is considered as separate node and cluster is created.

Cassandra binary is running from each of the node directory. So we should be seeing 6 cassandra processes running on that host as shown below.

advaitd@desktop:~$ ps -aef | grep cassandra | grep -v grep | wc -l
6

I will be doing more learning and posting articles on Cassandra as well.

References:

http://cassandra.apache.org/

http://www.datastax.com/

http://en.wikipedia.org/wiki/Apache_Cassandra

https://github.com/pcmanus/ccm/issues

http://www.datastax.com/blog

http://docs.datastax.com/en/index.html

Oracle Performance Tuning – ASH Basics

Recently I got an opportunity to dig into performance of database. We were not facing any performance issue, but we wanted to understand whats happening under the hood. What are the bottlenecks and how do we prepare and scale our database further.

In future the load on the databases are going to be crazy and we want to make sure that we look into every consumer of our database and optimize the usage.

I went back into the basics of database performance tuning – ASH. Active session history is the beast which has loads of information and provides information at very granular level – “session”. We can dig into ASH data and generate different reports rolled-up against various entities like module, action, sql_id, user_id etc. We can determine how much each of these entities are using database resources.

DB Time:

One of the most fundamental measure of database performance is DB Time. This is the amount of time a session has spend on database activities. This is a combination of CPU Time + non-Idle wait times. This does not consider any idle wait time so even if session is done with its activity but just sitting idle doing nothing, that idle time is not considered part of DB Time.

This gives a very accurate measure of amount of database resources a session is burning.

Active Session History:

ASH or Active Session History provides snapshots of active sessions connected to database (status=ACTIVE in v$session) and loads of other information related to those sessions about what they were doing when snapshot was taken. To understand how we can extract this information and make sense about it, we need to understand how ASH is build and what that information means.

V$ACTIVE_SESSION_HISTORY

Active session history is exposed by database using a view V$ACTIVE_SESSION_HISTORY. This view stores samples taken every 1 sec. A background process MMNL (Memory manager lite) takes a snapshot from v$session and v$session_wait about the current state of active sessions and dumps the information in V$ACTIVE_SESSION_HISTORY. Note that only active sessions’s snapshot is taken.

V$ACTIVE_SESSION_HISTORY is a memory view stored in the form of circular buffer. It means that this view has size limit on its growth (approx 2MB / CPU). So when it reaches to that size it starts overwriting the oldest records in circular buffer. You can visualize the same using following figure

 

ash-architecture-and-advanced-usage-rmoug2014-7-638

 

 

 

 

 

 

 

 

So how these samples are useful to us in finding DB Time ?

Lets say a session has spend 10 sec of active time in database. It means DB time for that session is 10 sec. During this 10 sec of active time, we will have 10 samples of that session taken as interval time between snapshot is 1 sec. So if we just take a count(*) from V$ACTIVE_SESSION_HISTORY, we will see 10 records for that session. This means that to get DB Time of a session  we just have to take count(*) from V$ACTIVE_SESSION_HISTORY for that session.

Its little difficult to adjust to the fact that time parameter can be obtained using count(*) from a view. But this is because interval time between samples is 1 sec.

So DB Time is actually – [ Number of samples ] X [ Time interval between samples ]

Since time interval between samples is 1 sec, we can get DB time as number of samples which is count(*) from V$ACTIVE_SESSION_HISTORY

Again, calculating DB Time using count(*) from ASH is approximation as sessions which are taking fraction of times < 1 sec will not be counted if they happens between 2 samples and DB time of other sessions which ends just before next samples is taken will be counted until last samples when they were active. But those were very few cases and amount of DB time not measure in such cases will be very less compared to total DB time we are measuring. You can visualize the same using following figure.

 

Aas_ashDBA_HIST_ACTIVE_SESS_HISTORY

DBA_HIST_ACTIVE_SESS_HISTORY is a history table which maintains history of active sessions between AWR snapshots. Since V$ACTIVE_SESSION_HISTORY is a memory view and has limited space for growth, data from V$ACTIVE_SESSION_HISTORY flows to DBA_HIST_ACTIVE_SESS_HISTORY table.

DBA_HIST_ACTIVE_SESS_HISTORY table enables us to view active session data from the past. Data in DBA_HIST_ACTIVE_SESS_HISTORY is retained based on retention level set for AWR snapshots (30 days by default). So every time AWR snapshot happens, data from V$ACTIVE_SESSION_HISTORY goes into this table.

But NOT all data from V$ACTIVE_SESSION_HISTORY goes into DBA_HIST_ACTIVE_SESS_HISTORY. When AWR snapshot happens, it picks only 1 out of 10 samples from V$ACTIVE_SESSION_HISTORY into history table.

Complete visualization of active sessions looks like below

ash-architecture-and-advanced-usage-rmoug2014-7-638

 

 

 

 

 

 

 

 

Since AWR snapshot interval is 1 hour, it will take AWR snapshots every 1 hour and during that time 1 out of 10 samples form V$ACTIVE_SESSION_HISTORY will move to DBA_HIST_ACTIVE_SESS_HISTORY table.

So if want to calculate DB Time from DBA_HIST_ACTIVE_SESS_HISTORY table, we need to consider sample time as 10 sec instead of 1 sec.

DB Time = [ Number of samples ] X [ Time interval between samples ]
        = count(1) * 10

That’s why when we are calculating DB Time of a session or SQL from DBA_HIST_ACTIVE_SESS_HISTORY table, we need to take count(*) * 10.

Calculating DB Time

With this brief understanding of ASH, we can use these tables to find out DB time for a session and consolidate sessions to calculate DB Time for module/user etc.

If we can list modules in terms of % of DB time they are consuming in database, modules with high % of DB time are the one using most resources in database and has high impact to database. Same holds true at user level as well. We can find out which user is contributing highest DB time.

I have given couple of queries to find out % of DB Time different modules are contributing in database. This gives a fair idea about module wise resource consumption in database.

Note that I have randomized module names using DBMS_RANDOM.string(‘L’,TRUNC(DBMS_RANDOM.value(10,21))) to avoid exposing actual module name outside.


set line 999
set pagesize 999
set verify off
col module format a30;
BREAK ON REPORT
COLUMN DUMMY HEADING ''
compute sum of PCT_CPU on REPORT
with total_cpu as
(select   count(1)
 from     V$ACTIVE_SESSION_HISTORY
 where    SESSION_TYPE = 'FOREGROUND')
 select   module,
          count(1) "module_cpu",
          (select * from total_cpu) "total_cpu",
          round((count(1)/(select * from total_cpu))*100,2) PCT_CPU
from      V$ACTIVE_SESSION_HISTORY
where     SESSION_TYPE = 'FOREGROUND'
group by module order by PCT_CPU desc;

MODULE                         module_cpu  total_cpu    PCT_CPU
------------------------------ ---------- ---------- ----------
vvadcagpas                        29355     187065      15.69
cbkcbqchts                        26506     187065      14.17
zugkvhbajxpff                     24655     187065      13.18
tvfqhrkqzktiiceolrfd              10046     187065       5.37
paczxkndgsaqt                      6740     187065        3.6
pzqacdgxeqzbtqhalfc                6220     187065       3.33
nvqqscmayx                         5905     187065       3.16
zbirttdyoxxbslopnm                 5412     187065       2.89
bbztiihehigeqwxvk                  5051     187065        2.7
kvosekxfhkmfw                      4687     187065       2.51
aenyopwmok                         3977     187065       2.13
brfqfuwdyhyrej                     3256     187065       1.74
abruxykxhvqgfg                     3227     187065       1.73
zfrahazoqgdxubpni                  2759     187065       1.47
ddxinnevqd                         2633     187065       1.41
lructcsjkxhxatfidpf                2518     187065       1.35
naawhsfqvcaoathn                   2246     187065        1.2
sljzltjhhigepqfmcji                2142     187065       1.15
chvjvntynkjb                       2102     187065       1.12
zlzhwwhzbhfdtsszfedj               2005     187065       1.07
yvkuwzjkjhyeamco                   1610     187065        .86
fdjwspubzyhyxyzzkgyi               1578     187065        .84
risowlcnkizrbpgt                   1557     187065        .83
ywwdycosdpcfsflaubia               1539     187065        .82
pxlqrkmvreppum                     1496     187065         .8
amivlnlivqeesnqdkto                1404     187065        .75
jjaqyosqsc                         1347     187065        .72
yxkgfgktafvx                       1354     187065        .72
gqistslkwdw                        1328     187065        .71
bhuzvdpklocmnyimtzux               1272     187065        .68
uimycabvbjcczibrn                  1053     187065        .56
ksnvqslyker                         997     187065        .53
ofekwpklwpjtha                      975     187065        .52
wmkmovueneno                        904     187065        .48
umdtggtvdptkubfs                    795     187065        .42
dddegpqpwyalotxtoh                  761     187065        .41
pwycwyhavccwjuufqjx                 733     187065        .39
pxvcbhpfmeuwen                      729     187065        .39
nhezqmgkqccdcog                     716     187065        .38
ixtljofuazhpp                       628     187065        .34
mbotmyoairyqbdg                     600     187065        .32
aocabnswozdg                        586     187065        .31
qtdbdbuffprxj                       586     187065        .31
suiwwwtolcnkbnkcfwxw                560     187065         .3
btbdbtaztym                         562     187065         .3
xmvdgeczrot                         567     187065         .3
gkwzuwlgsh                          538     187065        .29
yhgmyqgqkvzmvbpsi                   463     187065        .25
obahkcsiafagmkxxu                   457     187065        .24
tcjoiyhzpuutbfhegn                  448     187065        .24
brlrkszrpmtw                        426     187065        .23
spzvvwzvoh                          435     187065        .23
ckvvarzrtpedhjl                     334     187065        .18
mnamshpswmsif                       329     187065        .18
gwvwypqrpwxkbugms                   293     187065        .16
clkuqearqhguezagwjlc                302     187065        .16
fskwtuctalnxhhu                     306     187065        .16
dyljhijlwyfmjy                      277     187065        .15
eqdifihzda                          255     187065        .14
nkrvwqrldd                          229     187065        .12
jkevokwkwokbnkbx                    231     187065        .12
niatevdadex                         198     187065        .11
qfjiifhiubvstyp                     200     187065        .11
acquoqslwxedulsmekbu                180     187065         .1
hjfimltalvopvuum                    156     187065        .08
bmyyhtzdqebo                        152     187065        .08
tkehpofgfe                          157     187065        .08
qnhgoqgstvq                         131     187065        .07
ddlloytpmyhpzpmk                    125     187065        .07
bvborwlump                          133     187065        .07
juirsmdjuejot                       124     187065        .07
saeoojlzfepjfybpclls                129     187065        .07
ppfdzcozyhxmf                       135     187065        .07
xbrwqdopemjqrhebglzs                115     187065        .06
zrfdsejwxgitsorat                   119     187065        .06
vsmsefmxnkoeb                        97     187065        .05
dvhoxhzayqyhsny                      98     187065        .05
cwjpiesdhpxvjs                      101     187065        .05
bvffncgtiojaj                        98     187065        .05
flfuciblxfzdcbo                      87     187065        .05
asucxewzdruwyfmg                     66     187065        .04
dztczkivnnbuzyuh                     67     187065        .04
hslcljfjmrhduw                       73     187065        .04
lywascpugxxjidx                      74     187065        .04
cawgdisaagltmie                      73     187065        .04
dpqbuvfkef                           68     187065        .04
tjpaushwfftmm                        54     187065        .03
ulgktcqvsazktojskfo                  56     187065        .03
ckoayyqtcxybkukgqvgl                 49     187065        .03
aqmphwbrngbotpxehyhe                 49     187065        .03
asqllnbivusyu                        60     187065        .03
skvrrqoshnirxbud                     59     187065        .03
vobtqhfewje                          58     187065        .03
xurrifyontk                          34     187065        .02
rliqgkwdwao                          42     187065        .02
ziwftblonh                           36     187065        .02
wzybbgmxkx                           30     187065        .02
vhumxxhimauc                         16     187065        .01
vssaefphdkebdx                       26     187065        .01
vjbjocafrwgdv                        11     187065        .01
xgcwcxrlzrjvysanlijw                 14     187065        .01
ztzxyfykhnxozdn                      20     187065        .01
btlbvffybkyjsn                       12     187065        .01
pmmqhsjuqhdaf                        22     187065        .01
kaumdvjqfcprmaeygr                   11     187065        .01
kadliemxdxhtzleqjr                   28     187065        .01
hlebszxkejjgkfbpmjew                 16     187065        .01
atwvjasykmlmldbipswh                 16     187065        .01
                                                     ----------
sum                                                      99.95

Similarly to check DB Time for sessions in past, we need to use DBA_HIST_ACTIVE_SESS_HISTORY table. Following query will provide % DB time used by individual modules calculated as % of total DB time between 2 AWR snapshots

set line 999
set pagesize 999
set verify off
BREAK ON REPORT
COLUMN DUMMY HEADING ''
compute sum of PCT_CPU on REPORT
with total_cpu as
(select   count(1)*10
from      dba_hist_active_sess_history
where     snap_id >= &begin_snap_id
and       snap_id < &end_snap_id
and       SESSION_TYPE = 'FOREGROUND')
select    module,
          count(1)*10 "module_cpu",
          (select * from total_cpu) "total_cpu",
          round((count(1)*10/(select * from total_cpu))*100,2) PCT_CPU
from      dba_hist_active_sess_history
where     snap_id >= &&begin_snap_id
and       snap_id < &&end_snap_id
and       SESSION_TYPE = 'FOREGROUND'
group by module order by PCT_CPU desc

 

 

Note that we have used count(*) * 10 in above query because 1 of 10 samples go to this table. Also, you need to specify begin_snap_id and end_snap_id and module DB Time usage during those snapshots will be calculated.

CPU Time

If we want to ignore waits and only focus on pure CPU used by modules, we can list modules as per CPU time they consume as % of total CPU time of database.

I think DB time is still better metric than CPU time because DB time covers waits as well. Waits slow down database.

We can calculate CPU time of a session by doing the same count(*) from V$ACTIVE_SESSION_HISTORY but we need to count only those samples during which session was actually using CPU. An ACTIVE session could be either using CPU to do some processing or it could be actively waiting on something (example IO when doing db file sequential read). Active session using CPU is represented by session_state = ‘ON CPU’. So we will count only those samples for session where session_state was ON CPU

Following query will give CPU time used by each module as % of total CPU time using V$ACTIVE_SESSION_HISTORY


set line 999
set pagesize 999
set verify off
BREAK ON REPORT
COLUMN DUMMY HEADING ''
compute sum of PCT_CPU on REPORT
with total_cpu as
(select count(1)
from  V$ACTIVE_SESSION_HISTORY
where   SESSION_TYPE != 'BACKGROUND'
and session_state = 'ON CPU')
select  module,
count(1) "module_cpu",
(select * from total_cpu) "total_cpu",
round((count(1)/(select * from total_cpu))*100,2) PCT_CPU
from  V$ACTIVE_SESSION_HISTORY
where SESSION_TYPE != 'BACKGROUND'
and   session_state = 'ON CPU'
group by module order by PCT_CPU desc

We can find similar CPU time for past sessions using DBA_HIST_ACTIVE_SESS_HISTORY table. Following query will provide CPU time used by module as % of total CPU time between 2 AWR snapshots


set line 999
set pagesize 999
set verify off
BREAK ON REPORT
COLUMN DUMMY HEADING ''
compute sum of PCT_CPU on REPORT
with total_cpu as
(select count(1)*10
from  dba_hist_active_sess_history
where   snap_id >= &&begin_snap_id
and   snap_id < &&end_snap_id
and   SESSION_TYPE != 'BACKGROUND'
and session_state = 'ON CPU')
select  module,
count(1)*10 "module_cpu",
(select * from total_cpu) "total_cpu",
round((count(1)*10/(select * from total_cpu))*100,2) PCT_CPU
from  dba_hist_active_sess_history
where   snap_id >= &&begin_snap_id
and   snap_id < &&end_snap_id
and   SESSION_TYPE != 'BACKGROUND'
and   session_state = 'ON CPU'
group by module order by PCT_CPU desc

Note, that in above query from DBA_HIST_ACTIVE_SESS_HISTORY, we have used count(*) * 10 because only 1 of 10 samples goes into this table.

I think for now above information is sufficient to give a basic idea about ASH.

I will be writing more articles on performance tuning using ASH as I dig further.

Hope this helps !!

“show slave hosts” on master not reporting hostname

If you have multiple MySQL slaves or complex topology, its difficult to remember what slaves are replicating from which master.

We can go to slave servers and check “show slave status” which list master host, but if we want to find out other way, we have can run “show slave hosts” from master.
This will list down all the slaves that are getting replicated from this master.

Only problem I faced was, “show slave hosts” was not reporting hostnames. They were empty.

root [fleetdb] >show slave hosts;
+-----------+------+------+-----------+--------------------------------------+
| Server_id | Host | Port | Master_id | Slave_UUID                           |
+-----------+------+------+-----------+--------------------------------------+
|         3 |      | 3306 |         2 | c21f28d2-c243-11e4-baf5-2c600c20dba4 |
+-----------+------+------+-----------+--------------------------------------+
1 row in set (0.01 sec)

After bit of RTFM, I came to know we need to set a parameter – “report_host” while starting slave. This will register slave hostname while connecting to master (for replication) and same will be shown in “show slave hosts”

This is a read-only parameter meaning, we need to bounce mysql instance after setting this parameter in /etc/my.cnf

So I set the parameter in /etc/my.cnf as below

report_host=slave-host1.mydb.example.com

We can specify any name here and same will be registered in master. But it make sense to provide actual host name.

Once I bounce mysql instance, my slave is started and it connects to master.
Now I check “show slave hosts” and it shows this host.

root [fleetdb] >show slave hosts;
+-----------+------------------------------+------+-----------+--------------------------------------+
| Server_id | Host                         | Port | Master_id | Slave_UUID                           |
+-----------+------------------------------+------+-----------+--------------------------------------+
|         3 | slave-host1.mydb.example.com | 3306 |         2 | c21f28d2-c243-11e4-baf5-2c600c20dba4 |
+-----------+------------------------------+------+-----------+--------------------------------------+
1 row in set (0.00 sec)

So we can easily know what slaves are connected to this master.

Hope this helps !!

Restoring Slave when GTID is enabled on master

This is a quick post on issues I faced while restoring a slave when GTID was enabled on master.

I have master created few days back and now I am trying to create a slave. I have GTID enabled on master.

Master status:

root [mysql] >show master status \G
*************************** 1. row ***************************
             File: bin_log.000002
         Position: 16682
     Binlog_Do_DB:
 Binlog_Ignore_DB:
Executed_Gtid_Set: f7718b80-c237-11e4-baa8-a0369f370a52:1-59
1 row in set (0.00 sec)
root [mysql] >show variables like '%gtid%';
+--------------------------+-----------+
| Variable_name            | Value     |
+--------------------------+-----------+
| enforce_gtid_consistency | ON        |
| gtid_executed            |           |
| gtid_mode                | ON        |
| gtid_next                | AUTOMATIC |
| gtid_owned               |           |
| gtid_purged              |           |
+--------------------------+-----------+
6 rows in set (0.00 sec)

Step 1) Take backup from master

mysqldump -u root -p --databases deo fctest nosql --set-gtid-purged=off > /u01/backups/mysql_backup.sql

I used above command to take backup of master.

Note that I used –set-gtid-purged=off. If GTID is enabled on master and we are creating a slave, we want slave to start reading from the point when this backup is created. Because all previous information will be restored and available on slave as part of this backup restore.

If you want slave to start reading from this point and not from the beginning you should NOT set this variable to off. I will explain the significance of this variable later in the post.

Step 2) Install and create MySQL instance

Refer to my previous post https://avdeo.com/2014/02/18/installing-mysql/ for the same

Step 3) SCP backup file to slave server

slave-host.example.com$ scp master-host:/u01/backups/mysql_backup.sql /u01/backups/mysql_backup.sql

Step 4) Import backup into slave

slave-host.example.com$ mysql -u root -p </u01/backups/mysql_backup.sql

Step 5) Start Slave

We have to point slave to our master and then start slave.

When GTID is enabled, you can use MASTER_AUTO_POSITION=1 and slave should automatically start applying from correct GTID

mysql> CHANGE MASTER TO MASTER_HOST = 'master-host.example.com', MASTER_PORT = 3306, MASTER_USER = 'replicate', MASTER_PASSWORD = 'welcome', MASTER_AUTO_POSITION = 1;
Query OK, 0 rows affected, 2 warnings (0.12 sec)
mysql> start slave;
Query OK, 0 rows affected (0.03 sec)

mysql> show slave status \G
*************************** 1. row ***************************
               Slave_IO_State: Waiting for master to send event
                  Master_Host: master-host.example.com
                  Master_User: replicate
                  Master_Port: 3306
                Connect_Retry: 60
              Master_Log_File: bin_log.000002
          Read_Master_Log_Pos: 16682
               Relay_Log_File: relay_log.000002
                Relay_Log_Pos: 547
        Relay_Master_Log_File: bin_log.000002
             Slave_IO_Running: Yes
            Slave_SQL_Running: No
              Replicate_Do_DB:
          Replicate_Ignore_DB:
           Replicate_Do_Table:
       Replicate_Ignore_Table:
      Replicate_Wild_Do_Table:
  Replicate_Wild_Ignore_Table:
                   Last_Errno: 1133
                   Last_Error: Error 'Can't find any matching row in the user table' on query. Default database: 'mysql'. Query: 'SET PASSWORD FOR 'root'@'master-host.example.com'='*DF216F57F1F2066124E1AA5491D995C3CB57E4C2''
                 Skip_Counter: 0
          Exec_Master_Log_Pos: 341
              Relay_Log_Space: 17086
              Until_Condition: None
               Until_Log_File:
                Until_Log_Pos: 0
           Master_SSL_Allowed: No
           Master_SSL_CA_File:
           Master_SSL_CA_Path:
              Master_SSL_Cert:
            Master_SSL_Cipher:
               Master_SSL_Key:
        Seconds_Behind_Master: NULL
Master_SSL_Verify_Server_Cert: No
                Last_IO_Errno: 0
                Last_IO_Error:
               Last_SQL_Errno: 1133
               Last_SQL_Error: Error 'Can't find any matching row in the user table' on query. Default database: 'mysql'. Query: 'SET PASSWORD FOR 'root'@'master-host.example.com'='*DF216F57F1F2066124E1AA5491D995C3CB57E4C2''
  Replicate_Ignore_Server_Ids:
             Master_Server_Id: 2
                  Master_UUID: f7718b80-c237-11e4-baa8-a0369f370a52
             Master_Info_File: mysql.slave_master_info
                    SQL_Delay: 0
          SQL_Remaining_Delay: NULL
      Slave_SQL_Running_State:
           Master_Retry_Count: 86400
                  Master_Bind:
      Last_IO_Error_Timestamp:
     Last_SQL_Error_Timestamp: 150304 08:03:47
               Master_SSL_Crl:
           Master_SSL_Crlpath:
           Retrieved_Gtid_Set: f7718b80-c237-11e4-baa8-a0369f370a52:1-59
            Executed_Gtid_Set: c21f28d2-c243-11e4-baf5-2c600c20dba4:1-59,
f7718b80-c237-11e4-baa8-a0369f370a52:1
                Auto_Position: 1
1 row in set (0.00 sec)

Now, if we see above slave status output, its showing error in SQL thread. SQL that is failing is Query: ‘SET PASSWORD FOR ‘root’@’master-host.example.com’=’*DF216F57F1F2066124E1AA5491D995C3CB57E4C2”

This is the first DML that was run on master. Its failing because master-host.example.com entry is not present in mysql.user table.  This is because its a slave on slave host and so root user entry will be present corresponding to slave host only.

Never the less, it should not be running DMLs from the beginning because we have restored everything until GTID 1-59.

Significance of –set-gtid-purged=off

The reason its applying transactions from the beginning is because we used –set-gtid-purged=off while taking backup.

When we are using GTID, we need to know in which GTID is the master and set it on the slave. MySQL keeps two global variables with GTID numbers on it:

  • gtid_executed: it contains a representation of the set of all transaction logged in the binary log
  • gtid_purged: it contains a representation of the set of all transactions deleted from the binary log

So now, the process is the following:

  • take a backup from the master and store the value of gtid_executed
  • restore the backup on the slave and set gtid_purged with the value of gtid_executed from the master

If we don’t use –set-gtid-purged=off in mysqldump, these global variables are automatically set in dump file created by mysqldump. So that when we apply that dump file on slave it will set these variables and slave will start pulling from GTID 60 onwards (in our example).

Fixing the issue

So at this point we have a backup which does not have GTID information. Lets see how to fix this issue manually.

I tried setting GTID_NEXT to 60 by first executing the transaction on master and make GTID 60 on master.

mysql> set gtid_next = "f7718b80-c237-11e4-baa8-a0369f370a52:60";
Query OK, 0 rows affected (0.00 sec)

mysql> begin; commit;
Query OK, 0 rows affected (0.00 sec)

Query OK, 0 rows affected (0.00 sec)

mysql> set gtid_next = "AUTOMATIC";
Query OK, 0 rows affected (0.00 sec)

mysql> stop slave;
Query OK, 0 rows affected (0.02 sec)

mysql> start slave;
Query OK, 0 rows affected (0.03 sec)

But this didn’t work.

We know that master is done until GTID 59 and we have those transaction in backup and slave also have them because backup was restored completely.

Lets try to set GTID_PURGED to remove those transactions from BINLOG

mysql> SET GLOBAL gtid_purged="f7718b80-c237-11e4-baa8-a0369f370a52:1:59";
ERROR 1840 (HY000): @@GLOBAL.GTID_PURGED can only be set when @@GLOBAL.GTID_EXECUTED is empty.

Its saying that GTID_EXECUTED should be empty. Lets check whats the current value and try to set to empty

mysql> show global variables like 'gtid_executed';
+---------------+--------------------------------------------------------------------------------------+
| Variable_name | Value                                                                                |
+---------------+--------------------------------------------------------------------------------------+
| gtid_executed | c21f28d2-c243-11e4-baf5-2c600c20dba4:1-59,
f7718b80-c237-11e4-baa8-a0369f370a52:1:59 |
+---------------+--------------------------------------------------------------------------------------+
1 row in set (0.00 sec)
mysql> SET GLOBAL gtid_executed = "";
ERROR 1238 (HY000): Variable 'gtid_executed' is a read only variable

We cannot assign empty value to gtid_executed variable because its read-only and since we restored the backup gtid_executed is updated till the point this instance has executed the transactions.

Only way to fix this is using reset master on slave

When we issue reset master on slave, it will reset GTID_EXECUTED and make it null. After that we can set GTID_PURGED till 1-59 and slave can start applying from 60

mysql> reset master;
Query OK, 0 rows affected (0.07 sec)

mysql> show global variables like 'GTID_EXECUTED';
+---------------+-------+
| Variable_name | Value |
+---------------+-------+
| gtid_executed |       |
+---------------+-------+
1 row in set (0.00 sec)
mysql> set global GTID_PURGED="f7718b80-c237-11e4-baa8-a0369f370a52:1-59";
Query OK, 0 rows affected (0.07 sec)

mysql> start slave;
Query OK, 0 rows affected (0.01 sec)
mysql> show slave status \G
*************************** 1. row ***************************
               Slave_IO_State: Waiting for master to send event
                  Master_Host: master-host.example.com
                  Master_User: replicate
                  Master_Port: 3306
                Connect_Retry: 60
              Master_Log_File: bin_log.000002
          Read_Master_Log_Pos: 16953
               Relay_Log_File: relay_log.000004
                Relay_Log_Pos: 442
        Relay_Master_Log_File: bin_log.000002
             Slave_IO_Running: Yes
            Slave_SQL_Running: Yes
              Replicate_Do_DB:
          Replicate_Ignore_DB:
           Replicate_Do_Table:
       Replicate_Ignore_Table:
      Replicate_Wild_Do_Table:
  Replicate_Wild_Ignore_Table:
                   Last_Errno: 0
                   Last_Error:
                 Skip_Counter: 0
          Exec_Master_Log_Pos: 16953
              Relay_Log_Space: 1202
              Until_Condition: None
               Until_Log_File:
                Until_Log_Pos: 0
           Master_SSL_Allowed: No
           Master_SSL_CA_File:
           Master_SSL_CA_Path:
              Master_SSL_Cert:
            Master_SSL_Cipher:
               Master_SSL_Key:
        Seconds_Behind_Master: 0
Master_SSL_Verify_Server_Cert: No
                Last_IO_Errno: 0
                Last_IO_Error:
               Last_SQL_Errno: 0
               Last_SQL_Error:
  Replicate_Ignore_Server_Ids:
             Master_Server_Id: 2
                  Master_UUID: f7718b80-c237-11e4-baa8-a0369f370a52
             Master_Info_File: mysql.slave_master_info
                    SQL_Delay: 0
          SQL_Remaining_Delay: NULL
      Slave_SQL_Running_State: Slave has read all relay log; waiting for the slave I/O thread to update it
           Master_Retry_Count: 86400
                  Master_Bind:
      Last_IO_Error_Timestamp:
     Last_SQL_Error_Timestamp:
               Master_SSL_Crl:
           Master_SSL_Crlpath:
           Retrieved_Gtid_Set: f7718b80-c237-11e4-baa8-a0369f370a52:1-60
            Executed_Gtid_Set: f7718b80-c237-11e4-baa8-a0369f370a52:1-60
                Auto_Position: 1
1 row in set (0.00 sec)

This fixed the issue.

Hope this helps !!

References:

How to create/restore a slave using GTID replication in MySQL 5.6

 

Moving InnoDB Table Between Servers

This is a small article telling how to move a innoDB table from one MySQL server to another.

Moving MyISAM table from one server to another is very straight forward process. MyISAM table generates 3 files at file system level.

  • .frm file contains table structure/definition
  • .MYD file contains table data
  • .MYI file contains index data

We can simply copy these 3 files to another MySQL server installation under desired database and it will show up in “SHOW TABLES” command.

But things are different in InnoDB, because InnoDB also stores metadata in default tablespace represented by ibdata* datafiles in data directory.

InnoDB table has 2 types of files generated

  • .frm file contains table structure/definition
  • .ibd file contains data of table as well as index

Apart from above files, we also have metadata like table ID, database name and transaction information about tables in default tablespace (ibdata1 datafile).

This makes InnoDB table special and, simply copying .frm and .ibd files from source server to destination server will not migrate/copy InnoDB table to different server.

We need to do additional steps to complete the copy/migration. Please follow below steps to copy InnoDB table from one server to another.

 

Step 1) On destination database, create required database if not available.

So lets say, I want to move one table called flavors from fctest database on source server to destination server.

For that to happen, I have to make sure that on destination server, I have fctest database created. If its not created, create the same.

You need to do this, even if you want to move this table to some other existing database on destination server.

We have to first move the table to fctest database on destination (same as source) and then we will rename the same to desired database.

On destination server:

root [deo] >show databases;
+--------------------+
| Database           |
+--------------------+
| information_schema |
| deo                |
| menagerie          |
| mysql              |
| performance_schema |
| test               |
+--------------------+
6 rows in set (0.00 sec)

I don’t have fctest database. Creating the same.

root [deo] >create database fctest;
Query OK, 1 row affected (0.00 sec)

root [deo] >
root [deo] >use fctest;
Database changed
root [fctest] >

Step 2) Create empty table structure same as source table

Once you create the database fctest, create empty table flavors whose structure/definition is same as source.

You can get the definition from source using “SHOW CREATE TABLE” command

On Source server:

mysql> show create table flavors;
+---------+----------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------+
| Table   | Create Table                                                                                                                                                                                                                                                                                                                                                                                       |
+---------+----------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------+
| flavors | CREATE TABLE `flavors` (
  `flavor_id` int(11) NOT NULL AUTO_INCREMENT,
  `build_type` varchar(255) NOT NULL DEFAULT '',
  `variant` varchar(255) NOT NULL DEFAULT '',
  `threads` varchar(255) NOT NULL DEFAULT '',
  `first_event_id` int(11) NOT NULL DEFAULT '0',
  `last_event_id` int(11) DEFAULT NULL,
  PRIMARY KEY (`flavor_id`)
) ENGINE=InnoDB AUTO_INCREMENT=30 DEFAULT CHARSET=latin1 |
+---------+----------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------+
1 row in set (0.00 sec)

Create the same on destination server

On Destination server:

root [fctest] >CREATE TABLE flavors (
    ->   flavor_id int(11) NOT NULL AUTO_INCREMENT,
    ->   build_type varchar(255) NOT NULL DEFAULT '',
    ->   variant varchar(255) NOT NULL DEFAULT '',
    ->   threads varchar(255) NOT NULL DEFAULT '',
    ->   first_event_id int(11) NOT NULL DEFAULT '0',
    ->   last_event_id int(11) DEFAULT NULL,
    ->   PRIMARY KEY (flavor_id))
    ->  ENGINE=InnoDB;
Query OK, 0 rows affected (0.00 sec)
root [fctest] >show tables;
+------------------+
| Tables_in_fctest |
+------------------+
| flavors          |
+------------------+
1 row in set (0.00 sec)

Step 3) Discard tablespace on destination server

This step is required so that MySQL will discard current .ibd file generated for this table.

root [fctest] >ALTER TABLE flavors DISCARD TABLESPACE;
Query OK, 0 rows affected (0.00 sec)

Step 4) Copy .frm and .ibd file from source server to destination server

At this point copy the .frm and .ibb files from source server to destination server.

Before copying these files, make sure that no transactions are happening on this table (which you are migrating).

Safest way is to use “FLUSH TABLE <table_name> WITH READ LOCK” on source database which will flush table data and takes read only lock so that no DML will happen on the table.

While you keep that session active, copy the .frm and .ibd files to destination server.

Step 5) Import tablespace on destination server

Once you copy .frm and .ibd files to destination server, import the tablespace, which will make required metadata imported into default tablespace and accept the .ibd file.

root [fctest] >ALTER TABLE flavors IMPORT TABLESPACE;
Query OK, 0 rows affected (0.00 sec)
root [fctest] >show tables;
+------------------+
| Tables_in_fctest |
+------------------+
| flavors          |
+------------------+
1 row in set (0.00 sec)
root [fctest] >select count(1) from flavors;
+----------+
| count(1) |
+----------+
|       29 |
+----------+
1 row in set (0.00 sec)

Step 6) Rename the table to desired database

Since we wanted to move this table to deo database on destination server, we can simply run below rename command and it will move the tables to deo database.

root [fctest] >rename table fctest.flavors to deo.flavors;
Query OK, 0 rows affected (0.00 sec)

you can later drop fctest database on destination server.

Hope this helps !!

Disabling GTID in MySQL replication

There could be situations where we have to disable GTID in our replication setup.
We can follow below steps to do so. No need to change any GTID related parameters. We just have to change slave configuration and bounce slave. No need to touch master.

Step 1) Confirm that you are using GTID

If you check processlist on master, you can see GTID is enabled

master> show processlist \G
*************************** 1. row ***************************
     Id: 51
   User: replicate
   Host: master-1.db.example.com:43420
     db: NULL
Command: Binlog Dump GTID
   Time: 81316
  State: Master has sent all binlog to slave; waiting for binlog to be updated
   Info: NULL

Check the “command” above and it says “Binlog Dump GTID

 

Step 2) Stop slave, reset master_auto_position to 0


mysql> stop slave;
Query OK, 0 rows affected (0.00 sec)

mysql> change master to master_auto_position=0;
Query OK, 0 rows affected (0.00 sec)

Step 3) Check slave status for master log file and position

mysql> show slave status \G
*************************** 1. row ***************************
               Slave_IO_State:
                  Master_Host: master-1.db.example.com
                  Master_User: replicate
                  Master_Port: 3306
                Connect_Retry: 5
              Master_Log_File: bin_log.000003
          Read_Master_Log_Pos: 191
               Relay_Log_File: relay_log.000001
                Relay_Log_Pos: 4
        Relay_Master_Log_File: bin_log.000003
             Slave_IO_Running: No
            Slave_SQL_Running: No
              Replicate_Do_DB:
          Replicate_Ignore_DB:
           Replicate_Do_Table:
       Replicate_Ignore_Table:
       ...
       ...
           Master_SSL_Crlpath:
           Retrieved_Gtid_Set:
            Executed_Gtid_Set: 8c593930-b4da-11e3-9e2d-b499ba05473c:1-7,
cc4192b6-a60d-11e4-8301-80c16e2272cc:1-168
                Auto_Position: 0
1 row in set (0.01 sec)

 

Step 4) Use change master to point to master log file and position


mysql> change master to master_host='master-1.db.example.com' , master_port = 3306 , master_user = 'replicate' , master_password='welcome' , master_log_file = 'bin_log.000003' , master_log_pos = 191, master_connect_retry = 5;
Query OK, 0 rows affected, 2 warnings (0.00 sec)

mysql> show warnings;
+-------+------+--------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------+
| Level | Code | Message                                                                                                                                                                                                                                                                              |
+-------+------+--------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------+
| Note  | 1759 | Sending passwords in plain text without SSL/TLS is extremely insecure.                                                                                                                                                                                                               |
| Note  | 1760 | Storing MySQL user name or password information in the master info repository is not secure and is therefore not recommended. Please consider using the USER and PASSWORD connection options for START SLAVE; see the 'START SLAVE Syntax' in the MySQL Manual for more information. |
+-------+------+--------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------+
2 rows in set (0.00 sec)

 

Step 5) start slave


mysql> start slave;
Query OK, 0 rows affected (0.00 sec)

show slave status will still show GTIDs but its not using GTIDs now

mysql> show slave status \G
*************************** 1. row ***************************
               Slave_IO_State: Waiting for master to send event
                  Master_Host: master-1.db.example.com
                  Master_User: replicate
                  Master_Port: 3306
                Connect_Retry: 5
              Master_Log_File: bin_log.000003
          Read_Master_Log_Pos: 465
               Relay_Log_File: relay_log.000002
                Relay_Log_Pos: 586
        Relay_Master_Log_File: bin_log.000003
             Slave_IO_Running: Yes
            Slave_SQL_Running: Yes
              Replicate_Do_DB:
              ...
              ...
     Last_SQL_Error_Timestamp:
               Master_SSL_Crl:
           Master_SSL_Crlpath:
           Retrieved_Gtid_Set: 8c593930-b4da-11e3-9e2d-b499ba05473c:8
            Executed_Gtid_Set: 8c593930-b4da-11e3-9e2d-b499ba05473c:1-8,
cc4192b6-a60d-11e4-8301-80c16e2272cc:1-168
                Auto_Position: 0
1 row in set (0.00 sec)

 

We can confirm that GTID is not getting used by checking processlist on master

master>show processlist \G
*************************** 1. row ***************************
     Id: 1
   User: root
   Host: localhost
     db: deo
Command: Query
   Time: 0
  State: init
   Info: show processlist
*************************** 2. row ***************************
     Id: 3
   User: replicate
   Host: master-1.db.example.com:59631
     db: NULL
Command: Binlog Dump
   Time: 24
  State: Master has sent all binlog to slave; waiting for binlog to be updated
   Info: NULL
2 rows in set (0.00 sec)

If you check “command” for row 2, you will notice GTID is gone.

Hope this helps !!

Semi-synchronous Replication in MySQL

If we talk about MySQL replication, we always default to asynchronous replication. By asynchronous, we mean master will keep doing transactions while they are getting replicated to slave via IO thread asynchronously. In this replication, its possible that slave will fall behind but that does not impact master in anyway.
Problem with this replication is data-loss. Example: In case slave is falling behind and master host crashes, some of the transactions may not get transfered/replicated to slave. Following 2 options exists in this scenario

  1. Wait for master host to come up. Which may take around 20-25 mins
  2. Promote slave to master with some data loss

Nether options sounds good on production database. That’s why we have semi-synchronous replication.

Purpose of semi-synchronous replication is twofold

  1. Reduce replication lag and keep slave in sync with master
  2. Ensure no data-loss in case master crashes

Semi-synchronous replication has been in place from MySQL 5.5 onwards in the form of plugin. We just have to install the plugin and then we can enable the same. But again this semi-synchronous replication available in MySQL 5.5 and 5.6 does not guarantee “no data loss” completely for the reasons we will see later but it keeps slave in sync with master.

MySQL 5.7 Oracle has made some enhanced semi-synchronous replication which they call “loss-less semi-synchronous replication”. This guarantees that there is no data loss in case master crashes.

We will take a look at how semi-synchronous replication works, how to install the plugin and whats the enhancement in MySQL 5.7 to prevent data loss.

How Semi-Synchronous replication works?

As discussed above, there are 2 different algorithm for semi-synchronous replication.

  1. Normal semi-synchronous replication available in mysql 5.5 and 5.6 – does not guarantee “no data loss”
  2. Loss-less semi-synchronous replication available in mysql 5.7 – guarantee “no data loss”

lets take a look at each and how it works

1) Normal semi-synchronous replication in MySQL 5.5 and 5.6

If we talk above true synchronous replication – In truly synchronous replication, when you commit a transaction, the commit does not complete on master until all replicas have also committed successfully and acknowledged to master.

In MySQL’s semi-synchronous replication when transaction is committed on master, its written to binary log and innodb transaction log and commit is complete. At this point the data has not even reached to any of the slave but on master commit is complete. The client issuing the commit is still on hold and is not told that commit has complete.
At this point slave IO thread pulls transaction data from binary logs and updated relay logs, flush relay logs to disk and acknowledges master that slave has received and written the data. Only after receiving ack from at least 1 slave, MySQL confirms the client that transaction is committed.

There is a problem with above functionality. MySQL is only delaying acknowledge of commit to client. In reality its committing the transaction on master way before that (before even sending data to slave). Since committing of transaction is done before waiting for ack from semi-sync slave, so the committed rows are visible from applications, even though semi-sync slaves may not have received the data. If master is crashed and none of the slaves received the data, the data is lost but applications may have seen them. This is called phantom reads, and in many cases it’s problematic.

2) Loss-less semi-synchronous replication in MySQL 5.7

loss-less semi-synchronous replication in MySQL 5.7 answers above problem. Only change that’s done in this semi-synchronous replication is that commit on primary is delayed until master receives acknowledgement from only of the semi-synchronous slave that data is received and written by the slave. So when committed data is visible from applications, one of the semi-sync slaves have received that. Phantom read risk is much smaller: if both master and the latest semi-sync slave are down at the same time, data is lost. But it’s much less likely to happen compared to normal semi-sync.

To avoid data loss and phantom reads, Normal Semi-sync can’t meet your expectations. Using Loss-Less Semi-sync is needed.
With Loss-Less Semi-Synchronous replication, committed data should be on one of the slaves, so you can recover from the latest slave. You can always do fast failover here.

The acknowledgement from the slave means, it has received the transaction and applied it to the relay log and flushed it to disk. It doesn’t mean the data has actually gone into the database.

Just to make this statement clearer, it means the transaction was written to the relay log, and the relay log was flushed to disk.
There is still the need for the SQL thread to pick it up and apply it to the database. It’s very unlikely not to complete this task, but that way we’re getting a bit more performance.

Implementing semi-synchronous replication

1) check if semi-synchronous plugin is installed using “show plugins;”


mysql> show plugins;
+----------------------------+----------+--------------------+---------+---------+
| Name                       | Status   | Type               | Library | License |
+----------------------------+----------+--------------------+---------+---------+
| binlog                     | ACTIVE   | STORAGE ENGINE     | NULL    | GPL     |
| mysql_native_password      | ACTIVE   | AUTHENTICATION     | NULL    | GPL     |
| mysql_old_password         | ACTIVE   | AUTHENTICATION     | NULL    | GPL     |
| sha256_password            | ACTIVE   | AUTHENTICATION     | NULL    | GPL     |
| MRG_MYISAM                 | ACTIVE   | STORAGE ENGINE     | NULL    | GPL     |
| MyISAM                     | ACTIVE   | STORAGE ENGINE     | NULL    | GPL     |
| MEMORY                     | ACTIVE   | STORAGE ENGINE     | NULL    | GPL     |
| CSV                        | ACTIVE   | STORAGE ENGINE     | NULL    | GPL     |
| InnoDB                     | ACTIVE   | STORAGE ENGINE     | NULL    | GPL     |
| INNODB_TRX                 | ACTIVE   | INFORMATION SCHEMA | NULL    | GPL     |
| INNODB_LOCKS               | ACTIVE   | INFORMATION SCHEMA | NULL    | GPL     |
...
...
| PERFORMANCE_SCHEMA         | ACTIVE   | STORAGE ENGINE     | NULL    | GPL     |
| BLACKHOLE                  | ACTIVE   | STORAGE ENGINE     | NULL    | GPL     |
| FEDERATED                  | DISABLED | STORAGE ENGINE     | NULL    | GPL     |
| ARCHIVE                    | ACTIVE   | STORAGE ENGINE     | NULL    | GPL     |
| partition                  | ACTIVE   | STORAGE ENGINE     | NULL    | GPL     |
+----------------------------+----------+--------------------+---------+---------+
42 rows in set (0.01 sec)

You should be looking for “rpl_semi_sync_master” name.
In my case plugin was not installed.

Optionally you can also check system variables


mysql> show variables like 'rpl_semi_sync%';
Empty set (0.00 sec)

Nothing. That means semi-synchronous functionality is not installed.

2) Install semi-synchronous plugin

You need to install “semisync_master.so” on master and “semisync_slave.so” on slave

on Master:


mysql> INSTALL PLUGIN rpl_semi_sync_master SONAME 'semisync_master.so';
Query OK, 0 rows affected (0.12 sec)

on Slave:


mysql> INSTALL PLUGIN rpl_semi_sync_slave SONAME 'semisync_slave.so';
Query OK, 0 rows affected (0.00 sec)

These .so plugin files are available in <mysql_install_dir>/lib/plugin directory.

“show plugins;” should show the plugin now


mysql> show plugins;
+----------------------------+----------+--------------------+--------------------+---------+
| Name                       | Status   | Type               | Library            | License |
+----------------------------+----------+--------------------+--------------------+---------+
| binlog                     | ACTIVE   | STORAGE ENGINE     | NULL               | GPL     |
| mysql_native_password      | ACTIVE   | AUTHENTICATION     | NULL               | GPL     |
| mysql_old_password         | ACTIVE   | AUTHENTICATION     | NULL               | GPL     |
| sha256_password            | ACTIVE   | AUTHENTICATION     | NULL               | GPL     |
| MRG_MYISAM                 | ACTIVE   | STORAGE ENGINE     | NULL               | GPL     |
...
...
| rpl_semi_sync_master       | ACTIVE   | REPLICATION        | semisync_master.so | GPL     |
+----------------------------+----------+--------------------+--------------------+---------+
43 rows in set (0.00 sec)

You should also see the variables


mysql> show variables like 'rpl_semi_syn%';
+------------------------------------+-------+
| Variable_name                      | Value |
+------------------------------------+-------+
| rpl_semi_sync_master_enabled       | OFF   |
| rpl_semi_sync_master_timeout       | 10000 |
| rpl_semi_sync_master_trace_level   | 32    |
| rpl_semi_sync_master_wait_no_slave | ON    |
+------------------------------------+-------+
4 rows in set (0.00 sec)

3) Enable semi-synchronous replication

On master:


mysql> SET GLOBAL rpl_semi_sync_master_enabled = 1;
Query OK, 0 rows affected (0.03 sec)

mysql> SET GLOBAL rpl_semi_sync_master_timeout = 10000;
Query OK, 0 rows affected (0.00 sec)

rpl_semi_sync_master_timeout is the timeout in millisec. We are setting a timeout of 10 sec.

On slave:


mysql> SET GLOBAL rpl_semi_sync_slave_enabled = 1;
Query OK, 0 rows affected (0.00 sec)

mysql> stop slave io_thread;
Query OK, 0 rows affected (0.00 sec)

mysql> start slave io_thread;
Query OK, 0 rows affected (0.00 sec)

we have to bounce IO_THREAD on slave so that it will connect back to master as semi-synchronous slave.

Monitoring Semi-synchronous replication

We have few status variables that we can monitor to check if semi-synchronous replication is working as expected or not.

If you check status variables on master, it shows following status variables.


mysql> SHOW STATUS LIKE 'Rpl_semi_sync%';
+--------------------------------------------+-------+
| Variable_name                              | Value |
+--------------------------------------------+-------+
| Rpl_semi_sync_master_clients               | 1     |
| Rpl_semi_sync_master_net_avg_wait_time     | 0     |
| Rpl_semi_sync_master_net_wait_time         | 0     |
| Rpl_semi_sync_master_net_waits             | 0     |
| Rpl_semi_sync_master_no_times              | 0     |
| Rpl_semi_sync_master_no_tx                 | 0     |
| Rpl_semi_sync_master_status                | ON    |
| Rpl_semi_sync_master_timefunc_failures     | 0     |
| Rpl_semi_sync_master_tx_avg_wait_time      | 0     |
| Rpl_semi_sync_master_tx_wait_time          | 0     |
| Rpl_semi_sync_master_tx_waits              | 0     |
| Rpl_semi_sync_master_wait_pos_backtraverse | 0     |
| Rpl_semi_sync_master_wait_sessions         | 0     |
| Rpl_semi_sync_master_yes_tx                | 0     |
+--------------------------------------------+-------+
14 rows in set (0.00 sec)

On replication slave we have only 1 variable Rpl_semi_sync_slave_status.

Among the variables mentioned above, 3 varaibles are important for us to watch

Rpl_semi_sync_master_status -> Whether semi-synchronous replication currently is operational on the master
Rpl_semi_sync_master_yes_tx -> The number of commits that were acknowledged successfully by a slave
Rpl_semi_sync_master_no_tx  -> The number of commits that were not acknowledged successfully by a slave

Currently we see that Rpl_semi_sync_master_status is ON, that means the semi-synchronous replication is working fine.

Lets create a table in “deo” database and insert few records.

On master:


mysql> use deo;
Reading table information for completion of table and column names
You can turn off this feature to get a quicker startup with -A

Database changed
mysql> create table test (col1 smallint not null);
Query OK, 0 rows affected (0.14 sec)

mysql> set autocommit=1;
Query OK, 0 rows affected (0.00 sec)

mysql> insert into test values (1),(2),(3);
Query OK, 3 rows affected (0.02 sec)
Records: 3  Duplicates: 0  Warnings: 0

mysql> select * from test;
+------+
| col1 |
+------+
|    1 |
|    2 |
|    3 |
+------+
3 rows in set (0.00 sec)
mysql> show status like 'Rpl_semi_sync%';
+--------------------------------------------+-------+
| Variable_name                              | Value |
+--------------------------------------------+-------+
| Rpl_semi_sync_master_clients               | 1     |
| Rpl_semi_sync_master_net_avg_wait_time     | 1356  |
| Rpl_semi_sync_master_net_wait_time         | 4070  |
| Rpl_semi_sync_master_net_waits             | 3     |
| Rpl_semi_sync_master_no_times              | 0     |
| Rpl_semi_sync_master_no_tx                 | 0     |
| Rpl_semi_sync_master_status                | ON    |
| Rpl_semi_sync_master_timefunc_failures     | 0     |
| Rpl_semi_sync_master_tx_avg_wait_time      | 1379  |
| Rpl_semi_sync_master_tx_wait_time          | 4139  |
| Rpl_semi_sync_master_tx_waits              | 3     |
| Rpl_semi_sync_master_wait_pos_backtraverse | 0     |
| Rpl_semi_sync_master_wait_sessions         | 0     |
| Rpl_semi_sync_master_yes_tx                | 3     |
+--------------------------------------------+-------+
14 rows in set (0.00 sec)

If you see above, we have 3 transactions done (corrosponding to 3 inserts since autocommit was on) and all are replicated synchronously on slave.

Rpl_semi_sync_master_yes_tx has become 3 and Rpl_semi_sync_master_no_tx is still 0.

Lets stop replication IO thread so that changes in master will not get replicated on slave

on slave:


mysql> stop slave io_thread;
Query OK, 0 rows affected (0.00 sec)

Now, lets insert a record in test table in master

On master:


mysql> insert into test values (4);
Query OK, 1 row affected (10.01 sec)

mysql> show status like 'Rpl_semi_sync%';
+--------------------------------------------+-------+
| Variable_name                              | Value |
+--------------------------------------------+-------+
| Rpl_semi_sync_master_clients               | 1     |
| Rpl_semi_sync_master_net_avg_wait_time     | 1022  |
| Rpl_semi_sync_master_net_wait_time         | 4090  |
| Rpl_semi_sync_master_net_waits             | 4     |
| Rpl_semi_sync_master_no_times              | 1     |
| Rpl_semi_sync_master_no_tx                 | 1     |
| Rpl_semi_sync_master_status                | OFF   |
| Rpl_semi_sync_master_timefunc_failures     | 0     |
| Rpl_semi_sync_master_tx_avg_wait_time      | 1379  |
| Rpl_semi_sync_master_tx_wait_time          | 4139  |
| Rpl_semi_sync_master_tx_waits              | 3     |
| Rpl_semi_sync_master_wait_pos_backtraverse | 0     |
| Rpl_semi_sync_master_wait_sessions         | 0     |
| Rpl_semi_sync_master_yes_tx                | 3     |
+--------------------------------------------+-------+
14 rows in set (0.00 sec)

If you see, Rpl_semi_sync_master_no_tx is updated to 1 and Rpl_semi_sync_master_status is updated to off.
It means that replication is changed back to asynchronous.

Also note that insert statement took 10 secs to complete, because semi-sync replication timeout was set to 10 sec (rpl_semi_sync_master_timeout = 10000)

We can now start the slave IO thread and it will be back to semi-synchronous replication

on slave:


mysql> start slave io_thread;
Query OK, 0 rows affected (0.00 sec)

mysql> select * from test;
+------+
| col1 |
+------+
|    1 |
|    2 |
|    3 |
|    4 |
+------+
4 rows in set (0.00 sec)

On master:


mysql> show status like 'Rpl_semi_sync%';
+--------------------------------------------+-------+
| Variable_name                              | Value |
+--------------------------------------------+-------+
| Rpl_semi_sync_master_clients               | 1     |
| Rpl_semi_sync_master_net_avg_wait_time     | 1626  |
| Rpl_semi_sync_master_net_wait_time         | 8131  |
| Rpl_semi_sync_master_net_waits             | 5     |
| Rpl_semi_sync_master_no_times              | 1     |
| Rpl_semi_sync_master_no_tx                 | 1     |
| Rpl_semi_sync_master_status                | ON    |
| Rpl_semi_sync_master_timefunc_failures     | 0     |
| Rpl_semi_sync_master_tx_avg_wait_time      | 1379  |
| Rpl_semi_sync_master_tx_wait_time          | 4139  |
| Rpl_semi_sync_master_tx_waits              | 3     |
| Rpl_semi_sync_master_wait_pos_backtraverse | 0     |
| Rpl_semi_sync_master_wait_sessions         | 0     |
| Rpl_semi_sync_master_yes_tx                | 3     |
+--------------------------------------------+-------+
14 rows in set (0.00 sec)

So after starting slave IO thread, it recovered immediately and semi-synchronous replication enabled automatically.

Reference:

http://yoshinorimatsunobu.blogspot.in/2014/04/semi-synchronous-replication-at-facebook.html
http://dev.mysql.com/doc/refman/5.5/en/replication-semisync.html

How Does Semisynchronous MySQL Replication Work?

Hope this help !!

Converting non-CDB database as PDB in existing CDB

This is a short article on converting existing non-container (non-cdb) 12c database as pluggable database (pdb) to existing container database (cdb).

Version of Non-CDB database = 12.1.0.1

Version of CDB database = 12.1.0.2

Following steps will plug-in non-cdb database into cdb as pdb and also will upgrade the version to 12.1.0.2

My non-cdb database name is deo12c

My cdb database name is deocdb

Step 1) Make sure that non-cdb database is at least 12.1.0.1 version. This is the first version available.

Step 2) Compatible is set to at least 12.0.0 in non-cdb database

Step 3) Open non-cdb database in read-only to create XML file for the PDB

Shutting down non-cdb database (deo12c)

SYS.DEO12C.PRIMARY>shut immediate;
Database closed.
Database dismounted.
ORACLE instance shut down.

Mounting the database and opening in read-only

SYS.DEO12C.PRIMARY>startup mount
ORA-32004: obsolete or deprecated parameter(s) specified for RDBMS instance
ORACLE instance started.

Total System Global Area 4294967296 bytes
Fixed Size            3719064 bytes
Variable Size         4177526888 bytes
Database Buffers      100663296 bytes
Redo Buffers           13058048 bytes
Database mounted.
SYS.DEO12C.PRIMARY>alter database open read only;

Database altered.

SYS.DEO12C.PRIMARY>

Step 4) Create XML file for PDB

SYS.DEO12C.PRIMARY-RO>exec DBMS_PDB.DESCRIBE('/fs-a01-a/backups/deo12c/deopdb2.xml');

PL/SQL procedure successfully completed.

Step 5) Shutdown non-cdb database

SYS.DEO12C.PRIMARY-RO>SHUTDOWN IMMEDIATE
Database closed.
Database dismounted.
ORACLE instance shut down.
SYS.DEO12C.PRIMARY-RO>

Step 6) Plugin non-cdb database as PDB into CDB

Running following command on CDB database – deocdb

SYS.DEOCDB.PRIMARY>CREATE PLUGGABLE DATABASE deopdb2 USING '/fs-a01-a/backups/deo12c/deopdb2.xml' NOCOPY TEMPFILE REUSE;

Pluggable database created.

SYS.DEOCDB.PRIMARY>

Step 7) Convert the dictionary of new plugged in database to the PDB type

Change the container to new pluggable database

SYS.DEOCDB.PRIMARY>ALTER SESSION set container=deopdb2;

Session altered.

SYS.DEOCDB.PRIMARY>@$ORACLE_HOME/rdbms/admin/noncdb_to_pdb.sql

Step 8) Start new pluggable database

SYS.DEOCDB.PRIMARY>alter pluggable database deopdb2 open;

Warning: PDB altered with errors.

When we opened the pluggable database, we got a warning. You can check the details of warning by querying PDB_PLUG_IN_VIOLATIONS table

SYS.DEOCDB.PRIMARY>select name,cause,type,message,status from PDB_PLUG_IN_VIOLATIONS where name = 'DEOPDB2' and status != 'RESOLVED';

NAME                   CAUSE                  TYPE    MESSAGE                        STATUS
------------------------------ ------------------------------ --------- -------------------------------------------------- ---------
DEOPDB2                NATIONAL CHARACTER SET          ERROR    National character set mismatch: PDB national char PENDING
                                    acter set AL16UTF16. CDB national character set UT
                                    F8.

DEOPDB2                OPTION                  WARNING    Database option APS mismatch: PDB installed versio PENDING
                                    n NULL. CDB installed version 12.1.0.2.0.

DEOPDB2                OPTION                  WARNING    Database option CATJAVA mismatch: PDB installed ve PENDING
                                    rsion NULL. CDB installed version 12.1.0.2.0.

DEOPDB2                OPTION                  WARNING    Database option CONTEXT mismatch: PDB installed ve PENDING
                                    rsion NULL. CDB installed version 12.1.0.2.0.

DEOPDB2                OPTION                  WARNING    Database option DV mismatch: PDB installed version PENDING
                                     NULL. CDB installed version 12.1.0.2.0.

DEOPDB2                OPTION                  WARNING    Database option JAVAVM mismatch: PDB installed ver PENDING
                                    sion NULL. CDB installed version 12.1.0.2.0.

DEOPDB2                OPTION                  WARNING    Database option OLS mismatch: PDB installed versio PENDING
                                    n NULL. CDB installed version 12.1.0.2.0.

DEOPDB2                OPTION                  WARNING    Database option ORDIM mismatch: PDB installed vers PENDING
                                    ion NULL. CDB installed version 12.1.0.2.0.

DEOPDB2                OPTION                  WARNING    Database option OWM mismatch: PDB installed versio PENDING
                                    n NULL. CDB installed version 12.1.0.2.0.

DEOPDB2                OPTION                  WARNING    Database option SDO mismatch: PDB installed versio PENDING
                                    n NULL. CDB installed version 12.1.0.2.0.

DEOPDB2                OPTION                  WARNING    Database option XML mismatch: PDB installed versio PENDING
                                    n NULL. CDB installed version 12.1.0.2.0.

DEOPDB2                OPTION                  WARNING    Database option XOQ mismatch: PDB installed versio PENDING
                                    n NULL. CDB installed version 12.1.0.2.0.

DEOPDB2                APEX                  WARNING    APEX mismatch: PDB installed version NULL CDB inst PENDING
                                    alled version 4.2.5.00.08


13 rows selected.

These warnings can be ignored. These are basically coming because dba_registry of container database doesn’t match new pluggable database.

In my case container database has following registry

SYS.DEOCDB.PRIMARY>alter session set container=cdb$root;

Session altered.

SYS.DEOCDB.PRIMARY>col comp_name format a40;
SYS.DEOCDB.PRIMARY>SELECT comp_id,comp_name,version FROM dba_registry;

COMP_ID                COMP_NAME                VERSION
------------------------------ ---------------------------------------- ------------------------------
DV                   Oracle Database Vault            12.1.0.2.0
APEX                   Oracle Application Express        4.2.5.00.08
OLS                   Oracle Label Security            12.1.0.2.0
SDO                   Spatial                    12.1.0.2.0
ORDIM                   Oracle Multimedia            12.1.0.2.0
CONTEXT                Oracle Text                12.1.0.2.0
OWM                   Oracle Workspace Manager         12.1.0.2.0
XDB                   Oracle XML Database            12.1.0.2.0
CATALOG                Oracle Database Catalog Views        12.1.0.2.0
CATPROC                Oracle Database Packages and Types    12.1.0.2.0
JAVAVM                   JServer JAVA Virtual Machine        12.1.0.2.0
XML                   Oracle XDK                12.1.0.2.0
CATJAVA                Oracle Database Java Packages        12.1.0.2.0
APS                   OLAP Analytic Workspace            12.1.0.2.0
XOQ                   Oracle OLAP API                12.1.0.2.0
RAC                   Oracle Real Application Clusters     12.1.0.2.0

16 rows selected.

In my pluggable database, registry has only following components

SYS.DEOCDB.PRIMARY>alter session set container=deopdb2;

Session altered.

SYS.DEOCDB.PRIMARY>alter session set container=deopdb2;

Session altered.

SYS.DEOCDB.PRIMARY>SELECT comp_id,comp_name,version FROM dba_registry;

COMP_ID                COMP_NAME                VERSION
------------------------------ ---------------------------------------- ------------------------------
XDB                   Oracle XML Database            12.1.0.2.0
CATALOG                Oracle Database Catalog Views        12.1.0.2.0
CATPROC                Oracle Database Packages and Types    12.1.0.2.0
RAC                   Oracle Real Application Clusters     12.1.0.2.0

4 rows selected.

You can get rid of above warnings by installing the required components in your pluggable database and match it with container database.

References:

https://docs.oracle.com/database/121/UPGRD/upgrade.htm#CHDBEDDA

Hope this helps !!

MySQL Performance Schema

Performance schema, as name suggest is a tool for monitoring MySQL performance. It was implemented staring from MySQL 5.5 and improved further in MySQL 5.6. Performance Schema includes set of tables that gives information on how database is performing. It provides a way to inspect internal execution of the server at run time. Performance schema monitors every “event” the server does that takes time and has been instrumented so that timing information can be collected. It is implemented using the PERFORMANCE_SCHEMA storage engine and the performance_schema database.

Performance Schema focuses primarily on performance data. This differs from INFORMATION_SCHEMA, which serves for inspection of metadata. Having its own engine allows us to access information about server execution while having minimal impact on server performance. Tables created in performance_schema are persistent in memory. Information from those tables will be wiped out if the database is rebooted. Memory allocation for performance_schema table is all done at server startup, so there is no further memory reallocation or sizing, which greatly streamlines performance.

In this article we will see how different tables are organized and what kind of information each table store. We will also see how we can customize instrumentation in performance schema to gather required performance stats. Performance schema is a very big topic and there have been very exhaustive documentation on the same. I tried to make this article brief and cover major areas of performance schema. For further details readers can check out the document links provided in reference section.

Tables in Performance schema

We can categorize the tables in performance schema in 5 broad categories as described below.

1) Setup tables – These tables contains information about setting up monitoring/instrumentation. We can modify (update) these tables to customize what kind of performance data to be collected. These tables define who, what, how  to monitor, and where to save this data.

2) Current Events tables – These tables records performance data about the current events happening in database. This also depends on what kind of instrumentation we have enabled in setup tables. These tables contains most recently collected monitored events.

3) History events table – Performance data collected in current events table move to these history table after a period of time. So structure wise history tables are same as current events table but they contains historical data

4) Summary tables – These tables contacts performance data summarized over an entity. So performance data of events are summarized by various entities like account_id, host_name or by thread. These tables has aggregate value of events grouped by different entities.

5) Object Instances tables – These tables records what instances of objects (files, locks, mutexes, etc) are currently in use. Example if a mutex is currently in use, you will see an entry corresponding t that mutex in mutex_instances table.

Lets try to dig into each category and see what tables we have and what they store.

1) Setup tables

Most of the tables in performance_schema database is read only tables. There are very few tables which we can modify to adjust instrumentation. Setup tables are among those few tables which can be modified to define what kind of performance data we want to collect.

SETUP_ACTORS – This table defines who to monitor. We can insert username@hostname combination in this table that we want to monitor.

By default it has only 1 entry and every user is monitored

mysql> select * from setup_actors;
+------+------+------+
| HOST | USER | ROLE |
+------+------+------+
| %    | %    | %    |
+------+------+------+
1 row in set (0.00 sec)

 

 

SETUP_OBJECTS – This table defines which objects (table) to monitor. By details mysql does not monitor any activity in performance_schema, information_schema and MySQL databases. Everything else is monitored. If you are interested in monitoring only 1 or 2 databases, you can remove the entry of monitoring all databases and insert into databases that you want to monitor.

Default entries looks file following

mysql> select * from setup_objects;
+-------------+--------------------+-------------+---------+-------+
| OBJECT_TYPE | OBJECT_SCHEMA      | OBJECT_NAME | ENABLED | TIMED |
+-------------+--------------------+-------------+---------+-------+
| TABLE       | mysql              | %           | NO      | NO    |
| TABLE       | performance_schema | %           | NO      | NO    |
| TABLE       | information_schema | %           | NO      | NO    |
| TABLE       | %                  | %           | YES     | YES   |
+-------------+--------------------+-------------+---------+-------+
4 rows in set (0.02 sec)

 

SETUP_INSTRUMENTS – This table defines the “what” part – what to monitor. This is a major table in performance_schema database. This table has list of all instrumentation that is available in mysql. This table lists classes of instrumented objects for which events can be collected. Each instrument added to the source code provides a row for this table, even when the instrumented code is not executed. You can enable or disable each instrumentation in this table.

This table looks like following

mysql> select * from setup_instruments;
+---------------------------------------------------------------------------------------+---------+-------+
| NAME                                                                                  | ENABLED | TIMED |
+---------------------------------------------------------------------------------------+---------+-------+
| wait/synch/mutex/sql/PAGE::lock                                                       | NO      | NO    |
| wait/synch/mutex/sql/TC_LOG_MMAP::LOCK_sync                                           | NO      | NO    |
| wait/synch/mutex/sql/TC_LOG_MMAP::LOCK_active                                         | NO      | NO    |
| wait/synch/mutex/sql/TC_LOG_MMAP::LOCK_pool                                           | NO      | NO    |
| wait/synch/mutex/sql/LOCK_des_key_file                                                | NO      | NO    |
| wait/synch/mutex/sql/MYSQL_BIN_LOG::LOCK_commit                                       | NO      | NO    |
...
...
...
| statement/com/Set option                                                              | YES     | YES   |
| statement/com/Fetch                                                                   | YES     | YES   |
| statement/com/Daemon                                                                  | YES     | YES   |
| statement/com/Binlog Dump GTID                                                        | YES     | YES   |
| statement/com/Error                                                                   | YES     | YES   |
| statement/abstract/Query                                                              | YES     | YES   |
| statement/abstract/new_packet                                                         | YES     | YES   |
| statement/abstract/relay_log                                                          | YES     | YES   |
| wait/io/socket/sql/server_tcpip_socket                                                | NO      | NO    |
| wait/io/socket/sql/server_unix_socket                                                 | NO      | NO    |
| wait/io/socket/sql/client_connection                                                  | NO      | NO    |
| idle                                                                                  | YES     | YES   |
+---------------------------------------------------------------------------------------+---------+-------+

Instrument names have multiple parts and form a hierarchy. An instrument name consists of a sequence of components separated by ‘/’ characters. Example names:

wait/io/file/myisam/log
wait/io/file/mysys/charset
wait/lock/table/sql/handler
wait/synch/cond/mysys/COND_alarm
wait/synch/cond/sql/BINLOG::update_cond
wait/synch/mutex/mysys/BITMAP_mutex
wait/synch/mutex/sql/LOCK_delete
wait/synch/rwlock/sql/Query_cache_query::lock
stage/sql/closing tables
stage/sql/Sorting result
statement/com/Execute
statement/com/Query
statement/sql/create_table
statement/sql/lock_tables

The instrument name space has a tree-like structure. The components of an instrument name from left to right provide a progression from more general to more specific. The number of components a name has depends on the type of instrument.

The interpretation of a given component in a name depends on the components to the left of it. For example, myisam appears in both of the following names, but myisam in the first name is related to file I/O, whereas in the second it is related to a synchronization instrument:

wait/io/file/myisam/log
wait/synch/cond/myisam/MI_SORT_INFO::cond

SETUP_CONSUMERS – This defines type of consumers for which event information is getting collected. In a way this defines is history information should be kept or not or just current event information should be collected and stored.

mysql> select * from setup_consumers;
+--------------------------------+---------+
| NAME                           | ENABLED |
+--------------------------------+---------+
| events_stages_current          | NO      |
| events_stages_history          | NO      |
| events_stages_history_long     | NO      |
| events_statements_current      | YES     |
| events_statements_history      | NO      |
| events_statements_history_long | NO      |
| events_waits_current           | NO      |
| events_waits_history           | NO      |
| events_waits_history_long      | NO      |
| global_instrumentation         | YES     |
| thread_instrumentation         | YES     |
| statements_digest              | YES     |
+--------------------------------+---------+
12 rows in set (0.00 sec)

If you enable history table for any of the event type, mysql will keep history of performance data collecetd for that event type.
Instrument data will be collected for only those consumers which are enabled in above table.

While we are talking above setup_consumers table, I would like to explain about 3 consumers mentioned in above tables

global_instrumentation – is the highest level consumer. If global_instrumentation is NO, it disables all instrumentation. All other settings are lower level and are not checked; it does not matter what they are set to. If global_instrumentation is YES, the Performance Schema maintains information for global states and also checks the thread_instrumentation consumer.

thread_instrumentation – is checked only if global_instrumentation is YES. Otherwise, if thread_instrumentation is NO, it disables thread-specific instrumentation and all lower-level settings are ignored. No information is maintained per thread and no individual events are collected in the current-events or event-history tables. If thread_instrumentation is YES, the Performance Schema maintains thread-specific information and also checks events_xxx_current consumers.

statement_digest – This consumer requires global_instrumentation to be YES or it is not checked. There is no dependency on the Statement Event consumers. This means you can obtain statistics per digest without having to collect statistics in events_statements_current, which is advantageous in terms of overhead.

Following figure show instrumentation hierarchy:

instru

 

 

 

 

 

 

 

 

SETUP_TIMERS – This table defines that is the unit of values for all events that are collecting performance data. For some time of event it could be timing information, in that case it defines that unit of time that is used for those events. If for few events, unit should be number of instances occurring then this table should be updated to reflect the same.

By default it defines following unit of measure for each type of event

mysql> select * from setup_timers;
+-----------+-------------+
| NAME      | TIMER_NAME  |
+-----------+-------------+
| idle      | MICROSECOND |
| wait      | CYCLE       |
| stage     | NANOSECOND  |
| statement | NANOSECOND  |
+-----------+-------------+
4 rows in set (0.00 sec)

CYCLE refers to the timer that is based on the CPU (processor) cycle counter

2) Current Events tables

Once you update setup tables to set correct monitoring in place, MySQL will start collecting performance data for the events for which instrumentation is enabled.
Following are the current tables

mysql> show tables like '%current%';
+------------------------------------------+
| Tables_in_performance_schema (%current%) |
+------------------------------------------+
| events_stages_current                    |
| events_statements_current                |
| events_waits_current                     |
+------------------------------------------+
3 rows in set (0.00 sec)

 

Each current table corresponds to high level instrument type. As we have seen above, instrument name is a tree type hierarchy structure and top of the tree is high level instrument type.
If we want to find out high level instrument name, we can use following SQL

mysql> select distinct substr(name,1,instr(name,'/')) from setup_instruments where name != 'idle';
+--------------------------------+
| substr(name,1,instr(name,'/')) |
+--------------------------------+
| wait/                          |
| stage/                         |
| statement/                     |
+--------------------------------+
3 rows in set (0.08 sec)

 

Statement – indicates a sql statement (like SELECT) or command (like CONNECT)
Stages  – is the stage of statement processing, the , like SORTING RESULT
Wait – is an instrumented wait event, like WAIT I/O/FILE or WAIT/LOCK
Idle – is when a connection is idle, that is,  a socket is idle if it is waiting for a request from the client

So corresponding to each of above instrument type, we have current tables.
These tables are populated they are enabled in setup_consumers tables. If you check above the output of setup_consumers table, you will see entries corresponding to these tables.

mysql> select * from setup_consumers where name like '%current%';
+---------------------------+---------+
| NAME                      | ENABLED |
+---------------------------+---------+
| events_stages_current     | NO      |
| events_statements_current | YES     |
| events_waits_current      | NO      |
+---------------------------+---------+
3 rows in set (0.01 sec)

 

So by default, instrumentation for current statements are enabled by default

EVENTS_STAGES_CURRENT

This table contains current stage events, one row per thread showing the current status of the thread’s most recent monitored stage event.
Every statement passes through various stages and we can enable/disable instrumentation for every stage. If you have enabled instrumentation for a stage and if your statement is currently in that stage, we will see a record in this table.
Again, if query taking less time, you will hardly see anything in this table for that thread. Advice is to enable history table.

mysql> select * from setup_instruments where name like 'stage%';
+---------------------------------------------------------------------------------------+---------+-------+
| NAME                                                                                  | ENABLED | TIMED |
+---------------------------------------------------------------------------------------+---------+-------+
| stage/sql/After create                                                                | NO      | NO    |
| stage/sql/allocating local table                                                      | NO      | NO    |
| stage/sql/preparing for alter table                                                   | NO      | NO    |
| stage/sql/altering table                                                              | NO      | NO    |
| stage/sql/committing alter table to storage engine                                    | NO      | NO    |
| stage/sql/Changing master                                                             | NO      | NO    |
...
...
...

 

EVENTS_STATEMENTS_CURRENT

This table has data about current statement events, one row per thread showing the current status of the thread’s most recent monitored statement event.
Any user runs a statement, performance data about that statement gets recorded in this table.
Please note that this table stores only 1 row per thread, so as soon as user runs another SQL in same connection, previous SQL will be gone.
That’s why its useful to enable history tables in setup_consumers.

This table has many important columns to store performance data like

TIMER_START, TIMER_END, TIMER_WAIT – Which defines how much time the SQL took
LOCK_TIME – defines how much time lock was held
ROWS_AFFECTED, ROWS_SENT, ROWS_EXAMINED – defines how much work was done
CREATED_TMP_DISK_TABLES, CREATED_TMP_TABLES – defines if temp tables where created and if they got spiled over on disk
SELECT_FULL_JOIN, SELECT_FULL_RANGE_JOIN, SELECT_RANGE, SELECT_RANGE_CHECK, SELECT_SCAN – defines access path

mysql> select * from performance_schema.events_statements_current where thread_id = 31 \G
*************************** 1. row ***************************
              THREAD_ID: 31
               EVENT_ID: 89
           END_EVENT_ID: 112
             EVENT_NAME: statement/sql/select
                 SOURCE: mysqld.cc:939
            TIMER_START: 351432249673000000
              TIMER_END: 351432249845000000
             TIMER_WAIT: 172000000
              LOCK_TIME: 73000000
               SQL_TEXT: select * from event
                 DIGEST: 4f58e5f352636c00d01edc576732852b
            DIGEST_TEXT: SELECT * FROM EVENT
         CURRENT_SCHEMA: deo
            OBJECT_TYPE: NULL
          OBJECT_SCHEMA: NULL
            OBJECT_NAME: NULL
  OBJECT_INSTANCE_BEGIN: NULL
            MYSQL_ERRNO: 0
      RETURNED_SQLSTATE: NULL
           MESSAGE_TEXT: NULL
                 ERRORS: 0
               WARNINGS: 0
          ROWS_AFFECTED: 0
              ROWS_SENT: 20
          ROWS_EXAMINED: 20
CREATED_TMP_DISK_TABLES: 0
     CREATED_TMP_TABLES: 0
       SELECT_FULL_JOIN: 0
 SELECT_FULL_RANGE_JOIN: 0
           SELECT_RANGE: 0
     SELECT_RANGE_CHECK: 0
            SELECT_SCAN: 1
      SORT_MERGE_PASSES: 0
             SORT_RANGE: 0
              SORT_ROWS: 0
              SORT_SCAN: 0
          NO_INDEX_USED: 1
     NO_GOOD_INDEX_USED: 0
       NESTING_EVENT_ID: NULL
     NESTING_EVENT_TYPE: NULL
1 row in set (0.00 sec)

EVENTS_WAITS_CURRENT

This table defines wait event information for current wait events, one row per thread showing the current status of the thread’s most recent monitored wait event.
If a thread has passed that wait event, it will remove that record from this table and insert another record about wait event that thread is experiencing right now.
Its bit difficult to use this view unless thread is experience waits currently and for longer time (enough to get hold of data from this table). Else its good idea to enable history tables.

mysql> select * from events_waits_current \G
*************************** 1. row ***************************
            THREAD_ID: 30
             EVENT_ID: 135
         END_EVENT_ID: 135
           EVENT_NAME: idle
               SOURCE: mysqld.cc:909
          TIMER_START: 351281970836000000
            TIMER_END: 351286937566000000
           TIMER_WAIT: 4966730000000
                SPINS: NULL
        OBJECT_SCHEMA: NULL
          OBJECT_NAME: NULL
           INDEX_NAME: NULL
          OBJECT_TYPE: NULL
OBJECT_INSTANCE_BEGIN: 0
     NESTING_EVENT_ID: NULL
   NESTING_EVENT_TYPE: NULL
            OPERATION: idle
      NUMBER_OF_BYTES: NULL
                FLAGS: NULL
1 row in set (0.00 sec)

You might see the same timing columns in this table as in events_statements_current table. But note that these timings column belongs to wait event. They represent how much was the wait time, whereas timings column in events_statements_current table represent how much was the time taken by the entire statement.

3) History events table

These tables as name says, stores historical information of the event that happened. As we have learned above that current table stores only current event that is happening for every thread. If the thread has passed that event, data in current table changes. So history tables are available in MySQL which will store history of passed event happened in a thread.
Corresponding to every current table, we have 2 history table _history and _history_long

mysql> show tables like '%history%';
+------------------------------------------+
| Tables_in_performance_schema (%history%) |
+------------------------------------------+
| events_stages_history                    |
| events_stages_history_long               |
| events_statements_history                |
| events_statements_history_long           |
| events_waits_history                     |
| events_waits_history_long                |
+------------------------------------------+
6 rows in set (0.00 sec)

_history table contains the most recent 10 records for every thread and _history_long table contains most recent 10,000 records for every thread.
Example if a thread is executing 20 SQL statements, events_statements_history will records latest 10 SQL statements executed by that thread, where as events_statements_history_long table will record all 20 sql statements. As new events are added to the table, older events are discarded if the table is full.  Events are not added to the table until they have ended.

Structure wise they are same as current table. They just store more data per thread.
You can enable history tables and history_long table from setup_consumers table

mysql> select * from setup_consumers where name like '%history%';
+--------------------------------+---------+
| NAME                           | ENABLED |
+--------------------------------+---------+
| events_stages_history          | NO      |
| events_stages_history_long     | NO      |
| events_statements_history      | NO      |
| events_statements_history_long | NO      |
| events_waits_history           | NO      |
| events_waits_history_long      | NO      |
+--------------------------------+---------+
6 rows in set (0.00 sec)

If you think that storing 10 records in history table and 10000 records in history_long table is less, you can increase how many records you want to store in history and history long for each of the instrument type table.
Following system level variables are available which can be set at start of instance which defines how many records to store in these tables

mysql> show variables like 'performance_schema_%history%';
+--------------------------------------------------------+-------+
| Variable_name                                          | Value |
+--------------------------------------------------------+-------+
| performance_schema_events_stages_history_long_size     | 10000 |
| performance_schema_events_stages_history_size          | 10    |
| performance_schema_events_statements_history_long_size | 10000 |
| performance_schema_events_statements_history_size      | 10    |
| performance_schema_events_waits_history_long_size      | 10000 |
| performance_schema_events_waits_history_size           | 10    |
+--------------------------------------------------------+-------+
6 rows in set (0.00 sec)

Note that changing these values will need mysql reboot to take affect and also increasing these values means that you will need more memory to store these values. If you increase number of records setting in system variables above, but there is no sufficient memory available to store those many records, MySQL will not be able to load history table.
You can check if this situation is happening by tracking some status variables in MySQL. The Performance Schema implements several status variables that provide information about instrumentation that could not be loaded or created due to memory constraints:

mysql> show status like 'performance%';
+-----------------------------------------------+-------+
| Variable_name                                 | Value |
+-----------------------------------------------+-------+
| Performance_schema_accounts_lost              | 0     |
| Performance_schema_cond_classes_lost          | 0     |
| Performance_schema_cond_instances_lost        | 0     |
| Performance_schema_digest_lost                | 0     |
| Performance_schema_file_classes_lost          | 0     |
| Performance_schema_file_handles_lost          | 0     |
| Performance_schema_file_instances_lost        | 0     |
| Performance_schema_hosts_lost                 | 0     |
| Performance_schema_locker_lost                | 0     |
| Performance_schema_mutex_classes_lost         | 0     |
| Performance_schema_mutex_instances_lost       | 0     |
| Performance_schema_rwlock_classes_lost        | 0     |
| Performance_schema_rwlock_instances_lost      | 0     |
| Performance_schema_session_connect_attrs_lost | 0     |
| Performance_schema_socket_classes_lost        | 0     |
| Performance_schema_socket_instances_lost      | 0     |
| Performance_schema_stage_classes_lost         | 0     |
| Performance_schema_statement_classes_lost     | 0     |
| Performance_schema_table_handles_lost         | 0     |
| Performance_schema_table_instances_lost       | 0     |
| Performance_schema_thread_classes_lost        | 0     |
| Performance_schema_thread_instances_lost      | 0     |
| Performance_schema_users_lost                 | 0     |
+-----------------------------------------------+-------+
23 rows in set (0.04 sec)

Each of the above status variables will get updated if corresponding performance schema table is not able to store records. We will see details about above variables in later section of this article.

 

4) Summary tables

Summary tables provide aggregated information for terminated events over time. The tables in this group summarize event data in different ways.
There are bunch of summary tables available in performance schema

 mysql> show tables like '%summary%';
+----------------------------------------------------+
| Tables_in_performance_schema (%summary%)           |
+----------------------------------------------------+
| events_stages_summary_by_account_by_event_name     |
| events_stages_summary_by_host_by_event_name        |
| events_stages_summary_by_thread_by_event_name      |
| events_stages_summary_by_user_by_event_name        |
| events_stages_summary_global_by_event_name         |
| events_statements_summary_by_account_by_event_name |
| events_statements_summary_by_digest                |
| events_statements_summary_by_host_by_event_name    |
| events_statements_summary_by_thread_by_event_name  |
| events_statements_summary_by_user_by_event_name    |
| events_statements_summary_global_by_event_name     |
| events_waits_summary_by_account_by_event_name      |
| events_waits_summary_by_host_by_event_name         |
| events_waits_summary_by_instance                   |
| events_waits_summary_by_thread_by_event_name       |
| events_waits_summary_by_user_by_event_name         |
| events_waits_summary_global_by_event_name          |
| file_summary_by_event_name                         |
| file_summary_by_instance                           |
| objects_summary_global_by_type                     |
| socket_summary_by_event_name                       |
| socket_summary_by_instance                         |
| table_io_waits_summary_by_index_usage              |
| table_io_waits_summary_by_table                    |
| table_lock_waits_summary_by_table                  |
+----------------------------------------------------+
25 rows in set (0.00 sec)

Each summary table has grouping columns that determine how to group the data to be aggregated, and summary columns that contain the aggregated values
I will give 1 liner for few tables to understand how data is stored in these tables.

Event Wait Summaries:

  •     events_waits_summary_global_by_event_name: Wait events summarized per event name
  •     events_waits_summary_by_instance: Wait events summarized per instance
  •     events_waits_summary_by_thread_by_event_name: Wait events summarized per thread and event name

Stage Summaries:

  •     events_stages_summary_by_thread_by_event_name: Stage waits summarized per thread and event name
  •     events_stages_summary_global_by_event_name: Stage waits summarized per event name

Statement Summaries:

  •     events_statements_summary_by_thread_by_event_name: Statement events summarized per thread and event name
  •     events_statements_summary_global_by_event_name: Statement events summarized per event name

4.1 Digest summary Table

In above summary table, you see a table – events_statements_summary_by_digest
So what exactly is digest table

As of MySQL 5.6.5, the Performance Schema maintains statement digest information. Digesting converts a SQL statement to normalized form and computes a hash value for the result.
Normalization permits statements that are similar to be grouped and summarized to expose information about the types of statements the server is executing and how often they occur.

We can enable digest summary by enabling the same in setup_consumers table

The statement event tables (events_statements_current, events_statements_history, and events_statements_history_long) have DIGEST and DIGEST_TEXT columns that contain digest MD5 values and the corresponding normalized statement text strings.

Normalizing a statement transforms the statement text to a more standardized string representation that preserves the general statement structure while removing information not essential to the structure.Object identifiers such as database and table names are preserved. Values and comments are removed, and whitespace is adjusted.

Consider these statements:

SELECT * FROM orders WHERE customer_id=10 AND quantity>20
SELECT * FROM orders WHERE customer_id = 20 AND quantity > 100

To normalize these statements, the Performance Schema replaces data values by ? and adjusts whitespace. Both statements yield the same normalized form and thus are considered “the same”:

SELECT * FROM orders WHERE customer_id = ? AND quantity > ?

The normalized statement contains less information but is still representative of the original statement. Other similar statements that have different comparison values have the same normalized form.

5) Object Instances tables

Instance tables records synchronization object instances. There are three types of synchronization objects: cond, mutex, and rwlock.

cond – represents condition. condition is a synchronization mechanism used in the code to signal that a specific event has happened, so that a thread waiting for this condition can resume work
mutex – mutex is a synchronization mechanism used in the code to enforce that only one thread at a given time can have access to some common resource. The resource is said to be “protected” by the mutex.
rwlock – synchronization mechanism used in the code to enforce that threads at a given time can have access to some common resource following certain rules. The resource is said to be “protected” by the rwlock. The access is either shared (many threads can have a read lock at the same time) or exclusive (only one thread can have a write lock at a given time).

Corresponding to this we have following 5 instances table

mysql> select TABLE_NAME from information_schema.tables where TABLE_NAME like '%instance%' and table_name not like '%summary%';
+------------------+
| TABLE_NAME       |
+------------------+
| cond_instances   |
| file_instances   |
| mutex_instances  |
| rwlock_instances |
| socket_instances |
+------------------+
5 rows in set (0.03 sec)

cond_instances – table lists all the conditions seen by the Performance Schema while the server executes
file_instances – table lists all the files seen by the Performance Schema when executing file I/O instrumentation. If a file on disk has never been opened, it will not be in file_instances. When a file is deleted from the disk, it is also removed from the file_instances table.
mutex_instances – table lists all the mutexes seen by the Performance Schema while the server executes.
rwlock_instances – table lists all the rwlock instances (read write locks) seen by the Performance Schema while the server executes.
socket_instances – table provides a real-time snapshot of the active connections to the MySQL server.

You can enable this instrumentation using setup_instruments table. Every instance tables mentioned above will get an entry if those events are enabled and executed.

setup_instruments table has many instruments as explained above. Enabling these instruments will populate specific instances table

 

+------------------------------------------------------------+---------+-------+
| NAME                                                       | ENABLED | TIMED |
+------------------------------------------------------------+---------+-------+
| wait/synch/mutex/sql/PAGE::lock                            | NO      | NO    |
| wait/synch/mutex/sql/TC_LOG_MMAP::LOCK_sync                | NO      | NO    |
| wait/synch/mutex/sql/TC_LOG_MMAP::LOCK_active              | NO      | NO    |
| wait/synch/mutex/sql/TC_LOG_MMAP::LOCK_pool                | NO      | NO    |
| wait/synch/mutex/sql/LOCK_des_key_file                     | NO      | NO    |
| wait/synch/mutex/sql/MYSQL_BIN_LOG::LOCK_commit            | NO      | NO    |
...
| wait/synch/rwlock/sql/Binlog_transmit_delegate::lock       | NO      | NO    |
| wait/synch/rwlock/sql/Binlog_relay_IO_delegate::lock       | NO      | NO    |
| wait/synch/rwlock/sql/LOCK_grant                           | NO      | NO    |
| wait/synch/rwlock/sql/LOGGER::LOCK_logger                  | NO      | NO    |
| wait/synch/rwlock/sql/LOCK_sys_init_connect                | NO      | NO    |
| wait/synch/rwlock/sql/LOCK_sys_init_slave                  | NO      | NO    |
...
| wait/synch/cond/sql/TC_LOG_MMAP::COND_active               | NO      | NO    |
| wait/synch/cond/sql/TC_LOG_MMAP::COND_pool                 | NO      | NO    |
| wait/synch/cond/sql/MYSQL_BIN_LOG::COND_done               | NO      | NO    |
| wait/synch/cond/sql/MYSQL_BIN_LOG::update_cond             | NO      | NO    |
| wait/synch/cond/sql/MYSQL_BIN_LOG::prep_xids_cond          | NO      | NO    |
| wait/synch/cond/sql/MYSQL_RELAY_LOG::COND_done             | NO      | NO    |
...
| wait/io/file/sql/map                                       | YES     | YES   |
| wait/io/file/sql/binlog                                    | YES     | YES   |
| wait/io/file/sql/binlog_index                              | YES     | YES   |
| wait/io/file/sql/relaylog                                  | YES     | YES   |
| wait/io/file/sql/relaylog_index                            | YES     | YES   |
| wait/io/file/sql/casetest                                  | YES     | YES   |
...
| wait/io/socket/sql/server_tcpip_socket                     | NO      | NO    |
| wait/io/socket/sql/server_unix_socket                      | NO      | NO    |
| wait/io/socket/sql/client_connection                       | NO      | NO    |
+------------------------------------------------------------+---------+-------+

Lets take a example of mutex_instance table and understand how it works and how these tables play a role.

The mutex_instances table lists all the mutexes seen by the Performance Schema while the server executes. A mutex is a synchronization mechanism used in the code to enforce that only one thread at a given time can have access to some common resource. The resource is said to be “protected” by the mutex.

When two threads executing in the server (for example, two user sessions executing a query simultaneously) do need to access the same resource (a file, a buffer, or some piece of data), these two threads will compete against each other, so that the first query to obtain a lock on the mutex will cause the other query to wait until the first is done and unlocks the mutex.

For every mutex instrumented in the code, the Performance Schema provides the following information.

  •     The setup_instruments table lists the name of the instrumentation point, with the prefix wait/synch/mutex/.
  •     When some code creates a mutex, a row is added to the mutex_instances table. The OBJECT_INSTANCE_BEGIN column is a property that uniquely identifies the mutex.
  •     When a thread attempts to lock a mutex, the events_waits_current table shows a row for that thread, indicating that it is waiting on a mutex (in the EVENT_NAME column), and indicating which mutex is waited on (in the OBJECT_INSTANCE_BEGIN column).
  •     When a thread succeeds in locking a mutex:
    • events_waits_current shows that the wait on the mutex is completed (in the TIMER_END and TIMER_WAIT columns)
    • The completed wait event is added to the events_waits_history and events_waits_history_long tables
    • mutex_instances shows that the mutex is now owned by the thread (in the THREAD_ID column).
  •     When a thread unlocks a mutex, mutex_instances shows that the mutex now has no owner (the THREAD_ID column is NULL).
  •     When a mutex object is destroyed, the corresponding row is removed from mutex_instances.

By performing queries on both of the following tables, a monitoring application or a DBA can detect bottlenecks or deadlocks between threads that involve mutexes:

  •     events_waits_current, to see what mutex a thread is waiting for
  •     mutex_instances, to see which other thread currently owns a mutex

 

6) Other remaining tables

Apart from above 5 categories there are few other tables which records account level and thread level information.

account table – Connection statistics per client account. “Account” here means username + hostname combination


mysql> select * from accounts;
+------+-----------+---------------------+-------------------+
| USER | HOST      | CURRENT_CONNECTIONS | TOTAL_CONNECTIONS |
+------+-----------+---------------------+-------------------+
| NULL | NULL      |                  20 |                26 |
| root | localhost |                   1 |                 7 |
+------+-----------+---------------------+-------------------+
2 rows in set (0.01 sec)

users table – Connection statistics per client user name


mysql> select * from users;
+------+---------------------+-------------------+
| USER | CURRENT_CONNECTIONS | TOTAL_CONNECTIONS |
+------+---------------------+-------------------+
| NULL |                  20 |                26 |
| root |                   1 |                 7 |
+------+---------------------+-------------------+
2 rows in set (0.01 sec)

hosts table – Connection statistics per client host name


mysql> select * from hosts;
+-----------+---------------------+-------------------+
| HOST      | CURRENT_CONNECTIONS | TOTAL_CONNECTIONS |
+-----------+---------------------+-------------------+
| NULL      |                  20 |                26 |
| localhost |                   1 |                 7 |
+-----------+---------------------+-------------------+
2 rows in set (0.00 sec)

host_cache – Information from the internal host cache.

performance_timers – Shows which event timers are available

This table shows available times in your mysql server. If a timer is not listed in this table it cannot be used. value of timers for various event types set in setup_timers should be from performance_timers table


mysql> select * from performance_timers;
+-------------+-----------------+------------------+----------------+
| TIMER_NAME  | TIMER_FREQUENCY | TIMER_RESOLUTION | TIMER_OVERHEAD |
+-------------+-----------------+------------------+----------------+
| CYCLE       |      2932537313 |                1 |             22 |
| NANOSECOND  |      1000000000 |             1000 |            264 |
| MICROSECOND |         1000000 |                1 |            116 |
| MILLISECOND |            1037 |                1 |            135 |
| TICK        |             101 |                1 |            254 |
+-------------+-----------------+------------------+----------------+
5 rows in set (0.00 sec)

threads – table contains a row for each server thread. Each row contains information about a thread and indicates whether monitoring is enabled for it

Performance Schema status variables

We have following performance_schema status variables


mysql> show status like '%performance%';
+-----------------------------------------------+-------+
| Variable_name                                 | Value |
+-----------------------------------------------+-------+
| Performance_schema_accounts_lost              | 0     |
| Performance_schema_cond_classes_lost          | 0     |
| Performance_schema_cond_instances_lost        | 0     |
| Performance_schema_digest_lost                | 0     |
| Performance_schema_file_classes_lost          | 0     |
| Performance_schema_file_handles_lost          | 0     |
| Performance_schema_file_instances_lost        | 0     |
| Performance_schema_hosts_lost                 | 0     |
| Performance_schema_locker_lost                | 0     |
| Performance_schema_mutex_classes_lost         | 0     |
| Performance_schema_mutex_instances_lost       | 0     |
| Performance_schema_rwlock_classes_lost        | 0     |
| Performance_schema_rwlock_instances_lost      | 0     |
| Performance_schema_session_connect_attrs_lost | 0     |
| Performance_schema_socket_classes_lost        | 0     |
| Performance_schema_socket_instances_lost      | 0     |
| Performance_schema_stage_classes_lost         | 0     |
| Performance_schema_statement_classes_lost     | 0     |
| Performance_schema_table_handles_lost         | 0     |
| Performance_schema_table_instances_lost       | 0     |
| Performance_schema_thread_classes_lost        | 0     |
| Performance_schema_thread_instances_lost      | 0     |
| Performance_schema_users_lost                 | 0     |
+-----------------------------------------------+-------+
23 rows in set (0.00 sec)

If an instrumentation could not be loaded because of memory constraints, corresponding status variable will get updated.
Non-zero values of these variables does not mean there is any issue with end user operation. Queries and DMLs will get executed as expected. Non-Zero values of these variables only mean corresponding instrumentation is not getting recorded because of less memory.

Names for these variables have several forms:

Performance_schema_accounts_lost – The number of times a row could not be added to the accounts table because it was full.
Performance_schema_xxx_classes_lost – How many instruments of type xxx could not be loaded.
Performance_schema_hosts_lost – The number of times a row could not be added to the hosts table because it was full.
Performance_schema_xxx_instances_lost – How many instances of object type xxx could not be created.
Performance_schema_xxx_handles_lost – How many instances of object type xxx could not be opened.
Performance_schema_locker_lost – How many events are “lost” or not recorded for various reason.
Performance_schema_session_connect_attrs_lost – The number of times a connection attribute string was larger than the reserved storage.
Performance_schema_users_lost – The number of times a row could not be added to the users table because it was full.

 

References:

http://manuales.guebs.com/mysql-5.6/performance-schema.html

http://dev.mysql.com/doc/refman/5.6/en/performance-schema-quick-start.html

A Visual Guide to the MySQL Performance Schema

http://www.databasejournal.com/features/mysql/an-overview-of-the-mysql-performance-schema.html