XA介绍

最近在Spring代码中看到’XA’这个字眼,一时没想起啥意思,也不知道XA是什么的缩写…

XA是什么

一开始我也不清楚XA是什么东西,感觉好牛逼,google一番,没有找到XA是什么东西的缩写,不过在wikipedia X/Open XA上找到下面的描述

In computing, the X/Open XA standard (short for “eXtended Architecture”) is a specification released in 1991[1] by X/Open (which later merged with The Open Group) for distributed transaction processing (DTP).

感觉XA是eXtended Architecture的缩写,说实话有点诡异。这段话也说明了:XA是由X/Open组织提出的分布式事务的规范。好吧,是我土鳖了,一说分布式事务其他的我就知道了,只是不知道XA的含义。不过为啥不直接叫:”distributed transaction specification”呢,多简洁清晰。

XA Goals

The goal of XA is to guarantee atomicity in “global transactions” that are executed across heterogeneous components. A transaction is a unit of work such as transferring money from one person to another. Distributed transactions update multiple data stores (such as databases, application servers, message queues, transactional caches, etc.) To guarantee integrity, XA uses a two-phase commit (2PC) to ensure that all of a transaction’s changes either take effect (commit) or do not (roll back), i.e., atomically.

XA Architecture

Specifically, XA describes the interface between a global transaction manager and a specific application. An application that wants to use XA engages an XA transaction manager using a library or separate service. The transaction manager tracks the participants in the transaction (i.e. the various data stores to which the application writes), and works with them to carry out the two-phase commit. In other words, the XA transaction manager is separate from an application’s interactions with servers. XA maintains a log of its decisions to commit or roll back, which it can use to recover in case of a system outage.[1]

Many software vendors support XA (meaning the software can participate in XA transactions), including a variety of relational databases and message brokers.[1]

说说我理解的XA

上面的XA gloalsXA Architecture是wikipedia上对XA的介绍,为了文章完整性我copy过来了,下面说说我的理解吧。

  • XA是分布式事务的规范,XA是”eXtended Architecture”的简写,至于为啥叫XA不叫“distributed transaction specification”这鬼知道,在我看来就是瞎jb起名。
  • XA协议属于 两阶段提交协议, 主流的关系型数据库产品都是实现了XA接口的。

  • XA规范主要定义了(全局)事务管理器(Transaction Manager)和(局部)资源管理器(Resource Manager)之间的接口, XA接口是双向的系统接口,在事务管理器 (TM)以及一个或多个资源管理器(RM)之 间形成通信桥梁

X/Open DTP模型架构图

  • X/Open DTP模型(1994)包括:
    • 应用程序(AP)
    • 事务管理器(TM)
    • 资源管理器(RM)
    • 通信资源管理器(CRM)

X/Open DTP模型架构图1
X/Open DTP模型架构图2

Mysql中的XA

mysql中的XA事务的官方文档地址:mysql XA Transactions

参考资料

数据库MHA架构介绍

MHA(Master High Availability)目前在MySQL高可用方面是一个相对成熟的解决方案,它由日本人youshimaton开发,是一套优秀的作为MySQL高可用性环境下故障切换和主从提升的高可用软件。在MySQL故障切换过程中,MHA能做到0~30秒之内自动完成数据库的故障切换操作,并且在进行故障切换的过程中,MHA能最大程度上保证数据库的一致性。

适用场景

目前MHA主要支持一主多从的架构,要搭建MHA,要求一个复制集群必须最少有3台数据库服务器,一主二从,即一台充当Master,一台充当备用Master,另一台充当从库。出于成本考虑,淘宝在此基础上进行了改造,目前淘宝开发的TMHA已经支持一主一从。

MAH组件

MHA由两部分组成:

  • MHA Manager(管理节点)
  • MHA Node(数据节点)。

MHA Manager

MHA Manager可以单独部署在一台独立的机器上管理多个master-slave集群,也可以部署在一台slave节点上。主要用来运行一些工具,比如masterha_manager工具实现自动监控MySQL Master和实现master故障切换,其它工具实现手动实现master故障切换、在线mater转移、连接检查等等。一个Manager可以管理多 个master-slave集群

MHA Manager会定时探测集群中的master节点,当master出现故障时,它可以自动将最新的slave提升为新的master,然后将所有其他的slave重新指向新的master。整个故障转移过程对应用程序完全透明。

MHA Node

MHA Node运行在每台MySQL服务器上,类似于Agent。主要作用有:

  • 保存二进制日志
    如果能够访问故障master,会拷贝master的二进制日志
  • 应用差异中继日志
    从拥有最新数据的slave上生成差异中继日志,然后应用差异日志。
  • 清除中继日志
    在不停止SQL线程的情况下删除中继日志

工作原理简述

因为MHA Manager会不断的去探测master节点,因此当master出现故障时,通过对比slave之间I/O线程读取masterbinlog的位置,选取最接近的slave做为latestslave。 其它slave通过与latest slave对比生成差异中继日志。在latest slave上应用从master保存的binlog,同时将latest slave提升为master。最后在其它slave上应用相应的差异中继日志并开始从新的master开始复制。

在MHA实现Master故障切换过程中,MHA Node会试图访问故障的master(通过SSH),如果可以访问(不是硬件故障,比如InnoDB数据文件损坏等),会保存二进制文件,以最大程度保 证数据不丢失。MHA和半同步复制一起使用会大大降低数据丢失的危险。流程如下:

  • 从宕机崩溃的master保存二进制日志事件(binlog events)。
  • 识别含有最新更新的slave。
  • 应用差异的中继日志(relay log)到其它slave。
  • 应用从master保存的二进制日志事件(binlog events)。
  • 提升一个slave为新master并记录binlog file和position。
  • 使其它的slave连接新的master进行复制。
  • 完成切换manager主进程OFFLINE

pt-fingerprint 安装和基本使用

1
2
3
4
5
6
7
8
9
sudo wget percona.com/get/percona-toolkit.tar.gz

## 安装相关依赖
sudo yum install perl -y
sudo yum install perl-DBI -y
sudo yum install perl-DBD-MySQL -y
sudo yum install perl-Time-HiRes -y
sudo yum install perl-IO-Socket-SSL -y
sudo yum install perl-Digest-MD5.x86_64 -y

初步使用

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
[$ /home/w/percona/percona-toolkit-3.0.13/bin]$ ./pt-fingerprint --query "select a, b, c from users where id = 500"
select a, b, c from users where id = ?
[$ /home/w/percona/percona-toolkit-3.0.13/bin]$ ./pt-fingerprint --query "update test set a =1 , b=2 where id = 3"
update test set a =? , b=? where id = ?
[$ /home/w/percona/percona-toolkit-3.0.13/bin]$ ./pt-fingerprint --query "update test set a =1 , b=2 where id = 5"
update test set a =? , b=? where id = ?
[$ /home/w/percona/percona-toolkit-3.0.13/bin]$
[$ /home/w/percona/percona-toolkit-3.0.13/bin]$ ./pt-fingerprint --query "insert into a(id) values (1)"
insert into a(id) values(?+)
[$ /home/w/percona/percona-toolkit-3.0.13/bin]$ ./pt-fingerprint --query "insert into a(id) values (2)"
insert into a(id) values(?+)
[$ /home/w/percona/percona-toolkit-3.0.13/bin]$ ./pt-fingerprint --query "delete from test where id = 2"
delete from test where id = ?
[$ /home/w/percona/percona-toolkit-3.0.13/bin]$ ./pt-fingerprint --query "delete from test where id = 3"
delete from test where id = ?
[$ /home/w/percona/percona-toolkit-3.0.13/bin]$

不过发现pt-fingerprint执行速度有点慢,我自己写了一个java程序,程序中调用命令行来执行pt-fingerprint命令,执行20批次,没个批次执行100次命令:

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
@Test
public void execute() {
ShellCommandExecutor shellCommandExecutor = new ShellCommandExecutor();

for (int j = 0; j < 20; j++) {
long start = System.currentTimeMillis();
for (int i = 0; i < 100; i++) {
ShellCommandExecInputer commandExecInputer = new ShellCommandExecInputer();
commandExecInputer.setCommand("./pt-fingerprint --query \"select a, b, c from users where id = 500\"");
commandExecInputer.setCommandExecDir("/Users/rollenholt/Downloads/percona/percona-toolkit-3.0.13/bin");
ShellCommandExecResult execResult = shellCommandExecutor.execute(commandExecInputer);
Assert.assertTrue(execResult.isOk());
}
long end = System.currentTimeMillis();
System.out.println((end - start) );
}
}

输出如下(单位毫秒,每执行100次pt-fingerprint命令耗时)
5520
5477
4898
4714
4708
4742
4729
4710
4714
5127
4801
4733
5472
5429
4870
5077
4935
5226
5461
5008

为了排查程序写的搓的影响,在机器上尝试执行了一下,发现耗时也比较长

1
2
3
4
5
6
[$ /home/w/percona/percona-toolkit-3.0.13/bin]$ time ./pt-fingerprint --query "delete from test where id = 3"
delete from test where id = ?

real 0m0.042s
user 0m0.038s
sys 0m0.004s

在我们的业务常见下基本不可用,太慢了。

mysql binlog初步介绍

binlog 即二进制日志,它记录了数据库上的所有改变,并以二进制的形式保存在磁盘中;
它可以用来查看数据库的变更历史、数据库增量备份和恢复、Mysql的复制(主从数据库的复制)。

mysql binlog解析

binlog有三种格式:

  • Statement 基于SQL语句的复制(statement-based replication,SBR),
  • Row 基于行的复制(row-based replication,RBR),
  • Mixed 混合模式复制(mixed-based replication,MBR)。

在我这边mysql 5.7.20版本中默认是使用Row的, 而且默认情况下没有开启binlog

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
mysql> select version();
+-----------+
| version() |
+-----------+
| 5.7.20 |
+-----------+
1 row in set (0.00 sec)

mysql>
mysql>
mysql> show variables like 'binlog_format';
+---------------+-------+
| Variable_name | Value |
+---------------+-------+
| binlog_format | ROW |
+---------------+-------+
1 row in set (0.00 sec)

mysql> show variables like 'log_bin';
+---------------+-------+
| Variable_name | Value |
+---------------+-------+
| log_bin | OFF |
+---------------+-------+
1 row in set (0.00 sec)

mac环境下开启mysql的binlog

编辑my.cnf文件,对于我来说就是/usr/local/etc/my.cnf文件,在其中增加下面的内容:

1
2
3
log-bin = /Users/rollenholt/Downloads/mysql/binlog
binlog-format = ROW
server_id = 1

然后brew services restart mysql重启mysql,接下来我们就可以验证mysql的binlog已经开启了:

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
mysql> show variables like 'log_bin'
-> ;
+---------------+-------+
| Variable_name | Value |
+---------------+-------+
| log_bin | ON |
+---------------+-------+
1 row in set (0.00 sec)

mysql>
mysql> show binary logs;
+---------------+-----------+
| Log_name | File_size |
+---------------+-----------+
| binlog.000001 | 177 |
| binlog.000002 | 177 |
| binlog.000003 | 177 |
| binlog.000004 | 154 |
+---------------+-----------+
4 rows in set (0.00 sec)

mysql> show master status;
+---------------+----------+--------------+------------------+-------------------+
| File | Position | Binlog_Do_DB | Binlog_Ignore_DB | Executed_Gtid_Set |
+---------------+----------+--------------+------------------+-------------------+
| binlog.000004 | 154 | | | |
+---------------+----------+--------------+------------------+-------------------+
1 row in set (0.00 sec)

mysql> show master logs;
+---------------+-----------+
| Log_name | File_size |
+---------------+-----------+
| binlog.000001 | 177 |
| binlog.000002 | 177 |
| binlog.000003 | 177 |
| binlog.000004 | 154 |
+---------------+-----------+
4 rows in set (0.00 sec)

同时查看我们配置的log-bin属性对于的目录下:

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
~/Downloads/mysql » ls
binlog.000001 binlog.000002 binlog.000003 binlog.000004 binlog.index
------------------------------------------------------------
~/Downloads/mysql » cat binlog.index
/Users/rollenholt/Downloads/mysql/binlog.000001
/Users/rollenholt/Downloads/mysql/binlog.000002
/Users/rollenholt/Downloads/mysql/binlog.000003
/Users/rollenholt/Downloads/mysql/binlog.000004
------------------------------------------------------------
~/Downloads/mysql » cat binlog.000001
_binR �[w{5.7.20-logR �[8


**4SVѶR �[#��_]�hu �[����% ------------------------------------------------------------
~/Downloads/mysql »

binlog相关的基本命令:

  • 查看是否开启binlog show variables like 'log_bin'
  • 获取binlog文件列表 show binary logs
  • 查看master上的binlog show master logs
  • 只查看第一个binlog文件的内容

    1
    2
    3
    4
    5
    6
    7
    8
    9
    mysql> show binlog events;
    +---------------+-----+----------------+-----------+-------------+---------------------------------------+
    | Log_name | Pos | Event_type | Server_id | End_log_pos | Info |
    +---------------+-----+----------------+-----------+-------------+---------------------------------------+
    | binlog.000001 | 4 | Format_desc | 1 | 123 | Server ver: 5.7.20-log, Binlog ver: 4 |
    | binlog.000001 | 123 | Previous_gtids | 1 | 154 | |
    | binlog.000001 | 154 | Stop | 1 | 177 | |
    +---------------+-----+----------------+-----------+-------------+---------------------------------------+
    3 rows in set (0.01 sec)
  • 查看指定binlog文件的内容 show binlog events in 'binlog.000001'

  • 删除binlog
    • 使用linux命令删除binlog文件
    • 设置binlog的过期时间 使用variable expire_logs_days
    • 手动删除binlog
      • reset master;//删除master的binlog
      • reset slave; //删除slave的中继日志
      • purge master logs before ‘2012-03-30 17:20:00’; //删除指定日期以前的日志索引中binlog日志文件
      • purge master logs to ‘mysql-bin.000002’; //删除指定日志文件的日志索引中binlog日志文件
1
2
3
4
5
6
7
mysql> show variables like '%expire_logs_days%';
+------------------+-------+
| Variable_name | Value |
+------------------+-------+
| expire_logs_days | 0 |
+------------------+-------+
1 row in set (0.00 sec)
  • flush logs 刷新日志
    • 当停止或重启服务器时,服务器会把日志文件记入下一个日志文件,Mysql会在重启时生成一个新的日志文件,文件序号递增;此外,如果日志文件超过max_binlog_size(默认值1G)系统变量配置的上限时,也会生成新的日志文件(在这里需要注意的是,如果你正使用大的事务,二进制日志还会超过max_binlog_size,不会生成新的日志文件,事务全写入一个二进制日志中,这种情况主要是为了保证事务的完整性);日志被刷新时,新生成一个日志文件。

ERROR 1728 (HY000): Cannot load from mysql.procs_priv. The table is probably corrupted

今天在搞mysql binlog收集时,需要创建一个mysql用户,结果出现了:

ERROR 1728 (HY000): Cannot load from mysql.procs_priv. The table is probably corrupted异常

解决办法:

sudo mysql_upgrade -u root -p

注意后面的用户名和密码自己修改为自己的哈。

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
~ » sudo mysql_upgrade -u root -p
Password:
Enter password:
Checking if update is needed.
Checking server version.
Running queries to upgrade MySQL server.
Checking system database.
mysql.columns_priv OK
mysql.db OK
mysql.engine_cost OK
mysql.event OK
mysql.func OK
mysql.general_log OK
mysql.gtid_executed OK
mysql.help_category OK
mysql.help_keyword OK
mysql.help_relation OK
mysql.help_topic OK
mysql.innodb_index_stats OK
mysql.innodb_table_stats OK
mysql.ndb_binlog_index OK
mysql.plugin OK
mysql.proc OK
mysql.procs_priv OK
mysql.proxies_priv OK
mysql.server_cost OK
mysql.servers OK
mysql.slave_master_info OK
mysql.slave_relay_log_info OK
mysql.slave_worker_info OK
mysql.slow_log OK
mysql.tables_priv OK
mysql.time_zone OK
mysql.time_zone_leap_second OK
mysql.time_zone_name OK
mysql.time_zone_transition OK
mysql.time_zone_transition_type OK
mysql.user OK
The sys schema is already up to date (version 1.5.1).
Checking databases.
sys.sys_config OK
temp.hsc_biz_system_customer OK
temp.hsc_settlement_info OK
temp.hsc_settlement_invoice OK
temp.hsc_settlement_object OK
temp.invoice_config OK
test.worker_node OK
test.a OK
test.application OK
test.application_acl OK
test.b OK
test.big_table OK
test.certificate_info_tab OK
test.company_tab OK
test.connection_config OK
test.leader_election OK
test.test OK
test.test1 OK
test.test_emoij OK
test.user OK
Upgrade process completed successfully.
Checking if update is needed.
------------------------------------------------------------

然后重启mysql以后,在尝试创建用户,发现完美解决。

mysql sleep

今天看到了一个sql:

1
select count(*), sleep(5) from test

第一次看到这个sleep函数,所以专门研究了一波。这个函数的语法是:SLEEP(duration), 其中duration的单位是

Sleeps (pauses) for the number of seconds given by the duration argument, then returns 0. The duration may have a fractional part. If the argument is NULL or negative, SLEEP() produces a warning, or an error in strict SQL mode.

When sleep returns normally (without interruption), it returns 0:

简单的说他可以让sql在执行的时候sleep一段时间。

1
2
3
4
5
6
mysql> SELECT SLEEP(1000);
+-------------+
| SLEEP(1000) |
+-------------+
| 0 |
+-------------+

When SLEEP() is the only thing invoked by a query that is interrupted, it returns 1 and the query itself returns no error. This is true whether the query is killed or times out:

  • This statement is interrupted using KILL QUERY from another session:
1
2
3
4
5
6
mysql> SELECT SLEEP(1000);
+-------------+
| SLEEP(1000) |
+-------------+
| 1 |
+-------------+
  • This statement is interrupted by timing out:
1
2
3
4
5
6
mysql> SELECT /*+ MAX_EXECUTION_TIME(1) */ SLEEP(1000);
+-------------+
| SLEEP(1000) |
+-------------+
| 1 |
+-------------+

这里解释一下上面的语法,那个语法是Mysql Optimizer Hints

When SLEEP() is only part of a query that is interrupted, the query returns an error:

  • This statement is interrupted using KILL QUERY from another session:
1
2
mysql> SELECT 1 FROM t1 WHERE SLEEP(1000);
ERROR 1317 (70100): Query execution was interrupted
  • This statement is interrupted by timing out:
1
2
3
mysql> SELECT /*+ MAX_EXECUTION_TIME(1000) */ 1 FROM t1 WHERE SLEEP(1000);
ERROR 3024 (HY000): Query execution was interrupted, maximum statement
execution time exceeded

This function is unsafe for statement-based replication. A warning is logged if you use this function when binlog_format is set to STATEMENT.

参考资料

Your browser is out-of-date!

Update your browser to view this website correctly. Update my browser now

×