reddey

uid：1772

注册时间：2024-4-24 13:30 上次发表时间：2025-8-5 15:28

好友数 4 | 博客数 101 | 回帖数 297 | 主题数 29

slony集群部署的我所遇问题之一

reddey 已有 221 次阅读2025-6-23 10:49 |系统分类:运维实战

最近在学习PG大佬唐成老师的从小工到专家一书，这本书的知名度很高，得到很多行业大佬的推荐。很多大佬都把这本书当作入门启蒙书籍，上周我也果断从淘宝购买此书。先对这本书的做个总体评价，内容详细几乎面面俱到，你会在这本书看到一些别的书本没有提到的内容，从内容的深度上讲，有些地方已经也提到了。虽然这本书是2020年出版的，书中所用的数据库版本为PG12有些老，但内容却写得很精彩。大家在学习此书时，做相关实验时也用相应的数据库版本和开源软件工具版本，保持和书中的版本相同。

slony是一个开源的PG集群管理工具，是一个基于逻辑复制的数据同步集群工具，该工具由俄罗斯人开发。

先介绍一下我的主机环境，主机操作系统为centos7，数据库版本为PG12，slony软件我采用编译安装的方式，每个主机节点都要安装，软件的安装目录为/usr/local。slon_tools.conf是配置文件，现在遇到的问题是，如果一个节点已经启动了，你更改了配置文件后，如果要重启这个主机节点如何处理。有些网友可能会不知所措，采用KILL -9的方式杀掉进程后再启动节点。

/home/postgres/run/slony1/cluster01_node2.pid这个文件确实有显示进程PID，你会发现这种杀进程的方式后，slon进程会马上自动重启。slony是一个优秀的开源工具产品，俄罗斯人的数学独步天下，开发人员早就应该想到如何重启slony节点。这个问题纠结了我一个星期，在slony社区也找不到相应的办法。我今天打开软件的安装目录，才恍然大悟。

[postgres@pg02 local]$ cd bin

[postgres@pg02 bin]$ ls

coraenv slonik_execute_script slonik_unsubscribe_set

dbhome slonik_failover slonik_update_nodes

oraenv slonik_init_cluster slon_kill

slonik_add_node slonik_merge_sets slon_start

slonik_build_env slonik_move_set slon_status

slonik_create_set slonik_print_preamble slon_watchdog

slonik_drop_node slonik_restart_node slon_watchdog2

slonik_drop_sequence slonik_store_node slony_show_configuration

slonik_drop_set slonik_subscribe_set

slonik_drop_table slonik_uninstall_nodes

从上面的众多shell脚本命名来看，slon_kill就是停用节点的。

我们打开脚本内容看一下，该命令如何使用的。

[postgres@pg1 bin]$ cat slon_kill -n

1 #!/usr/bin/perl

2 #

3 # Kill all slon instances for the current cluster

4 # Author: Christopher Browne

7 use Getopt::Long;

9 # Defaults

10 $CONFIG_FILE = '/usr/local/etc/slon_tools.conf';

11 $SHOW_USAGE = 0;

12 $WATCHDOG_ONLY = 0;

13 $ONLY_NODE = 0;

15 # Read command-line options

16 GetOptions("config=s" => \$CONFIG_FILE,

17 "help" => \$SHOW_USAGE,

18 "w|watchdog" => \$WATCHDOG_ONLY,

19 "only-node=i" => \$ONLY_NODE);

21 my $USAGE =

22 "Usage: slon_kill [--config file] [-w|--watchdog]

24 --config file Location of the slon_tools.conf file

26 -w

27 --watchdog Only kill the watchdog process(es)

29 Kills all running slon and slon_watchdog on this machine for every

30 node in the cluster.

32 --only-node=i Only kill slon processes for the indicated node

33 ";

35 if ($SHOW_USAGE) {

36 print $USAGE;

37 exit 0;

38 }

40 require '/home/postgres/soft/lib//slon-tools.pm';

41 require $CONFIG_FILE;

43 print "slon_kill.pl... Killing all slon and slon_watchdog instances for the cluster $CLUSTER_NAME\n";

44 print "1. Kill slon watchdogs\n";

46 $found="n";

48 # kill the watchdogs

49 if($ONLY_NODE) {

50 kill_watchdog($ONLY_NODE);

51 } else {

52 for my $nodenum (@NODES) {

53 kill_watchdog($nodenum);

54 }

55 }

56 if ($found eq 'n') {

57 print "No watchdogs found\n";

58 }

60 unless ($WATCHDOG_ONLY) {

61 print "\n2. Kill slon processes\n";

63 # kill the slon daemons

64 $found="n";

66 if($ONLY_NODE) {

67 kill_slon_node( $ONLY_NODE );

68 } else {

69 for my $nodenum (@NODES) {

70 kill_slon_node( $nodenum );

71 }

72 }

74 if ($found eq 'n') {

75 print "No slon processes found\n";

76 }

77 }

79 sub kill_watchdog($) {

80 my ($nodenum) = @_;

82 my $config_regexp = quotemeta( $CONFIG_FILE );

84 my $command = ps_args() . "| egrep \"[s]lon_watchdog[2]? .*=$config_regexp node$nodenum \" | awk '{print \$2}' | sort -n";

86 #print "Command:\n$command\n";

87 open(PSOUT, "$command|");

89 while ($pid = ) {

90 chomp $pid;

91 if (!($pid)) {

92 print "No slon_watchdog is running for the cluster $CLUSTER_NAME, node $nodenum!\n";

93 } else {

94 $found="y";

95 kill 9, $pid;

96 print "slon_watchdog for cluster $CLUSTER_NAME node $nodenum killed - PID [$pid]\n";

97 }

98 }

99 close(PSOUT);

100 }

101

102 sub kill_slon_node($) {

103 my ($nodenum) = @_;

104

105 my $pid = get_pid($nodenum);

106

107 #print "Command:\n$command\n";

108 if (!($pid)) {

109 print "No slon is running for the cluster $CLUSTER_NAME, node $nodenum!\n";

110 } else {

111 $found="y";

112 kill 15, $pid;

113 print "slon for cluster $CLUSTER_NAME node $nodenum killed - PID [$pid]\n";

114 }

115 }

以下内容是关于该命令使用方法

22 "Usage: slon_kill [--config file] [-w|--watchdog]

24 --config file Location of the slon_tools.conf file

26 -w

27 --watchdog Only kill the watchdog process(es)

29 Kills all running slon and slon_watchdog on this machine for every

30 node in the cluster.

32 --only-node=i Only kill slon processes for the indicated node

看了脚本的内容后，只要我们执行slon_kill就会停用本机上所有节点。由于我部署时是一个主机一个节点，slon_kill也只会停用本机节点。当然，你也可用--only-node=i指定节点，i是节点的编号。slon_kill执行后，系统果然提示杀掉了节点2。重新启动节点后，加载新的配置文件，如下所示，果然使用的是最新的配置文件。

[postgres@pg02 bin]$ slon_start 2

Invoke slon for node 2 - /home/postgres/soft/bin//slon -p /home/postgres/run/slony1/cluster01_node2.pid -s 1000 -d2 cluster01 'host=192.168.200.40 dbname=slave user=postgres port=5432 password=111' > /var/log/slony1/node2/slave-2025-06-23.log 2>&1 &

Slon successfully started for cluster cluster01, node node2

PID [52274]

Start the watchdog process as well...

查看节点1、2的状态，命令如下所示。

[postgres@pg02 bin]$ slon_status

Usage: slon_status [--config file] node#

--config file Location of the slon_tools.conf file

[postgres@pg02 bin]$ slon_status 2

Slon is running for the 'cluster01' cluster on node2.

[postgres@pg02 bin]$ slon_status 1

slon is not running for cluster.

查看本机slony进程的状态如下所示：

[postgres@pg02 bin]$ ps -ef | grep slony

postgres 52274 1 0 09:13 pts/0 00:00:00 /home/postgres/soft/bin//slon -p /home/postgres/run/slony1/cluster01_node2.pid -s 1000 -d2 cluster01 host=192.168.200.40 dbname=slave user=postgres port=5432 password=111

postgres 52526 51508 0 09:17 pts/0 00:00:00 grep --color=auto slony

总结：按照别人的文章进行环境部署时，可能会遇到一些你自己要独自面对的技术问题。由于水平所限可能无法立即解决，问题不妨先放一下，多从架构者和开发人员的角度来观察如何处理，也许他们已经在解决方案写在软件的某个目录里面了，你要自己去查找。

收藏 0 邀请举报

reddey

slony集群部署的我所遇问题之一

全部作者的其他最新博客

评论 (0 个评论)