|
官网 greatsql.service
文件
[Unit]
Description=GreatSQL Server
Documentation=man:mysqld(8)
Documentation=http://dev.mysql.com/doc/refman/en/using-systemd.html
After=network.target
After=syslog.target
[Install]
WantedBy=multi-user.target
[Service]
# 本文省略some limits
User=mysql
Group=mysql
Type=notify
TimeoutSec=10
PermissionsStartOnly=true
ExecStartPre=/usr/local/GreatSQL-8.0.32-27-Linux-glibc2.28-x86_64/bin/mysqld_pre_systemd
ExecStart=/usr/local/GreatSQL-8.0.32-27-Linux-glibc2.28-x86_64/bin/mysqld $MYSQLD_OPTS
EnvironmentFile=-/etc/sysconfig/mysql
Restart=on-failure
RestartPreventExitStatus=1
Environment=MYSQLD_PARENT_PID=1
PrivateTmp=false
上述服务文件中 MYSQLD_OPTS、MYSQLD_PARENT_PID 的用途是什么?Type 和 ExecStart 有什么关系?服务停止的逻辑是什么?TimeoutSec 超时会怎样?
[Service]
ExecStart=/data/svr/greatsql/bin/mysqld --defaults-file=/data/conf/greatsql4306.cnf $MYSQLD_OPTS
EnvironmentFile=-/data/conf/greatsql
MYSQLD_OPTS
是一个特殊的环境变量,用于在启动时向 MYSQLD 进程传递额外的命令行参数。适合需要动态调整参数的场景。
可以通过以下方式设置 MYSQLD_OPTS
# 设置
systemctl set-environment MYSQLD_OPTS="--general_log=1"
# 取消
systemctl unset-environment MYSQLD_OPTS
[Service]
Environment=MYSQLD_OPTS=--general_log=1
EnvironmentFile=-/data/conf/greatsql
[Service]
Environment=LD_PRELOAD=/usr/local/jemalloc-5.3.0/lib/libjemalloc.so
Environment=LD_PRELOAD=/data/svr/greatsql/lib/mysql/libjemalloc.so #覆盖之前同名的变量
Environment= #清空所有环境变量
如果同一变量被重复设置,后续的赋值会覆盖之前的值。如果将此选项赋值为空字符串,则会重置环境变量列表,之前的所有设置均失效。
[Service]
EnvironmentFile=-/etc/sysconfig/mysql #-表示忽略文件不存在错误
EnvironmentFile=-/data/conf/greatsql
EnvironmentFile= #清空所有待读取的文件
$ cat /data/conf/greatsql
LD_PRELOAD=/data/svr/greatsql/lib/mysql/libjemalloc.so
LD_LIBRARY_PATH=/data/svr/greatsql/lib
TZ=CST
MYSQLD_OPTS=--general_log=1 --port=4307
EnvironmentFile 可以设置多次,所有匹配的文件均会被读取。若将此选项赋值为空字符串,则会清空待读取的文件列表,之前所有设置均失效。
EnvironmentFile 按顺序依次读取,后加载的变量会覆盖之前的设定,且会覆盖 Environment 中的同名变量。
Environment、EnvironmentFile 在服务启动前解析,这些变量会被直接写入服务的环境变量列表,对所有后续命令(ExecStartPre、ExecStart、ExecStartPost)可见。
如果 EnvironmentFile 指定的文件在运行时动态生成,systemd 会尝试读取它,如果文件在读取时被修改,systemd 会使用最新的内容。
systemd 通过 fork-exec + cgroups 的机制创建并严格管理服务进程,确保所有进程均为其子进程。
Uses fork() + execve() to spawn the new process:fork(): Creates a child process (a copy of the systemd parent).execve(): Overwrites the child process with the target binary.Assigns the process to a dedicated cgroupEnsures all child processes remain within the same cgroup.Enables resource limits and process tracking.
[Service]
Type=simple
ExecStart=/data/svr/greatsql/bin/mysqld --defaults-file=/data/conf/greatsql4306.cnf $MYSQLD_OPTS
如果 ExecStart 启动的命令以 daemon 模式运行,daemon 进程有一个瞬间退出的中间父进程,对应就是子进程。在子进程退出时,systemd 会将其从监控队列中踢掉,同时杀掉所有附属进程(杀进程的方式由 KillMode 控制)。
# KillMode=control-group
$ systemctl start db-4306
$ systemctl status db-4306
● db-4306.service - db-4306 Server
Loaded: loaded (/usr/lib/systemd/system/db-4306.service; enabled; vendor preset: disabled)
Active: inactive (dead) since Fri 2025-05-30 11:03:26 CST; 9s ago
Process: 1914 ExecStart=/data/svr/greatsql/bin/mysqld --defaults-file=/data/conf/greatsql4306.cnf --daemonize $MYSQLD_OPTS (code=exited, status=0/SUCCESS)
Main PID: 1914 (code=exited, status=0/SUCCESS)
Jun 05 11:03:22 dbcluster-165 systemd[1]: Started db-4306 Server.
$ ps aux |grep 4306 |grep -v grep
Type=simple,执行 daemon 命令,默认启动后马上会停止。
[Service]
Type=forking
PIDFile=/data/dbdata/data4306/data/mysql.pid
ExecStart=/data/svr/greatsql/bin/mysqld --defaults-file=/data/conf/greatsql4306.cnf --daemonize $MYSQLD_OPTS
以下是 forking 模式下正常启动的服务
$ systemctl status db-4306
● db-4306.service - db-4306 Server
Loaded: loaded (/usr/lib/systemd/system/db-4306.service; enabled; vendor preset: disabled)
Active: active (running) since Thu 2025-05-29 22:28:03 CST; 11s ago
Process: 24262 ExecStart=/data/svr/greatsql/bin/mysqld --defaults-file=/data/conf/greatsql4306.cnf --daemonize $MYSQLD_OPTS (code=exited, status=0/SUCCESS)
Main PID: 24342 (mysqld)
Tasks: 54
CGroup: /system.slice/db-4306.service
└─24342 /data/svr/greatsql/bin/mysqld --defaults-file=/data/conf/greatsql4306.cnf --daemonize
May 29 22:28:01 dbcluster-165 systemd[1]: Starting db-4306 Server...
May 29 22:28:03 dbcluster-165 systemd[1]: Started db-4306 Server.
ExecStart 启动的进程 PID=24262,且该进程的状态是已退出,退出状态码为0,这个进程是 daemon 类进程创建过程中瞬间退出的中间父进程。Main PID: 24342 (mysqld),这是 systemd 真正监控的服务主进程。
如果 ExecStart 是一个前台命令,systemd 会一直等待 ExecStart 启动的进程作为中间父进程退出,在等待过程中,systemctl start 会一直卡住,直到等待超时而失败。
$ systemctl status db-4306
● db-4306.service - db-4306 Server
Loaded: loaded (/usr/lib/systemd/system/db-4306.service; enabled; vendor preset: disabled)
Active: activating (start) since Fri 2025-05-30 17:25:01 CST; 52s ago
Main PID: 27683 (code=exited, status=0/SUCCESS); : 12646 (mysqld)
Tasks: 54
CGroup: /system.slice/db-4306.service
└─12646 /data/svr/greatsql/bin/mysqld --defaults-file=/data/conf/greatsql4306.cnf
May 30 17:25:01 dbcluster-165 systemd[1]: Starting db-4306 Server...
$ ps axj |grep 4306 |grep -v grep
18266 12640 12640 18266 pts/1 12640 S+ 0 0:00 systemctl start db-4306
1 12646 12646 12646 ? -1 Ssl 986 0:02 /data/svr/greatsql/bin/mysqld --defaults-file=/data/conf/greatsql4306.cnf
$ tailf /var/log/messages |grep db-4306
May 30 17:25:01 dbcluster-165 systemd: Starting db-4306 Server...
May 30 17:26:31 dbcluster-165 systemd: db-4306.service start operation timed out. Terminating.
May 30 17:26:32 dbcluster-165 systemd: Failed to start db-4306 Server.
May 30 17:26:32 dbcluster-165 systemd: Unit db-4306.service entered failed state.
May 30 17:26:32 dbcluster-165 systemd: db-4306.service failed.
May 30 17:26:32 dbcluster-165 systemd: db-4306.service holdoff time over, scheduling restart.
May 30 17:26:32 dbcluster-165 systemd: Stopped db-4306 Server.
May 30 17:26:32 dbcluster-165 systemd: Starting db-4306 Server...
Type=forking,执行前台命令,在Restart=on-failure场景,启动超时导致服务反复重启。
[Service]
Type=notify
ExecStart=/data/svr/greatsql/bin/mysqld --defaults-file=/data/conf/greatsql4306.cnf $MYSQLD_OPTS
当使用 mysqld_safe 启动数据库时,ps 可以看到 mysqld 进程带有很多变量
$ ps aux |grep 4306
mysql 8787 0.0 0.0 113316 1640 pts/1 S 08:59 0:00 /bin/sh /data/svr/greatsql/bin/mysqld_safe --defaults-file=/data/conf/greatsql4306.cnf
mysql 10424 0.6 3.0 1251912 499724 pts/1 Sl 08:59 0:16 /data/svr/greatsql/bin/mysqld --defaults-file=/data/conf/greatsql4306.cnf --basedir=/data/svr/greatsql --datadir=/data/dbdata/data4306/data --plugin-dir=/data/svr/greatsql/lib/plugin --log-error=/data/logs/error4306.log --open-files-limit=65535 --pid-file=/data/dbdata/data4306/data/mysql.pid --socket=/data/dbdata/data4306/data/mysql.sock --port=4306
$
mysqld_safe 处理逻辑如下
cmd="`mysqld_ld_preload_text`$NOHUP_NICENESS"
for i in "$ledir/$MYSQLD" "$defaults" "--basedir=$MY_BASEDIR_VERSION" \
"--datadir=$DATADIR" "--plugin-dir=$plugin_dir" "$USER_OPTION"
do
cmd="$cmd "`shell_quote_string "$i"`
done
cmd="$cmd $args"
# Avoid 'nohup: ignoring input' warning
test -n "$NOHUP_NICENESS" && cmd="$cmd < /dev/null"
log_notice "Starting $MYSQLD daemon with databases from $DATADIR"
对于 systemd service,可以添加 ExecStartPre,从 --defaults-file
中获取需要显示的变量
[Service]
Type=notify
ExecStartPre=-/bin/bash -c "sed 's/_/-/g; s/ //g; s/#.*//' /data/conf/greatsql4306.cnf |grep -E '^(basedir|datadir|log-error|socket|port)=' |sed 's/^/--/' |tr '\n' ' ' |sed 's/^/MYSQLD_OPTS=/' > /data/conf/greatsql4306.env"
ExecStart=/data/svr/greatsql/bin/mysqld --defaults-file=/data/conf/greatsql4306.cnf $MYSQLD_OPTS
EnvironmentFile=-/data/conf/greatsql4306.env
# 启动后的效果
$ systemctl -l status db-4306
● db-4306.service - db-4306 Server
Loaded: loaded (/usr/lib/systemd/system/db-4306.service; enabled; vendor preset: disabled)
Active: active (running) since Fri 2025-06-06 10:27:16 CST; 7h ago
Main PID: 12020 (mysqld)
Status: "Server is operational"
Tasks: 53
CGroup: /system.slice/db-4306.service
└─12020 /data/svr/greatsql/bin/mysqld --defaults-file=/data/conf/greatsql4306.cnf --port=4306 --basedir=/data/svr/greatsql --datadir=/data/dbdata/data4306/data --pid-file=/data/dbdata/data4306/data/mysql.pid --socket=/data/dbdata/data4306/data/mysql.sock --log-error=/data/logs/error4306.log
Jun 06 10:27:14 dbcluster-165 systemd[1]: Starting db-4306 Server...
Jun 06 10:27:16 dbcluster-165 systemd[1]: Started db-4306 Server.
对于开机自动启动,如果磁盘挂载服务启动较慢,数据库服务可能会报错,可以配置数据库服务延迟启动
[Unit]
Description=GreatSQL Server
After=network.target local-fs.target #部分环境After、Requires=local-fs.target无效
Requires=local-fs.target
[Service]
Type=notify
Environment=LD_PRELOAD=/usr/local/jemalloc-5.3.0/lib/libjemalloc.so
ExecStartPre=-/usr/bin/sleep 5 #本节采用
ExecStart=/data/svr/greatsql/bin/mysqld --defaults-file=/data/conf/greatsql4306.cnf $MYSQLD_OPTS
服务日志如下
$ systemctl start db-4306
$ systemctl -l status db-4306
● db-4306.service - db-4306 Server
Loaded: loaded (/usr/lib/systemd/system/db-4306.service; enabled; vendor preset: disabled)
Active: active (running) since Fri 2025-06-13 23:16:09 CST; 15s ago
Process: 3920 ExecStartPre=/usr/bin/sleep 5 (code=exited, status=0/SUCCESS)
Main PID: 4004 (mysqld)
Status: "Server is operational"
Tasks: 54
CGroup: /system.slice/db-4306.service
└─4004 /data/svr/greatsql/bin/mysqld --defaults-file=/data/conf/greatsql4306.cnf
Jun 13 23:16:02 dbcluster-165 systemd[1]: Starting db-4306 Server...
Jun 13 23:16:02 dbcluster-165 sleep[3920]: ERROR: ld.so: object '/usr/local/jemalloc-5.3.0/lib/libjemalloc.so' from LD_PRELOAD cannot be preloaded: ignored.
Jun 13 23:16:09 dbcluster-165 systemd[1]: Started db-4306 Server.
$ lsof -p 4004 |grep -i jem
mysqld 4004 mysql mem REG 8,2 10479400 846986 /usr/local/jemalloc-5.3.0/lib/libjemalloc.so.2
systemctl status
中 ERROR 的原因:执行 ExecStartPre 会加载服务的环境变量,此时由于磁盘挂载暂未完成,导致 so 文件无法加载(这一条 ERROR 可以忽略)。只要延迟足够,在磁盘挂载完成后再执行 ExecStart,就能正常加载配置的 so 文件。
对于版本升级等场景,通常会设置 innodb_fast_shutdown=0,此时关闭数据库会比较慢,如果 TimeoutStopSec 过小,可能导致 ExecStop、SIGTERM 超时,触发 SIGKILL。
[Service]
Type=notify
ExecStart=/data/svr/greatsql/bin/mysqld --defaults-file=/data/conf/greatsql4306.cnf
Restart=on-failure
$ date;systemctl stop db-4306
Tue Jun 3 15:37:19 CST 2025
$ systemctl status db-4306
● db-4306.service - db-4306 Server
Loaded: loaded (/usr/lib/systemd/system/db-4306.service; enabled; vendor preset: disabled)
Active: failed (Result: signal) since Tue 2025-06-03 15:37:21 CST; 36s ago
Process: 26645 ExecStop=/usr/bin/sleep 60 (code=killed, signal=TERM)
Process: 26322 ExecStart=/data/svr/greatsql/bin/mysqld --defaults-file=/data/conf/greatsql4306.cnf (code=killed, signal=KILL)
Main PID: 26322 (code=killed, signal=KILL)
Status: "Server shutdown in progress"
Jun 03 15:37:01 dbcluster-165 systemd[1]: Starting db-4306 Server...
Jun 03 15:37:03 dbcluster-165 systemd[1]: Started db-4306 Server.
Jun 03 15:37:19 dbcluster-165 systemd[1]: Stopping db-4306 Server...
Jun 03 15:37:20 dbcluster-165 systemd[1]: db-4306.service stopping timed out. Terminating.
Jun 03 15:37:21 dbcluster-165 systemd[1]: db-4306.service stop-sigterm timed out. Killing.
Jun 03 15:37:21 dbcluster-165 systemd[1]: db-4306.service: main process exited, code=killed, status=9/KILL
Jun 03 15:37:21 dbcluster-165 systemd[1]: Stopped db-4306 Server.
Jun 03 15:37:21 dbcluster-165 systemd[1]: Unit db-4306.service entered failed state.
Jun 03 15:37:21 dbcluster-165 systemd[1]: db-4306.service failed.
systemctl stop 超时(ExecStop、SIGTERM),服务会被 SIGKILL。systemd 认为是一个预期的行为,不会触发重启。
greatsql> shutdown;
systemd 对所有通过它管理的服务都实施完整的生命周期控制。当执行 shutdown 时,信号传递链:shutdown --> 数据库服务 --> systemd 服务管理器。systemd 会监控整个停止过程。
Jun 3 17:32:23 dbcluster-165 systemd: db-4306.service stop-sigterm timed out. Killing.
Jun 3 17:32:24 dbcluster-165 systemd: db-4306.service: main process exited, code=killed, status=9/KILL
Jun 3 17:32:24 dbcluster-165 systemd: Unit db-4306.service entered failed state.
Jun 3 17:32:24 dbcluster-165 systemd: db-4306.service failed.
Jun 3 17:32:24 dbcluster-165 systemd: db-4306.service holdoff time over, scheduling restart.
Jun 3 17:32:24 dbcluster-165 systemd: Stopped db-4306 Server.
Jun 3 17:32:24 dbcluster-165 systemd: Starting db-4306 Server...
Jun 3 17:32:25 dbcluster-165 systemd: Started db-4306 Server.
SIGTERM 超时,发送 SIGKILL 强制终止,进程退出原因是 Unclean signal。在 Restart=on-failure 场景,等待 RestartSec(默认100ms)后重启。
命令行 shutdown 超时(SIGTERM),服务会被 SIGKILL,之后会触发重启。
如果要禁用 systemd 的默认停止行为,可以参考如下设置
[Service]
ExecStop=/path/to/your-stop-script # 必须确保此脚本终止所有进程
KillMode=none # 禁用systemd的默认终止行为
TimeoutStopSec=0 # 避免超时干预
Configures whether the service shall be restarted when the service process exits, is killed, or a timeout is reached. The service process may be the main service process, but it may also be one of the processes specified with ExecStartPre=, ExecStartPost=, ExecStop=, ExecStopPost=, or ExecReload=. When the death of the process is a result of systemd operation (e.g. service stop or restart), the service will not be restarted. Timeouts include missing the watchdog "keep-alive ping" deadline and a service start, reload, and stop operation timeouts.
Restart settings/Exit causes | no | always | on-success | on-failure | on-abnormal | on-abort | on-watchdog |
---|---|---|---|---|---|---|---|
Clean exit code or signal | X | X | |||||
Unclean exit code | X | X | |||||
Unclean signal | X | X | X | X | |||
Timeout | X | X | X | ||||
Watchdog | X | X | X | X |
A clean exit means an exit code of 0, or one of the signals SIGHUP, SIGINT, SIGTERM or SIGPIPE, and additionally, exit statuses and signals specified in SuccessExitStatus=.
greatsql> restart;
ERROR 3707 (HY000): Restart server failed (mysqld is not managed by supervisor process).
原因:没有设置相关的监控进程(https://dev.mysql.com/doc/refman/8.0/en/restart.html)
$ ps aux |grep 4306 |grep -v grep
mysql 21055 0.5 3.1 2635940 505712 ? Ssl Jun03 7:31 /data/svr/greatsql/bin/mysqld --defaults-file=/data/conf/greatsql4306.cnf
$ cat /proc/21055/environ |tr '\0' '\n'
LANG=en_US.UTF-8
PATH=/usr/local/sbin:/usr/local/bin:/usr/sbin:/usr/bin
NOTIFY_SOCKET=/run/systemd/notify
HOME=/home/mysql
LOGNAME=mysql
USER=mysql
SHELL=/bin/bash
#echo "Running mysqld: [$cmd]"
cmd="env MYSQLD_PARENT_PID=$$ $cmd"
eval "$cmd"
[Service]
Environment=MYSQLD_PARENT_PID=1
$ systemctl restart db-4306
$ ps aux |grep 4306 |grep -v grep
mysql 11050 0.9 3.0 2635676 500004 ? Ssl 15:53 0:04 /data/svr/greatsql/bin/mysqld --defaults-file=/data/conf/greatsql4306.cnf
$ cat /proc/11050/environ |tr '\0' '\n' |grep MYSQLD_PARENT_PID
MYSQLD_PARENT_PID=1
greatsql> restart;
Query OK, 0 rows affected (0.00 sec)
如果要求不重启数据库就能执行 restart 命令,可通过 gdb 动态修改环境变量,gdb 会短时间阻塞数据库$ gdb -p 21055 (gdb) call putenv("MYSQLD_PARENT_PID=1") $2 = 0 (gdb) detach Detaching from program: /data/svr/GreatSQL-8.0.32-26-Linux-glibc2.17-x86_64/bin/mysqld, process 21055 (gdb) quit greatsql> restart; Query OK, 0 rows affected (0.00 sec)
clone 同样需要设置 MYSQLD_PARENT_PID,才能自动重启。
ReStartSec:重启条件满足后等多久自动重启
StartLimitInterval、StartLimitBurst:限制指定时间内(StartLimitInterval)重启的次数(StartLimitBurst)
RestartPreventExitStatus:指定某些退出状态码或信号不重启。GreatSQL服务建议设置为1,在遇到严重错误时不重启实例,需人工介入处理
RestartForceExitStatus:强制将某些退出状态码或信号重启。比如 Restart=no,RestartForceExitStatus=16,则不依赖自动重启,但命令行执行 restart 可正常重启实例
合作电话:010-64087828
社区邮箱:greatsql@greatdb.com