04_在线安装k3s集群-外置mysql数据库

一、准备运行 Rancher Server 的 Linux 节点

1. 克隆出虚拟机前,先把模板升级到最新;

yum update -y

2. K3s 安装要求

请在开始安装 K3s 之前,确保满足安装要求
https://docs.rancher.cn/docs/k3s/installation/installation-requirements/_index/

证书过期导致Rancher无法打开UI的问题
https://www.mayanpeng.cn/archives/120.html
k3s 证书过期解决方法
https://blog.51cto.com/3138583/2466781
https://www.yisu.com/zixun/4290.html
我们建议安装 ntp (Network Time Protocol),这样可以防止在客户端和服务器之间因为时钟不同步而发生证书验证错误。

1. 主机及IP规划

序号 IP 主机名 用途 系统
1 172.16.7.211 mariadb-master MySQL数据库(主) CentOS 7.9
2 172.16.7.212 mariadb-slave MySQL数据库(从) CentOS 7.9
3 172.16.7.213 harbor-master Harbor仓库(主) CentOS 7.9
4 172.16.7.214 harbor-slave Harbor仓库(从) CentOS 7.9
5 172.16.7.215 rancher-slb 负载均衡器 CentOS 7.9
6 172.16.7.216 rancher-01 Rancher-master CentOS 7.9
7 172.16.7.217 rancher-02 Rancher-master CentOS 7.9
8 172.16.7.218 vip worker节点虚拟IP CentOS 7.9
9 172.16.7.219 worker-01 worker节点 CentOS 7.9
10 172.16.7.220 worker-02 worker节点 CentOS 7.9
11 172.16.7.221 worker-03 worker节点 CentOS 7.9

2. K3s 高可用安装的 CPU 和 内存要求

部署规模 集群 节点 vCPUs 内存 数据库规模
最多 150 个 最多 1500 个 2 8 GB 2 cores, 4GB + 1000 IOPS
最多 300 个 最多 3000 个 4 16 GB 2 cores, 4GB + 1000 IOPS
最多 500 个 最多 5000 个 8 32 GB 2 cores, 4GB + 1000 IOPS
特大 最多 1000 个 最多 10,000 个 16 64 GB 2 cores, 4GB + 1000 IOPS
超大 最多 2000 个 最多 20,000 个 32 128 GB 2 cores, 4GB + 1000 IOPS

联系 Rancher,如果您要管理 2000+ 集群和/或 20,000+ 节点。

3. 修改主机hosts文件

# (1) 修改Rancher-slb主机hosts
[root@rancher-slb ~]# cat /etc/hosts

172.16.7.215    rancher-slb  rancher.zyrox.com
127.0.0.1   localhost localhost.localdomain localhost4 localhost4.localdomain4
::1         localhost localhost.localdomain localhost6 localhost6.localdomain6

# (2) 修改Rancher-01主机hosts
[root@rancher-01 ~]# cat /etc/hosts

172.16.7.216  rancher-01
127.0.0.1   localhost localhost.localdomain localhost4 localhost4.localdomain4
::1         localhost localhost.localdomain localhost6 localhost6.localdomain6
172.16.7.215    rancher.zyrox.com

# (3) 修改Rancher-02主机hosts
[root@rancher-02 ~]# cat /etc/hosts

172.16.7.217   rancher-02
127.0.0.1   localhost localhost.localdomain localhost4 localhost4.localdomain4
::1         localhost localhost.localdomain localhost6 localhost6.localdomain6
172.16.7.215   rancher.zyrox.com

# (4) 修改Worker-01主机hosts
[root@worker-01 ~]# cat /etc/hosts

172.16.7.219    worker-01
127.0.0.1   localhost localhost.localdomain localhost4 localhost4.localdomain4
::1         localhost localhost.localdomain localhost6 localhost6.localdomain6
172.16.7.215 rancher.zyrox.com

# (5) 修改Worker-02主机hosts
[root@worker-02 ~]# cat /etc/hosts

172.16.7.220    worker-02
127.0.0.1   localhost localhost.localdomain localhost4 localhost4.localdomain4
::1         localhost localhost.localdomain localhost6 localhost6.localdomain6
172.16.7.215    rancher.zyrox.com

# (6) 修改Worker-03主机hosts
[root@worker-03 ~]# cat /etc/hosts

172.16.7.221    worker-03
127.0.0.1   localhost localhost.localdomain localhost4 localhost4.localdomain4
::1         localhost localhost.localdomain localhost6 localhost6.localdomain6
172.16.7.215    rancher.zyrox.com

说明:
如果worker节点不添加域名与IP映射关系,后续创建业务集群时,worker节点执行加入集群脚本,在Rancher UI上不显示,不会继续下一步添加操作。

4. 关闭相关服务

# (1) 关闭防火墙
systemctl stop firewalld
systemctl disable firewalld
# (2) 关闭swap
swapoff -a && sed -i '/ swap / s/^\(.*\)$/#\1/g' /etc/fstab
# (3) 关闭selinux 
setenforce 0 && sed -i 's/^SELINUX=.*/SELINUX=disabled/' /etc/selinux/config
# (4) 关闭邮件服务
systemctl stop postfix && systemctl disable postfix

5 升级内核至4.10+

上传内核rpm包:kernel-ml-5.5.11-1.el7.elrepo.x86_64.rpm

rpm -ivh kernel-ml-5.5.11-1.el7.elrepo.x86_64.rpm
grub2-set-default 0
reboot

二、节点 OS 调优

1. 内核调优

echo "
net.bridge.bridge-nf-call-ip6tables=1
net.bridge.bridge-nf-call-iptables=1
net.ipv4.ip_forward=1
net.ipv4.conf.all.forwarding=1
net.ipv4.neigh.default.gc_thresh1=4096
net.ipv4.neigh.default.gc_thresh2=6144
net.ipv4.neigh.default.gc_thresh3=8192
net.ipv4.neigh.default.gc_interval=60
net.ipv4.neigh.default.gc_stale_time=120

# 参考 https://github.com/prometheus/node_exporter#disabled-by-default
kernel.perf_event_paranoid=-1

#sysctls for k8s node config
net.ipv4.tcp_slow_start_after_idle=0
net.core.rmem_max=16777216
fs.inotify.max_user_watches=524288
kernel.softlockup_all_cpu_backtrace=1

kernel.softlockup_panic=0

kernel.watchdog_thresh=30
fs.file-max=2097152
fs.inotify.max_user_instances=8192
fs.inotify.max_queued_events=16384
vm.max_map_count=262144
fs.may_detach_mounts=1
net.core.netdev_max_backlog=16384
net.ipv4.tcp_wmem=4096 12582912 16777216
net.core.wmem_max=16777216
net.core.somaxconn=32768
net.ipv4.ip_forward=1
net.ipv4.tcp_max_syn_backlog=8096
net.ipv4.tcp_rmem=4096 12582912 16777216

net.ipv6.conf.all.disable_ipv6=1
net.ipv6.conf.default.disable_ipv6=1
net.ipv6.conf.lo.disable_ipv6=1

kernel.yama.ptrace_scope=0
vm.swappiness=0

# 可以控制core文件的文件名中是否添加pid作为扩展。
kernel.core_uses_pid=1

# Do not accept source routing
net.ipv4.conf.default.accept_source_route=0
net.ipv4.conf.all.accept_source_route=0

# Promote secondary addresses when the primary address is removed
net.ipv4.conf.default.promote_secondaries=1
net.ipv4.conf.all.promote_secondaries=1

# Enable hard and soft link protection
fs.protected_hardlinks=1
fs.protected_symlinks=1

# 源路由验证
# see details in https://help.aliyun.com/knowledge_detail/39428.html
net.ipv4.conf.all.rp_filter=0
net.ipv4.conf.default.rp_filter=0
net.ipv4.conf.default.arp_announce = 2
net.ipv4.conf.lo.arp_announce=2
net.ipv4.conf.all.arp_announce=2

# see details in https://help.aliyun.com/knowledge_detail/41334.html
net.ipv4.tcp_max_tw_buckets=5000
net.ipv4.tcp_syncookies=1
net.ipv4.tcp_fin_timeout=30
net.ipv4.tcp_synack_retries=2
kernel.sysrq=1

" >> /etc/sysctl.conf

使sysctl.conf配置生效

sysctl -p

2. nofile

cat >> /etc/security/limits.conf <<EOF
* soft nofile 65535
* hard nofile 65536
EOF

三、Docker 调优

1.上传/下载调优

1.1 调整 Docker 镜像下载最大并发数

通过配置镜像上传\下载并发数max-concurrent-downloads,max-concurrent-uploads,缩短镜像上传\下载的时间。

1.2 配置镜像加速地址

通过配置镜像加速地址registry-mirrors,可以很大程度提高镜像下载速度。

1.3 存储驱动调优

配置 Docker 存储驱动时,建议使用新版的 overlay2,因为它更稳定。OverlayFS 是一个新一代的联合文件系统,类似于 AUFS,
但速度更快,实现更简单。Docker 为 OverlayFS 提供了两个存储驱动程序:旧版的 overlay 和新版的overlay2。

1.4 日志文件调优

容器中会产生大量日志文件,很容器占满磁盘空间。您可以在全局范围限制日志文件大小max-size和日志文件数量max-file,
可以有效控制日志文件对磁盘的占用量,如下图所示,您可以将日志文件大小max-size设为 30Mb,日志文件数量max-file设为 10。
完成设置后,请运行systemctl daemon-reload命令,重新加载配置文件;然后运行systemctl restart docker命令,重启 Docker。
重启后调优规则马上生效。日志文件存储的机制是这样的:
日志不满 30Mb 的情况下,只会生成一个.log文件,存储日志内容。
日志超出 30Mb,但少于 300Mb(数量限制 x 大小限制)的情况下,会生成
.log、.log.1、.log.2….log.n(n 小于或等于 9)这几个文件存储日志内容。
日志超出 300Mb(数量限制 x 大小限制),会按照生成 log 文件的时间,由早到晚依次将
.log、.log.1、.log.2…*.log.n的日志内容替换成最近的日志内容。

cat > /etc/docker/daemon.json <<EOF
{
    "oom-score-adjust": -1000,
    "log-driver": "json-file",
    "log-opts": {"max-size": "100m","max-file": "3"},
    "max-concurrent-downloads": 10,
    "max-concurrent-uploads": 10,
    "registry-mirrors": ["https://7bezldxe.mirror.aliyuncs.com"],
    "storage-driver": "overlay2",
    "storage-opts": ["overlay2.override_kernel_check=true"]
}
EOF
systemctl daemon-reload && systemctl restart docker

1.5 docker.service 配置调优

对于 CentOS 系统,docker.service 默认位于/usr/lib/systemd/system/docker.service,编辑docker.service,添加以下参数。
防止 docker 服务 OOM OOMScoreAdjust=-1000
开启 iptables 转发链 ExecStartPost=/usr/sbin/iptables -P FORWARD ACCEPT

[Service]
OOMScoreAdjust=-1000                                    # 添加此行于此
Type=notify
# the default is not to use systemd for cgroups because the delegate issues still
# exists and systemd currently does not support the cgroup feature set required
# for containers run by docker
ExecStart=/usr/bin/dockerd -H fd:// --containerd=/run/containerd/containerd.sock
ExecReload=/bin/kill -s HUP $MAINPID
ExecStartPost=/usr/sbin/iptables -P FORWARD ACCEPT        # 添加此行于此
systemctl daemon-reload
systemctl restart docker

四、搭建mysql主从服务器并创建数据库

1 安装MySQL主从数据库

安装MySQL主从数据库,参见 https://www.showdoc.com.cn/576518050103730?page_id=6006721209616520

2 创建数据库

rancherk3s/rancherk3sdb

create database rancherk3sdb DEFAULT CHARACTER SET utf8 COLLATE utf8_general_ci;
grant usage on rancherk3sdb.* to 'rancherk3s'@'%' identified by 'hz310012' with grant option;
grant all privileges on rancherk3sdb.* to 'rancherk3s'@'%';
flush privileges;

五、负载均衡安装

负载均衡安装,参考: https://www.showdoc.com.cn/963349270507135?page_id=5826127227251464

1. 创建 NGINX 配置

mv /etc/nginx/nginx.conf /etc/nginx/nginx.conf.bak
vim /etc/nginx/nginx.conf

输入如下内容:

worker_processes 4;
worker_rlimit_nofile 40000;

events {
    worker_connections 8192;
}

stream {
    upstream rancher_servers_http {
        least_conn;
        server 172.16.7.216:80 max_fails=3 fail_timeout=5s;
        server 172.16.7.217:80 max_fails=3 fail_timeout=5s;
    }
    server {
        listen 80;
        proxy_pass rancher_servers_http;
    }

    upstream rancher_servers_https {
        least_conn;
        server 172.16.7.216:443 max_fails=3 fail_timeout=5s;
        server 172.16.7.217:443 max_fails=3 fail_timeout=5s;
    }
    server {
        listen     443;
        proxy_pass rancher_servers_https;
    }

}

六、安装 K3s 集群

1. 安装 Kubernetes 并配置 K3s Server

安装k3s集群,官文默认采用 containerd 进行安装,如果习惯使用docker安装,添加—docker参数。
详细介绍:https://docs.rancher.cn/docs/k3s/installation/install-options/server-config/_index#agent-%E8%BF%90%E8%A1%8C%E6%97%B6

2. 国内用户用如下链接安装

如果需要高可用,可在多个服务器节点上执行如下命令,任选一台进行安装,不需要两台都进行安装,此处我选Rancher-01主机进行安装。国内用户如何习惯docker,可以采用docker方式进行安装,命令如下:

[root@rancher-01 ~]# curl -sfL http://rancher-mirror.cnrancher.com/k3s/k3s-install.sh | INSTALL_K3S_MIRROR=cn sh -s - server \
--docker --datastore-endpoint="mysql://rancherk3s:hz310012@tcp(172.16.7.211:6006)/rancherk3sdb"

输出:

[INFO]  Finding release for channel stable
[INFO]  Using v1.19.5+k3s2 as release
[INFO]  Downloading hash http://rancher-mirror.cnrancher.com/k3s/v1.19.5-k3s2/sha256sum-amd64.txt
[INFO]  Downloading binary http://rancher-mirror.cnrancher.com/k3s/v1.19.5-k3s2/k3s
[INFO]  Verifying binary download
[INFO]  Installing k3s to /usr/local/bin/k3s
Loaded plugins: fastestmirror
Loading mirror speeds from cached hostfile
 * base: mirrors.ustc.edu.cn
 * extras: mirrors.ustc.edu.cn
 * updates: mirrors.ustc.edu.cn
base                                                                                                            | 3.6 kB  00:00:00     
extras                                                                                                          | 2.9 kB  00:00:00     
updates                                                                                                         | 2.9 kB  00:00:00     
updates/7/x86_64/primary_db                                                                                     | 4.0 MB  00:00:04     
Package yum-utils-1.1.31-54.el7_8.noarch already installed and latest version
Nothing to do
Loaded plugins: fastestmirror
Loaded plugins: fastestmirror
Loading mirror speeds from cached hostfile
 * base: mirrors.ustc.edu.cn
 * extras: mirrors.ustc.edu.cn
 * updates: mirrors.ustc.edu.cn
rancher-k3s-common-stable                                                                                       | 2.9 kB  00:00:00     
rancher-k3s-common-stable/primary_db                                                                            | 1.8 kB  00:00:02     
Resolving Dependencies
--> Running transaction check
---> Package k3s-selinux.noarch 0:0.2-1.el7_8 will be installed
--> Finished Dependency Resolution

Dependencies Resolved

=======================================================================================================================================
 Package                      Arch                    Version                         Repository                                  Size
=======================================================================================================================================
Installing:
 k3s-selinux                  noarch                  0.2-1.el7_8                     rancher-k3s-common-stable                   13 k

Transaction Summary
=======================================================================================================================================
Install  1 Package

Total download size: 13 k
Installed size: 82 k
Downloading packages:
warning: /var/cache/yum/x86_64/7/rancher-k3s-common-stable/packages/k3s-selinux-0.2-1.el7_8.noarch.rpm: Header V4 RSA/SHA1 Signature, key ID e257814a: NOKEY
Public key for k3s-selinux-0.2-1.el7_8.noarch.rpm is not installed
k3s-selinux-0.2-1.el7_8.noarch.rpm                                                                              |  13 kB  00:00:03     
Retrieving key from https://rpm.rancher.io/public.key
Importing GPG key 0xE257814A:
 Userid     : "Rancher (CI) <ci@rancher.com>"
 Fingerprint: c8cf f216 4551 26e9 b9c9 18be 925e a29a e257 814a
 From       : https://rpm.rancher.io/public.key
Running transaction check
Running transaction test
Transaction test succeeded
Running transaction
Warning: RPMDB altered outside of yum.
  Installing : k3s-selinux-0.2-1.el7_8.noarch                                                                                      1/1 
  Verifying  : k3s-selinux-0.2-1.el7_8.noarch                                                                                      1/1 

Installed:
  k3s-selinux.noarch 0:0.2-1.el7_8                                                                                                     

Complete!
[INFO]  Creating /usr/local/bin/kubectl symlink to k3s
[INFO]  Creating /usr/local/bin/crictl symlink to k3s
[INFO]  Skipping /usr/local/bin/ctr symlink to k3s, command exists in PATH at /usr/bin/ctr
[INFO]  Creating killall script /usr/local/bin/k3s-killall.sh
[INFO]  Creating uninstall script /usr/local/bin/k3s-uninstall.sh
[INFO]  env: Creating environment file /etc/systemd/system/k3s.service.env
[INFO]  systemd: Creating service file /etc/systemd/system/k3s.service
[INFO]  systemd: Enabling k3s unit
[INFO]  systemd: Starting k3s

3. 确认 K3s 是否创建成功

要确认已成功设置 K3s,请在任一 K3s Server 节点上运行以下命令:

k3s kubectl get nodes

然后,您应该看到两个具有master角色的节点:

NAME         STATUS   ROLES    AGE   VERSION
rancher-01   Ready    master   24h   v1.19.5+k3s2
rancher-02   Ready    master   24h   v1.19.5+k3s2

4. 然后测试集群容器的运行状况

[root@rancher-02 ~]# k3s kubectl get pods --all-namespaces
NAMESPACE   NAME                                    READY STATUS           RESTARTS  AGE
kube-system metrics-server-7b4f8b595-xg66w          1/1   Running          0         18m
kube-system coredns-66c464876b-c4tv9                1/1   Running          0         18m
kube-system local-path-provisioner-7ff9579c6-cl5zm  1/1   Running          0         18m
kube-system helm-install-traefik-8rx7q              0/1   ImagePullBackOff 0         18m

也可使用如下命令检查k3s状态

kubectl get all -n kube-system

5. 镜像拉取失败

查看集群容器运行情况时发现有一个镜像拉取失败,分析原因,查看pod日志:

k3s kubectl describe pod helm-install-traefik-8rx7q --namespace kube-system

输出如下:

Events:
  Type     Reason            Age                From               Message
  ----     ------            ----               ----               -------
  Warning  FailedScheduling  28m                default-scheduler  0/1 nodes are available: 1 node(s) had taint {node.kubernetes.io/not-ready: }, that the pod didn't tolerate.
  Warning  FailedScheduling  28m                default-scheduler  0/1 nodes are available: 1 node(s) had taint {node.kubernetes.io/not-ready: }, that the pod didn't tolerate.
  Normal   Scheduled         28m                default-scheduler  Successfully assigned kube-system/helm-install-traefik-8rx7q to rancher-01
  Warning  Failed            17m                kubelet            Error: ErrImagePull
  Warning  Failed            17m                kubelet            Failed to pull image "rancher/klipper-helm:v0.3.0": rpc error: code = Unknown desc = failed to pull and unpack image "docker.io/rancher/klipper-helm:v0.3.0": failed to copy: read tcp 172.16.7.216:49106->104.18.125.25:443: read: connection reset by peer
  Normal   BackOff           17m                kubelet            Back-off pulling image "rancher/klipper-helm:v0.3.0"
  Warning  Failed            17m                kubelet            Error: ImagePullBackOff
  Normal   Pulling           17m (x2 over 28m)  kubelet            Pulling image "rancher/klipper-helm:v0.3.0"

5.1 解决办法

方法1:

cat >> /etc/rancher/k3s/registries.yaml <<EOF
mirrors:
  "docker.io":
    endpoint:
      - "https://fogjl973.mirror.aliyuncs.com"
      - "https://registry-1.docker.io"
EOF

重启k3s服务

systemctl restart k3s

5.2 解决办法

方法2:
参考:https://docs.rancher.cn/docs/k3s/installation/airgap/_index/

sudo mkdir -p /var/lib/rancher/k3s/agent/images/
sudo cp ./k3s-airgap-images-$ARCH.tar /var/lib/rancher/k3s/agent/images/

然后在 http://mirror.cnrancher.com/ 下载 k3s的镜像文件

5.3 采用方法1 解决问题,再次查看集群容器运行情况

[root@rancher-01 k3s]# k3s kubectl get pod  --all-namespaces
NAMESPACE     NAME                                     READY   STATUS      RESTARTS   AGE
kube-system   metrics-server-7b4f8b595-xg66w           1/1     Running     0          50m
kube-system   local-path-provisioner-7ff9579c6-cl5zm   1/1     Running     0          50m
kube-system   coredns-66c464876b-c4tv9                 1/1     Running     0          50m
kube-system   helm-install-traefik-8rx7q               0/1     Completed   0          50m
kube-system   svclb-traefik-5rzth                      2/2     Running     0          6m1s
kube-system   svclb-traefik-bdlqb                      2/2     Running     0          6m1s
kube-system   traefik-5dd496474-lkr79                  1/1     Running     0          6m1s

到此k3s就安装完成了,此时在每个k3s server节点上都会自动生成一个kubeconfig 配置文件
文件的具体位置:/etc/rancher/k3s/k3s.yaml,该文件包含用于完全访问集群的凭据

6. 修改kubeconfig 配置文件

6.1 创建config文件

cp /etc/rancher/k3s/k3s.yaml ~/.kube/config

6.2 修改配置文件

说明:请在rancher安装完成后,再手动更改这个地址为负载均衡器的 DNS,并且指定端口 6443,否则在安装rancher时会报错。
在这个 kubeconfig 文件中,server参数为 localhost(即:https://127.0.0.1:6443)。您需要手动更改这个地址为负载均衡器的 DNS,并且指定端口 6443
vim ~/.kube/config
注意: server中url协议一定要写https://

https://rancher.zyrox.com:6443

结果: 您现在可以使用kubectl来管理您的 K3s 集群。如果您有多个 kubeconfig 文件,可以在使用kubectl时通过传递文件路径来指定要使用的 kubeconfig 文件:

kubectl --kubeconfig ~/.kube/config/k3s.yaml get pods --all-namespaces

6.3 检查集群 Pod 的运行状况

既然已经设置了kubeconfig文件,就可以使用kubectl从您的本地计算机访问集群了。
检查所有需要的 Pod 和容器是否状况良好:

kubectl get pods --all-namespaces

结果: 您已确认可以使用kubectl访问集群,并且 K3s 集群正在正确运行。现在,可以在集群上安装 Rancher Server 了。
至此,k3s集群安装完成。

下一步:在k3s集群上在线安装Rancher(helm安装方式)

参考资料:
1、在线安装k3s集群-外置mysql数据库
https://blog.csdn.net/zhoumengshun/article/details/108158988
2、官文:在线安装介绍(k3s+rke),本文参考k3s部分
https://docs.rancher.cn/docs/rancher2/installation/_index
3、在k3s上安装Rancher(helm安装方式)
https://blog.csdn.net/zhoumengshun/article/details/108160704