十一、Docker容器化部署企业级应用集群-2-kubernetes-enterprise-orchestration
Kubernetes 企业级容器编排实战:从入门到生产架构的深度实践
·
Kubernetes 企业级容器编排实战:从入门到生产架构的深度实践
作者:云原生架构师
技术栈:Kubernetes, Docker, Helm, Istio, Prometheus
难度等级:★★★★★(专家级)
预计阅读时间:80 分钟
目录
- [引言:为什么选择 Kubernetes](#1-引言为什么选择 kubernetes)
- Kubernetes 架构深度解析
- Kubernetes 集群部署
- 核心资源对象详解
- 应用部署实战
- 服务发现与负载均衡
- 存储管理
- 配置与密钥管理
- 自动扩缩容
- 监控与日志
- 安全加固
- 生产环境最佳实践
1. 引言:为什么选择 Kubernetes
1.1 容器编排的演进
┌─────────────────────────────────────────────────────┐
│ 容器编排发展史 │
├─────────────────────────────────────────────────────┤
│ 2013-2014: 手动部署阶段 │
│ - Docker Compose (单机) │
│ - 脚本管理 (ansible, puppet) │
│ 问题:无法自动恢复、无法弹性伸缩 │
│ │
│ 2014-2016: 编排工具竞争 │
│ - Docker Swarm (Docker 原生) │
│ - Kubernetes (Google 背景) │
│ - Mesos (Apache 出品) │
│ │
│ 2016-2018: Kubernetes 胜出 │
│ - CNCF 毕业项目 (2018) │
│ - 事实标准 (78% 市场份额) │
│ - 生态完善 (Helm, Istio, Prometheus) │
│ │
│ 2018-现在:云原生时代 │
│ - Serverless Kubernetes │
│ - Service Mesh │
│ - GitOps │
└─────────────────────────────────────────────────────┘
1.2 Kubernetes vs Docker Swarm
| 特性 | Kubernetes | Docker Swarm |
|---|---|---|
| 架构复杂度 | 高(Master/Node) | 低(Manager/Worker) |
| 学习曲线 | 陡峭 | 平缓 |
| 功能丰富度 | ⭐⭐⭐⭐⭐ | ⭐⭐⭐ |
| 自动恢复 | 支持 | 支持 |
| 自动扩缩容 | HPA/VPA | 手动 |
| 服务发现 | 内置 DNS | 内置 DNS |
| 负载均衡 | 高级(Ingress) | 基础 |
| 存储编排 | CSI 标准 | 基础 |
| 配置管理 | ConfigMap/Secret | Config |
| 生态成熟度 | 极成熟 | 一般 |
| 企业采用率 | 78% | 15% |
结论:生产环境首选 Kubernetes
2. Kubernetes 架构深度解析
2.1 集群架构
2.2 核心组件详解
Control Plane(控制平面):
-
API Server:
- 集群统一入口
- RESTful API
- 认证授权
- 参数验证
-
Scheduler:
- Pod 调度决策
- 资源优化
- 亲和性/反亲和性
-
Controller Manager:
- 节点控制器
- 副本控制器
- 端点控制器
- 服务账户控制器
-
etcd:
- 分布式键值存储
- 集群状态存储
- 强一致性(Raft 协议)
Worker Node(工作节点):
-
Kubelet:
- 节点代理
- Pod 生命周期管理
- 健康检查
-
Kube Proxy:
- 网络代理
- 负载均衡
- iptables/IPVS
-
Container Runtime:
- Docker
- containerd
- CRI-O
3. Kubernetes 集群部署
3.1 kubeadm 部署
环境准备:
# 所有节点执行
# 关闭 swap
swapoff -a
sed -i '/ swap / s/^\(.*\)$/#\1/g' /etc/fstab
# 配置内核参数
cat > /etc/sysctl.d/k8s.conf << EOF
net.bridge.bridge-nf-call-ip6tables = 1
net.bridge.bridge-nf-call-iptables = 1
net.ipv4.ip_forward = 1
EOF
sysctl --system
# 安装 Docker
yum install -y yum-utils
yum-config-manager --add-repo https://download.docker.com/linux/centos/docker-ce.repo
yum install -y docker-ce docker-ce-cli containerd.io
systemctl enable docker && systemctl start docker
# 安装 kubeadm, kubelet, kubectl
cat > /etc/yum.repos.d/kubernetes.repo << EOF
[kubernetes]
name=Kubernetes
baseurl=https://packages.cloud.google.com/yum/repos/kubernetes-el7-x86_64
enabled=1
gpgcheck=1
repo_gpgcheck=1
gpgkey=https://packages.cloud.google.com/yum/doc/yum-key.gpg https://packages.cloud.google.com/yum/doc/rpm-package-key.gpg
EOF
yum install -y kubelet kubeadm kubectl
systemctl enable kubelet && systemctl start kubelet
初始化 Master 节点:
# Master 节点执行
kubeadm init \
--pod-network-cidr=10.244.0.0/16 \
--service-cidr=10.96.0.0/12 \
--kubernetes-version=v1.29.0 \
--apiserver-advertise-address=192.168.1.100
# 配置 kubectl
mkdir -p $HOME/.kube
sudo cp -i /etc/kubernetes/admin.conf $HOME/.kube/config
sudo chown $(id -u):$(id -g) $HOME/.kube/config
# 安装网络插件(Calico)
kubectl apply -f https://docs.projectcalico.org/manifests/calico.yaml
加入 Worker 节点:
# Worker 节点执行(使用 kubeadm init 输出的命令)
kubeadm join 192.168.1.100:6443 \
--token <token> \
--discovery-token-ca-cert-hash sha256:<hash>
验证集群:
# 查看节点
kubectl get nodes
# 输出:
# NAME STATUS ROLES AGE VERSION
# master Ready control-plane 10m v1.29.0
# worker1 Ready <none> 5m v1.29.0
# worker2 Ready <none> 5m v1.29.0
# 查看组件状态
kubectl get componentstatuses
# 输出:
# NAME STATUS MESSAGE ERROR
# controller-manager Healthy ok
# scheduler Healthy ok
# etcd-0 Healthy {"health":"true"}
3.2 高可用集群部署
架构:
┌─────────────────────────────────────────────────────┐
│ Kubernetes 高可用集群 │
├─────────────────────────────────────────────────────┤
│ ┌─────────────┐ ┌─────────────┐ ┌─────────────┐ │
│ │ Master 1 │ │ Master 2 │ │ Master 3 │ │
│ │ 192.168.1.10│ │ 192.168.1.11│ │ 192.168.1.12│ │
│ └──────┬──────┘ └──────┬──────┘ └────────────┘ │
│ │ │ │ │
│ └────────────────┼────────────────┘ │
│ │ │
│ ┌───────────▼───────────┐ │
│ │ Load Balancer │ │
│ │ (HAProxy/Nginx) │ │
│ │ 192.168.1.100:6443 │ │
│ └───────────┬───────────┘ │
│ │ │
│ ┌────────────────┼────────────────┐ │
│ │ │ │ │
│ ┌──────▼──────┐ ┌──────▼──────┐ ┌──────▼──────┐ │
│ │ Worker 1 │ │ Worker 2 │ │ Worker 3 │ │
│ │192.168.1.20 │ │192.168.1.21 │ │192.168.1.22 │ │
│ └─────────────┘ └─────────────┘ └─────────────┘ │
└─────────────────────────────────────────────────────┘
HAProxy 配置:
# /etc/haproxy/haproxy.cfg
global
log 127.0.0.1 local2
maxconn 4000
ulimit-n 4160
defaults
mode tcp
log global
option tcplog
option dontlognull
timeout connect 5000
timeout client 50000
timeout server 50000
frontend kubernetes
bind *:6443
option tcplog
default_backend kubernetes-master
backend kubernetes-master
option httpchk GET /healthz
http-check expect status 200
balance roundrobin
server master1 192.168.1.10:6443 check
server master2 192.168.1.11:6443 check
server master3 192.168.1.12:6443 check
4. 核心资源对象详解
4.1 Pod
Pod 定义:
apiVersion: v1
kind: Pod
metadata:
name: nginx-pod
labels:
app: nginx
version: "1.25"
annotations:
description: "Nginx web server"
spec:
containers:
- name: nginx
image: nginx:1.25.3-alpine
ports:
- containerPort: 80
protocol: TCP
resources:
requests:
cpu: "100m"
memory: "128Mi"
limits:
cpu: "500m"
memory: "512Mi"
env:
- name: TZ
value: "Asia/Shanghai"
volumeMounts:
- name: html
mountPath: /usr/share/nginx/html
- name: config
mountPath: /etc/nginx/conf.d
readOnly: true
livenessProbe:
httpGet:
path: /health
port: 80
initialDelaySeconds: 10
periodSeconds: 10
timeoutSeconds: 5
failureThreshold: 3
readinessProbe:
httpGet:
path: /ready
port: 80
initialDelaySeconds: 5
periodSeconds: 5
timeoutSeconds: 3
failureThreshold: 3
volumes:
- name: html
emptyDir: {}
- name: config
configMap:
name: nginx-config
restartPolicy: Always
nodeSelector:
disktype: ssd
tolerations:
- key: "dedicated"
operator: "Equal"
value: "web"
effect: "NoSchedule"
4.2 Deployment
Deployment 定义:
apiVersion: apps/v1
kind: Deployment
metadata:
name: nginx-deployment
labels:
app: nginx
spec:
replicas: 3
selector:
matchLabels:
app: nginx
strategy:
type: RollingUpdate
rollingUpdate:
maxSurge: 1
maxUnavailable: 0
template:
metadata:
labels:
app: nginx
version: "1.25"
spec:
containers:
- name: nginx
image: nginx:1.25.3-alpine
ports:
- containerPort: 80
resources:
requests:
cpu: "100m"
memory: "128Mi"
limits:
cpu: "500m"
memory: "512Mi"
livenessProbe:
httpGet:
path: /health
port: 80
initialDelaySeconds: 10
periodSeconds: 10
readinessProbe:
httpGet:
path: /ready
port: 80
initialDelaySeconds: 5
periodSeconds: 5
滚动更新:
# 更新镜像
kubectl set image deployment/nginx-deployment nginx=nginx:1.25.4
# 查看更新状态
kubectl rollout status deployment/nginx-deployment
# 查看更新历史
kubectl rollout history deployment/nginx-deployment
# 回滚到上一个版本
kubectl rollout undo deployment/nginx-deployment
# 回滚到指定版本
kubectl rollout undo deployment/nginx-deployment --to-revision=2
4.3 StatefulSet
StatefulSet 定义:
apiVersion: apps/v1
kind: StatefulSet
metadata:
name: mysql
spec:
serviceName: mysql
replicas: 3
selector:
matchLabels:
app: mysql
template:
metadata:
labels:
app: mysql
spec:
containers:
- name: mysql
image: mysql:8.0
ports:
- containerPort: 3306
name: mysql
env:
- name: MYSQL_ROOT_PASSWORD
valueFrom:
secretKeyRef:
name: mysql-secret
key: root-password
volumeMounts:
- name: data
mountPath: /var/lib/mysql
volumeClaimTemplates:
- metadata:
name: data
spec:
accessModes: ["ReadWriteOnce"]
resources:
requests:
storage: 10Gi
4.4 DaemonSet
DaemonSet 定义:
apiVersion: apps/v1
kind: DaemonSet
metadata:
name: node-exporter
spec:
selector:
matchLabels:
app: node-exporter
template:
metadata:
labels:
app: node-exporter
spec:
hostNetwork: true
hostPID: true
containers:
- name: node-exporter
image: prom/node-exporter:latest
ports:
- containerPort: 9100
hostPort: 9100
volumeMounts:
- name: proc
mountPath: /host/proc
readOnly: true
- name: sys
mountPath: /host/sys
readOnly: true
- name: root
mountPath: /rootfs
readOnly: true
volumes:
- name: proc
hostPath:
path: /proc
- name: sys
hostPath:
path: /sys
- name: root
hostPath:
path: /
5. 应用部署实战
5.1 微服务应用部署
完整 YAML:
# 部署微服务应用
apiVersion: apps/v1
kind: Deployment
metadata:
name: user-service
spec:
replicas: 3
selector:
matchLabels:
app: user-service
template:
metadata:
labels:
app: user-service
version: v1
spec:
containers:
- name: user-service
image: myregistry/user-service:v1.0.0
ports:
- containerPort: 8080
env:
- name: DB_HOST
value: "mysql.default.svc.cluster.local"
- name: REDIS_HOST
value: "redis.default.svc.cluster.local"
- name: JAVA_OPTS
value: "-Xms512m -Xmx1024m"
resources:
requests:
cpu: "200m"
memory: "512Mi"
limits:
cpu: "1000m"
memory: "2Gi"
livenessProbe:
httpGet:
path: /actuator/health
port: 8080
initialDelaySeconds: 60
periodSeconds: 10
readinessProbe:
httpGet:
path: /actuator/health
port: 8080
initialDelaySeconds: 30
periodSeconds: 5
---
# 服务暴露
apiVersion: v1
kind: Service
metadata:
name: user-service
spec:
selector:
app: user-service
ports:
- port: 80
targetPort: 8080
type: ClusterIP
---
# 水平自动扩缩容
apiVersion: autoscaling/v2
kind: HorizontalPodAutoscaler
metadata:
name: user-service-hpa
spec:
scaleTargetRef:
apiVersion: apps/v1
kind: Deployment
name: user-service
minReplicas: 3
maxReplicas: 10
metrics:
- type: Resource
resource:
name: cpu
target:
type: Utilization
averageUtilization: 70
- type: Resource
resource:
name: memory
target:
type: Utilization
averageUtilization: 80
5.2 数据库部署
MySQL StatefulSet:
apiVersion: v1
kind: Secret
metadata:
name: mysql-secret
type: Opaque
stringData:
root-password: "secure-password-123"
---
apiVersion: apps/v1
kind: StatefulSet
metadata:
name: mysql
spec:
serviceName: mysql
replicas: 3
selector:
matchLabels:
app: mysql
template:
metadata:
labels:
app: mysql
spec:
containers:
- name: mysql
image: mysql:8.0
ports:
- containerPort: 3306
name: mysql
env:
- name: MYSQL_ROOT_PASSWORD
valueFrom:
secretKeyRef:
name: mysql-secret
key: root-password
- name: MYSQL_DATABASE
value: "appdb"
- name: MYSQL_USER
value: "appuser"
- name: MYSQL_PASSWORD
value: "apppassword"
volumeMounts:
- name: data
mountPath: /var/lib/mysql
resources:
requests:
cpu: "500m"
memory: "1Gi"
limits:
cpu: "2000m"
memory: "4Gi"
livenessProbe:
exec:
command:
- mysqladmin
- ping
initialDelaySeconds: 30
periodSeconds: 10
readinessProbe:
exec:
command:
- mysqladmin
- ping
initialDelaySeconds: 10
periodSeconds: 5
volumeClaimTemplates:
- metadata:
name: data
spec:
accessModes: ["ReadWriteOnce"]
resources:
requests:
storage: 20Gi
storageClassName: nfs-storage
---
apiVersion: v1
kind: Service
metadata:
name: mysql
spec:
selector:
app: mysql
ports:
- port: 3306
targetPort: 3306
clusterIP: None
6. 服务发现与负载均衡
6.1 Service 类型
ClusterIP(默认):
apiVersion: v1
kind: Service
metadata:
name: my-service
spec:
selector:
app: my-app
ports:
- port: 80
targetPort: 8080
type: ClusterIP
NodePort:
apiVersion: v1
kind: Service
metadata:
name: my-service
spec:
selector:
app: my-app
ports:
- port: 80
targetPort: 8080
nodePort: 30080
type: NodePort
LoadBalancer:
apiVersion: v1
kind: Service
metadata:
name: my-service
spec:
selector:
app: my-app
ports:
- port: 80
targetPort: 8080
type: LoadBalancer
6.2 Ingress
Ingress 配置:
apiVersion: networking.k8s.io/v1
kind: Ingress
metadata:
name: app-ingress
annotations:
nginx.ingress.kubernetes.io/rewrite-target: /
nginx.ingress.kubernetes.io/ssl-redirect: "true"
spec:
ingressClassName: nginx
tls:
- hosts:
- example.com
secretName: example-tls
rules:
- host: example.com
http:
paths:
- path: /api
pathType: Prefix
backend:
service:
name: api-service
port:
number: 80
- path: /
pathType: Prefix
backend:
service:
name: web-service
port:
number: 80
7. 存储管理
7.1 PersistentVolume
apiVersion: v1
kind: PersistentVolume
metadata:
name: nfs-pv
spec:
capacity:
storage: 10Gi
accessModes:
- ReadWriteMany
persistentVolumeReclaimPolicy: Retain
storageClassName: nfs
nfs:
path: /data/nfs
server: 192.168.1.100
7.2 PersistentVolumeClaim
apiVersion: v1
kind: PersistentVolumeClaim
metadata:
name: nfs-pvc
spec:
accessModes:
- ReadWriteMany
storageClassName: nfs
resources:
requests:
storage: 10Gi
8. 配置与密钥管理
8.1 ConfigMap
apiVersion: v1
kind: ConfigMap
metadata:
name: app-config
data:
database.properties: |
db.host=mysql.default.svc.cluster.local
db.port=3306
db.name=appdb
application.yml: |
server:
port: 8080
spring:
datasource:
url: jdbc:mysql://${DB_HOST}:3306/appdb
username: appuser
password: apppassword
8.2 Secret
apiVersion: v1
kind: Secret
metadata:
name: app-secret
type: Opaque
stringData:
db-password: "secure-password"
api-key: "your-api-key"
9. 自动扩缩容
9.1 HPA(Horizontal Pod Autoscaler)
apiVersion: autoscaling/v2
kind: HorizontalPodAutoscaler
metadata:
name: app-hpa
spec:
scaleTargetRef:
apiVersion: apps/v1
kind: Deployment
name: app-deployment
minReplicas: 2
maxReplicas: 10
metrics:
- type: Resource
resource:
name: cpu
target:
type: Utilization
averageUtilization: 70
- type: Resource
resource:
name: memory
target:
type: Utilization
averageUtilization: 80
behavior:
scaleDown:
stabilizationWindowSeconds: 300
policies:
- type: Percent
value: 50
periodSeconds: 60
scaleUp:
stabilizationWindowSeconds: 0
policies:
- type: Percent
value: 100
periodSeconds: 15
- type: Pods
value: 4
periodSeconds: 15
selectPolicy: Max
10. 监控与日志
10.1 Prometheus 部署
apiVersion: monitoring.coreos.com/v1
kind: Prometheus
metadata:
name: k8s
spec:
version: v2.45.0
replicas: 2
retention: 15d
resources:
requests:
cpu: 500m
memory: 2Gi
serviceAccountName: prometheus
serviceMonitorSelector: {}
podMonitorSelector: {}
10.2 Grafana 仪表盘
导入 Dashboard:
- Kubernetes Cluster: 6417
- Node Exporter: 1860
- Prometheus: 2
- Pod Monitoring: 6417
11. 安全加固
11.1 RBAC 配置
apiVersion: rbac.authorization.k8s.io/v1
kind: Role
metadata:
namespace: default
name: pod-reader
rules:
- apiGroups: [""]
resources: ["pods"]
verbs: ["get", "watch", "list"]
---
apiVersion: rbac.authorization.k8s.io/v1
kind: RoleBinding
metadata:
name: read-pods
namespace: default
subjects:
- kind: User
name: jane
apiGroup: rbac.authorization.k8s.io
roleRef:
kind: Role
name: pod-reader
apiGroup: rbac.authorization.k8s.io
11.2 Network Policy
apiVersion: networking.k8s.io/v1
kind: NetworkPolicy
metadata:
name: deny-all
namespace: default
spec:
podSelector: {}
policyTypes:
- Ingress
- Egress
---
apiVersion: networking.k8s.io/v1
kind: NetworkPolicy
metadata:
name: allow-nginx
namespace: default
spec:
podSelector:
matchLabels:
app: nginx
policyTypes:
- Ingress
ingress:
- from:
- podSelector:
matchLabels:
app: frontend
ports:
- protocol: TCP
port: 80
12. 生产环境最佳实践
12.1 资源管理
resources:
requests:
cpu: "500m" # 保证资源
memory: "512Mi"
limits:
cpu: "1000m" # 最大资源
memory: "1Gi"
12.2 健康检查
livenessProbe:
httpGet:
path: /health
port: 8080
initialDelaySeconds: 60 # 启动 60 秒后开始检查
periodSeconds: 10 # 每 10 秒检查一次
timeoutSeconds: 5 # 超时 5 秒
failureThreshold: 3 # 失败 3 次重启
readinessProbe:
httpGet:
path: /ready
port: 8080
initialDelaySeconds: 30
periodSeconds: 5
timeoutSeconds: 3
failureThreshold: 3 # 失败 3 次从 Service 移除
12.3 标签规范
metadata:
labels:
app: user-service # 应用名称
version: v1.0.0 # 版本号
component: backend # 组件类型
team: platform # 所属团队
environment: production # 环境
版权声明:本文原创,转载请注明出处
如果本文对您有帮助,欢迎点赞、收藏、转发!
更多推荐



所有评论(0)