Documentation Index
Fetch the complete documentation index at: https://wukong.mintlify.app/llms.txt
Use this file to discover all available pages before exploring further.
WuKongIM 多节点集群提供高可用性、容灾能力和负载均衡,适用于对数据安全要求高的大型应用。
集群特点
优点:
- 高可用性和容灾能力强
- 支持在线扩容
- 多副本间实时自动备份
- 负载均衡
缺点:
集群原则:WuKongIM 遵循 2n+1 原则,n 表示允许宕机的节点数量。
- 允许 1 台机器宕机:需要 3 台机器(2×1+1=3)
- 允许 2 台机器宕机:需要 5 台机器(2×2+1=5)
环境要求
- 机器数量:4台或以上
- 操作系统:Linux(推荐 Ubuntu)
- 配置:2核4G 或 4核8G
- Docker:24.0.4 或以上版本
示例服务器配置:
| 角色 | 说明 | 内网IP | 外网IP |
|---|
| 负载均衡和监控 | gateway | 10.206.0.2 | 119.45.33.109 |
| WuKongIM节点 | node1 (ID: 1) | 10.206.0.10 | 146.56.249.208 |
| WuKongIM节点 | node2 (ID: 2) | 10.206.0.12 | 129.211.171.99 |
| WuKongIM节点 | node3 (ID: 3) | 10.206.0.5 | 119.45.175.82 |
部署步骤
1. 安装负载均衡和监控
在 gateway 节点创建安装目录:
mkdir ~/gateway
cd ~/gateway
创建 docker-compose.yml 文件:
version: '3.7'
services:
prometheus: # 监控服务
image: registry.cn-shanghai.aliyuncs.com/wukongim/prometheus:v2.53.1
volumes:
- "./prometheus.yml:/etc/prometheus/prometheus.yml"
ports:
- "9090:9090"
nginx: # 负载均衡
image: registry.cn-shanghai.aliyuncs.com/wukongim/nginx:1.27.0
volumes:
- ./nginx.conf:/etc/nginx/nginx.conf
ports:
- "15001:15001"
- "15100:15100"
- "15200:15200"
- "15300:15300"
- "15172:15172"
创建 nginx.conf 文件(替换 IP 地址为实际地址):
user nginx;
worker_processes auto;
error_log /var/log/nginx/error.log notice;
pid /var/run/nginx.pid;
events {
use epoll;
worker_connections 4096;
multi_accept on;
accept_mutex off;
}
http {
include /etc/nginx/mime.types;
default_type application/octet-stream;
log_format main '$remote_addr - $remote_user [$time_local] "$request" '
'$status $body_bytes_sent "$http_referer" '
'"$http_user_agent" "$http_x_forwarded_for"';
access_log /var/log/nginx/access.log main;
sendfile on;
keepalive_timeout 65;
# API 负载均衡
upstream wukongimapi {
server 10.206.0.10:5001;
server 10.206.0.12:5001;
server 10.206.0.5:5001;
}
# Demo 负载均衡
upstream wukongimdemo {
server 10.206.0.10:5172;
server 10.206.0.12:5172;
server 10.206.0.5:5172;
}
# Manager 负载均衡
upstream wukongimanager {
server 10.206.0.10:5300;
server 10.206.0.12:5300;
server 10.206.0.5:5300;
}
# WebSocket 负载均衡
upstream wukongimws {
server 10.206.0.10:5200;
server 10.206.0.12:5200;
server 10.206.0.5:5200;
}
# HTTP API 转发
server {
listen 15001;
location / {
proxy_pass http://wukongimapi;
proxy_connect_timeout 20s;
proxy_read_timeout 60s;
}
}
# Demo 界面
server {
listen 15172;
location / {
proxy_pass http://wukongimdemo;
proxy_connect_timeout 20s;
proxy_read_timeout 60s;
}
location /login {
rewrite ^ /chatdemo?apiurl=http://119.45.33.109:15001;
proxy_pass http://wukongimdemo;
proxy_connect_timeout 20s;
proxy_read_timeout 60s;
}
}
# Manager 界面
server {
listen 15300;
location / {
proxy_pass http://wukongimanager;
proxy_connect_timeout 60s;
proxy_read_timeout 60s;
}
}
# WebSocket 转发
server {
listen 15200;
location / {
proxy_pass http://wukongimws;
proxy_redirect off;
proxy_http_version 1.1;
proxy_read_timeout 180s;
proxy_send_timeout 120s;
proxy_connect_timeout 4s;
proxy_set_header X-Real-IP $remote_addr;
proxy_set_header Upgrade $http_upgrade;
proxy_set_header Connection "upgrade";
}
}
}
# TCP 负载均衡
stream {
upstream wukongimtcp {
server 10.206.0.10:5100;
server 10.206.0.12:5100;
server 10.206.0.5:5100;
}
server {
listen 15100;
proxy_connect_timeout 4s;
proxy_timeout 120s;
proxy_pass wukongimtcp;
}
}
创建 prometheus.yml 文件(替换 IP 地址为实际地址):
global:
scrape_interval: 15s
evaluation_interval: 15s
scrape_configs:
- job_name: wukongim1-trace-metrics
static_configs:
- targets: ['10.206.0.10:5300']
labels:
id: "1"
- job_name: wukongim2-trace-metrics
static_configs:
- targets: ['10.206.0.12:5300']
labels:
id: "2"
- job_name: wukongim3-trace-metrics
static_configs:
- targets: ['10.206.0.5:5300']
labels:
id: "3"
2. 安装 WuKongIM 节点
在所有 WuKongIM 节点创建安装目录:
mkdir ~/wukongim
cd ~/wukongim
节点 1 配置(替换 IP 地址为实际地址):
version: '3.7'
services:
wukongim:
image: registry.cn-shanghai.aliyuncs.com/wukongim/wukongim:v2
environment:
- "WK_MODE=release"
- "WK_CLUSTER_NODEID=1"
- "WK_INTRANET_TCPADDR=10.206.0.10:5100"
- "WK_CLUSTER_APIURL=http://10.206.0.10:5001"
- "WK_CLUSTER_SERVERADDR=10.206.0.10:11110"
- "WK_EXTERNAL_WSADDR=ws://119.45.33.109:15200"
- "WK_EXTERNAL_TCPADDR=119.45.33.109:15100"
- "WK_TRACE_PROMETHEUSAPIURL=http://10.206.0.2:9090"
- "WK_CLUSTER_INITNODES=1@10.206.0.10 2@10.206.0.12 3@10.206.0.5"
healthcheck:
test: "wget -q -Y off -O /dev/null http://localhost:5001/health > /dev/null 2>&1"
interval: 10s
timeout: 10s
retries: 3
restart: always
volumes:
- ./wukongim_data:/root/wukongim
ports:
- 11110:11110 # 分布式节点通讯端口
- 5001:5001 # 内网 API 通讯端口
- 5100:5100 # TCP 端口
- 5200:5200 # WebSocket 端口
- 5300:5300 # 管理端端口
- 5172:5172 # Demo 端口
节点 2 配置(替换 IP 地址为实际地址):
version: '3.7'
services:
wukongim:
image: registry.cn-shanghai.aliyuncs.com/wukongim/wukongim:v2
environment:
- "WK_MODE=release"
- "WK_CLUSTER_NODEID=2"
- "WK_CLUSTER_APIURL=http://10.206.0.12:5001"
- "WK_CLUSTER_SERVERADDR=10.206.0.12:11110"
- "WK_EXTERNAL_WSADDR=ws://119.45.33.109:15200"
- "WK_EXTERNAL_TCPADDR=119.45.33.109:15100"
- "WK_INTRANET_TCPADDR=10.206.0.12:5100"
- "WK_TRACE_PROMETHEUSAPIURL=http://10.206.0.2:9090"
- "WK_CLUSTER_INITNODES=1@10.206.0.10 2@10.206.0.12 3@10.206.0.5"
healthcheck:
test: "wget -q -Y off -O /dev/null http://localhost:5001/health > /dev/null 2>&1"
interval: 10s
timeout: 10s
retries: 3
restart: always
volumes:
- ./wukongim_data:/root/wukongim
ports:
- 11110:11110
- 5001:5001
- 5100:5100
- 5200:5200
- 5300:5300
- 5172:5172
节点 3 配置(替换 IP 地址为实际地址):
version: '3.7'
services:
wukongim:
image: registry.cn-shanghai.aliyuncs.com/wukongim/wukongim:v2
environment:
- "WK_MODE=release"
- "WK_CLUSTER_NODEID=3"
- "WK_CLUSTER_APIURL=http://10.206.0.5:5001"
- "WK_CLUSTER_SERVERADDR=10.206.0.5:11110"
- "WK_EXTERNAL_WSADDR=ws://119.45.33.109:15200"
- "WK_EXTERNAL_TCPADDR=119.45.33.109:15100"
- "WK_INTRANET_TCPADDR=10.206.0.5:5100"
- "WK_TRACE_PROMETHEUSAPIURL=http://10.206.0.2:9090"
- "WK_CLUSTER_INITNODES=1@10.206.0.10 2@10.206.0.12 3@10.206.0.5"
healthcheck:
test: "wget -q -Y off -O /dev/null http://localhost:5001/health > /dev/null 2>&1"
interval: 10s
timeout: 10s
retries: 3
restart: always
volumes:
- ./wukongim_data:/root/wukongim
ports:
- 11110:11110
- 5001:5001
- 5100:5100
- 5200:5200
- 5300:5300
- 5172:5172
3. 启动服务
启动顺序:
- 先启动负载均衡和监控:
# 在 gateway 节点
cd ~/gateway
docker-compose up -d
- 再启动所有 WuKongIM 节点:
# 在每个 WuKongIM 节点
cd ~/wukongim
docker-compose up -d
4. 验证部署
检查服务状态:
# 检查容器状态
docker-compose ps
# 查看日志
docker-compose logs -f
验证集群状态:
# 检查集群节点
curl http://119.45.33.109:15001/cluster/nodes
# 检查健康状态
curl http://119.45.33.109:15001/health
访问服务:
配置说明
关键环境变量
| 变量名 | 说明 | 示例值 |
|---|
WK_CLUSTER_NODEID | 节点 ID | 1, 2, 3 |
WK_CLUSTER_APIURL | 节点 API 地址 | http://10.206.0.10:5001 |
WK_CLUSTER_SERVERADDR | 节点通讯地址 | 10.206.0.10:11110 |
WK_CLUSTER_INITNODES | 初始节点列表 | 1@10.206.0.10 2@10.206.0.12 3@10.206.0.5 |
WK_EXTERNAL_WSADDR | 外部 WebSocket 地址 | ws://119.45.33.109:15200 |
WK_EXTERNAL_TCPADDR | 外部 TCP 地址 | 119.45.33.109:15100 |
端口说明
| 端口 | 说明 | 访问方式 |
|---|
| 5001 | HTTP API | 内网访问 |
| 5100 | TCP 连接 | 客户端连接 |
| 5200 | WebSocket | 客户端连接 |
| 5300 | 管理界面 | Web 访问 |
| 5172 | Demo 界面 | Web 访问 |
| 11110 | 集群通讯 | 节点间通讯 |
| 15001 | 负载均衡 API | 外网访问 |
| 15100 | 负载均衡 TCP | 外网访问 |
| 15200 | 负载均衡 WebSocket | 外网访问 |
| 15300 | 负载均衡管理 | 外网访问 |
| 15172 | 负载均衡 Demo | 外网访问 |
故障排除
常见问题
节点无法加入集群:
# 检查网络连通性
ping 10.206.0.10
# 检查端口是否开放
telnet 10.206.0.10 11110
# 查看节点日志
docker-compose logs wukongim
负载均衡无法访问:
# 检查 nginx 配置
docker-compose exec nginx nginx -t
# 重启 nginx
docker-compose restart nginx
监控数据异常:
# 检查 Prometheus 配置
curl http://119.45.33.109:9090/api/v1/targets
# 重启监控服务
docker-compose restart prometheus
日志查看
# 查看所有服务日志
docker-compose logs
# 查看特定服务日志
docker-compose logs wukongim
docker-compose logs nginx
docker-compose logs prometheus
# 实时查看日志
docker-compose logs -f wukongim
扩容操作
添加新节点到现有集群:
- 在新节点创建配置文件
- 设置新的节点 ID
- 更新
WK_CLUSTER_INITNODES 包含新节点
- 启动新节点服务
- 更新负载均衡配置
下一步