内网离线部署

内网机器需要安装了docker，docker-compose，iptables

内网中有可以联网的机器

联网机器设置代理服务器

联网机器上设置nginx代理软件源，参考install/kubernetes/nginx-https/apt-yum-pip-source.conf

启动nginx代理访问

需要监听80和443端口

docker run --name proxy-repo -d --restart=always --network=host -v $PWD/nginx-https/apt-yum-pip-source.conf:/etc/nginx/nginx.conf nginx

在内网机器上配置host

host

<出口服务器的IP地址>    mirrors.aliyun.com
<出口服务器的IP地址>    ccr.ccs.tencentyun.com
<出口服务器的IP地址>    registry-1.docker.io
<出口服务器的IP地址>    auth.docker.io
<出口服务器的IP地址>    hub.docker.com
<出口服务器的IP地址>    www.modelscope.cn
<出口服务器的IP地址>    modelscope.oss-cn-beijing.aliyuncs.com
<出口服务器的IP地址>    archive.ubuntu.com
<出口服务器的IP地址>    security.ubuntu.com
<出口服务器的IP地址>    cloud.r-project.org
<出口服务器的IP地址>    deb.nodesource.com
<出口服务器的IP地址>    docker-76009.sz.gfp.tencent-cloud.com

添加新的host要重启下kubelet docker restart kubelet

如果代理机器没法占用80和443，需要使用iptable尝试转发。

iptables

sudo iptables -t nat -A PREROUTING -p tcp --dport 80 -d mirrors.aliyun.com -j DNAT --to-destination <出口服务器的IP地址>:<出口服务器的端口>

k8s配置域名解析

k8s中修改 kube-system命名空间，coredns的configmap，添加需要访问的地址的地址映射

{
	"Corefile": ".:53 {
		    errors
		    health {
		      lameduck 5s
		    }
		    ready
		    kubernetes cluster.local in-addr.arpa ip6.arpa {
		      pods insecure
		      fallthrough in-addr.arpa ip6.arpa
		    }
		    # 自定义host
		    hosts {
		        <出口服务器的IP地址>    mirrors.aliyun.com
                <出口服务器的IP地址>    ccr.ccs.tencentyun.com
                <出口服务器的IP地址>    registry-1.docker.io
                <出口服务器的IP地址>    auth.docker.io
                <出口服务器的IP地址>    hub.docker.com
                <出口服务器的IP地址>    www.modelscope.cn
                <出口服务器的IP地址>    modelscope.oss-cn-beijing.aliyuncs.com
                <出口服务器的IP地址>    archive.ubuntu.com
                <出口服务器的IP地址>    security.ubuntu.com
                <出口服务器的IP地址>    cloud.r-project.org
                <出口服务器的IP地址>    deb.nodesource.com
                <出口服务器的IP地址>    docker-76009.sz.gfp.tencent-cloud.com
		      fallthrough
		    }
		    prometheus :9153
		    forward . \"/etc/resolv.conf\"
		    cache 30
		    loop
		    reload
		    loadbalance
		} # STUBDOMAINS - Rancher specific change
		"
}

重启coredns的pod

容器里面使用放开的域名

pip配置https源:

pip3 config set global.index-url https://mirrors.aliyun.com/pypi/simple

apt配置https源: 修改/etc/apt/source.list

ubuntu 20.04

deb https://mirrors.aliyun.com/ubuntu/ focal main restricted universe multiverse
deb-src https://mirrors.aliyun.com/ubuntu/ focal main restricted universe multiverse

deb https://mirrors.aliyun.com/ubuntu/ focal-updates main restricted universe multiverse
deb-src https://mirrors.aliyun.com/ubuntu/ focal-updates main restricted universe multiverse

deb https://mirrors.aliyun.com/ubuntu/ focal-backports main restricted universe multiverse
deb-src https://mirrors.aliyun.com/ubuntu/ focal-backports main restricted universe multiverse

deb https://mirrors.aliyun.com/ubuntu/ focal-security main restricted universe multiverse
deb-src https://mirrors.aliyun.com/ubuntu/ focal-security main restricted universe multiverse

deb https://mirrors.aliyun.com/ubuntu/ focal-proposed main restricted universe multiverse
deb-src https://mirrors.aliyun.com/ubuntu/ focal-proposed main restricted universe multiverse

yum 配置https源：下载阿里的源

wget -O /etc/yum.repos.d/CentOS-Base.repo https://mirrors.aliyun.com/repo/Centos-8.repo

完全无法联网的内网机器

安装依赖组件和数据

能外网的机器上，拷贝到内网机器上

mkdir offline
cd offline
wget https://cube-studio.oss-cn-hangzhou.aliyuncs.com/install/kubectl
wget https://cube-studio.oss-cn-hangzhou.aliyuncs.com/harbor/harbor-offline-installer-v2.3.4.tgz
# 下载模型
wget https://docker-76009.sz.gfp.tencent-cloud.com/github/cube-studio/inference/resnet50.onnx
wget https://docker-76009.sz.gfp.tencent-cloud.com/github/cube-studio/inference/resnet50-torchscript.pt
wget https://docker-76009.sz.gfp.tencent-cloud.com/github/cube-studio/inference/resnet50.mar
wget https://docker-76009.sz.gfp.tencent-cloud.com/github/cube-studio/inference/tf-mnist.tar.gz

# 训练,标注数据集
wget https://cube-studio.oss-cn-hangzhou.aliyuncs.com/pipeline/coco_data_sample.zip
wget https://docker-76009.sz.gfp.tencent-cloud.com/github/cube-studio/aihub/deeplearning/cv-tinynas-object-detection-damoyolo/dataset/coco2014.zip

连不上网的机器上

1、安装kubectl

chmod +x kubectl  && cp kubectl /usr/bin/ && mv kubectl /usr/local/bin/

2、安装内网镜像仓库

参考install/kubernetes/harbor/readme.md

并创建cube-studio和rancher项目，分别存放rancher的基础镜像和cube-studio的基础镜像

配置每台机器docker添加这个 insecure-registries内网的私有镜像仓，如果是https可以忽略

参考： install/kubernetes/rancher/install_docker.md

3、将其他前面下载的数据转移到个人目录下

cp -r offline /data/k8s/kubeflow/pipeline/workspace/admin/

镜像转移至内网

转移rancher镜像

修改install/kubernetes/rancher/all_image.py中内网仓库地址，运行导出推送和拉取脚本.

联网机器上运行 pull_rancher_images.sh将镜像推送到内网仓库或 rancher_image_save.sh将镜像压缩成文件再导入到内网机器

不能联网机器上运行，每台机器运行 pull_rancher_harbor.sh 从内网仓库中拉取镜像或 rancher_image_load.sh 从压缩文件中导入镜像

内网部署 k8s

使用rancher相同方法可在内网部署k8s

转移cube-studio基础镜像

修改all_image.py中内网仓库地址，运行导出推送和拉取脚本.

联网机器上运行 push_harbor.sh 将镜像推送到内网仓库或 image_save.sh将镜像压缩成文件再导入到内网机器

不能联网机器上运行，每台机器运行 pull_harbor.sh 从内网仓库中拉取镜像或 image_load.sh 从压缩文件中导入镜像

构建内网版本cube-studio

联网机器上，重新打前后端镜像，并更新到内网仓库

在install/kubernetes目录下执行替换成内网镜像

cube_repo = '<内网镜像仓库ip>:<内网镜像仓库端口>/cube-studio/'
import os
def fix_file(file_path):
    if os.path.isdir(file_path):
        file_paths = [os.path.join(file_path, one) for one in os.listdir(file_path)]
    else:
        file_paths = [file_path]
        
    for file_path in file_paths:
        content = ''.join(open(file_path, mode='r').readlines())
        content = content.replace('ccr.ccs.tencentyun.com/cube-studio/', cube_repo)  # 替换自产镜像
        content = content.replace('docker:23.0.4', cube_repo + 'docker:23.0.4')  # 替换docker
        content = content.replace('python:', cube_repo + 'python:')  # 替换docker
        file = open(file_path, mode='w')
        file.write(content)
        file.close()

fix_file('cube/overlays/config/config.py')
fix_file('../../myapp/init')
fix_file('../../myapp/init-en')

项目根路径下

构建前端
docker build --network=host -t 192.168.3.7:88/cube-studio/kubeflow-dashboard-frontend-enterprise:2024.01.01-offline -f install/docker/dockerFrontend/Dockerfile .
docker push 192.168.3.7:88/cube-studio/kubeflow-dashboard-frontend-enterprise:2024.01.01-offline
构建后端
docker build --network=host -t 192.168.3.7:88/cube-studio/kubeflow-dashboard-enterprise:2024.01.01-offline --build-arg TARGETARCH=amd64 -f install/docker/Dockerfile .
docker push 192.168.3.7:88/cube-studio/kubeflow-dashboard-enterprise:2024.01.01-offline

内网部署cube-studio

1、修改init_node.sh中pull_images.sh 修改为pull_harbor.sh，表示从内网拉取镜像，每台机器都要执行。

2、取消下载kubectl，注释掉

ARCH=$(uname -m)

if [ "$ARCH" = "x86_64" ]; then
  wget https://cube-studio.oss-cn-hangzhou.aliyuncs.com/install/kubectl && chmod +x kubectl  && cp kubectl /usr/bin/ && mv kubectl /usr/local/bin/
elif [ "$ARCH" = "aarch64" ]; then
  wget -O kubectl https://cube-studio.oss-cn-hangzhou.aliyuncs.com/install/kubectl-arm64 && chmod +x kubectl  && cp kubectl /usr/bin/ && mv kubectl /usr/local/bin/
fi

3、修改cube-studio镜像为内网镜像。

vi cube/overlays/kustomization.yml
修改最底部的newName和newTag

3、复制k8s的config文件，部署cube-studio，部署方式通外网，参考：部署/单机部署

web界面的部分内网修正

1、web界面hubsecret改为内部仓库的账号密码

2、自带的目标识别pipeline中，第一个数据拉取任务启动命令改为，cp offline/coco_data_sample.zip ./ && ...

3、自带的推理服务启动命令由wget https://xxxx/xx/.zip 部分改为 cp /mnt/admin/offline/xx.zip ./

欢迎大家传播分享文章

开源体验地址：

http://39.96.177.55:8888/

Provide feedback

Saved searches

Use saved searches to filter your results more quickly