设置 StyleGAN2 所需要的环境
概述
没想到又被环境问题折腾了半天,简单记录一下。
关键点是要手动安装正确的 CUDA 环境,不能用 conda 安装,其安装的不完整,因为我们要编译自定义算子。
这里吐槽一下 https://github.com/eladrich/pixel2style2pixel 的 README,里面就没说需要的 CUDA 版本,我还是去找原项目才发现
CUDA 版本需要为 CUDA 10.1/10.2
。
步骤
手动安装 CUDA 10.1:
chmod u+x cuda_10.1.105_418.39_linux.run
sudo ./cuda_10.1.105_418.39_linux.run
之后配置环境变量,vi ~/.bashrc
:
export CUDA_HOME=/usr/local/cuda-10.1
export PATH="${CUDA_HOME}/bin:$PATH"
export LD_LIBRARY_PATH="${CUDA_HOME}/lib64:${CUDA_HOME}/extras/CUPTI/lib64:$LD_LIBRARY_PATH"
安装 cuDNN:
sudo dpkg -i ./libcudnn7_7.6.5.32-1+cuda10.1_amd64.deb
创建 conda 环境,并安装依赖:
conda create --name pytorch1.6 python=3.6.7 -y
pip install matplotlib opencv-python
pip install torch==1.6.0+cu101 torchvision==0.7.0+cu101 -f https://download.pytorch.org/whl/torch_stable.html
pip install tensorflow-gpu==1.15 # 注意这里装的 tensorflow 实际上用不了 GPU,不过无所谓
conda install ninja=1.10.0
终于能跑了,😅
之前遇到的报错
subprocess.CalledProcessError: Command '['ninja', '-v']' returned non-zero exit status 1
RuntimeError: Error building extension 'fused'
FAILED: fused_bias_act_kernel.cuda.o
error: no suitable constructor exists to convert from "c10::ScalarType" to "at::Type"
note: in expansion of macro ‘CHECK_CUDA’
CHECK_CUDA(input);
^
ninja: build stopped: subcommand failed.
nvcc fatal : Unsupported gpu architecture 'compute_75'
<char16_t>; _Alloc = std::allocator<char16_t>]’ without object __p->_M_set_sharable();
/bin/sh: 1: :/usr/local/cuda/bin/nvcc: not found
#define CHECK_CUDA(x) TORCH_CHECK(x.type().is_cuda(), #x " must be a CUDA tensor")
error: ‘TORCH_CHECK’ was not declared in this scope #define CHECK_CUDA(x) TORCH_CHECK(x.type().is_cuda(), #x " must be a CUDA tensor")
note: in expansion of macro ‘CHECK_CUDA’ CHECK_CUDA(input); ^ ninja: build stopped: subcommand failed
RuntimeError: Ninja is required to load C++ extensions
以上就是为了成功装个环境所遇到的形形色色的报错。
Links: setup-env-for-stylegan2