RuntimeError: Error building extension ‘fused’&FAILED: fused_bias_act_kernel.cuda.o&ninja: build stopped: subcommand failed.

问题如下:
RuntimeError: Error building extension ‘fused’: [1/3] /usr/local/cuda/bin/nvcc -DTORCH_EXTENSI ON_NAME=fused -DTORCH_API_INCLUDE_EXTENSION_H -isystem /root/miniconda3/lib/python3.8/site-pac kages/torch/include -isystem /root/miniconda3/lib/python3.8/site-packages/torch/include/torch/ csrc/api/include -isystem /root/miniconda3/lib/python3.8/site-packages/torch/include/TH -isyst em /root/miniconda3/lib/python3.8/site-packages/torch/include/THC -isystem /usr/local/cuda/inc lude -isystem /root/miniconda3/include/python3.8 -D_GLIBCXX_USE_CXX11_ABI=0 -D__CUDA_NO_HALF_O PERATORS__ -D__CUDA_NO_HALF_CONVERSIONS__ -D__CUDA_NO_HALF2_OPERATORS__ --expt-relaxed-constex pr -gencode=arch=compute_86,code=sm_86 --compiler-options ‘-fPIC’ -std=c++14 -c /autodl-tmp/Ru n_dir/20210817172013/op/fused_bias_act_kernel.cu -o fused_bias_act_kernel.cuda.o
FAILED: fused_bias_act_kernel.cuda.o
/usr/local/cuda/bin/nvcc -DTORCH_EXTENSION_NAME=fused -DTORCH_API_INCLUDE_EXTENSION_H -isystem /root/miniconda3/lib/python3.8/site-packages/torch/include -isystem /root/miniconda3/lib/pyth on3.8/site-packages/torch/include/torch/csrc/api/include -isystem /root/miniconda3/lib/python3 .8/site-packages/torch/include/TH -isystem /root/miniconda3/lib/python3.8/site-packages/torch/ include/THC -isystem /usr/local/cuda/include -isystem /root/miniconda3/include/python3.8 -D_GL IBCXX_USE_CXX11_ABI=0 -D__CUDA_NO_HALF_OPERATORS__ -D__CUDA_NO_HALF_CONVERSIONS__ -D__CUDA_NO_ HALF2_OPERATORS__ --expt-relaxed-constexpr -gencode=arch=compute_86,code=sm_86 --compiler-opti ons ‘-fPIC’ -std=c++14 -c /autodl-tmp/Run_dir/20210817172013/op/fused_bias_act_kernel.cu -o fu sed_bias_act_kernel.cuda.o
nvcc fatal : Unsupported gpu architecture ‘compute_86’
[2/3] c++ -MMD -MF fused_bias_act.o.d -DTORCH_EXTENSION_NAME=fused -DTORCH_API_INCLUDE_EXTENSI ON_H -isystem /root/miniconda3/lib/python3.8/site-packages/torch/include -isystem /root/minico nda3/lib/python3.8/site-packages/torch/include/torch/csrc/api/include -isystem /root/miniconda 3/lib/python3.8/site-packages/torch/include/TH -isystem /root/miniconda3/lib/python3.8/site-pa ckages/torch/include/THC -isystem /usr/local/cuda/include -isystem /root/miniconda3/include/py thon3.8 -D_GLIBCXX_USE_CXX11_ABI=0 -fPIC -std=c++14 -c /autodl-tmp/Run_dir/20210817172013/op/f used_bias_act.cpp -o fused_bias_act.o
In file included from /root/miniconda3/lib/python3.8/site-packages/torch/include/c10/core/Devi ceType.h:8:0,
from /root/miniconda3/lib/python3.8/site-packages/torch/include/c10/core/Devi ce.h:3,
from /root/miniconda3/lib/python3.8/site-packages/torch/include/c10/core/Allo cator.h:6,
from /root/miniconda3/lib/python3.8/site-packages/torch/include/ATen/ATen.h:7 ,
from /autodl-tmp/Run_dir/20210817172013/op/fused_bias_act.cpp:2:
/autodl-tmp/Run_dir/20210817172013/op/fused_bias_act.cpp: In function ‘at::Tensor fused_bias_a ct(const at::Tensor&, const at::Tensor&, const at::Tensor&, int, int, float, float)’:
/autodl-tmp/Run_dir/20210817172013/op/fused_bias_act.cpp:11:22: warning: ‘at::DeprecatedTypePr operties& at::Tensor::type() const’ is deprecated: Tensor.type() is deprecated. Instead use Te nsor.options(), which in many cases (e.g. in a constructor) is a drop-in replacement. If you w ere using data from type(), that is now available from Tensor itself, so instead of tensor.typ e().scalar_type(), use tensor.scalar_type() instead and instead of tensor.type().backend() use tensor.device(). [-Wdeprecated-declarations]
TORCH_CHECK(x.type().is_cuda(), #x " must be a CUDA tensor")
^
/autodl-tmp/Run_dir/20210817172013/op/fused_bias_act.cpp:15:3: note: in expansion of macro ‘CH ECK_CUDA’
CHECK_CUDA(x);
^~~~~~~~~~
/autodl-tmp/Run_dir/20210817172013/op/fused_bias_act.cpp:22:3: note: in expansion of macro ‘CH ECK_INPUT’
CHECK_INPUT(input);
^~~~~~~~~~~
In file included from /root/miniconda3/lib/python3.8/site-packages/torch/include/ATen/Tensor.h :3:0,
from /root/miniconda3/lib/python3.8/site-packages/torch/include/ATen/Context. h:4,
from /root/miniconda3/lib/python3.8/site-packages/torch/include/ATen/ATen.h:9 ,
from /autodl-tmp/Run_dir/20210817172013/op/fused_bias_act.cpp:2:
/root/miniconda3/lib/python3.8/site-packages/torch/include/ATen/core/TensorBody.h:277:30: note : declared here
DeprecatedTypeProperties & type() const {
^~~~
In file included from /root/miniconda3/lib/python3.8/site-packages/torch/include/c10/core/Devi ceType.h:8:0,
from /root/miniconda3/lib/python3.8/site-packages/torch/include/c10/core/Devi ce.h:3,
from /root/miniconda3/lib/python3.8/site-packages/torch/include/c10/core/Allo cator.h:6,
from /root/miniconda3/lib/python3.8/site-packages/torch/include/ATen/ATen.h:7 ,
from /autodl-tmp/Run_dir/20210817172013/op/fused_bias_act.cpp:2:
/autodl-tmp/Run_dir/20210817172013/op/fused_bias_act.cpp:11:22: warning: ‘at::DeprecatedTypePr operties& at::Tensor::type() const’ is deprecated: Tensor.type() is deprecated. Instead use Te nsor.options(), which in many cases (e.g. in a constructor) is a drop-in replacement. If you w ere using data from type(), that is now available from Tensor itself, so instead of tensor.typ e().scalar_type(), use tensor.scalar_type() instead and instead of tensor.type().backend() use tensor.device(). [-Wdeprecated-declarations]
TORCH_CHECK(x.type().is_cuda(), #x " must be a CUDA tensor")
^
/autodl-tmp/Run_dir/20210817172013/op/fused_bias_act.cpp:15:3: note: in expansion of macro ‘CH ECK_CUDA’
CHECK_CUDA(x);
^~~~~~~~~~
/autodl-tmp/Run_dir/20210817172013/op/fused_bias_act.cpp:23:3: note: in expansion of macro ‘CH ECK_INPUT’
CHECK_INPUT(bias);
^~~~~~~~~~~
In file included from /root/miniconda3/lib/python3.8/site-packages/torch/include/ATen/Tensor.h :3:0,
from /root/miniconda3/lib/python3.8/site-packages/torch/include/ATen/Context. h:4,
from /root/miniconda3/lib/python3.8/site-packages/torch/include/ATen/ATen.h:9 ,
from /autodl-tmp/Run_dir/20210817172013/op/fused_bias_act.cpp:2:
/root/miniconda3/lib/python3.8/site-packages/torch/include/ATen/core/TensorBody.h:277:30: note : declared here
DeprecatedTypeProperties & type() const {
^~~~
ninja: build stopped: subcommand failed.

解决这个问题的时候人都快傻了,为什么呢?
用现在搜到的解决方案都没办法。

  • 首先,一开始报错的是“ compiled C++/cuda extension”
    然后解决方案就是 pip install ninja
    这没问题

  • 然后,我继续运行,又报一个 FAILED: fused_bias_act_kernel.cuda.o
    ninja: build stopped: subcommand failed.
    然后找到很多方案,说是换pytorch版本,或者看看cuda的路径在不在,配置cuda环境等等,这些都试了一下。无果
    然后无意间读到错误里包含这句话:
    FAILED: fused_bias_act_kernel.cuda.o
    那说明是编译的问题,这可能是cuda环境配置,或者nvcc配置,都有可能,所以都看了看配了一下。
    后来注意到这句话,所以也查了一下:
    nvcc fatal : Unsupported gpu architecture ‘compute_86’
    然后找啊找,找到这篇博客:
    https://blog.csdn.net/cwm19950318/article/details/111287797
    成功解决,因为那个时候用的是3090,所以导致这个算力太高。所以也试了一下,结果解决了。

处理方法就是:

vim ~/.bashrc

然后将这句话加到最后去:

export TORCH_CUDA_ARCH_LIST="7.5"

降低算力,就不会再报这个错误了。
当然,cuda,cudnn 的配置都要配对,具体怎么配置环境就去搜一下吧,很多的。

Logo

华为开发者空间,是为全球开发者打造的专属开发空间,汇聚了华为优质开发资源及工具,致力于让每一位开发者拥有一台云主机,基于华为根生态开发、创新。

更多推荐