1

我正在尝试在 ubuntu 18 上训练模型,并且遵循了 Tesorflow-GPU 的文档: https ://www.tensorflow.org/install/gpu ubuntu 18 CUDA 11 tensorflow-gpu 1.13 我遇到了这个问题:

2021-02-03 13:16:00.755944: I tensorflow/stream_executor/platform/default/dso_loader.cc:53] Could not dlopen library 'libcudart.so.10.0'; dlerror: libcudart.so.10.0: cannot open shared object file: No such file or directory
2021-02-03 13:16:00.756245: I tensorflow/stream_executor/platform/default/dso_loader.cc:53] Could not dlopen library 'libcublas.so.10.0'; dlerror: libcublas.so.10.0: cannot open shared object file: No such file or directory
2021-02-03 13:16:00.756534: I tensorflow/stream_executor/platform/default/dso_loader.cc:53] Could not dlopen library 'libcufft.so.10.0'; dlerror: libcufft.so.10.0: cannot open shared object file: No such file or directory
2021-02-03 13:16:00.756834: I tensorflow/stream_executor/platform/default/dso_loader.cc:53] Could not dlopen library 'libcurand.so.10.0'; dlerror: libcurand.so.10.0: cannot open shared object file: No such file or directory
2021-02-03 13:16:00.757106: I tensorflow/stream_executor/platform/default/dso_loader.cc:53] Could not dlopen library 'libcusolver.so.10.0'; dlerror: libcusolver.so.10.0: cannot open shared object file: No such file or directory
2021-02-03 13:16:00.757389: I tensorflow/stream_executor/platform/default/dso_loader.cc:53] Could not dlopen library 'libcusparse.so.10.0'; dlerror: libcusparse.so.10.0: cannot open shared object file: No such file or directory
2021-02-03 13:16:00.757674: I tensorflow/stream_executor/platform/default/dso_loader.cc:53] Could not dlopen library 'libcudnn.so.7'; dlerror: libcudnn.so.7: cannot open shared object file: No such file or directory
2021-02-03 13:16:00.757800: W tensorflow/core/common_runtime/gpu/gpu_device.cc:1663] Cannot dlopen some GPU libraries. Skipping registering GPU devices...
2021-02-03 13:16:00.757899: I tensorflow/core/common_runtime/gpu/gpu_device.cc:1181] Device interconnect StreamExecutor with strength 1 edge matrix:
2021-02-03 13:16:00.757992: I tensorflow/core/common_runtime/gpu/gpu_device.cc:1187]      0 
2021-02-03 13:16:00.758088: I tensorflow/core/common_runtime/gpu/gpu_device.cc:1200] 0:   N 
2021-02-03 13:16:01.201726: W tensorflow/compiler/jit/mark_for_compilation_pass.cc:1412] (One-time warning): Not using XLA:CPU for cluster because envvar TF_XLA_FLAGS=--tf_xla_cpu_global_jit was not set.  If you want XLA:CPU, either set that envvar, or use experimental_jit_scope to enable XLA:CPU.  To confirm that XLA is active, pass --vmodule=xla_compilation_cache=1 (as a proper command-line flag, not via TF_XLA_FLAGS) or set the envvar XLA_FLAGS=--xla_hlo_profile.

从错误中我可以看到未找到 CUDA 文件,并且在检查后没有此类文件。

4

1 回答 1

0

问题出在 Tensorflow 版本 1.13 上,我已经将它更新为 2.4 并且它已经工作了。

于 2021-02-03T14:26:02.427 回答