系统信息
- 您正在使用的模型的顶级目录是什么:models/tree/master/research/tensorrt
- 我是否编写了自定义代码(而不是使用 TensorFlow 中提供的股票示例脚本):否
- 操作系统平台和发行版(例如,Linux Ubuntu 16.04):GNU/Linux 4.4.0-128-generic x86_64
- 从(源代码或二进制文件)安装的 TensorFlow:通过 docker image nvcr.io/nvidia/tensorflow:18.07-py3
- TensorFlow 版本(使用下面的命令):1.18
- TensoRT 版本:4.0.1。
- Bazel 版本(如果从源代码编译):不适用
- CUDA/cuDNN 版本:9.0.176 / 7.1.4
- GPU型号和内存:Tesla V100-SXM2-16GB
- 重现的确切命令:
python3 tensorrt.py --frozen_graph=resnetv2_imagenet_frozen_graph.pb --image_file=image.jpg --native --fp32 --fp16 --int8 --output_dir=output_trt
描述问题
我正在通过容器映像运行示例 tensorrt.py,但它会引发错误
python3: helpers.cpp:56: nvinfer1::DimsCHW nvinfer1::getCHW(const nvinfer1::Dims&): Assertion `d.nbDims >= 3' failed.".
没有标志,测试工作正常--int8
源代码/日志
运行 suscesufully native graph、FP32 graph 和 FP16 graph 时的大量输出
Running INT8 graph
2018-08-14 19:29:51.458632: I tensorflow/core/grappler/devices.cc:51] Number of eligible GPUs (core count >= 8): 4
2018-08-14 19:29:52.261613: I tensorflow/contrib/tensorrt/convert/convert_graph.cc:419] MULTIPLE tensorrt candidate conversion: 2
2018-08-14 19:29:52.539209: I tensorflow/contrib/tensorrt/convert/convert_nodes.cc:3087] Max batch size= 128 max workspace size= 2123190784
2018-08-14 19:29:52.539473: I tensorflow/contrib/tensorrt/convert/convert_nodes.cc:3101] finished op preparation
2018-08-14 19:29:52.539645: I tensorflow/contrib/tensorrt/convert/convert_nodes.cc:3109] OK
2018-08-14 19:29:52.539655: I tensorflow/contrib/tensorrt/convert/convert_nodes.cc:3110] finished op building
2018-08-14 19:29:52.568280: I tensorflow/contrib/tensorrt/convert/convert_nodes.cc:3087] Max batch size= 128 max workspace size= 24292802
2018-08-14 19:29:52.568346: I tensorflow/contrib/tensorrt/convert/convert_nodes.cc:3101] finished op preparation
2018-08-14 19:29:52.568374: I tensorflow/contrib/tensorrt/convert/convert_nodes.cc:3109] OK
2018-08-14 19:29:52.568383: I tensorflow/contrib/tensorrt/convert/convert_nodes.cc:3110] finished op building
INFO:tensorflow:Starting execution
2018-08-14 19:29:56.151230: I tensorflow/core/common_runtime/gpu/gpu_device.cc:1435] Adding visible gpu devices: 0, 1, 2, 3
2018-08-14 19:29:56.151342: I tensorflow/core/common_runtime/gpu/gpu_device.cc:923] Device interconnect StreamExecutor with strength 1 edge matrix:
2018-08-14 19:29:56.151354: I tensorflow/core/common_runtime/gpu/gpu_device.cc:929] 0 1 2 3
2018-08-14 19:29:56.151380: I tensorflow/core/common_runtime/gpu/gpu_device.cc:942] 0: N Y Y Y
2018-08-14 19:29:56.151389: I tensorflow/core/common_runtime/gpu/gpu_device.cc:942] 1: Y N Y Y
2018-08-14 19:29:56.151397: I tensorflow/core/common_runtime/gpu/gpu_device.cc:942] 2: Y Y N Y
2018-08-14 19:29:56.151405: I tensorflow/core/common_runtime/gpu/gpu_device.cc:942] 3: Y Y Y N
2018-08-14 19:29:56.152301: I tensorflow/core/common_runtime/gpu/gpu_device.cc:1053] Created TensorFlow device (/job:localhost/replica:0/task:0/device:GPU: 0 with 8080 MB memory) -> physical GPU (device: 0, name: Tesla V100-SXM2-16GB, pci bus id: 0000:1a:00.0, compute capability: 7.0)
2018-08-14 19:29:56.152492: I tensorflow/core/common_runtime/gpu/gpu_device.cc:1053] Created TensorFlow device (/job:localhost/replica:0/task:0/device:GPU: 1 with 8080 MB memory) -> physical GPU (device: 1, name: Tesla V100-SXM2-16GB, pci bus id: 0000:1c:00.0, compute capability: 7.0)
2018-08-14 19:29:56.152650: I tensorflow/core/common_runtime/gpu/gpu_device.cc:1053] Created TensorFlow device (/job:localhost/replica:0/task:0/device:GPU: 2 with 8080 MB memory) -> physical GPU (device: 2, name: Tesla V100-SXM2-16GB, pci bus id: 0000:1d:00.0, compute capability: 7.0)
2018-08-14 19:29:56.152773: I tensorflow/core/common_runtime/gpu/gpu_device.cc:1053] Created TensorFlow device (/job:localhost/replica:0/task:0/device:GPU: 3 with 8080 MB memory) -> physical GPU (device: 3, name: Tesla V100-SXM2-16GB, pci bus id: 0000:1e:00.0, compute capability: 7.0)
INFO:tensorflow:Starting Warmup cycle
python3: helpers.cpp:56: nvinfer1::DimsCHW nvinfer1::getCHW(const nvinfer1::Dims&): Assertion `d.nbDims >= 3' failed.