您正在使用https://github.com/OlafenwaMoses/ImageAI。
尽管它没有被弃用,但该存储库的最后一次提交是从 2019 年 1 月开始的。
此外,它们集成在其框架过时的网络中
(例如,不推荐使用 keras-retinanet)
鉴于此,我将回答您的最后一个问题:
“还有其他方法可以使用预训练的对象检测模型吗?”:
是的,有。和
是目前深度学习的主要库,都提供它们
。tensorflow
pytorch
例如,pytorch 编码的检测模型很少torchvision.models.detection
:
https ://github.com/pytorch/vision/tree/master/torchvision/models/detection
注意 1:要安装 pytorch,您必须在您的 conda 环境中运行:
conda install torchvision -c pytorch
注 2:以下代码已实现功能,结合文档字符串
:https: //github.com/pytorch/vision/blob/master/torchvision/models/detection/retinanet.py
和本教程:
https://debuggercafe .com/faster-rcnn-object-detection-with-pytorch/
我建议你也看看它们。
import cv2
import requests
import torchvision
import numpy as np
from torchvision import transforms
from PIL import Image
from io import BytesIO
coco_names = [
'__background__', 'person', 'bicycle', 'car', 'motorcycle', 'airplane', 'bus',
'train', 'truck', 'boat', 'traffic light', 'fire hydrant', 'N/A', 'stop sign',
'parking meter', 'bench', 'bird', 'cat', 'dog', 'horse', 'sheep', 'cow',
'elephant', 'bear', 'zebra', 'giraffe', 'N/A', 'backpack', 'umbrella', 'N/A', 'N/A',
'handbag', 'tie', 'suitcase', 'frisbee', 'skis', 'snowboard', 'sports ball',
'kite', 'baseball bat', 'baseball glove', 'skateboard', 'surfboard', 'tennis racket',
'bottle', 'N/A', 'wine glass', 'cup', 'fork', 'knife', 'spoon', 'bowl',
'banana', 'apple', 'sandwich', 'orange', 'broccoli', 'carrot', 'hot dog', 'pizza',
'donut', 'cake', 'chair', 'couch', 'potted plant', 'bed', 'N/A', 'dining table',
'N/A', 'N/A', 'toilet', 'N/A', 'tv', 'laptop', 'mouse', 'remote', 'keyboard', 'cell phone',
'microwave', 'oven', 'toaster', 'sink', 'refrigerator', 'N/A', 'book',
'clock', 'vase', 'scissors', 'teddy bear', 'hair drier', 'toothbrush'
]
COLORS = np.random.uniform(0, 255, size=(len(coco_names), 3))
# read an image from the internet
url = "https://raw.githubusercontent.com/fizyr/keras-retinanet/master/examples/000000008021.jpg"
response = requests.get(url)
image = Image.open(BytesIO(response.content)).convert("RGB")
# create a retinanet inference model
model = torchvision.models.detection.retinanet_resnet50_fpn(pretrained=True, score_thresh=0.3)
model.eval()
# predict detections in the input image
image_as_tensor = transforms.Compose([transforms.ToTensor(), ])(image)
outputs = model(image_as_tensor.unsqueeze(0))
# post-process the detections ( filter them out by score )
detection_threshold = 0.5
pred_classes = [coco_names[i] for i in outputs[0]['labels'].cpu().numpy()]
pred_scores = outputs[0]['scores'].detach().cpu().numpy()
pred_bboxes = outputs[0]['boxes'].detach().cpu().numpy()
boxes = pred_bboxes[pred_scores >= detection_threshold].astype(np.int32)
classes = pred_classes
labels = outputs[0]['labels']
# draw predictions
image = cv2.cvtColor(np.asarray(image), cv2.COLOR_BGR2RGB)
for i, box in enumerate(boxes):
color = COLORS[labels[i]]
cv2.rectangle(image, (int(box[0]), int(box[1])), (int(box[2]), int(box[3])), color, 2)
cv2.putText(image, classes[i], (int(box[0]), int(box[1] - 5)), cv2.FONT_HERSHEY_SIMPLEX, 0.8, color, 2,
lineType=cv2.LINE_AA)
cv2.imshow('Image', image)
cv2.waitKey(0)
输出:
