python - 从用户界面图像中提取 OCR 文本

Question

我目前正在使用 Pytesseract 从 Amazon、ebay、（电子商务）等图像中提取文本以观察某些模式。我不想使用网络爬虫，因为这是关于从此类网站上的文本中识别某些模式。图像示例如下所示：

然而，每个网站看起来都不同，因此模板匹配也无济于事。此外，图像背景的颜色也不相同。

该代码给了我大约 40% 的准确率。但是如果我将图像裁剪成更小的尺寸，它会正确地给我所有的文本。

有没有办法获取一张图像，将其裁剪成多个部分，然后提取文本？图像的预处理没有帮助。我尝试过的是使用：重新缩放、消除噪音、去歪斜、歪斜、自适应阈值、灰度、otsu 等，但我不知道该怎么做。

try:
    from PIL import Image
except ImportError:
    import Image
import pytesseract
# import pickle


def ocr_processing(filename):
    """
    This function uses Pillow to open the file and Pytesseract to find string in image.
    """
    text = pytesseract.image_to_data(Image.open(
        filename), lang='eng', config='--psm 6')
    # text = pytesseract.image_to_string(Image.open(
    # filename), lang='eng', config ='--psm 11')
    return text

score 1 · Accepted Answer

如果您有很多文本并且想通过 OCR 检测它（如上图），仅作为建议，“Keras”是一个非常好的选择。比 pytesseract 或仅使用 EAST 要好得多。这是评论部分提供的建议。它能够正确追踪 98.99% 的文本。

这是 Keras-ocr 文档的链接：https ://keras-ocr.readthedocs.io/en/latest/

python - 从用户界面图像中提取 OCR 文本

1 回答 1

Related

Reference