Tesseract-ocr ->iterator.WordFontAttributes() does not work

Maia07 · (This post was last modified: Sep-01-2018, 02:48 PM by Gribouillis.)

When i'm running each image that i have in my directory, my goal is to extract the text and see the text attributes. The text extraction works but then when i'm going to know the text attributes with the PyTessBaseAPI() api for some reasons, some of my images don't recognize the text attributes and it gives in the python shell "=============================== RESTART: Shell =============================== "

Here is the code:

for i, cnt in enumerate(contours):
    x,y,w,h = cv2.boundingRect(cnt)

    x = x - 3
    y = y - 3

    if x < 0 or y < 0:
            continue

    cropped = image_file[y: y+h+padding, x: x+w+padding]
    #     make image bigger to recgnize better the text
    cropped = cv2.resize(cropped, (0,0), fx=4.0, fy=4.0)
    #     CONVERT NUMPY ARRAY TO PIL IMAGE 
    im = Image.fromarray(cropped.astype('uint8'), 'RGB')
    im = im.filter(ImageFilter.SHARPEN())

    text = image_to_string(im)
    #print(text)
    if text != "":
        #print("OCR Output : " + image_to_string(im))
        cv2.imwrite("img_text/cropped"+str(i)+".png", cropped)

        path = os.path.abspath("img_text/cropped"+str(i)+".png")
        with PyTessBaseAPI() as api:
            #img = Image.open(path, mode='r')
            bytes = readimage(path)
            img = Image.open(io.BytesIO(bytes))
            api.SetImage(img)
            api.Recognize()  # required to get result from the next line
            iterator = api.GetIterator()
            #print(iterator.WordFontAttributes())
            dict = iterator.WordFontAttributes()
            #print(dict['font_name'])

Does anybody knows what i'm doing wrong here?

Thanks

Possibly Related Threads…
Thread		Author	Replies	Views	Last Post
	Adding Columns to CSV using iterator	pstarrett	10	37,863	Jan-22-2018, 02:37 AM Last Post: pstarrett

Tesseract-ocr ->iterator.WordFontAttributes() does not work

User Panel Messages

Announcements