OCR app or tool best for scanning msx listings

By Randam

Paragon (1425)

Аватар пользователя Randam

21-12-2020, 21:24

Does anyone have an idea which app or OCR tool would be the best for scanning msx (basic) listings? I have some japanese books and magazines with listings in it and would like to do it the easy way, rather than typing it all out.

Any suggestions?

Для того, чтобы оставить комментарий, необходимо регистрация или !login

By CASDuino

Master (253)

Аватар пользователя CASDuino

21-12-2020, 22:20

I've been using Acrobat for years for my OCR projects but certainly with the MSX Computing listings it has difficulty with the fonts and recognising the dot matrix printed out effect they use.

By gdx

Enlighted (5017)

Аватар пользователя gdx

22-12-2020, 01:37

I get that with a online OCR:


With a better scan, you can get a better result but MSX specific characters will not be recognized even by most OCRs.

By RamonMSX

Expert (125)

Аватар пользователя RamonMSX

22-12-2020, 07:39

If you are looking for the listings in MSX-Fan magazine, most of them have been published on cartridge or disk as MSX-Fan FANDOM Collections. See here: https://www.generation-msx.nl/company/tokuma-shoten-intermedia/120/software/

By jepmsx

Expert (110)

Аватар пользователя jepmsx

22-12-2020, 10:33

I have never tried for MSX listing but in my work I use tesseract 4.

The key of tesseract is the binarization of the image. For that I'm using an old program that is https://github.com/chriswolfvision/local_adaptive_binarization and then I pass the binarized image to tesseract