More Tests

So, after yesterday's bleak attempt, I decided to test the system on a random image of bengali script, and passed one of the results of a simple Google image search for "Bengali text". A full page excerpt.

The result was terrible.

The noise level on the image was very high compared to the previous test sample, and the font was obviously different. But still Tesseract did manage to identify the language and few of the characters.

The work on the pre-processing algorithm is going good, and I can start coding as soon as I learn more about the imaging library. I'll go through the previous work done once again to look for some more help in this regard.
However, I still haven't figured out a way to make NPP++ display bengali characters.

So Long.

BookWorm - the BongOCR

Saturday, June 15, 2013

More Tests

No comments:

Post a Comment

About Me