Thursday, July 18, 2013

some quick training

The last two days, I decided to go with my impulse and try generating some training data. Now I have a good grasp on the Tesseract commands for generating the required files, but still need deeper understanding about the clustering part. Amazingly, the long and tedious training procedure at the original documentation doesn't look so difficult after staring at it for almost 3 months.
However, after testing with the new data, no miracle happened, and while it managed to detect the language correctly, the accuracy wasn't worth mentioning.
Next step, I will generate some training data with the WhiteWashing algorithm and see how it fares.
But now, off to documentation.
So Long.

No comments:

Post a Comment