Hello everyone,
I’m currently trying to use the R package tesseract. For now, my results are alright, however, I would like to improve the accuracy by specifying some of the available parameters.
This leads to my problem. I do not understand how to specify some parameters correctly.
In particular, I’d like to specify the following parameters:
- tessedit_char_whitelist
- user_patterns_file
- user_words_file
I manage to correctly specify tessedit_char_whitelist but face problems with the other two parameters. I fail to understand how I have to insert the patterns and word file into R. My idea was to specify a character vector, operating as a list of patterns or words I’d like tesseract to make use of. However, this does not seem to work.
If I understand the documentation correctly, then I would have to specify a pattern and a words file separately. I do not understand how I would do that.
I’m more than glad for any suggestions and help. Unfortunately, I cannot share the pdf file due to data privacy.
Thank you very much.
Sincerely,
Albert