Fady Medhat
TMIXT: A process flow for Transcribing MIXed handwritten and machine-printed Text
Medhat, Fady; Mohammadi, Mahnaz; Jaf, Sardar; Willcocks, Chris; Breckon, Toby; Matthews, Peter; McGough, Andrew Stephen; Theodoropoulos, Georgios; Obara, Boguslaw
Authors
Mahnaz Mohammadi
Sardar Jaf
Dr Chris Willcocks christopher.g.willcocks@durham.ac.uk
Associate Professor
Toby Breckon
Peter Matthews
Andrew Stephen McGough
Georgios Theodoropoulos
Boguslaw Obara
Abstract
—Text recognition of scanned documents is usually dependent upon the type of text, being handwritten or machine-printed. Accordingly, the recognition involves prior classification of the text category, before deciding on the recognition method to be applied. This poses a more challenging task if a document contains both handwritten and machine-printed text. In this work, we present a generic process flow for text recognition in scanned documents containing mixed handwritten and machine-printed text without the need to classify text in advance. We have realized the proposed process flow using several open-source image processing and text recognition packages. The speed process and the amount of text documents used in organization such as defense that can not be processed by humans without considerable amount of automation, will be efficiently and effectively handled by this proposed work flow. The evaluation was performed using a specially developed variant, presented in this work, of the IAM handwriting database, where we have achieved an average transcription accuracy of nearly 80% for pages containing both printed and handwritten text.
Citation
Medhat, F., Mohammadi, M., Jaf, S., Willcocks, C., Breckon, T., Matthews, P., McGough, A. S., Theodoropoulos, G., & Obara, B. (2018, December). TMIXT: A process flow for Transcribing MIXed handwritten and machine-printed Text. Presented at IEEE International Conference on Big Data., Seattle, WA, USA
Presentation Conference Type | Conference Paper (published) |
---|---|
Conference Name | IEEE International Conference on Big Data. |
Start Date | Dec 10, 2018 |
End Date | Dec 13, 2018 |
Acceptance Date | Nov 8, 2018 |
Publication Date | Dec 1, 2018 |
Deposit Date | Nov 10, 2018 |
Publisher | Institute of Electrical and Electronics Engineers |
Public URL | https://durham-repository.worktribe.com/output/1143586 |
Publisher URL | http://cci.drexel.edu/bigdata/bigdata2018/index.html |
You might also like
Dynamic Unary Convolution in Transformers
(2023)
Journal Article
Deep Learning Protein Conformational Space with Convolutions and Latent Interpolations
(2021)
Journal Article
The relationship between curvilinear structure enhancement and ridge detection approaches
(2020)
Journal Article
Sequential graph-based extraction of curvilinear structures
(2019)
Journal Article
Downloadable Citations
About Durham Research Online (DRO)
Administrator e-mail: dro.admin@durham.ac.uk
This application uses the following open-source libraries:
SheetJS Community Edition
Apache License Version 2.0 (http://www.apache.org/licenses/)
PDF.js
Apache License Version 2.0 (http://www.apache.org/licenses/)
Font Awesome
SIL OFL 1.1 (http://scripts.sil.org/OFL)
MIT License (http://opensource.org/licenses/mit-license.html)
CC BY 3.0 ( http://creativecommons.org/licenses/by/3.0/)
Powered by Worktribe © 2024
Advanced Search