Skip to main content

Research Repository

Advanced Search

TMIXT: A process flow for Transcribing MIXed handwritten and machine-printed Text

Medhat, Fady; Mohammadi, Mahnaz; Jaf, Sardar; Willcocks, Chris; Breckon, Toby; Matthews, Peter; McGough, Andrew Stephen; Theodoropoulos, Georgios; Obara, Boguslaw

Authors

Fady Medhat

Mahnaz Mohammadi

Sardar Jaf

Toby Breckon

Peter Matthews

Andrew Stephen McGough

Georgios Theodoropoulos

Boguslaw Obara



Abstract

—Text recognition of scanned documents is usually dependent upon the type of text, being handwritten or machine-printed. Accordingly, the recognition involves prior classification of the text category, before deciding on the recognition method to be applied. This poses a more challenging task if a document contains both handwritten and machine-printed text. In this work, we present a generic process flow for text recognition in scanned documents containing mixed handwritten and machine-printed text without the need to classify text in advance. We have realized the proposed process flow using several open-source image processing and text recognition packages. The speed process and the amount of text documents used in organization such as defense that can not be processed by humans without considerable amount of automation, will be efficiently and effectively handled by this proposed work flow. The evaluation was performed using a specially developed variant, presented in this work, of the IAM handwriting database, where we have achieved an average transcription accuracy of nearly 80% for pages containing both printed and handwritten text.

Citation

Medhat, F., Mohammadi, M., Jaf, S., Willcocks, C., Breckon, T., Matthews, P., McGough, A. S., Theodoropoulos, G., & Obara, B. (2018, December). TMIXT: A process flow for Transcribing MIXed handwritten and machine-printed Text. Presented at IEEE International Conference on Big Data., Seattle, WA, USA

Presentation Conference Type Conference Paper (published)
Conference Name IEEE International Conference on Big Data.
Start Date Dec 10, 2018
End Date Dec 13, 2018
Acceptance Date Nov 8, 2018
Publication Date Dec 1, 2018
Deposit Date Nov 10, 2018
Publisher Institute of Electrical and Electronics Engineers
Public URL https://durham-repository.worktribe.com/output/1143586
Publisher URL http://cci.drexel.edu/bigdata/bigdata2018/index.html