Please use this identifier to cite or link to this item: http://hdl.handle.net/1959.13/44962
- An effective character extraction algorithm for optical character recognition
- The University of Newcastle. Faculty of Science & Information Technology, School of Design, Communication and Information Technology
- This paper introduces an effective character extraction algorithm that can be used for optical character recognition (OCR). Using both geometrical and colour information, the character extraction algorithm can extract text from colour document images which contain mixed text and pictures. The algorithm consists of three components, i.e., adaptive k-means clustering, binary morphological processing, and shape and space-related refinement. When the algorithm is used as a plug-in pre-processing stage for an OCR system, the performance of the system can be improved. Character recognition experiment was done with a commercial OCR package. It has been shown that our algorithm can improve character recognition rate on complex document from 73.1% to 95.5% on average.
- Asia-Pacific Workshop Visual Information Processing. Proceeding for the Asia-Pacific Workshop Visual Information Processing (Beijing, China 7-9th November, 2006) p. 219-223
optical character recognition (OCR);
- Resource Type
- conference paper