Country/Region:GLOBAL | Overseas

Super-Resolution for Text Images

By scanning a paper document and converting (digitizing) it to digital image data such as TIFF or JPEG, we can reduce paper usage and streamline our workflow. However, documents are mostly digitized in low resolution for various reasons, such as reducing the costs of fax transmission and hard disks or memories for storing digital image data, and limiting the file size of email attachments. When a document is digitized in low resolution, the loss of character quality becomes a problem since some original information expressed by characters in a paper document will be lost.

To solve this problem, Fuji Xerox has developed super-resolution for text images that enhances the quality of characters by recovering the information of the original paper document. This technology aims to enhance the quality of digitized characters to the paper-document level (Fig. 1).

Fig. 1 Goal of Super-Resolution for Text ImagesFig. 1 Goal of Super-Resolution for Text Images

By using super-resolution for text images, the quality of characters can be enhanced to reproduce clear and easy-to-read documents. Moreover, the accuracy of character recognition is also expected to improve.

Super-resolution is a technology that uses multiple similar images and restores the high-resolution information that was lost through digitization. This technology generally uses several video frames (images acquired with a short time lag) as the similar images. However, acquiring similar images from documents is difficult since documents usually consist of sheets containing different contents.

However, a character usually appears several times in a document. Our super-resolution for text images leverages this characteristic and performs super-resolution on individual characters extracted from a document as shown in Fig. 2. Therefore, super-resolution is effective for characters that appear more than once, even in a one-sheet document.

Fig. 2 Super-Resolution for Text Images of Fuji XeroxFig. 2 Super-Resolution for Text Images of Fuji Xerox

Fig. 3 below shows the flow of super-resolution for text images.

  1. Extract individual characters from the input image data.
  2. Select similar characters. This step requires the technology for selecting only the similar characters according to specific rules. For example, characters of different fonts will not be treated as similar characters. Conversely, characters of the same font will be treated as similar characters even when the colors are different.
  3. Perform super-resolution on each of the selected similar characters.
  4. Replace original characters with the characters processed with super-resolution, and create output image data.

Fig.3 Flow of Super-Resolution for Text ImagesFig.3 Flow of Super-Resolution for Text Images

end of content