Image Pre-Processing

Image Pre-Processing is a critical step in optimising document processing workflows, especially when dealing with scanned or uploaded documents. This support document aims to provide an understanding of Image Quality, Orientation Correction and OCR on Digital Documents emphasising how these factors impact document quality, storage, and processing.

Image Quality and DPI: A Brief Overview

Dots Per Inch (DPI) refers to the density of dots in a printed image, representing the image's resolution or quality. A higher DPI value indicates a greater number of dots per inch, resulting in finer detail and better image quality. In the context of document processing, the DPI value significantly influences the clarity and readability of scanned or uploaded documents.

Importance of Image Quality in Document Processing
In the realm of document processing, image quality plays a pivotal role in accurate data extraction and overall workflow efficiency. High-quality images with higher DPI values provide several advantages:

Improved Optical Character Recognition (OCR): Higher DPI images enhance OCR accuracy by capturing finer text details, ensuring more accurate data extraction.
Enhanced Data Extraction: Images with superior quality lead to better recognition of handwriting, signatures, and intricate details, improving data extraction precision.
Reduced Error Rates: Clearer images reduce the likelihood of misinterpretation or incorrect data extraction, ultimately minimizing processing errors.
Enhanced Archiving: High-quality images contribute to better archival and compliance practices, preserving documents in their truest form.

Orientation Correction: Simplifying Document Alignment

Orientation Correction is a feature designed to automatically adjust the orientation of uploaded documents. Manual adjustments to document orientation can be time-consuming and prone to errors. With Orientation Correction, the need for manual orientation adjustments is eliminated. This feature ensures that documents are correctly aligned for accurate data extraction, optimising the entire processing workflow.

OCR on Digital Documents: Extracting Text from Images

OCR on Digital Documents is a powerful tool that offers two capabilities:

Extracting Text from Logos and Image Blocks: This feature enables the extraction of text content from logos, headers, and image blocks present in digital documents. By capturing textual information from these graphical elements, you can enhance the accuracy and completeness of data extraction.
Complete OCR on Digital Documents: Perform a comprehensive OCR on the entire digital document, converting images with embedded text into machine-readable text. This process improves text searchability, facilitates data extraction, and enhances overall document usability.

How to Configure Image Pre-Processing

Step 1: Access Document Types Page

Start by navigating to the "Document Types" page in your application. Use the search function to locate the desired document type for which you want to adjust image quality.

Step 2: Access Document Type Settings

Click on the settings icon associated with the target document type to access its settings.

Step 3: Pre-Processing Section

Within the document type settings, locate the "Pre-Processing" section.

Step 4: Image Pre-Processing: Image Quality for Storage (dpi)

In the "Image Pre-Processing" subsection, you'll find the "Image Quality for Storage (dpi)" option. By default, this value is set to 200 dpi. You can adjust this value according to your preferences. Higher dpi values result in higher image quality.

Step 5: Image Pre-Processing: Orientation Correction

Under "Image Pre-Processing," you'll also find "Orientation Correction." This feature is enabled by default, meaning that uploaded documents' orientations will be automatically adjusted. This eliminates the need for manual orientation adjustments. If you prefer manual control, you can disable this option.

Step 6: Image Pre-Processing: OCR on Digital Documents

In the same "Image Pre-Processing" section, locate the "OCR on Digital Documents" option. Click on the dropdown button to reveal available options:
1. Logos and Image Block: Selecting this option enables text extraction from logos and image blocks on digital documents.
2. Complete OCR: This option allows comprehensive text extraction from the entirety of digital documents.
3. None: Choosing this option disables text extraction from logos and images on digital documents.

Congratulations, you've successfully configured image pre-processing settings to optimise data extraction within the chosen document type!