Digital Transformation
About DocuWare
Start your free trial

Best scan settings for boosting text recognition

There are plenty of paper documents that need scanning, such as incoming A/P invoices. Here are some quick setting adjustments you can make, so that DocuWare can read the data in the best possible way – and you’ll never have to rescan.

Whether it’s for invoices, delivery slips or signed contracts: Many paper documents need to be scanned before they are digitally transferred into DocuWare. This is usually handled centrally or via departmental scanner. So it’s no wonder that the better the scan quality, the better the data which can be captured from the resulting digital documents.

How quickly and accurately a scan is generated depends largely on scan quality and readability of a document. You can improve this enormously with just a few setting changes. Read on…

Which file format?

When it comes to picking a file format, it’s important to know what will ultimately happen with a scanned/ digitized document. There are documents like incoming invoices or delivery slips, which DocuWare will read out data after scanning, in order to further process them or incorporate them in a workflow. And there are other documents – like blueprints – that will simply be stored in DocuWare so that they can be accessed via DocuWare client as needed or sent as an email attachment.

If you want to process data from a document – for example to read text and barcodes and use this information for indexing or in workflows – a document must be captured in a PDF or PDF/A format. So be sure to select this setting when scanning.

If a document only needs to be displayed or possibly forwarded by email, all common file formats are available. In addition to PDF and PDF/A, this list includes PNG, JPEG or TIF. If it only needs to be archived, it does not matter which format you select when scanning. However, to ensure that such documents are also displayed well on the screen, you can take a few additional steps to ensure optimal scan quality. More on that…

Color mode as needed

For some document types, the "Color" setting can be useful – for example, when contracts are signed using different pen colors to reflect different rights/signers or when a blueprint’s detail has to be viewed in in color. In most cases, however, black-and-white or gray-scale mode is completely sufficient. These scans also require the least amount of storage, which is a great advantage when you are talking volume.

Finding the right resolution

A document’s resolution is essential for optimal indexing. This number reflects the dot density of the file and is measured in dpi, "dots per inch." For optimal results, a resolution of at least 300 dpi is recommended. If the resolution is too low, optical character recognition (OCR) errors can more easily occur. For example, an "i" or "!" is captured as an "I" or a "v" is read out twice instead of as a "w."

When differentiated by color mode, we recommend 300 or 400 dpi for black and white scans. For gray-scale and color scans, 150 to 300 dpi is often sufficient.

Another factor is the font size of a document. Here’s a good rule of thumb: the smaller the font, the higher the resolution should be.

Scan settings in DocuWare
In black-and-white mode, the letters "I l t i" are only clearly captured at 300 dpi and above. In gray-scale mode, the characters are already clearly recognizable at 75 dpi, but image information is included, such as the structure of the paper, which will likely enlarge the file and take up unnecessary storage space.


Best conditions for quick and accurate capture

Find the best settings and you will get the best possible result in the shortest possible time when importing scanned documents into DocuWare. If the system can’t decide during the import process whether something is a 1 or an L, for example, this will slow down your processes and performance suffers. So it's worthwhile to make sure you have the right settings for scanning and ensure your system has the best data quality possible for managing your documents.

For more on automatically importing scanned documents into DocuWare. If you mostly scan barcodes, read how to recognize barcodes even moren precisely