OCR and Invoice Capturing: Tips & Tricks for better Recognition Rates

Would you like to find out how you can quickly and easily achieve your optimum level of automation in invoice capturing using the BLU DELTA AI system? Here we have some of the most helpful tips & tricks that can easily further optimize your invoice workflow!

The Quality is Crucial!

Before the BLU DELTA AI can make predictions and interpretations about the document with the words and blocks of numbers, the pixels must first be converted into exactly these letters and numbers. This is done by the OCR (Optical Character Recognition = text recognition).

The OCR system looks at every pixel of the document and interprets our characters from it. The BLU DELTA AI continues to work with pixels and these recognized characters, so the result of our document capture services is heavily dependent on the image quality received.

Improve OCR

Photos from (mobile phone) cameras, for example, provide a worse basis for an OCR, since the angle of view, shadows and other factors are not as optimal as with scans.

We therefore recommend that you always use a scanner with a scan resolution of 300 dpi

Create Optimal Conditions for Invoice Processing

With dark and colored backgrounds, as in these examples, OCR finds it particularly difficult to assign the pixels to individual letters.

OCR Beispiele Texterkennung

[Text recognition has recognized letters in the green areas. At the top right you can see two sections of photos that are accordingly blurred. And on the right below a few examples of background surfaces or a blurred scan.]

So if you send your own outgoing invoices through the BLU DELTA service, please do not use these backgrounds and use structures without filling instead. You can also make everyday office life much easier for your own customers with this small redesign if they also use an OCR system.

Sharing Known Values

Due to the structure of some documents, e.g. sender and recipient could be mixed up in a few cases. In this part of the article you will find out how you can always get reliable, correct information:

Using Property Store for VatIDs

If you send a DetectInvoiceRequest, this has, among other things, a property store. This is a structure in which you can send a key and the associated value as a key-value pair in the format of a Dictionary: object

Enter the ReceiverVatId as the recipient of the invoice, so the system knows that each additional VatID must be from the sender.

An example could be:

{

    "Filter": 0,

    "Invoice": "InvoiceContentAsBase64EncodedString",

    "PropertyStore": {

        "ReceiverVatId": "DE169838187"

    },

    "CreateResultPdf": true

}

The VatID can also be given by the sender. This is helpful if you record your outgoing invoices with BLU DELTA. So we know that the sender VatID will always be yours.

The language can also be specified, so the system does not have to think about how exactly the letter could look like for each character. Especially with accented characters, the given language of the document is a great relief for the OCR and AI.

Detailed information on using the BLU DELTA API can be found in our online REST API documentation BLU DELTA Invoice Capturing Ressources.

With these tips, you can create the best basis for processing for our OCR and AI, and thus avoid unnecessary manual processing of documents.

If you have any further questions or concerns, our support team is always available (bludelta-support@blumatix.com).