To err is human! But how much? And what does that have to do with AI?

16. September 2020

Do you know the human error rate of your receipt capturing? Admittedly, who likes to measure the error rate of their colleagues in accounting! For us, developers of an AI for receipt and invoice capturing, this is essential. This article explains why this is the case and how high the error rate is among our receipt capturers.

AI and to err is human

A Google search for comparative values of human error rates with our values quickly revealed one thing: virtually every provider of software for automating receipt capturing and accounting systems mentions the human error source, but none mentions a concrete figure. This also reflects our experience from discussions with customers. Human error is recognised but not measured in invoice capturing.

The situation is different with machines. The tolerance for error is low and the motto is rather: It all starts with the KPI (Key Performance Indicator). However, one must not forget that the machine is either created by humans or, as with artificial intelligence, very often learns from humans. Our BLU DELTA Invoice Capturing Intelligence therefore does not function detached from the human error rate, but this is decisive for the quality of our AI product in several respects (see also Benefits of AI).

The human error rate as an anchor point for AI

But how many errors are permissible? In the development of AI systems, the human error rate gives us an optimal orientation point. If our algorithm has an error rate of 10%, but the human error rate is 2%, this is an important indication of the quality and for further optimisation of the AI (see also Andrew Ng‘s Machine Learning Yearning, chapter 33). Because our AI should of course be better than humans!

Artificial intelligence “learns” from human error

We train our AI with data collected by our data capturers (also known as labellers). The quality and representativeness of the data for the use case play a decisive role: to make the error rate of our AI better than that of humans, we need data that contain as few human errors as possible. At BLU DELTA, we have defined our own rules and developed systems to ensure this.

Our human error rate: 2% and 7%

Receipt capturing is about recognising the correct characteristics of an invoice. The data capturer or accountant transfers the characteristic they are looking for (e.g. the gross total amount) from a scan or image of the invoice and enters it into a data template. This data is then recorded in our labelling tool and processed for further measurements or training. The correctness of the data is checked in a multi-stage process, whereby difficult or unclear entries are additionally screened.

There are different levels of difficulty: The gross total is often a particularly highlighted amount with reference to a currency information with a usually unique position on the invoice. For this type of characteristic, we have measured an average error rate of 2% for our data capturers. Other features that do not always have to be present, whose format is more general and whose position on the invoice is virtually arbitrary, are significantly more difficult. Here we measured an error rate of 7%.

For the error rate measurement of the easy and clearly recognisable characteristics, we examined 1052 characteristics records from 9 data capturers as a basis, and for the measurement of the difficult to recognise characteristics we used 231 characteristics records from 2 data capturers as a basis for analysis.

A guide to the quality of BLU DELTA models

From this, a guideline for our BLU DELTA AI models is derived. The error rate for clearly recognisable characteristics should be at least less than 2% (or 7% for hard-to-recognise characteristics).

BLU DELTA is a product for the automated capture of financial documents. Partners, but also our customers’ finance departments, accounts payable clerks and tax consultants can use BLU DELTA to immediately relieve their employees of the time-consuming and mostly manual entry of documents by using BLU DELTA AI and Cloud.

Blumatix Intelligence GmbH keeps it as its goal to make the strenuous everyday work easier with artificial intelligence and to always draw added value for everyone from shared intelligence.

Christian Weiler

Author: Christian Weiler is a former General Manager of a global IT company based in Seattle/US. Since 2016, Christian Weiler has been increasingly active in various roles in the field of artificial intelligence and has strengthened the management team of Blumatix Intelligence GmbH since 2018.
Contact: c.weiler@blumatix.com