Data Capture – Step 1: Defining Essential Information
In the first step, you should thoroughly examine 5-10 examples of the desired document type and mark the data elements you absolutely need. It’s important to consider which information serves as references to identify the corresponding data in your internal database. This often involves identifying orders, senders, recipients, or other references in your database. Typical information may include IBAN, VAT ID, Purchase Order ID, addresses, and so on.
You should also consider which data provides value in the first step and which data can potentially be retrained in a later project.
Data Capture – Step 2: Generating Training Data
Based on this information, our experts can provide a quote and immediately begin implementation. If you have historical data for training, you can upload it directly through our Learn API while we configure auto-training in the background. This way, you will already have an initial model accessible through an endpoint assigned by us.
Data Capture – Step 3: Benchmark Analysis and Expansion of Training Data
During training, our data management team creates a benchmark that forms the basis for quality measurement. This allows the quality of your AI models to be automated and continuously measured. If the quality of the model from the first attempt is not sufficient, our data management team will expand and improve your training data until the desired accuracy is achieved.
Data Capture – Step 4: Integration of the New Service
You now have access to an AI at the URL „https://capture.bludelta.de/[YOUR DOCUMENT TYPE]/v1“ that captures your documents and provides structured information from PDFs and images. You must integrate this functionality into your existing workflows, DMS, or ERP systems.
Best Practice: Feedback from Your Workflow
BLU DELTA AI informs you of the data capture results while also indicating which data you can rely on and where uncertainties exist. You can use this information for dark data capture in your documents or display “uncertain information” in your workflow for correction.
For corrections, you can send them back to us through our Learn API, continuously automating training, improving, measuring, and seamlessly deploying the model.
The most time-consuming task in this process can be the creation of training data. If there are no or only insufficient historical data available, BLU DELTA Data Management Team will need to enhance them. Once this is done, training and measuring the model become a mere formality.
In other words, if you have high-quality historical data, we can provide a model in approximately 2-3 weeks. If you lack historical data, depending on the complexity of the elements to be extracted, you should expect about 4-6 weeks.
For questions regarding the duration or the exact approach to your specific problem, our BLU DELTA AI experts can provide answers.
Take advantage of a free consultation with one of our experts.
If you’d like to learn more about data capture with BLU DELTA AI, we look forward to hearing from you.