Work with Custom Invoice Extraction

Overview

The Custom Invoice Extraction feature is similar to the Invoice Extraction feature. Both use a set of default fields.

Customization means that you have the ability to utilize additional, non-default fields to the extraction. You can also retrain default fields.

Considerations

Open Documents

Open AI from the navigation menu and select Documents.

Document Overview Page

View the saved document extractions.

This includes Invoices, Fixed Forms, and Custom Invoice.

The documents overview page displays all of the saved document extractions.

You can view the:

  • Document extraction names - Name given to identify the document model.
  • Number of versions - The number of different versions for the document extraction.
  • Created - The date the document extraction was created.
  • Settings:
    • Edit -  Open the extraction in Document Studio to edit the extraction.
    • Clone - Copy the document extraction.
    • Delete -  Delete the document extraction.
    • Configuration - Edit the name of the extraction.

Create a Custom Invoice Extraction

To create a new Custom Invoice Extraction:

  1. After opening Documents in Hero Platform_, click Create Document Model.

  2. Enter a name for the Invoice extraction and click Next.

  3. Select Custom Invoices and click Next.

  4. Select your training data Input which has been saved in the Data Store during the data preparation stage of the process.

    A RobinSkill Input is required to access data in the Data Store:


    Click Save and Preview to enter the Document Studio.

    Languages supported

    Text typeLanguageNote
    Typed TextMultiple LanguagesSupported

    Other

    Documents in other languages can be trained. The model’s accuracy depends on the number of labeled documents.

    Automation Hero recommends labeling 1-2k documents or use the Context aware training feature to improve results with a smaller labeled document set. (20-50)


  5. The Document Studio displays a training document with located pre-trained fields marked. 

    Only the first five input documents are displayed in the Document Studio. When training the extraction, all submitted documents are used for training.



  6. Click Save and Train in the toolbar after reviewing the sample results.
    • Training is available when new data or a change has been made to a previous version.
    • The specific versions can be selected by the drop-down menu at the top of the screen.

    • When saving, choose between saving to the current version or creating a new version.

    Click Start training to begin training and then add the Custom Invoice Extraction to Hero Platform_.

Information Tabs in Custom Invoice Extraction

Available Fields

Custom Fields and Standard Fields

Displays the names of user created Custom Fields and the Invoice Extractor Standard Fields.

Sample Documents

Displays the first five invoice documents from the training data Input. 

Use the arrow icons on the side of the Document Studio to preview the model's performance.

After training has been completed, the custom fields are marked on the sample documents. 

Remove sample documents by clicking the trash icon.

Metrics

Displays the start and end time for the training of the model.

Click Download files to download a zip file containing two log files: training.log and metrics.json

  • The metric.json file contains the text displayed in the metric charts below.

Charts:

The metric charts show each trained fields information:

  • Precision - How often is the extractor correct when values are identified.
  • Recall - How many of the known elements does the extractor identify.
  • F1-score - A measure of overall performance that combines both precision and recall calculations.

In general, higher values are better.

Settings

Make this a context aware document extractor

Background

Automation Hero has created features called "Context Awareness". These features let you connect and use data from your existing data sources. This data can help our AI make better decisions about the information it detects. 

These features can be used to enhance the training speed and accuracy of document extraction models.

The context aware feature for the Custom Invoice Extractor is a helping hand that allows you to reduce the amount of labeled sample documents needed to produce highly accurate trained models by supplying a source of sample values.

Automation Hero recommends this feature when possible to save time and increase accuracy.

Benefits
  • Quick data preparation process from a reduced number of labeled documents. 
  • Fewer possibilities of human error in the data labeling process.
  • Improves accuracy with a deeper learning of expected field values.

This training method results in the creation of a data extraction model that’s both faster and easier.

Requirements 
  • An input containing 1000+ sample values for each field.
    • Each custom labeled field needs sample values for training the model. 
  • A minimum of 15 labeled sample documents.
    • Automation Hero recommends increasing the amount of accurately labeled sample documents to help the AI have a better understanding of what it is looking for.
  • As the context aware feature helps create a model based on the source values as well as the supplied labeled sample documents, it is important for the labeled sample documents to be accurate so as to produce accurate results.
Usage

Select Yes to enable this feature. This feature is disabled by default.

Click Save and Train in the Document Studio.

Before training begins, a pop-up box is displayed with the model's custom fields.

Next to each custom field, select the Input and corresponding field name for each custom field in the model. Each source field should contain 1000+ sample values.

When complete, click Start training to begin training the custom invoice extraction model.

Languages

Select a language used for values in typed text fields that are created for the model.

List of supported languages.

Custom Invoice Training Status

Status iconStatusDefinition
ReadyTraining is complete.
Needs trainingA model version has been created but has not yet been trained.
TrainingThe model is in the process of being trained.
ErrorThe training of the model has crashed and was not completed.

Use a Custom Invoice Extraction in a Flow

After a Custom Invoice Extraction has been saved and trained, it can be used as a function in a Flow.

To use a Custom Invoice Extraction in a Flow:

  1. Open and start creating a Flow in the Flow Studio.
  2. View the Document functions in the element browser.
  3. Click and drag the Custom Invoice Extraction from the element browser onto the Flow Studio canvas.
  4. Connect the Custom Invoice Extraction using a cable from an element in the Flow.
  5. Confirm or select a version of the Custom Invoice Extraction model.
  6. Add Input documents.
  7. Configure/review the fields for the Fixed Form model's Docker container deployment.

    1. Capture logs - Select if the Docker function should capture logs.
    2. RAM - Adjust the sliding bar for memory (RAM) allocation for the function.
    3. vCPU - Adjust the sliding bar for CPU consumption. (by cores)
    4. Attempt timeout(s) - Enter the timeout setting (in seconds).
    5. Initial Delay - Enter the initial delay value in seconds for amount of time to between when container starts and when the Flow begins to use it.
    6. Retry attempts - Enter the max retry attempts before failing.

    Automation Hero recommends leaving the Docker container settings at the default levels unless problems arise. 

    An example of when raising the default settings may be beneficial is when the the documents being processed are very large.

  8. Configure the fields for the Custom Invoice Extraction.
  9. Click OK to finish adding the Custom Invoice Extraction to the Flow.