Skip to main content
Version: 2024 R1

OCR AI Projects

The node allows you to create projects used for recognizing and extracting text layer in documents added to WEBCON BPS workflow instances. With the AI module, the extracted information can be then distributed to proper form fields.

OCR AI Projects

To add a new project, click the New button. Alternatively, it is possible to import an existing project by pressing the Import button. After clicking the said button, choose the .zip file containing the correct project.

Project preview

The tab enables you to modify the basic OCR AI project settings.

Project preview

1. Project name

The user-defined project name.

2. ID

The project identifier.

3. DLL file name

Name of the Dynamic-Link Library file that contains project configuration.

4. DLL file version

Version of the Dynamic-Link Library file that contains project configuration.

5. Fields list

The section contains names of fields in the OCR AI project and their IDs. It is not possible to manually add nor remove fields from this list – it is created during the import of the .zip package.
To add your own fields to your project, go to the Custom fields list section.

6. Customs fields list

The list contains fields added by clicking the Add (Add) button. Pressing the aforementioned button opens the Create custom OCR AI field which allows you to define your own field in the OCR AI project.

Create custom OCR AI field

  • Name – defines name of a field. The name is used to identify the field in the system logs and in other configuration sections in Designer Studio. A correct name can contain only alphanumericals, spaces, and underscores.
  • ID – field identifier used internally within the system. It is generated on the basis of a field name. It should be unique within the project.
  • Type – selecting the OCR AI custom field type can increase quality and speed up recognition time of values that belong to the chosen type.
  • Enable blocks merging – recognition of phrases composed of several text blocks. After selecting the checkbox, the OCR AI module determines the number of blocks a phrase should be composed of based on the learning data. For example, if the learning data contains two sequences of blocks that contain the company's name:
    [WEBCON] [sp.] [z o.o.] and
    [WEBCON] [sp.] [z] [o.] [o.],
    the OCR AI module will conclude that the target phrase can be composed of 3–5 text blocks.
    The OCR AI engine also analyzes all single-block phrases. This enables it to analyze and assess whether a phrase is correct based on the combination of all blocks.
  • Field format – the expected format of the value which is to be found in the document.
    • Unrestricted – available only for Unrestricted fields, it signifies the lack of filtering. Recommended when a field does not have a clearly defined format or when it is expected to change often.
    • Default – predefined format available for the Date and Amount fields. Recommended when the field is expected to contain standard dates and amounts of currency.
    • Detect automatically – used for automatic format recognition of the field based on the learned data. Recommended when the field is expected to contain various formats of data.
    • Use regular expression – the format defined by the user. Recommended when a field’s format can be described using RegEx.
  • Regular expression – the field is active if the Regular expression option has been selected in the Field format option. The expression specifies the expected format of the target value.
    The most common regular expressions:
    • sequence of any 6 characters (e.g. "WEBCON"): .{6},
    • sequence of digits (e.g. "1234"): [0-9]+,
    • sequence of digits with common separators (e.g. 234-234-55-33): [-:\s0-9]+,
    • decimal amounts (e.g. "1 233,10" or "2.244,90"): ^[0-9\s\.\,]+[\.,][0-9]{2}$,
    • text containing a given word (e.g. "FV/ABC/23/201"): .*ABC.*,
    • text starting with a given word (e.g. "2015/FV/23/0001"): 2015.*,
    • text without digits and alphanumericals: ^[^0-9ABC]*$,
    • zip code (e.g. "12345-6789"): ^[0-9]{2}-[0-9]{3}$.

Version management

The tab allows you to restore the previous network version, e.g. in the case of incorrect learning. It is also possible to restore only the selected fields from the previous versions.

Version management

1. Custom networks

The section presents all the networks dedicated to a given project. By selecting the Common network, it is possible to manage the project versions used if no distinguisher is found. New custom networks are created during additional network learning for a given distinguisher (it is possible to delete them by clicking the Remove button (Remove) available next to the list).

2. Version preview

The section contains the following fields:

  • Name – name of a custom network selected in the Custom networks section,
  • Distinguisher – distinguisher defined for a selected custom network. (The Universal network does not have a distinguisher as it is used only when the distinguisher is not found).

3. Version management

The section enables you to select a network and specify fields to be restored.

  • Choose version – a drop-down list for selecting version of the network or its individual fields to be restored.
  • Fields list – the table contains field names and identifiers. By using the selection buttons, you can select individual fields to be restored in the current version.
  • Restore version – the button allows you to create a new version of the custom network which is:
    • a copy of the version selected from the Choose version field (when no checkbox is selected in the Fields list table), or
    • a copy of the previous, most current version (excluding the selected fields which will be copied from the version selected in the Choose version menu).

In both scenarios, a new network version is created which is either a complete or partial copy of the selected version.

Usages

The tab presents actions or action templates (divided into processes, workflows, and steps) in which a given OCR AI project is used. To remove the OCR AI project, it cannot be used in any actions or action templates.

Usages