OCR, data extraction and document classification
Many documents that organizations need to handle and process within a process automation and electronic document workflow system still arrive in paper format. In order to manage them electronically, they must first be digitized. Bosflow provides excellent support for transforming paper documents into digital format and registering them in the system. During digitization, we use the system’s built-in OCR module.
This module utilizes OCR technology and functions based on it, such as data extraction and content classification.
- To read the textual content of files, paper documents are scanned and processed by the Bosflow OCR module.
- The extracted text content is indexed by the search engine, enabling documents to be easily found by their contents.
- Data extraction algorithms pull key attributes from the scanned text. These may include data such as dates, amounts, tax ID numbers (NIP) and other values defined for specific document types.
- The system verifies the extracted values and automatically fills the corresponding fields of electronic forms used in business processes.
- Based on the document’s textual content and extracted attribute values, each document is automatically assigned a document class, which determines the appropriate business process path for it.
- The handling of documents, types of extracted attributes and classification methods are adapted individually to the needs of each business process.
Typical documents such as invoices are supported in the standard version of the Bosflow OCR module.
What are the benefits of implementing the Bosflow system? 
14 dni
Average invoice circulation time
3 dni
15 min
Search for related documents
5 min
10 min
Finding out who has the document
1 min
5 min
Checking the invoice amount against contractual and budgetary limits
1 min
niewykonalne
Checking the amount of upcoming liabilities from documents in circulation
1 min