Authors: Giulia De Poli, Davide Cui, Manuela Bazzarelli, Leone De Marco, Matteo Bregonzio
Identity validation and data gathering through physical documents (papers and cards) is still widely adopted in several sectors including retail banking, insurance, and public administration. Those tasks generally consist in manually transcribing relevant fields from physical documents to a dedicated software application. This is due to many reasons such as the sensitive nature of these operations and the lack of suitable digital alternatives; however, this process remains time consuming, expensive, and prone to human error.
For these reasons, there is still a strong need for automating physical document interpretation and data extraction, mainly in the instances where documents could rapidly be digitised (scanned). Therefore, we devised an innovative A.I. based pipeline for automatically acquire, process, and validate digitised documents and the related manually extracted fields. Our solution constitutes a support layer for the human agents enabling faster document processing and drastically reduce errors in the operations. The proposed pipeline leverages Cloud infrastructure for scalability, employing several A.I. techniques from Computer Vision to Convolutional Neural Networks.
Despite the accuracy may vary depending on document type and the specific field to recognize, we observed promising experimental results. On average, it reached 92.6% accuracy for document recognition tasks and 81.1% for the field extraction tasks. Furthermore, we foresee a significant reduction in operational time and errors.
We presented the following paper at 2021 International Conference on decision support system technology.