FEC - School of Computing
School of Computing
Prof Cathal Gurrin

DocExtractNet – A Deep Learning Framework for Accurate Receipt Data Extraction

In the evolving landscape of business automation, extracting reliable information from financial documents like receipts remains a persistent challenge. From poor image quality to diverse layouts and handwritten annotations, these documents resist easy interpretation by traditional systems.

To address these hurdles, Prof Cathal Gurrin has developed DocExtractNet alongside researchers from Wuhan and Hangzhou in China, which is a cutting-edge framework that builds upon LayoutLMv3 to streamline and enhance receipt data extraction. 

This innovative approach not only boosts accuracy across multiple benchmarks but also holds significant promise for improving the speed and reliability of financial workflows.

This paper introduces DocExtractNet, a novel framework designed for enhanced information extraction from business documents, specifically addressing the challenges involved in processing receipts. The researchers highlight several difficulties in extracting critical information from receipts, including variable scanned image quality, complex and diverse formats, and the presence of handwritten elements and noise, all of which make accurate extraction particularly challenging. 

Built upon the base model LayoutLMv3, DocExtractNet aims to provide an efficient solution for automating financial processes and supporting timely business decisions by tackling these issues head-on.

DocExtractNet employs several innovative strategies to achieve its goals. The framework incorporates an ImageEnhance method to process image features and improve recognition accuracy, particularly for low-quality images. It also uses a PrecisionHints strategy to help supplement missing key-value pairs within the text data, thereby improving overall data integrity. Furthermore, the CrossModalFusion method is utilised to combine features from both the image and text modalities, enabling the model to better understand and extract the required information. 

Experimental results presented in the paper demonstrate that DocExtractNet significantly improves F1 scores on datasets such as Finance-Receipts (97.07%), FUNSD (91.80%), and CORD (97.38%), showcasing its superior performance compared to other models in receipt information extraction tasks, alongside optimising for increased processing efficiency.

Read the full paper here: https://www.sciencedirect.com/science/article/pii/S0306457324004059