AI, robot hand, ML Machine learning, calculations, accounting

Doctoral project: Document Classification and Entity Extraction

Many aspects of accounting present difficulties in achieving full automation
due to the abundance of unstructured information, such as invoices and
receipts. The aim of this project is to leverage state-of-the-art AI and machine
learning technology and apply it to the field of accounting.

Project information

Doctoral student
Nemi Pelgrom
Morgan Ericsson
Assistant supervisors
Jonas Nordqvist, Håkan Grahn (BTH)
Participant organizations
Linnéuniversitetet, Fortnox
Fortnox, KK-stiftelsen (Industriforskarskolan för Data Intensive Applications + (DIA+))
September 2023 –September 2028
Computer and information science (Department of Computer Science and Media Technology, Faculty of Technology)
Research group
Data Intensive Software Technologies and Applications (DISTA)
Linnaeus University Centre
Linnaeus University Centre for Data Intensive Sciences and Applications (DISA)

More about the project

The focus of the research will center on exploring the feasibility of automated decision support systems for unstructured information used, for instance, in accounting. Consequently, there arises a necessity to incorporate advanced models and techniques from diverse domains of AI/ML into these systems.

The research tasks may involve, but are not limited to, the following problems:

  • Document classification, documents such as invoices and receipts need to be automatically assessed and classified, which may require models suited to work on image and textual data.
  • Entity extraction, being able to discern and identify important information from unstructured documents is crucial for the above-mentioned automation.
  • Robustness and reliability analysis, to know that the methods are sound, there is a need to investigate the robustness of the methods.
  • Conditional learning, in all systems designed to assist humans, there exists an opportunity to enhance the AI/ML models by incorporating a priori knowledge and logical reasoning from domain experts. This approach, known as conditional learning, allows the methods to adapt and utilize the expertise effectively. However, to achieve this, it is crucial to ensure that the methods employed can be conditioned on such knowledge.

The doctoral project is performed within Data Intensive Software Technologies and Applications (DISTA) and Linnaeus University Centre for Data Intensive Sciences and Applications (DISA)