An Integrated Approach for Table Detection and Structure Recognition

  • Hai-Hong Phan Le Quy Don Technical University
  • Dai Duong Ngo Le Quy Don TechnicalUniversity
Keywords: Document structure analysis, table detection, detect the table structure, hydrometeorology

Abstract

Detecting and identifying the table structure is
an important issue in document digitization. Although there
have been many great strides based on current deep learning
techniques, table structure identification is still a difficult
and arduous problem, especially when solving the problem
of digitizing text in practice. The paper proposes a solution
to digitize table documents based on the Cascade R-CNN
HRNet network to detect, classify tables and integrate image
processing algorithms to improve table data identification
results. The proposed algorithm proved effective on real data
- the hydrometeorological station record book contains tables
including simple and complex structures tables with over 98%
accuracy.

Author Biographies

Hai-Hong Phan, Le Quy Don Technical University

Hai-Hong Phan, Ngo Dai Duong

Le Quy Don Technical University

*hongpth@lqdtu.edu.vn

Dai Duong Ngo, Le Quy Don TechnicalUniversity

Ngo Dai Duong is currently studying for
master degree at Le Quy Don Technical
University. He received the B.S. degree
in Information Technology from Le Quy
Don Technical University in 2014. His researches focus on computer vision, document analysis and machine learning.
Email: daiduong28789@hotmail.com

Published
2021-05-31