PubLayNet: Largest Dataset Ever for Document Layout Analysis

Zhong, Xu; Tang, Jianbin; Yepes, Antonio Jimeno

doi:10.1109/icdar.2019.00166

articleSep 1, 2019Closed access

PubLayNet: Largest Dataset Ever for Document Layout Analysis

XZXu Zhong JTJianbin Tang AJAntonio Jimeno Yepes

IBM Research - Australia

Indexed incrossref

Abstract

Recognizing the layout of unstructured digital documents is an important step when parsing the documents into structured machine-readable format for downstream applications. Deep neural networks that are developed for computer vision have been proven to be an effective method to analyze layout of document images. However, document layout datasets that are currently publicly available are several magnitudes smaller than established computing vision datasets. Models have to be trained by transfer learning from a base model that is pre-trained on a traditional computer vision dataset. In this paper, we develop the PubLayNet dataset for document layout analysis by automatically matching the XML representations and…

Citation impact

467

total citations

FWCI: 18.57
Percentile: 100%
References: 27

Citations per year

Authors

3

Topics & keywords

Topics

Keywords

Computer science
Parsing
Document layout analysis
XML
Information retrieval
Artificial intelligence
Domain (mathematical analysis)
Document Structure Description

No related works found for this paper.