articleSep 1, 2019Closed access

PubLayNet: Largest Dataset Ever for Document Layout Analysis

IBM Research - Australia

Indexed incrossref

Abstract

Recognizing the layout of unstructured digital documents is an important step when parsing the documents into structured machine-readable format for downstream applications. Deep neural networks that are developed for computer vision have been proven to be an effective method to analyze layout of document images. However, document layout datasets that are currently publicly available are several magnitudes smaller than established computing vision datasets. Models have to be trained by transfer learning from a base model that is pre-trained on a traditional computer vision dataset. In this paper, we develop the PubLayNet dataset for document layout analysis by automatically matching the XML representations and…

Citation impact

467
total citations
FWCI
18.57
Percentile
100%
References
27
Citations per year

Authors

3

Topics & keywords

Keywords
  • Computer science
  • Parsing
  • Document layout analysis
  • XML
  • Information retrieval
  • Artificial intelligence
  • Domain (mathematical analysis)
  • Document Structure Description
No related works found for this paper.