LayoutLM: Pre-training of Text and Layout for Document Image Understanding

Xu, Yiheng; Li, Minghao; Cui, Lei; Huang, Shaohan; Wei, Furu; Zhou, Ming

doi:10.1145/3394486.3403172

articleAug 20, 2020GREEN OA

LayoutLM: Pre-training of Text and Layout for Document Image Understanding

YXYiheng XuMLMinghao LiLCLei CuiSHShaohan HuangFWFuru Wei

Harbin Institute of Technology · Beihang University · +1 more institution

Indexed inarxivcrossref

Abstract

Pre-training techniques have been verified successfully in a variety of NLP tasks in recent years. Despite the widespread use of pre-training models for NLP applications, they almost exclusively focus on text-level manipulation, while neglecting layout and style information that is vital for document image understanding. In this paper, we propose the LayoutLM to jointly model interactions between text and layout information across scanned document images, which is beneficial for a great number of real-world document image understanding tasks such as information extraction from scanned documents. Furthermore, we also leverage image features to incorporate words' visual information into LayoutLM. To the best of…

Citation impact

575

total citations

FWCI: 22.66
Percentile: 100%
References: 16

Citations per year

Authors

6

YX
Yiheng XuCorresponding
Harbin Institute of Technology
ML
Minghao Li
Beihang University
LC
Lei Cui
Microsoft Research Asia (China)
SH
Shaohan Huang
Microsoft Research Asia (China)
FW
Furu Wei
Microsoft Research Asia (China)

Topics & keywords

Topics

Keywords

Leverage (statistics)
Document layout analysis
Document image processing
Focus (optics)
Image (mathematics)
Variety (cybernetics)
Information extraction
Historical document

No related works found for this paper.