articleAug 20, 2020GREEN OA

LayoutLM: Pre-training of Text and Layout for Document Image Understanding

YXYiheng XuMLMinghao LiLCLei CuiSHShaohan HuangFWFuru Wei

Harbin Institute of Technology · Beihang University · +1 more institution

Indexed inarxivcrossref

Abstract

Pre-training techniques have been verified successfully in a variety of NLP tasks in recent years. Despite the widespread use of pre-training models for NLP applications, they almost exclusively focus on text-level manipulation, while neglecting layout and style information that is vital for document image understanding. In this paper, we propose the LayoutLM to jointly model interactions between text and layout information across scanned document images, which is beneficial for a great number of real-world document image understanding tasks such as information extraction from scanned documents. Furthermore, we also leverage image features to incorporate words' visual information into LayoutLM. To the best of…

Citation impact

575
total citations
FWCI
22.66
Percentile
100%
References
16
Citations per year

Authors

6
  • YX
    Yiheng XuCorresponding

    Harbin Institute of Technology

  • ML
    Minghao Li

    Beihang University

  • LC
    Lei Cui

    Microsoft Research Asia (China)

  • SH
    Shaohan Huang

    Microsoft Research Asia (China)

  • FW
    Furu Wei

    Microsoft Research Asia (China)

Topics & keywords

Keywords
  • Leverage (statistics)
  • Document layout analysis
  • Document image processing
  • Focus (optics)
  • Image (mathematics)
  • Variety (cybernetics)
  • Information extraction
  • Historical document
No related works found for this paper.