articleFrontiers in Artificial IntelligenceMar 14, 2023GOLD OA

COVID-Twitter-BERT: A natural language processing model to analyse COVID-19 content on Twitter

École Polytechnique Fédérale de Lausanne · Fundación para el Fomento de la Investigación Sanitaria y Biomédica de la Comunitat Valenciana

PubMed
Indexed incrossrefdoajpubmed

Abstract

Introduction

This study presents COVID-Twitter-BERT (CT-BERT), a transformer-based model that is pre-trained on a large corpus of COVID-19 related Twitter messages. CT-BERT is specifically designed to be used on COVID-19 content, particularly from social media, and can be utilized for various natural language processing tasks such as classification, question-answering, and chatbots. This paper aims to evaluate the performance of CT-BERT on different classification datasets and compare it with BERT-LARGE, its base model.

Methods

The study utilizes CT-BERT, which is pre-trained on a large corpus of COVID-19 related Twitter messages. The authors evaluated the performance of CT-BERT on five different classification datasets, including one in the target domain. The model's performance is compared to its base model, BERT-LARGE, to measure the marginal improvement. The authors also provide detailed information on the training process and the technical specifications of the model.

Citation impact

184
total citations
FWCI
80.90
Percentile
100%
References
16
Citations per year

Authors

3

Topics & keywords

Keywords
  • Coronavirus disease 2019 (COVID-19)
  • Severe acute respiratory syndrome coronavirus 2 (SARS-CoV-2)
  • 2019-20 coronavirus outbreak
  • Content (measure theory)
  • Computer science
  • Pandemic
  • Natural language processing
  • Virology
UN Sustainable Development Goals
  • Quality Education
No related works found for this paper.

Funding