articleArtificial Intelligence ReviewOct 17, 2025HYBRID OA

Safeguarding large language models: a survey

University of Liverpool · Université Stendhal – Grenoble 3 · +4 more institutions

PubMed
Indexed incrossrefpubmed

Abstract

In the burgeoning field of Large Language Models (LLMs), developing a robust safety mechanism, colloquially known as "safeguards" or "guardrails", has become imperative to ensure the ethical use of LLMs within prescribed boundaries. This article provides a systematic literature review on the current status of this critical mechanism. It discusses its major challenges and how it can be enhanced into a comprehensive mechanism dealing with ethical issues in various contexts. First, the paper elucidates the current landscape of safeguarding mechanisms that major LLM service providers and the open-source community employ. This is followed by the techniques to evaluate, analyze, and enhance some (un)desirable…

Citation impact

42
total citations
FWCI
78.66
Percentile
100%
References
96
Citations per year

Authors

12

Topics & keywords

Keywords
  • Safeguarding
  • Field (mathematics)
  • Service (business)
  • Mechanism (biology)
  • Service provider
  • Work (physics)
No related works found for this paper.

Funding