Observing a Baseline Phase Transition in LLM Safety Behavior: Supplementary Evidence for Environmental Design — Paper 5-1 in the Buddys Architecture Series

Steinhauser (Czechia)

Indexed indatacite

Abstract

Seven days after Paper 5 demonstrated that a 92-character value injection produces a sharp phase transition from action (A=93%) to restraint (C=100%) on a trilemma probe, we attempted replication across six models from three providers (Anthropic, OpenAI, Google). The replication failed—not because the effect disappeared, but because the baselines themselves had undergone the same phase transition. All five testable models shifted toward C without any system prompt: Claude Opus and Sonnet reached C=100%, GPT-4o and GPT-4o-mini reached C=100%, and Gemini 2.5 Flash shifted partially to C=13%. Claude Haiku 3.5 was retired from the API entirely. This industry-wide convergence observed within a one-week window…

Citation impact

5
total citations
FWCI
Percentile
References
8
Too recent for citation history.

Authors

1

Topics & keywords

Keywords
  • Offset (computer science)
  • Attractor
  • Series (stratigraphy)
  • Phase transition
  • Baseline (sea)
  • Robustness (evolution)
  • Reboot
No related works found for this paper.