The Representational Budget: Scale, RL, and Multimodal Alignment Compete for Geometric Potential in Transformers

Zhang, Jing

doi:10.5281/zenodo.19585083

preprintZenodo (CERN European Organization for Nuclear Research)Apr 15, 2026GREEN OA

The Representational Budget: Scale, RL, and Multimodal Alignment Compete for Geometric Potential in Transformers

JZJing Zhang

Indexed indatacite

Abstract

We introduce the spectral slope S(ℓ)—the log-linear decay rate of PCA eigenvalues computed from hidden-state representations at layer ℓ—as a cheap, per-layer diagnostic scalar for Transformer geometry. Across four rounds of experiments on 13 models from 5 architecture families (0.6B–30B parameters, dense and MoE, with varying RL intensity and modality count), we find that (1) per-layer spectral expansion ΔS/L decays monotonically with log N within the Qwen3 family (r=−0.968, p=0.007); (2) output-layer participation ratio PR tracks RL training intensity from 13.3 (base) to 4.3 (extreme RL); (3) chain-of-thought reasoning reverses RL-induced compression at runtime; (4) MoE routing increases aggregate spectral…

Citation impact

6

total citations

FWCI: —
Percentile: —
References: 0

Too recent for citation history.

Authors

1

JZ
Jing ZhangCorresponding

Topics & keywords

Topics

Keywords

Transformer
Eigenvalues and eigenvectors
Spectral shape analysis
Monotonic function
Scalar (mathematics)
Topology (electrical circuits)
Pattern recognition (psychology)

No related works found for this paper.