← Back to papers

Paper deep dive

Spectral Superposition: A Theory of Feature Geometry

Georgi Ivanov, Narmeen Oozeer, Shivam Raval, Tasana Pejovic, Shriyash Upadhyay, Amir Abdullah

Year: 2025Venue: arXiv preprintArea: Mechanistic Interp.Type: TheoreticalEmbeddings: 152

Intelligence

Status: succeeded | Model: google/gemini-3.1-flash-lite-preview | Prompt: intel-v1 | Confidence: 94%

Last extracted: 3/11/2026, 12:44:40 AM

Summary

The paper introduces a spectral theory of feature geometry for neural networks, utilizing the frame operator (F = WWᵀ) to analyze how features allocate norm across eigenspaces. This approach moves beyond pairwise interaction analysis to capture global geometric structures, enabling the diagnosis of feature localization and structural interference in both toy models and arbitrary weight matrices.

Entities (5)

Frame Operator · mathematical-object · 98%Gram Matrix · mathematical-object · 98%Spectral Measure · analytical-tool · 95%Superposition · neural-network-phenomenon · 95%Bose-Mesner Algebra · mathematical-framework · 92%

Relation Signals (3)

Gram Matrix isrelatedto Frame Operator

confidence 98% · the nonzero eigenspaces of MM and FF are in one-to-one correspondence.

Frame Operator captures Global Feature Geometry

confidence 95% · spectral methods capture the global geometry (“how do all features interact?”).

Capacity Saturation forces Spectral Localization

confidence 92% · capacity saturation forces spectral localization: features collapse onto single eigenspaces

Cypher Suggestions (2)

Map the relationship between phenomena and their effects · confidence 90% · unvalidated

MATCH (p:Phenomenon)-[r:FORCES]->(e:Effect) RETURN p.name, r.relation, e.name

Find all mathematical objects used to analyze feature geometry · confidence 85% · unvalidated

MATCH (e:Entity {entity_type: 'Mathematical Object'})-[:RELATED_TO]->(f:FeatureGeometry) RETURN e

Abstract

Abstract:Neural networks represent more features than they have dimensions via superposition, forcing features to share representational space. Current methods decompose activations into sparse linear features but discard geometric structure. We develop a theory for studying the geometric structre of features by analyzing the spectra (eigenvalues, eigenspaces, etc.) of weight derived matrices. In particular, we introduce the frame operator $F = WW^\top$, which gives us a spectral measure that describes how each feature allocates norm across eigenspaces. While previous tools could describe the pairwise interactions between features, spectral methods capture the global geometry (``how do all features interact?''). In toy models of superposition, we use this theory to prove that capacity saturation forces spectral localization: features collapse onto single eigenspaces, organize into tight frames, and admit discrete classification via association schemes, classifying all geometries from prior work (simplices, polygons, antiprisms). The spectral measure formalism applies to arbitrary weight matrices, enabling diagnosis of feature localization beyond toy settings. These results point toward a broader program: applying operator theory to interpretability.

Tags

ai-safety (imported, 100%)mechanistic-interp (suggested, 92%)theoretical (suggested, 88%)

Links

PDF not stored locally. Use the link above to view on the source site.

Full Text

151,293 characters extracted from source content.

Expand or collapse full text

Spectral Superposition: A Theory of Feature Geometry Georgi Ivanov Narmeen Oozeer Shivam Raval Tasana Pejovic Shriyash Upadhyay Amir Abdullah Abstract Neural networks represent more features than they have dimensions via superposition, forcing features to share representational space. Current methods decompose activations into sparse linear features but discard geometric structure. We develop a theory for studying the geometric structre of features by analyzing the spectra (eigenvalues, eigenspaces, etc.) of weight derived matrices. In particular, we introduce the frame operator F=W​W⊤F=W , which gives us a spectral measure that describes how each feature allocates norm across eigenspaces. While previous tools could describe the pairwise interactions between features, spectral methods capture the global geometry (“how do all features interact?”). In toy models of superposition, we use this theory to prove that capacity saturation forces spectral localization: features collapse onto single eigenspaces, organize into tight frames, and admit discrete classification via association schemes, classifying all geometries from prior work (simplices, polygons, antiprisms). The spectral measure formalism applies to arbitrary weight matrices, enabling diagnosis of feature localization beyond toy settings. These results point toward a broader program: applying operator theory to interpretability. Machine Learning, Feature Geometry 1 Introduction Representation in model TRPSHfeatureextraction Separate concept vectors (current methods) R→ RP→ PS→ SH→ HT→ Tspectralmeasure Independent activation subspaces Rank 1: H→ H, T→ TRank 2: R→ R, P→ P, S→ Sspectralsignature Recovered geometry HTRPS Figure 1: A diagram showing the concepts of ”rock(R)/paper(P)/scissors(S)” and ”heads(H)/tails(T)” embedded in a three dimensional space. This is an example of structural interference, where a model causes concepts to interfere because they are related (the hand-game features interfere and the coin features interfere based on the relationships between the objects in each set, but the two sets do not interfere with each other). Current methods, which only extract features, will create a basis with 5 vectors. We capture the relationships between features in activation space using the spectral measure to bin features by the eigenspaces they occupy. When each feature is spectrally localized (occupies a single eigenspace), we can recover the full geometry by covering it with an additional spectral signature. These tools from operator theory let us explore the global character of interference: not just how pairs of features interact, but how all features relate. Neural networks represent far more features than they have dimensions in their activation space, a phenomenon known as superposition (Elhage et al., 2022; Chan, 2024). When a model needs to encode f features in d dimensions with f>df>d, it cannot give each feature its own orthogonal direction. Instead, features must share representational space, compressed into overlapping directions that exploit the high-dimensional geometry available to them. A direct consequence of superposition is feature interference: multiple features compete for the same representational directions, so modifying one inevitably perturbs others. In practice, we see activation steering methods (Turner et al., 2024; Rimsky et al., 2024; Chalnev et al., 2024) often produce unintended side effects, where pushing on one concept inadvertently shifts others (Siu et al., 2025; Raedler et al., 2025; Nguyen et al., 2025a). Current interpretability approaches address interference by decomposing activations into sparse, independent linear features. Sparse autoencoders (SAEs) and related methods attempt to recover a dictionary of features that activate sparsely across inputs, effectively trying to undo the compression that superposition creates. The hope is to find a basis where each feature corresponds to a single, interpretable direction. However, this approach discards something important. There are at least two distinct reasons why a model might cause features to share dimensions: • Incidental interference. Some features are anti-correlated in when they activate, they rarely or never co-occur in the same input (Lecomte et al., 2023). If “marine biology” and “abstract algebra” are both sparse features that almost never appear together, the model can efficiently reuse the same dimensions for both, disambiguating by context. This kind of sharing is a compression trick we would like to undo: the features are conceptually independent, just packed together for efficiency. • Structural interference. Other features share dimensions because there is genuine structure in the data that makes this natural. The days of the week might be encoded in a shared subspace precisely because the model needs to represent relationships between them, their cyclic structure, their ordering, the fact that Monday is “between” Sunday and Tuesday (Engels et al., 2024). Here, the geometry is not an artifact to be removed but a meaningful representation of how concepts relate. A growing body of work demonstrates that models do encode meaningful geometric relationships between features, showing evidence of structural sharing that sparse linear decomposition discards. Models solving modular arithmetic trace helices in representation space (Nanda and Chan, 2025). Spatial reasoning tasks induce geometric embeddings (Gurnee and Tegmark, 2024). The Evo 2 DNA foundation model organizes biological species according to their phylogenetic relationships on a curved manifold (Nguyen et al., 2025b). Similar geometric organization has been observed across domains and architectures (Park et al., 2025; Anthropic, 2025; Bhaskara et al., 2025; Google, 2025; Conmy et al., 2025; Pan et al., 2025; Nguyen and Mo, 2025). These findings suggest that geometry is not an incidental artifact but a core mechanism by which features coexist and interact. When features are embedded with non-trivial geometry, treating them as independent directions loses information. Consider a stylized example (Figure 1) of how models might represent game-playing: five features representing “rock,” “paper,” “scissors,” “heads,” and “tails,” embedded in three dimensions. The rock-paper-scissors features form a triangle in a shared plane; the heads-tails features lie along a separate axis. This is structural interference, the hand-game features share a subspace because their relationships matter, and likewise for the coin features, but the two groups remain orthogonal. Current feature extraction methods return five individual vectors. The geometric structure—which features share a subspace, which are orthogonal, how they are arranged, is lost. We want to recover this global structure: not just features in isolation, but how they organize collectively into sub-geometries that may carry functional meaning. Prakash et al. (2025) show that inducing refusal in Gemma2-2B-Instruct requires ablating 2,538 SAE features, and that “backup” features activate when others are suppressed, suggesting that model functionality lives in collective geometric structure, not individual directions. Our contributions are as follows: • We motivate the use of spectral theory (eigenspaces, etc.) to study interference by developing an intuition through our game-playing example: interference only occurs when features share subspaces of activation space (the eigenspaces). • We develop a spectral theory of superposition, providing generalizable tools that can be applied in both toy and realistic settings to study feature geometry. • We apply these tools to the canonical toy model of superposition (Elhage et al., 2022), fully characterizing the geometry of features that form in this setting. 2 Motivating Spectral Methods In this section, we will motivate the study of feature geometry using spectral theory by looking more closely at our games of chance example (Figure 2). There are three notable features about this example: 1. The geometric structure is permutation- invariant. Permuting feature vectors within clusters does not change the global structure and interaction. 2. The HT and RPS subspaces are independent. No coin toss would provide a better winning chance at RPS and vice versa, and thus the concepts are encoded in orthogonal subspaces. 3. The geometric arrangement of features encodes precise symmetries. In RPS: the cyclic rule of rock beats scissors, scissors beat paper, paper beats rock. In HT: and the binary opposition of heads or tails. The classic way of studying interference between features is the Gram Matrix M=W⊤​WM=W W, where each entry Mi​j=⟨Wi,Wj⟩=Wi⊤​WjM_ij= W_i,W_j =W_i W_j is the Euclidean norm between column vectors. We could try to use this method to determine which vectors interact and which don’t. W=[⋮W1⋯W5⋮]=[1212−10032−320000001−1]W= bmatrix && \\ W_1&·s&W_5\\ && bmatrix= bmatrix 12& 12&-1&0&0\\ 32& - 32&0&0&0\\ 0&0&0&1&-1 bmatrix Figure 2: The games of chance example in more detail, both geometrically and as a matrix. Here, W△=[W1,W2,W3]W_ =[W_1,W_2,W_3] represent the feature concepts for “rock”, “paper”, and “scissors”, respectively, arranged as an equilateral triangle in the x​yxy-plane. Similarly, WD=[W4,W5]W_D=[W_4,W_5] correspond to “heads” and “tails” and form an antipodal pair along the z-axis. Given how we have written W (Figure 2), this would work: direct matrix multiplication reveals a block matrix M=M△⊕MD=(W△⊤​W△)⊕(WD⊤​WD)M=M_ M_D=(W_ W_ ) (W_D W_D). However, this clean block-diagonal structure is an artifact of indexing choice. Only permutations (A.1 via matrices PγP_γ) within the local feature clusters Ω△=1,2,3 _ =\1,2,3\ and ΩD=4,5 _D=\4,5\ leave the block decomposition intact. Imagine that we rearrange the column vectors or rotate the matrix; this structure would no longer be visually evident in W or in M, but the underlying geometry would remain unchanged. Our goal in this section is to find a way to express the global symmetries of features in W (to capture the geometry), in a way that is basis invariant. This is precisely what spectral analysis allows us to do. 2.1. Geometric Invariance At a high level, our argument proceeds as follows: geometric structure induces symmetry constraints; these constraints can be expressed as algebraic relations among matrices; and these relations are fully captured by spectral data. In this subsection, we assume the geometry is known and show how spectral structure encodes it. In subsection 2.2, we reverse the direction: given only the weight matrix, we recover the geometry from spectra in activation space. We established that we need to get information about interference in a permutation invariant way. As such, consider arbitrary column labels Ω△=i1,i2,i3 _ =\i_1,i_2,i_3\ and ΩD=i4,i5 _D=\i_4,i_5\ with W△=[Wi1,Wi2,Wi3]W_ =[W_i_1,W_i_2,W_i_3] and WD=[Wi4,Wi5]W_D=[W_i_4,W_i_5]. The group that preserves the equilateral triangle is the dihedral group D3≅S3D_3 S_3. We can denote these elements in permutation notation σ=(i​j​k)σ=(i\ j\ k), where i→j→k→i→ j→ k→ i represents the order in which the sub-indices get switched around (Figure 3). Figure 3: The elements of D3D_3 in permutation notation. We can express the group action in terms of adjacency matrices over Ω×Ω × , each representing the stable orbits (the trajectory of a single index after applying every possible permutation σ∈D3σ∈ D_3). In our context,every vertex is accessible within 11 step. As such, outside of the trivial adjacency matrix A0=I3A_0=I_3, representing the stationary orbit 0=diag​(Ω)O_0=diag( ), we only have A1=J3−I3A_1=J_3-I_3 (P.f. in A.3.1), where J3∈ℝ3J_3 ^3 is the matrix of all ones. We will focus on △=span⁡A0,A1A_ =span \A_0,A_1 \, called the Bose-Mesner algebra. Why work with this algebra rather than simply diagonalizing M△M_ directly? The answer is that △A_ captures exactly the matrices that commute with all symmetry operations—it is the centralizer of D3D_3 (Pf. in A.1). Any D3D_3-invariant operator, including M△M_ , must lie in this algebra and hence decompose over its spectral projectors S0,S1S_0,S_1. This is more powerful than generic diagonalization: it tells us not just the eigenvalues, but why they arise from symmetry. The Bose-Mesner algebra is a central object in the theory of association schemes (Bailey, 2004) and comes equipped with its own spectral theory over strata ℝΩ△≅ℝ3≅V0⊕V1R _ ^3 V_0 V_1 (Lemma A.5). The corresponding stratum projectors S0S_0 and S1S_1 serve as a more convenient basis of A. There are orthogonality relations connecting them (Lemma A.6): Ai=∑e=01C​(i,e)​Se,Se=∑i=01D​(e,i)​AiA_i= _e=0^1C(i,e)S_e, S_e= _i=0^1D(e,i)A_i (2.1) The C​(i,e)C(i,e) entries are the eigenvalue of AiA_i on WeW_e, forming a character table C∈ℝ2×2C ^2× 2. The values of D​(e,i)D(e,i) comprise a matrix that is the inverse of C. The same lemma states that S0=|Ω△|−1​J=J/3S_0=| _ |^-1J=J/3 and provides us with explicit values for some of the coefficients, such as C​(0,0)=C​(0,1)=1C(0,0)=C(0,1)=1 and C​(1,0)=ai=2C(1,0)=a_i=2. Here aia_i is the non-trivial orbit’s degree, representing the number of neighbors an index can reach under the group action. By using the latter information along with (2.1)(2.1), we can explicitly calculate S1=A0−S0=I3−13​J3S_1=A_0-S_0=I_3- 13J_3. With both projection matrices we recover V0=span​(3)V_0=span(1_3) and V1=3⟂V_1=1_3 . Lastly, by solving for A1=2​S0+C​(1,1)​S1A_1=2S_0+C(1,1)S_1, we complete the character table with C​(1,1)=−1C(1,1)=-1 and D along with it per Lemma A.6: C=[112−1],D=13​[112−1]C= bmatrix1&1\\ 2&-1 bmatrix, D= 13 bmatrix1&1\\ 2&-1 bmatrix Using the characters and the fact that M△M_ is D3D_3-invariant (Pf. in A.3.1), we can express it over the basis AiA_i as: M△=I−12​(J−I)=32​I−12​J=32​S1M_ =I- 12(J-I)= 32I- 12J= 32S_1 Now, let us proceed with the digon, whose group is Γ≅S3≅C2 S_3 C_2. This is the cyclic group of two elements e,s\e,s\, namely the identity e and s is an element of order 22, meaning that s2=1s^2=1. This visually represents swapping around the two nodes of the digon (i.e. going from heads-tails to tails-heads). Per the fact that this is a simplex (i.e one can reach from one node to every other), we have the same number of orbits/associate classes 0=diag⁡(ΩD)C_0=diag( _D) and 1=(4,5);(4,5)C_1=\(4,5);(4,5)\ with adjacency matrices A0′=I2A_0 =I_2 and A1′=J2−I2A_1 =J_2-I_2. Similarly, Lemma A.5 for D=span⁡(A0′,A1′)A_D=span(A_0 ,A_1 ) yields ℝΩD≅ℝ2=U0⊕U1R _D ^2=U_0 U_1 and by Lemma A.6 S0′=12​IS_0 = 12I, S1′=I−12​JS_1 =I- 12J. As illustrated below, the spaces ℝΩ△R _ (left) and ℝΩDR _D (right) decompose into a centroid subspace (red arrows) and a difference subspace (the shaded plane/yellow diagonal, corresponding to S1S_1 and S1′S_1 ). The identities M△=32​S1M_ = 32S_1 and rank⁡(M△)=2rank(M_ )=2 show that, within each symmetric cluster, the Gram operator acts as a projector that annihilates the centroid with eigenvalue 0 and scales the difference plane by λ△=32 _ = 32 or λD=2 _D=2, respectively. This immediately implies an intrinsic rank drops rank⁡(M△)=2rank(M_ )=2 and rank⁡(MD)=1rank(M_D)=1. Figure 4: Association Scheme Strata However, this description is not visible at the level of Gram entries once feature indices are scrambled. What survives the index permutation is only the spectral content (eigenvalues/eigenspaces up to relabeling), which is why we next pass to an activation-space operator that is itself invariant under relabeling. 2.2. Spectral Bridge Section 2.1 showed that known geometry is encoded in spectral data. But in practice, we observe only the weight matrix W, not the underlying feature structure. How can we recover geometry without knowing it in advance? The key is to work in activation space rather than index space. The Gram matrix M=W⊤​WM=W W transforms under index permutations as M↦P⊤​M​PM P MP, scrambling its entries. The frame operator F=W​W⊤F=W , by contrast, is invariant: (W​P)​(W​P)⊤=W​W⊤=F(WP)(WP) =W =F. This makes F the natural object for basis-invariant geometric analysis. Up to now we have described the underlying symmetry as spectral modes of the Gram M (index-dependent) in the abstract concept/index space ℝΩ△⊕ℝΩDR _ _D. However, the geometry that actually determines interference and capacity lives in activation space ℝ3R^3. Scrambling feature indices corresponds to right-multiplying W by a permutation matrix PγP_γ, i.e. W→W​PγW→ WP_γ (Pf. in A.1). This conjugates the Gram M→P⊤​M​PM→ P MP, obscuring the local block structure in the entries. Thus, if we want a genuinely label-free description of geometry-informed interference we must pass to an activation-space object that is invariant under such relabelings and preserves the spectral information we uncovered earlier. The canonical such object is the frame operator F:=W​W⊤∈ℝ3×3F:=W ^3× 3, which remains unchanged under index permutations (W​Pγ)​(W​Pγ)⊤=W​Pγ​Pγ⊤​W⊤=W​W⊤=F(WP_γ)(WP_γ) =WP_γP_γ W =W =F. And while gluing back together M△M_ and MDM_D in the concept space is non-trivial due to the arbitrary indexing, the process becomes straightforward for the Frame operator F=∑i=1nWi​Wi⊤F= _i=1^nW_iW_i : F=∑i∈Ω△Wi​Wi⊤+∑i∈ΩDWi​Wi⊤=F△+FDF= _i∈ _ W_iW_i + _i∈ _DW_iW_i =F_ +F_D (2.2) We use the frame operator because of a precise spectral correspondence between M and F. Let M=∑eλe​EeM= _e _eE_e be the Gram’s spectral decomposition B.1. Then, by the Spectral Correspondence Lemma B.3, M and F share the same non-zero eigenvalues λe>0 _e>0 and the primitive idempotents EeE_e of M lift to Pe=λe−1​W​Ee​W⊤.P_e= _e^-1WE_eW . (2.3) We can apply this lemma to derive a corollary establishing the precise relationship between the local cluster projectors and the frame’s decomposition (Corollary 6): F=2​PD+32​P△F=2P_D+ 32P_ , where PDP_D and P△P_ are projectors onto the subspaces along which the digon and triangle live, respectively. Furthermore, for C∈△,DC∈\ ,D\, we have the following consequence: 1λC=dim(UC)∑I∈ΩC‖Wi‖2=rank⁡(PC)|ΩC|, 1 _C= (U_C) _I∈ _C\|W_i\|^2= rank(P_C)| _C|, namely that the fractional dimensionality Di=d/pD_i=d/p, denoting the fraction of features that share dimension, is recoverable from spectral signature. In the following section, we provide detailed analysis about the meaning of our result and how the ideas applied here generalize to an arbitrary weight matrix setting. 3 Generalizable Tools In the previous section, we show that the geometry can recovered from the frame operator that lives in the latent space and can encode structured representations. In this section, we focus on a set of tools from the previous section that generalize to arbitrary weight matrices. In particular, we show that the fact that features arrange along eigenspaces can determine their interaction patterns (given by the spectral bridge), the capacity consumed by a given feature (fractional dimensionality) as well as how localized the feature is. Spectral Bridge: The spectral bridge introduced in Section 2 is generalizable to arbitrary weight matrices W and classifies the intertwining of the input space (via the Gram matrix M) and the activation space (via the Frame operator F). We are able to recover the geometry from the spectral behavior of F, classifying it by its invariance with the help of association schemes (Appendix A.1). More specifically, W intertwines the decompositions of M and F. As a result, the nonzero eigenspaces of M and F are in one-to-one correspondence. An equally important consequence of the decomposition of the frame operator into orthogonal subspaces is that feature interactions are confined to shared spectral subspaces: components supported in the same induced eigenspace can interact, components supported in different eigenspaces are non-interacting, and directions orthogonal to Im⁡(W)Im(W) are inert, meaning they are annihilated by W⊤W and therefore do not participate in feature interactions or capacity usage. Spectral Measure: While the global Empirical Spectral Density (Martin and Mahoney, 2021a) describes where capacity exists, the collection of spectral measures describes how individual features use that capacity. Two models may therefore exhibit similar heavy-tailed global spectra while possessing very different patterns of feature localization, interference, and effective dimensionality which is revealed by the spectral measure. The primary object of study here is the per-feature dimensionality, defined in Toy models (Elhage et al., 2022) to study interactions between features encoded as: Di=‖Wi‖4∑j=1n(Wi⊤​Wj)2D_i= \|W_i\|^4 _j=1^n (W_i W_j )^2 (3.1) It represents the degree of superposition with other features, where Di≈1D_i≈ 1 when the feature vector WiW_i has its own dedicated dimension (i.e. it is approximately orthogonal to WjW_j, where i≠ji≠ j) and 0≤Di≪10≤ D_i 1 when it co-interacts with many others in superposition. The current form hides some of the more interesting relations. We can rewrite it as an expression of the Gram M and the frame operator F matrices as follows (Lemmas B.9 and B.10): Di=(Mi​i2)(M2)i​i=1κi​‖Wi‖2,D_i= (M_i^2)(M^2)_i= 1 _i\|W_i\|^2, (3.2) where κi=Wi⊤​F​Wi‖Wi‖2 _i= W_i FW_i\|W_i\|^2 is the Rayleigh quotient of F at WiW_i. Note that ‖|Wi‖|>0\||W_i\||>0 at initialization with probability 11 and under the capacity saturation we observe, it is stable w.r.t. to rank collapse (after noticing ϵ​uε u with ϵ→0ε→ 0 preserves the bound B.13). We can view the Gram first form as a signal-to-noise ratio derived from pairwise correlations, while the second one recasts this as a spectral problem. We now introduce the spectral measure, whose role is to provide a measure of how features allocate their mass across eigenmodes. Consider the weight matrix, with associated Gram and frame operators M and F. As shown earlier, all nontrivial interaction structure in activation space is captured by the spectral decomposition (Equation 2.3) where the Pe​(t)P_e(t) are mutually orthogonal projectors corresponding to independent spectral modes. To describe how an individual feature participates in this structure, for each feature vector WiW_i, we define pi,e:=‖Pe​Wi‖2‖Wi‖2p_i,e:= \|P_eW_i\|^2\|W_i\|^2 which measures the fraction of the feature’s squared norm supported in spectral mode e. We note that these coefficients are nonnegative and Lemma B.11 shows that they sum to one, and therefore form a probability distribution. This motivates the definition of the spectral measure μi​(t):=∑e∈ℰ+pi,e​(t)​δλe​(t). _i(t):= _e _+p_i,e(t)\, _ _e(t). (3.3) The spectral measure provides a compact description of how a feature allocates its mass across interaction modes. Moreover, operator-level quantities can now be interpreted probabilistically. Lemma B.2 shows that all moments of μi _i are accessible via powers of the frame operator. In particular, the Rayleigh quotient admits the represention κi=Wi⊤​F​Wi‖Wi‖2=μi​[λ]. _i= W_i FW_i\|W_i\|^2=E_ _i[λ]. Finally, fractional dimensionality can be written as Di​(t)=‖Wi​(t)‖2μi​(t)​[λ],D_i(t)= \|W_i(t)\|^2E_ _i(t)[λ], (3.4) making explicit that effective dimensionality is governed by how a feature distributes its mass across the spectrum. While the ESD provides a global summary of how capacity is distributed across spectral scales, it does not describe how individual features participate in this structure. To address this, our framework introduces a per-feature spectral measure. For each feature vector WiW_i, the coefficients pi,ep_i,e in Equation 3.3 form a probability distribution over spectral modes and quantify the fraction of the feature’s squared norm supported in each eigenspace. This construction allows us to distinguish between two regimes at the level of individual features. A feature is said to be delocalized if its spectral measure μi _i spreads mass across many eigenvalues, indicating participation in multiple interaction modes. Conversely, a feature is localized if μi _i concentrates on a small number of eigenvalues, or collapses to a single Dirac mass. In the extreme case μi=δλ _i= _λ, the feature lies entirely within a single eigenspace of the frame operator. 4 Solving Toy Models of Superposition In this section, we show how some of machinery from the spectral framework developed above can be used to exhaustively explain superposition in toy models. We begin by empirically observing that trained models operate near capacity saturation, motivating a regime in which fractional dimensionality exhausts the rank of the weight matrix. Under this assumption, we prove that all features must spectrally localize, collapsing onto individual eigenspaces of the frame operator (Theorem 1). Once localized, features no longer trade capacity across eigenspaces: fractional dimensionality becomes linearly determined by feature norm and spectral scale (Corollary 2), and features within each eigenspace organize into tight frames (Theorem 3). This tight-frame structure makes feature geometry identifiable via association schemes (Theorem C), enabling discrete geometric classification. Finally, we connect these static results to training dynamics, showing how gradient flow induces spectral mass transport and eigenvalue drift, and why capacity-saturated tight-frame configurations emerge as stable fixed points. Together, these results provide a complete picture of how superposition resolves into structured, stable geometry under spectral constraints. The “Toy Models of Superposition” (TMS) paper by Anthropic (Elhage et al., 2022) is one of the seminal technical contributions to the understanding of polysemanticity. The phenomenon is explored by studying synthetic input vectors x that simulate the properties of the underlying features. Each has an associated sparsity SiS_i (i.e ℙ​(xi=0)=SiP(x_i=0)=S_i and xi∼Unif​(0,1)x_i (0,1) otherwise) and importance IiI_i for each dimension xix_i. Define the projection as a linear map h=W​x,h=Wx, where each column WiW_i corresponds to the direction in the lower-dimensional space that represents a feature xix_i. To recover the original vector, we use the transpose WTW^T, giving the reconstruction x′x . With the addition of the ReLU nonliearity, the model becomes x′=ReLU​(WT​h+b)=ReLU​(WT​W​x+b)x =ReLU(W^Th+b)=ReLU(W^TWx+b). Experimental Setup To investigate the emergence of superposition in toy models, we are conducting a high-throughput parameter sweep of 3,200 experiments, using mean squared error (MSE), weighted by feature importance L=∑x∑iIi​(xi−xi′)2L= _x _iI_i(x_i-x _i)^2, as in the original paper (Elhage et al., 2022). We fix the input dimension to n=1024n=1024 features and sample linearly 32 discrete values for M∈[16,512]M∈[16,512] and 50 discrete values for sparsity (S∈[0.0,0.99])(S∈[0.0,0.99]) (22 seeds per run). We set uniform importance Ii=I=1I_i=I=1 and sparsity Si=S_i=S, but discuss in the appendix G our results for non-uniform sparsity. 4.1 Capacity Saturation is Spectral Measure Localization Our first important empirical observation (E​1)(E1) is that the toy models utilize approximately all of their hidden dimension capacity ∑iDi≈m _iD_i≈ m. The following plot shows sum of individual feature fractional dimensionalities (as defined in 3.2.2) over all 3200 training runs: Figure 5: Capacity Saturation across 3,200 runs This is the most important fact that we use to justify our theoretical analysis and later predictions. It provides a significant reduction on the complexity of training dynamics, namely by forcing all of the individual features to localize to exactly one eigenspace of the frame operator, characterized by the corresponding projector PkP_k: Theorem 1 (Spectral Localization). Assume the model saturates the fractional dimensionality capacity bound, i.e. ∑i=1nDi=rank​(W)=m _i=1^nD_i=rank(W)=m. Then, for every feature i, the spectral measure collapses to a single Dirac mass μi=δλk _i= _ _k, centered at some eigenvalue λk>0 _k>0. This is equivalent to F​Wi=λk​WiFW_i= _kW_i. Proof. Define the leverage scores ℓi:=Wi⊤​F+​Wi _i:=W_i F^+W_i, so ∑iℓi=tr​(W⊤​F+​W)=tr​(F+​F)=rank​(W) _i _i=tr(W F^+W)=tr(F^+F)=rank(W). We can apply Lemma B.13 to each feature Wi∈Im​(F)W_i (F) for the Cauchy–Schwarz (CW) bound with a:=F1/2​xa:=F^1/2x and b:=F+⁣/2​xb:=F^+/2x: ‖Wi‖4≤(Wi⊤​F​Wi)​(Wi⊤​F+​Wi),\|W_i\|^4≤(W_i FW_i)(W_i F^+W_i), After rearrangement, we get the per-feature upper bound on fractional dimensionality Di=‖Wi‖4/(Wi⊤​F​Wi)≤ℓiD_i=\|W_i\|^4/(W_i FW_i)≤ _i. By summing over all features we get the bound ∑iDi≤∑iℓi=rank​(W) _iD_i≤ _i _i=rank(W). This makes the interpretation of ∑iDi=rank​(W)=m _iD_i=rank(W)=m one of saturating all of the fractional dimensionality bounds, i.e. that Di=ℓiD_i= _i. By Lemma B.16 and the finite-atomic nature of μi _i (formalized in Lemma B.14) κi=Wi⊤​F​Wi‖Wi‖2=μi​[λ] _i= W_i FW_i\|W_i\|^2=E_ _i[λ] and ℓi‖Wi‖2=Wi⊤​F+​Wi‖Wi‖2=μi​[λ+]. _i\|W_i\|^2= W_i F^+W_i\|W_i\|^2=E_ _i[λ^+]. Thus Di=‖Wi‖2/μi​[λ+]D_i=\|W_i\|^2/E_ _i[λ^+]. Even though we have now re-interpreted both DiD_i and ℓi _i through the spectral measure, the former CW is in (ℝm,⟨⋅,⋅⟩)(R^m, ·,· ), whereas the latter lives in L2​(μi)L^2( _i) (Lemma B.14) which is feature-dependent. The fundamental mismatch occurs because feature vectors WiW_i live in ℝmR^m, whereas the quantities controlling capacity and interference are spectral averages over eigenvalues of F. This is further validation for the need to consider W as an intertwining operator. The way to bridge the two is to define the cyclic subspace x:=span​x,F​x,F2​x,…,K_x:=span\x,Fx,F^2x,…\, which is the smallest F-invariant slice of ℝmR^m space that captures all quadratic forms Wi⊤​h​(F)​WiW_i h(F)W_i and vectors F1/2​Wi,F+⁣/2​WiF^1/2W_i,F^+/2W_i. Per the cyclic-space isometry UWi:L2​(μi)→WiU_W_i:L^2( _i) _W_i (Lemma B.16), the functions u​(λ)=λu(λ)= λ and v​(λ)=λ+⁣/2v(λ)=λ^+/2 correspond exactly to the vectors F1/2​Wi/‖Wi‖F^1/2W_i/\|W_i\| and F+⁣/2​Wi/‖Wi‖F^+/2W_i/\|W_i\|. Hence, CW in L2​(μi)L^2( _i) is literally the Euclidean CW on Wi⊆ℝmK_W_i ^m, making the equality Di=ℓiD_i= _i is equivalent to μi​[λ]​μi​[λ+]=1E_ _i[λ]E_ _i[λ^+]=1, since ⟨u​(λ),v​(λ)⟩L2​(μi)=1 u(λ),v(λ) _L^2( _i)=1 (Lemma B.15). The equality holds if and only if λ=c​λ+⁣/2 λ=cλ^+/2 μi _i-a.s., i.e. if λ is μi _i-a.s. constant. Therefore, μi _i collapses to a Dirac mass δλ​(i) _λ(i), meaning WiW_i lies entirely in a single positive-eigenvalue eigenspace of F, equivalently F​Wi=λ​(i)​WiFW_i=λ(i)W_i. Q.E.D. ∎ This result is in fact quite strong. It provides us with an immediate handle on global geometry of features. The first corollary (Pf. in C) is that if Spectral Localization holds, the fractional dimensionality scales linearly with the norm, where the slope corresponds to the inverse eigenvalue λk _k: Corollary 2 (Projective Linearity). Assume Spectral Localization holds. Then, the fractional dimensionality DiD_i of any feature is linearly determined by its feature norm ∝‖Wi‖2 \|W_i\|^2 with slope k equal to the reciprocal of the eigenvalue λe _e of the subspace it occupies Di=λe+​‖Wi‖2D_i= _e^+\|W_i\|^2 Since the model saturates its capacity, and hence its rank (as proved above), this means that with this information we can recover the exact density (and hence fractional dimensionality) of features living in that eigenspace. To investigate the validity of the Projective Linearity Corollary across the entire training sweep, we analyzed the geometric stability of feature clusters from 3,200 experimental runs (3,276,800 total features). For each feature wiw_i, we computed its spectral measure weights pi,e=‖Pe​wi‖2/‖wi‖2p_i,e=\|P_ew_i\|^2/\|w_i\|^2. We quantified the degree of Spectral Localization for each cluster C by the mean maximum spectral mass of its constituent features, defined as ℒC=1|C|​∑i∈Cmaxe⁡(pi,e)L_C= 1|C| _i∈ C _e(p_i,e). We plot against the linear coefficient of determination R2R^2. The error bars correspond to the absolute error |k​λ−1||kλ-1| for each cluster: Figure 6: Projective Linearity test: y-axis represents the coefficient of determination R2R^2 of the λk−1 _k^-1 linear fit; error bars provide the absolute error |k​λ−1||kλ-1|; x-axis represents the mean max metric for Spectral Measure localization The added gradient coloring on the error bars reveals a strong relationship between the eigenvalue slope fit and feature sparsity - lower sparsity forces features into tighter configurations (and hence eigenspaces) due to the stronger interference, whereas high sparsity features are more diffuse across the spectrum. This observation aligns with our perturbative analysis for small deviations σi=1−Di/ℓi _i=1-D_i/ _i in the capacity bound (Appendix D). This is also the explanation behind the two peaks in the conservation law, clearly stratified across sparsity. While slack is minimal across all runs, the slight deviation in the high sparsity regime results in broad delocalization of a few features. The spectral localization also has a direct consequence for the type of geometry the model learns (Pf. C): Theorem 3 (Decomposition into Tight Frames). Assume Spectral Localization holds for all features i∈1,⋯,ni∈\1,·s,n\. Let Λ=λ1,⋯,λk =\ _1,·s, _k\ be the set of distinct eigenvalues of F=W​W⊤F=W . Then, the feature set partitions into disjoint clusters Ck=i|λ​(i)=λkC_k=\i|λ(i)= _k\ and F=⨁k=1mλk​PVkF= _k=1^m _kP_V_k, where Vk=span⁡Wi|i∈CkV_k=span\W_i|i∈ C_k\. Furthermore, within each cluster CkC_k, the sub-matrix of weights WCkW_C_k forms a Tight Frame: ∑i∈CkWi​Wi⊤=λk​Ik _i∈ C_kW_iW_i = _kI_k on VkV_k with constant λk _k. This result provides a complete description of the geometries learned in TMS (Elhage et al., 2022), including all the simplices, polygons and square antiprism. It is important to note that for any shape other than the simplex, the tight frame characterization is not unique. Nonetheless, we can use the Spectral Correspondence (Lemma B.3) to provide a method of classification of the exact geometry by its spectral signature in terms of association schemes (Pf. in C): Theorem 4 (Spectral Identification of Geometry). Assume Spectral Localization holds, s.t per Theorem 3 a feature cluster CkC_k forms a Tight Frame. Let Mk=WCk⊤​WCkM_k=W_C_k W_C_k be the local Gram matrix. Then, the geometry of cluster CkC_k is an instance of an association scheme A if and only if the eigenspaces of the Gram matrix MkM_k coincide with the canonical stratums W0,W1,⋯,Ws\W_0,W_1,·s,W_s\ of the algebra A. This theorem has a direct corollary for the Simplex Algebra classification A.3.1, namely that if the tight frame is made up of d+1d+1 features in d dimensions (d+1d​Id) ( d+1dI_d ), then it necessarily must be a simplex geometry (Pf. in C): Corollary 5 (Simplex Identification). The geometry of a cluster of features CkC_k is a simplex if and only if the corresponding Gram matrix MkM_k has exactly two eigenspaces W0=ker⁡(Mk)=span⁡()W_0= (M_k)=span(1), W1=Im⁡(Mk)=⟂W_1=Im(M_k)=1 with Λ​(W0)=0 (W_0)=0 and Λ​(W1)=λk=|Ck|dim(Vk) (W_1)= _k= |C_k| (V_k) We provide an analysis of why these features persist during training in 17, hypothesizing that they remain stable under gradient descent. 5 Limitations and Conclusion Interpretability seeks to understand neural networks in terms of their internal structure. Current methods focus on identifying features—sparse, interpretable directions in activation space. This paper argues that the geometry relating features is equally important, and that this geometry can be recovered using classical mathematical tools. By studying the eigenspaces of weight-induced operators, rather than individual feature vectors, we preserve geometric relationships between features, including shared subspaces, symmetry, and structured interference – all of which are discarded by sparse or linear analyses. The framework converts questions about relationships (pairwise) into questions about structure (global). Instead of asking “how does feature i relate to feature j,” you ask “what is the geometry of the feature space, and where do i and j sit in it?” This is the difference between knowing the pairwise distances between cities versus having a map. The map presented by this paper is still limited. For the spectral measure to completely characterize interference, models need to exhibit spectral localization (suggested by HT-SR phenomenology, but not yet directly measured). To recover the geometry fully, we require capacity saturation, which we empirically verify only in toy models. And our toy-model specific results are particularly clean primarily due to the gradient is mediated only through the gram matrix because the toy model is an autoencoder. Despite these limitations, the weight-induced operators, rather than individual feature vectors, appear to be the appropriate units for analyzing superposition. We take this to be an argument for using operator theory to study why models behave the way they do. Operator theory has seen widespread success in studying complex physical systems. The spectral theorem’s role in quantum mechanics, decomposing states into eigenbases of observables, is analogous to our spectral localization result, which decomposes features into eigenspaces of the frame operator. Representation theory’s classification of symmetry actions parallels our use of association schemes to discretely classify feature geometries. These are not mere analogies: the mathematical structures are identical, suggesting that interpretability may benefit from the same tools that made complex physical systems tractable. More fundamentally, however, by looking at how different spaces are intertwined, operator theory is designed to provide maps instead of mere pairwise comparisons. By analyzing the components of models in operator theoretic terms, we hope we can move from the study of mere features to that of model behavior. 6 Impact statement This paper proposes a novel theoretical framework for analyzing superposition in neural networks using spectral tools. The primary contribution is methodological: it introduces basis-invariant diagnostics for classifying feature geometry and interference in learned representations. The work is foundational and does not introduce new model capabilities, datasets, or applications. The results are intended to support research in interpretability and representation analysis. While improved understanding of internal representations may indirectly inform downstream methods, this work itself poses minimal risk and has no direct societal impact. References M. Adler and N. Shavit (2024) On the complexity of neural computation in superposition. Note: arXiv:2409.15318 Cited by: 2nd item. M. Aizenman and S. Molchanov (1993) Localization at large disorder and at extreme energies: an elementary derivation. Communications in Mathematical Physics 157:245-278. Cited by: §E.2. Anthropic (2025) When models manipulate manifolds: the geometry of a counting task. Note: Internal Presentation/Research Note Cited by: §1. M. Atiyah and P. Sutcliffe (2003) Polyhedra in physics, chemistry and geometry. Note: https://arxiv.org/pdf/math-ph/0303071 Cited by: §A.4, §F.1. R. A. Bailey (2004) Association schemes: designed experiments, algebra and combinatorics. Note: Cambridge University Press Cited by: §A.1, §A.2, §A.2, §2. A. R. Bhaskara, P. Kr. Yadama, T. Pell, and D. J. Sutherland (2025) Projecting assumptions: the duality between sparse autoencoders and concept geometry. Note: arXiv preprint Cited by: §1. S. Chalnev, M. Siu, and A. Conmy (2024) Improving steering vectors by targeting sparse autoencoder features. arXiv preprint arXiv:2411.02193. Cited by: §1. L. Chan (2024) Superposition is not “just” neuron polysemanticity. In AI Alignment Forum, Cited by: §1. A. Conmy, T. Pell, P. Kr. Yadama, and D. J. Sutherland (2025) From flat to hierarchical: extracting sparse representations with matching pursuit. Note: arXiv preprint Cited by: §1. M. Dreyer, E. Purelku, J. Vielhaben, W. Samek, and S. Lapuschkin (2024) PURE: turning polysemantic neurons into pure features by identifying relevant circuits. Note: CVPR 2024 Workshops; arXiv:2404.06453 Cited by: 7th item. J. T. Edwards and D. J. Thouless (1972) Numerical studies of localization in disordered systems. Journal of Physics C: Solid State Physics. Cited by: §E.2. N. Elhage, T. Hume, C. Olsson, N. Schiefer, T. Henighan, S. Kravec, Z. Hatfield‑Dodds, R. Lasenby, D. Drain, C. Chen, R. Grosse, S. McCandlish, J. Kaplan, D. Amodei, M. Wattenberg, and C. Olah (2022) Toy models of superposition. Note: arXiv preprint arXiv:2209.10652 Cited by: §A.1, §A.4, 3rd item, §1, §3, §4.1, §4. J. Engels, I. Liao, E. J. Michaud, and M. Tegmark (2024) Not all language model features are one-dimensionally linear. Note: arXiv:2405.14860 Cited by: 2nd item. C. Godsil and G. Royle (2001) Algebraic graph theory. Note: Springer Cited by: §A.1. Google (2025) Deep sequence models tend to memorize geometrically. Note: Google Research Blog/Internal Report Cited by: §1. W. Gurnee and M. Tegmark (2024) Language models represent space and time. Note: arXiv:2310.02207 Cited by: §1. R. Hesse, J. Fischer, S. Schaub-Meyer, and S. Roth (2025) Disentangling polysemantic channels in convolutional neural networks. Note: CVPR 2025 Workshops; arXiv:2504.12939 Cited by: 6th item. L. Hollard, L. Mohimont, N. Gaveau, and L. Steffenel (2025) Exploring superposition and interference in state-of-the-art low-parameter vision models. Note: arXiv:2507.15798 Cited by: 4th item. D. Hundertmark (2008) Analysis and stochastics of growth processes and interface models. Cited by: §E.1. T. Katō (1980) Perturbation theory for linear operators. Springer-Verlag. Cited by: item (i), §B.1, §B.1, §B.1, §B.1, Lemma B.1. D. Klindt, C. O’Neill, P. Reizinger, H. Maurer, and N. Miolane (2025) From superposition to sparse codes: interpretable representations in neural networks. Note: arXiv:2503.01824 Cited by: 3rd item. V. Lecomte, K. Thaman, R. Schaeffer, N. Bashkansky, T. Chow, and S. Koyejo (2023) What causes polysemanticity? an alternative origin story of mixed selectivity from incidental causes. Note: arXiv:2312.03096 Cited by: 1st item, 1st item. C. Martin and M. Mahoney (2020) Heavy-tailed universality predicts trends in accuracies for very large pre-trained deep neural networks. Note: https://arxiv.org/pdf/1901.08278 Cited by: §E.1. C. Martin and M. Mahoney (2021a) Implicit self-regularization in deep neural networks: evidence from random matrix theory and implications for learning. Nature Communications 4122. Cited by: §E.1, §3. C. Martin and M. Mahoney (2021b) Predicting trends in the quality of state-of-the-art neural networks without access to training or testing data. Nature Communications 4122. Cited by: §E.1. A. D. Mirlin, Y. V. Fyodorov, F. Dittes, J. Quezada, and T. H. Seligman (1996) Transition from localized to extended eigenstates in the ensemble of power-law random banded matrices. Physical Review E 54, 3221. Cited by: §E.2. N. Nanda and L. Chan (2025) Language models use trigonometry to do addition. Note: Mechanistic Interpretability Research Cited by: §1. D. Nguyen, A. Prasad, E. Stengel-Eskin, and M. Bansal (2025a) Multi-attribute steering of language models via targeted intervention. arXiv preprint arXiv:2502.12446. Cited by: §1. E. Nguyen, M. Poli, A. Thomas, B. Hie, S. Quake, and C. Ré (2025b) Finding the tree of life in evo 2. Note: Arc Institute/Stanford University Cited by: §1. T. M. Nguyen and R. Mo (2025) Angular steering: behavior control via rotation in activation space. Note: National University of Singapore Cited by: §1. A. Pan, Y. Chen, Q. Chen, X. Fang, W. Fu, and X. Jin (2025) The hidden dimensions of llm alignment: a multi-dimensional safety analysis. Note: MIT/Safety Research Cited by: §1. K. Park, J. Kim, A. Jiang, and V. Veitch (2025) The geometry of categorical and hierarchical concepts in large language models. Note: University of Chicago Research Cited by: §1. L. Pertl, H. Xuanyuan, and P. Li‘o (2025) Superposition in graph neural networks. Note: arXiv:2509.00928 Cited by: 5th item. N. Prakash, Y. W. Jie, A. Abdullah, R. Satapathy, E. Cambria, and R. K. W. Lee (2025) Beyond i’m sorry, i can’t: dissecting large language model refusal. arXiv preprint arXiv:2509.09708. Cited by: §1. J. B. Raedler, W. Li, A. M. Taliotis, M. Goyal, S. Swaroop, and W. Pan (2025) The necessity for intervention fidelity: unintended side effects when steering llms. In ICML 2025 Workshop on Reliable and Responsible Foundation Models, Cited by: §1. N. Rimsky, N. Gabrieli, J. Schulz, M. Tong, E. Hubinger, and A. Turner (2024) Steering llama 2 via contrastive activation addition. In Proceedings of the 62nd Annual Meeting of the Association for Computational Linguistics (Volume 1: Long Papers), p. 15504–15522. Cited by: §1. V. Siu, N. Crispino, D. Park, N. W. Henry, Z. Wang, Y. Liu, D. Song, and C. Wang (2025) SteeringSafety: a systematic safety evaluation framework of representation steering in llms. arXiv preprint arXiv:2509.13450. Cited by: §1. S. Smale (1998) Mathematical problems for the next century. Note: Mathematical Intelligencer, Vol. 20 Cited by: §F.1. T. Tao (2012) Topics in random matrix theory. American Mathematical Society. Cited by: §E.2. A. M. Turner, L. Thiergart, G. Leech, D. Udell, U. Mini, and M. MacDiarmid (2024) Activation addition: steering language models without optimization. Note: arXiv preprint arXiv:2308.10248 Cited by: §1. Appendix Appendix A Feature Geometry A.1 Orbit Algebra Derivation Let us fix a feature cluster C of size |C|=p|C|=p. We can index the set of vertices V=1,⋯,pV=\1,·s,p\. Per the polytope geometry, we can assume the existence of an abstract symmetry group Γ⊆Sp S_p (Cayley’s theorem) that acts transitively on V, i.e ∀i,j∈V,∃γ∈Γ,∀ i,j∈ V,∃γ∈ , s.t. γ⋅i=jγ· i=j. In simple terms, this means that every vertex can be reached from any other vertex as a starting point. This can be said quite succinctly using the notion of an ”orbit” of a given vertex i. The latter is defined as Γ⋅i=γ⋅i|γ∈Γ · i=\γ· i\ |\ γ∈ \, namely, the set of all vertices, accessible by the elements γ with i as a starting point. Then, transitivity is equivalent to the group having one orbit. To return back to the domain of linear algebra, we can define Pγ∈Rp×pP_γ∈ R^p× p to be the γ permutation matrix representation: (Pγ)i​j=δi,γ​(j)=1if ​i=γ​(j)0otherwise(P_γ)_ij= _i,γ(j)= \ matrix1&if i=γ(j)\\ 0&otherwise matrix . (A.1) We have the following neat fact about permutation matrices: Lemma A.1 (Orthogonality of Permutation Matrix). For every γ∈Γγ∈ , Pγ−1=Pγ⊤P_γ^-1=P_γ Proof. By definition (5)(5), we have (Pγ⊤)i​j=(Pγ)j​i=δj,γ​(i)(P_γ )_ij=(P_γ)_ji= _j,γ(i), hence: (Pγ​Pγ⊤)i​j=∑k=1p(Pγ)i​k​(Pγ⊤)k​j=(P_γP_γ )_ij= _k=1^p(P_γ)_ik(P_γ )_kj= ∑k=1pδi,γ​(k)​δj,γ​(k)=δi​j=(Ip)i​j, _k=1^p _i,γ(k) _j,γ(k)= _ij=(I_p)_ij, where Ip∈ℝp×pI_p ^p× p is the identity matrix. WLOG Pγ⊤​PγP_γ P_γ yields the same result. ∎ This in fact leads to a direct consequence that PΓ=Pγ∈ℝp×p:γ∈Γ⊆O​(p),P_ =\P_γ ^p× p:γ∈ \ O(p), where O​(p)O(p) is the p-dimensional orthogonal group, defined as the group of p×p× p orthogonal matrices. This is the over-arching ”parent” symmetry that contains all of the polytopal geometries symmetries: PDn⊂O​(2)⊂O​(3)⊂⋯⊂O​(n−1)⊂O​(n)P_D_n⊂ O(2)⊂ O(3)⊂·s⊂ O(n-1)⊂ O(n) This matrix formalism is also what allows us to rigorously define ”stability” of cluster configurations using the centralizer (commutant): Γ:=A∈ℝp×p:Pγ​A​Pγ⊤=A,∀γ∈ΓA_ :=\A ^p× p:P_γAP_γ =A,∀γ∈ \ (A.2) This is the set of all ℝp×pR^p× p matrices that are invariant when conjugated by any permutation PγP_γ. In fact, more than that, it is an algebra. To see why, we need to shift our attention to the the set of ordered pairs V×V× V. It has a canonical group action of Γ acting on it defined as: π:Γ×(V×V)→V×V:γ⋅(i,j)=(γ(i),γ(j)),π: ×(V× V)→ V× V: γ·(i,j)=(γ(i),γ(j)), which naturally partitions the set of all |V×V|=p2|V× V|=p^2 pairs into disjoint orbits 1,⋯,RO_1,·s,O_R, where R+1=r​(Γ)R+1=r( ) is the rank of the permutation group (Lagrange’s theorem). We can define an orbital matrix ArA_r for every orbit rO_r as follows: (Ar)i​j=1if ​(i,j)∈r0otherwise(A_r)_ij= \ matrix1&if (i,j) _r\\ 0&otherwise matrix . (A.3) Since our group Γ is transitive on V, the diagonal 0=(i,i)O_0=\(i,i)\ is canonically an orbital with the identity matrix IpI_p as its representation. Lemma A.2 (Centralizer Membership). A matrix X is in ΓA_ if and only if its entries are constant on the orbits of Γ on V×V× V. Proof. (⇒)( )Let γ∈Γγ∈ and X∈ΓX _ be arbitrary. By definition (6)(6): Pγ​X​Pγ⊤=X⇔Pγ​X=X​Pγ(Lemma ​4.3)P_γXP_γ =X P_γX=XP_γ (Lemma 4.3) By definition (5)(5), for all i,ji,j we have that the LHS is: ∑k=1p(Pγ)i​k​Xk​j=∑k=1pδi,γ​(k)​Xk​j=Xγ−1​(i),j _k=1^p(P_γ)_ikX_kj= _k=1^p _i,γ(k)X_kj=X_γ^-1(i),j And similarly for the RHS: ∑k=1pXi​k​(Pγ)k​j=∑k=1pXi​k​δk,γ​(j)=Xi,γ​(j) _k=1^pX_ik(P_γ)_kj= _k=1^pX_ik _k,γ(j)=X_i,γ(j) Hence, Xi,γ​(j)=Xγ−1​(i),jX_i,γ(j)=X_γ^-1(i),j. Since this is true for all i,ji,j, we can set i=γ−1​(k)i=γ^-1(k), i.e Xγ−1​(γ​(k)),j=Xk,j=Xγ​(k),γ​(j)X_γ^-1(γ(k)),j=X_k,j=X_γ(k),γ(j) And since γ was arbitrary, this completes the forward direction. The opposite direction (⇐)( ) follows WLOG by taking arbitrary i,ji,j entries of the LHS and RHS above. ∎ Lemma A.3 (Basis of the centralizer). A0,⋯,AR\A_0,·s,A_R\ is the basis of ΓA_ Proof. The preceding lemma directly implies that ∀r∈0,⋯,RAr⊆Γ _r∈\0,·s,R\A_r _ , since if (i,j)∈r(i,j) _r, by definition of the orbit: ∀γ∈Γ,γ⋅(i,j)=(γ​(i),γ​(j))∈r,∀γ∈ , γ·(i,j)=(γ(i),γ(j)) _r, implying that (Ar)i,j=1=(Ar)γ​(i),γ​(j)(A_r)_i,j=1=(A_r)_γ(i),γ(j) and similarly ∀(i,j)∉r∀(i,j) _r, (Ar)i,j=0=(Ar)γ​(i),γ​(j)(A_r)_i,j=0=(A_r)_γ(i),γ(j). This in fact extends to span​A0,⋯,AR⊆Γspan\A_0,·s,A_R\ _ , since the orbits are disjoint: X=c0​A0+⋯​c​AR⇔Xi​j=cr=Xγ​(i),γ​(j),X=c_0A_0+·s cA_R X_ij=c_r=X_γ(i),γ(j), (A.4) where (i,j)∈r(i,j) _r. Substituting X for an arbitrary ArA_r in (8)(8) implies that A0,⋯,AR\A_0,·s,A_R\ are linearly independent. Furthermore, since any invariant matrix X∈ΓX _ must be constant on these orbits, (8)(8) and the preceding Lemma imply that X can be uniquely written as: X=∑r=0Rcr​Ar,X= _r=0^Rc_rA_r, with which we have completed the other direction Γ⊆span​(A0,⋯,ARA_ (A_0,·s,A_R\ of the equality, concluding that A0,⋯,AR\A_0,·s,A_R\ is in fact a basis. ∎ The last property we will need is the algebraic closure of the centralizer under matrix multiplication: Lemma A.4 (Algebraic Closure). If X,Y∈ΓX,Y _ , then X​Y∈ΓXY _ Proof. Let X,Y∈ΓX,Y _ . By the definition of membership, we have PΓ​X=X​PΓP_ X=XP_ and Pγ​Y=Y​PγP_γY=YP_γ. By multiplying X​YXY by PγP_γ, we get: Pγ​(X​Y)=(Pγ​X)​Y=(X​Pγ)​YP_γ(XY)=(P_γX)Y=(XP_γ)Y =X​(Pγ​Y)=X​(Y​Pγ)=(X​Y)​Pγ=X(P_γY)=X(YP_γ)=(XY)P_γ Hence X​Y∈ΓXY _ . ∎ We can apply the latter two lemmas to the product of any two orbital matrices (Ar​As)i​j=∑k(Ar)i​k​(As)k​j,(A_rA_s)_ij= _k(A_r)_ik(A_s)_kj, the individual entries of which can be interpreted as the number of ”paths” of length 22 from i to j, where the first step is of orbital type r and the second is orbital type s: Ar​As∈span​A0,⋯​AR⇒Ar​As=∑u=0Rcr​su​AuA_rA_s \A_0,·s A_R\ A_rA_s= _u=0^Rc_rs^uA_u (A.5) cr​su=|k∈V|(i,k)∈r​ and ​(k,j)∈s|c_rs^u=|\k∈ V|(i,k) _r and (k,j) _s\| (A.6) for any (i,j)∈u(i,j) _u. In fact, notice that if we chose (i,j)∈u(i,j) _u, per the invariance of Lemma A.4. this count must be independent of the specific choice of (i,j)(i,j) in that orbit. These structure constants are in fact known as the intersection numbers of the association scheme A0,⋯,AR\A_0,·s,A_R\ [Bailey, 2004], which is how it is known in algebraic graph theory [Godsil and Royle, 2001]. To explain the ”sticky points” on the scatter plot described in toy models of superposition [Elhage et al., 2022], we need to show that once a matrix enters such a regime, it becomes ”stable” with respect to group symmetry of the given polygon. A.2 Association Schemes Every association scheme A0,⋯​As\A_0,·s A_s\ on a set Ω with s associate classes makes up a Bose-Mesner algebra: =∑i=0sμi​Ai:μ0,⋯,μs∈ℝA= \ _i=0^s _iA_i: _0,·s, _s \ We have the following neat spectral theory, stated as Theorem 2.6 in [Bailey, 2004]: Lemma A.5. Let A0,A1,⋯​As\A_0,A_1,·s A_s\ be the adjacency matrices of an association scheme on Ω and let A be its Bose-Mesner algebra. Then ℝΩR has s+1s+1 orthogonal subspaces W0,W1,…​WsW_0,W_1,… W_s (strata) with orthogonal projectors, such that: (i) ℝΩ=W0⊕W1⊕⋯⊕WsR =W_0 W_1 ·s W_s (i) each of W0,W1​…,WsW_0,W_1…,W_s is a sub-eigenspace of every matrix in A; (i) for i=0,1,…,si=0,1,…,s, the adjacency matrix AiA_i is a linear combination of S0,S1,⋯,SsS_0,S_1,·s,S_s (note that S0=1|Ω|​JS_0= 1| |J) (iv) for e=0,1,…​se=0,1,… s, the stratum projector SeS_e is a linear combination of A0;A1,⋯,AsA_0;A_1,·s,A_s For i∈i (the index set of the association scheme) and e (the index set of the stratum projectors), we define C(i,e)C_(i,e) to be the eigenvalue of AiA_i on WeW_e and D(i,e)D_(i,e) be the coefficients in the following expansion: Ai=∑e∈ℰC(i,e)​Se,Se=∑i∈D(e,i)​AiA_i= _e C_(i,e)S_e, S_e= _i D_(e,i)A_i The following facts hold (Lemma 2.9, Theorem 2.12, Corollary 2.14 and Corollary 2.15 from [Bailey, 2004]): Lemma A.6. The matrices C∈ℝ×ℰC ^K× E and: (i) D∈ℝℰ×D ^E× K are mutual inverses and: C(0,e)=1,C(i,0)=ai,D(0,i)=ai,D(e,0)=de|Ω|,C_(0,e)=1, C_(i,0)=a_i, D_(0,i)=a_i, D_(e,0)= d_e| |, where ai=pi​i0a_i=p_i^0 is the valency of the i-th associate class and de=tr​(Se)d_e=tr(S_e); (i) ∑eC(i,e)​C(j,e)​de=δi​j​ai​|Ω|, _eC_(i,e)C_(j,e)d_e= _ija_i| |, where δi​j _ij is the delta function, i.e. δi​j=1 _ij=1 if i=ji=j and 0 otherwise. (i) ∑i∈1ai​C(i,e)​C(i,f)=δi​j​|Ω|de _i 1a_iC_(i,e)C_(i,f)= _ij | |d_e (iv) ∑i∈D(e,i)​D(f,i)​ai=δe​f​de|Ω| _i D_(e,i)D_(f,i)a_i= _ef d_e| | (v) ∑e1de​D(e,i)​D(e,j)=δi​j|Ω|​ai _e 1d_eD_(e,i)D_(e,j)= _ij| |a_i A.3 Algebra Classification A.3.1 Simplex algebra A property of any simplex is its two-inner product, i.e for unit vectors xii∈V⊂ℝp−1\x_i\_i∈ V ^p-1, we have Xi​i=⟨xi,xi⟩=1X_i= x_i,x_i =1 for all i and Xi​j=⟨xi,xj⟩=11−pX_ij= x_i,x_j = 11-p. We chose Γ=Sp =S_p as the group. By the definition of X, its entry values depend only on whether its two indices are equal. And since γ is a bijection we have that γ−1​(i)=γ−1​(j)⇔i=jγ^-1(i)=γ^-1(j) i=j, hence per the centralizer definition follows that: (PγMPγ⊤)i​j=Mγ−1​(i),γ−1​(j)=1i=j11−pi≠j=Mi​j(P_γMP_γ )_ij=M_γ^-1(i),γ^-1(j)= \ matrix1&i=j\\ 11-p&i≠ j matrix . =M_ij This splits the group action on V×V× V into two orbitals, since outside the canonical diagonal 0O_0. Define the other orbital as: o​f​f=(i,j)∈V×V:i≠jO_off=\(i,j)∈ V× V:i≠ j\ By the injectivity of γ, i≠ji≠ j implies γ​(i)≠γ​(j)γ(i)≠γ(j), hence γ​(i,j)∈o​f​fγ(i,j) _off. Now, take any (i,j)(i,j) and (k,ℓ)(k, ) in offO_off; so i≠ji≠ j and k≠ℓk≠ . We can explicitly construct γ∈Spγ∈ S_p with γ​(i)=kγ(i)=k and γ​(j)=ℓγ(j)= . Define γ on i,j\i,j\ by γ​(i)=k,γ​(j)=ℓγ(i)=k,γ(j)= . Because k≠ℓk≠ , this is injective on i,j\i,j\. Extend γ to a bijection on all of V by mapping the remaining p−2p-2 elements of V∖i,jV \i,j\ bijectively onto V∖k,ℓV \k, \ (this is possible because these sets have the same finite cardinality). This extension is a permutation γ∈Spγ∈ S_p. Then γ⋅(i,j)=(k,ℓ)γ·(i,j)=(k, ). Thus, there are exactly two orbitals, i.e. r​(Γ)=2r( )=2 and R=1R=1. Per A.3 Γ -invariant matrix has form A=a​I+c​(J−I),A=aI+c(J-I), where J is the matrix with all ones. With A0=I,A1=J−IA_0=I,A_1=J-I, we can also compute the intersection numbers: A0​A0=A0,A0​A1=A1,A1​A0=A1A_0A_0=A_0, A_0A_1=A_1, A_1A_0=A_1 A12=(J−I)2=J2−2​J+I=p​J−2​J+IA_1^2=(J-I)^2=J^2-2J+I=pJ-2J+I =(p−2)​J+I=(p−2)​A1+(p−1)​A0=(p-2)J+I=(p-2)A_1+(p-1)A_0 Hence, the only non-zero intersection numbers are: c000=1,c011=c101=1,c110=p−1,c111=p−2c_00^0=1, c_01^1=c_10^1=1, c_11^0=p-1, c_11^1=p-2 Using A.6(i) to say C(0,0)=1,C(0,1)=1,C(1,0)=a1=c110=p−1C_(0,0)=1, C_(0,1)=1, C_(1,0)=a_1=c_11^0=p-1 we can explicitly calculate the projectors: A0=S0+S1,A1=(p−1)​S0+C(1,1)​S1,A_0=S_0+S_1, A_1=(p-1)S_0+C_(1,1)S_1, where since S0=1p​JS_0= 1pJ [A.5(i)], we have that S1=I−1p​JS_1=I- 1pJ. Hence: J−I=p−1p​J+C(1,1)​[I−1p​J]J-I= p-1pJ+C_(1,1) [I- 1pJ ] C(1,1)+1p​J=(1+C(1,1))​I, C_(1,1)+1pJ=(1+C_(1,1))I, meaning that C(1,1)=−1C_(1,1)=-1. Using B.3 we can transfer the primitive idempotents we found to the feature latent space ℝmR^m W​S0​W⊤=1p​W​J​W⊤=1p​W​⊤​W⊤=1p​(W​)​(W​)⊤=1p​(∑i=1pWi)​(∑i=1pWi)⊤WS_0W = 1pWJW = 1pW11 W = 1p(W1)(W1) = 1p ( _i=1^pW_i ) ( _i=1^pW_i ) Per the definition of the simplex Gram matrix we have ⟨Wi,Wi⟩=1,⟨Wi,Wj⟩=11−p=−1p−1=1p−1 W_i,W_i =1, W_i,W_j = 11-p=- 1p-1= 1p-1 Then, for arbitrary k: ⟨Wk,∑i=1pWi⟩=∑i=1p⟨Wk,Wi⟩=⟨Wk,Wk⟩+∑i≠k⟨Wk,Wi⟩=1+(p−1)​−1p−1=0 W_k, _i=1^pW_i = _i=1^p W_k,W_i = W_k,W_k + _i≠ k W_k,W_i =1+(p-1) -1p-1=0 This means in fact that ⟨∑k=1pWk,∑i=1pWi⟩=∑k=1p⟨Wk,∑i=1pWi⟩=0⇔W​S0​W⊤=0 _k=1^pW_k, _i=1^pW_i = _k=1^p W_k, _i=1^pW_i =0 WS_0W =0 From the same definition M=I−1p−1​(J−I)=p−1​I−1p−1​J=p−1​(I−1p​J)=p−1​S1M=I- 1p-1(J-I)= pp-1I- 1p-1J= pp-1 (I- 1pJ )= pp-1S_1 Lemma A.7 (Digon). Consider the setting from 2. We can derive the exact association scheme of the digon in the same way we did for the triangle (and in fact the simplex above): Proof. Here, Γ≅S3≅C2 S_3 C_2 is the cyclic group of two elements e,s\e,s\, namely the identity e and s is an element of order 22, meaning that s2=1s^2=1. This visually represents swapping around the two nodes of the digon (i.e. going from heads-tails to tails-heads). Per the fact that this is a simplex (i.e one can reach from one node to every other), we have the same number of orbits/associate classes 0=diag⁡(ΩD)C_0=diag( _D) and 1=(4,5);(4,5)C_1=\(4,5);(4,5)\ with adjacency matrices A0′=I2A_0 =I_2 and A1′=J2−I2A_1 =J_2-I_2. Similarly, Lemma A.5 for D=span⁡(A0′,A1′)A_D=span(A_0 ,A_1 ) yields ℝΩD≅ℝ2=U0⊕U1R _D ^2=U_0 U_1 and by Lemma A.6 S0′=12​IS_0 = 12I, S1′=I−12​JS_1 =I- 12J and: C′=[111−1],D′=12​[111−1] C = bmatrix1&1\\ 1&-1 bmatrix, D = 12 bmatrix1&1\\ 1&-1 bmatrix MD=I−(J−I)=2​I−J=2​S1′ M_D=I-(J-I)=2I-J=2S_1 ∎ A.4 Tight Frame Lemma A.8. Let w1,⋯,wp∈ℝdw_1,·s,w_p ^d be unit vectors forming a tight frame: ∑j=1pwj​wj⊤=pd​Id _j=1^pw_jw_j = pdI_d Let Pi​j=⟨wi,wj⟩P_ij= w_i,w_j be their p×p× p Gram matrix. Then, for every i: (P2)i​i=pd⇒Di=1(P2)i​i=dp(P^2)_i= pd D_i= 1(P^2)_i= dp Proof. (P2)i​i=∑j(wi⊤​wj)2=wi⊤​(∑jwj​wj⊤)​wi=wi⊤​pd​I​wi=pd(P^2)_i= _j(w_i w_j)^2=w_i ( _jw_jw_j )w_i=w_i pdIw_i= pd. By (L4.1) Pi​i=1P_i=1.∎ The latter result is what allows us to quantify the observed cluster values by the inverse fractional dimensions. More specifically, notice that per Lemma 4.2, we can categorize the geometries observed in toy models of superposition [Elhage et al., 2022] expressing their corresponding fractional dimensionalities using the tight frame parameters (p,d)(p,d): • (2,1)(2,1) - Digon with Di=1/2D_i=1/2 • (3,2)(3,2) - Triangle with Di=2/3D_i=2/3 • (4,3)(4,3) - Tetrahedron with Di=3/4D_i=3/4 • (5,2)(5,2) - Pentagon with Di=2/5D_i=2/5 • (8,3)(8,3) - Square antiprism with Di=3/8D_i=3/8 However, it is important to note that this alone is not sufficient to distinguish between the different types of symmetries. For example, the square antiprism and cube both have (p,d)=(8,3)(p,d)=(8,3), but the former is the one that remains stable as is observed both in the original paper and the more general Thomson problem [Atiyah and Sutcliffe, 2003]. Appendix B Spectral Theory Lemma B.1 (Spectral decomposition of the Gram operator [Katō, 1980], Thm. 2.10). Let W∈ℝm×nW ^m× n and let M:=W⊤​W∈ℝn×nM:=W W ^n× n be the Gram matrix. Then M is symmetric positive semidefinite and admits an orthogonal spectral resolution M=∑e∈ℰλe​Ee,M\;=\; _e _e\,E_e, where λe∈ℝ≥0 _e _≥ 0 are the distinct eigenvalues and Ee∈ℝn×nE_e ^n× n are the corresponding orthogonal spectral projectors. Concretely, Ee⊤=Ee,Ee2=Ee,Ee​Ef=0​(e≠f),∑e∈ℰEe=In,E_e =E_e, E_e^2=E_e, E_eE_f=0\ (e≠ f), _e E_e=I_n, and M​Ee=Ee​M=λe​Ee(∀e∈ℰ).ME_e=E_eM= _eE_e (∀ e ). If 0 is an eigenvalue, we denote it by λ0=0 _0=0 and E0E_0 is the orthogonal projector onto ker⁡(M) (M). Lemma B.2 (Frame and Gram Kernels). The following are direct consequences of the orthogonal decomposition of the image and the kernel: (i) ker⁡(W​W⊤)=ker⁡(W⊤) (W )= (W ) (i) ker⁡(W⊤​W)=ker⁡(W) (W W)= (W) Proof. (⇒)​ker⁡(W⊤)⊆ker⁡(W​W⊤)( ) (W ) (W ). Let y∈ker⁡(W⊤)y∈ (W ) be arbitrary. Then, W⊤​y=0⇒W​W⊤​y=W​0=0⇒y∈ker⁡(W​W⊤)W y=0 W y=W0=0 y∈ (W ) (⇐)( ) Let y∈ker⁡(W​W⊤)y∈ (W ) be arbitrary. Then W​W⊤​y=0W y=0. Since the RHS is 0, we can multiply the left by y⊤y , s..t, we get: y⊤W⊤y=y⊤0=0⇒(W⊤y)⊤W⊤x=||W⊤y∥=0,y W y=y 0=0 (W y) W x=||W y\|=0, which by the definition of the norm yields W⊤​y=0W y=0, i.e. y∈ker⁡(W⊤)y∈ (W ). With this we have completed the equality ker⁡(W⊤)=ker⁡(W​W⊤) (W )= (W ) from both sides. The proof for (i​i)(i) is identical after switching W⊤W and W around. ∎ Lemma B.3 (Spectral Correspondence). Let M=WT​W=∑eλe​EeM=W^TW= _e _eE_e be the spectral decomposition of the Gram matrix. Define Pe:=λe+​W​Ee​W⊤∈ℝm×mP_e:= _e^+WE_eW ^m× m for each e=e:λe>0e=\e: _e>0\. Then: (i) PeP_e symmetric Pe⊤=PeP_e =P_e and idempotent Pe2=PeP_e^2=P_e (i.e. orthogonal projector) (i) Pe​Pf=0P_eP_f=0 for e≠fe≠ f (i) The frame operator decomposes as F=W​W⊤=∑eλe​PeF=W = _e _eP_e. (iv) Im​(Pe)=W​(Im​(Ee))Im(P_e)=W(Im(E_e)) Proof. (i) Symmetry is immediate from symmetry of EeE_e: Pe⊤=λe+​(W​Ee​W⊤)⊤=λe+​W​Ee⊤​W⊤=λe+​W​Ee​W⊤=PeP_e = _e^+(WE_eW ) = _e^+WE_e W = _e^+WE_eW =P_e Using idempotence of EeE_e, W⊤​W=MW W=M and M​Ee=λe​EeME_e= _eE_e we get: Pe2=(λe+​W​Ee​W⊤)​(λe+​W​Ee​W⊤)=P_e^2= ( _e^+WE_eW ) ( _e^+WE_eW )= =(λe+)2​W​Ee​M​Ee​W⊤=λe+​W​Ee​W⊤=Pe= ( _e^+ )^2WE_eME_eW = _e^+WE_eW =P_e As such, since PeP_e is symmetric and idempotent, it is an orthogonal projector onto Im​(Pe)Im(P_e) [Katō, 1980]. (i) Computing: Pe​Pf=(λe+​W​Ee​W⊤)​(λf+​W​Ef​W⊤)=P_eP_f= ( _e^+WE_eW ) ( _f^+WE_fW )= λe+​λf+​W​Ee​M​Ef​W⊤=λe+​λf+​W​Ee​(∑gλg​Eg)​Ef=λe+​W​Ee​Ef=0 _e^+ _f^+WE_eME_fW = _e^+ _f^+WE_e ( _g _gE_g )E_f= _e^+WE_eE_f=0 Since Ee​Ef=0E_eE_f=0 unless g=fg=f . Since also each PeP_e is symmetric, this implies that Im​(Pe)⟂Im​(Pf)Im(P_e) (P_f) for e≠fe≠ f (i) Rewrite for λe>0 _e>0, λe​Pe=W​Ee​W⊤ _eP_e=WE_eW , and observe that by B.2(i) W​E0=0WE_0=0: ∑eλe​Pe=∑eW​Ee​W⊤=W​(∑eEe)​W⊤=W​(In−E0)​W⊤=W​W⊤−W​E0​W⊤=W​W⊤ _e _eP_e= _eWE_eW =W ( _eE_e )W =W (I_n-E_0 )W =W -WE_0W =W (iv) (⇒)( ) Let y∈ℝmy ^m be arbitrary. Then Pe​y=λe+​W​Ee​W⊤​yP_ey= _e^+WE_eW y Let u:=Ee​W⊤​yu:=E_eW y. Then u∈Im​(Ee)u (E_e), and Pe​y=λe+​W​u∈W​(Im​(Ee)),P_ey= _e^+Wu∈ W(Im(E_e)), hence Im​(Pe)⊆W​(Im​(Ee))Im(P_e) W(Im(E_e)) , since y was arbitrary (⇐)( ) Let u∈Im​(Ee)u (E_e) be arbitrary. Then Ee​u=uE_eu=u and M​u=∑eEe​u=λe​uMu= _eE_eu= _eu. Evaluating PeP_e on W​uWu, we have: Pe​(W​u)=λe+​W​Ee​W⊤​W​u=λe+​W​Ee​M​u=P_e(Wu)= _e^+WE_eW Wu= _e^+WE_eMu= =λe+​W​Ee​(λe​u)=W​(Eu​u)=W​u= _e^+WE_e( _eu)=W(E_uu)=Wu With this we complete W​(Im​(Ee))⊆Im​(Pe)W(Im(E_e)) (P_e), and as such the equality im​(Pe)=W​(Im​(Ee))im(P_e)=W(Im(E_e)) ∎ There is an immediate corollary of (A​.3)​(i​v)(A.3)(iv), namely dimIm​(Pe)=dimIm​(Ee) (P_e)= (E_e). The proof is simply the observation that W is injective since if u∈Im​(Ee)u (E_e) and W​u=0Wu=0, then M​u=W⊤​W​u=0=M​u=λe​uMu=W Wu=0=Mu= _eu and λe>0 _e>0. Lemma B.4 (Spectral Intertwining). Define Pe:=λe+​W​Ee​W⊤∈ℝm×mP_e:= _e^+WE_eW ^m× m for each e=e:λe>0e=\e: _e>0\, where M=WT​W=∑eλe​EeM=W^TW= _e _eE_e is the spectral decomposition of the Gram operator. Then: (i) Pe​W=W​EeP_eW=WE_e and W⊤​Pe=Ee​W⊤W P_e=E_eW . (i) F​Pe=Pe​F=λe​PeFP_e=P_eF= _eP_e. (i) ℝm=(⨁eIm​(Pe))⊕ker​(W⊤)R^m= ( _eIm(P_e) ) (W ) Proof. (i) Pe​W=λe+​W​Ee​W⊤​W=λe+​W​Ee​M=λe+​W​(λe​Ee)=W​EeP_eW= _e^+WE_eW W= _e^+WE_eM= _e^+W( _eE_e)=WE_e W⊤​Pe=λe+​W⊤​W​Ee​W⊤=λe+​M​Ee​W⊤=Ee​W⊤W P_e= _e^+W WE_eW = _e^+ME_eW =E_eW (i) F​Pe=W​W⊤​λe+​W​Ee​W⊤=λe+​W​(W⊤​W)​Ee​W⊤=FP_e=W _e^+WE_eW = _e^+W(W W)E_eW = =λe+​W​M​Ee​W⊤=λe+​W​(λe​Ee)​W⊤=λe​Pe= _e^+WME_eW = _e^+W( _eE_e)W = _eP_e Pe​F=λe+​W​Ee​W⊤​W​W⊤=λe+​W​Ee​M​W⊤=λe​PeP_eF= _e^+WE_eW W = _e^+WE_eMW = _eP_e (i) Let M+=∑eλe+​EeM^+= _e _e^+E_e be the Moore-Penrose pseudoinverse of M. Then: ∑ePe=∑eλe+​W​Ee​W⊤=W​M+​W⊤ _eP_e= _e _e^+WE_eW =WM^+W and W​M+​W⊤WM^+W is the orthogonal projector into Im(W)=ker(W⊤)⟂=ker(W⊤)⟂Im(W)= (W ) = (W ) by B.2. ∎ Corollary 6 (Triange-Digon Case). Let the setting be as in Section 2. Then, F=2​PD+32​P△F=2P_D+ 32P_ , where PDP_D and P△P_ are projectors onto the subspaces along which the digon and triangle live, respectively. Proof. Now, let us define U△:span⁡Wi:Ω△⊂ℝ3U_ :span\W_i: _ \ ^3 and UD:span⁡Wi:i∈ΩD⊂ℝ3U_D:span\W_i:i∈ _D\ ^3 To tie this back to spectral description M△M_ and MDM_D we found (which are equivalent and as such we can generalize to an arbitrary C), let us define the injection operator from a cluster C∈△,DC∈\ ,D\ to the total concept space EC:ℝΩC→ℝΩE_C:R _C , where (EC​x)i=∑j∈ΩCxi​δi​j(E_Cx)_i= _j∈ _Cx_i _ij (Fig B). Figure 7: W is an Intertwining Operator of Spectral Modes Then, if WC:=W​EC∈ℝ3×|ΩC|W_C:=WE_C ^3×| _C|, we have that MC=WC⊤​WCM_C=W_C W_C and FC=WC​WC⊤=∑i∈ΩCWi​Wi⊤F_C=W_CW_C = _i∈ _CW_iW_i since summation over a set is permutation-stable. Furthermore, our spectral results get directly transferred: MC=λC​S1(C)M_C= _CS_1^(C), S0(C)=1|ΩC|​⊤S_0^(C)= 1| _C|11 , S1(C)=I−S0(C)S_1^(C)=I-S_0^(C). Since the centroid is being sent to the kernel means S0(C)S_0^(C) MC​=λC​S1(C)​=0M_C1= _CS_1^(C)1=0. There is a direct consequence for the injected matrices: multiplying by ⊤1 on the left yields: 0=⊤​MC​=⊤​WC⊤​WC​=‖Wc​‖2=‖∑i∈ΩCWC‖20=1 M_C1=1 W_C W_C1=\|W_c1\|^2= \| _i∈ _CW_C \|^2 If MC​x=λC​xM_Cx= _Cx with λC>0 _C>0, then FC​WC​x=WC​MC​x=λC​(WC​x).F_CW_Cx=W_CM_Cx= _C(W_Cx). So WCW_C maps the λC _C-eigenspace of MCM_C into the λC _C-eigenspace of FCF_C. Now since MC=λC​S1(C)M_C= _CS^(C)_1, the λC _C-eigenspace in concept space is Im⁡(S1(C))Im(S^(C)_1) (the “difference” subspace), and the (0)-eigenspace is Im⁡(S0(C))=span⁡()Im(S^(C)_0)=span(1), which we just showed is killed by WCW_C. Therefore the only nontrivial activation geometry produced by the cluster is UC:=WC(Im(S1(C)))=Im(WC)⊆;ℝ3,U_C:=W_C (Im(S^(C)_1) )=Im(W_C) ;R^3, and FCF_C acts as λC _C on UCU_C and as (0) on UC⟂U_C . Equivalently,FC=λC,PC,F_C= _C,P_C, where PCP_C is the orthogonal projector onto UCU_C. Lastly, define PC:=λC−1​WC​S1(C)​WC⊤.P_C:= _C^-1W_CS^(C)_1W_C . Then PCP_C is a projector (idempotent and symmetric) by a direct computation using MC​S1(C)=λC​S1(C)M_CS^(C)_1= _CS^(C)_1: PC2=λC−2​WC​S1​WC⊤​WC​S1​WC⊤=λC−2​WC​S1​MC​S1​WC⊤ P_C^2= _C^-2W_CS_1W_C W_CS_1W_C = _C^-2W_CS_1M_CS_1W_C =λC−1​WC​S1​WC⊤=PC. = _C^-1W_CS_1W_C =P_C. Then clearly FC=WC​WC⊤=WC​(S0+S1)​WC⊤=WC​S1​WC⊤=λC​PCF_C=W_CW_C =W_C(S_0+S_1)W_C =W_CS_1W_C = _CP_C because WC​S0=0W_CS_0=0 from centroid cancellation. With this, we have recovered (2.2)(2.2). Now, we can use all of the spectral information to give both the global and local geometric classification that would otherwise have been inaccessible from an arbitrarily scrambled Gram matrix. Since FC=λC=PCF_C= _C=P_C, we have: tr⁡(FC)=λC​tr⁡(PC)=λC​rank⁡(PC)=λC​dim(UC)tr(F_C)= _Ctr(P_C)= _Crank(P_C)= _C (U_C) But also, tr⁡(FC)=∑i∈ΩC‖Wi‖2tr(F_C)= _i∈ _C\|W_i\|^2, so for our unit-norm features, this is |ΩC|| _C|, i.e. 1λC=dim(UC)∑I∈ΩC‖Wi‖2=rank⁡(PC)|ΩC| 1 _C= (U_C) _I∈ _C\|W_i\|^2= rank(P_C)| _C| and as such, we recover our main diagram 2. ∎ B.1 Perturbation Theory Lemma B.5 (Analytic expansion of the Gram operator). Let x↦M​(x)∈ℝn×nx M(x) ^n× n be analytic in a neighborhood of x=0x=0, with M​(0)=M(0)=M. Then there exist matrices M(k)∈ℝn×nM^(k) ^n× n such that M​(x)=M+x​M(1)+x2​M(2)+⋯,M(k)=1k!​dkd​xk​M​(x)|x=0.M(x)\;=\;M+xM^(1)+x^2M^(2)+·s, M^(k)\;=\; 1k! . d^kdx^kM(x) |_x=0. In our setting one may take M​(x)=W​(x)⊤​W​(x)M(x)=W(x) W(x) and F​(x)=W​(x)​W​(x)⊤F(x)=W(x)W(x) . When M​(⋅)M(·) follows the Gram flow M˙=−M,Φ M=-\M, \ and the derivative exists at the snapshot, one may identify M(1)M^(1) with M˙​(0)=−M,Φ M(0)=-\M, \. Reference: [Katō, 1980], Ch. I, §1, Eq. (1.2). Lemma B.6 (Reduced resolvent of an eigenvalue of M). Fix an eigenvalue λ of M and let E be the corresponding spectral projector. The reduced resolvent of M at λ is the operator SλS_λ that inverts M−λ​InM-λ I_n on the invariant complement (In−E)​ℝn(I_n-E)R^n. Equivalently, SλS_λ is characterized by (M−λ​In)​Sλ=Sλ​(M−λ​In)=In−E,Sλ​E=E​Sλ=0.(M-λ I_n)S_λ=S_λ(M-λ I_n)=I_n-E, S_λE=ES_λ=0. (For symmetric M=∑fλf​EfM= _f _fE_f, one may write explicitly Sλ=∑f:λf≠λ(λf−λ)−1​EfS_λ= _f: _f≠λ( _f-λ)^-1E_f.) Reference: [Katō, 1980], Ch. I, §5.3, Eqs. (5.26)–(5.29) (definition via holomorphic part of the resolvent and inverse-on-complement identities). Lemma B.7 (First-order drift of Gram eigenprojections). Let M​(x)=M+x​M(1)+O​(x2)M(x)=M+xM^(1)+O(x^2) be an analytic perturbation, and let E​(x)E(x) be the spectral projector corresponding to an isolated eigenvalue λ of M (equivalently, an isolated λ-group). Let E:=E​(0)E:=E(0) and let SλS_λ be the reduced resolvent of M at λ. Then the first Taylor coefficient E(1):=d​x​E​(x)|x=0E^(1):= . ddxE(x) |_x=0 satisfies E(1)=−E​M(1)​Sλ−Sλ​M(1)​E.E^(1)\;=\;-\,E\,M^(1)\,S_λ\;-\;S_λ\,M^(1)\,E. In particular, along the Gram flow M˙=−M,Φ M=-\M, \, the instantaneous eigenprojection drift is obtained by substituting M(1)=M˙M^(1)= M. Reference: [Katō, 1980], Ch. I, §2, Eq. (2.14) (semisimple case; automatically satisfied for symmetric M). Lemma B.8 (First-order drift of Gram eigenvalues). Let M​(x)=M+x​M(1)+O​(x2)M(x)=M+xM^(1)+O(x^2) be analytic and let λ be an eigenvalue of M with spectral projector E of rank d=tr​(E)d=tr(E). Let λ¯​(x) λ(x) denote the weighted mean of the eigenvalues of M​(x)M(x) bifurcating from λ (equivalently λ¯​(x)=1d​tr​(M​(x)​E​(x)) λ(x)= 1dtr(M(x)E(x)) where E​(x)E(x) is the associated spectral projector). Then λ¯(1):=d​x​λ¯​(x)|x=0=1d​tr​(M(1)​E). λ^(1):= . ddx λ(x) |_x=0\;=\; 1d\,tr\! (M^(1)E ). Along the Gram flow, λ¯˙​(t)=1d​tr​(M˙​(t)​E​(t)) λ(t)= 1dtr( M(t)\,E(t)) with M˙​(t)=−M​(t),Φ​(t) M(t)=-\M(t), (t)\. If λ is simple with unit eigenvector u, this reduces to λ(1)=u⊤​M(1)​uλ^(1)=u M^(1)u. Reference: [Katō, 1980], Ch. I, §2, Eq. (2.32) (in this PDF’s numbering). B.2 Spectral Measure Overview: Lemma B.9 recasts the dimensionality of a feature i denoted by DiD_i extracted from the weights as a function of the gram matrix. B.10 has a corresponding expression of DiD_i in terms the feature norm scaled by the inverse of the Rayleigh coefficient of the frame operator (which determines the range of eigenvalues that the frame operator could take) Lemma B.9. Let Wi∈ℝm\0W_i ^m \0\ be column i of the weight matrix W and M=W⊤​WM=W W. Then: Di=Mi​i2(M2)i​iD_i= M_i^2(M^2)_i (A1) Proof. By definition of the Gram matrix, Mi​j=⟨Wi,Wj⟩M_ij= W_i,W_j , i.e Mi​i=‖Wi‖2M_i=\|W_i\|^2 and (M2)i​i=∑jMi​j​Mj​i=∑j(Wi⊤​Wj)2(M^2)_i= _jM_ijM_ji= _j(W_i W_j)^2. Hence: Di=‖Wi‖2∑j(W^i⊤​Wj)2=‖Wi‖21‖Wi‖2​∑j(Wi⊤​Wj)2=D_i= \|W_i\|^2 _j( W_i W_j)^2= \|W_i\|^2 1\|W_i\|^2 _j(W_i W_j)^2= ‖Wi‖4∑j(Wi⊤​Wj)2=Mi​i2(M2)i​i \|W_i\|^4 _j(W_i W_j)^2= M_i^2(M^2)_i ∎ Lemma B.10. Let Wi∈ℝm\0W_i ^m \0\ be column i of W∈ℝm×nW ^m× n and DiD_i be defined as in (A​1)(A1). Then: Di=1κi​‖Wi‖2,κi=Wi⊤​F​Wi‖Wi‖2D_i= 1 _i\|W_i\|^2, _i= W_i FW_i\|W_i\|^2 Proof. Rewriting: ∀i(M2)i​i=ei⊤​M2​ei=ei⊤​(W⊤​W)​(W⊤​W)​ei= _i\ (M^2)_i=e_i M^2e_i=e_i (W W)(W W)e_i= =(W​ei)⊤​F​(W​ei)=Wi⊤​F​Wi=(We_i) F(We_i)=W_i FW_i Hence by using Lemma A1 Di=Mi​i2(M2)i​i=‖Wi‖4Wi⊤​F​Wi=‖Wi‖2Wi⊤​F​Wi​‖Wi‖2=1κi​‖Wi‖2,D_i= M_i^2(M^2)_i= \|W_i\|^4W_i FW_i= \|W_i\|^2W_i FW_i\|W_i\|^2= 1 _i\|W_i\|^2, ∎ By having expressed the fractional dimensionality as a functional of both M=W⊤​WM=W W and F=W​W⊤F=W (B.9, B.10)), we can use B.3 to restate the fractional dimensionality in purely spectral terms. Since all subsequent analysis is going to be dynamic, we can WLOG use previous the results for any specific temporal snapshot W​(t)W(t) of the weight matrix and its corresponding Gram and Frame operators M​(t)=W​(t)⊤​W​(t)M(t)=W(t) W(t) and F​(t)=W​(t)​W​(t)⊤F(t)=W(t)W(t) . Using the spectral resolution provided to us from B.3(i) F​(t)=∑e∈ℰ+λe​(t)​Pe​(t)F(t)= _e _+ _e(t)P_e(t), we can define the per-feature spectral measure μi​(t) _i(t) using the spectral weights: Lemma B.11 (Spectral measure). Fix a snapshot of the weight matrix W∈ℝm×nW ^m× n and let M:=W⊤​WM:=W W and F:=W​W⊤F:=W . Let M=∑e∈Eλe​EeM= _e∈ E _eE_e be the orthogonal spectral resolution from Lemma B.1, and for each e∈E+:=e:λe>0e∈ E_+:=\e: _e>0\ let Pe:=λe+​W​Ee​W⊤P_e:= _e^+\,WE_eW be the induced activation-space projector from Lemma B.3. For any feature (column) Wi≠0W_i≠ 0, define pi,e:=‖Pe​Wi‖2‖Wi‖2(e∈E+),μi:=∑e∈E+pi,e​δλe.p_i,e\;:=\; \|P_eW_i\|^2\|W_i\|^2 (e∈ E_+), _i\;:=\; _e∈ E_+p_i,e\, _ _e. Then: 1. pi,e≥0p_i,e≥ 0 for all e∈E+e∈ E_+ and ∑e∈E+pi,e=1 _e∈ E_+p_i,e=1. In particular, μi _i is a well-defined probability measure (finite atomic) supported on λe:e∈E+⊂(0,∞)\ _e:e∈ E_+\⊂(0,∞). Equivalently, for any Borel set B⊂ℝB , μi​(B)=∑e∈E+:λe∈Bpi,e. _i(B)\;=\; _e∈ E_+:\, _e∈ Bp_i,e. 2. The weights admit the equivalent closed form pi,e=λe​(Ee)i​i‖Wi‖2(e∈E+).p_i,e\;=\; _e\,(E_e)_i\|W_i\|^2 (e∈ E_+). Proof. We use only Lemmas B.1–B.4. (1) Since each PeP_e is an orthogonal projector (Lemma B.3(i)), ‖Pe​Wi‖2≥0\|P_eW_i\|^2≥ 0, hence pi,e≥0p_i,e≥ 0. To show the weights sum to 11, first note that Wi=W​ei∈Im​(W)W_i=We_i (W). By Lemma B.4(i), ℝm=(⨁e∈E+Im​(Pe))⊕ker⁡(W⊤)R^m= ( _e∈ E_+Im(P_e) ) (W ) and ∑e∈E+Pe _e∈ E_+P_e is the orthogonal projector onto Im(W)=ker(W⊤)⟂Im(W)= (W ) . Therefore ∑e∈E+Pe​Wi=Wi _e∈ E_+P_eW_i=W_i. Because the projectors are mutually orthogonal (Lemma B.3(i)), the vectors Pe​Wie∈E+\P_eW_i\_e∈ E_+ are pairwise orthogonal, so ‖Wi‖2=‖∑e∈E+Pe​Wi‖2=∑e∈E+‖Pe​Wi‖2.\|W_i\|^2\;=\; \| _e∈ E_+P_eW_i \|^2\;=\; _e∈ E_+\|P_eW_i\|^2. Dividing by ‖Wi‖2>0\|W_i\|^2>0 yields ∑e∈E+pi,e=1 _e∈ E_+p_i,e=1. Hence μi=∑e∈E+pi,e​δλe _i= _e∈ E_+p_i,e _ _e has total mass 11, so it is a probability measure; its action on Borel sets is the standard one for finite atomic measures. (2) Since PeP_e is an orthogonal projector (Lemma B.3(i)), ‖Pe​Wi‖2=Wi⊤​Pe​Wi\|P_eW_i\|^2=W_i P_eW_i. Using Pe=λe+​W​Ee​W⊤P_e= _e^+WE_eW (Lemma B.3) and Wi=W​eiW_i=We_i, we compute Wi⊤​Pe​Wi=ei⊤​W⊤​(λe+​W​Ee​W⊤)​W​ei=λe+​ei⊤​(W⊤​W)​Ee​(W⊤​W)​ei=λe+​ei⊤​M​Ee​M​ei.W_i P_eW_i=e_i W ( _e^+WE_eW )We_i= _e^+\,e_i (W W)\,E_e\,(W W)\,e_i= _e^+\,e_i ME_eMe_i. By Lemma B.1, M​Ee=λe​EeME_e= _eE_e and Ee​M=λe​EeE_eM= _eE_e, so M​Ee​M=λe2​EeME_eM= _e^2E_e. Thus Wi⊤​Pe​Wi=λe+​λe2​ei⊤​Ee​ei=λe​(Ee)i​iW_i P_eW_i= _e^+ _e^2\,e_i E_ee_i= _e(E_e)_i. Dividing by ‖Wi‖2\|W_i\|^2 gives the claimed formula for pi,ep_i,e. ∎ Lemma B.12 (Moments of the spectral measure). Fix i with Wi≠0W_i≠ 0 and let μi=∑e∈E+pi,e​δλe _i= _e∈ E_+p_i,e _ _e be as in Lemma B.11. Then for every integer r≥1r≥ 1, μi​[λr]:=∫λr​μi​(λ)=Wi⊤​Fr​Wi‖Wi‖2.E_ _i[λ^r]\;:=\; λ^r\,d _i(λ)\;=\; W_i F^rW_i\|W_i\|^2. Moreover, writing the Moore–Penrose pseudoinverse as F+=∑e∈E+λe+​PeF^+= _e∈ E_+ _e^+P_e, Wi⊤​F+​Wi‖Wi‖2=∫λ+​μi​(λ)=μi​[λ+],where ​λ+:=1λ​ on ​(0,∞). W_i F^+W_i\|W_i\|^2\;=\; λ^+\,d _i(λ)\;=\;E_ _i[λ^+], λ^+:= 1λ on (0,∞). Proof. By Lemma B.3(i), F=∑e∈E+λe​PeF= _e∈ E_+ _eP_e, and by Lemma B.3(i) the projectors satisfy Pe​Pf=δe​f​PeP_eP_f= _efP_e (orthogonal idempotents). Therefore, for any integer r≥1r≥ 1, Fr=(∑e∈E+λe​Pe)r=∑e∈E+λer​Pe,F^r= ( _e∈ E_+ _eP_e )^r= _e∈ E_+ _e^rP_e, since all mixed products Pe1​⋯​PerP_e_1·s P_e_r vanish unless e1=⋯=ere_1=·s=e_r. Hence Wi⊤​Fr​Wi=∑e∈E+λer​Wi⊤​Pe​Wi=∑e∈E+λer​‖Pe​Wi‖2,W_i F^rW_i= _e∈ E_+ _e^r\,W_i P_eW_i= _e∈ E_+ _e^r\,\|P_eW_i\|^2, where we used that PeP_e is an orthogonal projector (Lemma B.3(i)), so Wi⊤​Pe​Wi=‖Pe​Wi‖2W_i P_eW_i=\|P_eW_i\|^2. Dividing by ‖Wi‖2\|W_i\|^2 and recalling pi,e=‖Pe​Wi‖2/‖Wi‖2p_i,e=\|P_eW_i\|^2/\|W_i\|^2 gives Wi⊤​Fr​Wi‖Wi‖2=∑e∈E+pi,e​λer=∫λr​μi​(λ)=μi​[λr]. W_i F^rW_i\|W_i\|^2= _e∈ E_+p_i,e _e^r= λ^r\,d _i(λ)=E_ _i[λ^r]. The pseudoinverse identity is identical: by definition F+=∑e∈E+λe+​PeF^+= _e∈ E_+ _e^+P_e, so Wi⊤​F+​Wi‖Wi‖2=∑e∈E+pi,e​λe+=∫λ+​μi​(λ)=μi​[λ+]. W_i F^+W_i\|W_i\|^2= _e∈ E_+p_i,e _e^+= λ^+\,d _i(λ)=E_ _i[λ^+]. ∎ This allows us to readily reformulate the Rayleigh quotient as the first moment of the spectral measure μi​(t) _i(t): Corollary 7. κi=μi​(t)​[λ] _i=E_ _i(t)[λ] This means we have the following dynamic law for the Rayleigh quotient throughout training: κ˙i​(t)=∑kλ˙k​(t)​pi​k​(t)+∑kλk​(t)​p˙i​k​(t), κ_i(t)= _k λ_k(t)p_ik(t)+ _k _k(t) p_ik(t), where the first term represents the eigenvalue drift and the second term is the mass transport of features across eigenspaces (e.g. eigenhopping). Lemma B.13 (Cyclic space, fractional powers, and pseudoinverse Cauchy–Schwarz). Let F∈ℝm×mF ^m× m be symmetric PSD with spectral decomposition F=∑e=0λe​Pe,λk≥0F= _e=0 _eP_e, _k≥ 0, where the PkP_k are orthogonal projectors, Pk​Pℓ=δk​ℓ​PkP_kP_ = _k P_k, and ∑kPk=I _kP_k=I. Let P:=PIm​(F)=∑λk>0Pk=I−P0,F1/2:=∑kλk1/2​Pk,F+⁣/2:=∑kλk+1/2​Pk.P:=P_Im(F)= _ _k>0P_k=I-P_0, F^1/2:= _k _k^1/2P_k, F^+/2:= _k _k^+1/2P_k. For x∈ℝmx ^m, define the cyclic subspace x:=span​x,F​x,F2​x,….K_x:=span\x,Fx,F^2x,…\. Then: 1. x=span​Pk​x:Pk​x≠0K_x=span\P_kx:\,P_kx≠ 0\. In particular, dim(x)≤m (K_x)≤ m and there exists r≤mr≤ m such that x=span​x,F​x,…,Fr−1​xK_x=span\x,Fx,…,F^r-1x\. 2. F1/2​x∈xF^1/2x _x and F+⁣/2​x∈xF^+/2x _x. 3. For all x∈ℝmx ^m, (x⊤​P​x)2≤(x⊤​F​x)​(x⊤​F+​x),(x Px)^2\ ≤\ (x Fx)\,(x F^+x), and if x∈Im​(F)x (F) (equivalently P​x=xPx=x), then (x⊤​x)2≤(x⊤​F​x)​(x⊤​F+​x).(x x)^2\ ≤\ (x Fx)\,(x F^+x). Moreover, equality holds iff F1/2​xF^1/2x and F+⁣/2​xF^+/2x are linearly dependent, equivalently iff F​x=c​P​xFx=c\,Px for some c∈ℝc (and iff F​x=c​xFx=cx when x∈Im​(F)x (F)). Proof. (1) For every integer t≥0t≥ 0, Ft​x=∑k=0Kλkt​Pk​x,F^tx= _k=0^K _k^t\,P_kx, so x⊆span​Pk​x:Pk​x≠0K_x \P_kx:\,P_kx≠ 0\. For the reverse inclusion, let S:=k:Pk​x≠0S:=\k:\,P_kx≠ 0\ and let λkk∈S\ _k\_k∈ S be the distinct eigenvalues present in x. For each fixed j∈Sj∈ S, choose a polynomial qjq_j of degree ≤|S|−1≤|S|-1 such that qj​(λj)=1q_j( _j)=1 and qj​(λk)=0q_j( _k)=0 for all k∈S∖jk∈ S \j\ (Lagrange interpolation on the finite set λkk∈S\ _k\_k∈ S). Then qj​(F)​x=∑k∈Sqj​(λk)​Pk​x=Pj​x,q_j(F)x= _k∈ Sq_j( _k)\,P_kx=P_jx, and since qj​(F)​xq_j(F)x is a linear combination of Ft​xt=0|S|−1\F^tx\_t=0^|S|-1, we have Pj​x∈xP_jx _x. Thus span​Pk​x:Pk​x≠0⊆xspan\P_kx:\,P_kx≠ 0\ _x, proving equality. The stabilization x=span​x,F​x,…,Fr−1​xK_x=span\x,Fx,…,F^r-1x\ follows with r:=dim(x)≤|S|≤mr:= (K_x)≤|S|≤ m. (2) Using the spectral expansions, F1/2​x=∑kλk1/2​Pk​x,F+⁣/2​x=∑λk>0λk−1/2​Pk​x,F^1/2x= _k _k^1/2P_kx, F^+/2x= _ _k>0 _k^-1/2P_kx, and each Pk​x∈xP_kx _x by (1), so both vectors lie in xK_x. (3) Note F1/2​F+⁣/2=F+⁣/2​F1/2=PF^1/2F^+/2=F^+/2F^1/2=P. Let a:=F1/2​xa:=F^1/2x and b:=F+⁣/2​xb:=F^+/2x. Then ⟨a,b⟩=x⊤​F1/2​F+⁣/2​x=x⊤​P​x,‖a‖2=x⊤​F​x,‖b‖2=x⊤​F+​x. a,b =x F^1/2F^+/2x=x Px, \|a\|^2=x Fx, \|b\|^2=x F^+x. Cauchy–Schwarz in ℝmR^m yields (x⊤​P​x)2=⟨a,b⟩2≤‖a‖2​‖b‖2=(x⊤​F​x)​(x⊤​F+​x)(x Px)^2= a,b ^2≤\|a\|^2\|b\|^2=(x Fx)(x F^+x). If x∈Im​(F)x (F) then P​x=xPx=x and the stated simplification follows. Equality in Cauchy–Schwarz holds iff a and b are linearly dependent, i.e. F1/2​x=c​F+⁣/2​xF^1/2x=c\,F^+/2x. Left-multiplying by F1/2F^1/2 gives F​x=c​F1/2​F+⁣/2​x=c​P​xFx=c\,F^1/2F^+/2x=c\,Px. If P​x=xPx=x, this reduces to F​x=c​xFx=cx. ∎ Lemma B.14 (L2​(μ)L^2(μ) for finite atomic measures). Let μ=∑k=1spk​δλkμ= _k=1^sp_k\, _ _k be a probability measure on ℝR with pk>0p_k>0 and ∑kpk=1 _kp_k=1. Then L2​(μ)L^2(μ) can be identified with ℝsR^s via f↦(f​(λ1),…,f​(λs))f (f( _1),…,f( _s)), with inner product ⟨f,g⟩L2​(μ)=∫f​(λ)​g​(λ)​μ​(λ)=∑k=1spk​f​(λk)​g​(λk), f,g _L^2(μ)= f(λ)g(λ)\,dμ(λ)= _k=1^sp_k\,f( _k)g( _k), and norm ‖f‖L2​(μ)2=∑k=1spk​|f​(λk)|2\|f\|_L^2(μ)^2= _k=1^sp_k\,|f( _k)|^2. In particular, every function on λ1,…,λs\ _1,…, _s\ is in L2​(μ)L^2(μ), and L2​(μ)L^2(μ) is a finite-dimensional Hilbert space. Proof. Since μ is supported on the finite set λ1,…,λs\ _1,…, _s\ with strictly positive masses pkp_k, two functions f,gf,g are equal μ-a.s. iff f​(λk)=g​(λk)f( _k)=g( _k) for all k. Thus an L2​(μ)L^2(μ) equivalence class is uniquely specified by the vector of its values (f​(λ1),…,f​(λs))∈ℝs(f( _1),…,f( _s)) ^s. Moreover, ∫|f​(λ)|2​μ​(λ)=∑k=1spk​|f​(λk)|2<∞ |f(λ)|^2\,dμ(λ)= _k=1^sp_k\,|f( _k)|^2<∞ for every such vector (finite sum), so every function on the support belongs to L2​(μ)L^2(μ). The formula for the inner product is immediate from the definition of integration against an atomic measure. Completeness follows because L2​(μ)L^2(μ) is finite-dimensional. ∎ Lemma B.15 (Cauchy–Schwarz in L2​(μ)L^2(μ)). Let (X,ℬ,μ)(X,B,μ) be a probability space and let u,v∈L2​(μ)u,v∈ L^2(μ). Then |⟨u,v⟩L2​(μ)|2≤‖u‖L2​(μ)2​‖v‖L2​(μ)2.| u,v _L^2(μ)|^2≤\|u\|_L^2(μ)^2\,\|v\|_L^2(μ)^2. If ‖v‖L2​(μ)>0\|v\|_L^2(μ)>0, equality holds iff there exists c∈ℝc such that u=c​vu=c\,v μ-a.s. (If ‖v‖L2​(μ)=0\|v\|_L^2(μ)=0, then v=0v=0 μ-a.s. and the inequality is trivial.) Proof. If ‖v‖2=0\|v\|_2=0 the claim is immediate. Otherwise define c:=⟨u,v⟩‖v‖22.c:= u,v \|v\|_2^2. Then 0≤‖u−c​v‖22=‖u‖22−2​c​⟨u,v⟩+c2​‖v‖22=‖u‖22−⟨u,v⟩2‖v‖22,0≤\|u-cv\|_2^2=\|u\|_2^2-2c u,v +c^2\|v\|_2^2=\|u\|_2^2- u,v ^2\|v\|_2^2, which rearranges to ⟨u,v⟩2≤‖u‖22​‖v‖22 u,v ^2≤\|u\|_2^2\|v\|_2^2. Equality holds iff ‖u−c​v‖22=0\|u-cv\|_2^2=0, i.e. u=c​vu=cv μ-a.s. ∎ Lemma B.16 (Cyclic isometry L2​(μx)≃xL^2( _x) _x). Let F∈ℝm×mF ^m× m be symmetric PSD with spectral decomposition F=∑k=0Kλk​PkF= _k=0^K _kP_k. Fix x∈Im​(F)∖0x (F) \0\, and define the (positive-spectrum) spectral weights and measure pk:=‖Pk​x‖2‖x‖2(λk>0),μx:=∑λk>0pk​δλk.p_k:= \|P_kx\|^2\|x\|^2 ( _k>0), _x:= _ _k>0p_k\, _ _k. For f defined on supp​(μx)supp( _x), define functional calculus on Im​(F)Im(F) by f​(F):=∑λk>0f​(λk)​Pk.f(F):= _ _k>0f( _k)\,P_k. Define Ux:L2​(μx)→xU_x:L^2( _x) _x by Ux​(f):=1‖x‖​f​(F)​x.U_x(f):= 1\|x\|\,f(F)x. Then UxU_x is a well-defined unitary isomorphism (linear, bijective, and inner-product preserving). In particular, for all f,g∈L2​(μx)f,g∈ L^2( _x), ⟨Ux​(f),Ux​(g)⟩ℝm=∫f​(λ)​g​(λ)​μx​(λ), U_x(f),U_x(g) _R^m= f(λ)g(λ)\,d _x(λ), and for any real-valued h on supp​(μx)supp( _x), x⊤​h​(F)​x‖x‖2=∫h​(λ)​μx​(λ). x h(F)x\|x\|^2= h(λ)\,d _x(λ). Proof. Well-definedness. If f=gf=g μx _x-a.s., then f​(λk)=g​(λk)f( _k)=g( _k) for every atom with pk>0p_k>0. If pk=0p_k=0 then Pk​x=0P_kx=0, so changing f​(λk)f( _k) does not change f​(F)​xf(F)x. Hence f​(F)​x=g​(F)​xf(F)x=g(F)x and UxU_x is well-defined on L2​(μx)L^2( _x) equivalence classes. Isometry. Using orthogonality Pk​Pℓ=δk​ℓ​PkP_kP_ = _k P_k, ⟨Ux​(f),Ux​(g)⟩ U_x(f),U_x(g) =1‖x‖2​⟨f​(F)​x,g​(F)​x⟩ = 1\|x\|^2\, f(F)x,\,g(F)x =1‖x‖2​⟨∑λk>0f​(λk)​Pk​x,∑λℓ>0g​(λℓ)​Pℓ​x⟩ = 1\|x\|^2 _ _k>0f( _k)P_kx,\ _ _ >0g( _ )P_ x =1‖x‖2​∑λk>0f​(λk)​g​(λk)​‖Pk​x‖2=∑λk>0pk​f​(λk)​g​(λk)=∫f​g​μx. = 1\|x\|^2 _ _k>0f( _k)g( _k)\,\|P_kx\|^2= _ _k>0p_kf( _k)g( _k)= fg\,d _x. Surjectivity. By Lemma B.13(1), x=span​Pk​x:Pk​x≠0=span​Pk​x:pk>0K_x=span\P_kx:\,P_kx≠ 0\=span\P_kx:\,p_k>0\. Given y=∑pk>0ak​Pk​x∈xy= _p_k>0a_kP_kx _x, define f on supp​(μx)supp( _x) by f​(λk):=ak​‖x‖f( _k):=a_k\|x\|. Then f∈L2​(μx)f∈ L^2( _x) (finite support) and Ux​(f)=1‖x‖​∑pk>0f​(λk)​Pk​x=∑pk>0ak​Pk​x=y.U_x(f)= 1\|x\| _p_k>0f( _k)P_kx= _p_k>0a_kP_kx=y. Injectivity follows from the isometry: if Ux​(f)=0U_x(f)=0 then ‖f‖L2​(μx)2=‖Ux​(f)‖2=0\|f\|_L^2( _x)^2=\|U_x(f)\|^2=0, hence f=0f=0 μx _x-a.s. Thus UxU_x is unitary. Quadratic form identity. For any h on supp​(μx)supp( _x), x⊤​h​(F)​x=∑λk>0h​(λk)​x⊤​Pk​x=∑λk>0h​(λk)​‖Pk​x‖2,x h(F)x= _ _k>0h( _k)\,x P_kx= _ _k>0h( _k)\,\|P_kx\|^2, and dividing by ‖x‖2\|x\|^2 gives ∫h​μx h\,d _x. ∎ Appendix C Theorem Proofs and Corollaries Corollary 2 (Projective Linearity). Assume Spectral Localization holds. Then, the fractional dimensionality DiD_i of any feature is linearly determined by its feature norm ∝‖Wi‖2 \|W_i\|^2 with slope k equal to the reciprocal of the eigenvalue λe _e of the subspace it occupies Di=λe+​‖Wi‖2D_i= _e^+\|W_i\|^2. Proof of Corollary 2 (Projective Linearity). By Lemma B.10, Di=1κi​|Wi|2,κi=Wi⊤​F​Wi|Wi|2.D_i\;=\; 1 _i\, W_i ^2, _i\;=\; W_i FW_i W_i ^2. Under Spectral Localization (Theorem 1), F​Wi=λe​WiFW_i= _e\,W_i for some F eigenvalue λe>0 _e>0. Hence Wi⊤​F​Wi=λe​Wi⊤​Wi=λe​|Wi|2,W_i FW_i\;=\; _e\,W_i W_i\;=\; _e\, W_i ^2, so κi=λe _i= _e and therefore Di=|Wi|2λe=λe+​|Wi|2,D_i\;=\; W_i ^2 _e\;=\; _e^+\, W_i ^2, where λe+=1/λe _e^+=1/ _e since λe>0 _e>0 in Theorem 1. ∎ Theorem 3 (Decomposition into Tight Frames) Assume Spectral Localization (Theorem 1) holds for all features i∈1,…,ni∈\1,…,n\. Let Λ=λ1,…,λr =\ _1,…, _r\ be the distinct positive eigenvalues of F=W​W⊤F=W , and define the index clusters and their spans Ck:=i:λ​(i)=λk,Vk:=span​Wi:i∈Ck.C_k:=\\,i:λ(i)= _k\,\, V_k:=span\\,W_i:i∈ C_k\,\. Then Ckk=1r\C_k\_k=1^r partitions 1,…,n\1,…,n\, the subspaces VkV_k are pairwise orthogonal, and F=∑k=1rλk​PVk(on Im​(W); plus 0 on ker⁡(F)).F\;=\; _k=1^r _k\,P_V_k (on $ Im(W)$; plus $0$ on $ (F)$). Moreover, within each cluster CkC_k, ∑i∈CkWi​Wi⊤=λk​Ion ​Vk, _i∈ C_kW_iW_i \;=\; _k\,I V_k, i.e. (WCk)(W_C_k) forms a tight frame for VkV_k with frame constant λk _k. Proof. By Theorem 1, each WiW_i is an eigenvector of F with eigenvalue λ​(i)>0λ(i)>0, i.e. F​Wi=λ​(i)​WiFW_i=λ(i)W_i. Fix distinct eigenvalues λk≠λℓ _k≠ _ . Since F is symmetric PSD, eigenspaces for distinct eigenvalues are orthogonal, hence Wi⟂WjW_i W_j whenever λ​(i)≠λ​(j)λ(i)≠λ(j). Therefore the sets CkC_k are disjoint and partition 1,…,n\1,…,n\, and the spans VkV_k are pairwise orthogonal. For any v∈Vkv∈ V_k, write v=∑i∈Ckαi​Wiv= _i∈ C_k _iW_i. Then F​v=∑i∈Ckαi​F​Wi=∑i∈Ckαi​λk​Wi=λk​v,Fv= _i∈ C_k _iFW_i= _i∈ C_k _i _kW_i= _kv, so Vk⊆Im⁡PkV_k P_k. Conversely, since F=W​W⊤F=W , we have Im​(F)=Im​(W)=span​WiiIm(F)=Im(W)=span\W_i\_i. Any eigenvector v of F with eigenvalue λk>0 _k>0 lies in Im​(F)Im(F), hence decomposes uniquely as v=∑ℓ=1rvℓv= _ =1^rv_ with vℓ∈Vℓv_ ∈ V_ (orthogonal direct sum). Applying F gives F​v=∑ℓ=1rλℓ​vℓ,but alsoF​v=λk​v=∑ℓ=1rλk​vℓ,Fv= _ =1^r _ v_ , also Fv= _kv= _ =1^r _kv_ , so (λℓ−λk)​vℓ=0( _ - _k)v_ =0 for all ℓ , forcing vℓ=0v_ =0 for ℓ≠k ≠ k. Thus Im⁡(Pk)∩Im​(W)=Vk,Im(P_k) (W)=V_k, and F acts as λk​I _kI on VkV_k. This yields F=∑k=1rλk​PVkon ​Im​(W),F= _k=1^r _kP_V_k Im(W), and F=0F=0 on ker⁡(F)=Im​(W)⟂ (F)=Im(W) . For the tight-frame claim, define Fk:=∑i∈CkWi​Wi⊤.F_k:= _i∈ C_kW_iW_i . Take v∈Vkv∈ V_k. For any j∉Ckj∉ C_k, orthogonality of distinct eigenspaces gives Wj⊤​v=0W_j v=0, hence F​v=∑i=1nWi​Wi⊤​v=∑i∈CkWi​Wi⊤​v=Fk​v.Fv= _i=1^nW_iW_i v= _i∈ C_kW_iW_i v=F_kv. But also F​v=λk​vFv= _kv for v∈Vkv∈ V_k, so Fk​v=λk​vF_kv= _kv for all v∈Vkv∈ V_k, i.e. Fk=λk​IF_k= _kI on VkV_k. This is the tight-frame condition with frame constant λk _k. ∎ Theorem 4 (Spectral Identification of Geometry) Assume Spectral Localization (Theorem 1) holds so that each cluster CkC_k forms a tight frame as in Theorem 3. Let Mk:=WCk⊤​WCkM_k:=W_C_k W_C_k be the local Gram matrix. Then the geometry of cluster CkC_k is an instance of an association scheme algebra A if and only if the eigenspace decomposition of MkM_k aligns with the canonical strata (W0,…,Ws)(W_0,…,W_s) of A. Proof. We use standard association-scheme facts (Appendix A.2). If (A0,…,As)(A_0,…,A_s) are adjacency matrices of an association scheme on Ω , its Bose–Mesner algebra A admits a canonical orthogonal decomposition ℝΩ=W0⊕⋯⊕WsR =W_0 ·s W_s with orthogonal projectors (S0,…,Ss)(S_0,…,S_s), and every X∈X preserves each stratum WeW_e. Moreover, (Se)(S_e) form a basis of mutually orthogonal idempotents for A, so each X∈X has a unique expansion X=∑e=0sθe​Se,X= _e=0^s _eS_e, and acts as scalar θe _e on We=Im​(Se)W_e=Im(S_e). (⇒ ) If the cluster geometry is an instance of A, then the defining invariant matrix (here MkM_k) lies in A. Thus Mk=∑eθe​SeM_k= _e _eS_e, so each stratum WeW_e is an invariant subspace on which MkM_k acts as a scalar. Equivalently, the spectral decomposition of MkM_k is governed by the stratum decomposition. (⇐ ) Conversely, suppose the eigenspace decomposition of MkM_k aligns with the strata, i.e. MkM_k is block-scalar on ℝΩ=⨁eWeR = _eW_e. Then there exist scalars θe _e such that Mk|We=θe​IM_k|_W_e= _eI for all e. Therefore Mk=∑eθe​Se∈M_k= _e _eS_e since Se∈S_e and A is a linear space. Hence the cluster geometry is an instance of the association scheme algebra. ∎ Corollary 5 (Simplex Identification) A cluster CkC_k is simplex geometry if and only if its Gram matrix MkM_k has exactly two invariant subspaces W0=ker⁡(Mk)=span​(),W1=Im​(Mk)=⟂,W_0= (M_k)=span(1), W_1=Im(M_k)=1 , with eigenvalues Λ​(W0)=0 (W_0)=0 and Λ​(W1)=λk=|Ck|dim(Vk) (W_1)= _k= |C_k| (V_k). Proof. In the simplex association scheme (Appendix A.3.1), the Bose–Mesner algebra has exactly two strata: the constant-vector space W0=span​()W_0=span(1) and its orthogonal complement W1=⟂W_1=1 , with projectors S0=1p​JS_0= 1pJ and S1=I−1p​JS_1=I- 1pJ where p=|Ck|p=|C_k|. By Theorem 4, being an instance of the simplex scheme is equivalent to having spectral/invariant decomposition exactly along these two strata. For eigenvalues: in the simplex model the vectors are unit and centered (sum to zero), and Appendix A.3.1 yields Mk=p−1​S1.M_k= pp-1\,S_1. Thus Mk​=0M_k1=0 and MkM_k acts as scalar p−1 pp-1 on ⟂1 . Under the tight-frame parameterization of a unit tight frame of p vectors spanning a d-dimensional subspace, the frame constant is p/dp/d (trace identity; cf. Lemma A.7). For a simplex, p=d+1p=d+1, so p−1=d+1d=pd=|Ck|dim(Vk), pp-1= d+1d= pd= |C_k| (V_k), matching the stated λk _k. ∎ Appendix D Perturbative case Theorem 8 (Defect decomposition: near-saturation ⇔ small aggregate slack). Let W∈ℝm×nW ^m× n with columns w1,…,wnw_1,…,w_n, and let the frame operator be F:=W​W⊤F:=W . Let r:=rank​(W)=rank​(F)r:=rank(W)=rank(F). Define leverage scores ℓi:=wi⊤​F+​wi _i:=w_i F^+w_i and fractional dimensionalities Di:=‖wi‖4/(wi⊤​F​wi)D_i:=\|w_i\|^4/(w_i Fw_i) (for wi≠0w_i≠ 0). Define relative slack σi:=1−Di/ℓi∈[0,1] _i:=1-D_i/ _i∈[0,1]. Then: 1. (Capacity budget) ∑i=1nℓi=r _i=1^n _i=r. 2. (Pointwise bound) Di≤ℓiD_i≤ _i for all i, with equality iff μi _i is a Dirac mass (equivalently: wiw_i is an eigenvector of F). 3. (Exact defect identity) r−∑i=1nDi=∑i=1n(ℓi−Di)=∑i=1nℓi​σi.r- _i=1^nD_i\;=\; _i=1^n( _i-D_i)\;=\; _i=1^n _i\, _i. In particular, if r=mr=m then m−∑iDi=∑iℓi​σim- _iD_i= _i _i _i. Proof. (1) Using ℓi=wi⊤​F+​wi _i=w_i F^+w_i and cyclicity of trace, ∑i=1nℓi=∑i=1nwi⊤​F+​wi=tr​(∑i=1nF+​wi​wi⊤)=tr​(F+​∑i=1nwi​wi⊤)=tr​(F+​F). _i=1^n _i= _i=1^nw_i F^+w_i=tr\! ( _i=1^nF^+w_iw_i )=tr\! (F^+ _i=1^nw_iw_i )=tr(F^+F). For symmetric PSD F, the matrix F+​F^+F is the orthogonal projector onto Im​(F)Im(F), hence tr​(F+​F)=rank​(F)=rtr(F^+F)=rank(F)=r. (2) Fix i and set x:=wix:=w_i. Since x is a column of W, we have x∈Im​(W)=Im​(F)x (W)=Im(F). Let a:=F1/2​xa:=F^1/2x and b:=F+1/2​xb:=F^+1/2x. By Cauchy–Schwarz, ⟨a,b⟩2≤‖a‖2​‖b‖2. a,b ^2≤\|a\|^2\|b\|^2. Compute each term: ⟨a,b⟩=x⊤​F1/2​F+1/2​x=x⊤​(F+​F)1/2​x=x⊤​ΠIm​(F)​x=‖x‖2, a,b =x F^1/2F^+1/2x=x (F^+F)^1/2x=x _Im(F)x=\|x\|^2, because x∈Im​(F)x (F). Also ‖a‖2=x⊤​F​x\|a\|^2=x Fx and ‖b‖2=x⊤​F+​x=ℓi\|b\|^2=x F^+x= _i. Thus ‖x‖4≤(x⊤​F​x)​(x⊤​F+​x)⟹‖x‖4x⊤​F​x≤x⊤​F+​x⟹Di≤ℓi.\|x\|^4≤(x Fx)(x F^+x) \|x\|^4x Fx≤ x F^+x D_i≤ _i. Equality holds in Cauchy–Schwarz iff a and b are linearly dependent, i.e. F1/2​x=c​F+1/2​xF^1/2x=c\,F^+1/2x for some c∈ℝc . Multiplying by F1/2F^1/2 gives F​x=c​ΠIm​(F)​x=c​xFx=c\, _Im(F)x=cx, so x is an eigenvector of F (with eigenvalue c>0c>0 since x∈Im​(F)x (F) and x≠0x≠ 0). Equivalently, x lies entirely in a single positive-eigenvalue eigenspace, which is the statement that the feature spectral measure μi _i is a Dirac mass. (3) By definition ℓi​σi=ℓi−Di _i _i= _i-D_i, hence ∑i=1nℓi​σi=∑i=1n(ℓi−Di)=(∑i=1nℓi)−∑i=1nDi. _i=1^n _i _i= _i=1^n( _i-D_i)= ( _i=1^n _i )- _i=1^nD_i. Using (1), ∑iℓi=r _i _i=r, yielding r−∑iDi=∑iℓi​σir- _iD_i= _i _i _i. The full-rank specialization is the same identity with r=mr=m. ∎ Theorem 9 (Delocalized residue bound under ε -saturation). Assume ∑i=1nDi≥(1−ε)​r _i=1^nD_i≥(1- )\,r. Then for any τ∈(0,1)τ∈(0,1), 1r​∑i:σi≥τℓi≤ετ. 1r _i:\, _i≥τ _i\;≤\; τ. Proof. From Theorem 8(3), ∑i=1nℓi​σi=r−∑i=1nDi≤ε​r. _i=1^n _i _i=r- _i=1^nD_i≤ r. On the index set Sτ:=i:σi≥τS_τ:=\i: _i≥τ\ we have ℓi​σi≥τ​ℓi _i _i≥τ _i, so ε​r≥∑i=1nℓi​σi≥∑i∈Sτℓi​σi≥τ​∑i∈Sτℓi. r\;≥\; _i=1^n _i _i\;≥\; _i∈ S_τ _i _i\;≥\;τ _i∈ S_τ _i. Divide by τ​rτ r to obtain (1/r)​∑i:σi≥τℓi≤ε/τ(1/r) _i: _i≥τ _i≤ /τ. ∎ Theorem 10 (Eigenvector residual equals eigenvalue variance). Let F=∑k:λk>0λk​PkF= _k: _k>0 _kP_k be the spectral decomposition of F on Im​(F)Im(F). For wi≠0w_i≠ 0, define weights pi​k:=‖Pk​wi‖2/‖wi‖2p_ik:=\|P_kw_i\|^2/\|w_i\|^2 and the associated spectral measure μi:=∑k:λk>0pi​k​δλk _i:= _k: _k>0p_ik\, _ _k. Let κi:=μi​[λ] _i:=E_ _i[λ]. Then ‖F​wi−κi​wi‖2‖wi‖2=Varμi​(λ)=μi​[(λ−κi)2]. \|Fw_i- _iw_i\|^2\|w_i\|^2\;=\;Var_ _i(λ)\;=\;E_ _i [(λ- _i)^2 ]. In particular, CVi:=Varμi​(λ)μi​[λ]=‖F​wi−κi​wi‖κi​‖wi‖CV_i:= Var_ _i(λ)E_ _i[λ]= \|Fw_i- _iw_i\| _i\|w_i\| vanishes iff wiw_i is an eigenvector of F (equivalently μi _i is Dirac). Proof. Write wi=∑kPk​wiw_i= _kP_kw_i. Then F​wi=∑kλk​Pk​wi,κi​wi=∑kκi​Pk​wi,Fw_i= _k _kP_kw_i, _iw_i= _k _iP_kw_i, so F​wi−κi​wi=∑k(λk−κi)​Pk​wi.Fw_i- _iw_i= _k( _k- _i)\,P_kw_i. Using orthogonality of distinct spectral subspaces, ‖F​wi−κi​wi‖2=∑k(λk−κi)2​‖Pk​wi‖2.\|Fw_i- _iw_i\|^2= _k( _k- _i)^2\|P_kw_i\|^2. Divide by ‖wi‖2\|w_i\|^2 to get ‖F​wi−κi​wi‖2‖wi‖2=∑kpi​k​(λk−κi)2=μi​[(λ−κi)2]=Varμi​(λ). \|Fw_i- _iw_i\|^2\|w_i\|^2= _kp_ik( _k- _i)^2=E_ _i [(λ- _i)^2 ]=Var_ _i(λ). Variance is zero iff λ is μi _i-a.s. constant, i.e. μi _i is Dirac on some λ​(i)>0λ(i)>0, equivalently wiw_i lies entirely in a single eigenspace, i.e. F​wi=λ​(i)​wiFw_i=λ(i)w_i. The formula for CViCV_i is immediate by dividing by κi2 _i^2 and taking square-roots. ∎ Theorem 11 (Band-localization ⇒ quasi-tightness (Kantorovich bound)). Fix i and suppose μi _i is supported on [λi−,λi+]⊂(0,∞)[ _i^-, _i^+]⊂(0,∞). Let κi⋆:=λi+/λi−≥1 _i := _i^+/ _i^-≥ 1. Then κi​hi=μi​[λ]​μi​[λ−1]≤(κi⋆+1)24​κi⋆, _i\,h_i=E_ _i[λ]\;E_ _i[λ^-1]\;≤\; ( _i +1)^24 _i , and consequently σi=1−Diℓi≤1−4​κi⋆(κi⋆+1)2. _i=1- D_i _i≤ 1- 4 _i ( _i +1)^2. Equivalently, with ωi:=(λi+−λi−)/(λi++λi−)∈[0,1) _i:=( _i^+- _i^-)/( _i^++ _i^-)∈[0,1), σi≤ωi2. _i≤ _i^2. Proof. Step 1 (Kantorovich inequality in the scalar form used here). Let X be any random variable supported on [a,b]⊂(0,∞)[a,b]⊂(0,∞). Pointwise on [a,b][a,b], (x−a)​(b−x)≥0⟺x2−(a+b)​x+a​b≤0⟺x+a​bx≤a+b.(x-a)(b-x)≥ 0\; \;x^2-(a+b)x+ab≤ 0\; \;x+ abx≤ a+b. Taking expectations gives ​[X]+a​b​[X−1]≤a+b.E[X]+ab\,E[X^-1]≤ a+b. By AM–GM, ​[X]+a​b​[X−1]≥2​a​b​[X]​[X−1]E[X]+ab\,E[X^-1]≥ 2 ab\,E[X]E[X^-1], hence 2​a​b​[X]​[X−1]≤a+b⟹​[X]​[X−1]≤(a+b)24​a​b.2 ab\,E[X]E[X^-1]≤ a+b\; \;E[X]E[X^-1]≤ (a+b)^24ab. Step 2 (apply to λ∼μiλ _i). Take X=λX=λ, a=λi−a= _i^-, b=λi+b= _i^+. Then μi​[λ]​μi​[λ−1]≤(λi−+λi+)24​λi−​λi+=(κi⋆+1)24​κi⋆.E_ _i[λ]E_ _i[λ^-1]≤ ( _i^-+ _i^+)^24 _i^- _i^+= ( _i +1)^24 _i . This is the first displayed inequality. Step 3 (convert to a slack bound). Using the spectral weights pi​kp_ik, one has wi⊤​F​wi=∑kλk​‖Pk​wi‖2=‖wi‖2​μi​[λ],wi⊤​F+​wi=∑kλk−1​‖Pk​wi‖2=‖wi‖2​μi​[λ−1].w_i Fw_i= _k _k\|P_kw_i\|^2=\|w_i\|^2\,E_ _i[λ], w_i F^+w_i= _k _k^-1\|P_kw_i\|^2=\|w_i\|^2\,E_ _i[λ^-1]. Hence κi=μi​[λ] _i=E_ _i[λ] and ℓi=‖wi‖2​hi _i=\|w_i\|^2h_i with hi:=μi​[λ−1]h_i:=E_ _i[λ^-1]. Also Di=‖wi‖4/(wi⊤​F​wi)=‖wi‖2/κiD_i=\|w_i\|^4/(w_i Fw_i)=\|w_i\|^2/ _i. Therefore Diℓi=‖wi‖2/κi‖wi‖2​hi=1κi​hi=1μi​[λ]​μi​[λ−1]. D_i _i= \|w_i\|^2/ _i\|w_i\|^2h_i= 1 _ih_i= 1E_ _i[λ]E_ _i[λ^-1]. Combining with the upper bound on κi​hi _ih_i yields Diℓi≥4​κi⋆(κi⋆+1)2⟹σi=1−Diℓi≤1−4​κi⋆(κi⋆+1)2. D_i _i≥ 4 _i ( _i +1)^2 _i=1- D_i _i≤ 1- 4 _i ( _i +1)^2. Finally, with ωi=(λi+−λi−)/(λi++λi−) _i=( _i^+- _i^-)/( _i^++ _i^-) and κi⋆=λi+/λi− _i = _i^+/ _i^-, 1−4​κi⋆(κi⋆+1)2=(κi⋆−1)2(κi⋆+1)2=(λi+−λi−λi++λi−)2=ωi2.1- 4 _i ( _i +1)^2= ( _i -1)^2( _i +1)^2= ( _i^+- _i^- _i^++ _i^- )^2= _i^2. ∎ Theorem 12 (A three-class taxonomy consistent with near-saturation). Theorem 11 and Corollary 9 justify: 1. Localized (Dirac) features: σi=0 _i=0 (equivalently CVi=0CV_i=0), i.e. μi _i is Dirac and wiw_i is an eigenvector. 2. Band-localized features: ωi≪1 _i 1, hence σi≤ωi2≪1 _i≤ _i^2 1 though μi _i need not be Dirac. 3. Broadband (delocalized-residue) features: ωi=Θ​(1) _i= (1) and CViCV_i nontrivial; these incur σi>0 _i>0, but under ε -saturation their total leverage mass is small. Proof. (1) By Theorem 8(2), σi=0 _i=0 iff Di=ℓiD_i= _i iff wiw_i is an eigenvector, equivalently μi _i is Dirac. Lemma 10 shows this is also equivalent to CVi=0CV_i=0. (2) If μi _i is supported on a narrow band with ωi≪1 _i 1, Theorem 11 gives σi≤ωi2≪1 _i≤ _i^2 1, capturing “perturbations” of perfect localization. (3) If ωi=Θ​(1) _i= (1) (broad support), then typically Varμi​(λ)Var_ _i(λ) is non-negligible, hence CViCV_i is nontrivial by Lemma 10, and σi>0 _i>0 unless μi _i collapses to Dirac. Under ε -saturation, Corollary 9 bounds the leverage-weighted mass of any set with σi≥τ _i≥τ, which forces the broadband/high-slack population to occupy only a small fraction of the leverage budget. ∎ Empirical interpretation in the perturbative regime. W stays full-rank across sparsity, while ∑iDi/m _iD_i/m falls slightly below 11 in a sparsity-dependent way. In the full-rank regime r=mr=m, Theorem 8 forces the identity m−∑iDi=∑iℓi​σi,m- _iD_i= _i _i _i, so the observed deviation from saturation cannot be attributed to rank defect; it an aggregate slack term. At the same time, the paper observes that higher sparsity leads to features becoming “more diffuse across the spectrum” (i.e. less spectrally localized). Lemma 10 formalizes “diffuse across the spectrum” as larger eigenvalue spread Varμi​(λ)Var_ _i(λ) (equivalently larger CViCV_i), which is precisely the obstruction to being an eigenvector/Dirac. This explains why increasing sparsity can correlate with stronger delocalization diagnostics even when global capacity usage remains close to saturated. The apparent coexistence of “minimal slack” with “some delocalization” is resolved by Corollary 9: if ∑iDi≥(1−ε)​r _iD_i≥(1- )r with small ε (near-saturation), then for any fixed delocalization threshold τ, 1r​∑i:σi≥τℓi≤ετ. 1r _i: _i≥τ _i≤ τ. Thus, near-saturation does not forbid delocalized/broadband features; it forbids them from carrying much total leverage mass. Empirically, the leverage-weighted tail mass of high-slack features remains small across sparsity, matching this prediction: delocalization is allowed, but it is confined to a small “residual” subset in leverage (Appendix I). Finally, Theorem 11 gives a clean perturbative mechanism: if most features remain band-localized with small relative half-width ωi≪1 _i 1, then they automatically have σi≤ωi2≪1 _i≤ _i^2 1, so they contribute negligibly to aggregate slack. The sparsity-driven delocalization can then be understood as an increasing broadening of μi _i for a minority of features (broadband residue), which raises their CViCV_i and σi _i without materially changing the global budget because their total leverage weight is small. Appendix E Localization vs Delocalization E.1 HT-SR Empirical work by Charles Martin and Michael Mahoney has demonstrated that during training models ”self-regularize” as their generalization capabilities increase. The so-called phenomenology of Heavy-Tailed Self-Regularization (HT-SR) [Martin and Mahoney, 2021a, b, 2020]) is inspired by Random Matrix Theory (RMT) and its object of study is the empirical spectral density (ESD) of the layer weight matrices. More specifically, consider the real-valued weight matrices l∈ℝN×MW_l ^N× M with singular value decomposition (SVD) l=​∗W_l=U ^*. Here, νi=i​i _i= _i is the iith singular value, pi=νi2/∑iνi2p_i= _i^2/ _i _i^2 the spectral probability distribution over singular values and l=1N​lT​lX_l= 1NW_l^TW_l is the sample covariance matrix. By computing its eigenvalues i=λi​iXv_i= _iv_i, where ∀i=1,⋯,Mλi=νi2 _i=1,·s,M _i= _i^2, the authors identify a gradual phase transition of the ESD away from Marchenko-Pastur (MP) /Tracy-Widom (TW) spectral statistics to PWi​j​(X)∼1x1+μP_W_ij(X) 1x^1+μ with μ>0μ>0, i.e. Power-Law (PL)/ Heavy-Tailed (HT) statistics. While HT-SR’s focus is exclusively on the eigenvalue behavior, we argue that the observation is the surface-level feature of the Anderson localization research program in mathematical physics [Hundertmark, 2008]. E.2 Anderson Localization The literature on the spectral behavior of random operators and disordered systems classifies the Tracy-Widom spectral distribution as the eigenvalue statistics of the Gaussian universality class with delocalized eigenvectors, such as GUE [Tao, 2012], whereas the Power Law/Heavy-Tailed spectral statistics represent random matrices with localized eigenvectors [Aizenman and Molchanov, 1993, Mirlin et al., 1996, Edwards and Thouless, 1972]. Appendix F Gradient Flow F.1 Gradient Flow Dynamics All of our derivations so far have been under the assumption that the model has saturated its capacity, which makes the results a static description. If we were to provide one of the dynamics, we need to be cautious. Solving explicitly similar systems from mathematical physics [Atiyah and Sutcliffe, 2003, Smale, 1998] remains an open problem. That said, the tools we have introduced in Section 3 provide us with enough handle on the dynamics to show that our initial capacity saturation assumption is well-founded. Theorem 13 (Gradient Flow). Let M=W⊤​WM=W W be the autoencoder and Φ=Sym⁡(∇ML) =Sym( _ML) be the gradient kernel, where F.2: Φ=−x​[(x)​⊤+x​(x)⊤] =-E_x[ δ_(x)x +x δ_(x) ] (x)=I⊙(−′)⊙​(M​+b>0). δ_(x)=I (x-x ) (Mx+b>0). Then: M˙=−M,Φ=−(M​Φ+Φ​M) M=-\M, \=-(M + M) (F.1) Corollary 14 (Projector Dynamics). E˙e=∑e≠fλe+λfλe−λf​(Ee​Φ​Ef+Ee​Φ​Ef) E_e= _e≠ f _e+ _f _e- _f (E_e E_f+E_e E_f ) (F.2) Corollary 15 (Eigenvalue Drift). λ˙e=−2​λe​tr⁡(Ee​Φ​Ee)dim(Ee) λ_e=-2 _e tr(E_e E_e) (E_e) (F.3) Corollary 16 (Spectral Mass Transport). Define qi,e​(t)=(Ee​(t))i​i=λe+​pi,e​Mi​iq_i,e(t)=(E_e(t))_i= _e^+p_i,eM_i, representing the mass of feature i supported on eigenspace e. As such, its time evolution is given by: q˙i,e=∑e≠fe→f(i)=∑e≠f2​(λe+λfλe−λf)​(Ee​Φ​Ef)i​i q_i,e= _e≠ fT_e→ f^(i)= _e≠ f2 ( _e+ _f _e- _f )(E_e E_f)_i (F.4) Corollary 17 (Stability of Capacity Saturation). A configuration is a spectral fixed Point if the gradient kernel Φ commutes with the Gram operator spectral projector [Ee,Φ]=0∀λ[E_e, ]=0 ∀λ. This condition is satisfied by Tight Frames under uniform sparsity, implying Let W∈ℝm×nW ^m× n be the weight matrix and M=W⊤​W∈ℝn×nM=W W ^n× n be its Gram matrix. The standard gradient flow on weights is W˙=−∇WL W=- _WL. We assume gradients defined using the ordinary dot product on the parameter coordinates, i.e. treating the parameters as living in flat ℝm​nR^mn with no curvature or weighting. Using the Frobenius inner product ⟨A,B⟩F=tr​(A⊤​B) A,B _F=tr(A B), we have: d​L=∑a,b∂L∂Wa​b​d​Wa​b=tr​((∇WL)⊤​d​W)=⟨∇WL,d​W⟩FdL= _a,b ∂ L∂ W_ab\,dW_ab=tr (( _WL) dW )= _WL,dW _F Let us derive the induced flow on M. First, let us compute the differential: d​M=d​(W⊤​W)=d​W⊤​W+W⊤​d​WdM=d(W W)=dW W+W dW . ⟨∇ML,d​M⟩F=tr​((∇ML)⊤​d​M)= _ML,dM _F=tr (( _ML) dM )= =tr​((∇ML)⊤​(d​W⊤​W+W⊤​d​W))= =tr (( _ML) (dW W+W dW ))= =tr​((∇ML)⊤​d​W⊤​W)+tr​((∇ML)⊤​W⊤​d​W) =tr (( _ML) dW W )+tr (( _ML) W dW ) In order to convert this into a Frobenius inner product with ∇W∇ W on the right, we can use the cyclic properties of the transpose, as well as tr​(A)=tr​(A⊤)tr(A)=tr(A ): tr​(W⊤​d​W​∇ML)=tr​((∇ML)​W⊤​d​W)= (W dW _ML )=tr(( _ML)W dW)= =tr​((W​(∇ML)⊤)⊤​d​W)=⟨W​(∇ML)⊤,d​W) =tr((W( _ML) ) dW)= W( _ML) ,dW) Similarly, for the right hand side, we have: tr​((∇ML)⊤​W⊤​d​W)=tr​((W​∇ML)⊤​d​W)=⟨W​∇ML,d​W⟩tr(( _ML) W dW)=tr((W _ML) dW)= W _ML,dW ⇒d​L=⟨W​(∇ML)⊤+W​∇ML,d​W⟩=⟨∇WL,d​W⟩ dL= W( _ML) +W _ML,dW = _WL,dW ⇒∇WL=W​(∇ML+(∇ML)⊤) _WL=W ( _ML+( _ML) ) As such, the induced flow on M is: M˙=W˙⊤​W+W⊤​W˙=−(∇WL)⊤​W−W⊤​(∇WL)= M= W W+W W=- ( _WL ) W-W ( _WL)= =−(∇ML+(∇ML)⊤)⊤​W⊤​W =-( _ML+( _ML) ) W W −W⊤​W​(∇ML+(∇ML)⊤) -W W( _ML+( _ML) ) =−(Ψ+Ψ⊤)​M−M​(Ψ+Ψ⊤) =-( + )M-M( + ) This in fact shows us that only the symmetrized gradient drives the Gram flow (which should be expected per the Frobenius inner product), and it should tell us that we need to define the kernel gradient as: Φ:=2​Sym​(Ψ)=(Ψ+Ψ⊤), :=2Sym( )=( + ), As such, our induced gradient flow becomes: M˙=−Φ,M=−(Φ​M+M​Φ)=−M,Φ M=-\ ,M\=-( M+M )=-\M, \ (F.5) driven by the Gradient Kernel Φ=Sym​(∇ML) =Sym( _ML). F.2 Explicit Form Recall the MSE loss function: L=12​∑xp​(x)​∑iIi​(xi−ReLU​(∑kMi​k​xk+bi))2L= 12 _xp(x) _iI_i (x_i-ReLU ( _kM_ikx_k+b_i ) )^2 By the chain rule, we have for ui=∑kMi​k​xk+biu_i= _kM_ikx_k+b_i and xi′=ReLU​(ui)x _i=ReLU(u_i): ∂L∂Mi​j=∑l∂L∂xl′⋅∂xl′∂ul⋅∂ul∂Mi​j, ∂ L∂ M_ij= _l ∂ L∂ x _l· ∂ x _l∂ u_l· ∂ u_l∂ M_ij, where ∂L∂xl′=−Il​(xl−xl′),∂xl′∂ul=​(ul>0),∂ul∂Mi​j=∂Mi​j​(∑kMl​k​xk+bk), ∂ L∂ x _l=-I_l(x_l-x _l), ∂ x _l∂ u_l=I(u_l>0), ∂ u_l∂ M_ij= ∂ M_ij ( _kM_lkx_k+b_k ), where the latter is non-zero only when l=il=i. As such, in expectation form ⇒∂L∂Mi​j=−x​[Ii​(xi−xi′)⋅​(∑kMi​k​xk+bi>0)⋅xj] ∂ L∂ M_ij=-E_x [I_i(x_i-x _i)·I ( _kM_ikx_k+b_i>0 )· x_j ] To recover the matrix form, let us define the effective error vector: (x)=I⊙(−′)⊙​(M​+b>0), δ_(x)=I (x-x ) (Mx+b>0), where ⊙ is the Hadamard product. Observe that the element (∇ML)i​j( _ML)_ij corresponds to the expectation of the product of the i-th error term δi _i and the j-th input xjx_j. Hence: ∇ML _ML =−x​[(x)​⊤] =-E_x[ δ_(x)x ] Φ=−x​[(x)​⊤+​(x)⊤] =-E_x[ δ_(x)x +x δ_(x) ] =−∑xp​(x)​[(x)​⊤+​(x)⊤] =- _xp(x) [ δ_(x)x +x δ_(x) ] (2) The sparsity regimes follow from the decomposition of Φ into diagonal (Feature benefit) Φi​i _i and off-diagonal (interference) Φi​j​(i≠j) _ij~(i≠ j) components. More specifically, observe that: Φi​j=−∑xp​(x)​(δ(x),i​xj+xi​δ(x),j) _ij=- _xp(x)( _(x),ix_j+x_i _(x),j) ⇒Φi​i=−2​∑xp​(x)​δ(x),i​xi=−2​∑xp​(x)​[Ii​(xi−xi′)⋅​(ui>0)⋅xi] _i=-2 _xp(x) _(x),ix_i=-2 _xp(x) [I_i(x_i-x _i)·I(u_i>0)· x_i ] F.3 Association Scheme Reduction Lemma F.1 (Γ -invariance of M,G,ΦM,G, ). Per Lemma 4.5 and 4.7, we have the following expansions over the association scheme: M(t)=∑r=0Rθr(t)At,Φ(t)=∑r=0Rφr(t)ArM(t)= _r=0^R _r(t)A_t , (t)= _r=0^R _r(t)A_r Let us equate the coefficients using our induced gradient flow equation (1)(1) and substitute directly: M​Φ=(∑rθr​Ar)​(∑sφs​As)=∑r,sθr​φs​(Ar​As)=M = ( _r _rA_r ) ( _s _sA_s )= _r,s _r _s(A_rA_s)= =∑r,sθr​φs​∑ucr​su​Au=∑u(∑r,sθr​φs​cr​su)​Au= _r,s _r _s _uc_rs^uA_u= _u ( _r,s _r _sc_rs^u )A_u Φ​M=∑r,sφr​θs​(Ar​As)=∑u(∑r,sφr​θs​cr​su)​Au M= _r,s _r _s(A_rA_s)= _u ( _r,s _r _sc_rs^u )A_u M˙=−(M​Φ+Φ​M)= M=-(M + M)= =−∑u(∑r,sθt​φs​cr​su+∑r,sφr​θs​cr​su)​Au=- _u ( _r,s _t _sc_rs^u+ _r,s _r _sc_rs^u )A_u ⇒θ˙u=−∑r,sθr​φs​cr​su−∑r,sφr​θs​cr​su=−∑r,sθr​φs​(cr​su+cs​ru) θ_u=- _r,s _r _sc_rs^u- _r,s _r _sc_rs^u=- _r,s _r _s(c_rs^u+c_sr^u) (F.6) Theorem 18 (Gradient Flow (Theorem 6)). Let M=W⊤​WM=W W and define the (symmetric) gradient kernel Φ=Sym​(∇ML) =Sym( _ML) with explicit form Φ=−x​[δ​(x)​x⊤+x​δ​(x)⊤],δ​(x)=I⊙(x−x′)⊙​(M​x+b>0), =-E_x[δ(x)x +xδ(x) ], δ(x)=I (x-x ) 1(Mx+b>0), as in Appendix E.1 / equation (E.1). Then the induced Gram flow is M˙=−M,Φ=−(M​Φ+Φ​M). M=-\M, \=-(M + M). Proof. Let W​(t)W(t) follow weight-space gradient flow W˙=−∇WL W=- _WL. Differentiate M=W⊤​WM=W W: M˙=W˙⊤​W+W⊤​W˙=−(∇WL)⊤​W−W⊤​(∇WL). M= W W+W W=-( _WL) W-W ( _WL). To express ∇WL _WL in terms of ∇ML _ML, use d​M=d​W⊤​W+W⊤​d​WdM=dW W+W dW and Frobenius inner products: d​L=⟨∇ML,d​M⟩=⟨∇ML,d​W⊤​W+W⊤​d​W⟩=⟨W​(∇ML+(∇ML)⊤),d​W⟩.dL= _ML,dM = _ML,dW W+W dW = W( _ML+( _ML) ),dW . Hence ∇WL=W​(∇ML+(∇ML)⊤)=W​Φ, _WL=W( _ML+( _ML) )=W , where Φ=(∇ML+(∇ML)⊤)=2​Sym​(∇ML) =( _ML+( _ML) )=2\,Sym( _ML). Substituting into M˙ M gives M˙=−(W​Φ)⊤​W−W⊤​(W​Φ)=−(Φ​W⊤​W+W⊤​W​Φ)=−(Φ​M+M​Φ)=−M,Φ. M=-(W ) W-W (W )=-( W W+W W )=-( M+M )=-\M, \. The explicit form of Φ is computed in Appendix E.1, yielding the stated expression. ∎ Theorem 19 (Projector Dynamics (Corollary 7)). Let M=∑eλe​EeM= _e _eE_e be the spectral decomposition of M, with EeE_e the orthogonal spectral projectors. Along the Gram flow M˙=−M,Φ M=-\M, \, E˙e=∑f≠eλe+λfλf−λe​(Ee​Φ​Ef+Ef​Φ​Ee). E_e= _f≠ e _e+ _f _f- _e\, (E_e E_f+E_f E_e ). Proof. Apply Kato’s first-order eigenprojection drift formula (Lemma B.7) with M(1)=M˙M^(1)= M. For an isolated eigenvalue λe _e with projector EeE_e, Lemma B.7 gives E˙e=−Ee​M˙​Sλe−Sλe​M˙​Ee, E_e=-E_e M\,S_ _e-S_ _e M\,E_e, where for symmetric M the reduced resolvent is (Lemma B.6) Sλe=∑f≠e1λf−λe​Ef.S_ _e= _f≠ e 1 _f- _e\,E_f. Substitute M˙=−M,Φ=−(M​Φ+Φ​M) M=-\M, \=-(M + M) to get E˙e=Ee​(M​Φ+Φ​M)​Sλe+Sλe​(M​Φ+Φ​M)​Ee. E_e=E_e(M + M)S_ _e+S_ _e(M + M)E_e. Using M​Ef=λf​EfME_f= _fE_f and Ee​M=λe​EeE_eM= _eE_e, for f≠ef≠ e we have Ee​(M​Φ+Φ​M)​Ef=Ee​M​Φ​Ef+Ee​Φ​M​Ef=(λe+λf)​Ee​Φ​Ef.E_e(M + M)E_f=E_eM E_f+E_e ME_f=( _e+ _f)\,E_e E_f. Therefore Ee​(M​Φ+Φ​M)​Sλe=∑f≠eλe+λfλf−λe​Ee​Φ​Ef,E_e(M + M)S_ _e= _f≠ e _e+ _f _f- _e\,E_e E_f, and similarly Sλe​(M​Φ+Φ​M)​Ee=∑f≠eλe+λfλf−λe​Ef​Φ​Ee.S_ _e(M + M)E_e= _f≠ e _e+ _f _f- _e\,E_f E_e. Adding yields the stated formula. ∎ Theorem 20 (Eigenvalue Drift (Corollary 8)). Along the Gram flow M˙=−M,Φ M=-\M, \, the eigenvalues satisfy λ˙e=−2​λe​tr​(Ee​Φ​Ee)dim(Ee). λ_e=-2 _e\, tr(E_e E_e) (E_e). Proof. Lemma B.8 gives the first-order eigenvalue drift for eigenvalue λe _e with spectral projector EeE_e of rank de=tr​(Ee)d_e=tr(E_e): λ˙e=1de​tr​(M˙​Ee). λ_e= 1d_etr( M\,E_e). Now M˙=−(M​Φ+Φ​M) M=-(M + M), so tr​(M˙​Ee)=−tr​(M​Φ​Ee)−tr​(Φ​M​Ee).tr( ME_e)=-tr(M E_e)-tr( ME_e). Using cyclicity and M​Ee=λe​EeME_e= _eE_e, tr​(M​Φ​Ee)=tr​(Φ​Ee​M)=λe​tr​(Φ​Ee),tr​(Φ​M​Ee)=tr​(Φ​λe​Ee)=λe​tr​(Φ​Ee).tr(M E_e)=tr( E_eM)= _e\,tr( E_e), ( ME_e)=tr( _eE_e)= _e\,tr( E_e). Also tr​(Φ​Ee)=tr​(Ee​Φ​Ee)tr( E_e)=tr(E_e E_e) since cross-terms vanish under the projector decomposition. Hence tr​(M˙​Ee)=−2​λe​tr​(Ee​Φ​Ee),tr( ME_e)=-2 _e\,tr(E_e E_e), and dividing by de=dim(Ee)d_e= (E_e) gives the claim. ∎ Theorem 21 (Spectral Mass Transport (Corollary 9)). Define qi,e​(t):=(Ee​(t))i​i=λe+​pi,e​(t)​Mi​i​(t),q_i,e(t):=(E_e(t))_i= _e^+\,p_i,e(t)\,M_i(t), interpreted as the “mass” of coordinate/feature i supported in eigenspace e. Then q˙i,e=∑f≠eTe→f(i),Te→f(i):=2​(λe+λfλe−λf)​(Ee​Φ​Ef)i​i. q_i,e= _f≠ eT^(i)_e→ f, T^(i)_e→ f:=2 ( _e+ _f _e- _f )\,(E_e E_f)_i. Proof. The identity qi,e=(Ee)i​i=λe+​pi,e​Mi​iq_i,e=(E_e)_i= _e^+\,p_i,e\,M_i is given by Lemma B.11(2), together with Mi​i=‖Wi‖2M_i=\|W_i\|^2. Differentiate qi,e=(Ee)i​iq_i,e=(E_e)_i: q˙i,e=(E˙e)i​i. q_i,e=( E_e)_i. Substitute Corollary 7. Using symmetry of Φ and EeE_e, note that for f≠ef≠ e, (Ef​Φ​Ee)i​i=ei⊤​Ef​Φ​Ee​ei=(Ef​ei)⊤​Φ​(Ee​ei)=(Ee​ei)⊤​Φ​(Ef​ei)=(Ee​Φ​Ef)i​i.(E_f E_e)_i=e_i E_f E_ee_i=(E_fe_i) (E_ee_i)=(E_ee_i) (E_fe_i)=(E_e E_f)_i. Thus the paired terms add, giving q˙i,e=∑f≠e2​(λe+λfλe−λf)​(Ee​Φ​Ef)i​i. q_i,e= _f≠ e2 ( _e+ _f _e- _f )\,(E_e E_f)_i. Defining Te→f(i)T^(i)_e→ f as in the statement yields the transport form. ∎ Let Γ≤Sni ≤ S_n_i act on indices of the block Ωi _i and let Pγ∈Γ\P_γ\_γ∈ be the associated permutation matrices. Define the commutant (centralizer) Γ:=A∈ℝni×ni:Pγ​A​Pγ⊤=A​∀γ∈Γ.A_ :=\A ^n_i× n_i:\;P_γAP_γ =A\ ∀γ∈ \. Equivalently, Γ=A:Pγ​A=A​Pγ​∀γA_ =\A:\;P_γA=AP_γ\ ∀γ\. [Gradient equivariance under Γ -conjugation] Lemma F.2. Assume L:ℝni×ni→ℝL:R^n_i× n_i is Fréchet differentiable (with respect to the Frobenius inner product) and satisfies L​(Pγ​M​Pγ⊤)=L​(M)∀γ∈Γ,∀M.L(P_γMP_γ )=L(M) ∀γ∈ ,\ ∀ M. Then the gradient is Γ -equivariant: ∇L​(Pγ​M​Pγ⊤)=Pγ​(∇L​(M))​Pγ⊤∀γ∈Γ,∀M.∇ L(P_γMP_γ )=P_γ (∇ L(M) )P_γ ∀γ∈ ,\ ∀ M. Proof. Fix γ∈Γγ∈ and define Ψ​(M):=Pγ​M​Pγ⊤ (M):=P_γMP_γ . The invariance hypothesis is L​(Ψ​(M))=L​(M)L( (M))=L(M) for all M. Differentiate at M in an arbitrary direction H: d​ε|ε=0​L​(Ψ​(M+ε​H))=d​ε|ε=0​L​(M+ε​H). . dd |_ =0L( (M+ H))= . dd |_ =0L(M+ H). Since Ψ​(M+ε​H)=Ψ​(M)+ε​(Pγ​H​Pγ⊤) (M+ H)= (M)+ (P_γHP_γ ), the definition of the gradient gives ⟨∇L​(Ψ​(M)),Pγ​H​Pγ⊤⟩F=⟨∇L​(M),H⟩F. ∇ L( (M)),\,P_γHP_γ _F= ∇ L(M),\,H _F. Using Frobenius invariance under orthogonal conjugation, ⟨A,Pγ​H​Pγ⊤⟩F=⟨Pγ⊤​A​Pγ,H⟩F A,\,P_γHP_γ _F= P_γ AP_γ,\,H _F, we obtain ⟨Pγ⊤​∇L​(Ψ​(M))​Pγ,H⟩F=⟨∇L​(M),H⟩F∀H. P_γ ∇ L( (M))P_γ,\,H _F= ∇ L(M),\,H _F ∀ H. Hence Pγ⊤​∇L​(Pγ​M​Pγ⊤)​Pγ=∇L​(M)P_γ ∇ L(P_γMP_γ )P_γ=∇ L(M), which is equivalent to the claim. ∎ [ΓA_ -membership of the (symmetrized) gradient] Lemma F.3. Assume the hypotheses of Lemma F.3. If M∈ΓM _ , then ∇L​(M)∈Γ∇ L(M) _ . Consequently, Φ​(M):=Sym​(∇L​(M)):=12​(∇L​(M)+∇L​(M)⊤)∈Γ. (M):=Sym(∇ L(M)):= 12 (∇ L(M)+∇ L(M) ) _ . Proof. If M∈ΓM _ , then Pγ​M​Pγ⊤=MP_γMP_γ =M for all γ. Lemma F.3 yields ∇L​(M)=∇L​(Pγ​M​Pγ⊤)=Pγ​(∇L​(M))​Pγ⊤,∇ L(M)=∇ L(P_γMP_γ )=P_γ(∇ L(M))P_γ , so ∇L​(M)∈Γ∇ L(M) _ . Since Pγ​A​Pγ⊤=AP_γAP_γ =A implies Pγ​A⊤​Pγ⊤=A⊤P_γA P_γ =A , we also have ∇L​(M)⊤∈Γ∇ L(M) _ . Because ΓA_ is a linear subspace, their average Sym​(∇L​(M))Sym(∇ L(M)) belongs to ΓA_ . ∎ Algebraic closure of ΓA_ ΓA_ is a matrix algebra: if X,Y∈ΓX,Y _ then X+Y∈ΓX+Y _ , α​X∈Γα X _ for all α∈ℝα , and X​Y∈ΓXY _ . Moreover, if X∈ΓX _ then X⊤∈ΓX _ , hence Sym​(X)∈ΓSym(X) _ . Proof. Linearity is immediate from the defining relation Pγ​(⋅)​Pγ⊤=(⋅)P_γ(·)P_γ =(·). For products, if X,Y∈ΓX,Y _ , then for every γ∈Γγ∈ , Pγ​(X​Y)​Pγ⊤=(Pγ​X​Pγ⊤)​(Pγ​Y​Pγ⊤)=X​Y,P_γ(XY)P_γ =(P_γXP_γ )(P_γYP_γ )=XY, so X​Y∈ΓXY _ . For transpose, if Pγ​X​Pγ⊤=XP_γXP_γ =X, then taking transpose gives Pγ​X⊤​Pγ⊤=X⊤P_γX P_γ =X (since Pγ⊤=Pγ−1P_γ =P_γ^-1), so X⊤∈ΓX _ . Finally, Sym​(X)=12​(X+X⊤)Sym(X)= 12(X+X ) belongs to ΓA_ by linearity. ∎ Lemma F.4 (Invariant-block closure of Gram flow). Assume Φ​(t)∈Γ (t) _ for all t and consider the Gram flow M˙​(t)=−(M​(t)​Φ​(t)+Φ​(t)​M​(t)). M(t)=-(M(t) (t)+ (t)M(t)). If M​(t0)∈ΓM(t_0) _ at some time t0t_0, then M​(t)∈ΓM(t) _ for all t≥t0t≥ t_0. Proof. Since M​(t0),Φ​(t0)∈ΓM(t_0), (t_0) _ and ΓA_ is an algebra (Lemma F.3), we have M​(t0)​Φ​(t0)∈ΓM(t_0) (t_0) _ and Φ​(t0)​M​(t0)∈Γ (t_0)M(t_0) _ , hence M˙​(t0)∈Γ M(t_0) _ . More generally, whenever M​(t)∈ΓM(t) _ we get M˙​(t)∈Γ M(t) _ by the same closure, so the vector field is tangent to the linear subspace ΓA_ everywhere on ΓA_ . Therefore the solution starting at M​(t0)∈ΓM(t_0) _ remains in ΓA_ for all t≥t0t≥ t_0. ∎ Lemma F.5 (Invariant-block closure of Gram flow). Assume Φ​(t)∈Γ (t) _ for all t and consider the Gram flow M˙​(t)=−(M​(t)​Φ​(t)+Φ​(t)​M​(t)). M(t)=-(M(t) (t)+ (t)M(t)). If M​(t0)∈ΓM(t_0) _ at some time t0t_0, then M​(t)∈ΓM(t) _ for all t≥t0t≥ t_0. Proof. Fix any γ∈Γγ∈ and define M~​(t):=Pγ​M​(t)​Pγ⊤ M(t):=P_γM(t)P_γ . Then M~˙​(t)=Pγ​M˙​(t)​Pγ⊤=−(M~​(t)​Φ~​(t)+Φ~​(t)​M~​(t)), M(t)=P_γ M(t)P_γ =- ( M(t)\, (t)+ (t)\, M(t) ), where Φ~​(t):=Pγ​Φ​(t)​Pγ⊤ (t):=P_γ (t)P_γ . Since Φ​(t)∈Γ (t) _ , we have Φ~​(t)=Φ​(t) (t)= (t) for all t, hence M~˙​(t)=−(M~​(t)​Φ​(t)+Φ​(t)​M~​(t)). M(t)=-( M(t) (t)+ (t) M(t)). Moreover, because M​(t0)∈ΓM(t_0) _ , we have M~​(t0)=Pγ​M​(t0)​Pγ⊤=M​(t0) M(t_0)=P_γM(t_0)P_γ =M(t_0). Thus M and M~ M solve the same ODE with the same initial condition at t0t_0. By uniqueness of solutions to linear matrix ODEs, M~​(t)=M​(t) M(t)=M(t) for all t≥t0t≥ t_0. Therefore Pγ​M​(t)​Pγ⊤=M​(t)P_γM(t)P_γ =M(t) for all γ∈Γγ∈ and all t≥t0t≥ t_0, i.e. M​(t)∈ΓM(t) _ for all t≥t0t≥ t_0. ∎ Theorem 22 (Stability of Capacity Saturation (Corollary 10)). A configuration is a spectral fixed point if the gradient kernel Φ commutes with every Gram spectral projector: [Ee,Φ]=0∀e.[E_e, ]=0 ∀ e. In particular, tight-frame geometries under uniform sparsity satisfy this condition (hence are spectral fixed points). Proof. If [Ee,Φ]=0[E_e, ]=0 for all e, then for f≠ef≠ e, Ee​Φ​Ef=Ee​Ef​Φ=0,E_e E_f=E_eE_f =0, since commuting gives Ee​Φ=Φ​EeE_e = E_e and orthogonality gives Ee​Ef=0E_eE_f=0. Plugging into Corollary 7 yields E˙e=0 E_e=0 for all e. Consequently, Corollary 9 gives q˙i,e=0 q_i,e=0 for all i,ei,e: there is no spectral mass transport between eigenspaces, so the spectral decomposition is dynamically stable. For the tight-frame/uniform-sparsity claim: Appendix E.2 (Lemma E.1) expands both M​(t)M(t) and Φ​(t) (t) in the same association-scheme (orbital) basis Ar\A_r\: M​(t)=∑r=0Rθr​(t)​Ar,Φ​(t)=∑r=0Rϕr​(t)​Ar.M(t)= _r=0^R _r(t)A_r, (t)= _r=0^R _r(t)A_r. When the centralizer is a Bose–Mesner algebra (commutative association scheme), all ArA_r commute, hence M​(t)M(t) and Φ​(t) (t) commute. In a symmetric commuting family, Φ commutes with the spectral projectors Ee\E_e\ of M (simultaneous diagonalization), so [Ee,Φ]=0[E_e, ]=0 for all e, i.e. a spectral fixed point. ∎ Appendix G Non-uniform Sparsity We observe the same type of projective linearity for uniform sparsity: Figure 8: Description of image 1 Figure 9: Description of image 2 Figure 10: Description of image 3 Figure 11: Description of image 4 Figure 12: Description of image 5 Figure 13: Description of image 6 Figure 14: Description of image 7 Appendix H Related Work Before we begin our rigorous examination of the latter model, let us provide context for other relevant work on polysemanticity: • Lecomte et al. (2023/2024) [Lecomte et al., 2023] Incidental Polysemanticity Lecomte et al. argue polysemanticity can arise incidentally, even when capacity suffices for monosemantic codes. Using theory and experiments, they show random initialization, regularization, and neural noise can fortuitously align multiple features to one neuron early in training, after which gradient dynamics reinforce the overlap. This complements capacity-driven superposition: polysemanticity can be a stable attractor without being strictly necessary for performance. • Adler & Shavit (2024/2025) [Adler and Shavit, 2024] — Complexity of Computing in Superposition Adler and Shavit develop complexity-theoretic bounds for computing with superposed features, proving that a broad class of tasks (e.g., permutations, pairwise logic) needs at least Ω​(m′​log⁡m′) ( m m ) neurons and Ω​(m′​log⁡m′) (m m ) parameters to compute m′m outputs in superposition. They also provide near-matching constructive upper bounds (e.g., pairwise AND with O​(m′​log⁡m′)O( m m ) neurons), revealing large but not unbounded efficiency gains. The results distinguish “representing” features in superposition (much cheaper) from “computing” with them (provably costlier), setting principled limits that any interpretability or compression method must respect. • Klindt et al. (2025) [Klindt et al., 2025] — From Superposition to Sparse Codes Klindt et al. propose a principled route from superposed activations to interpretable factors by leveraging three components: identifiability (classification representations recover latent features up to an invertible linear transform), sparse coding/dictionary learning to find a concept-aligned basis, and quantitative interpretability metrics. In this view, deep nets often linearly overlay concepts; post-hoc sparse coding can invert the mixing and yield monosemantic directions without retraining the model. • Hollard et al. (2025) [Hollard et al., 2025] Superposition in Low-Parameter Vision Models Hollard et al. study modern <!1.5<!1.5M-parameter CNNs and show that bottleneck designs and superlinear activations exacerbate interference (feature overlap) in feature maps, limiting accuracy scaling. By systematically varying bottleneck structures, they identify design choices that reduce interference and introduce a “NoDepth Bottleneck” that improves ImageNet scaling within tight parameter budgets. • Pertl et al. (2025)[Pertl et al., 2025] — Superposition in GNNs Pertl et al. analyze superposition in graph neural networks by extracting feature directions at node and graph levels and studying their basis-invariant overlaps. They observe a width-driven phase pattern in overlap, topology-induced mixing at the node level, and mitigation via sharper pooling that increases axis alignment; shallow models can fall into metastable low-rank embeddings. • Hesse et al. (2025) [Hesse et al., 2025] Disentangling Polysemantic Channels in CNNs Hesse et al. present an algorithmic surgery to split a polysemantic CNN channel into multiple channels, each responding to a single concept, by exploiting distinct upstream activation patterns that feed the mixed unit. The method rewires a pretrained network (without retraining) to produce explicit, monosemantic channels, improving clarity of feature visualizations and enabling standard mechanistic tools. Unlike “virtual” decompositions, this yields concrete network components aligned to concepts, demonstrating a practical path to reduce superposition post hoc. • Dreyer et al. (2024) [Dreyer et al., 2024] PURE: Circuits-Based Decomposition Dreyer et al. introduce PURE, a post-hoc circuits method that decomposes a polysemantic neuron into multiple virtual monosemantic units by identifying distinct upstream subgraphs (circuits) responsible for each concept. The approach improves concept-level visualizations (e.g., via CLIP-based evaluation) and does not modify weights, instead reattributing behavior across disjoint computation paths. PURE shows that much polysemanticity reflects circuit superposition; separating circuits yields purer conceptual units without architectural changes. Appendix I Plots Figure 15: Description of image 5 Figure 16: Description of image 6 Figure 17: Description of image 5 Figure 18: Description of image 6 Figure 19: Jump m96 Figure 20: Rayleigh m96 Figure 21: Volatility m96 Figure 22: Jump m112-0 Figure 23: Rayleigh m112-0 Figure 24: Volatility m112-0 Figure 25: Jump m112-2 Figure 26: Rayleigh m112-2 Figure 27: Volatility m112-2 Figure 28: Jump m112-4 Figure 29: Rayleigh m112-4 Figure 30: Volatility m112-4 Figure 31: Jump m160 Figure 32: Rayleigh m160 Figure 33: Volatility m160 Figure 34: Jump m208 Figure 35: Rayleigh m208 Figure 36: Volatility m208 Figure 37: Jump m256 Figure 38: Rayleigh m256 Figure 39: Volatility m256