← Back to papers

Paper deep dive

Exploring Collatz Dynamics with Human-LLM Collaboration

Edward Y. Chang

Year: 2026Venue: arXiv preprintArea: math.DSType: PreprintEmbeddings: 62

Abstract

Abstract:We investigate structural properties of the Collatz iteration through two phenomena observed in large computational exploration: modular scrambling of residue classes and a burst--gap decomposition of trajectories. We prove several structural results, including a modular scrambling lemma showing that the gap-return map acts as an exact bijection on high bits, a persistent exit lemma characterizing gap structure after persistent states, and a decay property for known portions of binary representations under gap-return dynamics. We further prove that, in the modular model, gap lengths and $2$-adic valuations follow geometric distributions, while persistent run lengths are geometric with expected burst length $E[B]=2$; together these predict strict orbit contraction. These results suggest a conditional framework in which convergence would follow from suitable orbitwise hypotheses on burst and gap lengths, which in turn are suggested by an orbit equidistribution conjecture. However, the key hypotheses remain open, and the framework is exploratory rather than a complete reduction. The paper also documents the human-LLM collaboration through which these observations were developed.

Tags

ai-safety (imported, 100%)mathds (suggested, 92%)preprint (suggested, 88%)

Links

PDF not stored locally. Use the link above to view on the source site.

Intelligence

Status: succeeded | Model: google/gemini-3.1-flash-lite-preview | Prompt: intel-v1 | Confidence: 95%

Last extracted: 3/13/2026, 12:58:26 AM

Summary

The paper investigates the structural properties of the Collatz iteration (3n+1 problem) through computational exploration, focusing on modular scrambling and a burst-gap decomposition of trajectories. It proves several structural results, including a 1/4 persistent-transition law and a persistent exit lemma, and proposes a conditional framework for convergence based on orbitwise hypotheses regarding burst and gap lengths. The research was conducted through a structured human-LLM collaboration.

Entities (5)

Collatz conjecture · mathematical-conjecture · 100%Edward Y. Chang · researcher · 100%Syracuse map · mathematical-function · 100%1/4 Persistent-Transition Law · mathematical-theorem · 95%Persistent Exit Lemma · mathematical-lemma · 95%

Relation Signals (3)

Edward Y. Chang authored Exploring Collatz Dynamics with Human-LLM Collaboration

confidence 100% · Exploring Collatz Dynamics with Human–LLM Collaboration Edward Y. Chang

Syracuse map reformulates Collatz conjecture

confidence 90% · A common analytical reformulation is the Syracuse map, or odd-to-odd iteration

1/4 Persistent-Transition Law proves Collatz conjecture

confidence 80% · The conjecture reduces to showing that actual orbits cannot systematically concentrate on the persistent class beyond the rho_crit threshold.

Cypher Suggestions (2)

Retrieve the author of the paper. · confidence 95% · unvalidated

MATCH (a:Entity {entity_type: 'Researcher'})-[:AUTHORED]->(p:Paper {title: 'Exploring Collatz Dynamics with Human-LLM Collaboration'}) RETURN a.name

Find all mathematical lemmas mentioned in the paper. · confidence 90% · unvalidated

MATCH (e:Entity {entity_type: 'Mathematical Lemma'}) RETURN e.name

Full Text

61,875 characters extracted from source content.

Expand or collapse full text

Exploring Collatz Dynamics with Human–LLM Collaboration Edward Y. Chang Stanford University echang@cs.stanford.edu. This work was facilitated through a structured human–LLM collaboration; see the Methodology note in Section 9. Abstract We investigate structural properties of the Collatz iteration through two phenomena observed in large computational exploration: modular scrambling of residue classes and a burst–gap decomposition of trajectories. We prove several structural results, including a modular scrambling lemma showing that the gap-return map acts as an exact bijection on high bits, a persistent exit lemma characterizing gap structure after persistent states, and a decay property for known portions of binary representations under gap-return dynamics. We further prove that, in the modular model, gap lengths and 22-adic valuations follow geometric distributions, while persistent run lengths are geometric with expected burst length E​[B]=2E[B]=2; together these predict strict orbit contraction. These results suggest a conditional framework in which convergence would follow from suitable orbitwise hypotheses on burst and gap lengths, which in turn are suggested by an orbit equidistribution conjecture. However, the key hypotheses remain open, and the framework is exploratory rather than a complete reduction. The paper also documents the human–LLM collaboration through which these observations were developed. 1 Introduction The Collatz conjecture, also known as the 3​n+13n+1 problem, is one of the simplest unsolved problems in number theory. Starting from a positive integer n, the Collatz map iterates by T​(n)=n/2if ​n​ is even,3​n+1if ​n​ is odd.T(n)= casesn/2&if n is even,\\ 3n+1&if n is odd. cases The conjecture asserts that every positive integer eventually reaches the cycle 1→4→2→11→ 4→ 2→ 1 under repeated iteration. The problem was first posed by Lothar Collatz in 1937 and has since become one of the most widely studied problems in elementary number theory. Surveys of the extensive literature include the work of Lagarias [9], the monograph of Wirsching [14], and the collection edited by Lagarias [10]. Early analytic investigations include Terras’s study of stopping times [13] and Everett’s dynamical reformulation of the iteration [5]. Probabilistic models of the dynamics have also been explored, for example by Borovkov and Pfeifer [4]. Despite its simple definition, the conjecture has resisted proof for decades. Extensive computational verification has confirmed convergence for all integers up to at least 2682^68 [2], yet a complete mathematical explanation of the global dynamics remains unknown. The strongest analytic progress to date is due to Tao [12], who proved that almost all Collatz orbits attain almost bounded values. A central difficulty in the problem is the gap between distributional or almost-everywhere statements and pointwise guarantees for every orbit; many approaches, including Tao’s, encounter this barrier in different forms. A common analytical reformulation is the Syracuse map, or odd-to-odd iteration, which maps each odd integer to the next odd integer reached under the Collatz process. This formulation isolates the multiplicative component of the dynamics and has proved useful in several analytical approaches to the problem. Many structural features of the Collatz map arise from the interaction between powers of two and powers of three. This places the problem in proximity to classical questions concerning exponential Diophantine equations and linear forms in logarithms; see Baker [1] and Evertse [6]. From a dynamical perspective, statistical viewpoints on the iteration are naturally related to classical ergodic ideas [3] and to the theory of positive operators originating in the work of Perron [11] and Frobenius [7]. Several modern approaches analyze Collatz dynamics using probabilistic or statistical models. These models often suggest a negative average drift in the logarithm of orbit values, consistent with eventual convergence to the trivial cycle. However, translating statistical behavior of typical trajectories into pointwise guarantees for every orbit remains the central difficulty. The present work examines structural features of Collatz dynamics from a complementary perspective. Computational exploration repeatedly reveals two recurring phenomena: • Modular scrambling. Residue classes modulo powers of two appear to disperse rapidly under the odd-to-odd iteration, suggesting a form of mixing behavior in the dynamics. • Burst–gap structure. Collatz trajectories naturally decompose into alternating phases of multiplicative growth (“bursts”) and rapid contraction through powers of two (“gaps”). These observations suggest a structural picture in which trajectories wander through residue classes while undergoing alternating phases of expansion and contraction. Understanding the statistical properties of these phases may shed light on the long-term behavior of the system. The results presented here document several structural properties related to these phenomena, supported by computational experiments and analytical arguments. While these results do not resolve the Collatz conjecture, they highlight structural patterns that may contribute to a deeper understanding of the dynamics. Finally, this work also records the methodology used in the exploration. The investigation involved sustained collaboration between a human researcher and large language models. The human directed the conceptual structure of the investigation, while the language models assisted in large-scale exploration, symbolic manipulation, and computational verification. This process illustrates a potential new mode of mathematical experimentation combining human intuition with machine-scale exploration. The remainder of the paper develops the structural observations outlined above and discusses their implications for the broader Collatz problem. 2 Notation and preliminaries We collect the formal definitions and basic facts used throughout the paper. Definition 2.1 (Standard Collatz map). The standard Collatz map C:ℕ→ℕC is C​(n)=n/2C(n)=n/2 if n is even, and C​(n)=3​n+1C(n)=3n+1 if n is odd. Conjecture 2.2 (Collatz conjecture). For every N0≥1N_0≥ 1, the orbit N0,C​(N0),C2​(N0),…N_0,C(N_0),C^2(N_0),… eventually reaches 11. Definition 2.3 (Syracuse map). For an odd integer n≥1n≥ 1, define the Syracuse map T​(n):=(3​n+1)/2v2​(3​n+1)T(n):=(3n+1)/2^v_2(3n+1), where v2​(m):=max⁡j:2j∣mv_2(m):= \j:2^j m\ is the 22-adic valuation. Write n=2k​m−1n=2^km-1 with m odd, k=v2​(n+1)≥1k=v_2(n+1)≥ 1. We call k the odd-run length and m the multiplier. Here m is the odd multiplier associated with the odd integer n. Definition 2.4 (Persistent and safe states). A state (k,μ)(k,μ) where μ=mmod8μ=m 8 and k≥2k≥ 2 is persistent if 3k​μ≡7(mod8)3^kμ≡ 7 8. A state with k=1k=1 (or a persistent state whose certified worst-case drift w¯=k​log2⁡3−(k+emin)<0 w=k _23-(k+e_ )<0) is safe. The persistent class P is the set of odd integers n such that (k,mmod8)(k,m 8) is persistent. Definition 2.5 (Burst-gap decomposition). Given a Syracuse orbit x0,x1,x2,…x_0,x_1,x_2,…, a burst is a maximal run of consecutive epochs with kt≥2k_t≥ 2 (states with odd-run length at least 22). A gap is a maximal run with kt=1k_t=1 (safe iterates). The orbit decomposes as an alternating sequence: L1⏟burst​G1⏟gap​L2⏟burst​G2⏟gap​⋯ L_1_burst\; G_1_gap\; L_2_burst\; G_2_gap\;·s where LiL_i is the length of the i-th burst and GiG_i the length of the i-th gap. time ttL1L_1L2L_2L3L_3L4L_4G1G_1G2G_2G3G_33322441111223322551111223311111122443322burst: k≥2k≥ 2gap: safe (k=1k=1)Persistent Exit Lemma Figure 1: The burst-gap decomposition of a Collatz orbit. Bursts (red) are maximal runs of iterates with k≥2k≥ 2; gaps (green) are runs of safe iterates (k=1k=1). The Persistent Exit Lemma (Lemma 4.4) shows that when a burst ends at a persistent state, the subsequent gap has length exactly 11. Definition 2.6 (Gap-return map). The gap-return map is the composite T(g):=TgT^(g):=T^g restricted to n≡7(mod16)n≡ 7 16 (the entry condition for a burst), where g is the gap length. For n in a fixed residue class amod2M′a 2^M : T(g)​(n)=3g​n+cg2V,T^(g)(n)= 3^gn+c_g2^V, where V=∑i=1gviV= _i=1^gv_i is the total number of halvings and cgc_g is a correction term determined by the halving pattern (hence by amod2M′a 2^M ). Definition 2.7 (Critical persistent frequency). Define ρcrit:=wS−wP++wS−≈0.539, _crit:= w_S^-w_P^++w_S^-≈ 0.539, where wP+=supx∈Pw¯​(x)w_P^+= _x∈ P w(x) is the worst-case certified weight from a persistent state, and wS−=infx∉P(−w¯​(x))>0w_S^-= _x∉ P(- w(x))>0 is the best guaranteed descent from a safe state. If lim supNP​(T)/T<ρcrit N_P(T)/T< _crit for an orbit, then the orbit’s long-run mean drift is negative, forcing convergence. We also record a simple but essential fact about modular arithmetic: Lemma 2.8 (Bijection lemma). Let a be an odd integer and K≥1K≥ 1. Then the map δ↦a⋅δmod2Kδ a·δ 2^K is a bijection on 0,1,…,2K−1\0,1,…,2^K-1\. Proof. Since gcd⁡(a,2K)=1 (a,2^K)=1 (as a is odd), multiplication by a is an automorphism of ℤ/2K​ℤZ/2^KZ. ∎ 3 The 1/41/4 Persistent-Transition Law This section proves that the uniform-lift persistent-to-persistent transition probability is exactly 1/41/4. This is a structural fact about the arithmetic of the 3​n+13n+1 map at the modular level. Theorem 3.1 (1/41/4 Persistent-Transition Law). For every persistent state (k,μmod8)(k,μ 8) with k≥2k≥ 2, the fraction of admissible lifts m≡μ(mod8)m≡μ 8 whose successor under the odd-to-odd map T is again persistent equals exactly 14 14. That is, limN→∞#​m≤N:m≡μ​(8),k′≥2, 3k′​m′≡7​(8)#​m≤N:m≡μ​(8)=14. _N→∞ \#\m≤ N:m≡μ\ (8),\;k ≥ 2,\;3^k m ≡ 7\ (8)\\#\m≤ N:m≡μ\ (8)\= 14. Proof. We compute directly. Write P=3k​mP=3^km. Since the state is persistent, P≡7(mod8)P≡ 7 8, so P−1≡6(mod8)P-1≡ 6 8 and e:=v2​(P−1)=1e:=v_2(P-1)=1. The successor odd integer is n′=(P−1)/2n =(P-1)/2, which satisfies n′≡3(mod4)n ≡ 3 4. The successor’s odd-run length is k′=v2​(n′+1)=v2​((P+1)/2)k =v_2(n +1)=v_2((P+1)/2). Since P≡7(mod8)P≡ 7 8, we have P+1≡0(mod8)P+1≡ 0 8, so (P+1)/2≡0(mod4)(P+1)/2≡ 0 4, giving k′≥2k ≥ 2. Precisely, k′=j≥2k =j≥ 2 iff v2​(P+1)=j+1v_2(P+1)=j+1, i.e. P≡2j+1−1(mod2j+2)P≡ 2^j+1-1 2^j+2. The successor multiplier is m′=(P+1)/2j+1m =(P+1)/2^j+1, which is odd. The successor state (j,m′)(j,m ) is persistent iff 3j​m′≡7(mod8)3^jm ≡ 7 8. Since 3jmod83^j 8 cycles as 3,1,3,1,…3,1,3,1,… for j=1,2,3,4,…j=1,2,3,4,…, this reduces to: m′≡7(mod8)if ​j​ is even,5(mod8)if ​j​ is odd.m ≡ cases7 8&if j is even,\\ 5 8&if j is odd. cases In either case, exactly one of the four residue classes 1,3,5,7(mod8)\1,3,5,7\ 8 satisfies the condition. As P ranges over integers ≡2j+1−1(mod2j+2)≡ 2^j+1-1 2^j+2, the four lifts to mod 2j+4 \,2^j+4 produce m′=(P+1)/2j+1∈1,3,5,7(mod8)m =(P+1)/2^j+1∈\1,3,5,7\ 8, which is exactly uniformly distributed. Therefore, for each fixed j≥2j≥ 2: Pr⁡[successor persistent∣k′=j]=1/4 [successor persistent k =j]=1/4. Since this holds for every j, Pr⁡[successor persistent]=∑j≥214⋅Pr⁡[k′=j]=14.∎ [successor persistent]= _j≥ 2 14· [k =j]= 14. Corollary 3.2 (Geometric persistent excursions). In the uniform-lift model, persistent run lengths follow a Geometric​(3/4)Geometric(3/4) distribution with mean 4/34/3. Burst lengths (total epochs with k≥2k≥ 2 in a maximal persistent run) have expected value 22. Proof. By Theorem 3.1, the one-step persistence probability is 1/41/4 regardless of k′k . Successive applications give Pr⁡[run≥L]=(1/4)L−1 [run≥ L]=(1/4)^L-1, so the run length is Geometric​(3/4)Geometric(3/4) with mean 1/(3/4)=4/31/(3/4)=4/3. Since each persistent epoch contributes one unit to the burst length and the next epoch (when the burst ends) adds a transition count, the expected burst length is E​[L]=4/3+2/3=2E[L]=4/3+2/3=2. ∎ Remark 3.3 (Significance). The 1/41/4 law is an exact structural result, not a heuristic approximation. It shows that at the modular level, the 3​n+13n+1 map has a built-in exit mechanism from persistent states: exactly 3/43/4 of lifts escape to a safe state at each step. The conjecture reduces to showing that actual orbits cannot systematically concentrate on the persistent class beyond the ρcrit _crit threshold. 4 The convergence chain This section develops a conditional convergence chain linking the Collatz conjecture to two statistical hypotheses on orbit structure: a mean burst bound and a mean gap bound. The individual links in the chain (Entry–Occupancy Equivalence, Entry Bound, and the Burst-Gap Criterion) are each proved unconditionally, but the hypotheses they require remain open. 4.1 Entry–Occupancy Equivalence Theorem 4.1 (Entry–Occupancy Equivalence). Let x0,x1,x2,…x_0,x_1,x_2,…c be the Syracuse orbit, and let P denote the persistent class. Define NP​(T):=#​0≤t<T:xt∈P,EP​(T):=#​0≤t<T:xt+1∈P.N_P(T):=\#\0≤ t<T:x_t∈ P\, E_P(T):=\#\0≤ t<T:x_t+1∈ P\. Then lim supT→∞EP​(T)/T=lim supT→∞NP​(T)/T _T→∞E_P(T)/T= _T→∞N_P(T)/T. Proof. Relabelling s=t+1s=t+1: EP​(T)=#​1≤s≤T:xs∈PE_P(T)=\#\1≤ s≤ T:x_s∈ P\, while NP​(T+1)=#​0≤s≤T:xs∈PN_P(T+1)=\#\0≤ s≤ T:x_s∈ P\. These differ by at most P​(x0)1_P(x_0), so |EP​(T)/T−NP​(T+1)/T|≤1/T→0.|E_P(T)/T-N_P(T+1)/T|≤ 1/T→ 0. Since NP​(T+1)/(T+1)⋅(T+1)/T→NP​(T+1)/(T+1)N_P(T+1)/(T+1)·(T+1)/T→ N_P(T+1)/(T+1) and (T+1)/T→1(T+1)/T→ 1, the lim sup values coincide. ∎ Remark 4.2. This equivalence is elementary but essential: it allows us to reason about persistent occupancy (a static count) via persistent entries (a dynamic transition count), the latter being more directly related to the burst-gap structure. 4.2 Entry bound implies convergence Theorem 4.3 (Entry bound implies convergence). If there exists p∗<ρcritp_*< _crit such that every orbit satisfies lim supT→∞EP​(T)/T≤p∗ _T→∞E_P(T)/T≤ p_*, then every orbit of the Collatz map converges. Proof. By Theorem 4.1, lim supNP​(T)/T≤p∗<ρcrit N_P(T)/T≤ p_*< _crit. For every orbit prefix of length T, the cumulative certified drift satisfies 1T​∑t=0T−1w¯​(xt)≤NP​(T)T​wP+−(1−NP​(T)T)​wS−. 1T _t=0^T-1 w(x_t)\;≤\; N_P(T)T\,w_P^+- (1- N_P(T)T )w_S^-. This is because each persistent epoch contributes at most wP+w_P^+ to the drift, while each safe epoch contributes at most −wS−<0-w_S^-<0. Taking lim sup and using lim supNP​(T)/T≤p∗<ρcrit=wS−/(wP++wS−) N_P(T)/T≤ p_*< _crit=w_S^-/(w_P^++w_S^-): lim supT→∞1T​∑t=0T−1w¯​(xt)≤p∗​wP+−(1−p∗)​wS−< 0. _T→∞ 1T _t=0^T-1 w(x_t)\;≤\;p_*w_P^+-(1-p_*)w_S^-\;<\;0. The long-run mean certified drift is strictly negative. By the standard descent argument (see, e.g., [14, 9]): a negative mean drift implies that log2⁡xt→−∞ _2x_t→-∞, so the orbit descends below any fixed threshold, hence converges to the cycle 1\1\. ∎ 4.3 The Persistent Exit Lemma Lemma 4.4 (Persistent Exit Lemma). Let x0,x1,…x_0,x_1,… be a Syracuse orbit with burst-gap decomposition. 1. Every burst terminates at a state with kt=2k_t=2. 2. When the final burst state is persistent (i.e., 32​μ≡7(mod8)3^2μ≡ 7 8, equivalently mt≡7(mod8)m_t≡ 7 8), the subsequent gap has length exactly 11: the first gap iterate’s Syracuse successor re-enters a burst immediately. 3. Under the uniform-lift model, the 1/41/4 Persistent-Transition Law (Theorem 3.1) gives Pr⁡[successor is persistent]=14 [successor is persistent]= 14 at each persistent state. Proof. Part (1): Bursts terminate at k=2k=2. If kt≥3k_t≥ 3, then xt≡7(mod8)x_t≡ 7 8 (since xt=2kt​mt−1x_t=2^k_tm_t-1 with 2kt​mt≡0(mod8)2^k_tm_t≡ 0 8 for kt≥3k_t≥ 3), and T​(xt)=3​xt+12=3⋅2kt−1​mt−1,T(x_t)= 3x_t+12=3· 2^k_t-1m_t-1, since v2​(3​xt+1)=v2​(3⋅2kt​mt−2)=1v_2(3x_t+1)=v_2(3· 2^k_tm_t-2)=1 for kt≥2k_t≥ 2. The successor has kt+1=kt−1≥2k_t+1=k_t-1≥ 2, so the burst continues. Therefore the burst can only end when kt=2k_t=2. Part (2): Persistent final state ⇒ gap length 11. Let xtx_t be the final state of a burst with kt=2k_t=2 and mt≡7(mod8)m_t≡ 7 8 (persistent). Write xt=4​mt−1x_t=4m_t-1. The first gap iterate. 3​xt+1=12​mt−2=2​(6​mt−1)3x_t+1=12m_t-2=2(6m_t-1). Since mtm_t is odd, 6​mt−16m_t-1 is odd, so v2​(3​xt+1)=1v_2(3x_t+1)=1. Thus xt+1=6​mt−1x_t+1=6m_t-1 with kt+1=1k_t+1=1 (safe). The second iterate. 3​xt+1+1=18​mt−2=2​(9​mt−1)3x_t+1+1=18m_t-2=2(9m_t-1). Since mt≡7(mod8)m_t≡ 7 8, we have 9​mt−1≡62≡6(mod8)9m_t-1≡ 62≡ 6 8, giving v2​(9​mt−1)=1v_2(9m_t-1)=1. Therefore v2​(3​xt+1+1)=2v_2(3x_t+1+1)=2, so xt+2=18​mt−24=9​mt−12.x_t+2= 18m_t-24= 9m_t-12. Now k​(xt+2)=v2​(xt+2+1)=v2​(9​mt+12)k(x_t+2)=v_2(x_t+2+1)=v_2\! ( 9m_t+12 ). Writing mt=7+8​rm_t=7+8r, we get 9​mt+1=64+72​r=8​(8+9​r)9m_t+1=64+72r=8(8+9r), so v2​(9​mt+1)≥3v_2(9m_t+1)≥ 3, hence k​(xt+2)=v2​(9​mt+12)≥2k(x_t+2)=v_2\! ( 9m_t+12 )≥ 2. Since k​(xt+2)≥2k(x_t+2)≥ 2, the state xt+2x_t+2 begins a new burst. The gap consists of the single iterate xt+1x_t+1, so Gi=1G_i=1. Part (3) is a restatement of Theorem 3.1. ∎ Remark 4.5 (Gaps of length 22 do occur). An earlier version of this paper claimed that no gap has length exactly 22. This claim is false: computational verification shows that gaps of length 22 constitute approximately 19%19\% of all gaps across typical orbits. The smallest counterexample is n0=3n_0=3: the orbit 3→5→1→⋯3→ 5→ 1→·s has burst 3\3\ (k=2k=2, m=1m=1, non-persistent), followed by gap 5,1\5,1\ of length 22. Another example is n0=71n_0=71: the burst 71,107\71,107\ ends at the non-persistent state 107107 (k=2k=2, m=27m=27, 27mod8=327 8=3), followed by gap 161,121\161,121\ of length 22. The error arose from assuming that every burst ends at a persistent state. In fact, bursts whose final state has kt=2k_t=2 but mt≢7(mod8)m_t ≡ 7 8 are non-persistent, and the gap structure following such states is unconstrained. The Persistent Exit Lemma correctly identifies the subcase where Gi=1G_i=1 is guaranteed. Lemma 4.6 (Modular gap distribution). Let n be an odd integer in a gap step (v2​(3​n+1)=1v_2(3n+1)=1, i.e. n≡3(mod4)n≡ 3 4). Over the uniform distribution on integers n≡3(mod4)n≡ 3 4 modulo 22+L2^2+L (for any L≥1L≥ 1), the gap length G satisfies Pr⁡(G=g)=2−gfor ​1≤g≤L,Pr⁡(G>L)=2−L. (G=g)=2^-g 1≤ g≤ L, (G>L)=2^-L. Equivalently, the memoryless property Pr⁡(G≥g+1∣G≥g)=12 (G≥ g+1 G≥ g)= 12 holds for every g≥1g≥ 1. In particular, G∼Geometric​(1/2)G (1/2) with E​[G]=2E[G]=2. Proof. We show that each successive step of the gap is decided by exactly one bit of n, with the two outcomes equally likely. Step 1 (bit at position 22). For n≡3(mod4)n≡ 3 4, the Syracuse step gives T​(n)=(3​n+1)/2T(n)=(3n+1)/2. Split on nmod8n 8: • n≡3(mod8)n≡ 3 8: T​(n)=(24​k+10)/2=12​k+5≡1(mod4)T(n)=(24k+10)/2=12k+5≡ 1 4, so v2​(3​T​(n)+1)≥2v_2(3\,T(n)+1)≥ 2. The gap ends (burst starts). • n≡7(mod8)n≡ 7 8: T​(n)=(24​k+22)/2=12​k+11≡3(mod4)T(n)=(24k+22)/2=12k+11≡ 3 4, so v2​(3​T​(n)+1)=1v_2(3\,T(n)+1)=1. The gap continues. Among integers ≡3(mod4)≡ 3 4, exactly half fall in each case, so Pr⁡(G≥2)=1/2 (G≥ 2)=1/2. Inductive step (bit at position 2+g2+g). For the continuing case (n≡7mod8n≡ 7 8), T​(n)≡3(mod4)T(n)≡ 3 4 is again a gap step. Whether the gap continues at step g+1g+1 depends on T(g)​(n)mod8T^(g)(n) 8, which is determined by one additional bit of n at the next modular depth. Since the gap-step map n↦(3​n+1)/2n (3n+1)/2 is injective (with inverse n=(2​m−1)/3n=(2m-1)/3), the two sub-classes at each depth have equal size. By induction, Pr⁡(G≥g+1∣G≥g)=1/2 (G≥ g+1 G≥ g)=1/2 for all g≥1g≥ 1, giving the geometric distribution. ∎ Lemma 4.7 (Modular valuation distribution). For odd n in a burst step (n≡1(mod4)n≡ 1 4), the 22-adic valuation k=v2​(3​n+1)k=v_2(3n+1) satisfies, over uniform lifts at each successive modular depth, Pr⁡(k=j)=2−(j−1)for ​j≥2. (k=j)=2^-(j-1) j≥ 2. In particular, E​[k∣k≥2]=3E[k k≥ 2]=3. Proof. For n≡1(mod4)n≡ 1 4, we have 3​n+1≡4(mod8)3n+1≡ 4 8. Split on the next bit: • n≡1(mod8)n≡ 1 8: 3​n+1=4​(6​j+1)3n+1=4(6j+1) with 6​j+16j+1 odd, so v2=2v_2=2. • n≡5(mod8)n≡ 5 8: 3​n+1=8​(3​j+2)3n+1=8(3j+2), so v2≥3v_2≥ 3. Half the lifts give k=2k=2; the other half give k≥3k≥ 3. For the k≥3k≥ 3 case, the same splitting applies at the next depth: half give k=3k=3, half give k≥4k≥ 4, and so on. By induction, Pr⁡(k=j)=2−(j−1) (k=j)=2^-(j-1) for j≥2j≥ 2, giving E​[k]=∑j≥2j⋅2−(j−1)=3E[k]= _j≥ 2j· 2^-(j-1)=3. ∎ Corollary 4.8 (Convergence prediction under equidistribution). Under the equidistributed modular model, the expected log-contraction per burst-gap cycle is strictly negative: E​[B]​(log⁡3−E​[k]​log⁡2)+E​[G]​log⁡32= 2​(log⁡3−3​log⁡2)+2​log⁡32≈−1.15< 0,E[B]\; ( 3-E[k]\, 2 )+E[G]\; 32\;=\;2\,( 3-3 2)+2\, 32\;≈\;-1.15\;<\;0, where E​[B]=2E[B]=2 (Corollary 3.2), E​[G]=2E[G]=2 (Lemma 4.6), and E​[k∣k≥2]=3E[k k≥ 2]=3 (Lemma 4.7). Thus, even though gaps can have arbitrary length (Remark 4.5), the deeper contraction during burst steps (E​[k]=3E[k]=3 rather than the minimum k=2k=2) more than compensates for the longer gaps. The false Gap Lemma (Gi=1G_i=1 always) was unnecessary: the geometric gap distribution E​[G]=2E[G]=2 suffices for convergence under equidistribution. 4.4 The Burst-Gap Criterion Theorem 4.9 (Burst-Gap Criterion). Let x0,x1,…x_0,x_1,… be a Syracuse orbit with burst-gap decomposition (L1,G1,L2,G2,…)(L_1,G_1,L_2,G_2,…). Assume: Hypothesis A (Orbitwise Mean Gap). 1n​∑i=1nGi≥g∗−εn 1n _i=1^nG_i≥ g_*- _n with εn→0 _n→ 0, for some constant g∗>2​(1−ρcrit)ρcrit≈1.71g_*> 2(1- _crit) _crit≈ 1.71. Hypothesis B (Mean Burst Bound). There exists a finite constant C​(n0)C(n_0) such that ∑i=1nLi≤2​n+C​(n0) _i=1^nL_i≤ 2n+C(n_0) for all n≥1n≥ 1. Then lim supT→∞N≥2​(T)/T<ρcrit _T→∞N_≥ 2(T)/T< _crit, where N≥2​(T):=#​0≤t<T:kt≥2N_≥ 2(T):=\#\0≤ t<T:k_t≥ 2\, and the orbit converges by Theorem 4.3. Proof. Write Sn:=∑i=1nLiS_n:= _i=1^nL_i for the total burst time and Tn:=∑i=1n(Li+Gi)T_n:= _i=1^n(L_i+G_i) for the time of the n-th burst-gap boundary. Step 1. From Hypothesis A: ∑i=1nGi≥g∗​n−o​(n). _i=1^nG_i\;≥\;g_*\,n-o(n). Step 2. From Hypothesis B: Sn≤2​n+CS_n≤ 2n+C. Therefore Tn=Sn+∑i=1nGi≥Sn+g∗​n−o​(n).T_n=S_n+ _i=1^nG_i≥ S_n+g_*\,n-o(n). Using Sn≤2​n+CS_n≤ 2n+C gives n≥(Sn−C)/2n≥(S_n-C)/2, so Tn≥Sn+g∗⋅Sn−C2−o​(n)=Sn​(1+g∗2)−O​(1)−o​(n).T_n≥ S_n+g_*· S_n-C2-o(n)=S_n\! (1+ g_*2 )-O(1)-o(n). Step 3 (Interpolation). Fix T and choose n with Tn≤T<Tn+1T_n≤ T<T_n+1. Then N≥2​(T)≤Sn+1≤2​(n+1)+CN_≥ 2(T)≤ S_n+1≤ 2(n+1)+C and T≥Tn≥Sn​(1+g∗/2)−O​(1)T≥ T_n≥ S_n(1+g_*/2)-O(1). Since Sn+1≤Sn+Ln+1S_n+1≤ S_n+L_n+1 and Ln+1L_n+1 contributes at most O​(1)O(1) relative to T, we obtain N≥2​(T)T≤Sn+O​(1)Sn​(1+g∗/2)−O​(1)→22+g∗as ​T→∞. N_≥ 2(T)T\;≤\; S_n+O(1)S_n(1+g_*/2)-O(1)\;→\; 22+g_* T→∞. The condition g∗>2​(1−ρcrit)/ρcritg_*>2(1- _crit)/ _crit is equivalent to 2/(2+g∗)<ρcrit2/(2+g_*)< _crit, so convergence follows from Theorem 4.3. For example, with g∗=2g_*=2 (the expected gap under equidistribution), we get N≥2​(T)/T→12<0.539≈ρcritN_≥ 2(T)/T→ 12<0.539≈ _crit. ∎ Remark 4.10 (Role of the two hypotheses). Both Hypothesis A and Hypothesis B are open orbitwise conjectures. Under equidistribution, the expected burst length is 22 (Corollary 3.2) and the expected gap length is 22 (Lemma 4.6). Both hypotheses follow from the Orbit Equidistribution Conjecture (Theorem 7.3): Hypothesis B via the coupling inequality at fixed modulus (Step 2), and Hypothesis A via the growing-moduli tail control (Step 2′). The Persistent Exit Lemma provides structural support: it shows that when a burst ends at a persistent state, the subsequent gap has length exactly 11. More generally, the Modular Gap Distribution Lemma (Lemma 4.6) proves that gap length is Geometric​(1/2)Geometric(1/2) with E​[G]=2E[G]=2, and Corollary 4.8 shows that this suffices for convergence under equidistribution. 5 The Scrambling Lemma This is the algebraic core of the paper. We show that the gap-return map introduces zero carries between the known and unknown parts of an integer, yielding an exact bijection on high bits. 5.1 Statement and proof Theorem 5.1 (Scrambling Lemma). Let n≡a(mod2M′)n≡ a 2^M with n≡7(mod16)n≡ 7 16, where M′M is chosen so that the halving pattern (v1,…,vg)(v_1,…,v_g) of the gap-return is constant on the class amod2M′a 2^M . Write n=a+δ⋅2M′n=a+δ· 2^M with δ≥0δ≥ 0. Then the gap-return satisfies T(g)​(n)=3g​a+cg2V+3g⋅δ⋅2M′−V,T^(g)(n)= 3^ga+c_g2^V+3^g·δ· 2^M -V, (1) where g, V=∑i=1gviV= _i=1^gv_i, and cgc_g depend only on a (not on δ). Since gcd⁡(3g,2)=1 (3^g,2)=1, the map δ↦3g⋅δmod2K−M′δ 3^g·δ 2^K-M is a bijection on 0,1,…,2K−M′−1\0,1,…,2^K-M -1\ (Lemma 2.8). Therefore: 1. The bits of T(g)​(n)T^(g)(n) at positions ≥M′−V≥ M -V are an exact bijection of the free parameter δ. 2. If δ is uniformly distributed on 0,1,…,2K−M′−1\0,1,…,2^K-M -1\, then each bit of T(g)​(n)T^(g)(n) at position j≥M′−Vj≥ M -V is an exactly unbiased coin flip, independently of a. Proof. The gap-return computes T(g)​(n)=(3g​n+cg)/2VT^(g)(n)=(3^gn+c_g)/2^V, where the correction cgc_g arises from the iterated 3​n+13n+1 steps. Precisely, if we write the g-step iteration as T(g)​(n)=3g​n+∑i=0g−13g−1−i⋅2si2VT^(g)(n)= 3^gn+ _i=0^g-13^g-1-i· 2^s_i2^V for certain shift terms sis_i determined by the halving pattern, then cg=∑i=0g−13g−1−i⋅2sic_g= _i=0^g-13^g-1-i· 2^s_i depends only on the halving pattern, hence only on amod2M′a 2^M . Now substitute n=a+δ⋅2M′n=a+δ· 2^M : 3g​n+cg 3^gn+c_g =3g​(a+δ⋅2M′)+cg =3^g(a+δ· 2^M )+c_g =(3g​a+cg)+3g⋅δ⋅2M′. =(3^ga+c_g)+3^g·δ· 2^M . (2) This is the key algebraic step. The two summands in (2) interact as follows: No carry interaction. The second summand 3g⋅δ⋅2M′3^g·δ· 2^M is an exact multiple of 2M′2^M : it contributes only to bit positions ≥M′≥ M . The first summand 3g​a+cg3^ga+c_g is a fixed integer whose bits below position M′M are fully determined by a. Therefore the addition in (2) produces zero carries between positions <M′<M and positions ≥M′≥ M . The crucial observation is that this is a consequence of linearity: multiplication by 3g3^g distributes over addition, so 3g​(a+δ⋅2M′)=3g​a+3g​δ⋅2M′3^g(a+δ· 2^M )=3^ga+3^gδ· 2^M exactly. There is no nonlinear “mixing” of bits. Division by 2V2^V. Dividing (2) by 2V2^V: T(g)​(n)=3g​a+cg2V+3g⋅δ⋅2M′−V.T^(g)(n)= 3^ga+c_g2^V+3^g·δ· 2^M -V. The first term is a fixed integer (since 2V∣3g​a+cg2^V 3^ga+c_g by the halving pattern), independent of δ. The second term contributes to bits at positions ≥M′−V≥ M -V through the map δ↦3g⋅δ 3^g·δ. Since 3g3^g is odd, this is a bijection on ℤ/2K−M′​ℤZ/2^K-M Z by Lemma 2.8. ∎ Remark 5.2 (Why this is not obvious). The naive concern is that the iterated 3​n+13n+1 computation produces carries that propagate from low bits to high bits, destroying any independence. The Scrambling Lemma shows this fear is unfounded: the linearity of multiplication ensures that unknown bits contribute through an additive shift by an exact power of 22. Thus the carries that arise in the known portion of the computation cannot propagate into the unknown suffix determined by δ, where δ parameterizes the free bits in the representation n=n0+2M​δn=n_0+2^Mδ. 5.2 The pattern-determination bound The following bound controls how many bits are needed to determine the halving pattern: Proposition 5.3 (Pattern-determination bound). The modulus M′M required to fix the halving pattern satisfies M′−M≤max⁡(0,g−2),M -M\;≤\; (0,\,g-2), (3) and, combined with V≥g+1V≥ g+1: M′−V≤M−3.M -V\;≤\;M-3. (4) This bound is independent of the gap length g. Proof. We establish (3) using two facts: 1. The first two halvings satisfy v1=v2=1v_1=v_2=1 unconditionally: the entry condition n≡7(mod16)n≡ 7 16 forces 3​n+1≡22(mod48)3n+1≡ 22 48, giving v1=1v_1=1. The next iterate (3​n+1)/2≡11(mod24)(3n+1)/2≡ 11 24 gives (3⋅11+1)/2=17(3· 11+1)/2=17, confirming v2=1v_2=1. (For general n≡7(mod16)n≡ 7 16, the computation of v1v_1 and v2v_2 requires only the bits of n up to position 33, which are determined by nmod16=7n 16=7.) 2. Each subsequent halving viv_i (i≥3i≥ 3) depends on at most one additional bit of n beyond those already consumed by the first i−1i-1 halvings. This gives M′≤M+(g−2)M ≤ M+(g-2) additional bits for g≥3g≥ 3, and M′=M =M for g≤2g≤ 2. For the total halvings: V=∑i=1gvi≥g+1V= _i=1^gv_i≥ g+1 since v1=v2=1v_1=v_2=1 and vi≥1v_i≥ 1 for all i. Therefore: M′−V≤M+(g−2)−(g+1)=M−3.M -V≤ M+(g-2)-(g+1)=M-3. This is independent of g. ∎ Remark 5.4. The bound M′−V≤M−3M -V≤ M-3 means that each gap-return reduces the known zone by at least 33 bits, regardless of the gap length. This is the quantitative content of the “scrambling”: longer gaps do not help the known zone grow, because the additional bits needed to determine the halving pattern are compensated by the additional halvings. We now summarize the relationships among the components established above. Figure 2 shows which results are proved unconditionally and which inputs remain open conjectures. Collatz Conjecture Every orbit reaches 11 Entry Bound ⇒ Convergence Entry–Occupancy Equivalence 1/41/4 Law Persistent → Persistent Burst–Gap Criterion Persistent Exit gap =1=1 after persistent end Mean Gap Bound 1n​∑Gi≥g∗ 1nΣ G_i≥ g_* Mean Burst Bound ∑Li≤2​n+CΣ L_i≤ 2n+C Known-Zone Decay Scrambling Lemma E​[L]=2E[L]=2 Orbit Equidistribution Conjecture Proved Open Figure 2: Architecture of the conditional framework. Green boxes denote results proved in this paper. Red boxes denote open components. Solid arrows represent proved implications; dashed arrows denote steps depending on open inputs. The conditional framework is intended to reduce the Collatz conjecture to the Orbit Equidistribution Conjecture together with the orbitwise regularity needed to supply the mean burst and mean gap bounds. As shown in Figure 2, the conditional framework would reduce the Collatz conjecture to the Orbit Equidistribution Conjecture, which would supply two inputs to the Burst-Gap Criterion: the mean burst bound (Hypothesis B) and the mean gap bound (Hypothesis A). The remaining components in the chain are proved unconditionally, but the two hypotheses remain open. 6 Known-Zone Decay We iterate the Scrambling Lemma to show that the known zone shrinks to zero in ⌈M/3⌉ M/3 steps. Theorem 6.1 (Known-Zone Decay). Starting from a class amod2Ma 2^M with M≥4M≥ 4, let Tk​(n)T^k(n) denote k iterated gap-returns. Define the known zone ZkZ_k as the number of low-order bits of Tk​(n)T^k(n) that are determined by the starting class a. Then: 1. Z0=MZ_0=M. 2. Zk+1≤max⁡(0,Zk−3)Z_k+1≤ (0,\,Z_k-3) for each k. 3. After ⌈M/3⌉ M/3 gap-returns, Zk=0Z_k=0: all bits of Tk​(n)T^k(n) above bit 0 are exactly uniformly distributed, independent of the starting class. Proof. We proceed by induction on k. Base case: Z0=MZ_0=M by definition. Inductive step: Suppose at step k, the iterate Tk​(n)T^k(n) is known modulo 2Zk2^Z_k (i.e. the low ZkZ_k bits are determined by the starting class a). Apply the Scrambling Lemma (Theorem 5.1) with the known modulus M=ZkM=Z_k: the gap-return produces Tk+1​(n)T^k+1(n) with known zone Zk+1=Mk′−VkZ_k+1=M _k-V_k, where: • Mk′≤Zk+(gk−2)M _k≤ Z_k+(g_k-2) by Proposition 5.3, where gkg_k is the k-th gap length. • Vk≥gk+1V_k≥ g_k+1 (total halvings in the k-th gap-return). Therefore: Zk+1=Mk′−Vk≤Zk+(gk−2)−(gk+1)=Zk−3.Z_k+1=M _k-V_k≤ Z_k+(g_k-2)-(g_k+1)=Z_k-3. Since ZkZ_k decreases by at least 33 per step and Zk≥0Z_k≥ 0 by definition: Zk+1≤max⁡(0,Zk−3).Z_k+1≤ (0,Z_k-3). Termination: Starting from Z0=MZ_0=M, after k=⌈M/3⌉k= M/3 steps: Zk≤M−3​⌈M/3⌉≤0.Z_k≤ M-3 M/3 ≤ 0. At this point, Zk=0Z_k=0: no bits of Tk​(n)T^k(n) are determined by the starting class. The bits at positions ≥0≥ 0 are an exact bijection of the free parameters accumulated through k gap-returns, and if those parameters are uniformly distributed, so are the output bits. ∎ gap-return kkknown zone ZkZ_k0336699121201122334455worst-case: Zk≤M−3​kZ_k≤ M-3ktypical orbitbest case≥3≥ 3all bits free (Zk=0Z_k=0) Figure 3: Known-Zone Decay for M=12M=12. The known zone ZkZ_k decreases by at least 33 per gap-return. The worst-case bound Zk≤M−3​kZ_k≤ M-3k reaches zero after ⌈M/3⌉=4 M/3 =4 gap-returns. Empirically, typical orbits reach Zk=0Z_k=0 in 11–33 steps. Remark 6.2 (Computational verification). For M=12M=12: the bound gives Zk=0Z_k=0 after 44 gap-returns. Across all 239239 testable residue classes amod212a 2^12 with a≡7(mod16)a≡ 7 16 (the remaining 1717 classes reach 11 before a gap-return), the minimum shrinkage V−(M′−M)V-(M -M) is 33, confirming (4). Empirically, the known zone reaches 0 in 11–33 gap-returns for all tested classes. Remark 6.3 (Relation to mixing properties). The Known-Zone Decay is an exact mixing statement: the known zone shrinks by ≥3≥ 3 bits per step, with no error term. This exactness follows from the algebraic structure of the gap-return (the Scrambling Lemma), and is a local property of the map rather than a global statement about orbits. 7 Conditional convergence and reduction to orbit equidistribution We now state the conditional convergence theorem and formulate the Orbit Equidistribution Conjecture, which would supply the two open hypotheses required by the Burst-Gap Criterion. 7.1 The conditional convergence theorem Theorem 7.1 (Conditional convergence). If, for every odd starting value n0n_0, the burst lengths L1,L2,…L_1,L_2,… and gap lengths G1,G2,…G_1,G_2,… of the orbit satisfy (a) 1n​∑i=1nGi≥g∗−εn 1n _i=1^nG_i\;≥\;g_*- _n with εn→0 _n→ 0 for some g∗>2​(1−ρcrit)ρcrit≈1.71g_*> 2(1- _crit) _crit≈ 1.71, (Hypothesis A) (b) ∑i=1nLi≤ 2​n+C​(n0) _i=1^nL_i\;≤\;2n+C(n_0) for some finite constant C​(n0)C(n_0), (Hypothesis B) then every Collatz orbit converges to 11. Proof. The argument chains four deterministic results with the two assumed hypotheses: Step 1 (Hypotheses A and B). Both hypotheses are assumed to hold for the orbit. Hypothesis A is a mean gap condition; Hypothesis B is a mean burst condition. Neither is proved unconditionally, but both follow from the Orbit Equidistribution Conjecture (Theorem 7.3). The Persistent Exit Lemma (Lemma 4.4) provides structural support: when a burst ends at a persistent state (mt≡7(mod8)m_t≡ 7 8), the subsequent gap has length exactly 11. More generally, the Modular Gap Distribution Lemma (Lemma 4.6) proves that gap length is Geometric​(1/2)Geometric(1/2) with E​[G]=2E[G]=2 in the equidistributed model, with each continuation decided by a single fresh bit. Step 2 (Burst-Gap Criterion). By Theorem 4.9, Hypotheses A and B imply lim supT→∞N≥2​(T)T≤22+g∗. _T→∞ N_≥ 2(T)T\;≤\; 22+g_*. With g∗=2g_*=2 (the equidistribution value), this gives N≥2​(T)/T→12N_≥ 2(T)/T→ 12. Step 3 (Entry–Occupancy). By Theorem 4.1, lim supEP​(T)/T=lim supNP​(T)/T≤2/(2+g∗) E_P(T)/T= N_P(T)/T≤ 2/(2+g_*). This is the elementary relabelling argument. Step 4 (Entry bound). Since 2/(2+g∗)<ρcrit≈0.5392/(2+g_*)< _crit≈ 0.539 (by the hypothesis on g∗g_*), Theorem 4.3 yields convergence. The proof uses the certified-drift framework: a persistent occupancy rate below ρcrit _crit ensures negative mean drift, which forces the orbit below any threshold. ∎ 7.2 The Orbit Equidistribution Conjecture Conjecture 7.2 (Orbit Equidistribution Conjecture). For every odd n0n_0, the sequence of gap-return residues ai=Ti​(n0)mod2Ma_i=T^i(n_0) 2^M is equidistributed modulo 2M2^M, uniformly in M: there exists a function M​(N)→∞M(N)→∞ such that ‖μorb,N−μU‖TV→0as ​N→∞modulo ​2M​(N),\| _orb,N- _U\|_TV→ 0 N→∞ 2^M(N), where μorb,N=1N​∑i=1Nδai _orb,N= 1N _i=1^N _a_i and μU _U is the uniform distribution on the admissible residue classes modulo 2M​(N)2^M(N). In particular, this implies fixed-modulus equidistribution (take M​(N)=M(N)=M constant) and provides the tail control needed to pass from truncated to full orbitwise means (see Theorem 7.3). 7.3 Conditional reduction from orbit equidistribution Theorem 7.3 (Conditional reduction from orbit equidistribution). The Orbit Equidistribution Conjecture (Conjecture 7.2) implies the Collatz conjecture. Proof. Assume Conjecture 7.2 holds for an orbit starting at n0n_0. Step 1: Distributional burst length. The Scrambling Lemma (Theorem 5.1) shows that the gap-return map preserves the uniform distribution on residue classes: if δ is uniform, so is 3g​δ3^gδ. By the 1/41/4 Persistent-Transition Law (Theorem 3.1), the expected burst length under the uniform distribution is EμU​[L]=2E_ _U[L]=2 (Corollary 3.2). Step 1′: Distributional gap length. By the Modular Gap Distribution Lemma (Lemma 4.6), each gap step continues with probability exactly 12 12, determined by one fresh bit at the next modular depth. Gap length is therefore Geometric​(1/2)Geometric(1/2) with EμU​[G]=2E_ _U[G]=2. Step 2: From distributional to pointwise (burst length). If the orbit’s gap-return residues equidistribute modulo 2M2^M (Conjecture 7.2), then by the coupling inequality, for any bounded measurable function f: |1N​∑i=1Nf​(ai)−EμU​[f]|≤‖f‖∞⋅‖μorb,N−μU‖TV+o​(1). | 1N _i=1^Nf(a_i)-E_ _U[f] |\;≤\;\|f\|_∞·\| _orb,N- _U\|_TV+o(1). For burst length, the truncated function fK=min⁡(L,K)f_K= (L,K) is bounded and determined by aimod2M​(K)a_i 2^M(K) for some finite M​(K)M(K). The coupling inequality gives 1N​∑min⁡(Li,K)→EμU​[min⁡(L,K)] 1NΣ (L_i,K)→ E_ _U[ (L,K)] for each fixed K. Since burst lengths have geometric tails by the 1/41/4 Law (Corollary 3.2), taking K→∞K→∞ yields 1n​∑Li→2 1nΣ L_i→ 2. Step 2′: From distributional to pointwise (gap length). The same truncation strategy applies in principle: for each fixed K, the event Gi≥k\G_i≥ k\ (for k≤Kk≤ K) depends on finitely many bits, so the truncated gap min⁡(Gi,K) (G_i,K) is determined by the residue class modulo 2M′​(K)2^M (K). Equidistribution modulo 2M′​(K)2^M (K) then gives 1N​∑min⁡(Gi,K)→EμU​[min⁡(G,K)] 1NΣ (G_i,K)→ E_ _U[ (G,K)]. However, passing to K→∞K→∞ requires tail control: one must show limK→∞lim supN→∞1N​∑i=1NGi​ 1Gi>K= 0. _K→∞\; _N→∞\; 1N _i=1^NG_i\,1_G_i>K\;=\;0. (5) Under the uniform model, gap lengths are geometric with parameter 12 12 (Step 1′), so EμU​[G​ 1G>K]→0E_ _U[G\,1_G>K]→ 0 as K→∞K→∞. The growing-moduli form of the Orbit Equidistribution Conjecture (Conjecture 7.2) provides equidistribution at depth M​(N)→∞M(N)→∞, which controls the truncated means min⁡(Gi,K) (G_i,K) for all K≤M​(N)K≤ M(N) simultaneously, yielding (5). We therefore obtain 1n​∑Gi→2 1nΣ G_i→ 2. Step 3: Both hypotheses. Under the above assumptions, 1n​∑i=1nLi→ 2and1n​∑i=1nGi→ 2, 1n _i=1^nL_i\;→\;2 1n _i=1^nG_i\;→\;2, which are Hypotheses B and A of Theorem 7.1 (with g∗=2>1.71g_*=2>1.71 and C​(n0)C(n_0) depending on the rate of equidistribution). Convergence follows. ∎ Remark 7.4 (The distributional-vs-pointwise gap). The Scrambling Lemma proves a distributional result: if n is drawn uniformly from a residue class, the bits of T​(n)T(n) are exactly uniform. The conjecture requires a pointwise result: that each individual orbit equidistributes. This is a pointwise genericity question, analogous to proving that a given number is normal in base 22, not an almost-everywhere result. The map property (distributional uniformity) does not automatically imply the orbit property (pointwise equidistribution), just as knowing that the digit-frequency map preserves uniformity does not prove that a specific number like 2 2 is normal. Tao’s theorem [12] establishes almost-all convergence (in logarithmic density), which is a probabilistic rather than pointwise statement. Both approaches thus face the same fundamental barrier: the gap between distributional/almost-all and every-orbit results. Remark 7.5 (What is known toward the conjecture). Several lines of evidence support the Orbit Equidistribution Conjecture: 1. The gap-return map preserves the uniform distribution (Scrambling Lemma, proved unconditionally). 2. The Known-Zone Decay shows that the map is strongly mixing at the residue-class level: after ⌈M/3⌉ M/3 applications, all dependence on the starting class is eliminated. 3. Almost-all equidistribution follows from Tao [12] (logarithmic density). 4. All empirical orbits tested (up to 220002^2000) satisfy equidistribution within sampling noise. 8 Position relative to prior work The strongest rigorous progress on the Collatz conjecture remains Tao’s almost-all result [12]. The present work does not approach that level of mathematical strength. Its contribution is instead exploratory: it records several structural observations about burst–gap dynamics and modular scrambling, together with a transparent account of the human–LLM collaboration through which these observations were developed. 9 Note on LLM-assisted research methodology This section is included in the interest of transparency and intellectual honesty. 9.1 The collaboration This paper was produced through an extended collaboration between a human researcher (the author) and large language models (LLMs), specifically Anthropic’s Claude 4.6 and OpenAI’s GPT 5.4 Thinking. The author served as the intellectual architect and moderator: setting proof directions, formulating research questions, evaluating mathematical claims, directing the exploration, identifying errors, and making all judgment calls regarding correctness and significance. Claude 4.6 served as the primary computational and expository partner: performing algebraic calculations, writing verification code, drafting proofs, and iterating through approaches at speed. The moderator and Claude worked in a tight loop, exchanging ideas and formulating alternative approaches at each step. The moderator made the decision at every juncture about which option to explore next. GPT 5.4 Thinking served as a validator partner, verifying both the moderator’s instructions and Claude 4.6’s output on specific local issues rather than addressing the global proof structure. 9.2 The orchestration methodology A critical aspect of this collaboration was the moderator’s orchestration discipline. Following the spirit of the author’s SagaLLM framework [chang2025sagallm], the moderator ensured that all intermediate results were saved at each step. Over hundreds of iterations, this made it possible to undo and redo effectively without repeating the same errors or stepping into dead-end paths. When a line of reasoning stalled, the moderator would instruct Claude to investigate a cold path from different angles. The “transaction property,” orchestrated by the moderator, became a key lesson: each exploration step was treated as a reversible transaction, allowing the research to backtrack cleanly when necessary. By contrast, when other LLMs were allowed to derive proofs autonomously, the process quickly became difficult to control. This is why GPT was used primarily as a validator for checking local issues rather than as a driver of the global proof architecture. 9.3 What the LLMs contributed • Rapid exploration of proof strategies (nine routes were investigated before the Scrambling Lemma emerged). • Computational verification: Python scripts tested the Scrambling Lemma across all 256256 residue classes modulo 2122^12, verified the 1/41/4 law across thousands of orbits, and checked the Known-Zone Decay bound. • Algebraic manipulation and drafting of proofs, particularly the carry-free decomposition that forms the core of Theorem 5.1. • Identification of the distributional-vs-pointwise gap: the LLM flagged that its earlier claim of “unconditional convergence” was an overstatement before external peer review confirmed the same issue. 9.4 What the human contributed • The overall research direction: choosing to study the Collatz conjecture and framing the attack through persistent occupancy and burst-gap analysis. • Key conceptual redirections: instructing the LLM to work on the Syracuse map (Collatz∗, the odd-to-odd iteration) rather than the classical Collatz map, and to analyze orbits rather than sequences. The moderator also directed investigation of reverse sequences, suggested alternative convergence criteria (“underwater,” “trunk,” and stratified-orbit arguments), and proposed energy-compensation ideas. Not all of these survived into the final proof, but several, notably the Syracuse framing and the burst-gap orbit analysis, became foundational. • The orchestration: deciding which routes to pursue, which to abandon, and when to synthesize results. • Quality control: demanding rigorous proofs, identifying overclaims, and insisting on intellectual honesty when the unconditional proof attempt failed. 9.5 Contribution attribution Table 1 summarizes the nine most significant contributions in this work, categorized by importance level and attributed to the primary contributor. Importance levels are color-coded: Critical denotes mathematical correctness fixes without which the paper would contain errors; High denotes new mathematical content or substantial structural additions. Table 1: Key contributions by importance. Critical = correctness fix, High = new content. # Level Contribution Primary source 1 Critical Fix ν2​(9​mt+1)≥6 _2(9m_t+1)≥ 6 overclaim → ≥3≥ 3 with explicit computation Moderator 2 Critical Fix Theorem 7.3 statement/proof mismatch (tail-control gap) Moderator 3 Critical Fix Gap Distribution Lemma scope (mod​ 2mmod\;2^m, m>2m>2 overconstrained) Validation (Claude) 4 High Gap Distribution Lemma: G∼Geom​(1/2)G (1/2), E​[G]=2E[G]=2 Moderator (direction) + Claude (proof) 5 High Valuation Distribution Lemma: Pr⁡(k=j)=2−(j−1) (k=j)=2^-(j-1), E​[k]=3E[k]=3 Claude 6 High Convergence Corollary: net log-contraction ≈−1.15≈-1.15 Claude 7 High Growing-moduli Orbit Equidistribution Conjecture (Option B) Moderator (choice) + Claude (formulation) 8 High Case study: the false Gap Lemma (failure modes & lessons) Moderator (direction) + Claude (draft) 9 High Writing and claim inconsistencies GPT 9.6 Case study: the false Gap Lemma An earlier version of this paper contained a lemma claiming that gap length is never 22 (Gi≠2G_i≠ 2) after a persistent exit. Had it been true, the result would have reduced the Collatz conjecture to a single equidistribution conjecture with no additional tail-control hypothesis. The lemma was false. The episode is documented here because it illustrates several failure modes of human–LLM collaboration that are instructive for the methodology. How the error arose. The LLM produced a proof that when a burst ends at a persistent state (mt≡7(mod8)m_t≡ 7 8), the subsequent gap has length exactly 11. This part is correct and survives as the Persistent Exit Lemma (Lemma 4.4). The error was in generalizing: the LLM’s proof implicitly assumed that every burst ends at a persistent state. In fact, bursts can also end at non-persistent states (kt=2k_t=2 with mt≢7(mod8)m_t ≡ 7 8), and the gap structure after such exits is unconstrained. The false lemma Gi≠2G_i≠ 2 was a pattern-match from the persistent case to the general case. How it survived. The false lemma survived multiple rounds of LLM-assisted proofreading, validator checks by a second model, and initial peer feedback. Three factors contributed: 1. Confirmation bias in validation. When asked to “verify the Gap Lemma,” both models checked the algebraic steps within the persistent-exit proof rather than questioning the scope of the claim. The proof was correct for the case it addressed; the error was in the unstated assumption that this case was exhaustive. 2. Sycophantic momentum. Once the lemma appeared in the proof chain, the models treated it as established and built further arguments on top of it. Neither model spontaneously revisited the lemma’s premises during later rounds of proofreading. 3. Insufficient adversarial testing. The computational verification tested the Persistent Exit Lemma at the modular level (where it holds) but did not separately enumerate gap lengths across all orbit segments to check whether Gi=2G_i=2 actually occurs. How it was found. The moderator, during a careful re-reading of the full proof chain, asked whether every burst necessarily ends at a persistent state. A targeted computational check immediately produced counterexamples: the orbit starting at n0=3n_0=3 has a gap of length 22, and approximately 19%19\% of all gaps in typical orbits have length 22 or more. Remediation. The correction proceeded in three stages: 1. Honest downgrading. The false lemma was removed. All claims that depended on it were weakened to conditional statements, and the paper’s title and abstract were revised to reflect exploratory rather than reductive framing. 2. Going one level deeper. By tracking one additional bit of modular depth at each gap step, we proved the Modular Gap Distribution Lemma (Lemma 4.6): gap length is Geometric​(1/2)Geometric(1/2) with E​[G]=2E[G]=2. This is weaker than Gi=1G_i=1 always, but it is true and sufficient: combined with the Modular Valuation Distribution (Lemma 4.7, E​[k]=3E[k]=3), the net log-contraction per burst-gap cycle is −1.15-1.15, comfortably negative (Corollary 4.8). 3. Strengthening the conjecture. The Orbit Equidistribution Conjecture was upgraded from fixed-modulus to growing-moduli form (Conjecture 7.2), which supplies the tail control that the false lemma had made unnecessary. Theorem 7.3 now states a clean implication from one conjecture. Lessons. The episode yields four concrete lessons for human–LLM mathematical collaboration: 1. Transactional discipline. The moderator’s practice of saving intermediate states made it possible to surgically remove the false lemma and its downstream consequences without starting over. 2. Scope-checking over step-checking. Validators should be directed to question the scope of a claim (“does this cover all cases?”) rather than only the steps of a proof. The algebraic steps were correct; the implicit universality was not. 3. Independent computational falsification. A simple enumeration of gap lengths across actual orbits would have caught the error immediately. Computational checks should be designed to falsify claims, not merely to confirm the cases the proof addresses. 4. Domain expertise as bottleneck. The error was caught by the moderator’s targeted question, not by any automated process. Current LLMs do not spontaneously generate adversarial queries against their own outputs with sufficient rigor. This gap is the single most important limitation of autonomous LLM proof search. 9.7 Broader implications This work is, to the author’s knowledge, an early example of human–LLM collaborative research applied to a longstanding open problem in mathematics. The author served primarily as a “moderator” guiding an LLM through a mathematical research program that would ordinarily require graduate-level number theory expertise. The methodology, human orchestration of LLM capabilities, is explored in the author’s books The Path to AGI, Volumes 1 and 2 [PathAGIV1Chang2025, PathAGIV2Chang2025], which examine how effective human–AI collaboration can amplify both parties’ capabilities beyond what either achieves alone. Specifically, the orchestration provides three rigorous methodologies to mitigate common LLM limitations: (1) context loss, (2) sycophancy, and (3) reasoning errors. Additional features such as reducing hallucination, maintaining strong debate synergy, and preserving reasoning quality are discussed in those volumes. Details of the methodology applied in this work will be documented in a separate report. Terence Tao has noted in public remarks that he now uses LLMs extensively in his own research workflow. The present work illustrates how the combination of human mathematical intuition and LLM computational power can accelerate the exploration–verification–correction cycle without replacing human insight. The significance of this work lies in illustrating a new mode of mathematical research in which the human contribution is primarily architectural and conceptual while the LLM contribution is exploratory and expository. Indeed, Donald Knuth’s recent “Shock! Shock!” note [8] demonstrates the point vividly. Claude Opus 4.6 solved an open Hamiltonian-cycle problem that Knuth had investigated for weeks, producing a construction after 31 systematic explorations in roughly one hour. Knuth then supplied the rigorous proof. The episode illustrates an emerging division of labor: the LLM rapidly explores large search spaces and proposes candidate structures, while the human mathematician interprets the result and establishes the formal proof. This pattern is consistent with observations made by Terence Tao regarding the evolving relationship between mathematics and artificial intelligence. Human mathematicians contribute deep intuition, the perception of abstract structure, and the ability to reformulate problems within entirely new conceptual frameworks. Artificial intelligence, by contrast, excels at large-scale exploration: testing vast numbers of possibilities, identifying patterns across enormous computational spaces, and rapidly evaluating candidate hypotheses. The Collatz exploration described in this work illustrates this complementary dynamic. The LLM agents performed extensive computational exploration, generating and testing multiple analytical formulations, examining modular dynamics, and identifying structural phenomena such as carry propagation and ghost-cycle behavior. These explorations clarified several structural properties of the iteration. However, they also repeatedly encountered the same fundamental obstruction: the difficulty of converting distributional information about “typical” orbit behavior into pointwise guarantees for every orbit. One structural pattern revealed by this exploration is that the Collatz map lies at the intersection of two multiplicative worlds: powers of two and powers of three. Many observed phenomena, including drift behavior, valuation patterns, modular carry propagation, and near-resonances between 3L3^L and 2S2^S, appear to arise from the interaction between these two arithmetic structures. This suggests that further progress may require a conceptual reframing that treats these structures simultaneously rather than separately. The broader methodological lesson is that effective collaboration between human intuition and machine-scale exploration can accelerate mathematical research. In this mode of research, the human contribution is primarily architectural and conceptual, while the LLM contribution is exploratory and expository. The combination is complementary rather than substitutive. If future LLM systems acquire the ability to guide their own exploration with the discipline and reliability that currently require human moderation, the practice of mathematical research may change significantly. Acknowledgements The author acknowledges the use of Anthropic’s Claude (Claude Opus 4.6) and OpenAI’s GPT 5.4 Thinking as collaborative research tools throughout the development of this work. All mathematical claims were verified through a combination of formal proofs and computational checks. The distributional-vs-pointwise gap was identified during the research process and later confirmed by external peer review. References [1] A. Baker (1966) Linear forms in the logarithms of algebraic numbers. Mathematika 13, p. 204–216. Cited by: §1. [2] D. Bárina (2021) Convergence verification of the Collatz problem. The Journal of Supercomputing 77, p. 2681–2688. Cited by: §1. [3] G. D. Birkhoff (1931) Proof of the ergodic theorem. Proceedings of the National Academy of Sciences of the USA 17, p. 656–660. Cited by: §1. [4] K. Borovkov and D. Pfeifer (2001) Estimates for the Syracuse problem via a probabilistic model. Theory of Probability and Its Applications 45 (2), p. 300–310. Cited by: §1. [5] C. J. Everett (1977) Iteration of the number-theoretic function f​(2​n)=nf(2n)=n, f​(2​n+1)=3​n+2f(2n+1)=3n+2. Advances in Mathematics 25, p. 42–45. Cited by: §1. [6] J. Evertse (1984) On sums of S-units and linear recurrences. Compositio Mathematica 53, p. 225–244. Cited by: §1. [7] F. Frobenius (1912) Über matrizen aus nicht negativen Elementen. Sitzungsberichte der Königlich Preußischen Akademie der Wissenschaften zu Berlin, p. 456–477. Cited by: §1. [8] D. E. Knuth (2026-02) Claude’s cycles. Note: Stanford Computer Science DepartmentRevised March 2026. Available at https://cs.stanford.edu/˜knuth/papers/claude-cycles.pdf Cited by: §9.7. [9] J. C. Lagarias (1985) The 3​x+13x+1 problem and its generalizations. American Mathematical Monthly 92 (1), p. 3–23. Cited by: §1, §4.2. [10] J. C. Lagarias (Ed.) (2010) The ultimate challenge: the 3​x+13x+1 problem. American Mathematical Society, Providence, RI. Cited by: §1. [11] O. Perron (1907) Zur Theorie der Matrizen. Mathematische Annalen 64, p. 248–263. Cited by: §1. [12] T. Tao (2020) Almost all orbits of the Collatz map attain almost bounded values. Forum of Mathematics, Pi 8, p. e14. Cited by: §1, item 3, Remark 7.4, §8. [13] R. Terras (1976) A stopping time problem on the positive integers. Acta Arithmetica 30, p. 241–252. Cited by: §1. [14] G. J. Wirsching (1998) The dynamical system generated by the 3​n+13n+1 function. Lecture Notes in Mathematics, Vol. 1681, Springer, Berlin. Cited by: §1, §4.2.