Instant research discovery
Search and browse ingested papers with intelligence signals and fast filtering.
| Paper | Year | Area | Tags | Intel | Citations |
|---|---|---|---|---|---|
| Compressibility Measures Complexity: Minimum Description Length Meets Singular Learning Theory Jesse Hoogland, Einar Urdshals, Edmund Lau, Daniel Murfet Year: 2025Area: Training DynamicsCitations: 3 Tags: empirical, ai-safety, training-dynamics | 2025 | Training Dynamics | empirical, ai-safety, training-dynamics | E5 / R3 (95%) | 3 |
| Data Shapley in One Training Run Ruoxi Jia, Jiachen T. Wang, Prateek Mittal, Dawn Song Year: 2025Area: Training DynamicsCitations: 48 Tags: theoretical, ai-safety, training-dynamics | 2025 | Training Dynamics | theoretical, ai-safety, training-dynamics | E5 / R3 (95%) | 48 |
| Differentiation and Specialization of Attention Heads via the Refined Local Learning Coefficient Jesse Hoogland, Zach Furman, George Wang, Daniel Murfet Year: 2025Area: Training DynamicsCitations: 24 Tags: empirical, ai-safety, training-dynamics | 2025 | Training Dynamics | empirical, ai-safety, training-dynamics | E5 / R3 (93%) | 24 |
| Evolution of Concepts in Language Model Pre-Training Xuyang Ge, Zhengfu He, Yunhua Zhou, Jiaxing Wu Year: 2025Area: Training DynamicsCitations: 2 Tags: empirical, ai-safety, training-dynamics | 2025 | Training Dynamics | empirical, ai-safety, training-dynamics | E5 / R3 (94%) | 2 |
| Generalization to Political Beliefs from Fine-Tuning on Sports Team Preferences Owen Terry Year: 2025Area: Training DynamicsCitations: - Tags: empirical, ai-safety, training-dynamics | 2025 | Training Dynamics | empirical, ai-safety, training-dynamics | E5 / R3 (93%) | - |
| Learning Dynamics of LLM Finetuning Yi Ren, Danica J. Sutherland Year: 2025Area: Training DynamicsCitations: 67 Tags: theoretical, alignment-training, ai-safety, training-dynamics | 2025 | Training Dynamics | theoretical, alignment-training, ai-safety, training-dynamics | E5 / R3 (94%) | 67 |
| Learning from Negative Examples: Why Warning-Framed Training Data Teaches What It Warns Against Tsogt-Ochir Enkhbayar Year: 2025Area: Training DynamicsCitations: - Tags: empirical, ai-safety, training-dynamics | 2025 | Training Dynamics | empirical, ai-safety, training-dynamics | E5 / R3 (96%) | - |
| Let Me Grok for You: Accelerating Grokking via Embedding Transfer from a Weaker Model Wei Hu, Zhiwei Xu, Zhiyu Ni, Yixin Wang Year: 2025Area: Training DynamicsCitations: 6 Tags: empirical, ai-safety, training-dynamics | 2025 | Training Dynamics | empirical, ai-safety, training-dynamics | E5 / R3 (95%) | 6 |
| The Local Learning Coefficient: A Singularity-Aware Complexity Measure Zach Furman, Susan Wei, Edmund Lau, George Wang Year: 2025Area: Training DynamicsCitations: 34 Tags: theoretical, ai-safety, training-dynamics | 2025 | Training Dynamics | theoretical, ai-safety, training-dynamics | E5 / R3 (96%) | 34 |
| Using physics-inspired Singular Learning Theory to understand grokking & other phase transitions in modern neural networks Anish Lakkapragada Year: 2025Area: Training DynamicsCitations: - Tags: empirical, ai-safety, training-dynamics | 2025 | Training Dynamics | empirical, ai-safety, training-dynamics | E5 / R3 (95%) | - |
| AI Models Collapse When Trained on Recursively Generated Data Ilia Shumailov, Ross Anderson, Zakhar Shumaylov, Yarin Gal Year: 2024Area: Training DynamicsCitations: 427 Tags: empirical, ai-safety, training-dynamics | 2024 | Training Dynamics | empirical, ai-safety, training-dynamics | E6 / R3 (97%) | 427 |
| Fine-tuning Enhances Existing Mechanisms: A Case Study on Entity Tracking David Bau, Nikhil Prakash, Tamar Rott Shaham, Tal Haklay Year: 2024Area: Training DynamicsCitations: 101 Tags: empirical, ai-safety, training-dynamics | 2024 | Training Dynamics | empirical, ai-safety, training-dynamics | E7 / R4 (95%) | 101 |
| Learning and Unlearning of Fabricated Knowledge in Language Models Mark Sandler, Chen Sun, Nolan Andrew Miller, Max Vladymyrov Year: 2024Area: Training DynamicsCitations: 5 Tags: empirical, ai-safety, training-dynamics | 2024 | Training Dynamics | empirical, ai-safety, training-dynamics | E4 / R3 (94%) | 5 |
| Loss of Plasticity in Deep Continual Learning Parash Rahman, J. Fernando Hernandez-Garcia, A. Rupam Mahmood, Shibhansh Dohare Year: 2024Area: Training DynamicsCitations: 40 Tags: empirical, ai-safety, training-dynamics | 2024 | Training Dynamics | empirical, ai-safety, training-dynamics | E5 / R3 (97%) | 40 |
| Sudden Drops in the Loss: Syntax Acquisition, Phase Transitions, and Simplicity Bias in MLMs Ravid Shwartz-Ziv, Angelica Chen, Matthew L. Leavitt, Kyunghyun Cho Year: 2024Area: Training DynamicsCitations: 109 Tags: empirical, ai-safety, training-dynamics | 2024 | Training Dynamics | empirical, ai-safety, training-dynamics | E5 / R3 (97%) | 109 |
| The Developmental Landscape of In-Context Learning Jesse Hoogland, Matthew Farrugia-Roberts, Susan Wei, George Wang Year: 2024Area: Training DynamicsCitations: 19 Tags: empirical, ai-safety, training-dynamics | 2024 | Training Dynamics | empirical, ai-safety, training-dynamics | E5 / R3 (96%) | 19 |
| Are Emergent Abilities of Large Language Models a Mirage? Rylan Schaeffer, Brando Miranda, Sanmi Koyejo Year: 2023Area: Training DynamicsCitations: 591 Tags: empirical, ai-safety, training-dynamics | 2023 | Training Dynamics | empirical, ai-safety, training-dynamics | E5 / R3 (96%) | 591 |
| Dynamical versus Bayesian Phase Transitions in a Toy Model of Superposition Susan Wei, Edmund Lau, Jake Mendel, Zhongtian Chen Year: 2023Area: Training DynamicsCitations: 22 Tags: theoretical, ai-safety, training-dynamics | 2023 | Training Dynamics | theoretical, ai-safety, training-dynamics | E5 / R3 (95%) | 22 |
| Emergent and Predictable Memorization in Large Language Models Hailey Schoelkopf, Quentin Anthony, Edward Raff, Shivanshu Purohit Year: 2023Area: Training DynamicsCitations: 173 Tags: empirical, ai-safety, training-dynamics | 2023 | Training Dynamics | empirical, ai-safety, training-dynamics | E5 / R3 (96%) | 173 |
| Explaining Grokking Through Circuit Efficiency János Kramár, Vikrant Varma, Rohin Shah, Ramana Kumar Year: 2023Area: Training DynamicsCitations: 80 Tags: empirical, ai-safety, training-dynamics | 2023 | Training Dynamics | empirical, ai-safety, training-dynamics | E6 / R3 (93%) | 80 |
| Progress Measures for Grokking via Mechanistic Interpretability Lawrence Chan, Tom Lieberum, Jess Smith, Neel Nanda Year: 2023Area: Training DynamicsCitations: 680 Tags: empirical, ai-safety, training-dynamics, interpretability | 2023 | Training Dynamics | empirical, ai-safety, training-dynamics, interpretability | E5 / R3 (95%) | 680 |
| Pythia: A Suite for Analyzing Large Language Models Across Training and Scaling Hailey Schoelkopf, Quentin Anthony, Edward Raff, Shivanshu Purohit Year: 2023Area: Training DynamicsCitations: 1708 Tags: ai-safety, training-dynamics, tool, interpretability | 2023 | Training Dynamics | ai-safety, training-dynamics, tool, interpretability | E5 / R3 (99%) | 1708 |
| Studying Large Language Model Generalization with Influence Functions Alex Tamkin, Nicholas Joseph, Benoit Steiner, Karina Nguyen Year: 2023Area: Training DynamicsCitations: 286 Tags: empirical, ai-safety, training-dynamics | 2023 | Training Dynamics | empirical, ai-safety, training-dynamics | E5 / R3 (94%) | 286 |
| The Reversal Curse: LLMs trained on 'A is B' fail to learn 'B is A' Max Kaufmann, Meg Tong, Mikita Balesni, Owain Evans Year: 2023Area: Training DynamicsCitations: 425 Tags: empirical, ai-safety, training-dynamics | 2023 | Training Dynamics | empirical, ai-safety, training-dynamics | E5 / R3 (96%) | 425 |
| Transformers Learn In-Context by Gradient Descent Johannes von Oswald, Alexander Mordvintsev, Eyvind Niklasson, Max Vladymyrov Year: 2023Area: Training DynamicsCitations: 677 Tags: theoretical, ai-safety, training-dynamics | 2023 | Training Dynamics | theoretical, ai-safety, training-dynamics | E5 / R3 (95%) | 677 |
| Emergent Abilities of Large Language Models William Fedus, Tatsunori Hashimoto, Yi Tay, Jeff Dean Year: 2022Area: Training DynamicsCitations: 3244 Tags: empirical, ai-safety, training-dynamics | 2022 | Training Dynamics | empirical, ai-safety, training-dynamics | E7 / R3 (96%) | 3244 |
| Grokking: Generalization Beyond Overfitting on Small Algorithmic Datasets Harri Edwards, Alethea Power, Igor Babuschkin, Yuri Burda Year: 2022Area: Training DynamicsCitations: 526 Tags: empirical, ai-safety, training-dynamics | 2022 | Training Dynamics | empirical, ai-safety, training-dynamics | E5 / R3 (95%) | 526 |
| Inverse Scaling Can Become U-Shaped Yi Tay, Jason Wei, Quoc V. Le, Najoung Kim Year: 2022Area: Training DynamicsCitations: 79 Tags: empirical, ai-safety, training-dynamics | 2022 | Training Dynamics | empirical, ai-safety, training-dynamics | E5 / R3 (95%) | 79 |
| Quantifying Memorization Across Neural Language Models Nicholas Carlini, Katherine Lee, Matthew Jagielski, Daphne Ippolito Year: 2022Area: Training DynamicsCitations: 805 Tags: empirical, ai-safety, training-dynamics | 2022 | Training Dynamics | empirical, ai-safety, training-dynamics | E5 / R4 (96%) | 805 |
| Scaling Laws and Interpretability of Learning from Repeated Data Tom Conerly, Nicholas Joseph, Dawn Drain, Tom Henighan Year: 2022Area: Training DynamicsCitations: 148 Tags: empirical, ai-safety, training-dynamics, interpretability | 2022 | Training Dynamics | empirical, ai-safety, training-dynamics, interpretability | E5 / R3 (94%) | 148 |
Showing 30 of 33 papers on page 1.