Instant research discovery
Search and browse ingested papers with intelligence signals and fast filtering.
| Paper | Year | Area | Tags | Intel | Citations |
|---|---|---|---|---|---|
| AIR: Improving Agent Safety through Incident Response Jun Sun, Junjie Chen, Zibo Xiao Year: 2026Area: Agent SafetyCitations: - Tags: empirical, agent-safety, ai-safety | 2026 | Agent Safety | empirical, agent-safety, ai-safety | E5 / R3 (95%) | - |
| Agentic Uncertainty Reveals Agentic Overconfidence Jean Kaddour, Leo Richter, Srijan Patel, Pasquale Minervini Year: 2026Area: Agent SafetyCitations: - Tags: empirical, agent-safety, ai-safety, adversarial-robustness | 2026 | Agent Safety | empirical, agent-safety, ai-safety, adversarial-robustness | E5 / R3 (96%) | - |
| Authenticated Workflows: A Systems Approach to Protecting Agentic AI Mohan Rajagopalan, Vinay Rao Year: 2026Area: Agent SafetyCitations: - Tags: theoretical, agent-safety, ai-safety | 2026 | Agent Safety | theoretical, agent-safety, ai-safety | E5 / R4 (96%) | - |
| Institutional AI: Governing LLM Collusion in Multi-Agent Cournot Markets via Public Governance Graphs Daniele Nardi, Piercosma Bisconti, Vincenzo Suriani, Marcantonio Bracale Syrnikov Year: 2026Area: Agent SafetyCitations: 1 Tags: empirical, agent-safety, ai-safety | 2026 | Agent Safety | empirical, agent-safety, ai-safety | E5 / R4 (92%) | 1 |
| LPS-Bench: Benchmarking Safety Awareness of Computer-Use Agents in Long-Horizon Planning under Benign and Adversarial Scenarios Xia Hu, Ge Gao, Chujia Hu, Dongrui Liu Year: 2026Area: Agent SafetyCitations: 1 Tags: agent-safety, ai-safety, adversarial-robustness, benchmark | 2026 | Agent Safety | agent-safety, ai-safety, adversarial-robustness, benchmark | E4 / R3 (95%) | 1 |
| Modular Safety Guardrails Are Necessary for Foundation-Model-Enabled Robots in the Real World Davood Soleymanzadeh, Yi Ding, Minghui Zheng, Joonkyung Kim Year: 2026Area: Agent SafetyCitations: - Tags: agent-safety, ai-safety, position | 2026 | Agent Safety | agent-safety, ai-safety, position | E5 / R4 (93%) | - |
| RedTeamCUA: Realistic Adversarial Testing of Computer-Use Agents in Hybrid Web-OS Environments Zhiqiang Lin, Zeyi Liao, Eric Fosler-Lussier, Yu Su Year: 2026Area: Agent SafetyCitations: 12 Tags: agent-safety, ai-safety, adversarial-robustness, benchmark | 2026 | Agent Safety | agent-safety, ai-safety, adversarial-robustness, benchmark | E6 / R3 (95%) | 12 |
| The Devil Behind Moltbook: Anthropic Safety is Always Vanishing in Self-Evolving AI Societies Chaozhuo Li, Xi Zhang, Chenxu Wang, Songyang Liu Year: 2026Area: Agent SafetyCitations: - Tags: empirical, alignment-training, agent-safety, ai-safety | 2026 | Agent Safety | empirical, alignment-training, agent-safety, ai-safety | E4 / R3 (95%) | - |
| The Shadow Self: Intrinsic Value Misalignment in Large Language Model Agents Qian Wang, Yuan Yang, Ziyao Liu, Kwok-Yan Lam Year: 2026Area: Agent SafetyCitations: - Tags: alignment-training, agent-safety, ai-safety, benchmark | 2026 | Agent Safety | alignment-training, agent-safety, ai-safety, benchmark | E4 / R3 (96%) | - |
| Toward Constitutional Autonomy in AI Systems: A Theoretical Framework for Aligned Agentic Intelligence William Torgbi Agbemabiese Year: 2026Area: Agent SafetyCitations: 1 Tags: theoretical, alignment-training, agent-safety, ai-safety | 2026 | Agent Safety | theoretical, alignment-training, agent-safety, ai-safety | E4 / R3 (93%) | 1 |
| TrajAD: Trajectory Anomaly Detection for Trustworthy LLM Agents Yang Yu, Chong Zhang, Yong Wang, Yibing Liu Year: 2026Area: Agent SafetyCitations: - Tags: empirical, agent-safety, ai-safety | 2026 | Agent Safety | empirical, agent-safety, ai-safety | E5 / R3 (96%) | - |
| Trustworthy Agentic AI Requires Deterministic Architectural Boundaries Manish Bhattarai, Minh Vu Year: 2026Area: Agent SafetyCitations: - Tags: theoretical, alignment-training, agent-safety, ai-safety | 2026 | Agent Safety | theoretical, alignment-training, agent-safety, ai-safety | E5 / R4 (97%) | - |
| A Behavioural and Representational Evaluation of Goal-Directedness in Language Model Agents Moksh Nirvaan, Raghu Arghal, Fade Chen, Angelos Nalmpantis Year: 2025Area: Agent SafetyCitations: - Tags: empirical, agent-safety, ai-safety, safety-evaluation | 2025 | Agent Safety | empirical, agent-safety, ai-safety, safety-evaluation | E4 / R3 (94%) | - |
| A Sketch of an AI Control Safety Case Geoffrey Irving, Buck Shlegeris, Tomek Korbak, Joshua Clymer Year: 2025Area: Agent SafetyCitations: 22 Tags: agent-safety, ai-safety, position, safety-evaluation | 2025 | Agent Safety | agent-safety, ai-safety, position, safety-evaluation | E6 / R4 (93%) | 22 |
| A Survey of LLM-Driven AI Agent Communication: Protocols, Security Risks, and Defense Countermeasures Dezhang Kong, Xuan Liu, Yuyuan Li, Zhenhua Xu Year: 2025Area: Agent SafetyCitations: 35 Tags: agent-safety, ai-safety, survey | 2025 | Agent Safety | agent-safety, ai-safety, survey | E5 / R3 (97%) | 35 |
| A Survey on Agentic Security: Applications, Threats and Defenses Asif Shahriar, Sadif Ahmed, Farig Sadeque, Md Nafiu Rahman Year: 2025Area: Agent SafetyCitations: 6 Tags: agent-safety, ai-safety, adversarial-robustness, survey | 2025 | Agent Safety | agent-safety, ai-safety, adversarial-robustness, survey | E6 / R4 (96%) | 6 |
| A Survey on Autonomy-Induced Security Risks in Large Model-Based Agents Chang Liu, Jun Zhu, Jun Luo, Hang Su Year: 2025Area: Agent SafetyCitations: 12 Tags: agent-safety, ai-safety, survey | 2025 | Agent Safety | agent-safety, ai-safety, survey | E5 / R3 (95%) | 12 |
| A Survey on Trustworthy LLM Agents: Threats and Countermeasures Qingsong Wen, Shilong Wang, Bo An, Linsey Pang Year: 2025Area: Agent SafetyCitations: 55 Tags: agent-safety, ai-safety, survey | 2025 | Agent Safety | agent-safety, ai-safety, survey | E5 / R4 (95%) | 55 |
| A Survey on the Safety and Security Threats of Computer-Using Agents: JARVIS or Ultron? Yongjiang Wu, Jen-tse Huang, Wenxuan Wang, Ada Chen Year: 2025Area: Agent SafetyCitations: 15 Tags: agent-safety, ai-safety, survey | 2025 | Agent Safety | agent-safety, ai-safety, survey | E5 / R3 (95%) | 15 |
| ACE: A Security Architecture for LLM-Integrated App Systems William Robertson, Cristina Nita-Rotaru, Alina Oprea, Evan Li Year: 2025Area: Agent SafetyCitations: 16 Tags: empirical, agent-safety, ai-safety | 2025 | Agent Safety | empirical, agent-safety, ai-safety | E5 / R3 (95%) | 16 |
| AGENTSAFE: Benchmarking the Safety of Embodied Agents on Hazardous Instructions Xianglong Liu, Dacheng Tao, Jiakai Wang, Siyuan Liang Year: 2025Area: Agent SafetyCitations: 13 Tags: agent-safety, ai-safety, adversarial-robustness, benchmark | 2025 | Agent Safety | agent-safety, ai-safety, adversarial-robustness, benchmark | E5 / R4 (99%) | 13 |
| AGrail: A Lifelong Agent Guardrail with Effective and Adaptive Safety Detection Shenghong Dai, Muhao Chen, Suman Banerjee, Chaowei Xiao Year: 2025Area: Agent SafetyCitations: 29 Tags: empirical, agent-safety, ai-safety | 2025 | Agent Safety | empirical, agent-safety, ai-safety | E7 / R4 (95%) | 29 |
| AI Kill Switch for Malicious Web-based LLM Agents Sechan Lee, Sangdon Park Year: 2025Area: Agent SafetyCitations: - Tags: empirical, agent-safety, ai-safety | 2025 | Agent Safety | empirical, agent-safety, ai-safety | E5 / R4 (96%) | - |
| AdInject: Real-World Black-Box Attacks on Web Agents via Advertising Delivery Rupeng Zhang, Junjie Wang, Qing Wang, Xiaojun Jia Year: 2025Area: Agent SafetyCitations: 4 Tags: empirical, agent-safety, ai-safety | 2025 | Agent Safety | empirical, agent-safety, ai-safety | E5 / R3 (96%) | 4 |
| Adapting Insider Risk mitigations for Agentic Misalignment: an empirical study Francesca Gomez Year: 2025Area: Agent SafetyCitations: - Tags: empirical, alignment-training, agent-safety, ai-safety | 2025 | Agent Safety | empirical, alignment-training, agent-safety, ai-safety | E6 / R3 (96%) | - |
| AgentAlign: Navigating Safety Alignment in the Shift from Informative to Agentic Large Language Models Lu Yin, Jinchuan Zhang, Songlin Hu, Yan Zhou Year: 2025Area: Agent SafetyCitations: 4 Tags: empirical, alignment-training, agent-safety, ai-safety | 2025 | Agent Safety | empirical, alignment-training, agent-safety, ai-safety | E5 / R3 (96%) | 4 |
| AgentBreeder: Mitigating the AI Safety Risks of Multi-Agent Scaffolds via Self-Improvement J Rosser, Jakob Foerster Year: 2025Area: Agent SafetyCitations: 6 Tags: empirical, agent-safety, ai-safety, adversarial-robustness | 2025 | Agent Safety | empirical, agent-safety, ai-safety, adversarial-robustness | E5 / R4 (94%) | 6 |
| AgentDAM: Privacy Leakage Evaluation for Autonomous Web Agents Arman Zharmagambetov, Maya Pavlova, Kamalika Chaudhuri, Chuan Guo Year: 2025Area: Agent SafetyCitations: 31 Tags: agent-safety, ai-safety, safety-evaluation, benchmark | 2025 | Agent Safety | agent-safety, ai-safety, safety-evaluation, benchmark | E6 / R3 (97%) | 31 |
| AgentGuardian: Learning Access Control Policies to Govern AI Agent Behavior David Mimran, Nadya Abaev, Denis Klimov, Gerard Levinov Year: 2025Area: Agent SafetyCitations: 2 Tags: empirical, agent-safety, ai-safety | 2025 | Agent Safety | empirical, agent-safety, ai-safety | E4 / R3 (96%) | 2 |
| AgentLeak: A Full-Stack Benchmark for Privacy Leakage in Multi-Agent LLM Systems Faouzi El Yagoubi, Ranwa Al Mallah, Godwin Badu-Marfo Year: 2025Area: Agent SafetyCitations: - Tags: agent-safety, ai-safety, benchmark | 2025 | Agent Safety | agent-safety, ai-safety, benchmark | E5 / R3 (95%) | - |
Showing 30 of 195 papers on page 1.