Instant research discovery
Search and browse ingested papers with intelligence signals and fast filtering.
| Paper | Year | Area | Tags | Intel | Citations |
|---|---|---|---|---|---|
| AI Act Evaluation Benchmark: An Open, Transparent, and Reproducible Evaluation Dataset for NLP and RAG Systems Athanasios Davvetas, Michael Papademas, Vangelis Karkaletsis, Xenia Ziouvelou Year: 2026Area: cs.AICitations: - Tags: ai-safety, csai, safety-evaluation, preprint | 2026 | cs.AI | ai-safety, csai, safety-evaluation, preprint | E4 / R3 (95%) | - |
| Alignment Verifiability in Large Language Models: Normative Indistinguishability under Behavioral Evaluation Igor Santos-Grueiro Year: 2026Area: Deception & FailureCitations: 1 Tags: theoretical, alignment-training, ai-safety, deception-failure, safety-evaluation | 2026 | Deception & Failure | theoretical, alignment-training, ai-safety, deception-failure, safety-evaluation | E4 / R2 (94%) | 1 |
| An Agentic Evaluation Framework for AI-Generated Scientific Code in PETSc Murat Keceli, Satish Balay, Hong Zhang, Junchao Zhang Year: 2026Area: cs.AICitations: - Tags: ai-safety, csai, safety-evaluation, preprint | 2026 | cs.AI | ai-safety, csai, safety-evaluation, preprint | - | - |
| An Interactive Multi-Agent System for Evaluation of New Product Concepts Bin Xuan, Ruo Ai, Hakyeon Lee Year: 2026Area: cs.AICitations: - Tags: ai-safety, csai, safety-evaluation, preprint | 2026 | cs.AI | ai-safety, csai, safety-evaluation, preprint | E6 / R4 (96%) | - |
| Attack Selection Reduces Safety in Concentrated AI Control Settings against Trusted Monitoring Tyler Tracy, Joachim Schaeffer, Arjun Khandelwal Year: 2026Area: Safety EvaluationCitations: - Tags: empirical, ai-safety, safety-evaluation | 2026 | Safety Evaluation | empirical, ai-safety, safety-evaluation | E6 / R3 (94%) | - |
| AutoControl Arena: Synthesizing Executable Test Environments for Frontier AI Risk Evaluation Xudong Pan, Fazl Barez, Changyi Li, Min Yang Year: 2026Area: cs.AICitations: - Tags: ai-safety, csai, safety-evaluation, preprint | 2026 | cs.AI | ai-safety, csai, safety-evaluation, preprint | E5 / R3 (95%) | - |
| Benchmarking PDF Parsers on Table Extraction with LLM-based Semantic Evaluation Janis Keuper, Pius Horn Year: 2026Area: cs.CVCitations: - Tags: ai-safety, cscv, safety-evaluation, preprint | 2026 | cs.CV | ai-safety, cscv, safety-evaluation, preprint | - | - |
| Beyond Suffixes: Token Position in GCG Adversarial Attacks on Large Language Models Fadi Hassan, Hicham Eddoubi, Umar Faruk Abdullahi Year: 2026Area: Adversarial RobustnessCitations: - Tags: empirical, ai-safety, adversarial-robustness, safety-evaluation | 2026 | Adversarial Robustness | empirical, ai-safety, adversarial-robustness, safety-evaluation | E7 / R3 (98%) | - |
| CLAIRE: Compressed Latent Autoencoder for Industrial Representation and Evaluation -- A Deep Learning Framework for Smart Manufacturing Mohammadhossein Ghahramani, Mengchu Zhou Year: 2026Area: cs.LGCitations: - Tags: ai-safety, cslg, safety-evaluation, preprint | 2026 | cs.LG | ai-safety, cslg, safety-evaluation, preprint | E5 / R3 (94%) | - |
| CRIMSON: A Clinically-Grounded LLM-Based Metric for Generative Radiology Report Evaluation Sung Eun Kim, Thibault Heintz, Pranav Rajpurkar, Mona Alhammad Year: 2026Area: cs.CLCitations: - Tags: cscl, ai-safety, safety-evaluation, preprint | 2026 | cs.CL | cscl, ai-safety, safety-evaluation, preprint | E5 / R3 (94%) | - |
| CSSBench: Evaluating the Safety of Lightweight LLMs against Chinese-Specific Adversarial Patterns Qiankun Li, Kun Wang, Shilinlu Yan, Zhenhong Zhou Year: 2026Area: Safety EvaluationCitations: - Tags: ai-safety, adversarial-robustness, safety-evaluation, benchmark | 2026 | Safety Evaluation | ai-safety, adversarial-robustness, safety-evaluation, benchmark | E6 / R5 (96%) | - |
| CUAAudit: Meta-Evaluation of Vision-Language Models as Auditors of Autonomous Computer-Use Agents Oleksandr Kosovan, Marta Sumyk Year: 2026Area: cs.AICitations: - Tags: ai-safety, csai, safety-evaluation, preprint | 2026 | cs.AI | ai-safety, csai, safety-evaluation, preprint | - | - |
| ClawTrap: A MITM-Based Red-Teaming Framework for Real-World OpenClaw Security Evaluation Haochen Zhao, Shaoyang Cui Year: 2026Area: cs.CRCitations: - Tags: ai-safety, cscr, safety-evaluation, preprint | 2026 | cs.CR | ai-safety, cscr, safety-evaluation, preprint | - | - |
| CoMAI: A Collaborative Multi-Agent Framework for Robust and Equitable Interview Evaluation Zhiwei Xu, Liangyi Yin, Ruihao Yu, Bin Zhang Year: 2026Area: cs.MACitations: - Tags: ai-safety, safety-evaluation, csma, preprint | 2026 | cs.MA | ai-safety, safety-evaluation, csma, preprint | - | - |
| CoTJudger: A Graph-Driven Framework for Automatic Evaluation of Chain-of-Thought Efficiency and Redundancy in LRMs Ge Zhang, Hamid Alinejad-Rokny, Zhoufutu Wen, Yizhi Li Year: 2026Area: cs.AICitations: - Tags: ai-safety, csai, safety-evaluation, preprint | 2026 | cs.AI | ai-safety, csai, safety-evaluation, preprint | E4 / R3 (95%) | - |
| Constructing Safety Cases for AI Systems: A Reusable Template Framework Jieshan Chen, Md Shamsujjoha, Sung Une Lee, Liming Dong Year: 2026Area: Safety EvaluationCitations: - Tags: ai-safety, survey, safety-evaluation | 2026 | Safety Evaluation | ai-safety, survey, safety-evaluation | E5 / R4 (97%) | - |
| DEAF: A Benchmark for Diagnostic Evaluation of Acoustic Faithfulness in Audio Language Models Yutong Zhang, Ruofan Liao, Qi Cao, Yu Zheng Year: 2026Area: cs.AICitations: - Tags: ai-safety, csai, safety-evaluation, preprint | 2026 | cs.AI | ai-safety, csai, safety-evaluation, preprint | - | - |
| Deployment and Evaluation of an EHR-integrated, Large Language Model-Powered Tool to Triage Surgical Patients Jerry Liu, Abby Pandya, Nerissa Ambers, Stephen P Ma Year: 2026Area: cs.CYCitations: - Tags: ai-safety, cscy, safety-evaluation, preprint | 2026 | cs.CY | ai-safety, cscy, safety-evaluation, preprint | - | - |
| Detection Is Cheap, Routing Is Learned: Why Refusal-Based Alignment Evaluation Fails Gregory N. Frank Year: 2026Area: cs.LGCitations: - Tags: alignment-training, ai-safety, cslg, safety-evaluation, preprint | 2026 | cs.LG | alignment-training, ai-safety, cslg, safety-evaluation, preprint | - | - |
| Do Large Language Models Reflect Demographic Pluralism in Safety? Rafiq Ali, Sushant Kumar Ray, Usman Naseem, Abdullah Mohammad Year: 2026Area: Safety EvaluationCitations: - Tags: ai-safety, safety-evaluation, benchmark | 2026 | Safety Evaluation | ai-safety, safety-evaluation, benchmark | E6 / R4 (95%) | - |
| Dual-Metric Evaluation of Social Bias in Large Language Models: Evidence from an Underrepresented Nepali Cultural Context Tek Raj Chhetri, Ashish Pandey Year: 2026Area: cs.CLCitations: - Tags: cscl, ai-safety, safety-evaluation, preprint | 2026 | cs.CL | cscl, ai-safety, safety-evaluation, preprint | E4 / R2 (97%) | - |
| Efficient Policy Learning with Hybrid Evaluation-Based Genetic Programming for Uncertain Agile Earth Observation Satellite Scheduling Junhua Xue, Yuning Chen Year: 2026Area: cs.AICitations: - Tags: ai-safety, csai, safety-evaluation, preprint | 2026 | cs.AI | ai-safety, csai, safety-evaluation, preprint | E5 / R4 (97%) | - |
| Ethical Risks in Deploying Large Language Models: An Evaluation of Medical Ethics Jailbreaking Chengze Yan, Yunlou Fan, Jiacheng Ji, Hanhui Xu Year: 2026Area: Adversarial RobustnessCitations: - Tags: empirical, ai-safety, adversarial-robustness, safety-evaluation | 2026 | Adversarial Robustness | empirical, ai-safety, adversarial-robustness, safety-evaluation | E5 / R3 (96%) | - |
| Evaluation format, not model capability, drives triage failure in the assessment of consumer health AI Enrico Coiera, Farah Magrabi, David Fraile Navarro Year: 2026Area: cs.HCCitations: - Tags: ai-safety, cshc, safety-evaluation, preprint | 2026 | cs.HC | ai-safety, cshc, safety-evaluation, preprint | - | - |
| Expected Harm: Rethinking Safety Evaluation of (Mis)Aligned LLMs Zhi Rui Tam, Yen-Shan Chen, Yun-Nung Chen, Cheng-Kuang Wu Year: 2026Area: Safety EvaluationCitations: - Tags: empirical, ai-safety, adversarial-robustness, safety-evaluation | 2026 | Safety Evaluation | empirical, ai-safety, adversarial-robustness, safety-evaluation | E5 / R3 (95%) | - |
| From Data to Behavior: Predicting Unintended Model Behaviors Before Training Huajun Chen, Zhenqian Xu, Junfeng Fang, Ningyu Zhang Year: 2026Area: Safety EvaluationCitations: - Tags: empirical, ai-safety, safety-evaluation | 2026 | Safety Evaluation | empirical, ai-safety, safety-evaluation | E5 / R3 (94%) | - |
| From Helpfulness to Toxic Proactivity: Diagnosing Behavioral Misalignment in LLM Agents Sen Su, Fanyu Meng, Zhenhong Zhou, Zhengshuo Gong Year: 2026Area: Safety EvaluationCitations: - Tags: alignment-training, ai-safety, safety-evaluation, benchmark | 2026 | Safety Evaluation | alignment-training, ai-safety, safety-evaluation, benchmark | E5 / R3 (94%) | - |
| GNNs for Time Series Anomaly Detection: An Open-Source Framework and a Critical Evaluation Marcelo Fiori, Federico Larroca, Gonzalo Chiarlone, Gastón García González Year: 2026Area: cs.LGCitations: - Tags: ai-safety, cslg, safety-evaluation, preprint | 2026 | cs.LG | ai-safety, cslg, safety-evaluation, preprint | E7 / R3 (96%) | - |
| Hospitality-VQA: Decision-Oriented Informativeness Evaluation for Vision-Language Models Seungduk Kim, Jeongwoo Lee, Eungyeol Han, Baek Duhyeong Year: 2026Area: cs.AICitations: - Tags: ai-safety, csai, safety-evaluation, preprint | 2026 | cs.AI | ai-safety, csai, safety-evaluation, preprint | E7 / R3 (94%) | - |
| ICE: Intervention-Consistent Explanation Evaluation with Statistical Grounding for LLMs Pavan Chakraborty, Abhinaba Basu Year: 2026Area: cs.CLCitations: - Tags: cscl, ai-safety, safety-evaluation, preprint | 2026 | cs.CL | cscl, ai-safety, safety-evaluation, preprint | - | - |
Showing 30 of 641 papers on page 1.