Instant research discovery

Search and browse ingested papers with intelligence signals and fast filtering.

PaperIntel
AI Act Evaluation Benchmark: An Open, Transparent, and Reproducible Evaluation Dataset for NLP and RAG Systems

Athanasios Davvetas, Michael Papademas, Vangelis Karkaletsis, Xenia Ziouvelou

Year: 2026Area: cs.AICitations: -

Tags: ai-safety, csai, safety-evaluation, preprint

E4 / R3 (95%)
Alignment Verifiability in Large Language Models: Normative Indistinguishability under Behavioral Evaluation

Igor Santos-Grueiro

Year: 2026Area: Deception & FailureCitations: 1

Tags: theoretical, alignment-training, ai-safety, deception-failure, safety-evaluation

E4 / R2 (94%)
An Agentic Evaluation Framework for AI-Generated Scientific Code in PETSc

Murat Keceli, Satish Balay, Hong Zhang, Junchao Zhang

Year: 2026Area: cs.AICitations: -

Tags: ai-safety, csai, safety-evaluation, preprint

-
An Interactive Multi-Agent System for Evaluation of New Product Concepts

Bin Xuan, Ruo Ai, Hakyeon Lee

Year: 2026Area: cs.AICitations: -

Tags: ai-safety, csai, safety-evaluation, preprint

E6 / R4 (96%)
Attack Selection Reduces Safety in Concentrated AI Control Settings against Trusted Monitoring

Tyler Tracy, Joachim Schaeffer, Arjun Khandelwal

Year: 2026Area: Safety EvaluationCitations: -

Tags: empirical, ai-safety, safety-evaluation

E6 / R3 (94%)
AutoControl Arena: Synthesizing Executable Test Environments for Frontier AI Risk Evaluation

Xudong Pan, Fazl Barez, Changyi Li, Min Yang

Year: 2026Area: cs.AICitations: -

Tags: ai-safety, csai, safety-evaluation, preprint

E5 / R3 (95%)
Benchmarking PDF Parsers on Table Extraction with LLM-based Semantic Evaluation

Janis Keuper, Pius Horn

Year: 2026Area: cs.CVCitations: -

Tags: ai-safety, cscv, safety-evaluation, preprint

-
Beyond Suffixes: Token Position in GCG Adversarial Attacks on Large Language Models

Fadi Hassan, Hicham Eddoubi, Umar Faruk Abdullahi

Year: 2026Area: Adversarial RobustnessCitations: -

Tags: empirical, ai-safety, adversarial-robustness, safety-evaluation

E7 / R3 (98%)
CLAIRE: Compressed Latent Autoencoder for Industrial Representation and Evaluation -- A Deep Learning Framework for Smart Manufacturing

Mohammadhossein Ghahramani, Mengchu Zhou

Year: 2026Area: cs.LGCitations: -

Tags: ai-safety, cslg, safety-evaluation, preprint

E5 / R3 (94%)
CRIMSON: A Clinically-Grounded LLM-Based Metric for Generative Radiology Report Evaluation

Sung Eun Kim, Thibault Heintz, Pranav Rajpurkar, Mona Alhammad

Year: 2026Area: cs.CLCitations: -

Tags: cscl, ai-safety, safety-evaluation, preprint

E5 / R3 (94%)
CSSBench: Evaluating the Safety of Lightweight LLMs against Chinese-Specific Adversarial Patterns

Qiankun Li, Kun Wang, Shilinlu Yan, Zhenhong Zhou

Year: 2026Area: Safety EvaluationCitations: -

Tags: ai-safety, adversarial-robustness, safety-evaluation, benchmark

E6 / R5 (96%)
CUAAudit: Meta-Evaluation of Vision-Language Models as Auditors of Autonomous Computer-Use Agents

Oleksandr Kosovan, Marta Sumyk

Year: 2026Area: cs.AICitations: -

Tags: ai-safety, csai, safety-evaluation, preprint

-
ClawTrap: A MITM-Based Red-Teaming Framework for Real-World OpenClaw Security Evaluation

Haochen Zhao, Shaoyang Cui

Year: 2026Area: cs.CRCitations: -

Tags: ai-safety, cscr, safety-evaluation, preprint

-
CoMAI: A Collaborative Multi-Agent Framework for Robust and Equitable Interview Evaluation

Zhiwei Xu, Liangyi Yin, Ruihao Yu, Bin Zhang

Year: 2026Area: cs.MACitations: -

Tags: ai-safety, safety-evaluation, csma, preprint

-
CoTJudger: A Graph-Driven Framework for Automatic Evaluation of Chain-of-Thought Efficiency and Redundancy in LRMs

Ge Zhang, Hamid Alinejad-Rokny, Zhoufutu Wen, Yizhi Li

Year: 2026Area: cs.AICitations: -

Tags: ai-safety, csai, safety-evaluation, preprint

E4 / R3 (95%)
Constructing Safety Cases for AI Systems: A Reusable Template Framework

Jieshan Chen, Md Shamsujjoha, Sung Une Lee, Liming Dong

Year: 2026Area: Safety EvaluationCitations: -

Tags: ai-safety, survey, safety-evaluation

E5 / R4 (97%)
DEAF: A Benchmark for Diagnostic Evaluation of Acoustic Faithfulness in Audio Language Models

Yutong Zhang, Ruofan Liao, Qi Cao, Yu Zheng

Year: 2026Area: cs.AICitations: -

Tags: ai-safety, csai, safety-evaluation, preprint

-
Deployment and Evaluation of an EHR-integrated, Large Language Model-Powered Tool to Triage Surgical Patients

Jerry Liu, Abby Pandya, Nerissa Ambers, Stephen P Ma

Year: 2026Area: cs.CYCitations: -

Tags: ai-safety, cscy, safety-evaluation, preprint

-
Detection Is Cheap, Routing Is Learned: Why Refusal-Based Alignment Evaluation Fails

Gregory N. Frank

Year: 2026Area: cs.LGCitations: -

Tags: alignment-training, ai-safety, cslg, safety-evaluation, preprint

-
Do Large Language Models Reflect Demographic Pluralism in Safety?

Rafiq Ali, Sushant Kumar Ray, Usman Naseem, Abdullah Mohammad

Year: 2026Area: Safety EvaluationCitations: -

Tags: ai-safety, safety-evaluation, benchmark

E6 / R4 (95%)
Dual-Metric Evaluation of Social Bias in Large Language Models: Evidence from an Underrepresented Nepali Cultural Context

Tek Raj Chhetri, Ashish Pandey

Year: 2026Area: cs.CLCitations: -

Tags: cscl, ai-safety, safety-evaluation, preprint

E4 / R2 (97%)
Efficient Policy Learning with Hybrid Evaluation-Based Genetic Programming for Uncertain Agile Earth Observation Satellite Scheduling

Junhua Xue, Yuning Chen

Year: 2026Area: cs.AICitations: -

Tags: ai-safety, csai, safety-evaluation, preprint

E5 / R4 (97%)
Ethical Risks in Deploying Large Language Models: An Evaluation of Medical Ethics Jailbreaking

Chengze Yan, Yunlou Fan, Jiacheng Ji, Hanhui Xu

Year: 2026Area: Adversarial RobustnessCitations: -

Tags: empirical, ai-safety, adversarial-robustness, safety-evaluation

E5 / R3 (96%)
Evaluation format, not model capability, drives triage failure in the assessment of consumer health AI

Enrico Coiera, Farah Magrabi, David Fraile Navarro

Year: 2026Area: cs.HCCitations: -

Tags: ai-safety, cshc, safety-evaluation, preprint

-
Expected Harm: Rethinking Safety Evaluation of (Mis)Aligned LLMs

Zhi Rui Tam, Yen-Shan Chen, Yun-Nung Chen, Cheng-Kuang Wu

Year: 2026Area: Safety EvaluationCitations: -

Tags: empirical, ai-safety, adversarial-robustness, safety-evaluation

E5 / R3 (95%)
From Data to Behavior: Predicting Unintended Model Behaviors Before Training

Huajun Chen, Zhenqian Xu, Junfeng Fang, Ningyu Zhang

Year: 2026Area: Safety EvaluationCitations: -

Tags: empirical, ai-safety, safety-evaluation

E5 / R3 (94%)
From Helpfulness to Toxic Proactivity: Diagnosing Behavioral Misalignment in LLM Agents

Sen Su, Fanyu Meng, Zhenhong Zhou, Zhengshuo Gong

Year: 2026Area: Safety EvaluationCitations: -

Tags: alignment-training, ai-safety, safety-evaluation, benchmark

E5 / R3 (94%)
GNNs for Time Series Anomaly Detection: An Open-Source Framework and a Critical Evaluation

Marcelo Fiori, Federico Larroca, Gonzalo Chiarlone, Gastón García González

Year: 2026Area: cs.LGCitations: -

Tags: ai-safety, cslg, safety-evaluation, preprint

E7 / R3 (96%)
Hospitality-VQA: Decision-Oriented Informativeness Evaluation for Vision-Language Models

Seungduk Kim, Jeongwoo Lee, Eungyeol Han, Baek Duhyeong

Year: 2026Area: cs.AICitations: -

Tags: ai-safety, csai, safety-evaluation, preprint

E7 / R3 (94%)
ICE: Intervention-Consistent Explanation Evaluation with Statistical Grounding for LLMs

Pavan Chakraborty, Abhinaba Basu

Year: 2026Area: cs.CLCitations: -

Tags: cscl, ai-safety, safety-evaluation, preprint

-

Showing 30 of 641 papers on page 1.