← Back to papers

Paper deep dive

Between Generation and Judgment: A Cloud-Native Framework for Adversarial Evaluation of LLM Alignment

Diego E. G. Caetano De Oliveira, C. Miers, Marcos A. Simplicio, Victor Takashi Hayashi

Year: 2025Venue: IEEE International Conference on Cloud Computing Technology and Science (CloudCom 2025)Area: Safety EvaluationType: ToolEmbeddings: 1

Models: DeepSeek-V3, GPT-4o, Llama 3.3 70B Instruct, Mixtral 8x7B Instruct

Abstract

We present a cloud-native, end-to-end pipeline that unifies automated attack generation with a modular LLM-as-a-Judge, supports static/adaptive corpora across open and proprietary models, enables calibrated multi-judge consensus, and emits audit-ready cost/latency telemetry-addressing gaps from non-cloud-native scripts, decoupled attack/judgment, and single-judge bias. Evaluated under two budgets (MAX_PROMPTS <tex>$=10,32)$</tex> on three targets-Llama 3.3 70B Instruct, Mix-tral <tex>$8\times 7\mathrm{B}$</tex> Instruct, DeepSeek-V3-ASR spans 0-37.5% (10) and 0-26.6% (32); judge accuracy ranges from 92.93% (open Llama 3.3 70B Instruct) to 98.94% (proprietary GPT-4o); mean judgment latency is 1.2-5.6 s <tex>$\text { (GPT-4o DeepSeek-V3) }$</tex>; and unit cost is $0.10-$2.18 per lk adjudications <tex>$(\text{Mixtral}8\times 7\mathrm{B} \rightarrow \text{GPT}-40)$</tex>. These results motivate a tiered policy-lightweight open-family judges for high-throughput triage and cross-family/ensemble judges for low-confidence cases-offering a practical blueprint for continuous, auditable adversarial eval-uation of LLM alignment at cloud scale.

Tags

adversarial-robustness (suggested, 80%)ai-safety (imported, 100%)alignment-training (suggested, 80%)safety-evaluation (suggested, 80%)tool (suggested, 88%)

Links

Intelligence

Status: succeeded | Model: google/gemini-3.1-flash-lite-preview | Prompt: intel-v1 | Confidence: 95%

Last extracted: 3/11/2026, 12:38:53 AM

Summary

The paper introduces a cloud-native framework for the adversarial evaluation of LLM alignment, integrating automated attack generation with a modular LLM-as-a-Judge system. The framework supports multi-judge consensus, provides audit-ready telemetry, and addresses biases found in single-judge setups. Empirical evaluation on models like Llama 3.3, Mixtral, and DeepSeek-V3 demonstrates the framework's effectiveness in cost, latency, and accuracy, proposing a tiered policy for scalable evaluation.

Entities (5)

DeepSeek-V3 · llm · 100%GPT-4o · llm · 100%Llama 3.3 70B Instruct · llm · 100%Mixtral 8x7B Instruct · llm · 100%Cloud-Native Framework · software-framework · 90%

Relation Signals (4)

Cloud-Native Framework evaluates Llama 3.3 70B Instruct

confidence 95% · Evaluated under two budgets on three targets-Llama 3.3 70B Instruct

Cloud-Native Framework evaluates Mixtral 8x7B Instruct

confidence 95% · Evaluated under two budgets on three targets-Mixtral 8x7B Instruct

Cloud-Native Framework evaluates DeepSeek-V3

confidence 95% · Evaluated under two budgets on three targets-DeepSeek-V3

GPT-4o actsas LLM-as-a-Judge

confidence 90% · judge accuracy ranges from 92.93% (open Llama 3.3 70B Instruct) to 98.94% (proprietary GPT-4o)

Cypher Suggestions (2)

Find all LLMs evaluated by the framework · confidence 90% · unvalidated

MATCH (f:Framework {name: 'Cloud-Native Framework'})-[:EVALUATES]->(m:LLM) RETURN m.name

Identify models acting as judges · confidence 90% · unvalidated

MATCH (m:LLM)-[:ACTS_AS]->(r:Role {name: 'LLM-as-a-Judge'}) RETURN m.name

Full Text

850 characters extracted from source content.

Expand or collapse full text

Between Generation and Judgment: A Cloud-Native Framework for Adversarial Evaluation of LLM Alignment | IEEE Conference Publication | IEEE Xplore IEEE Account Change Username/Password Update Address Purchase Details Payment Options Order History View Purchased Documents Profile Information Communications Preferences Profession and Education Technical Interests Need Help? US & Canada: +1 800 678 4333 Worldwide: +1 732 981 0060 Contact & Support About IEEE Xplore Contact Us Help Accessibility Terms of Use Nondiscrimination Policy Sitemap Privacy & Opting Out of Cookies A not-for-profit organization, IEEE is the world's largest technical professional organization dedicated to advancing technology for the benefit of humanity.© Copyright 2026 IEEE - All rights reserved. Use of this web site signifies your agreement to the terms and conditions.