← Back to papers

Paper deep dive

Scaling Responsible Generative AI: Automating Red Teaming of LLM Applications

Adison Goh, Benjamin Chee, Matteo Vagnoli, Luca Baldassarre, Akshay Narayan

Year: 2025Venue: 2025 IEEE Conference on Artificial Intelligence (CAI)Area: Safety EvaluationType: ToolEmbeddings: 1

Intelligence

Status: succeeded | Model: google/gemini-3.1-flash-lite-preview | Prompt: intel-v1 | Confidence: 94%

Last extracted: 3/11/2026, 1:12:08 AM

Summary

This paper presents an automated red teaming framework for Large Language Models (LLMs) to identify and mitigate risks. It introduces a taxonomy of 48 LLM-associated risks, an adversarial prompt generation pipeline with five generator types, and an LLM-as-a-judge evaluation system, demonstrating significant reductions in manual evaluation time and improved scalability for LLM deployment.

Entities (4)

Large Language Models · technology · 100%Red-teaming · process · 98%Adversarial Prompt Generation Pipeline · system · 95%LLM-as-a-judge · evaluation-method · 95%

Relation Signals (2)

LLM-as-a-judge evaluates Large Language Models

confidence 95% · we implement a LLM-as-a-judge evaluation system to streamline testing.

Adversarial Prompt Generation Pipeline identifies LLM-associated risks

confidence 90% · we introduce an automated adversarial prompt generation pipeline involving five types of generators that cover diverse AI risks

Cypher Suggestions (2)

Map the relationship between evaluation methods and LLM risks · confidence 90% · unvalidated

MATCH (e:EvaluationMethod)-[:EVALUATES]->(l:Technology), (l)-[:HAS_RISK]->(r:Risk) RETURN e.name, l.name, r.name

Find all components of the automated red teaming framework · confidence 85% · unvalidated

MATCH (s:System)-[:COMPRISES_OR_USES]->(c) RETURN s.name, c.name

Abstract

Large Language Models (LLMs) present both significant business potential and substantial risks. This paper addresses the critical need for robust and scalable red teaming processes to identify and mitigate risks before LLM solution deployment. First, we define a comprehensive set of 48 LLM-associated risks reflecting emerging AI threats. Additionally, we introduce an automated adversarial prompt generation pipeline involving five types of generators that cover diverse AI risks at varying complexity levels. Finally, we implement a LLM-as-a-judge evaluation system to streamline testing. When applied to red-team two finance-based LLM applications, our approach achieved perfect recall in identifying failed outputs while reducing manual evaluation by over 90%. The developed features halved the time required for red teaming exercises, enhancing scalability and thoroughness while maintaining effectiveness. This work enhances LLM safety, accelerates deployment, and adds business value through improved risk management and responsible AI use.

Tags

adversarial-robustness (suggested, 80%)ai-safety (imported, 100%)red-teaming (suggested, 80%)safety-evaluation (suggested, 80%)tool (suggested, 88%)

Links

Full Text

826 characters extracted from source content.

Expand or collapse full text

Scaling Responsible Generative AI: Automating Red Teaming of LLM Applications | IEEE Conference Publication | IEEE Xplore IEEE Account Change Username/Password Update Address Purchase Details Payment Options Order History View Purchased Documents Profile Information Communications Preferences Profession and Education Technical Interests Need Help? US & Canada: +1 800 678 4333 Worldwide: +1 732 981 0060 Contact & Support About IEEE Xplore Contact Us Help Accessibility Terms of Use Nondiscrimination Policy Sitemap Privacy & Opting Out of Cookies A not-for-profit organization, IEEE is the world's largest technical professional organization dedicated to advancing technology for the benefit of humanity.© Copyright 2026 IEEE - All rights reserved. Use of this web site signifies your agreement to the terms and conditions.