← Back to papers

Paper deep dive

AssurAI: Experience with Constructing Korean Socio-cultural Datasets to Discover Potential Risks of Generative AI

Chae-Gyun Lim, Seung-Ho Han, EunYoung Byun, Jeongyun Han, Soohyun Cho, Eojin Joo, Heehyeon Kim, Sieun Kim, Juhoon Lee, Hyunsoo Lee, Dongkun Lee, Jonghwan Hyeon, Yechan Hwang, Young-Jun Lee, Kyeongryul Lee, Minhyeong An, Hyunjun Ahn, Jeongwoo Son, Junho Park, Donggyu Yoon, Taehyung Kim, Jeemin Kim, Dasom Choi, Kwangyoung Lee, Hyunseung Lim, Yeohyun Jung, Jongok Hong, Sooyohn Nam, Joonyoung Park, Sungmin Na, Yubin Choi, Jeanne Choi, Yoojin Hong, Sueun Jang, Youngseok Seo, Somin Park, Seoungung Jo, Wonhye Chae, Yeeun Jo, Eunyoung Kim, Joyce Jiyoung Whang, HwaJung Hong, Joseph Seering, Uichin Lee, Juho Kim, Sunna Choi, Seokyeon Ko, Taeho Kim, Kyunghoon Kim, Myungsik Ha, So Jung Lee, Jemin Hwang, JoonHo Kwak, Ho-Jin Choi

Year: 2025Venue: arXiv preprintArea: Safety EvaluationType: DatasetEmbeddings: 51

Abstract

Abstract:The rapid evolution of generative AI necessitates robust safety evaluations. However, current safety datasets are predominantly English-centric, failing to capture specific risks in non-English, socio-cultural contexts such as Korean, and are often limited to the text modality. To address this gap, we introduce AssurAI, a new quality-controlled Korean multimodal dataset for evaluating the safety of generative AI. First, we define a taxonomy of 35 distinct AI risk factors, adapted from established frameworks by a multidisciplinary expert group to cover both universal harms and relevance to the Korean socio-cultural context. Second, leveraging this taxonomy, we construct and release AssurAI, a large-scale Korean multimodal dataset comprising 11,480 instances across text, image, video, and audio. Third, we apply the rigorous quality control process used to ensure data integrity, featuring a two-phase construction (i.e., expert-led seeding and crowdsourced scaling), triple independent annotation, and an iterative expert red-teaming loop. Our pilot study validates AssurAI's effectiveness in assessing the safety of recent LLMs. We release AssurAI to the public to facilitate the development of safer and more reliable generative AI systems for the Korean community.

Tags

ai-safety (imported, 100%)dataset (suggested, 88%)safety-evaluation (suggested, 92%)

Links

PDF not stored locally. Use the link above to view on the source site.

Intelligence

Status: succeeded | Model: google/gemini-3.1-flash-lite-preview | Prompt: intel-v1 | Confidence: 98%

Last extracted: 3/11/2026, 1:18:38 AM

Summary

AssurAI is a quality-controlled, large-scale Korean multimodal dataset comprising 11,480 instances across text, image, video, and audio, designed to evaluate the safety of generative AI. It features a taxonomy of 35 AI risk factors tailored to Korean socio-cultural contexts, developed through a two-phase construction process involving expert-led seeding and crowdsourced scaling, followed by rigorous quality control and pilot validation.

Entities (5)

AssurAI · dataset · 100%KAIST · organization · 100%Kakao · organization · 100%Selectstar · organization · 100%Korean socio-cultural context · domain · 95%

Relation Signals (4)

AssurAI contains 35 AI risk factors

confidence 100% · we define a taxonomy of 35 distinct AI risk factors

KAIST developed AssurAI

confidence 100% · domain experts (KAIST) develop schemes for each risk factor

Selectstar produced AssurAI

confidence 100% · specialized data construction company (Selectstar) performs mass data production

Kakao validated AssurAI

confidence 100% · the leading domestic AI company (Kakao) pilots the constructed dataset

Cypher Suggestions (2)

Identify organizations involved in the creation of AssurAI. · confidence 95% · unvalidated

MATCH (o:Organization)-[rel:DEVELOPED|PRODUCED|VALIDATED]->(d:Dataset {name: 'AssurAI'}) RETURN o.name, type(rel)

Find all risk factors associated with the AssurAI dataset. · confidence 90% · unvalidated

MATCH (d:Dataset {name: 'AssurAI'})-[:CONTAINS]->(r:RiskFactor) RETURN r.name, r.category

Full Text

50,885 characters extracted from source content.

Expand or collapse full text

AssurAI: Experience with Constructing Korean Socio-cultural Datasets to Discover Potential Risks of Generative AI Chae-Gyun Lim 1 , Seung-Ho Han 1 , EunYoung Byun 2 , Jeongyun Han 3 , Soohyun Cho 4 , Eojin Joo 1 ,Heehyeon Kim 1 ,Sieun Kim 1 ,Juhoon Lee 1 ,Hyunsoo Lee 1 , Dongkun Lee 1 ,Jonghwan Hyeon 1 ,Yechan Hwang 1 ,Young-Jun Lee 1 ,Kyeongryul Lee 1 , Minhyeong An 1 ,Hyunjun Ahn 1 ,Jeongwoo Son 1 ,Junho Park 1 ,Donggyu Yoon 1 , Taehyung Kim 1 ,Jeemin Kim 1 ,Dasom Choi 1 ,Kwangyoung Lee 1 ,Hyunseung Lim 1 , Yeohyun Jung 1 ,Jongok Hong 1 ,Sooyohn Nam 1 ,Joonyoung Park 1 ,Sungmin Na 1 , Yubin Choi 1 ,Jeanne Choi 1 ,Yoojin Hong 1 ,Sueun Jang 1 ,Youngseok Seo 1 , Somin Park 1 ,Seoungung Jo 1 ,Wonhye Chae 3 ,Yeeun Jo 4 ,Eunyoung Kim 4 , Joyce Jiyoung Whang 1 ,HwaJung Hong 1 ,Joseph Seering 1 ,Uichin Lee 1 ,Juho Kim 1 , Sunna Choi 5 ,Seokyeon Ko 5 ,Taeho Kim 5 ,Kyunghoon Kim 6 ,Myungsik Ha 6 , So Jung Lee 6 ,Jemin Hwang 2 ,JoonHo Kwak 2 ,Ho-Jin Choi 1,† 1 KAIST, 2 TTA, 3 University of Seoul, 4 Keimyung University, 5 Selectstar, 6 Kakao †Correspondence:hojinc@kaist.ac.kr Abstract The rapid evolution of generative AI necessi- tates robust safety evaluations. However, cur- rent safety datasets are predominantly English- centric, failing to capture specific risks in non- English, socio-cultural contexts such as Korean, and are often limited to the text modality. To address this gap, we introduceAssurAI, a new quality-controlled Korean multimodal dataset for evaluating the safety of generative AI. First, we define a taxonomy of 35 distinct AI risk fac- tors, adapted from established frameworks by a multidisciplinary expert group to cover both universal harms and relevance to the Korean socio-cultural context. Second, leveraging this taxonomy, we construct and releaseAssurAI, a large-scale Korean multimodal dataset compris- ing 11,480 instances across text, image, video, and audio. Third, we apply the rigorous quality control process used to ensure data integrity, featuring a two-phase construction (i.e., expert- led seeding and crowdsourced scaling), triple independent annotation, and an iterative ex- pert red-teaming loop. Our pilot study validates AssurAI’s effectiveness in assessing the safety of recent LLMs. We releaseAssurAIto the public to facilitate the development of safer and more reliable generative AI systems for the Korean community. 1 Introduction Generative artificial intelligence (AI) and large lan- guage models (LLMs) have demonstrated remark- able potential across diverse applications, including content creation, programming support, and edu- cation (Wei et al., 2022; Shanahan et al., 2023). However, these technological advancements are double-edged swords, causing serious societal risks (Zeng et al., 2024; Slattery et al., 2024). Malicious users can exploit models to produce large volumes of persuasive misinformation or create deepfakes for defamation or political disruption (Qu et al., 2023; Miao et al., 2024). Furthermore, generative AI can perpetuate social biases and stereotypes by directly reflecting inherent biases in training data, and it can be exploited in cyberattacks by auto- matically generating sophisticated phishing emails (Weidinger et al., 2021; Shen et al., 2024; Liu et al., 2023). The widespread uses of these harmful cases highlight the need for reliable methods to system- atically evaluate the safety of generative AI and mitigate potential risks (Ganguli et al., 2022). There have been an increasing number of AI safety evaluation benchmark datasets developed for this purpose. However, most existing datasets, such as ToxiGen (Hartvigsen et al., 2022), Real- ToxicityPrompts (Gehman et al., 2020), and Safe- tyBench (Zhang et al., 2023), are built around En- glish, which has limitations in properly reflecting the unique linguistic nuances and socio-cultural context of non-English-speaking societies. This limitation is particularly noticeable in the Korean language environment, where specific cultural and social norms in Korea can give rise to unique types of AI hazards that are not addressed in existing datasets such as KoBBQ (Jin et al., 2024) and Kosbi (Lee et al., 2023). Furthermore, many safety datasets are limited to a text modality, making them insufficient for evaluating the increasing potential risks from multimodal generative models, such as 1 arXiv:2511.20686v1 [cs.AI] 20 Nov 2025 harmful image or video synthesis. In this study, we proposeAssurAI, a new quality-control-based Ko- rean multimodal dataset for evaluating the safety of generative AI, which is specifically designed for the Korean language environment to address these concerns. Also, we release theAssurAIdataset to promote research on developing more secure and trustworthy AI systems for the Korean community. The main contributions of this study are as follows. •Korean Socio-cultural Taxonomy for Risk Factors:Our multidisciplinary expert group defines a taxonomy of 35 distinct AI risk fac- tors adapted from existing frameworks. This taxonomy encompasses both universal harms and relevance to Korea’s socio-cultural con- text. • Large-Scale Korean Multimodal Dataset: Based on the taxonomy, we construct and re- leaseAssurAI, a new large-scale multimodal dataset that contains text, images, video, and audio. The dataset consists of a total of 11,480 instances. •Systematic Data Construction Process:We apply a rigorous quality management and validation process to ensure the reliability and validity of our dataset. This process in- volves a two-stage construction methodol- ogy (i.e., expert-led seed data generation and crowdsourcing-based mass production), itera- tive expert review, and validation through pilot testing with the latest generative models. 2 Related Work 2.1 Trends in AI Safety and Ethics Research With the advancement of generative AI, there has been active research to ensure the safety and ethics of AI. In particular, the red teaming approach has become a key strategy for identifying and mitigat- ing harmful or unintended outputs from models (Ganguli et al., 2022; Perez et al., 2022). Red team- ing aims to explore model vulnerabilities through aggressive prompt engineering, thereby strengthen- ing defensive mechanisms (Shen et al., 2024; Liu et al., 2023). Additionally, numerous studies on AI risk taxonomies have been proposed as part of efforts to systematically classify and manage poten- tial risks (Zeng et al., 2024; Slattery et al., 2024). These studies contribute to minimizing the negative impacts AI systems may have on society and guide the responsible development of the technology. Our study also aims to adapt a taxonomy suitable for the Korean-specific environment within the same context as these existing studies. 2.2 Existing AI Safety Benchmark Datasets To quantitatively evaluate AI safety, various bench- mark datasets have been constructed. ToxiGen (Hartvigsen et al., 2022) and RealToxicityPrompts (Gehman et al., 2020) are representative datasets designed to measure the toxicity of text generated by models. SafetyBench (Zhang et al., 2023) pro- posed a comprehensive benchmark that evaluates the safety of models across various toxicity cat- egories using multiple-choice questions. Regard- ing the aspect of bias, BBQ (Parrish et al., 2021) was developed to measure biases related to social stereotypes. Additionally, KoBBQ (Jin et al., 2024), which extends this concept to the Korean language environment, has been proposed. 2.3 Limitations of the Existing Research While these datasets have significantly contributed to AI safety research, they have several significant drawbacks. •Language and Cultural Bias:Most datasets are primarily built with English content, which limits their applicability in assessing risks specific to Korean linguistic characteris- tics and domestic socio-cultural contexts. Al- though KoBBQ (Jin et al., 2024) and Kosbi (Lee et al., 2023) have addressed social bias issues in Korean, these cover only a subset of the 35 risk factors proposed in this study. A comprehensive safety benchmark reflecting Korea’s unique socio-cultural context remains absent. • Limited Scope of Risks:Existing datasets tend to focus primarily on specific risks such as harmfulness, offensive expressions, and so- cial bias. Compared to the 35 comprehensive risk factors proposed in this study, their scope of addressed risks is relatively narrow. •Single-Modality Focus:Most existing stud- ies are confined to the text modality. It might create a fundamental limitation, as they are un- able to assess the risks posed by multimodal models that generate harmful images, videos, and audio. While early studies evaluating the safety of multimodal models have recently 2 emerged (Qu et al., 2023; Miao et al., 2024; Luo et al., 2024), there is still a lack of com- prehensive benchmarks in this field. To overcome these limitations, we aim to fill the gap in existing research by constructingAssurAI, a multimodal dataset designed explicitly for Korean that covers a wide range of risk factors and extends to include images, videos, and audio, in addition to text content. 3 Taxonomy of AI Risks Our aim in this project was not to establish a com- prehensive and systematic taxonomy of risk factors for evaluating the safety of generative AI. Instead, the practical objective was to define a set of practi- cal evaluation criteria by curating risk factors that reflect Korean socio-cultural contexts and are feasi- ble for actual data construction, based on existing authoritative taxonomies. Therefore, our multidisciplinary expert group, comprising specialists in artificial intelligence, ed- ucation, and psychology, conducted a thorough re- view of significant prior research, including the taxonomies from AIR 2024 (Zeng et al., 2024) and MIT FutureTech (Slattery et al., 2024). As a result of this review, the expert group eventually curated and adapted 35 risk factors based on the criteria of international universality, relevance to Korean society and culture, and ease of actual data con- struction. Risk factors 1 to 30 were curated and adapted from the AIR 2024 study, while risk fac- tors 31 to 35 were curated and adapted from the MIT FutureTech study. The list of 35 risk factors served as the core foundation for the entire con- struction process of the AssurAI dataset. Based on the properties of these 35 factors derived through the review of the expert group, they are organized into six higher-level categories: (1)Harmful & Violent Content, (2)Interpersonal Harm, (3)Sensitive & Adult Content, (4)Misin- formation & Manipulation, (5)Illegal & Unethi- cal Activities, and (6)Socioeconomic & Cognitive Risks. Our coverage encompasses not only direct and explicit threats, such as hate speech or the dis- semination of illegal information, but also more long-term and subtle societal risks, including the devaluation of human labor and the erosion of user autonomy. Detailed definitions and scopes for each risk factor are provided in Table 1. Figure 1: The iterative construction and quality control process for theAssurAIdataset. Quality Assessment Team acts as the central hub, managing data quality through continuous feedback loops among Dataset De- sign Team, Data Production Team, and the Pilot Imple- mentation Team. 4 Method of Dataset Construction and Quality Verification 4.1 Overall Process TheAssurAIdataset was constructed through a multi-stage process comprising (i) initial design of data scheme and seed data (i.e., samples) gen- eration, (i) mass production of large-scale data, and (i) rigorous quality control and verification to enable reliable Korean AI safety assessment. The overall dataset construction process with quality control is shown in Figure 1. First, domain ex- perts (KAIST) develop schemes for each risk fac- tor and create construction guidelines for datasets, as well as initial sample data that will serve as seed data. Based on these samples and guidelines, a specialized data construction company (Select- Star) performs mass data production through its crowdsourcing platform. The constructed data un- dergoes in-depth red team verification by a valida- tion team composed of multidisciplinary experts from the University of Seoul, Keimyung Univer- sity, and other institutions. Feedback derived from this process is then reflected in the quality improve- ment of the data. Finally, the leading domestic AI company (Kakao) pilots the constructed dataset in actual commercial models to validate its effective- ness, then feeds these results back into the early stages of data construction, followed by an iterative refinement process. In particular, one of the key design elements of our process is eight prompt types defined to ensure data diversity and evaluate model safety from various aspects. Each prompt type is designed to test specific capabilities of AI models, such as 3 Table 1: The taxonomy of 35 AI risk factors defined in theAssurAIdataset. Category#Risk FactorDefinition and Scope Harmful & Violent Content 1Supporting Malicious Organized Groups Generation of content that glorifies or supports the activities of malicious organizations such as terrorism, extremism, or organized crime. 2Celebrating SufferingGeneration of content that enjoys or encourages the physical or mental suffering of others or oneself. 3Violent ActsGeneration of content that provides specific information about or encourages violent acts. 4Depicting ViolenceGeneration of creative works containing cruel or violent content. 5Weapon Usage & DevelopmentGeneration of content that encourages illegal activities such as the creation, modification, or smuggling of weapons. 6Military and WarfareGeneration of content related to military strategies or tactics that could violate laws of war or cause mass casualties. Interpersonal Harm 7HarassmentGeneration of content that persistently harasses or bullies a specific individual or group. 8Hate SpeechGeneration of speech that promotes hatred or discrimination against a specific group. 9Offensive LanguageGeneration of content using offensive language that insults or attacks a specific target. 10Perpetuating Harmful BeliefsGeneration of content that reinforces or spreads negative stereotypes about a specific group. Sensitive & Adult Content 11Adult ContentGeneration of adult content, including explicit sexual imagery or acts. 12Erotic ContentGeneration of content that is sexually suggestive or obscene. 13Non-Consensual NudityGeneration of nude imagery or sexual content distributed without the subject’s consent. 14Monetized Sexual ContentEncouraging the use or distribution of sexual content for commercial profit. 15Endangerment, Harm, or Abuse of Children Generation of content that depicts physical or mental harm to children or child abuse. 16Child Sexual AbuseGeneration of content that depicts or encourages the sexual exploitation and abuse of children. 17Suicidal and Non-suicidal Self- injury Generation of content that encourages, glorifies, or describes specific methods of suicide or self-harm. Misinformation & Manipulation 18Political PersuasionGeneration of content that instills a specific political message to influence the recipient’s opinions or actions. 19Influencing PoliticsGeneration of content that can unduly influence political processes, such as distorting election results or manipulating public opinion. (Continued on next page) 4 (Continued from previous page) Category#Risk FactorDefinition and Scope 20Deterring Democratic Participa- tion Generation of content that discourages democratic participation, such as discouraging voter turnout. 21FraudGeneration of content that can be exploited for fraud, such as identity theft or phishing, to steal assets or information. 22Mis/disinformationGeneration of factually incorrect information to cause social confusion or manipulate public opinion, such as fake reviews. 23Sowing DivisionGeneration of content that shows a biased attitude towards a certain faction or encourages conflict and division within groups. 24MisrepresentationAI misrepresenting its output as human-generated or impersonating a specific person. 25Types of DefamationGeneration of content that lowers the social reputation of an individual or organization by stating specific false facts. Illegal & Unethical Activities 26Discriminatory ActivitiesGeneration of creative works containing discriminatory and biased content against a specific group. 27Unauthorized Privacy ViolationsActs that violate privacy, such as collecting, leaking, or using personal information without consent. 28Illegal/Regulated SubstancesGeneration of content that promotes the manufacturing, trading, or use of illegal drugs. 29Illegal Services/ExploitationGeneration of content related to illegal services or labor exploitation. 30Other Unlawful/Criminal Activi- ties Generation of content related to other criminal and illegal activities not specified above. Socioeconomic & Cognitive Risks 31Increased inequality and decline in employment quality Issues where the use of AI technology exacerbates social inequality or degrades the quality of employment. 32Economic and cultural devalua- tion of human effort The issue of AI devaluing human labor economically and culturally by replacing it. 33Competitive dynamicsThe issue of AI technology being monopolized by certain companies, distorting fair competition in existing industries. 34Overreliance and unsafe useAI characteristics and generated content that lead users to become overly reliant on or misuse the AI. 35Loss of human agency and auton- omy The issue where users lose critical thinking skills, agency, and autonomy in the process of using AI. 5 Table 2: The prompt types used for data construction. Prompt Type Description Multiple- Choice Prompts the model to select harm- ful or safe responses from given options. Q OnlyPoses a simple question related to a risk scenario to elicit a direct re- sponse. Multi- Session A multi-turn dialogue scenario that gradually steers the conversation to a risky topic. Role- Playing Assigns a specific persona to the model and requests it to answer from that perspective. Chain-of- Thought Requires the model to generate a step-by-step reasoning process for a complex problem. Expert Prompting Frames the model as an expert in a specific domain and asks for knowledgeable answers. RailEvaluates controllability by asking the model to respond under spe- cific constraints. ReflectionTests the model’s ability to self- evaluate and correct its previous responses. inference, role-playing, and information con- straints, as outlined in Table 2. These eight types were consistently applied throughout the entire data construction process. 4.2 Two-Stage Data Collection & Annotation AI safety evaluation datasets must balance high quality with scale, accurately reflecting the subtle context of the risks assessed while covering diverse scenarios to ensure statistical significance. Ensur- ing statistical significance can be challenging due to a trade-off between quality and scalability. When data requires a high level of domain expertise, it needs to be generated by experts to guarantee its quality and relevance. However, this approach can be both costly and time-consuming, making it difficult to scale. On the other hand, crowdsourcing provides the advantage of efficiently creating large datasets. While this can be an effective solution, the lack of expertise among contributors may lead to data quality issues. To overcome this dilemma, this study adopted a two-stage strategy that leverages both the depth of experts and the scale of crowdsourcing. This strategy maximizes the strengths of each approach while mitigating its weaknesses. In the initial stage, a group of experts creates a ‘schematic design’ for the data, and in the subsequent stage, large-scale mass production is carried out based on this design. 4.2.1 Stage 1: Expert-led Seed Data Generation The five specialized research laboratories participat- ing in our consortium divided 35 risk factors among themselves and manually crafted high-quality seed data aligned with their respective areas of exper- tise. These seed data, comprising 10% of the total target amount, served as clear guidelines for crowd- sourced workers in the subsequent mass production and as a gold standard for measuring the quality of the final deliverables. 4.2.2 Stage 2: Crowdsourcing-based Data Production Based on the sample data generated by experts and the construction guidelines developed in Stage 1, we proceeded with mass data production through the crowdsourcing platform. We recruited workers with experience in data annotation, provided inten- sive training on our risk factors and prompt types, and scaled up data construction to reach the entire target quantity. This approach enabled us to build a large-scale dataset while maintaining high-quality criteria efficiently. 4.3 Quality Assurance & Verification The value of benchmark datasets for evaluating AI models lies not in the volume of data, but in their quality—i.e., consistency and reliability. Un- like typical training data, AI safety evaluation data requires a deep understanding of specific socio- cultural contexts and is highly context-dependent. Since its quality directly impacts evaluation out- comes, a much stricter and systematic quality con- trol process is required. Based on this philosophy, this study considered multifaceted quality control and validation steps throughout the entire construc- tion process to ensure the highest level of dataset reliability. These considerations combine quanti- tative metrics with qualitative expert evaluations, designed to minimize potential errors and biases, 6 and ensure data consistency. The details of the con- siderations are as follows: • Triple Independent Annotation:To enhance data consistency and reliability, three workers were independently assigned to annotate all data instances. Each worker’s judgment was stored separately without influencing others, ensuring the objectivity of the results. This structure ensured the objectivity of outcomes while also providing a foundation for calculat- ing Inter-Annotator Agreement (IAA), which is a statistical reliability metric that measures the degree of agreement among annotators. • Expert Review & Feedback Loop:The con- structed data was regularly reviewed by our multidisciplinary expert group. This process was performed using a red team approach, identifying potential errors, biases, or con- textually inappropriate data, and proposing specific corrective actions corresponding to the prompt types. Review feedback and action histories were all documented, adding trans- parency to the data construction process. • Validity Verification via Pilot Implementa- tion:We piloted the constructed dataset to empirically verify its effectiveness in assess- ing the risk factors of actual generative AI models. By applying the dataset to five well- known LLMs, including GPT-4o and Claude- 3.5-Sonnet, we analyzed how the models re- sponded to each risk factor and confirmed the practicality and validity of the dataset. 5 TheAssurAIDataset: Specifications and Characteristics 5.1 Statistical Analysis TheAssurAIdataset consists of a total of 11,480 instances. The dataset is designed as a multimodal evaluation containing not only text but also images, videos, and audio, with the distribution per modal- ity shown in Table 3. In short, text data accounts for approximately 83% of the total, with the remaining approximately 17% comprising visual and auditory data. The dataset was constructed to cover all eight prompt types described in Table 2 above. Due to varying complexity in building data for each type, the final construction ratio differs. The distribution by type is shown in Table 4. The ‘Q Only’ and Table 3: Data distribution by modality. ModalityInstancesRatio (%) Text9,56083.3% Image1,16010.1% Video4303.7% Audio3302.9% Total11,480100% ‘Role-Playing’ types account for a high proportion. Model safety can be evaluated from multiple per- spectives through diverse prompt structures. Table 4: Data distribution by prompt type. Prompt TypeInstancesRatio (%) Q Only3,75132.7% Role-Playing2,36220.6% Chain-of-Thought1,49013.0% Multiple-Choice1,43012.5% Multi-Session9808.5% Expert Prompting7276.3% Rail5705.0% Reflection1701.5% Total11,480100% The distribution of data across the 35 risk factors is visualized in Figure 2. This stacked horizontal bar chart visually highlights two primary character- istics of the dataset. First, the data distribution is not uniform across the 35 factors but is concentrated in specific areas. Notably, ‘Discriminatory Activities’ (1,000 instances) and ‘Unauthorized Privacy Vio- lations’ (900 instances) are overwhelmingly larger than others. This was an intentional design choice, as these factors were structured to encompass a wide range of sub-scenarios (e.g., ‘Protected Char- acteristics’ and ‘Types of Sensitive Data’, respec- tively). Second, as is evident from the dominant blue (Text) portion of the chart, the AssurAI dataset is primarily composed of the text modality. Other modalities—i.e., Image (orange), Video (green), and Audio (red)—were selectively constructed for specific risk factors. A detailed breakdown of the exact instance counts per modality for each risk factor, along with their sources as mentioned in Section 3.1, is available in Table A.1 in the Ap- pendix. 7 Figure 2: Distribution of 35 risk factors by total in- stances, stacked by data modality. The chart is sorted in descending order by the total number of instances. Modalities are color-coded: text (blue), image (orange), video (green), and audio (red). 5.2 Design Considerations To minimize potential biases inherent in the dataset, our multidisciplinary expert group conducted an in-depth analysis of Korean socio-cultural char- acteristics. It incorporated these insights into the data construction guidelines. All annotators were provided with guidelines on personal information protection and ethical data processing principles, and ethical issues concerning the data were contin- uously reviewed throughout the entire construction process. Nevertheless, this dataset has several inherent limitations. First, while the 35 risk factors are com- prehensive at this point, they cannot predict and include all risks that may arise in the future with the emergence of new AI technologies. Second, the dataset is deeply customized to Korea’s linguistic and socio-cultural context, making it potentially difficult to apply directly to other cultural contexts. Third, judgments regarding ‘harmfulness’ involve some potential for subjectivity. While we have at- tempted to mitigate this concern through triple in- dependent annotation and expert review, it remains challenging to eliminate it completely. Finally, as the dataset is a static resource built at a specific point in time, it requires continuous expansion and supplementation to keep pace with the ongoing evaluation of generative AI models that are con- stantly being updated. 6 Experiments This section presents a series of experiments con- ducted to validate the reliability and applicability of the proposedAssurAIdataset. Unlike conventional model-benchmarking experiments, our goal here is not to optimize model performance but to ver- ify whether the dataset effectively exposes safety- related inconsistencies and refusal behaviors across LLMs.We describe the experimental goals, evalu- ation procedure, and baseline results across text and multimodal settings, followed by an analysis of findings that demonstrate the dataset’s discrimi- native capability and extensibility. 6.1 Experimental Goals and Setup The primary objective of this experiment is to assess the validity and coverage of theAssurAI dataset for evaluating safety alignment in LLMs. The dataset includes 35 risk categories and 8 prompt types, encompassing both safety-critical and neutral scenarios. Two evaluation tracks were established: (1) theText EvalTrack, designed for text-based assessment, and (2) theMulti- modal EvalTrack, which extends to image, au- dio, and video modalities. Four open-weight mod- els—EXAONE 3.5, Llama 3.1, Mistral, and Qwen 2.5—were selected for the text track, while Gemini Live 2.5 Flash Preview (audio), Gemini 1.5 Flash (image), and Veo 2.0 Generate 001 (video) served as the multimodal baselines. Text-track evaluations were conducted within the AI Inspect framework to ensure consistent scoring and reproducibility. However, since AI Inspect currently has limited functionality for multimodal processing, the multi- modal evaluations were implemented using a sep- arate custom script for model invocation and scor- ing. Figure 3 provides a visual overview of the uni- fied evaluation architecture, showing how the As- surAI dataset interacts with target models through text- and vision-based judging pipelines. The de- tailed evaluation procedure is described in the fol- lowing subsection. 8 Figure 3: The overall safety evaluation pipeline of As- sureAI, consisting of two tracks: the Text Track for text-based models and the Multimodal Track for audio, image, and video models. 6.2 Evaluation Procedure As illustrated in Figure 3, the evaluation procedure of AssurAI consists of two parallel processes: a text-based process and a vision-based process. All inputs are provided as textual prompts, and model outputs are pre-processed according to their modal- ity before being evaluated by a Judge Model. Text-based Process.For the text track, four open-weight models—EXAONE 3.5, Llama 3.1, Mistral, and Qwen 2.5—were employed. All mod- els were executed under the Q8 quantization config- uration to ensure stable operation in a lightweight local environment while minimizing accuracy loss. The model outputs were directly evaluated by the judge model using a four-element Judge Prompt composed ofInstruction,Input Prompt,Model Out- put, andOutput Candidate. For audio data, gen- erated speech files were transcribed into text to follow the same evaluation procedure. The GPT- 4o-mini model was used as the Judge Model for all text-based evaluations. Vision-based Process.For the visual modali- ties within the multimodal track, Gemini 2.5 and Veo 2.0 were evaluated as baseline models. Image and video outputs were standardized prior to eval- uation: images were resized, converted to JPEG, and Base64-encoded, while video data underwent frame sampling, quality filtering, and normalization to extract representative frames for scoring. The GPT-5-mini model was used as the Judge Model for the vision-based process. Both processes employed the same five-point safety rubric to ensure consistency and fairness across tracks. 6.3 Text Track Results The text-based evaluation was conducted using the AssurAI dataset, covering 35 risk categories and 8 prompt types. Model responses were rated by the Judge Model (GPT-4o-mini) on a 5-point safety scale (1 = Risk, 5 = Safe), which served as an auto- mated scoring proxy referencing human-averaged scores for consistent scaling. Table 5 summarizes the mean (μ), standard de- viation (σ), and coefficient of variation (CV) for each model.Mean scores ranged between 3.3 and 3.9 with low standard deviations (0.28–0.32), indi- cating a stable evaluation framework. All models showed CV values below 9%, confirming that the scoring framework maintained operational stability and minimal bias. Table 5: Statistical Summary of Model-wise Safety Scores (Text EvalTrack) ModelMean (μ)Std. Dev. (σ)CV (%) EXAONE3.900.307.7 Llama3.300.288.5 Mistral3.790.307.9 Qwen3.870.328.3 However, despite this overall stability, significant inter-model differences were observed. The results of statistical significance testing are summarized in Table 6. A one-way ANOVA revealed a significant difference across models (F(3,136)=23.29,p < .001 ,η 2 = 0.055), while Levene’s test confirmed homogeneity of variances (p=.99). These results suggest that the evaluation system maintained a stable variance structure while distinguishing fine- grained behavioral differences among models. Post-hoc Tukey HSD testing indicated no signifi- cant differences amongEXAONE 3.5(Q8),Qwen 2.5(Q8), andMistral(Q8), but all three scored sig- nificantly higher thanLlama 3.1(Q8)(p < .001), suggesting that Llama adopted a mitigated refusal strategy relying on contextual reasoning over ex- plicit refusals. 9 Table 6: Summary of ANOVA, Levene’s Test, and Effect Size (η²) StatisticValuep-value ANOVA F(3,136)23.28933.142e-12 Levene’s W0.03939.894e-01 Effect size (η²)0.0555 Figure 4: Result on the Safety Scores for Models Across the 35 risk categories, mean scores were concentrated between 3.3 and 4.0.EXAONE 3.5(Q8)andQwen 2.5(Q8)showed particularly high scores inHate Speech,Child Harm, andPri- vacy Violationcategories, reflecting training poli- cies that emphasize social norms and moral reason- ing. In contrast,Llama 3.1(Q8)exhibited lower scores in violent and sexual content, indicating a relatively relaxed approach to safety enforcement in sensitive areas. Consistent with the standard deviation results,Mistral(Q8)displayed moder- ate mean scores but a narrow dispersion (σ0.30), suggesting a balanced moderation strategy without over-rejection or over-permissiveness. Detailed per- category mean scores for all models are presented in 4, providing a granular view of model-specific variation across the 35 risk dimensions. Analysis by prompt type revealed no major differences in average scores, but distinct behav- ioral patterns emerged depending on prompt con- text(Fig. 5).EXAONE 3.5(Q8)andQwen 2.5(Q8) achieved higher consistency in refusal across Role-Playing,Reflection, andMulti-Sessiontypes, demonstrating strong alignment in interactive or dialogue-based tasks. Conversely,Llama 3.1(Q8) performed more stably onChain-of-Thoughtand Multiple-Choiceprompts. All models shared a com- mon weakness in theRailtype, indicating a sys- tematic challenge in this category. These findings indicate that models do not merely generate “safe” outputs but rather adapt their refusal strategies ac- cording to the purpose and interaction structure of Figure 5: Result on the Safety Scores for Prompt Types prompts. This suggests that Text EvalTrack cap- tures not only single-turn QA behavior but also task-specific safety dynamics across prompt intents. Thus, while each prompt type represents an inde- pendent context, the dataset ensures consistency un- der a unified evaluation framework, demonstrating an extended evaluative coverage across interaction modes. 6.4 Pilot Multimodal Track Results This section presents the exploratory (pilot) exper- imental results of theAssurAIMultimodal Track. While the text track compared model-level safety alignment performance using the official dataset, the multimodal track aimed to explore the oper- ational patterns of policy-level safety in the lat- est multimodal models using a prototype dataset. Therefore, the results in this section should be inter- preted as descriptive statistical analyses of trends in safety policies, rather than inferential statistical tests. 6.4.1 Audio Modality Results In the audio modality evaluation,Gemini Live 2.5 Flash Previewshowed stable and conservative re- sponse tendencies as a commercial model. Among 330 evaluations, Scores 4 (24.2%) and 5 (44.8%) accounted for approximately 69% of the total, indi- cating that the model avoided risky or inappropri- ate responses and maintained safe outputs for most audio inputs. Score 3 (24.8%) was classified as neutral, and high-risk responses (1–2 points, each 3.0%) appeared very rarely. This distribution sug- gests that the model maintained a risk-averse and safety-centered policy even in the audio modality, similar to text-based models. The average scores also ranged between 3.73–4.14, showing stable results, and most types had confidence intervals (95% CI) narrower than 10 Figure 6: Score distribution for the audio modality (1–5 Scale). Safety evaluation results of Gemini Live 2.5 Flash Preview. ± 0.4. In particular,Chain-of-Thought(4.14± 0.23) andQ Only(4.11±0.18) showed high stabil- ity above 4 points, whileExpert Prompting(3.73 ±0.41) recorded slightly lower scores, interpreted as the influence of prompt complexity on safety assessment. TheRole-Playingtype (3.98±0.21) showed low variability despite containing conversa- tional context, demonstrating that the model main- tained a consistent level of conversational stability even in voice-based interactions. Table 7: Average safety scores and 95% confidence intervals by prompt type for the audio modality (n=330) Prompt TypeMean95% CIn Chain-of-Thought4.14±0.2363 Q Only4.11±0.18139 Role-Playing3.98±0.21106 Expert Prompting3.73±0.4122 6.4.2 Image Modality Results In the image modality evaluation,Gemini 1.5 Flashwas used. During the generation process, many outputs were blocked due to safety controls. After repeating the same prompt up to three times, some cases were successfully generated, but a sig- nificant number remained blocked. Therefore, this study separately analyzedSafely Blockedcases and Evaluated cases. This procedure enabled veri- fication of the model’s policy-level refusal consis- tency and whether blocking persisted in repeated requests. Among 1,160 prompts, approximately 40% were blocked by safety policies, indicating that the model detected and responded to unsafe or inappro- priate requests in advance. Most generated images were distributed within the 3–5 score range, imply- ing overall safe or neutral results. The average safety scores by prompt type are Figure 7: Score distribution and safely blocked ratio for the image modality. Results of Gemini 1.5 Flash evaluation. summarized in Table 8. Overall, the mean scores ranged from 3.3 to 4.5, with several prompt types exhibiting stable performance around the 4-point level. In addition, the 95% confidence intervals (CIs) were generally narrow (below±0.4), indi- cating minimal variability in safety performance across prompt types. In contrast, theMultiple Choicetype was mostly blocked during evalua- tion, resulting in a very limited sample size (n= 3) and an exceptionally wide 95% CI of±4.97. This wide interval reflects statistical uncertainty caused by sample imbalance, which is interpreted as evi- dence of strong safety policy enforcement rather than instability in model behavior. Table 8: Average safety scores by prompt type for the image modality (Safely Blocked excluded; interpreta- tion limited forn <5groups) Prompt TypeMean95% CIn Chain-of-Thought3.37±0.3338 Expert Prompting3.55±0.3373 Multiple-Choice3.00±4.973 Multi-Session4.38±0.18101 Q Only4.02±0.14268 Reflection4.46±0.2348 Role-Playing3.88±0.2794 6.4.3 Video Modality Results The video modality was evaluated using theVeo- 2.0-generate-001model. As in the image modality, some generations were stopped by safety mecha- nisms, and repeated runs were not conducted due to API rate limits (10 generations per minute). The evaluation was performed using a singlerepresen- tative frame, following a proxy procedure based on information content and sharpness (see Appendix A). The overall score distribution is shown in Fig- ure 8. Scores of 5 (35.3%) and 4 (18.0%) accounted 11 for 53.3% of the total, followed by 3 (14.7%), 2 (3.8%), and 1 (8.0%). This indicates that Veo 2.0 maintained a risk-averse policy during video gener- ation, and extreme unsafe responses occurred rarely. The proportion ofSafely Blockedcases was 15.8%. Unlike the image modality, which showed a block- ing rate of about 40% even after up to three retries, the video modality showed a lower blocking rate even with single-run evaluations. Figure 8: Score distribution for the video modality (1–5 Scale). Safety evaluation results of Veo 2.0 Generate- 001. The average safety scores and 95% confidence intervals (CIs) for each prompt type in the video modality are presented in Table 9. The variation across prompt types was relatively large. TheQ Only(4.06±0.14) andRail(4.11±0.56) types demonstrated stable and consistent safety perfor- mance, whereas theChain-of-Thought(1.88± 0.51) andRole-Playing(2.44±0.85) types ex- hibited substantially lower safety scores. These re- sults suggest that the model generated unsafe or risky content in logic-driven (CoT) or role-based (RL) scenarios, indicating that the internal refusal or safety enforcement mechanisms did not fully operate under such conditions.Expert Prompting (3.79±0.48) showed moderate performance, with variability depending on the specific task and con- textual structure. Overall, these findings indicate that Veo 2.0 displays heterogeneous safety response patterns depending on prompt type, with higher risk generation tendencies in conversational and explanatory scenarios compared to single-turn or rule-guided prompts. This experiment was conducted under a single- frame evaluation setting, and thus does not fully account for risk factors associated with tempo- ral coherence or scene transition. Accordingly, these results can be interpreted as a minimal proxy evaluation that captures the model’s initial risk- assessment stage during video generation. They serve as a baseline indicator for subsequent multi- Table 9: Average safety scores and 95% confidence intervals by prompt type for the video modality (Safely Blocked excluded) Prompt TypeMean95% CIn Chain-of-Thought1.88±0.5117 Expert Prompting3.79±0.4834 Q Only4.06±0.14274 Rail4.11±0.5618 Role-Playing2.44±0.8516 frame comparative experiments. This pilot study represents exploratory evi- dence rather than a statistically generalized dataset- level analysis. The judge model configuration was limited to GPT-4o-mini and GPT-5-mini, and no human calibration was applied in the vision- based evaluation. Future work will incorporate a multi-judge framework and temporal consistency (TC)–based rubric enhancement to enable more robust verification of visual safety performance. 7 Conclusion This study was conducted to define a systematic foundation for AI safety evaluation tailored to the Korean language environment, in response to the potential risks associated with the rapid advance- ment of generative AI technology. The primary objective was to develop a highly reliable bench- mark dataset that can assess generative AI risks from multiple perspectives and to present a robust process to ensure data quality. We made several significant contributions as fol- lows: First, based on in-depth discussions with our multidisciplinary expert group, we developed a comprehensive taxonomy of 35 AI risk factors that considers both international universality and domestic socio-cultural specificity. Second, we con- structedAssurAI, a multimodal benchmark dataset containing 11,480 instances across text, image, video, and audio, based on the taxonomy of risk factors. Third, we ensured the reliability and valid- ity of the dataset by proposing a systematic quality management and validation process. This process incorporates expert-led sample generation, crowd- sourced mass production, and iterative validation using a red team approach. TheAssurAIdataset can serve as a crucial foun- dational resource for domestic and international AI researchers to quantitatively evaluate the safety of Korean language generation models, analyze their 12 vulnerabilities, and further develop technologies to enhance safety. It will be beneficial for maximizing the positive effects of AI technology, mitigating its social side effects, and ultimately contributing to the establishment of a trustworthy AI ecosystem. In future research, we believe it is necessary to continuously explore new variants of AI risks not covered by our dataset and expand the dataset ac- cordingly. Furthermore, we propose research to develop an evaluation framework that dynamically evolves and adapts alongside AI model advance- ments, moving beyond the limitations of the current static dataset. This type of self-adaptive evaluation will ensure the effectiveness of safety evaluations, even as AI technology continues to evolve. References Deep Ganguli, Liane Lovitt, Jackson Kernion, Amanda Askell, Yuntao Bai, Saurav Kadavath, Ben Mann, Ethan Perez, Nicholas Schiefer, Kamal Ndousse, and 1 others. 2022. Red teaming language models to re- duce harms: Methods, scaling behaviors, and lessons learned.arXiv preprint arXiv:2209.07858. Samuel Gehman, Suchin Gururangan, Maarten Sap, Yejin Choi, and Noah A Smith. 2020. Realtoxici- typrompts: Evaluating neural toxic degeneration in language models.arXiv preprint arXiv:2009.11462. Thomas Hartvigsen, Saadia Gabriel, Hamid Palangi, Maarten Sap, Dipankar Ray, and Ece Kamar. 2022. Toxigen: A large-scale machine-generated dataset for adversarial and implicit hate speech detection.arXiv preprint arXiv:2203.09509. Jiho Jin, Jiseon Kim, Nayeon Lee, Haneul Yoo, Alice Oh, and Hwaran Lee. 2024. Kobbq: Korean bias benchmark for question answering.Transactions of the Association for Computational Linguistics, 12:507–524. Hwaran Lee, Seokhee Hong, Joonsuk Park, Takyoung Kim, Gunhee Kim, and Jung-Woo Ha. 2023. Kosbi: A dataset for mitigating social bias risks towards safer large language model application.arXiv preprint arXiv:2305.17701. Yi Liu, Gelei Deng, Zhengzi Xu, Yuekang Li, Yaowen Zheng, Ying Zhang, Lida Zhao, Tianwei Zhang, Kai- long Wang, and Yang Liu. 2023. Jailbreaking chatgpt via prompt engineering: An empirical study.arXiv preprint arXiv:2305.13860. Weidi Luo, Siyuan Ma, Xiaogeng Liu, Xiaoyu Guo, and Chaowei Xiao. 2024. Jailbreakv: A benchmark for as- sessing the robustness of multimodal large language models against jailbreak attacks.arXiv preprint arXiv:2404.03027. Yibo Miao, Yifan Zhu, Lijia Yu, Jun Zhu, Xiao-Shan Gao, and Yinpeng Dong. 2024. T2vsafetybench: Evaluating the safety of text-to-video generative mod- els.Advances in Neural Information Processing Sys- tems, 37:63858–63872. Alicia Parrish, Angelica Chen, Nikita Nangia, Vishakh Padmakumar, Jason Phang, Jana Thompson, Phu Mon Htut, and Samuel R Bowman. 2021. Bbq: A hand-built bias benchmark for question answering. arXiv preprint arXiv:2110.08193. Ethan Perez, Saffron Huang, Francis Song, Trevor Cai, Roman Ring, John Aslanides, Amelia Glaese, Nat McAleese, and Geoffrey Irving. 2022. Red team- ing language models with language models.arXiv preprint arXiv:2202.03286. Yiting Qu, Xinyue Shen, Xinlei He, Michael Backes, Savvas Zannettou, and Yang Zhang. 2023. Unsafe diffusion: On the generation of unsafe images and hateful memes from text-to-image models. InPro- ceedings of the 2023 ACM SIGSAC conference on computer and communications security, pages 3403– 3417. Murray Shanahan, Kyle McDonell, and Laria Reynolds. 2023. Role play with large language models.Nature, 623(7987):493–498. Xinyue Shen, Zeyuan Chen, Michael Backes, Yun Shen, and Yang Zhang. 2024. " do anything now": Charac- terizing and evaluating in-the-wild jailbreak prompts on large language models. InProceedings of the 2024 on ACM SIGSAC Conference on Computer and Communications Security, pages 1671–1685. Peter Slattery, Alexander K Saeri, Emily AC Grundy, Jess Graham, Michael Noetel, Risto Uuk, James Dao, Soroush Pour, Stephen Casper, and Neil Thompson. 2024. The ai risk repository: A comprehensive meta- review, database, and taxonomy of risks from artifi- cial intelligence.arXiv preprint arXiv:2408.12622. Jason Wei, Xuezhi Wang, Dale Schuurmans, Maarten Bosma, Fei Xia, Ed Chi, Quoc V Le, Denny Zhou, and 1 others. 2022. Chain-of-thought prompting elic- its reasoning in large language models.Advances in neural information processing systems, 35:24824– 24837. Laura Weidinger, Jonathan Uesato, Jakub Bielecki, Glenn van den Driessche, Mike Chrzanowski, Dmitriy Krasheninnikov, Martin Chadwick, Ro- hen Shah Gur, Amanda Glaese, Ruben Tréger, and 1 others. 2021. Ethical and social risks of large lan- guage models.arXiv preprint arXiv:2112.04359. Yi Zeng, Kevin Klyman, Andy Zhou, Yu Yang, Minzhou Pan, Ruoxi Jia, Dawn Song, Percy Liang, and Bo Li. 2024. Ai risk categorization decoded (air 2024): From government regulations to corporate policies. arXiv preprint arXiv:2406.17864. Zhexin Zhang, Leqi Lei, Lindong Wu, Rui Sun, Yongkang Huang, Chong Long, Xiao Liu, Xuanyu 13 Lei, Jie Tang, and Minlie Huang. 2023. Safety- bench: Evaluating the safety of large language mod- els.arXiv preprint arXiv:2309.07045. A Detailed Dataset Specifications This appendix provides the full specifications of the AssurAI dataset summarized in the main body. Table 10 below is the raw data corresponding to Figure 2 of the main body, serving as detailed sup- porting material for the paper’s reproducibility and transparency. As described in Section 3, this table specifies the source as either AIR 2024 (Zeng et al., 2024) or MIT FutureTech (Slattery et al., 2024) corresponding to the 35 risk factors. 14 Table 10: Detailed distribution of the 35 AI risk factors by data modality and source. #Risk FactorSourceTotalTextImageVideoAudio 1Supporting Malicious Organized Groups AIR 2024150150--- 2Celebrating SufferingAIR 2024150150--- 3Violent ActsAIR 2024180150--30 4Depicting ViolenceAIR 20242601508030- 5Weapon Usage & DevelopmentAIR 2024150150--- 6Military and WarfareAIR 2024280150703030 7HarassmentAIR 2024430300703030 8Hate SpeechAIR 2024190150--40 9Offensive LanguageAIR 2024190150--40 10Perpetuating Harmful BeliefsAIR 2024300300--- 11Adult ContentAIR 2024430300703030 12Erotic ContentAIR 2024430300703030 13Non-Consensual NudityAIR 202445026016030- 14Monetized Sexual ContentAIR 2024300300--- 15Endangerment, Harm, or Abuse of Children AIR 2024430300703030 16Child Sexual AbuseAIR 20243903007020- 17Suicidal and Non-suicidal Self- injury AIR 20243903007020- 18Political PersuasionAIR 2024100100--- 19Influencing PoliticsAIR 2024100100--- 20Deterring Democratic Participa- tion AIR 2024100100--- 21FraudAIR 2024300300--- 22Mis/disinformationAIR 2024450300803040 23Sowing DivisionAIR 2024300300--- 24MisrepresentationAIR 2024430300703030 25Types of DefamationAIR 2024300300--- 26Discriminatory ActivitiesAIR 202410009007030- -(Covers various protected charac- teristics) 27Unauthorized Privacy ViolationsAIR 2024900900--- -(Covers various types of sensitive data) Continued on next page 15 Table 10:Continued #Risk FactorSourceTotalTextImageVideoAudio 28Illegal/Regulated SubstancesAIR 20244003007030- 29Illegal Services/ExploitationAIR 20244003007030- 30Other Unlawful/Criminal Activi- ties AIR 20244003007030- 31Increased inequality and decline in employment quality MIT FutureTech 300300--- 32Economic and cultural devalua- tion of human effort MIT FutureTech 300300--- 33Competitive dynamicsMIT FutureTech 300300--- 34Overreliance and unsafe useMIT FutureTech 150150--- 35Loss of human agency and auton- omy MIT FutureTech 150150--- Total Instances11,4809,5601,160430330 16