Instant research discovery
Search and browse ingested papers with intelligence signals and fast filtering.
| Paper | Year | Area | Tags | Intel | Citations |
|---|---|---|---|---|---|
| AlphaFlowTSE: One-Step Generative Target Speaker Extraction via Conditional AlphaFlow Zihan Qian, Lin Li, Duojia Li, Haizhou Li Year: 2026Area: cs.SDCitations: - Tags: ai-safety, cssd, preprint | 2026 | cs.SD | ai-safety, cssd, preprint | - | - |
| Are Audio-Language Models Listening? Audio-Specialist Heads for Adaptive Audio Steering Lenny Aharon, Ethan Fetaya, Neta Glazer Year: 2026Area: cs.SDCitations: - Tags: ai-safety, cssd, preprint | 2026 | cs.SD | ai-safety, cssd, preprint | E5 / R3 (96%) | - |
| Benchmarking Language Modeling for Lossless Compression of Full-Fidelity Audio Phillip Long, Chris Donahue, Zachary Novack Year: 2026Area: cs.SDCitations: - Tags: ai-safety, cssd, preprint | 2026 | cs.SD | ai-safety, cssd, preprint | E5 / R3 (95%) | - |
| Causal Prosody Mediation for Text-to-Speech:Counterfactual Training of Duration, Pitch, and Energy in FastSpeech2 Suvendu Sekhar Mohanty Year: 2026Area: cs.SDCitations: - Tags: ai-safety, cssd, preprint | 2026 | cs.SD | ai-safety, cssd, preprint | - | - |
| Diffusion Models for Joint Audio-Video Generation Alejandro Paredes La Torre Year: 2026Area: cs.SDCitations: - Tags: ai-safety, cssd, preprint | 2026 | cs.SD | ai-safety, cssd, preprint | - | - |
| Disentangling Reasoning in Large Audio-Language Models for Ambiguous Emotion Prediction Jean Honorio, Abhirup Ghosh, Xiaofeng Yu, Hong Jia Year: 2026Area: cs.SDCitations: - Tags: ai-safety, cssd, preprint | 2026 | cs.SD | ai-safety, cssd, preprint | E7 / R3 (95%) | - |
| Do Compact SSL Backbones Matter for Audio Deepfake Detection? A Controlled Study with RAPTOR Sandipana Dowerah, Atharva Kulkarni, Ajinkya Kulkarni, Mathew Magimai Doss Year: 2026Area: cs.SDCitations: - Tags: ai-safety, cssd, preprint | 2026 | cs.SD | ai-safety, cssd, preprint | E6 / R3 (97%) | - |
| EDMFormer: Genre-Specific Self-Supervised Learning for Music Structure Segmentation Krish Patel, Sahal Sajeer, Oscar Chung, Joel Song Bae Year: 2026Area: cs.SDCitations: - Tags: ai-safety, cssd, preprint | 2026 | cs.SD | ai-safety, cssd, preprint | E6 / R4 (96%) | - |
| Evolution Strategy-Based Calibration for Low-Bit Quantization of Speech Models Lucas Rakotoarivony Year: 2026Area: cs.SDCitations: - Tags: ai-safety, cssd, preprint | 2026 | cs.SD | ai-safety, cssd, preprint | E6 / R4 (96%) | - |
| Fish Audio S2 Technical Report Xingwei Liu, Yuxuan Wang, Dawei Han, Tianyu Li Year: 2026Area: cs.SDCitations: - Tags: ai-safety, cssd, preprint | 2026 | cs.SD | ai-safety, cssd, preprint | E5 / R4 (94%) | - |
| Gender Fairness in Audio Deepfake Detection: Performance and Disparity Analysis Anderson R. Avila, Shruti Kshirsagar, Aishwarya Fursule Year: 2026Area: cs.SDCitations: - Tags: ai-safety, cssd, preprint | 2026 | cs.SD | ai-safety, cssd, preprint | E7 / R3 (96%) | - |
| MOSS-TTS Technical Report Yiyang Zhang, Ke Chen, Yitian Gong, Ruixiao Li Year: 2026Area: cs.SDCitations: - Tags: ai-safety, cssd, preprint | 2026 | cs.SD | ai-safety, cssd, preprint | - | - |
| MUGEN: Evaluating and Improving Multi-audio Understanding of Large Audio-Language Models Yen-Ting Piao, Chih-Kai Yang, Yu-Kai Guo, Hung-Wei Chen Year: 2026Area: cs.SDCitations: - Tags: ai-safety, cssd, preprint | 2026 | cs.SD | ai-safety, cssd, preprint | E5 / R3 (99%) | - |
| Physics-Informed Neural Engine Sound Modeling with Differentiable Pulse-Train Synthesis Robin Doerfler, Lonce Wyse Year: 2026Area: cs.SDCitations: - Tags: ai-safety, cssd, preprint | 2026 | cs.SD | ai-safety, cssd, preprint | E4 / R3 (95%) | - |
| Probabilistic Verification of Voice Anti-Spoofing Models Alexandr Kozodaev, Oleg Kiriukhin, Mikhail Pautov, Oleg Y. Rogov Year: 2026Area: cs.SDCitations: - Tags: ai-safety, cssd, preprint | 2026 | cs.SD | ai-safety, cssd, preprint | - | - |
| Prosodic Boundary-Aware Streaming Generation for LLM-Based TTS with Streaming Text Input Tianrui Wang, Yizhou Peng, Changsong Liu, Eng Siong Chng Year: 2026Area: cs.SDCitations: - Tags: ai-safety, cssd, preprint | 2026 | cs.SD | ai-safety, cssd, preprint | E5 / R3 (96%) | - |
| RAMoEA-QA: Hierarchical Specialization for Robust Respiratory Audio Question Answering Tong Xia, Gaia A. Bertolino, Domenico Talia, Yuwei Zhang Year: 2026Area: cs.SDCitations: - Tags: ai-safety, cssd, preprint | 2026 | cs.SD | ai-safety, cssd, preprint | E5 / R3 (96%) | - |
| SCENEBench: An Audio Understanding Benchmark Grounded in Assistive and Industrial Use Cases Angelina Wang, Laya Iyer, Sanmi Koyejo Year: 2026Area: cs.SDCitations: - Tags: ai-safety, cssd, preprint | 2026 | cs.SD | ai-safety, cssd, preprint | E7 / R5 (99%) | - |
| Speaker Verification with Speech-Aware LLMs: Evaluation and Augmentation Laureano Moro-Velazquez, Najim Dehak, Thomas Thebaud, Jesus Villalba-Lopez Year: 2026Area: cs.SDCitations: - Tags: ai-safety, cssd, safety-evaluation, preprint | 2026 | cs.SD | ai-safety, cssd, safety-evaluation, preprint | - | - |
| Targeted Speaker Poisoning Framework in Zero-Shot Text-to-Speech Thanathai Lertpetchpun, Shrikanth Narayanan, Thanapat Trachu, Sai Praneeth Karimireddy Year: 2026Area: cs.SDCitations: - Tags: ai-safety, cssd, preprint | 2026 | cs.SD | ai-safety, cssd, preprint | E6 / R4 (95%) | - |
| TimberAgent: Gram-Guided Retrieval for Executable Music Effect Control Shengli Zhang, Shihao He, Yihan Xia, Fang Liu Year: 2026Area: cs.SDCitations: - Tags: ai-safety, cssd, preprint | 2026 | cs.SD | ai-safety, cssd, preprint | E5 / R3 (95%) | - |
| Toward Complex-Valued Neural Networks for Waveform Generation Deok-Hyeon Cho, Hyung-Seok Oh, Seong-Whan Lee, Seung-Bin Kim Year: 2026Area: cs.SDCitations: - Tags: ai-safety, cssd, preprint | 2026 | cs.SD | ai-safety, cssd, preprint | - | - |
| Towards Robust Speech Deepfake Detection via Human-Inspired Reasoning Dmitrii Tarasov, Oleg Kiriukhin, Artem Dvirniak, Artem Iudin Year: 2026Area: cs.SDCitations: - Tags: ai-safety, cssd, preprint | 2026 | cs.SD | ai-safety, cssd, preprint | - | - |
| VoiceSHIELD-Small: Real-Time Malicious Speech Detection and Transcription Ubaid Abbas, Sugandha Sharma, Sumit Ranjan, Puneeth N Ail Year: 2026Area: cs.SDCitations: - Tags: ai-safety, cssd, preprint | 2026 | cs.SD | ai-safety, cssd, preprint | E4 / R3 (97%) | - |
| VoxEmo: Benchmarking Speech Emotion Recognition with Speech LLMs Thomas Hain, Huang-Cheng Chou, Shrikanth Narayanan, Hezhao Zhang Year: 2026Area: cs.SDCitations: - Tags: ai-safety, cssd, preprint | 2026 | cs.SD | ai-safety, cssd, preprint | E5 / R3 (95%) | - |
| When Fine-Tuning Fails and when it Generalises: Role of Data Diversity and Mixed Training in LLM-based TTS Aditya Choudhary, Anupam Purwar Year: 2026Area: cs.SDCitations: - Tags: ai-safety, cssd, preprint | 2026 | cs.SD | ai-safety, cssd, preprint | - | - |
| Whisper-CD: Accurate Long-Form Speech Recognition using Multi-Negative Contrastive Decoding Yoonji Park, Hoseong Ahn, Kyuhong Shim, Jeongyun Chae Year: 2026Area: cs.SDCitations: - Tags: ai-safety, cssd, preprint | 2026 | cs.SD | ai-safety, cssd, preprint | E7 / R3 (96%) | - |
Showing 27 of 27 papers on page 1.