3

Modal Aphasia: Can Unified Multimodal Models Describe Images From Memory?

Oct 28, 2025

Poisoning Attacks on LLMs Require a Near-constant Number of Poison Samples

Oct 7, 2025

Strategic Dishonesty Can Undermine AI Safety Evaluations of Frontier LLMs

Sep 18, 2025

LLMs unlock new paths to monetizing exploits

May 16, 2025

Defeating Prompt Injections by Design

Mar 27, 2025

Gradient-based Jailbreak Images for Multimodal Fusion Models

Oct 7, 2024

AI Risk Management Should Incorporate Both Safety and Security

May 1, 2024

Competition Report: Finding Universal Jailbreak Backdoors in Aligned LLMs

Apr 24, 2024

Considerations for Differentially Private Learning with Large-Scale Public Pretraining

Dec 13, 2022