SPY Lab
SPY Lab
Blog
Publications
News
Teaching
Hiring
Contact
3
Measuring Non-Adversarial Reproduction of Training Data in Large Language Models
Nov 18, 2024
Persistent Pre-Training Poisoning of LLMs
Oct 18, 2024
Gradient-based Jailbreak Images for Multimodal Fusion Models
Oct 7, 2024
Adversarial Search Engine Optimization for Large Language Models
Jun 26, 2024
Blind Baselines Beat Membership Inference Attacks for Foundation Models
Jun 23, 2024
AI Risk Management Should Incorporate Both Safety and Security
May 1, 2024
Competition Report: Finding Universal Jailbreak Backdoors in Aligned LLMs
Apr 24, 2024
Scalable Extraction of Training Data from (Production) Language Models
Nov 28, 2023
Considerations for Differentially Private Learning with Large-Scale Public Pretraining
Dec 13, 2022