SPY Lab
SPY Lab
Blog
Publications
News
Teaching
Hiring
Contact
1
Adversarial Perturbations Cannot Reliably Protect Artists From Generative AI
Jun 19, 2024
Extracting Training Data From Document-Based VQA Models
Jun 1, 2024
Position: Considerations for Differentially Private Learning with Large-Scale Public Pretraining
Jun 1, 2024
Privacy Backdoors: Stealing Data with Corrupted Pretrained Models
Jun 1, 2024
Stealing part of a production language model
May 11, 2024
Universal Jailbreak Backdoors from Poisoned Human Feedback
May 7, 2024
Privacy Side Channels in Machine Learning Systems
May 1, 2024
Evading Black-box Classifiers Without Breaking Eggs
Apr 13, 2024
Evaluating Superhuman Models with Consistency Checks
Apr 8, 2024
Poisoning Web-Scale Training Datasets is Practical
Apr 8, 2024
Scalable and Transferable Black-Box Jailbreaks for Language Models via Persona Modulation
Dec 17, 2023
Students Parrot Their Teachers: Membership Inference on Model Distillation
Dec 17, 2023
Are aligned neural networks adversarially aligned?
Dec 10, 2023
Preventing Verbatim Memorization in Language Models Gives a False Sense of Privacy
Sep 11, 2023
Extracting Training Data from Diffusion Models
Aug 11, 2023
Tight Auditing of Differentially Private Machine Learning
Aug 11, 2023
A law of adversarial risk, interpolation, and label noise
May 1, 2023
A Light Recipe To Train Robust Vision Transformers
Feb 8, 2023
Red-Teaming the Stable Diffusion Safety Filter
Dec 9, 2022
Truth Serum: Poisoning Machine Learning Models to Reveal Their Secrets
Nov 1, 2022