The Secure and Private AI (SPY) Lab conducts research on the security, privacy and trustworthiness of machine learning systems.
We often approach these problems from an adversarial perspective, by designing attacks that probe the worst-case performance of a system to ultimately understand and improve its safety.
May 12, 2025 | AgentDojo, a benchmark from our group to evaluate robustness of AI agents, has been awarded the first prize in the SafeBench competition. |
---|---|
May 10, 2025 | 2 papers from our group were accepted to ICML 2025 as spotlights! Check our publications page for details. |
Feb 5, 2025 | 6 papers from our group were accepted to the ICLR 2025 conference! Check our publications page for details. Our paper Adversarial Perturbations Cannot Reliably Protect Artists From Generative AI was awarded a Spotlight and Consistency Checks for Language Model Forecasters will have an Oral Presentation! See you in Singapore 🇸🇬 |
Nov 4, 2024 | Our paper showing how unlearning methods fail to remove knowledge from LLMs got a spotlight and oral presentation at the SoLaR Workshop at NeurIPS 2024. |