To achieve truly autonomous AI agents that can handle complex tasks on our behalf, we must first trust them with our sensitive data while ensuring they can learn from and use this information responsibly. Yet current language models frequently expose private information and reproduce copyrighted content in unexpected ways, highlighting why we need to move beyond simplistic “good” or “bad” blanket rules toward context-aware systems. In this talk, we will examine data exposure through membership inference attacks, develop controlled generation methods to protect information and design privacy frameworks grounded in contextual integrity theory. Looking ahead, we’ll explore emerging directions in formalizing semantic privacy, developing dynamic data controls, and creating evaluation frameworks that bridge technical capabilities with human needs.
Niloofar Mireshghallah is a post-doctoral scholar at the Paul G. Allen Center for Computer Science & Engineering at the University of Washington. She received her Ph.D. from the CSE department of UC San Diego in 2023. Her research interests are privacy in machine learning, natural language processing, and generative AI and law. She is a recipient of the National Center for Women & IT Collegiate award in 2020, a finalist of the Qualcomm Innovation Fellowship in 2021, and a recipient of the 2022 Rising Stars in Adversarial ML award and Rising Stars in EECS.