Hi there! I am Xiangyu Qi (漆翔宇), a third-year Ph.D. candidate in the Department of Electrical and Computer Engineering at Princeton University, where I am advised by Prof. Prateek Mittal and Prof. Peter Henderson. Before that, I received my B.S. degree from the CS Department at Zhejiang University (June 2021). I also worked with Prof. Bo Li as a research intern at Secure Learning Lab.

My current research focuses on Machine Learning Safety and Security, with two main objectives: (1) To decipher the fundamental vulnerabilities prevalent in ML systems, and (2) To devise strategies that can counter these vulnerabilities, thereby contributing to the development of robust and trustworthy ML systems. Surrounding these two objectives, my research has covered multiple threads of Adversarial Machine Learning (Adv ML), including adversarial examples [2,6], data poisoning and backdoor attacks [3,4,5]. As the field of ML evolves with the introduction of large-scale foundation models and a concerted push towards AGI, my recent work [1,2] has also expanded to explore the tangible safety and security challenges within the sphere of AI alignment, with the ultimate objective of spurring robust and practical solutions that contribute to effective alignment infrastructures.

If you share similar interests, please feel free to reach out via xiangyuqi@princeton.edu. I am happy to chat and open to exploring opportunities for collaboration.


Selected Research




  • [2] (AAAI 2024 | oral presentation) Visual Adversarial Examples Jailbreak Aligned Large Language Models

    Xiangyu Qi$^* $, Kaixuan Huang$^* $, Ashwinee Panda, Peter Henderson, Mengdi Wang, Prateek Mittal

    GPT-4V(ision) system card cited this paper to underscore the emerging threat vector of multimodal jailbreaking.

    [Code]

    Highlight

    • Multimodality unavoidably expands attack surfaces, making the systems more vulnerable against adversarial attacks.
    • Visual adversarial examples (that still have not been addressed after a decade of research) can be a fundamental adversarial challenge against AI alignment.




  • [4] (ICLR 2023) Revisiting the Assumption of Latent Separability for Backdoor Defenses

    Xiangyu Qi$^* $, Tinghao Xie$^* $, Yiming Li, Saeed Mahloujifar, Prateek Mittal

    [Code]

    Highlight: Latent separability between clean and backdoor poison samples is pervasive and even used as a default assumption for designing defenses. But, we show that this is not necessarily true — we design adaptive backdoor poisoning attacks that can suppress the latent separation.



  • [5] (CVPR 2022 | oral presentation, 4.2%) Towards Practical Deployment-Stage Backdoor Attack on Deep Neural Networks

    Xiangyu Qi$^* $, Tinghao Xie$^* $, Ruizhe Pan, Jifeng Zhu, Yong Yang, Kai Bu

    [Code]

    Highlight: Given any neural network instance (regardless of its specific weights values) of a certain architecture, we can embed a backdoor into that model instance, by replacing a very narrow subnet of it with a malicious backdoor subnet.