🎈 About Me

I am currently a third-year Ph.D. student at AI Thrust, Information Hub, The Hong Kong University of Science and Technology (Guangzhou) supervised by Prof. Li Liu and Prof. Hui Xiong. Previously, I received my master degree from City University of Hong Kong, advised by Prof. Xiangyu Zhao.

Research Interests

  • AI Security / Safety
  • Red-teaming Agent
  • Backdoor Learning
  • Recommendation

📝 Selected Papers

(* indicates equal contribution, # indicates corresponding author)

Publications

NeurIPS 2025 D&B
sym

BackdoorDM: A Comprehensive Benchmark for Backdoor Learning on Diffusion Model GitHub

Weilin Lin*, Nanjun Zhou*, Yanyun Wang, Jianze Li, Hui Xiong, Li Liu#

  • The first comprehensive benchmark for backdoor learning on diffusion models.
  • Propose a unified attack formulation and a systematic target taxonomy.
  • Support 9 diffusion backdoor attacks, 5 defense methods, and 3 visualization tools.

The Thirty-Ninth Annual Conference on Neural Information Processing Systems Datasets & Benchmarks Track (NeurIPS D&B), San Diego, California, USA, 2025

AAAI 2025 Oral
sym

Fusing Pruned and Backdoored Models: Optimal Transport-based Data-free Backdoor Mitigation GitHub Oral Presentation (4.6%)

Weilin Lin, Li Liu#, Jianze Li, Hui Xiong

  • One of the few data-free defense strategies against backdoor attacks.
  • First adaptation of OT and model fusion on backdoor defense.

The Thirty-Ninth AAAI Conference on Artificial Intelligence (AAAI-25), Philadelphia, Pennsylvania, USA, 2025

NeurIPS 2024
sym

Unveiling and Mitigating Backdoor Vulnerabilities based on Unlearning Weight Changes and Backdoor Activeness GitHub

Weilin Lin, Li Liu#, Shaokui Wei, Jianze Li, Hui Xiong

  • New insights on unlearning weight change and backdoor activeness.
  • Propose an effective defense strategy using reinitialization and fine-tuning.

Annual Conference on Neural Information Processing Systems (NeurIPS), Vancouver, Canada, 2024

Preprints

Arxiv 2025
sym

SARSteer: Safeguarding Large Audio Language Models via Safe-Ablated Refusal Steering

Weilin Lin, Jianze Li, Hui Xiong, Li Liu#

  • The first inference-time defense framework specifically designed for Large Audio Language Models (LALMs).
  • Proposes text-derived refusal steering to enforce refusal without manipulating audio inputs.
  • Introduces decomposed safety space ablation to mitigate over-refusal on benign speech queries.

arXiv preprint, 2025

📖 Educations

  • Ph.D., Artificial Intelligence, The Hong Kong University of Science and Technology (Guangzhou), September 2023 - Present
  • M.S.c, Multimedia Information Technology, City University of Hong Kong, August 2021 - October 2022
  • B.S., Electronic Information Science and Technology, South China Normal University, September 2017 - July 2021

🧭 Experience

  • 2025.12 - Present, Research Intern, WeChat Security Team, Tencent. Research on Red Teaming Agents.
  • 2023.03 - 2023.09, Research Assistant, The Hong Kong University of Science and Technology (Guangzhou). Advisor: Prof. Li Liu.
  • 2022.07 - 2023.03, Research Assistant, USAIL Group, HKUST Fok Ying Tung Research Institute / The Hong Kong University of Science and Technology (Guangzhou). Advisor: Prof. Hao Liu.
  • 2021.09 - 2022.10, Research Assistant, Applied Machine Learning Lab (AML Lab), City University of Hong Kong. Advisor: Prof. Xiangyu Zhao.

🎖 Honors and Awards

  • Outstanding Paper Award (Second Prize), The 15th Guangdong-Hong Kong-Macao Conference on Image and Graphics, December 2025. Awarded for the paper “BackdoorDM: A Comprehensive Benchmark for Backdoor Learning on Diffusion Model”.
  • Best Student Paper Award Finalist in ICSR 2024.
  • Full Postgraduate Scholarship, HKUST(GZ).

🍀 Services

Reviewer/External Reviewer

  • The International Conference on Learning Representations (ICLR), 2026
  • The International Conference on Machine Learning (ICML), 2025, 2026
  • Annual Conference on Neural Information Processing Systems (NeurIPS), 2024, 2025
  • Annual AAAI Conference on Artificial Intelligence (AAAI), 2024, 2025, 2026
  • Conference on Research and Development in Information Retrieval (SIGIR), 2024