📝 Selected Papers

(* indicates equal contribution, # indicates corresponding author)

Publications

ICML 2026
sym

SARSteer: Safeguarding Large Audio Language Models via Safe-Ablated Refusal Steering

Weilin Lin, Jianze Li, Hui Xiong, Li Liu#

  • The first inference-time defense framework specifically designed for Large Audio Language Models (LALMs).
  • Proposes text-derived refusal steering to enforce refusal without manipulating audio inputs.
  • Introduces decomposed safety space ablation to mitigate over-refusal on benign speech queries.

The Forty-Third International Conference on Machine Learning (ICML), Seoul, South Korea, 2026

NeurIPS 2025 D&B
sym

BackdoorDM: A Comprehensive Benchmark for Backdoor Learning on Diffusion Model GitHub

Weilin Lin*, Nanjun Zhou*, Yanyun Wang, Jianze Li, Hui Xiong, Li Liu#

  • The first comprehensive benchmark for backdoor learning on diffusion models.
  • Propose a unified attack formulation and a systematic target taxonomy.
  • Support 9 diffusion backdoor attacks, 5 defense methods, and 3 visualization tools.

The Thirty-Ninth Annual Conference on Neural Information Processing Systems Datasets & Benchmarks Track (NeurIPS D&B), San Diego, California, USA, 2025

AAAI 2025 Oral
sym

Fusing Pruned and Backdoored Models: Optimal Transport-based Data-free Backdoor Mitigation GitHub Oral Presentation (4.6%)

Weilin Lin, Li Liu#, Jianze Li, Hui Xiong

  • One of the few data-free defense strategies against backdoor attacks.
  • First adaptation of OT and model fusion on backdoor defense.

The Thirty-Ninth AAAI Conference on Artificial Intelligence (AAAI-25), Philadelphia, Pennsylvania, USA, 2025

NeurIPS 2024
sym

Unveiling and Mitigating Backdoor Vulnerabilities based on Unlearning Weight Changes and Backdoor Activeness GitHub

Weilin Lin, Li Liu#, Shaokui Wei, Jianze Li, Hui Xiong

  • New insights on unlearning weight change and backdoor activeness.
  • Propose an effective defense strategy using reinitialization and fine-tuning.

Annual Conference on Neural Information Processing Systems (NeurIPS), Vancouver, Canada, 2024

Preprints