✨GateBreaker: Gate-Guided Attacks on Mixture-of-Expert LLMs
📝 Summary:
GateBreaker is the first framework to compromise MoE LLM safety by identifying and disabling ~3% of safety neurons in expert layers. This raises attack success rates from 7.4% to 64.9% across eight LLMs and generalizes to VLMs, showing concentrated and transferable safety vulnerabilities.
🔹 Publication Date: Published on Dec 24
🔹 Paper Links:
• arXiv Page: https://arxiv.org/abs/2512.21008
• PDF: https://arxiv.org/pdf/2512.21008
==================================
For more data science resources:
✓ https://t.me/DataScienceT
#LLM #AIsecurity #MoELLMs #AIvulnerability #GateBreaker
📝 Summary:
GateBreaker is the first framework to compromise MoE LLM safety by identifying and disabling ~3% of safety neurons in expert layers. This raises attack success rates from 7.4% to 64.9% across eight LLMs and generalizes to VLMs, showing concentrated and transferable safety vulnerabilities.
🔹 Publication Date: Published on Dec 24
🔹 Paper Links:
• arXiv Page: https://arxiv.org/abs/2512.21008
• PDF: https://arxiv.org/pdf/2512.21008
==================================
For more data science resources:
✓ https://t.me/DataScienceT
#LLM #AIsecurity #MoELLMs #AIvulnerability #GateBreaker