✨Sliding Window Attention Adaptation
📝 Summary:
Sliding Window Attention Adaptation SWAA allows pretrained LLMs to use efficient sliding window attention for long contexts without retraining. SWAA combines five adaptation methods, with specific synergistic combinations effectively recovering original long-context performance.
🔹 Publication Date: Published on Dec 11
🔹 Paper Links:
• arXiv Page: https://arxiv.org/abs/2512.10411
• PDF: https://arxiv.org/pdf/2512.10411
🔹 Models citing this paper:
• https://huggingface.co/yuyijiong/Qwen3-SWA-adaptation
✨ Datasets citing this paper:
• https://huggingface.co/datasets/yuyijiong/LongMemEval_24k
==================================
For more data science resources:
✓ https://t.me/DataScienceT
#LLMs #SlidingWindowAttention #LongContextAI #NLP #AIResearch
📝 Summary:
Sliding Window Attention Adaptation SWAA allows pretrained LLMs to use efficient sliding window attention for long contexts without retraining. SWAA combines five adaptation methods, with specific synergistic combinations effectively recovering original long-context performance.
🔹 Publication Date: Published on Dec 11
🔹 Paper Links:
• arXiv Page: https://arxiv.org/abs/2512.10411
• PDF: https://arxiv.org/pdf/2512.10411
🔹 Models citing this paper:
• https://huggingface.co/yuyijiong/Qwen3-SWA-adaptation
✨ Datasets citing this paper:
• https://huggingface.co/datasets/yuyijiong/LongMemEval_24k
==================================
For more data science resources:
✓ https://t.me/DataScienceT
#LLMs #SlidingWindowAttention #LongContextAI #NLP #AIResearch
❤2