GitHub repos – Telegram

GitHub repos

26.1K subscribers

18 photos

2 videos

11.5K links

Welcome to GitHub repos. Here you'll find valuable information on the latest trending projects. Subscribe to stay informed and gain insights from the thriving GitHub community.

Download Telegram

About

Blog

Apps

Platform

26.1K subscribers

thu-ml/SageAttention
Quantized Attention that achieves speedups of 2.1x and 2.7x compared to FlashAttention2 and xformers, respectively, without lossing end-to-end metrics across various models.
Language: Python
#attention #inference_acceleration #llm #quantization
Stars: 145 Issues: 6 Forks: 3
https://github.com/thu-ml/SageAttention

GitHub - thu-ml/SageAttention: [ICLR2025, ICML2025, NeurIPS2025 Spotlight] Quantized Attention achieves speedup of 2-5x and 3-11x…

[ICLR2025, ICML2025, NeurIPS2025 Spotlight] Quantized Attention achieves speedup of 2-5x and 3-11x compared to FlashAttention and xformers, without lossing end-to-end metrics across language, image...

👍3

1.86K views22:00