ML Research Hub
32.8K subscribers
4.13K photos
244 videos
23 files
4.46K links
Advancing research in Machine Learning – practical insights, tools, and techniques for researchers.

Admin: @HusseinSheikho || @Hussein_Sheikho
Download Telegram
UniQL: Unified Quantization and Low-rank Compression for Adaptive Edge LLMs

📝 Summary:
UniQL unifies quantization and low-rank compression to deploy LLMs on mobile devices. It reduces memory by 4x-5.7x and improves token throughput by 2.7x-3.4x, maintaining accuracy across various model types.

🔹 Publication Date: Published on Dec 3

🔹 Paper Links:
• arXiv Page: https://arxiv.org/abs/2512.03383
• PDF: https://arxiv.org/pdf/2512.03383
• Project Page: https://hychiang.info/projects/uniql/
• Github: https://github.com/enyac-group/UniQL

==================================

For more data science resources:
https://t.me/DataScienceT

#LLMs #EdgeAI #Quantization #ModelCompression #DeepLearning
SignRoundV2: Closing the Performance Gap in Extremely Low-Bit Post-Training Quantization for LLMs

📝 Summary:
SignRoundV2 is a post-training quantization framework for LLMs. It uses a sensitivity metric for bit allocation and pre-tuning for scales to achieve competitive accuracy even at 2-bit quantization, closing the gap with full-precision models.

🔹 Publication Date: Published on Dec 4

🔹 Paper Links:
• arXiv Page: https://arxiv.org/abs/2512.04746
• PDF: https://arxiv.org/pdf/2512.04746

==================================

For more data science resources:
https://t.me/DataScienceT

#LLMs #Quantization #DeepLearning #AI #MachineLearning
SQ-format: A Unified Sparse-Quantized Hardware-friendly Data Format for LLMs

📝 Summary:
The SQ-format is a unified sparse-quantized data format for LLM post-training quantization. It improves accuracy and efficiency balance by combining sparse and low-precision matrix multiplications. This enables better performance and throughput, especially for outlier activations, supporting next...

🔹 Publication Date: Published on Dec 5

🔹 Paper Links:
• arXiv Page: https://arxiv.org/abs/2512.05409
• PDF: https://arxiv.org/pdf/2512.05409

==================================

For more data science resources:
https://t.me/DataScienceT

#LLMs #Quantization #SparseML #HardwareAcceleration #AIResearch
1
Fairy2i: Training Complex LLMs from Real LLMs with All Parameters in {pm 1, pm i}

📝 Summary:
Fairy2i converts pre-trained real-valued LLMs to a complex form, enabling efficient low-bit quantization while reusing existing checkpoints. It achieves near full-precision performance for LLaMA-2 7B at 2-bit, significantly outperforming real-valued binary methods.

🔹 Publication Date: Published on Dec 2

🔹 Paper Links:
• arXiv Page: https://arxiv.org/pdf/2512.02901
• PDF: https://arxiv.org/pdf/2512.02901
• Github: https://github.com/PKULab1806/Fairy2i-W2

🔹 Models citing this paper:
https://huggingface.co/PKU-DS-LAB/Fairy2i-W2

==================================

For more data science resources:
https://t.me/DataScienceT

#LLM #Quantization #ModelCompression #DeepLearning #AIResearch
2
BitNet b1.58 2B4T Technical Report

📝 Summary:
BitNet b1.58 2B4T is the first open-source 1-bit Large Language Model with 2 billion parameters. It matches full-precision LLM performance while offering significant improvements in computational efficiency like reduced memory and energy. The model weights are openly released for research.

🔹 Publication Date: Published on Apr 16, 2025

🔹 Paper Links:
• arXiv Page: https://arxiv.org/abs/2504.12285
• PDF: https://arxiv.org/pdf/2504.12285
• Github: https://github.com/microsoft/bitnet

🔹 Models citing this paper:
https://huggingface.co/microsoft/bitnet-b1.58-2B-4T
https://huggingface.co/microsoft/bitnet-b1.58-2B-4T-gguf
https://huggingface.co/microsoft/bitnet-b1.58-2B-4T-bf16

Spaces citing this paper:
https://huggingface.co/spaces/suayptalha/Chat-with-Bitnet-b1.58-2B-4T
https://huggingface.co/spaces/aizip-dev/SLM-RAG-Arena
https://huggingface.co/spaces/Tonic/Native_1-bit_LLM

==================================

For more data science resources:
https://t.me/DataScienceT

#LLM #AI #Quantization #OpenSourceAI #DeepLearning
BitNet Distillation

📝 Summary:
BitNet Distillation fine-tunes LLMs to 1.58-bit precision using SubLN, attention distillation, and continual pre-training. It achieves comparable performance to full-precision models, offering 10x memory savings and 2.65x faster inference.

🔹 Publication Date: Published on Oct 15, 2025

🔹 Paper Links:
• arXiv Page: https://arxiv.org/abs/2510.13998
• PDF: https://arxiv.org/pdf/2510.13998
• Github: https://github.com/microsoft/BitNet

==================================

For more data science resources:
https://t.me/DataScienceT

#LLM #Quantization #ModelCompression #DeepLearning #AI