ML Research Hub
32.8K subscribers
4.2K photos
253 videos
23 files
4.54K links
Advancing research in Machine Learning – practical insights, tools, and techniques for researchers.

Admin: @HusseinSheikho || @Hussein_Sheikho
Download Telegram
Seeing the Forest and the Trees: Query-Aware Tokenizer for Long-Video Multimodal Language Models

📝 Summary:
QTSplus is a query-aware token selector for long-video multimodal language models. It dynamically selects the most important visual tokens based on a text query, significantly compressing vision data and reducing latency. This method maintains overall accuracy and enhances temporal understanding ...

🔹 Publication Date: Published on Nov 14

🔹 Paper Links:
• arXiv Page: https://huggingface.co/collections/AlpachinoNLP/qtsplus
• PDF: https://arxiv.org/pdf/2511.11910
• Project Page: https://qtsplus.github.io/
• Github: https://github.com/Siyou-Li/QTSplus

🔹 Models citing this paper:
https://huggingface.co/AlpachinoNLP/QTSplus-3B
https://huggingface.co/AlpachinoNLP/QTSplus-3B-FT

Spaces citing this paper:
https://huggingface.co/spaces/AlpachinoNLP/QTSplus-3B

==================================

For more data science resources:
https://t.me/DataScienceT

#MultimodalAI #VideoAI #LLM #Tokenization #ComputerVision