ML Research Hub – Telegram

ML Research Hub

32.7K subscribers

4.09K photos

237 videos

23 files

4.41K links

Advancing research in Machine Learning – practical insights, tools, and techniques for researchers.

Admin: @HusseinSheikho || @Hussein_Sheikho

Download Telegram

About

Blog

Apps

Platform

ML Research Hub

32.7K subscribers

ML Research Hub

✨TabTune: A Unified Library for Inference and Fine-Tuning Tabular Foundation Models

📝 Summary:
TabTune is a unified library that standardizes the workflow for tabular foundation models. It provides consistent access to state-of-the-art models, diverse adaptation strategies, and integrated evaluation for performance, calibration, and fairness.

🔹 Publication Date: Published on Nov 4

🔹 Paper Links:
• arXiv Page: https://arxiv.org/abs/2511.02802
• PDF: https://arxiv.org/pdf/2511.02802
• Github: https://github.com/Lexsi-Labs/TabTune

==================================

For more data science resources:
✓ https://t.me/DataScienceT

#TabularData #FoundationModels #MachineLearning #DataScience #AIResearch

❤1

320 views07:02

✨ Explore Data Science 📝 Write your paper

ML Research Hub

✨DINOv3

📝 Summary:
DINOv3 is a self-supervised vision model excelling across tasks. It scales datasets, prevents dense feature degradation via Gram anchoring, and uses post-hoc strategies for flexibility. This versatile foundation model outperforms specialized state of the art without fine-tuning.

🔹 Publication Date: Published on Aug 13

🔹 Paper Links:
• arXiv Page: https://huggingface.co/collections/facebook/dinov3
• PDF: https://arxiv.org/pdf/2508.10104
• Project Page: https://ai.meta.com/blog/dinov3-self-supervised-vision-model/
• Github: https://github.com/facebookresearch/dinov3

🔹 Models citing this paper:
• https://huggingface.co/facebook/dinov3-vit7b16-pretrain-lvd1689m
• https://huggingface.co/facebook/dinov3-vitb16-pretrain-lvd1689m
• https://huggingface.co/facebook/dinov3-vitl16-pretrain-lvd1689m

✨ Datasets citing this paper:
• https://huggingface.co/datasets/zhuangzhe1229/test_dataset
• https://huggingface.co/datasets/simon123905/vitl

✨ Spaces citing this paper:
• https://huggingface.co/spaces/atalaydenknalbant/DINOv3
• https://huggingface.co/spaces/manu02/DINOv3-Interactive-Patch-Cosine-Similarity
• https://huggingface.co/spaces/merve/dinov3-viz

==================================

For more data science resources:
✓ https://t.me/DataScienceT

#DINOv3 #SelfSupervisedLearning #ComputerVision #FoundationModels #AI

DINOv3 - a facebook Collection

DINOv3: foundation models producing excellent dense features, outperforming SotA w/o fine-tuning - https://arxiv.org/abs/2508.10104

299 views00:00

✨ Explore Data Science 📝 Write your paper

ML Research Hub

✨OlmoEarth: Stable Latent Image Modeling for Multimodal Earth Observation

📝 Summary:
OlmoEarth is a novel multimodal spatio-temporal foundation model for Earth observation data. It employs new self-supervised learning methods to achieve state-of-the-art performance on many tasks. It is deployed as a platform for non-profits and NGOs.

🔹 Publication Date: Published on Nov 17

🔹 Paper Links:
• arXiv Page: https://arxiv.org/abs/2511.13655
• PDF: https://arxiv.org/pdf/2511.13655
• Project Page: https://olmoearth.allenai.org/
• Github: https://github.com/allenai/olmoearth_pretrain

==================================

For more data science resources:
✓ https://t.me/DataScienceT

#EarthObservation #FoundationModels #AI #RemoteSensing #SelfSupervisedLearning

162 views05:05

✨ Explore Data Science 📝 Write your paper

ML Research Hub

✨Kandinsky 5.0: A Family of Foundation Models for Image and Video Generation

📝 Summary:
Kandinsky 5.0 is a family of state-of-the-art foundation models for high-resolution image and video generation. It includes Lite and Pro versions with varying parameters and uses advanced training techniques for superior quality and speed. This publicly available framework aims to advance generat...

🔹 Publication Date: Published on Nov 19

🔹 Paper Links:
• arXiv Page: https://arxiv.org/abs/2511.14993
• PDF: https://arxiv.org/pdf/2511.14993
• Project Page: https://kandinskylab.ai/
• Github: https://github.com/kandinskylab/kandinsky-5

==================================

For more data science resources:
✓ https://t.me/DataScienceT

#FoundationModels #ImageGeneration #VideoGeneration #AI #DeepLearning

229 views03:01

✨ Explore Data Science 📝 Write your paper

ML Research Hub

✨Medal S: Spatio-Textual Prompt Model for Medical Segmentation

📝 Summary:
Medal S is a medical segmentation foundation model using spatio-textual prompts for efficient, high-accuracy multi-class segmentation across diverse modalities. It uniquely aligns volumetric prompts with text embeddings and processes masks in parallel, significantly outperforming prior methods.

🔹 Publication Date: Published on Nov 17

🔹 Paper Links:
• arXiv Page: https://arxiv.org/abs/2511.13001
• PDF: https://arxiv.org/pdf/2511.13001
• Github: https://github.com/yinghemedical/Medal-S

🔹 Models citing this paper:
• https://huggingface.co/spc819/Medal-S-V1.0

==================================

For more data science resources:
✓ https://t.me/DataScienceT

#MedicalSegmentation #FoundationModels #AI #DeepLearning #ComputerVision

305 views09:02

✨ Explore Data Science 📝 Write your paper

ML Research Hub

✨Scaling Spatial Intelligence with Multimodal Foundation Models

📝 Summary:
SenseNova-SI is a new scaled multimodal foundation model that achieves superior spatial intelligence. By using 8 million diverse data samples, it sets unprecedented performance on various spatial benchmarks. The models are publicly released to foster further research.

🔹 Publication Date: Published on Nov 17

🔹 Paper Links:
• arXiv Page: https://arxiv.org/abs/2511.13719
• PDF: https://arxiv.org/pdf/2511.13719
• Project Page: https://huggingface.co/sensenova/SenseNova-SI-1.1-InternVL3-8B
• Github: https://github.com/OpenSenseNova/SenseNova-SI

🔹 Models citing this paper:
• https://huggingface.co/sensenova/SenseNova-SI-InternVL3-8B
• https://huggingface.co/sensenova/SenseNova-SI-InternVL3-2B
• https://huggingface.co/sensenova/SenseNova-SI-1.1-InternVL3-2B

==================================

For more data science resources:
✓ https://t.me/DataScienceT

#MultimodalAI #FoundationModels #SpatialIntelligence #ComputerVision #AI

Scaling Spatial Intelligence with Multimodal Foundation Models

Despite remarkable progress, multimodal foundation models still exhibit surprising deficiencies in spatial intelligence. In this work, we explore scaling up multimodal foundation models to...

181 views04:05

✨ Explore Data Science 📝 Write your paper

ML Research Hub

✨MiMo-Embodied: X-Embodied Foundation Model Technical Report

📝 Summary:
MiMo-Embodied is the first cross-embodied foundation model. It achieves state-of-the-art performance in both autonomous driving and embodied AI, demonstrating positive transfer through multi-stage learning and fine-tuning.

🔹 Publication Date: Published on Nov 20

🔹 Paper Links:
• arXiv Page: https://arxiv.org/abs/2511.16518
• PDF: https://arxiv.org/pdf/2511.16518
• Github: https://github.com/XiaomiMiMo/MiMo-Embodied

==================================

For more data science resources:
✓ https://t.me/DataScienceT

#FoundationModels #EmbodiedAI #AutonomousDriving #AI #Robotics

171 views04:05

✨ Explore Data Science 📝 Write your paper

ML Research Hub

Media is too big

VIEW IN TELEGRAM

✨SAM2S: Segment Anything in Surgical Videos via Semantic Long-term Tracking

📝 Summary:
SAM2S is a foundation model enhancing interactive video object segmentation in surgery. It leverages a new large benchmark, robust memory, and temporal learning to achieve superior accuracy 80.42 J and F and real-time performance in surgical video analysis.

🔹 Publication Date: Published on Nov 20

🔹 Paper Links:
• arXiv Page: https://arxiv.org/abs/2511.16618
• PDF: https://arxiv.org/pdf/2511.16618
• Project Page: https://jinlab-imvr.github.io/SAM2S
• Github: https://github.com/jinlab-imvr/SAM2S

==================================

For more data science resources:
✓ https://t.me/DataScienceT

#SurgicalAI #MedicalImaging #ComputerVision #FoundationModels #DeepLearning

❤1

205 views04:07

✨ Explore Data Science 📝 Write your paper

ML Research Hub

✨Pillar-0: A New Frontier for Radiology Foundation Models

📝 Summary:
Pillar-0 is a new radiology foundation model pretrained on diverse CT/MRI scans, utilizing RATE for scalable label extraction. It significantly outperforms existing models across various radiology tasks and extends to new applications like lung cancer risk prediction and brain hemorrhage detection.

🔹 Publication Date: Published on Nov 21

🔹 Paper Links:
• arXiv Page: https://arxiv.org/abs/2511.17803
• PDF: https://arxiv.org/pdf/2511.17803
• Github: https://github.com/YalaLab/rate-evals

==================================

For more data science resources:
✓ https://t.me/DataScienceT

#Radiology #FoundationModels #AI #MedicalImaging #MachineLearning

373 views05:05

✨ Explore Data Science 📝 Write your paper

ML Research Hub

✨Language Model Council: Benchmarking Foundation Models on Highly Subjective Tasks by Consensus

📝 Summary:
Benchmarking LLMs on subjective tasks like emotional intelligence is challenging. The Language Model Council LMC uses a democratic process with 20 LLMs to formulate, administer, and evaluate tests. This yields more robust, less biased rankings that align better with human leaderboards.

🔹 Publication Date: Published on Jun 12, 2024

🔹 Paper Links:
• arXiv Page: https://arxiv.org/abs/2406.08598
• PDF: https://arxiv.org/pdf/2406.08598
• Github: https://github.com/llm-council/llm-council

✨ Datasets citing this paper:
• https://huggingface.co/datasets/llm-council/emotional_application

✨ Spaces citing this paper:
• https://huggingface.co/spaces/llm-council/llm-council
• https://huggingface.co/spaces/llm-council/sandbox

==================================

For more data science resources:
✓ https://t.me/DataScienceT

#LLM #Benchmarking #AIEvaluation #FoundationModels #ConsensusAI

314 views02:00

✨ Explore Data Science 📝 Write your paper

ML Research Hub

✨MedSAM3: Delving into Segment Anything with Medical Concepts

📝 Summary:
MedSAM-3 is a text-promptable medical segmentation model fine-tuned on SAM 3 using semantic conceptual labels. It enables precise, open-vocabulary text-based segmentation of anatomical structures and integrates MLLMs for advanced reasoning. This approach significantly outperforms existing models ...

🔹 Publication Date: Published on Nov 24

🔹 Paper Links:
• arXiv Page: https://arxiv.org/abs/2511.19046
• PDF: https://arxiv.org/pdf/2511.19046
• Github: https://github.com/Joey-S-Liu/MedSAM3

==================================

For more data science resources:
✓ https://t.me/DataScienceT

#MedicalAI #ImageSegmentation #DeepLearning #MLLMs #FoundationModels

219 views04:02

✨ Explore Data Science 📝 Write your paper

ML Research Hub

✨Z-Image: An Efficient Image Generation Foundation Model with Single-Stream Diffusion Transformer

📝 Summary:
Z-Image is an efficient 6B-parameter diffusion transformer achieving state-of-the-art image generation with significantly reduced computational cost. It enables sub-second inference and consumer hardware compatibility, challenging the scale-at-all-costs paradigm.

🔹 Publication Date: Published on Nov 27

🔹 Paper Links:
• arXiv Page: https://arxiv.org/abs/2511.22699
• PDF: https://arxiv.org/pdf/2511.22699
• Project Page: https://tongyi-mai.github.io/Z-Image-blog/
• Github: https://github.com/Tongyi-MAI/Z-Image

==================================

For more data science resources:
✓ https://t.me/DataScienceT

#ImageGeneration #DiffusionModels #EfficientAI #FoundationModels #MachineLearning

❤1

348 views04:00

✨ Explore Data Science 📝 Write your paper

ML Research Hub

✨From Code Foundation Models to Agents and Applications: A Practical Guide to Code Intelligence

📝 Summary:
This paper provides a practical guide to code LLMs, covering their lifecycle from data to deployment. It examines techniques, analyzes various models, and discusses real-world challenges like correctness and security. Experiments on pre-training and fine-tuning are included.

🔹 Publication Date: Published on Nov 23

🔹 Paper Links:
• arXiv Page: https://arxiv.org/abs/2511.18538
• PDF: https://arxiv.org/pdf/2511.18538

==================================

For more data science resources:
✓ https://t.me/DataScienceT

#CodeLLMs #AI #MachineLearning #SoftwareEngineering #FoundationModels

143 views06:06

✨ Explore Data Science 📝 Write your paper

ML Research Hub

✨OmniFusion: Simultaneous Multilingual Multimodal Translations via Modular Fusion

📝 Summary:
OmniFusion is a multimodal translation system integrating pretrained foundation models with LLMs via a novel fusion strategy. It enables simultaneous multilingual translation using audio and visual inputs, reducing latency and improving quality over cascaded systems.

🔹 Publication Date: Published on Nov 28

🔹 Paper Links:
• arXiv Page: https://arxiv.org/abs/2512.00234
• PDF: https://arxiv.org/pdf/2512.00234
• Github: https://github.com/saikoneru/OmniFusion

🔹 Models citing this paper:
• https://huggingface.co/skoneru/OmniFusion
• https://huggingface.co/skoneru/OmniFusion_v2

==================================

For more data science resources:
✓ https://t.me/DataScienceT

#MultimodalAI #LLMs #MachineTranslation #FoundationModels #AIResearch

👍1

211 views09:09

✨ Explore Data Science 📝 Write your paper

ML Research Hub

✨LFM2 Technical Report

📝 Summary:
LFM2 is a family of compact foundation models designed for efficient on-device deployment. It uses hardware-in-the-loop architecture search and advanced training to achieve high performance across diverse tasks, including multimodal applications.

🔹 Publication Date: Published on Nov 28

🔹 Paper Links:
• arXiv Page: https://arxiv.org/abs/2511.23404
• PDF: https://arxiv.org/pdf/2511.23404

==================================

For more data science resources:
✓ https://t.me/DataScienceT

#FoundationModels #EdgeAI #MultimodalAI #AIResearch #MachineLearning

221 views10:10

✨ Explore Data Science 📝 Write your paper

ML Research Hub

✨Echo-4o: Harnessing the Power of GPT-4o Synthetic Images for Improved Image Generation

📝 Summary:
Echo-4o-Image is a 180K synthetic dataset from GPT-4o. It enhances image generation by covering rare scenarios and providing clean text to image supervision. This improves model performance and transferability across various foundation models.

🔹 Publication Date: Published on Aug 13

🔹 Paper Links:
• arXiv Page: https://arxiv.org/abs/2508.09987
• PDF: https://arxiv.org/pdf/2508.09987
• Project Page: https://yejy53.github.io/Echo-4o/
• Github: https://yejy53.github.io/Echo-4o

✨ Datasets citing this paper:
• https://huggingface.co/datasets/Yejy53/Echo-4o-Image

==================================

For more data science resources:
✓ https://t.me/DataScienceT

#ImageGeneration #GPT4o #SyntheticData #AIResearch #FoundationModels

166 views08:04

✨ Explore Data Science 📝 Write your paper

ML Research Hub

✨The SAM2-to-SAM3 Gap in the Segment Anything Model Family: Why Prompt-Based Expertise Fails in Concept-Driven Image Segmentation

📝 Summary:
This paper highlights the gap between SAM2 and SAM3. SAM2 uses spatial prompts for geometric segmentation, but SAM3 is a concept-driven multimodal model with a unified vision-language architecture. SAM3 represents a new class of foundation model for concept-driven segmentation.

🔹 Publication Date: Published on Dec 4

🔹 Paper Links:
• arXiv Page: https://arxiv.org/abs/2512.06032
• PDF: https://arxiv.org/pdf/2512.06032
• Github: https://github.com/Applied-AI-Research-Lab/The-SAM2-to-SAM3-Gap-in-the-Segment-Anything-Model-Family

==================================

For more data science resources:
✓ https://t.me/DataScienceT

#ImageSegmentation #FoundationModels #ComputerVision #MultimodalAI #AIResearch

❤1

344 views15:06

✨ Explore Data Science 📝 Write your paper

ML Research Hub

✨SAM Audio: Segment Anything in Audio

📝 Summary:
SAM Audio is a foundation model for general audio separation. It unifies text visual and temporal span prompts achieving state-of-the-art performance across diverse audio types. It also introduces a new real-world separation benchmark.

🔹 Publication Date: Published on Dec 19

🔹 Paper Links:
• arXiv Page: https://arxiv.org/abs/2512.18099
• PDF: https://arxiv.org/pdf/2512.18099
• Project Page: https://ai.meta.com/samaudio/
• Github: https://github.com/facebookresearch/sam-audio

🔹 Models citing this paper:
• https://huggingface.co/facebook/sam-audio-large
• https://huggingface.co/facebook/sam-audio-small
• https://huggingface.co/facebook/sam-audio-base

✨ Spaces citing this paper:
• https://huggingface.co/spaces/lpeterl/sam-audio-webui
• https://huggingface.co/spaces/Arrcttacsrks/SAM-Audio-Demo
• https://huggingface.co/spaces/chippie1/SAM-Audio-Demo

==================================

For more data science resources:
✓ https://t.me/DataScienceT

#AudioSeparation #FoundationModels #AI #DeepLearning #SAMAudio

SAM Audio: Segment Anything in Audio

General audio source separation is a key capability for multimodal AI systems that can perceive and reason about sound. Despite substantial progress in recent years, existing separation models are...

170 views03:01

✨ Explore Data Science 📝 Write your paper

ML Research Hub

✨A Comprehensive Survey of Self-Evolving AI Agents: A New Paradigm Bridging Foundation Models and Lifelong Agentic Systems

📝 Summary:
This survey reviews self-evolving AI agents that adapt to dynamic environments via automatic enhancement from interaction data. It proposes a unified framework and systematically reviews current techniques, addressing evaluation, safety, and ethics.

🔹 Publication Date: Published on Aug 10

🔹 Paper Links:
• arXiv Page: https://arxiv.org/abs/2508.07407
• PDF: https://arxiv.org/pdf/2508.07407
• Project Page: https://huggingface.co/spaces/X-iZhang/Awesome-Self-Evolving-Agents
• Github: https://github.com/EvoAgentX/Awesome-Self-Evolving-Agents

✨ Spaces citing this paper:
• https://huggingface.co/spaces/X-iZhang/Awesome-Self-Evolving-Agents

==================================

For more data science resources:
✓ https://t.me/DataScienceT

#SelfEvolvingAI #AIAgents #FoundationModels #LifelongLearning #ArtificialIntelligence

❤1

478 views14:01

✨ Explore Data Science 📝 Write your paper

ML Research Hub

✨Omni-Weather: Unified Multimodal Foundation Model for Weather Generation and Understanding

📝 Summary:
Omni-Weather is a new multimodal foundation model that unifies weather generation and understanding in a single architecture. It uses shared self-attention and a Chain-of-Thought dataset for interpretable, high-quality outputs, achieving state-of-the-art performance.

🔹 Publication Date: Published on Dec 25

🔹 Paper Links:
• arXiv Page: https://arxiv.org/abs/2512.21643
• PDF: https://arxiv.org/pdf/2512.21643

==================================

For more data science resources:
✓ https://t.me/DataScienceT

#WeatherGeneration #FoundationModels #MultimodalAI #AIResearch #DeepLearning

❤1

434 views06:02

✨ Explore Data Science 📝 Write your paper