✨PhyGDPO: Physics-Aware Groupwise Direct Preference Optimization for Physically Consistent Text-to-Video Generation
📝 Summary:
Recent advances in text-to-video (T2V) generation have achieved good visual quality, yet synthesizing videos that faithfully follow physical laws remains an open challenge. Existing methods mainly bas...
🔹 Publication Date: Published on Dec 31, 2025
🔹 Paper Links:
• arXiv Page: https://arxiv.org/abs/2512.24551
• PDF: https://arxiv.org/pdf/2512.24551
• Project Page: https://caiyuanhao1998.github.io/project/PhyGDPO/
• Github: https://github.com/caiyuanhao1998/Open-PhyGDPO
==================================
For more data science resources:
✓ https://t.me/DataScienceT
#AI #DataScience #MachineLearning #HuggingFace #Research
📝 Summary:
Recent advances in text-to-video (T2V) generation have achieved good visual quality, yet synthesizing videos that faithfully follow physical laws remains an open challenge. Existing methods mainly bas...
🔹 Publication Date: Published on Dec 31, 2025
🔹 Paper Links:
• arXiv Page: https://arxiv.org/abs/2512.24551
• PDF: https://arxiv.org/pdf/2512.24551
• Project Page: https://caiyuanhao1998.github.io/project/PhyGDPO/
• Github: https://github.com/caiyuanhao1998/Open-PhyGDPO
==================================
For more data science resources:
✓ https://t.me/DataScienceT
#AI #DataScience #MachineLearning #HuggingFace #Research
✨Scaling Open-Ended Reasoning to Predict the Future
📝 Summary:
This work trains language models for open-ended future prediction using a new dataset synthesized from news. Their OpenForecaster 8B model matches larger proprietary models in accuracy, calibration, and consistency. All resources are open-sourced.
🔹 Publication Date: Published on Dec 31, 2025
🔹 Paper Links:
• arXiv Page: https://arxiv.org/abs/2512.25070
• PDF: https://arxiv.org/pdf/2512.25070
• Project Page: https://www.openforecaster.github.io
• Github: https://github.com/OpenForecaster/scaling-forecasting-training
==================================
For more data science resources:
✓ https://t.me/DataScienceT
#LLMs #FuturePrediction #AI #OpenSourceAI #MachineLearning
📝 Summary:
This work trains language models for open-ended future prediction using a new dataset synthesized from news. Their OpenForecaster 8B model matches larger proprietary models in accuracy, calibration, and consistency. All resources are open-sourced.
🔹 Publication Date: Published on Dec 31, 2025
🔹 Paper Links:
• arXiv Page: https://arxiv.org/abs/2512.25070
• PDF: https://arxiv.org/pdf/2512.25070
• Project Page: https://www.openforecaster.github.io
• Github: https://github.com/OpenForecaster/scaling-forecasting-training
==================================
For more data science resources:
✓ https://t.me/DataScienceT
#LLMs #FuturePrediction #AI #OpenSourceAI #MachineLearning
✨Fantastic Reasoning Behaviors and Where to Find Them: Unsupervised Discovery of the Reasoning Process
📝 Summary:
This paper introduces RISE, an unsupervised framework using sparse auto-encoders to discover and control LLM reasoning behaviors. It identifies interpretable reasoning vectors like reflection and backtracking, enabling targeted interventions and discovery of novel behaviors without retraining.
🔹 Publication Date: Published on Dec 30, 2025
🔹 Paper Links:
• arXiv Page: https://arxiv.org/abs/2512.23988
• PDF: https://arxiv.org/pdf/2512.23988
==================================
For more data science resources:
✓ https://t.me/DataScienceT
#LLM #AI #MachineLearning #AIReasoning #Interpretability
📝 Summary:
This paper introduces RISE, an unsupervised framework using sparse auto-encoders to discover and control LLM reasoning behaviors. It identifies interpretable reasoning vectors like reflection and backtracking, enabling targeted interventions and discovery of novel behaviors without retraining.
🔹 Publication Date: Published on Dec 30, 2025
🔹 Paper Links:
• arXiv Page: https://arxiv.org/abs/2512.23988
• PDF: https://arxiv.org/pdf/2512.23988
==================================
For more data science resources:
✓ https://t.me/DataScienceT
#LLM #AI #MachineLearning #AIReasoning #Interpretability
✨Let It Flow: Agentic Crafting on Rock and Roll, Building the ROME Model within an Open Agentic Learning Ecosystem
📝 Summary:
The Agentic Learning Ecosystem ALE is a new infrastructure to streamline LLM agent development for real-world tasks. ALE comprises ROLL for optimization, ROCK for sandboxing, and iFlow CLI for context. Their agent ROME, built with ALE, shows strong benchmark performance.
🔹 Publication Date: Published on Dec 31, 2025
🔹 Paper Links:
• arXiv Page: https://arxiv.org/abs/2512.24873
• PDF: https://arxiv.org/pdf/2512.24873
==================================
For more data science resources:
✓ https://t.me/DataScienceT
#AIAgents #LLMDevelopment #AgenticLearning #AIArchitecture #MachineLearning
📝 Summary:
The Agentic Learning Ecosystem ALE is a new infrastructure to streamline LLM agent development for real-world tasks. ALE comprises ROLL for optimization, ROCK for sandboxing, and iFlow CLI for context. Their agent ROME, built with ALE, shows strong benchmark performance.
🔹 Publication Date: Published on Dec 31, 2025
🔹 Paper Links:
• arXiv Page: https://arxiv.org/abs/2512.24873
• PDF: https://arxiv.org/pdf/2512.24873
==================================
For more data science resources:
✓ https://t.me/DataScienceT
#AIAgents #LLMDevelopment #AgenticLearning #AIArchitecture #MachineLearning
✨Figure It Out: Improving the Frontier of Reasoning with Active Visual Thinking
📝 Summary:
Complex reasoning problems often involve implicit spatial, geometric, and structural relationships that are not explicitly encoded in text. While recent reasoning models have achieved strong performan...
🔹 Publication Date: Published on Dec 30, 2025
🔹 Paper Links:
• arXiv Page: https://arxiv.org/abs/2512.24297
• PDF: https://arxiv.org/pdf/2512.24297
• Github: https://github.com/chenmeiqii/FIGR
==================================
For more data science resources:
✓ https://t.me/DataScienceT
#AI #DataScience #MachineLearning #HuggingFace #Research
📝 Summary:
Complex reasoning problems often involve implicit spatial, geometric, and structural relationships that are not explicitly encoded in text. While recent reasoning models have achieved strong performan...
🔹 Publication Date: Published on Dec 30, 2025
🔹 Paper Links:
• arXiv Page: https://arxiv.org/abs/2512.24297
• PDF: https://arxiv.org/pdf/2512.24297
• Github: https://github.com/chenmeiqii/FIGR
==================================
For more data science resources:
✓ https://t.me/DataScienceT
#AI #DataScience #MachineLearning #HuggingFace #Research
✨Pretraining Frame Preservation in Autoregressive Video Memory Compression
📝 Summary:
We present PFP, a neural network structure to compress long videos into short contexts, with an explicit pretraining objective to preserve the high-frequency details of single frames at arbitrary temp...
🔹 Publication Date: Published on Dec 29, 2025
🔹 Paper Links:
• arXiv Page: https://arxiv.org/abs/2512.23851
• PDF: https://arxiv.org/pdf/2512.23851
• Github: https://github.com/lllyasviel/PFP
==================================
For more data science resources:
✓ https://t.me/DataScienceT
#AI #DataScience #MachineLearning #HuggingFace #Research
📝 Summary:
We present PFP, a neural network structure to compress long videos into short contexts, with an explicit pretraining objective to preserve the high-frequency details of single frames at arbitrary temp...
🔹 Publication Date: Published on Dec 29, 2025
🔹 Paper Links:
• arXiv Page: https://arxiv.org/abs/2512.23851
• PDF: https://arxiv.org/pdf/2512.23851
• Github: https://github.com/lllyasviel/PFP
==================================
For more data science resources:
✓ https://t.me/DataScienceT
#AI #DataScience #MachineLearning #HuggingFace #Research
✨Factorized Learning for Temporally Grounded Video-Language Models
📝 Summary:
Video-language models struggle with temporal grounding from coupled tasks. Our D^2VLM framework decouples grounding and textual response using evidence tokens. Factorized preference optimization explicitly optimizes temporal grounding for both tasks.
🔹 Publication Date: Published on Dec 30, 2025
🔹 Paper Links:
• arXiv Page: https://arxiv.org/abs/2512.24097
• PDF: https://arxiv.org/pdf/2512.24097
• Project Page: https://github.com/nusnlp/d2vlm
• Github: https://github.com/nusnlp/d2vlm
==================================
For more data science resources:
✓ https://t.me/DataScienceT
#AI #DataScience #MachineLearning #HuggingFace #Research
📝 Summary:
Video-language models struggle with temporal grounding from coupled tasks. Our D^2VLM framework decouples grounding and textual response using evidence tokens. Factorized preference optimization explicitly optimizes temporal grounding for both tasks.
🔹 Publication Date: Published on Dec 30, 2025
🔹 Paper Links:
• arXiv Page: https://arxiv.org/abs/2512.24097
• PDF: https://arxiv.org/pdf/2512.24097
• Project Page: https://github.com/nusnlp/d2vlm
• Github: https://github.com/nusnlp/d2vlm
==================================
For more data science resources:
✓ https://t.me/DataScienceT
#AI #DataScience #MachineLearning #HuggingFace #Research
✨JavisGPT: A Unified Multi-modal LLM for Sounding-Video Comprehension and Generation
📝 Summary:
This paper presents JavisGPT, the first unified multimodal large language model (MLLM) for Joint Audio-Video (JAV) comprehension and generation. JavisGPT adopts a concise encoder-LLM-decoder architect...
🔹 Publication Date: Published on Dec 28, 2025
🔹 Paper Links:
• arXiv Page: https://arxiv.org/abs/2503.23377
• PDF: https://arxiv.org/pdf/2512.22905
• Project Page: https://javisverse.github.io/JavisGPT-page/
• Github: https://github.com/JavisVerse/JavisGPT
🔹 Models citing this paper:
• https://huggingface.co/JavisVerse/JavisGPT-v0.1-7B-Instruct
✨ Datasets citing this paper:
• https://huggingface.co/datasets/JavisVerse/MM-PreTrain
• https://huggingface.co/datasets/JavisVerse/JavisUnd-Eval
• https://huggingface.co/datasets/JavisVerse/AV-FineTune
==================================
For more data science resources:
✓ https://t.me/DataScienceT
#AI #DataScience #MachineLearning #HuggingFace #Research
📝 Summary:
This paper presents JavisGPT, the first unified multimodal large language model (MLLM) for Joint Audio-Video (JAV) comprehension and generation. JavisGPT adopts a concise encoder-LLM-decoder architect...
🔹 Publication Date: Published on Dec 28, 2025
🔹 Paper Links:
• arXiv Page: https://arxiv.org/abs/2503.23377
• PDF: https://arxiv.org/pdf/2512.22905
• Project Page: https://javisverse.github.io/JavisGPT-page/
• Github: https://github.com/JavisVerse/JavisGPT
🔹 Models citing this paper:
• https://huggingface.co/JavisVerse/JavisGPT-v0.1-7B-Instruct
✨ Datasets citing this paper:
• https://huggingface.co/datasets/JavisVerse/MM-PreTrain
• https://huggingface.co/datasets/JavisVerse/JavisUnd-Eval
• https://huggingface.co/datasets/JavisVerse/AV-FineTune
==================================
For more data science resources:
✓ https://t.me/DataScienceT
#AI #DataScience #MachineLearning #HuggingFace #Research
arXiv.org
JavisDiT: Joint Audio-Video Diffusion Transformer with...
This paper introduces JavisDiT, a novel Joint Audio-Video Diffusion Transformer designed for synchronized audio-video generation (JAVG). Built upon the powerful Diffusion Transformer (DiT)...
✨Forging Spatial Intelligence: A Roadmap of Multi-Modal Data Pre-Training for Autonomous Systems
📝 Summary:
The rapid advancement of autonomous systems, including self-driving vehicles and drones, has intensified the need to forge true Spatial Intelligence from multi-modal onboard sensor data. While foundat...
🔹 Publication Date: Published on Dec 30, 2025
🔹 Paper Links:
• arXiv Page: https://arxiv.org/abs/2512.24385
• PDF: https://arxiv.org/pdf/2512.24385
• Github: https://github.com/worldbench/awesome-spatial-intelligence
==================================
For more data science resources:
✓ https://t.me/DataScienceT
#AI #DataScience #MachineLearning #HuggingFace #Research
📝 Summary:
The rapid advancement of autonomous systems, including self-driving vehicles and drones, has intensified the need to forge true Spatial Intelligence from multi-modal onboard sensor data. While foundat...
🔹 Publication Date: Published on Dec 30, 2025
🔹 Paper Links:
• arXiv Page: https://arxiv.org/abs/2512.24385
• PDF: https://arxiv.org/pdf/2512.24385
• Github: https://github.com/worldbench/awesome-spatial-intelligence
==================================
For more data science resources:
✓ https://t.me/DataScienceT
#AI #DataScience #MachineLearning #HuggingFace #Research
✨Valori: A Deterministic Memory Substrate for AI Systems
📝 Summary:
Valori introduces a deterministic AI memory substrate using fixed-point arithmetic, ensuring bit-identical results across platforms. This eliminates non-determinism from floating-point operations in vector embeddings and search, making AI systems trustworthy and verifiable.
🔹 Publication Date: Published on Dec 25, 2025
🔹 Paper Links:
• arXiv Page: https://arxiv.org/abs/2512.22280
• PDF: https://arxiv.org/pdf/2512.22280
• Project Page: https://valori.systems/
• Github: https://github.com/varshith-Git/Valori-Kernel
==================================
For more data science resources:
✓ https://t.me/DataScienceT
#AI #DataScience #MachineLearning #HuggingFace #Research
📝 Summary:
Valori introduces a deterministic AI memory substrate using fixed-point arithmetic, ensuring bit-identical results across platforms. This eliminates non-determinism from floating-point operations in vector embeddings and search, making AI systems trustworthy and verifiable.
🔹 Publication Date: Published on Dec 25, 2025
🔹 Paper Links:
• arXiv Page: https://arxiv.org/abs/2512.22280
• PDF: https://arxiv.org/pdf/2512.22280
• Project Page: https://valori.systems/
• Github: https://github.com/varshith-Git/Valori-Kernel
==================================
For more data science resources:
✓ https://t.me/DataScienceT
#AI #DataScience #MachineLearning #HuggingFace #Research