π€π§ LongCat-Video: Meituanβs Groundbreaking Step Toward Efficient Long Video Generation with AI
ποΈ 04 Nov 2025
π AI News & Trends
In the rapidly advancing field of generative AI, the ability to create realistic, coherent, and high-quality videos from text or images has become one of the most sought-after goals. Meituan, one of the leading technology innovators in China, has made a remarkable stride in this domain with its latest open-source model β LongCat-Video. Designed as ...
#LongCatVideo #Meituan #GenerativeAI #VideoGeneration #AIInnovation #OpenSource
ποΈ 04 Nov 2025
π AI News & Trends
In the rapidly advancing field of generative AI, the ability to create realistic, coherent, and high-quality videos from text or images has become one of the most sought-after goals. Meituan, one of the leading technology innovators in China, has made a remarkable stride in this domain with its latest open-source model β LongCat-Video. Designed as ...
#LongCatVideo #Meituan #GenerativeAI #VideoGeneration #AIInnovation #OpenSource
β¨olmOCR: Unlocking Trillions of Tokens in PDFs with Vision Language Models
π Summary:
olmOCR is an open-source toolkit that uses a fine-tuned vision language model to convert PDFs into clean, structured text. It enables large-scale, cost-effective extraction of trillions of tokens for training language models.
πΉ Publication Date: Published on Feb 25
πΉ Paper Links:
β’ arXiv Page: https://arxiv.org/abs/2502.18443
β’ PDF: https://arxiv.org/pdf/2502.18443
β’ Github: https://github.com/allenai/olmocr
β¨ Datasets citing this paper:
β’ https://huggingface.co/datasets/davanstrien/test-olmocr2
β’ https://huggingface.co/datasets/davanstrien/newspapers-olmocr2
β’ https://huggingface.co/datasets/stckmn/ocr-output-Directive017-1761355297
==================================
For more data science resources:
β https://t.me/DataScienceT
#OCR #VLMs #LLM #DataExtraction #OpenSource
π Summary:
olmOCR is an open-source toolkit that uses a fine-tuned vision language model to convert PDFs into clean, structured text. It enables large-scale, cost-effective extraction of trillions of tokens for training language models.
πΉ Publication Date: Published on Feb 25
πΉ Paper Links:
β’ arXiv Page: https://arxiv.org/abs/2502.18443
β’ PDF: https://arxiv.org/pdf/2502.18443
β’ Github: https://github.com/allenai/olmocr
β¨ Datasets citing this paper:
β’ https://huggingface.co/datasets/davanstrien/test-olmocr2
β’ https://huggingface.co/datasets/davanstrien/newspapers-olmocr2
β’ https://huggingface.co/datasets/stckmn/ocr-output-Directive017-1761355297
==================================
For more data science resources:
β https://t.me/DataScienceT
#OCR #VLMs #LLM #DataExtraction #OpenSource
β¨MinerU: An Open-Source Solution for Precise Document Content Extraction
π Summary:
MinerU is an open-source tool that provides high-precision document content extraction. It uses fine-tuned models and pre/postprocessing rules to consistently achieve high performance across diverse document types.
πΉ Publication Date: Published on Sep 27, 2024
πΉ Paper Links:
β’ arXiv Page: https://arxiv.org/pdf/2409.18839
β’ PDF: https://huggingface.co/spaces/Echo9k/PDF_reader
β’ Github: https://github.com/opendatalab/MinerU
β¨ Spaces citing this paper:
β’ https://huggingface.co/spaces/opendatalab/MinerU
β’ https://huggingface.co/spaces/xiaoye-winters/MinerU-API
β’ https://huggingface.co/spaces/ApeAITW/MinerU_2.5_Test
==================================
For more data science resources:
β https://t.me/DataScienceT
#DocumentExtraction #OpenSource #DataScience #NLP #AI
π Summary:
MinerU is an open-source tool that provides high-precision document content extraction. It uses fine-tuned models and pre/postprocessing rules to consistently achieve high performance across diverse document types.
πΉ Publication Date: Published on Sep 27, 2024
πΉ Paper Links:
β’ arXiv Page: https://arxiv.org/pdf/2409.18839
β’ PDF: https://huggingface.co/spaces/Echo9k/PDF_reader
β’ Github: https://github.com/opendatalab/MinerU
β¨ Spaces citing this paper:
β’ https://huggingface.co/spaces/opendatalab/MinerU
β’ https://huggingface.co/spaces/xiaoye-winters/MinerU-API
β’ https://huggingface.co/spaces/ApeAITW/MinerU_2.5_Test
==================================
For more data science resources:
β https://t.me/DataScienceT
#DocumentExtraction #OpenSource #DataScience #NLP #AI
π€π§ Krea Realtime 14B: Redefining Real-Time Video Generation with AI
ποΈ 05 Nov 2025
π AI News & Trends
The field of artificial intelligence is undergoing a remarkable transformation and one of the most exciting developments is the rise of real-time video generation. From cinematic visual effects to immersive virtual environments, AI is rapidly blurring the boundaries between imagination and reality. At the forefront of this innovation stands Krea Realtime 14B, an advanced open-source ...
#AI #RealTimeVideo #ArtificialIntelligence #OpenSource #VideoGeneration #KreaRealtime14B
ποΈ 05 Nov 2025
π AI News & Trends
The field of artificial intelligence is undergoing a remarkable transformation and one of the most exciting developments is the rise of real-time video generation. From cinematic visual effects to immersive virtual environments, AI is rapidly blurring the boundaries between imagination and reality. At the forefront of this innovation stands Krea Realtime 14B, an advanced open-source ...
#AI #RealTimeVideo #ArtificialIntelligence #OpenSource #VideoGeneration #KreaRealtime14B
π€π§ FIBO: The First JSON-Native, Open-Source Text-to-Image Model Built for Real-World Control and Accuracy
ποΈ 07 Nov 2025
π AI News & Trends
The world of generative AI has evolved rapidly with text-to-image tools enabling creators, marketers, designers and enterprises to bring ideas to life with unprecedented ease. However, most existing models have a clear limitation: they prioritize imagination at the cost of control. Whether producing inconsistent styles, unpredictable lighting or drifting away from user prompts, traditional models ...
#FIBO #TextToImage #GenerativeAI #OpenSource #JSONNative #RealWorldControl
ποΈ 07 Nov 2025
π AI News & Trends
The world of generative AI has evolved rapidly with text-to-image tools enabling creators, marketers, designers and enterprises to bring ideas to life with unprecedented ease. However, most existing models have a clear limitation: they prioritize imagination at the cost of control. Whether producing inconsistent styles, unpredictable lighting or drifting away from user prompts, traditional models ...
#FIBO #TextToImage #GenerativeAI #OpenSource #JSONNative #RealWorldControl
β¨OmniVinci: Enhancing Architecture and Data for Omni-Modal Understanding LLM
π Summary:
OmniVinci is an open-source omni-modal LLM that improves cross-modal understanding for audio, vision, and robotics. It features innovative architecture for better embedding alignment and temporal capture, along with efficient data curation. OmniVinci outperforms competitors while using significan...
πΉ Publication Date: Published on Oct 17
πΉ Paper Links:
β’ arXiv Page: https://arxivexplained.com/papers/omnivinci-enhancing-architecture-and-data-for-omni-modal-understanding-llm
β’ PDF: https://arxiv.org/pdf/2510.15870
β’ Project Page: https://nvlabs.github.io/OmniVinci/
β’ Github: https://github.com/NVlabs/OmniVinci
πΉ Models citing this paper:
β’ https://huggingface.co/nvidia/omnivinci
==================================
For more data science resources:
β https://t.me/DataScienceT
#LLM #MultimodalAI #Robotics #DeepLearning #OpenSource
π Summary:
OmniVinci is an open-source omni-modal LLM that improves cross-modal understanding for audio, vision, and robotics. It features innovative architecture for better embedding alignment and temporal capture, along with efficient data curation. OmniVinci outperforms competitors while using significan...
πΉ Publication Date: Published on Oct 17
πΉ Paper Links:
β’ arXiv Page: https://arxivexplained.com/papers/omnivinci-enhancing-architecture-and-data-for-omni-modal-understanding-llm
β’ PDF: https://arxiv.org/pdf/2510.15870
β’ Project Page: https://nvlabs.github.io/OmniVinci/
β’ Github: https://github.com/NVlabs/OmniVinci
πΉ Models citing this paper:
β’ https://huggingface.co/nvidia/omnivinci
==================================
For more data science resources:
β https://t.me/DataScienceT
#LLM #MultimodalAI #Robotics #DeepLearning #OpenSource
π€π§ Meilisearch: The Lightning-Fast, AI-Ready Search Engine for Modern Applications
ποΈ 08 Nov 2025
π AI News & Trends
Search is no longer a luxury feature. Todayβs users expect instant, relevant results across e-commerce platforms, SaaS tools, media libraries and knowledge systems. With AI-powered experiences becoming the new standard, developers need search infrastructure that is fast, flexible, developer-friendly and ready for hybrid semantic search. This is where Meilisearch stands out. Meilisearch is an open-source, ...
#Meilisearch #AIReadySearch #LightningFast #SearchEngine #ModernApplications #OpenSource
ποΈ 08 Nov 2025
π AI News & Trends
Search is no longer a luxury feature. Todayβs users expect instant, relevant results across e-commerce platforms, SaaS tools, media libraries and knowledge systems. With AI-powered experiences becoming the new standard, developers need search infrastructure that is fast, flexible, developer-friendly and ready for hybrid semantic search. This is where Meilisearch stands out. Meilisearch is an open-source, ...
#Meilisearch #AIReadySearch #LightningFast #SearchEngine #ModernApplications #OpenSource
Media is too big
VIEW IN TELEGRAM
β¨UniVA: Universal Video Agent towards Open-Source Next-Generation Video Generalist
π Summary:
UniVA is an open-source multi-agent framework that unifies video understanding, segmentation, editing, and generation. It uses a Plan-and-Act architecture with hierarchical memory to enable complex, iterative video workflows. This system aims to advance agentic video intelligence.
πΉ Publication Date: Published on Nov 11
πΉ Paper Links:
β’ arXiv Page: https://arxiv.org/abs/2511.08521
β’ PDF: https://arxiv.org/pdf/2511.08521
β’ Project Page: https://univa.online/
β’ Github: https://github.com/univa-agent/univa
==================================
For more data science resources:
β https://t.me/DataScienceT
#VideoAI #AIagents #GenerativeAI #ComputerVision #OpenSource
π Summary:
UniVA is an open-source multi-agent framework that unifies video understanding, segmentation, editing, and generation. It uses a Plan-and-Act architecture with hierarchical memory to enable complex, iterative video workflows. This system aims to advance agentic video intelligence.
πΉ Publication Date: Published on Nov 11
πΉ Paper Links:
β’ arXiv Page: https://arxiv.org/abs/2511.08521
β’ PDF: https://arxiv.org/pdf/2511.08521
β’ Project Page: https://univa.online/
β’ Github: https://github.com/univa-agent/univa
==================================
For more data science resources:
β https://t.me/DataScienceT
#VideoAI #AIagents #GenerativeAI #ComputerVision #OpenSource
π€π§ Steel Browser: The Open-Source Browser API Powering AI Agents and Automation
ποΈ 16 Nov 2025
π AI News & Trends
The evolution of artificial intelligence has ushered in a new era of automation where AI agents can perform complex digital tasks with minimal human intervention. However, one of the biggest challenges for developers building these systems is browser automation managing sessions, proxies, cookies and debugging environments. This is where Steel Browser comes into play. Steel ...
#SteelBrowser #OpenSource #BrowserAutomation #AIAgents #WebScraping #DigitalAutomation
ποΈ 16 Nov 2025
π AI News & Trends
The evolution of artificial intelligence has ushered in a new era of automation where AI agents can perform complex digital tasks with minimal human intervention. However, one of the biggest challenges for developers building these systems is browser automation managing sessions, proxies, cookies and debugging environments. This is where Steel Browser comes into play. Steel ...
#SteelBrowser #OpenSource #BrowserAutomation #AIAgents #WebScraping #DigitalAutomation
π1π₯1
π€π§ Skyvern: The Future of Browser Automation Powered by AI and Computer Vision
ποΈ 16 Nov 2025
π AI News & Trends
In todayβs fast-evolving digital landscape, automation plays a crucial role in enhancing productivity, efficiency and innovation. Yet, traditional browser automation tools often struggle with complexity, maintenance and reliability. They rely heavily on DOM parsing, XPaths and rigid scripts that easily break when websites change their layout. Enter Skyvern, an open-source, AI-driven browser automation platform developed ...
#Skyvern #BrowserAutomation #AIDriven #ComputerVision #OpenSource #WebAutomation
ποΈ 16 Nov 2025
π AI News & Trends
In todayβs fast-evolving digital landscape, automation plays a crucial role in enhancing productivity, efficiency and innovation. Yet, traditional browser automation tools often struggle with complexity, maintenance and reliability. They rely heavily on DOM parsing, XPaths and rigid scripts that easily break when websites change their layout. Enter Skyvern, an open-source, AI-driven browser automation platform developed ...
#Skyvern #BrowserAutomation #AIDriven #ComputerVision #OpenSource #WebAutomation
β€βπ₯1β€1π1
β¨P1: Mastering Physics Olympiads with Reinforcement Learning
π Summary:
P1 is a family of open-source physics reasoning models trained via reinforcement learning. P1-235B-A22B achieved Gold-medal performance at IPhO 2025 and won 12 other competitions. These models also show strong generalizability on other reasoning tasks.
πΉ Publication Date: Published on Nov 17
πΉ Paper Links:
β’ arXiv Page: https://arxiv.org/abs/2511.13612
β’ PDF: https://arxiv.org/pdf/2511.13612
β’ Project Page: https://prime-rl.github.io/P1/
β’ Github: https://github.com/PRIME-RL/P1
==================================
For more data science resources:
β https://t.me/DataScienceT
#ReinforcementLearning #Physics #AI #MachineLearning #OpenSource
π Summary:
P1 is a family of open-source physics reasoning models trained via reinforcement learning. P1-235B-A22B achieved Gold-medal performance at IPhO 2025 and won 12 other competitions. These models also show strong generalizability on other reasoning tasks.
πΉ Publication Date: Published on Nov 17
πΉ Paper Links:
β’ arXiv Page: https://arxiv.org/abs/2511.13612
β’ PDF: https://arxiv.org/pdf/2511.13612
β’ Project Page: https://prime-rl.github.io/P1/
β’ Github: https://github.com/PRIME-RL/P1
==================================
For more data science resources:
β https://t.me/DataScienceT
#ReinforcementLearning #Physics #AI #MachineLearning #OpenSource
β¨Instella: Fully Open Language Models with Stellar Performance
π Summary:
Instella is a family of fully open language models trained on open data. It achieves state-of-the-art among fully open models and competes with leading open-weight LLMs. Specialized variants for long context and math reasoning are also offered.
πΉ Publication Date: Published on Nov 13
πΉ Paper Links:
β’ arXiv Page: https://arxiv.org/abs/2511.10628
β’ PDF: https://arxiv.org/pdf/2511.10628
β’ Github: https://github.com/AMD-AGI/Instella
πΉ Models citing this paper:
β’ https://huggingface.co/amd/AMD-OLMo
β’ https://huggingface.co/amd/Instella-3B-Instruct
β’ https://huggingface.co/amd/Instella-3B
β¨ Datasets citing this paper:
β’ https://huggingface.co/datasets/amd/Instella-Long
β’ https://huggingface.co/datasets/amd/Instella-GSM8K-synthetic
β¨ Spaces citing this paper:
β’ https://huggingface.co/spaces/DexterSptizu/AMD-OLMo-1B
β’ https://huggingface.co/spaces/universeofml/DeepFocusTrain
==================================
For more data science resources:
β https://t.me/DataScienceT
#LLMs #OpenSource #AI #MachineLearning #NLP
π Summary:
Instella is a family of fully open language models trained on open data. It achieves state-of-the-art among fully open models and competes with leading open-weight LLMs. Specialized variants for long context and math reasoning are also offered.
πΉ Publication Date: Published on Nov 13
πΉ Paper Links:
β’ arXiv Page: https://arxiv.org/abs/2511.10628
β’ PDF: https://arxiv.org/pdf/2511.10628
β’ Github: https://github.com/AMD-AGI/Instella
πΉ Models citing this paper:
β’ https://huggingface.co/amd/AMD-OLMo
β’ https://huggingface.co/amd/Instella-3B-Instruct
β’ https://huggingface.co/amd/Instella-3B
β¨ Datasets citing this paper:
β’ https://huggingface.co/datasets/amd/Instella-Long
β’ https://huggingface.co/datasets/amd/Instella-GSM8K-synthetic
β¨ Spaces citing this paper:
β’ https://huggingface.co/spaces/DexterSptizu/AMD-OLMo-1B
β’ https://huggingface.co/spaces/universeofml/DeepFocusTrain
==================================
For more data science resources:
β https://t.me/DataScienceT
#LLMs #OpenSource #AI #MachineLearning #NLP
arXiv.org
Instella: Fully Open Language Models with Stellar Performance
Large language models (LLMs) have demonstrated remarkable performance across a wide range of tasks, yet the majority of high-performing models remain closed-source or partially open, limiting...
β€1
β¨OpenUS: A Fully Open-Source Foundation Model for Ultrasound Image Analysis via Self-Adaptive Masked Contrastive Learning
π Summary:
OpenUS is an open-source ultrasound foundation model built on a large public dataset. It uses a vision Mamba backbone and a novel self-adaptive masking framework to enhance pre-training, enabling label-efficient fine-tuning for various US tasks.
πΉ Publication Date: Published on Nov 14
πΉ Paper Links:
β’ arXiv Page: https://arxiv.org/abs/2511.11510
β’ PDF: https://arxiv.org/pdf/2511.11510
β’ Github: https://github.com/XZheng0427/OpenUS
==================================
For more data science resources:
β https://t.me/DataScienceT
#OpenSource #FoundationModel #UltrasoundAI #MachineLearning #MedicalImaging
π Summary:
OpenUS is an open-source ultrasound foundation model built on a large public dataset. It uses a vision Mamba backbone and a novel self-adaptive masking framework to enhance pre-training, enabling label-efficient fine-tuning for various US tasks.
πΉ Publication Date: Published on Nov 14
πΉ Paper Links:
β’ arXiv Page: https://arxiv.org/abs/2511.11510
β’ PDF: https://arxiv.org/pdf/2511.11510
β’ Github: https://github.com/XZheng0427/OpenUS
==================================
For more data science resources:
β https://t.me/DataScienceT
#OpenSource #FoundationModel #UltrasoundAI #MachineLearning #MedicalImaging
β€1
β¨HunyuanVideo 1.5 Technical Report
π Summary:
HunyuanVideo 1.5 is a lightweight, open-source video generation model achieving state-of-the-art visual quality and motion coherence. It employs an advanced DiT architecture with SSTA and an efficient video super-resolution network, enabling high-quality video creation on consumer GPUs.
πΉ Publication Date: Published on Nov 24
πΉ Paper Links:
β’ arXiv Page: https://arxiv.org/abs/2511.18870
β’ PDF: https://arxiv.org/pdf/2511.18870
β’ Github: https://github.com/Tencent-Hunyuan/HunyuanVideo-1.5
==================================
For more data science resources:
β https://t.me/DataScienceT
#VideoGeneration #AI #DeepLearning #OpenSource #DiffusionModels
π Summary:
HunyuanVideo 1.5 is a lightweight, open-source video generation model achieving state-of-the-art visual quality and motion coherence. It employs an advanced DiT architecture with SSTA and an efficient video super-resolution network, enabling high-quality video creation on consumer GPUs.
πΉ Publication Date: Published on Nov 24
πΉ Paper Links:
β’ arXiv Page: https://arxiv.org/abs/2511.18870
β’ PDF: https://arxiv.org/pdf/2511.18870
β’ Github: https://github.com/Tencent-Hunyuan/HunyuanVideo-1.5
==================================
For more data science resources:
β https://t.me/DataScienceT
#VideoGeneration #AI #DeepLearning #OpenSource #DiffusionModels
β¨GigaEvo: An Open Source Optimization Framework Powered By LLMs And Evolution Algorithms
π Summary:
GigaEvo is an open-source framework for LLM-guided evolutionary computation, providing modular tools for complex optimization. It enhances reproducibility of AlphaEvolve-inspired methods with detailed implementations, validated on challenging problems like Heilbronn triangle placement.
πΉ Publication Date: Published on Nov 17
πΉ Paper Links:
β’ arXiv Page: https://arxiv.org/abs/2511.17592
β’ PDF: https://arxiv.org/pdf/2511.17592
β’ Project Page: https://airi-institute.github.io/gigaevo-cover/
β’ Github: https://github.com/FusionBrainLab/gigaevo-core
==================================
For more data science resources:
β https://t.me/DataScienceT
#LLM #EvolutionaryAlgorithms #Optimization #OpenSource #AI
π Summary:
GigaEvo is an open-source framework for LLM-guided evolutionary computation, providing modular tools for complex optimization. It enhances reproducibility of AlphaEvolve-inspired methods with detailed implementations, validated on challenging problems like Heilbronn triangle placement.
πΉ Publication Date: Published on Nov 17
πΉ Paper Links:
β’ arXiv Page: https://arxiv.org/abs/2511.17592
β’ PDF: https://arxiv.org/pdf/2511.17592
β’ Project Page: https://airi-institute.github.io/gigaevo-cover/
β’ Github: https://github.com/FusionBrainLab/gigaevo-core
==================================
For more data science resources:
β https://t.me/DataScienceT
#LLM #EvolutionaryAlgorithms #Optimization #OpenSource #AI
β¨SWE-SQL: Illuminating LLM Pathways to Solve User SQL Issues in Real-World Applications
π Summary:
SWE-SQL introduces BIRD-CRITIC, a new benchmark for SQL issue debugging, and Six-Gym, a training environment using f-Plan Boosting. Their open-source Bird-Fixer agent surpasses proprietary LLMs like GPT-4.1 in performance, democratizing advanced SQL-debugging capabilities.
πΉ Publication Date: Published on Jun 23
πΉ Paper Links:
β’ arXiv Page: https://arxiv.org/abs/2506.18951
β’ PDF: https://arxiv.org/pdf/2506.18951
β’ Project Page: https://bird-critic.github.io
β’ Github: https://github.com/bird-bench/BIRD-CRITIC-1
β¨ Datasets citing this paper:
β’ https://huggingface.co/datasets/birdsql/bird-critic-1.0-flash-exp
β’ https://huggingface.co/datasets/birdsql/bird-critic-1.0-open
β’ https://huggingface.co/datasets/birdsql/bird-critic-1.0-postgresql
==================================
For more data science resources:
β https://t.me/DataScienceT
#SQL #LLM #AI #Debugging #OpenSource
π Summary:
SWE-SQL introduces BIRD-CRITIC, a new benchmark for SQL issue debugging, and Six-Gym, a training environment using f-Plan Boosting. Their open-source Bird-Fixer agent surpasses proprietary LLMs like GPT-4.1 in performance, democratizing advanced SQL-debugging capabilities.
πΉ Publication Date: Published on Jun 23
πΉ Paper Links:
β’ arXiv Page: https://arxiv.org/abs/2506.18951
β’ PDF: https://arxiv.org/pdf/2506.18951
β’ Project Page: https://bird-critic.github.io
β’ Github: https://github.com/bird-bench/BIRD-CRITIC-1
β¨ Datasets citing this paper:
β’ https://huggingface.co/datasets/birdsql/bird-critic-1.0-flash-exp
β’ https://huggingface.co/datasets/birdsql/bird-critic-1.0-open
β’ https://huggingface.co/datasets/birdsql/bird-critic-1.0-postgresql
==================================
For more data science resources:
β https://t.me/DataScienceT
#SQL #LLM #AI #Debugging #OpenSource
β€1
β¨SWE-Bench++: A Framework for the Scalable Generation of Software Engineering Benchmarks from Open-Source Repositories
π Summary:
SWE-Bench++ is an automated framework generating scalable, multilingual, repository-level coding tasks from live GitHub pull requests. It overcomes manual curation limits and static datasets, offering a benchmark to evaluate and improve code generation models across 11 languages.
πΉ Publication Date: Published on Dec 19
πΉ Paper Links:
β’ arXiv Page: https://arxiv.org/abs/2512.17419
β’ PDF: https://arxiv.org/pdf/2512.17419
β’ Project Page: https://research.turing.com/swebench
β’ Github: https://huggingface.co/papers?q=GitHub%20pull%20requests
==================================
For more data science resources:
β https://t.me/DataScienceT
#SoftwareEngineering #CodeGeneration #AIBenchmarking #MachineLearning #OpenSource
π Summary:
SWE-Bench++ is an automated framework generating scalable, multilingual, repository-level coding tasks from live GitHub pull requests. It overcomes manual curation limits and static datasets, offering a benchmark to evaluate and improve code generation models across 11 languages.
πΉ Publication Date: Published on Dec 19
πΉ Paper Links:
β’ arXiv Page: https://arxiv.org/abs/2512.17419
β’ PDF: https://arxiv.org/pdf/2512.17419
β’ Project Page: https://research.turing.com/swebench
β’ Github: https://huggingface.co/papers?q=GitHub%20pull%20requests
==================================
For more data science resources:
β https://t.me/DataScienceT
#SoftwareEngineering #CodeGeneration #AIBenchmarking #MachineLearning #OpenSource
β€1
β¨Simulstream: Open-Source Toolkit for Evaluation and Demonstration of Streaming Speech-to-Text Translation Systems
π Summary:
Simulstream is an open-source toolkit for evaluating and demonstrating streaming speech-to-text translation. It supports long-form audio, incremental decoding, and re-translation, plus offers an interactive demo interface.
πΉ Publication Date: Published on Dec 19
πΉ Paper Links:
β’ arXiv Page: https://arxiv.org/abs/2512.17648
β’ PDF: https://arxiv.org/pdf/2512.17648
β’ Project Page: https://pypi.org/project/simulstream/
==================================
For more data science resources:
β https://t.me/DataScienceT
#SpeechToText #MachineTranslation #NLP #OpenSource #StreamingAI
π Summary:
Simulstream is an open-source toolkit for evaluating and demonstrating streaming speech-to-text translation. It supports long-form audio, incremental decoding, and re-translation, plus offers an interactive demo interface.
πΉ Publication Date: Published on Dec 19
πΉ Paper Links:
β’ arXiv Page: https://arxiv.org/abs/2512.17648
β’ PDF: https://arxiv.org/pdf/2512.17648
β’ Project Page: https://pypi.org/project/simulstream/
==================================
For more data science resources:
β https://t.me/DataScienceT
#SpeechToText #MachineTranslation #NLP #OpenSource #StreamingAI
β€1
β¨NVIDIA Nemotron 3: Efficient and Open Intelligence
π Summary:
NVIDIA introduces Nemotron 3, a family of models with strong agentic, reasoning, and conversational capabilities. They feature a hybrid Mamba-Transformer MoE architecture for high throughput and long context, plus advanced post-training for tool use. The models will be openly released.
πΉ Publication Date: Published on Dec 24
πΉ Paper Links:
β’ arXiv Page: https://arxiv.org/abs/2512.20856
β’ PDF: https://arxiv.org/pdf/2512.20856
==================================
For more data science resources:
β https://t.me/DataScienceT
#AI #LLM #DeepLearning #NVIDIA #OpenSource
π Summary:
NVIDIA introduces Nemotron 3, a family of models with strong agentic, reasoning, and conversational capabilities. They feature a hybrid Mamba-Transformer MoE architecture for high throughput and long context, plus advanced post-training for tool use. The models will be openly released.
πΉ Publication Date: Published on Dec 24
πΉ Paper Links:
β’ arXiv Page: https://arxiv.org/abs/2512.20856
β’ PDF: https://arxiv.org/pdf/2512.20856
==================================
For more data science resources:
β https://t.me/DataScienceT
#AI #LLM #DeepLearning #NVIDIA #OpenSource
β¨SciEvalKit: An Open-source Evaluation Toolkit for Scientific General Intelligence
π Summary:
SciEvalKit is an open-source toolkit for evaluating AI models in science. It assesses scientific intelligence across diverse domains and competencies using expert-grade benchmarks and a flexible pipeline. This provides a standardized platform for scientific AI evaluation.
πΉ Publication Date: Published on Dec 26, 2025
πΉ Paper Links:
β’ arXiv Page: https://arxiv.org/abs/2512.22334
β’ PDF: https://arxiv.org/pdf/2512.22334
==================================
For more data science resources:
β https://t.me/DataScienceT
#AIevaluation #ScientificAI #OpenSource #AIBenchmarks #AIResearch
π Summary:
SciEvalKit is an open-source toolkit for evaluating AI models in science. It assesses scientific intelligence across diverse domains and competencies using expert-grade benchmarks and a flexible pipeline. This provides a standardized platform for scientific AI evaluation.
πΉ Publication Date: Published on Dec 26, 2025
πΉ Paper Links:
β’ arXiv Page: https://arxiv.org/abs/2512.22334
β’ PDF: https://arxiv.org/pdf/2512.22334
==================================
For more data science resources:
β https://t.me/DataScienceT
#AIevaluation #ScientificAI #OpenSource #AIBenchmarks #AIResearch