Papers.Data.Code – Telegram

Papers.Data.Code

@papersdatacode

18 subscribers

99 links

Only meaningful ML signals: papers, repos & datasets. Selected, not collected. 3–4 posts/day. 📄💻📊
papers.data.code@gmail.com

Download Telegram

About

Blog

Apps

Platform

Papers.Data.Code

Papers.Data.Code

📄 Paper #Paper #CV #VideoGeneration #DiffusionModels

AnyFlow: Any-Step Video Diffusion Model with On-Policy Flow Map Distillation
👤 Yuchao Gu, Guian Fang, Yuxin Jiang et al.

🎯 Task
Any-step video generation

💡 Idea
Instead of endpoint consistency maps for fixed few-step sampling, it learns arbitrary-time flow-map transitions along the full ODE path, then uses shortcut backward simulation for on-policy distillation to cut discretization error and causal exposure bias.

✨ Why it's interesting
On 14B T2V, it gets 84.05 VBench at 4 NFEs and 84.41 at 32; beats Krea-Realtime-14B's 83.25 at 4 and rCM-14B's 83.73 at 4.

💻 Repo
⭐ NVlabs/AnyFlow — 202 stars
⭐ NVLabs/AnyFlow — 202 stars

🔗 paper

via @Papers.Data.Code

GitHub - NVlabs/AnyFlow

Contribute to NVlabs/AnyFlow development by creating an account on GitHub.

6 views08:00

Papers.Data.Code

📄 Paper #Paper #LLM #ReinforcementLearning #AgentTraining

RubricEM: Meta-RL with Rubric-guided Policy Decomposition beyond Verifiable Rewards
👤 Gaotang Li, Bhavana Dalvi Mishra, Zifeng Wang et al.

🎯 Task
Long-form deep research agent training

💡 Idea
Instead of using rubrics only to score final answers, RubricEM uses them to structure execution, reward each stage, and store experience. It decomposes research into Plan/Research/Review/Answer, applies stagewise GRPO for denser credit, and jointly trains a reflection policy as reusable memory.

✨ Why it's interesting
RubricEM-8B outperforms comparable open models on 4 long-form research benchmarks and approaches proprietary deep-research systems after 1400 RL steps.

🔗 paper

via @Papers.Data.Code

RubricEM: Meta-RL with Rubric-guided Policy Decomposition beyond...

Training deep research agents, namely systems that plan, search, evaluate evidence, and synthesize long-form reports, pushes reinforcement learning beyond the regime of verifiable rewards. Their...

5 views10:00

Papers.Data.Code

📊 Dataset #Dataset #NLP #StemReasoning #VisualQuestionAnswering

open-mm-rl
👤 TuringEnterprises

🎯 Task
Multimodal STEM question answering

💡 Idea
40 MIT-licensed STEM QA examples across physics, math, biology, and chemistry, spanning single-image, multi-panel, and multi-image formats with deterministic final answers.

✨ Why it's interesting
Deterministic, programmatically checkable answers make advanced multimodal STEM reasoning benchmarkable for RL and outcome-supervised training.

Size: 40 examples, 15.5 MB

📊 Dataset
📥 2.6k downloads
❤️ 94 likes

🔗 dataset

via @Papers.Data.Code

TuringEnterprises/Open-MM-RL · Datasets at Hugging Face

We’re on a journey to advance and democratize artificial intelligence through open source and open science.

5 views13:00

Papers.Data.Code

💻 Repo #Repo #CV #ImageToVideo #2kGeneration

Swifti2v
👤 HKUST-LongGroup

🎯 Task
high-resolution image-to-video generation

💡 Idea
Generate native 2K videos from a single image by first producing a low-res motion reference, then refining to high resolution while conditioning on both the input image and the Stage I video.

✨ Why it's interesting
Matches strong 2K end-to-end I2V baselines on key VBench-I2V metrics with 202× less GPU-time; 81-frame 2K output runs in ~111s on one H800 and fits on a 24 GB RTX 4090.

💻 Repo
⭐ HKUST-LongGroup/SwiftI2V — 71 stars (+47 3d)
HTML

🔗 paper

via @Papers.Data.Code

GitHub - HKUST-LongGroup/SwiftI2V: Project page for paper "SwiftI2V: Efficient High-Resolution Image-to-Video Generation via Conditional…

Project page for paper "SwiftI2V: Efficient High-Resolution Image-to-Video Generation via Conditional Segment-wise Generation" - HKUST-LongGroup/SwiftI2V

4 views16:00

Papers.Data.Code

📋 Weekly Digest · May 09 – May 16
#WeeklyDigest

📄 Papers

AnyFlow: Any-Step Video Diffusion Model with On-Policy Flow Map Distillation
#VideoGeneration #DiffusionModels #Distillation
Any-step video diffusion ⟶ 84.05 VBench at 4 NFEs
→ Learn more...

Flow-OPD: On-Policy Distillation for Flow Matching Models
#TextToImage #KnowledgeDistillation #ReinforcementLearning
On-policy flow distillation ⟶ boosts GenEval and OCR
→ Learn more...

SenseNova-U1: Unifying Multimodal Understanding and Generation with NEO-unify Architecture
#VisionLanguageModels #ImageGeneration #MixtureOfExperts
NEO-unify multimodal model ⟶ unifies understanding and generation
→ Learn more...

δ-mem: Efficient Online Memory for Large Language Models
#MemoryMechanisms #Attention #ParameterEfficientTuning
Online associative memory ⟶ steers attention for long-horizon tasks
→ Learn more...

LLMs Improving LLMs: Agentic Discovery for Test-Time Scaling
#TestTimeScaling #Reasoning #AgenticSearch
Offline replay controller ⟶ improves accuracy-cost tradeoffs
→ Learn more...

RubricEM: Meta-RL with Rubric-guided Policy Decomposition beyond Verifiable Rewards
#ReinforcementLearning #AgentTraining #LongContextReasoning
Rubric-guided meta-RL ⟶ stagewise credit for research agents
→ Learn more...

💻 Repos

antirez/ds4 ⭐
#Metal #KvCache #OpenaiCompatible
Metal local inference ⟶ 1M context with disk KV cache
→ Learn more...

facebookresearch/ProgramBench ⭐
#Benchmark #SoftwareEngineering #ReverseEngineering
Program reconstruction benchmark ⟶ tests LM reverse engineering
→ Learn more...

sparolab/KISS-IMU ⭐
#InertialOdometry #SelfSupervised #LidarPseudoLabels
Self-supervised IMU odometry ⟶ denoises raw IMU with LiDAR labels
→ Learn more...

📊 Datasets

AI Index Data: Growth, Talent (Cambridge/Harvard)
#GlobalAI #CountryIndicators #LongitudinalData
Global AI panel dataset ⟶ cross-country trend forecasting
→ Learn more...

giant-permissive-image-corpus
#ImageGeneration #PermissiveLicense #ImageDataset
Permissive image corpus ⟶ trains visual generation
→ Learn more...

➡️ Tomorrow — Efficient ML Monthly

via @Papers.Data.Code

👍1

6 views09:00

Papers.Data.Code

📈 Monthly · Efficient ML · Apr 17 – May 17
#MonthlyDigest #EfficientML

📄 Papers

Trees to Flows and Back: Unifying Decision Trees and Diffusion Models
#DiffusionModels #DecisionTrees #KnowledgeDistillation
Trees and flows ⟶ faster tabular generation
→ Learn more...

📊 Datasets

MSR-ACC/TAE25
#QuantumChemistry #AtomizationEnergy #CoupledCluster
Quantum chemistry dataset ⟶ trains atomization energy models
→ Learn more...

AI Index Data: Growth, Talent (Cambridge/Harvard)
#GlobalAI #CountryIndicators #LongitudinalData
Global AI panel dataset ⟶ cross-country trend forecasting
→ Learn more...

WHO Global Health Indicators for Prediction
#GlobalHealth #CountryLevel #WorldBank
Global health panel data ⟶ cross-country trend analysis
→ Learn more...

⚡ Trends

▸ Longitudinal country-level datasets increasingly target forecasting and cross-country trend analysis.
▸ Wide, linked, multi-table dataset formats are becoming standard for benchmarking.
▸ Efficiency gains come from unifying model families and distilling complex systems.

🧭 TL;DR

📄 Trees to Flows and Back: Unifying Decision Trees and Diffusion Models
Unifies trees and diffusion, delivering faster tabular generation and effective distillation.

💡 Efficiency advances increasingly come from unifying classical structures with generative modeling.

via @Papers.Data.Code

5 views15:00

Papers.Data.Code

📄 Paper #Paper #LLM #Reasoning #ReinforcementLearning

Achieving Gold-Medal-Level Olympiad Reasoning via Simple and Unified Scaling
👤 Yafu Li, Runzhe Zhan, Haoran Zhang et al.

🎯 Task
Olympiad-level mathematical and scientific reasoning

💡 Idea
Instead of domain-specific systems, it uses one scaling recipe: reverse-perplexity long-CoT SFT to instill proof search and self-checking, then coarse verifiable-reward RL, proof-level RL with self-refinement/replay, and test-time verification loops.

✨ Why it's interesting
SU-01 gets 57.6% on IMO-ProofBench, 70.2% with test-time scaling, and reaches the IMO 2025 gold line with 35 points.

💻 Repo
⭐ Simplified-Reasoning/SU-01 — 68 stars

🔗 paper

via @Papers.Data.Code

GitHub - Simplified-Reasoning/SU-01: SU-01: Achieving Gold-Medal-Level Olympiad Reasoning via Simple and Unified Scaling

SU-01: Achieving Gold-Medal-Level Olympiad Reasoning via Simple and Unified Scaling - Simplified-Reasoning/SU-01

4 views08:00

Papers.Data.Code

📊 Dataset #Dataset #LLM #SoftwareEngineering #ToolUse

Orchard
👤 microsoft

🎯 Task
Agentic software engineering and web GUI interaction

💡 Idea
~110K agent trajectories in 2 parallel subsets: 107,185 SWE chat+tool rollouts over 2,788 GitHub repos with hidden-test pass/fail labels, plus 3,070 GUI decision-point rows with screenshots, chat context, and judge-verified rewards across 409 web tasks.

✨ Why it's interesting
Verified patch outcomes and judge-scored GUI steps make agent training and evaluation measurable across real coding and browser tasks.

Ⓢ 110,255 samples, ~10.97 GB

🔗 dataset

via @Papers.Data.Code

microsoft/Orchard · Datasets at Hugging Face

We’re on a journey to advance and democratize artificial intelligence through open source and open science.

5 views10:00

Papers.Data.Code

💻 Repo #Repo #CV #DepthEstimation #CameraPose

vggt Omega
👤 facebookresearch

🎯 Task
multi-view camera and depth reconstruction

💡 Idea
Infer camera parameters and per-image depth from a set of images in one forward pass, and optionally produce text-aligned embeddings for the same visual inputs.

✨ Why it's interesting
Runs end-to-end on a single A100 with 6.02 GB for 1 frame and 43.15 GB for 500 frames, with released 1B checkpoints and demo code.

💻 Repo
⭐ facebookresearch/vggt-omega — 413 stars (+413 3d)
Python

via @Papers.Data.Code

GitHub - facebookresearch/vggt-omega: [CVPR 2026 Oral] VGGT Omega

[CVPR 2026 Oral] VGGT Omega. Contribute to facebookresearch/vggt-omega development by creating an account on GitHub.

5 views13:00

Papers.Data.Code

📄 Paper #Paper #CV #VideoGeneration #DiffusionModels

Causal Forcing++: Scalable Few-Step Autoregressive Diffusion Distillation for Real-Time Interactive Video Generation
👤 Min Zhao, Hongzhou Zhu, Kaiwen Zheng et al.

🎯 Task
Real-time autoregressive video generation

💡 Idea
Instead of costly AR-teacher ODE trajectory distillation, it initializes few-step AR students with causal consistency distillation: same AR flow-map target, but learned from single online adjacent-step teacher updates, making frame-wise 1-2 step rollout practical.

✨ Why it's interesting
At frame-wise 2-step, beats 4-step chunk-wise Causal Forcing by +0.1 VBench Total, +0.3 Quality, +0.335 VisionReward; 50% lower first-frame latency, ~4x cheaper Stage 2.

💻 Repo
⭐ thu-ml/Causal-Forcing — 665 stars

🔗 paper

via @Papers.Data.Code

GitHub - thu-ml/Causal-Forcing: [ICML 2026] Official codebase for "Causal Forcing: Autoregressive Diffusion Distillation Done Right…

[ICML 2026] Official codebase for "Causal Forcing: Autoregressive Diffusion Distillation Done Right for High-Quality Real-Time Interactive Video Generation" & Causal Forci...

4 views16:00

Papers.Data.Code

📄 Paper #Paper #Multimodal #LongContextModeling #VisionLanguageModels

Training Long-Context Vision-Language Models Effectively with Generalization Beyond 128K Context
👤 Zhaowei Wang, Lishu Luo, Haodong Duan et al.

🎯 Task
Long-context vision-language modeling

💡 Idea
Instead of OCR-style long-data training, use instruction-formatted long-document VQA. A balanced length mix and retrieval-heavy task mixture beat 128K-focused or transcription-based training for extending LVLM context.

✨ Why it's interesting
With 5B tokens, MMProLong improves long-doc VQA by 7.1%, stays strong at 256K/512K beyond 128K training, and exceeds baselines by 20%+ there.

🔗 paper

via @Papers.Data.Code

Training Long-Context Vision-Language Models Effectively with...

Long-context modeling is becoming a core capability of modern large vision-language models (LVLMs), enabling sustained context management across long-document understanding, video analysis, and...

5 views08:00

Papers.Data.Code

📊 Dataset #Dataset #LLM #HallucinationDetection #Multilingual

LLM Hallucination Benchmark Dataset
👤 alitaqishah

🎯 Task
LLM hallucination detection and analysis

💡 Idea
200 annotated LLM responses spanning 5 models, 8 domains, 7 languages, 7 hallucination types, 4 annotator types, and 4 mitigation strategies, with prompt, response, hallucination label, span, severity, and verified correction.

✨ Why it's interesting
Makes cross-model, multilingual hallucination detection and mitigation evaluation directly measurable with typed error labels and corrected references.

Ⓢ 200 annotated responses

🔗 dataset

via @Papers.Data.Code

LLM Hallucination Benchmark Dataset

Multi-model hallucination labels across domains, tasks & languages (2024–25)

4 views10:00

Papers.Data.Code

📄 Paper #Paper #LLM #ReinforcementLearning #KnowledgeDistillation

Self-Distilled Agentic Reinforcement Learning
👤 Zhengxi Lu, Zhiyuan Yao, Zhuowen Han et al.

🎯 Task
Agentic RL for multi-turn LLMs

💡 Idea
Instead of naively mixing OPSD with RL, SDAR keeps RL as the backbone and uses detached token-level gates to apply distillation selectively—amplifying positive teacher-student gaps and softening negative teacher rejections.

✨ Why it's interesting
Beats GRPO by +9.4% on ALFWorld, +7.0% on Search-QA, and +10.2% WebShop-Acc, while avoiding naive GRPO+OPSD instability.

💻 Repo
⭐ ZJU-REAL/SDAR — 96 stars

🔗 paper

via @Papers.Data.Code

GitHub - ZJU-REAL/SDAR: Official code for "Self-Distilled Agentic Reinforcement Learning"

Official code for "Self-Distilled Agentic Reinforcement Learning" - ZJU-REAL/SDAR

4 views13:00

Papers.Data.Code

💻 Repo #Repo #CV #VideoGeneration #CameraControl

Warp As History
👤 yyfz

🎯 Task
camera-controlled video generation

💡 Idea
Generate videos that follow user-specified camera trajectories from one input frame, using a single training video and optional interactive/autoregressive control in a drop-in Helios pipeline.

✨ Why it's interesting
Enables interactive viewpoint control from only one camera-annotated training example, with released training, inference, and browser demo code.

💻 Repo
⭐ yyfz/Warp-as-History — 117 stars (+58 3d)
Python

🔗 paper

via @Papers.Data.Code

GitHub - yyfz/Warp-as-History: Warp-as-History: Generalizable Camera-Controlled Video Generation from One Training Video

Warp-as-History: Generalizable Camera-Controlled Video Generation from One Training Video - yyfz/Warp-as-History

4 views16:00

Papers.Data.Code

📄 Paper #Paper #NLP #TheoremProving #RetrievalAugmentedGeneration

OProver: A Unified Framework for Agentic Formal Theorem Proving
👤 David Ma, Kaijing Ma, Shawn Guo et al.

🎯 Task
Formal theorem proving

💡 Idea
Instead of bolting retrieval and self-repair onto a fixed prover at test time, OProver trains that agentic loop itself: multi-round proof revision conditioned on retrieved verified proofs and raw Lean feedback, with new verified proofs and repair traces fed back into training.

✨ Why it's interesting
OProver-32B gets best Pass@32 on MiniF2F 93.3%, ProverBench 58.2%, PutnamBench 11.3%, and second on MathOlympiad 22.8% and ProofNet 33.2%.

💻 Repo
⭐ multimodal-art-projection/OProver — 7 stars

🔗 paper

via @Papers.Data.Code

GitHub - multimodal-art-projection/OProver

Contribute to multimodal-art-projection/OProver development by creating an account on GitHub.

4 views08:00

Papers.Data.Code

📊 Dataset #Dataset #Tabular #Education #MentalHealth

Impact of Ai on Students
👤 laveshjadon

🎯 Task
Student outcome and burnout prediction

💡 Idea
50,000 student records with 16 features spanning academic profile, GenAI usage, study habits, institutional policy, anxiety, skill retention, and burnout, with targets for GPA regression, skill retention, and burnout classification.

✨ Why it's interesting
Makes it possible to model academic and well-being outcomes against AI usage and policy in one complete, balanced student dataset.

Ⓢ 50,000 samples, 16 columns, CSV

🔗 dataset

via @Papers.Data.Code

Impact of Ai on Students

Is AI a Tutor or a Cheat Code? 50,000 Student Records on GenAI Usage and Burnout

3 views10:00

Papers.Data.Code

💻 Repo #Repo #LLM #FederatedLearning #Lora

Smart Fed
👤 benmagnifico

🎯 Task
federated LLM fine-tuning with LoRA reuse

💡 Idea
Compose a frozen pool of existing task LoRAs into a federated adapter by splitting them into rank-wise experts and learning a small input-conditioned router that selects and combines them on each client.

✨ Why it's interesting
Cuts training, communication, and energy cost versus federated train-from-scratch baselines, and beats both knowledge-free and knowledge-reuse baselines on three skill-composition tasks.

💻 Repo
⭐ benmagnifico/SmartFed — 15 stars (+15 3d)

via @Papers.Data.Code

GitHub - benmagnifico/SmartFed: [ICML 2026 Spotlight] SmartFed is a resource-efficient framework that circumvents expensive training…

[ICML 2026 Spotlight] SmartFed is a resource-efficient framework that circumvents expensive training from scratch by intelligently reusing knowledge embedded in existing LoRA modules. - benmagnific...

3 views13:00

Papers.Data.Code

📄 Paper #Paper #Multimodal #ImageGeneration #VideoGeneration

Lance: Unified Multimodal Modeling by Multi-Task Synergy
👤 Fengyi Fu, Mengqi Huang, Shaojin Wu et al.

🎯 Task
Unified multimodal understanding and generation

💡 Idea
Instead of one shared visual path or bolted-on modules, Lance uses a shared interleaved multimodal context with dual MoE streams: one expert for text+semantic understanding, one for VAE-latent generation, plus modality-aware RoPE and staged multi-task training.

✨ Why it's interesting
With only 3B activated params and a 128-GPU budget, it substantially outperforms prior open-source unified models on image and video generation while keeping strong understanding.

💻 Repo
⭐ bytedance/Lance — 314 stars

🔗 paper

via @Papers.Data.Code

GitHub - bytedance/Lance: A 3B-active-parameter native unified multimodal model for image and video understanding, generation,…

A 3B-active-parameter native unified multimodal model for image and video understanding, generation, and editing. - bytedance/Lance

3 views16:00

Papers.Data.Code

📄 Paper #Paper #Multimodal #ComputerUseAgents #Benchmarking

OpenComputer: Verifiable Software Worlds for Computer-Use Agents
👤 Jinbiao Wei, Qianran Ma, Yilun Zhao et al.

🎯 Task
Computer-use agent evaluation and benchmark generation

💡 Idea
Instead of screenshot or LLM-judge evaluation, uses app-specific state verifiers over real software, then self-refines them from execution disagreements to synthesize and score realistic desktop tasks automatically.

✨ Why it's interesting
Covers 33 apps and 1,000 tasks. Verifiers align better with humans than LLM judges. Best agent hits 68.3% success; open models drop sharply vs OSWorld.

💻 Repo
⭐ echo0715/OpenComputer

🔗 paper

via @Papers.Data.Code

GitHub - echo0715/OpenComputer

Contribute to echo0715/OpenComputer development by creating an account on GitHub.

3 views08:00

Papers.Data.Code

📄 Paper #Paper #LLM #ReinforcementLearning #LongContext

GoLongRL: Capability-Oriented Long Context Reinforcement Learning with Multitask Alignment
👤 Minxuan Lv, Tiehua Mei, Tanlong Du et al.

🎯 Task
Long-context reinforcement learning for LLMs

💡 Idea
Instead of retrieval-path-heavy QA data and uniform rewards, it trains on 9 long-context capability tasks with task-native metrics, then replaces vanilla GRPO's prompt-level scaling with task-mean normalization plus difficulty-adaptive reweighting.

✨ Why it's interesting
On Qwen3-30B-A3B, average long-context score rises from 60.1 to 69.8; TMN-Reweight reaches 63.0 on 4B vs 62.2 with vanilla GRPO.

💻 Repo
⭐ xiaoxuanNLP/GoLongRL

🔗 paper

via @Papers.Data.Code

GitHub - xiaoxuanNLP/GoLongRL: GoLongRL: Capability-Oriented Long Context Reinforcement Learning with Multitask Alignment

GoLongRL: Capability-Oriented Long Context Reinforcement Learning with Multitask Alignment - xiaoxuanNLP/GoLongRL

3 views10:00

Papers.Data.Code

🔥 Repo #Repo #LLM #Pretraining #HierarchicalReasoningModel

Hrm Text
👤 sapientinc

🎯 Task
efficient foundation model pretraining

💡 Idea
Pretrain HRM text generation models from scratch on 8-16 H100s with built-in data packing, distributed training, benchmark evaluation, and checkpoint export to Transformers format.

✨ Why it's interesting
Claims 130-600x less compute and 150-900x less data; reference runs train 0.6B-1B models in 46-50 hours on 8-16 H100s.

💻 Repo
⭐ sapientinc/HRM-Text — 580 stars (+580 3d)
Python

via @Papers.Data.Code

GitHub - sapientinc/HRM-Text: HRM-Text is a 1B text generation model based on the HRM architecture, strengthened by task completion…

HRM-Text is a 1B text generation model based on the HRM architecture, strengthened by task completion and latent space reasoning. - sapientinc/HRM-Text

3 views13:00