Papers.Data.Code | Digests

Channel created

14:24

Channel photo updated

16:17

Papers.Data.Code | Digests

Channel name was changed to «Papers.Data.Code | Digests»

23:53

Papers.Data.Code | Digests

Top ML Papers · May 11 – May 17
#Paper #WeeklyDigest #W20_2026
From weekly rankings · through May 17

This week in top
• Any-step video diffusion
• On-policy flow distillation
• NEO-unify multimodal model

🥇 AnyFlow: Any-Step Video Diffusion Model with On-Policy Flow Map Distillation
Yuchao Gu, Guian Fang, Yuxin Jiang et al.
Unlike fixed-step distillation, it learns arbitrary-time video flow maps and tops 14B 4-NFE VBench.
→ Full breakdown

🥈 Flow-OPD: On-Policy Distillation for Flow Matching Models
Zhen Fang, Wenxuan Huang, Yu Zeng et al.
It brings on-policy multi-teacher distillation to flow matching, boosting SD3.5 GenEval 63→92 and OCR 59→94.
→ Full breakdown

🥉 SenseNova-U1: Unifying Multimodal Understanding and Generation with NEO-unify Architecture
Haiwen Diao, Penghao Wu, Hanming Deng et al.
Unlike encoder-plus-VAE hybrids, it jointly does understanding and pixel-space generation in one backbone at 32× compression.
→ Full breakdown

➡️ Daily ML signals → Papers.Data.Code
via @Papers.Data.Code | Digests

3 viewsedited 12:00

Papers.Data.Code | Digests

Top ML Repos · May 11 – May 17
#Repo #WeeklyDigest #W20_2026
From weekly rankings · through May 17

This week in top
• Metal local inference
• Program reconstruction benchmark
• Self-supervised IMU odometry

🥇 antirez/ds4
Unlike recent local DeepSeek ports, it adds disk-persistent KV cache for 1M-token contexts.
→ Full breakdown

🥈 facebookresearch/ProgramBench
Targets black-box reverse engineering: rebuilding full codebases from binaries, docs, and tests.
→ Full breakdown

🥉 sparolab/KISS-IMU
Uses LiDAR-odometry pseudo-labels with motion-balanced sampling to learn IMU odometry self-supervised.
→ Full breakdown

➡️ Daily ML signals → Papers.Data.Code
via @Papers.Data.Code | Digests

3 viewsedited 12:00

Papers.Data.Code | Digests

Top ML Datasets · May 11 – May 17
#Dataset #WeeklyDigest #W20_2026
From weekly rankings · through May 17

This week in top
• Global AI panel dataset
• Permissive image corpus
• Global hantavirus dataset

🥇 AI Index Data: Growth, Talent (Cambridge/Harvard)
It uniquely harmonizes 259,546 verified AI indicators across 227 countries from 1998–2025.
→ Full breakdown

🥈 stanford-vision-lab/giant-permissive-image-corpus
Unlike most recent datasets, it offers 100M high-quality images under fully permissive licensing.
→ Full breakdown

🥉 🦠 Hantavirus (Andes Virus) — Global Epidemiology
Links epidemiology, clinical outcomes, environmental risks, and strain data across 25 countries from 1993–2025.
→ Full breakdown

➡️ Daily ML signals → Papers.Data.Code
via @Papers.Data.Code | Digests

3 viewsedited 12:00

Papers.Data.Code | Digests

📋 ML Weekly Recap · May 11 – May 17
#Recap #WeeklyDigest #W20_2026

This week in top
• Any-step video diffusion · 🔗 Paper
• Metal local inference · 🔗 Repo
• Global AI panel dataset · 🔗 Dataset

⚡ Trends

▸ On-policy distillation accelerates flow and diffusion generation with stronger few-step quality
▸ Test-time reasoning shifts toward agentic control, multi-agent search, and reusable memory
▸ Unified or efficient multimodal generation targets native pixels, high resolution, and bounded compute

🧭 TL;DR

📄 SenseNova-U1: Unifying Multimodal Understanding and Generation with NEO-unify Architecture
Haiwen Diao, Penghao Wu, Hanming Deng et al.
Unified multimodal understanding and pixel generation in one end-to-end architecture

⭐ SwiftI2V
Practical 2K image-to-video generation with 202× less GPU-time

💡 Efficiency and unification are driving multimodal generation and reasoning forward.

────────────
➡️ Daily ML signals → Papers.Data.Code
via @Papers.Data.Code | Digests

2 views15:25

About

Blog

Apps

Platform