Papers.Data.Code
18 subscribers
101 links
Only meaningful ML signals: papers, repos & datasets. Selected, not collected. 3–4 posts/day. πŸ“„πŸ’»πŸ“Š
papers.data.code@gmail.com
Download Telegram
πŸ“„ Paper #Paper #Robotics #HumanoidControl #WorldModels

UniT: Toward a Unified Physical Language for Human-to-Humanoid Policy Learning and World Modeling
πŸ‘€ Boyu Chen, Yi Chen, Lu Qiu et al.

🎯 Task
Human-to-humanoid policy learning and world modeling

πŸ’‘ Idea
Tri-branch visual-action-fusion tokenizer with cross-reconstruction maps heterogeneous actions into shared discrete tokens; actions predict vision and vision reconstructs actions to capture embodiment-agnostic physical intent.

✨ Why it's interesting
Achieves SOTA data efficiency, robust OOD generalization, and zero-shot task transfer on sim and real humanoids.

πŸ’» Repo
⭐ xpeng-robotics/UniT β€” 37 stars

πŸ”— paper

via @Papers.Data.Code
πŸ“Š Dataset #Dataset #3DEditing #3DEditing #InstructionFollowing

HΒ³D: High-quality Holistic 3D Editing Dataset
πŸ‘€ ART-3D

🎯 Task
Instruction-following 3D editing

πŸ’‘ Idea
~102.7K records across 10 shards: each sample has before/after 3D SLAT latents, one aligned 518Γ—518 RGB view per side, and an edit prompt for deletion, addition, modification, scale, material, color, or global style.

✨ Why it's interesting
Paired latent+image edits across 7 types enable training and evaluation of part-level 3D editors.

Size: 102,704 records across 10 shards, ~54.5 GB total

Downloads: 441 | Likes: 10

πŸ”— dataset

via @Papers.Data.Code
πŸ”₯1
πŸ”₯ Repo #Repo #MixtureOfExperts #MixtureOfExperts #Quantization

Tile Kernels
πŸ‘€ deepseek-ai

🎯 Task
LLM GPU kernel optimization

πŸ’‘ Idea
Optimized TileLang kernels for LLM ops, including top-k MoE gating/routing, FP8/FP4/E5M6 quantization, batched transpose, Engram, and Manifold HyperConnection, plus trainable torch.autograd.Function wrappers for higher-level layers.

✨ Why it's interesting
Authors say most kernels approach hardware limits for compute intensity and memory bandwidth.

πŸ’» Repo
⭐ deepseek-ai/TileKernels β€” 1.2k stars (+1.1k 3d)
Python


via @Papers.Data.Code
πŸ“„ Paper #Paper #CV #NovelViewSynthesis #VideoDiffusion

Vista4D: Video Reshooting with 4D Point Clouds
πŸ‘€ Kuan Heng Lin, Zhizheng Liu, Pablo Salamanca et al.

🎯 Task
Video reshooting

πŸ’‘ Idea
4D-grounded point clouds with temporally persistent static pixels guide a video diffusion model, plus training on noisy reconstructed multiview data to preserve seen content and improve camera control under real-world artifacts.

✨ Why it's interesting
Best camera/3D consistency; user study wins 67.06% preservation, 68.17% camera, 77.38% fidelity.

πŸ’» Repo
⭐ Eyeline-Labs/Vista4D β€” 88 stars

πŸ”— paper

via @Papers.Data.Code
❀1
Channel photo updated
πŸ“„ Paper #Paper #LLM #TimeSeries #VisionLanguageModels

LLaTiSA: Towards Difficulty-Stratified Time Series Reasoning from Visual Perception to Semantics
πŸ‘€ Yueyang Ding, HaoPeng Zhang, Rui Dai et al.

🎯 Task
Time series reasoning

πŸ’‘ Idea
Dual-view VLM input uses a time-series plot plus an index-value table for precise numerical grounding, then curriculum fine-tunes across L1-L3 reasoning levels on the 83k-sample HiTSR dataset.

✨ Why it's interesting
Best OOD results: 86.8% L1, 75.6% local L2, 97.5% global L2, 67.0% L3 accuracy.

πŸ’» Repo
⭐ RainingNovember/LLaTiSA β€” 76 stars

πŸ”— paper

via @Papers.Data.Code
πŸ“Š Dataset #Dataset #Classification #Classification #Regression

Sleep Health & Daily Performance Dataset
πŸ‘€ mohankrishnathalla

🎯 Task
Sleep health prediction

πŸ’‘ Idea
100K records, 32 columns, and 3 targets spanning regression, multiclass, and binary tasks. Structured daily snapshots cover sleep metrics, behaviors, mental state, cognitive outcomes, 12 occupations, and 15 countries with no missing values.

✨ Why it's interesting
100K rows + 3 targets enable benchmarkable sleep, risk, and cognition models from beginner to expert level.

Size: 100K records, 32 columns, 14.3 MB

Downloads: 2.9k | Likes: 49

πŸ”— dataset

via @Papers.Data.Code
πŸ€—1
πŸ“„ Paper #Paper #LLM #AgenticRL #ToolUse

DR-Venus: Towards Frontier Edge-Scale Deep Research Agents with Only 10K Open Data
πŸ‘€ Venus Team, Sunhao Dai, Yong Deng et al.

🎯 Task
Edge-scale deep research agents

πŸ’‘ Idea
Cleaned and resampled long-horizon trajectories plus IGPO-based turn-level RL with information-gain rewards and format penalties train a 4B agent from ~10K open data, targeting dense credit assignment for long research runs.

✨ Why it's interesting
Beats prior agentic models under 9B on multiple deep research benchmarks.

πŸ’» Repo
⭐ inclusionAI/DR-Venus β€” 50 stars
⭐ verl-project/verl β€” 50 stars

πŸ”— paper πŸ”— dataset πŸ”— dataset

via @Papers.Data.Code
πŸ’» Repo #Repo #CV #FaceVerification #Webassembly

Face X
πŸ‘€ facex-engine

🎯 Task
Face verification

πŸ’‘ Idea
Local face embedding engine for browser, C, Go, Python, and CLI. It computes 512-d embeddings and cosine similarity, with no dependencies, optional encrypted weights, and SIMD-optimized CPU inference.

✨ Why it's interesting
Claims 3.0 ms native latency, 99.73% LFW accuracy, and 1.30x faster inference than ONNX Runtime.

πŸ’» Repo
⭐ facex-engine/facex β€” 82 stars (+82 3d)
C

πŸ”— paper

via @Papers.Data.Code
πŸ“„ Paper #Paper #CV #TextToVideo #ReinforcementLearning

World-R1: Reinforcing 3D Constraints for Text-to-Video Generation
πŸ‘€ Weijie Wang, Xiaoxuan He, Youping Gu et al.

🎯 Task
3D-consistent text-to-video generation

πŸ’‘ Idea
Flow-GRPO fine-tunes a video model with rewards from 3D reconstruction, meta-view VLM scoring, trajectory alignment, and aesthetics; camera motion is injected by warping latent noise instead of adding control modules.

✨ Why it's interesting
Improves 3D consistency by 10.23 dB and 7.91 dB PSNR while preserving general video quality.

πŸ’» Repo
⭐ microsoft/World-R1 β€” 197 stars

πŸ”— paper

via @Papers.Data.Code
πŸ“Š Dataset #Dataset #NLP #CompetitionMath #Multimodal

MathNet v0 β€” Olympiad Math Reasoning & Retrieval
πŸ‘€ ShadenA

🎯 Task
Olympiad math reasoning and retrieval

πŸ’‘ Idea
27,817 problems in v0 from 58 country/regional configs, with problem markdown, official solutions, topic paths, language, provenance, and 7,541 inline images; sourced from official booklets across 47 countries and 17 languages.

✨ Why it's interesting
30K-scale multilingual expert data enables hard reasoning, retrieval, and RAG evaluation beyond small English-only math sets.

Size: 27,817 problems, 7,541 images, 58 configs

Downloads: 9.3k | Likes: 26

πŸ”— dataset πŸ”— paper πŸ”— repo

via @Papers.Data.Code
πŸ’» Repo #Repo #Cpp #Cpp #Gguf

Llama Cpp Deep Seek V4 Flash
πŸ‘€ antirez

🎯 Task
Local LLM inference

πŸ’‘ Idea
DeepSeek v4 Flash support in llama.cpp with generated GGUFs using 2-bit quantization of routed experts, targeting MacBooks with 128GB RAM; works with CPU and Metal backends.

✨ Why it's interesting
Targets 128GB MacBooks for local DSv4 inference; Metal backend is faster than CPU.

πŸ’» Repo
⭐ antirez/llama.cpp-deepseek-v4-flash β€” 124 stars (+124 3d)
C++

πŸ”— paper πŸ”— paper πŸ”— paper

via @Papers.Data.Code
πŸ“„ Paper #Paper #CV #MultimodalLearning #ImageGeneration

Tuna-2: Pixel Embeddings Beat Vision Encoders for Multimodal Understanding and Generation
πŸ‘€ Zhiheng Liu, Weiming Ren, Xiaoke Huang et al.

🎯 Task
Unified multimodal understanding and generation

πŸ’‘ Idea
Direct patch embeddings replace VAE and representation encoders, so one transformer handles text, images, and pixel-space generation end to end. A masking-based visual feature learning scheme stabilizes training and improves pixel-space representations.

✨ Why it's interesting
At 7B, it reaches SOTA among native UMMs on understanding and stays competitive on generation.

πŸ’» Repo
⭐ facebookresearch/tuna-2 β€” 139 stars

πŸ”— paper

via @Papers.Data.Code
πŸ“„ Paper #Paper #MultiAgentSystems #MultiAgentSystems #Reasoning

Recursive Multi-Agent Systems
πŸ‘€ Xiyuan Yang, Jiaru Zou, Rui Pan et al.

🎯 Task
Multi-agent LLM reasoning

πŸ’‘ Idea
Latent-state recursion across agents via lightweight RecursiveLink modules β€” agents pass and refine hidden states in a loop, with inner-outer training for whole-system credit assignment instead of text-based coordination.

✨ Why it's interesting
Avg +8.3% accuracy, 1.2-2.4x faster inference, and 34.6-75.6% fewer tokens vs baselines.

πŸ’» Repo
⭐ RecursiveMAS/RecursiveMAS β€” 30 stars

πŸ”— paper

via @Papers.Data.Code
πŸ“„ Paper #Paper #AudioReasoning #AudioReasoning #Rlhf

Step-Audio-R1.5 Technical Report
πŸ‘€ Yuxin Zhang, Xiangyu Tony Zhang, Daijiao Liu et al.

🎯 Task
Audio reasoning for multi-turn spoken dialogue

πŸ’‘ Idea
RLHF with a rubric-guided generated reward model compares responses in multi-turn audio chats, optimizing naturalness, coherence, and instruction retention beyond label-only RLVR.

✨ Why it's interesting
77.97 avg across 8 benchmarks, +5.47 over Step-Audio-R1; 41.15 on Audio MC.

πŸ’» Repo
⭐ stepfun-ai/Step-Audio-R1 β€” 647 stars

πŸ”— paper πŸ”— dataset

via @Papers.Data.Code
πŸ”₯1
πŸ’» Repo #Repo #TestDrivenDevelopment #TestDrivenDevelopment #CodingAgents

Evan Flow
πŸ‘€ evanklem

🎯 Task
AI-assisted software development workflow

πŸ’‘ Idea
Single-entry workflow for Claude Code that orchestrates brainstorm β†’ plan β†’ execute β†’ iterate, with vertical-slice TDD inside coding tasks, optional parallel coder/overseer subagents, and a hook blocking dangerous git commands.

✨ Why it's interesting
Keeps users in control with approval checkpoints, no auto-commits, and blocked destructive git ops.

πŸ’» Repo
⭐ evanklem/evanflow β€” 356 stars (+356 3d)
Shell

πŸ”— paper

via @Papers.Data.Code
πŸ“„ Paper #Paper #InstructionTuning #InstructionTuning #KnowledgeGraphs

Programming with Data: Test-Driven Data Engineering for Self-Improving LLMs from Raw Corpora
πŸ‘€ Chenkai Pan, Xinglong Xu, Yuhang Xu et al.

🎯 Task
Domain-specific LLM fine-tuning

πŸ’‘ Idea
Shared L1 concepts, L2 relations, and L3 reasoning chains drive both SFT data and benchmarks; failures are traced to concept gaps or reasoning deficits and repaired with targeted data patches.

✨ Why it's interesting
Across 16 disciplines, one debug round let a 32B model beat GPT-5.4, Gemini-3-flash, and DeepSeek-v3.2.

πŸ’» Repo
⭐ OpenRaiser/ProDa β€” 43 stars

πŸ”— paper

via @Papers.Data.Code
πŸ“„ Paper #Paper #MultimodalAgents #MultimodalAgents #ToolUse

GLM-5V-Turbo: Toward a Native Foundation Model for Multimodal Agents
πŸ‘€ V Team, Wenyi Hong, Xiaotao Gu et al.

🎯 Task
Multimodal agent foundation model

πŸ’‘ Idea
Native multimodal agent model with CogViT and multimodal multi-token prediction using <|image|> placeholders, plus joint RL over 30+ perception, reasoning, coding, and GUI tasks for end-to-end tool use.

✨ Why it's interesting
Scores 94.8 on Design2Code and 75.7 on AndroidWorld; RL adds +4.9 on OSWorld.

πŸ’» Repo
⭐ zai-org/GLM-V β€” 2.3k stars
⭐ zai-org/ImageMining β€” 2.3k stars
⭐ zai-org/GLM-skills β€” 2.3k stars

πŸ”— paper

via @Papers.Data.Code
πŸ“Š Dataset #Dataset #RubricBasedEvaluation #RubricBasedEvaluation #PhysicianWritten

HealthBench Professional
πŸ‘€ openai

🎯 Task
Clinical response evaluation

πŸ’‘ Idea
Structured medical eval examples with conversations, physician responses, and scored rubric items, labeled by use case, red-teaming vs good-faith, difficulty, and specialty.

✨ Why it's interesting
Physician answers plus rubrics enable consistent scoring of model performance across clinical use cases.

Downloads: 5.7k | Likes: 43

πŸ”— dataset πŸ”— repo

via @Papers.Data.Code
πŸ’» Repo #Repo #Gbnf #Gbnf #LlamaCpp

Structured Cot
πŸ‘€ andthattoo

🎯 Task
Reasoning token compression for code generation

πŸ’‘ Idea
Constrain a model's thinking into short structured fields like GOAL/APPROACH/EDGE or GOAL/STATE/ALGO/EDGE/VERIFY at inference time, then let it generate code normally to reduce verbose CoT and compare free vs constrained runs.

✨ Why it's interesting
No training; 22.4Γ— fewer think tokens on HumanEval+ and +14 pp pass@1 on LiveCodeBench.

πŸ’» Repo
⭐ andthattoo/structured-cot β€” 196 stars (+154 3d)
Python

πŸ”— paper

via @Papers.Data.Code
πŸ“„ Paper #Paper #LLM #MultiAgentSystems #AgentOrchestration

From Skills to Talent: Organising Heterogeneous Agents as a Real-World Company
πŸ‘€ Zhengxu Yu, Yu Fu, Zhiyuan He et al.

🎯 Task
Multi-agent organization and coordination

πŸ’‘ Idea
Talent-Container architecture separates agent identity from runtime, while a Talent Market recruits verified agents on demand and E2R tree search plans, executes, and reviews tasks with formal guarantees.

✨ Why it's interesting
84.67% success on PRDBench, beating prior SOTA by 15.48 points.

πŸ’» Repo
⭐ 1mancompany/OneManCompany β€” 119 stars

πŸ”— paper

via @Papers.Data.Code
πŸ”₯1