ML Research Hub
32.8K subscribers
4.13K photos
244 videos
23 files
4.46K links
Advancing research in Machine Learning – practical insights, tools, and techniques for researchers.

Admin: @HusseinSheikho || @Hussein_Sheikho
Download Telegram
DataFlow: An LLM-Driven Framework for Unified Data Preparation and Workflow Automation in the Era of Data-Centric AI

📝 Summary:
DataFlow is an LLM-driven framework for unified, high-quality data preparation. It automates pipeline generation from natural language, significantly boosting LLM performance across diverse tasks like math, code, and text. DataFlow ensures reproducible data and provides a scalable foundation for AI.

🔹 Publication Date: Published on Dec 18

🔹 Paper Links:
• arXiv Page: https://arxiv.org/abs/2512.16676
• PDF: https://arxiv.org/pdf/2512.16676
• Project Page: https://github.com/OpenDCAI/DataFlow
• Github: https://github.com/OpenDCAI/DataFlow

Datasets citing this paper:
https://huggingface.co/datasets/OpenDCAI/dataflow-demo-code
https://huggingface.co/datasets/OpenDCAI/dataflow-demo-Text2SQL
https://huggingface.co/datasets/OpenDCAI/dataflow-instruct-10k

==================================

For more data science resources:
https://t.me/DataScienceT

#LLM #DataPreparation #DataCentricAI #WorkflowAutomation #AIResearch