Github Top Repositories
Photo
ð Spotted on GitHub Trending: NVIDIA/cosmos â let's break it down.
ð https://github.com/NVIDIA/cosmos
ð NVIDIA Cosmos is an open platform of world models, datasets, and tools that enables developers to build Physical AI for robots, autonomous vehicles, smart infrastructure, and more.
ââââââââââââââââââââââââââââââ
NVIDIA Cosmos is an open platform for building Physical AI, providing a suite of world models, datasets, and tools. Cosmos 3 is the latest model family, designed to jointly process and generate language, images, video, audio, and action sequences. It has two runtime surfaces:
Key features include:
- World understanding: analyze videos and images for captions, temporal events, and physical plausibility
- World generation: produce images, videos, sound, and action-conditioned rollouts from text, image, video, or action inputs
- Action modeling: predict policy actions for robotics and autonomous-driving settings
Cosmos 3 has a unified Mixture-of-Transformers architecture, combining an autoregressive transformer for reasoning with a diffusion transformer for multimodal generation. The model family includes
To get started, create a Hugging Face access token, authenticate locally, and set up a virtual environment with the required dependencies. You can use HuggingFace Diffusers for research, training, and model development.
One-liner takeaway: NVIDIA Cosmos is revolutionizing Physical AI by providing a powerful platform for world understanding and generation, enabling developers to build more sophisticated robots, autonomous vehicles, and smart infrastructure.
ââââââââââââââââââââââââââââââ
ð§ Channel: https://t.me/GithubRe
ð https://github.com/NVIDIA/cosmos
ð NVIDIA Cosmos is an open platform of world models, datasets, and tools that enables developers to build Physical AI for robots, autonomous vehicles, smart infrastructure, and more.
ââââââââââââââââââââââââââââââ
NVIDIA Cosmos is an open platform for building Physical AI, providing a suite of world models, datasets, and tools. Cosmos 3 is the latest model family, designed to jointly process and generate language, images, video, audio, and action sequences. It has two runtime surfaces:
Reasoner for world understanding and Generator for world generation. Key features include:
- World understanding: analyze videos and images for captions, temporal events, and physical plausibility
- World generation: produce images, videos, sound, and action-conditioned rollouts from text, image, video, or action inputs
- Action modeling: predict policy actions for robotics and autonomous-driving settings
Cosmos 3 has a unified Mixture-of-Transformers architecture, combining an autoregressive transformer for reasoning with a diffusion transformer for multimodal generation. The model family includes
Cosmos3-Nano, Cosmos3-Super, and specialized models for text-to-image and image-to-video generation.To get started, create a Hugging Face access token, authenticate locally, and set up a virtual environment with the required dependencies. You can use HuggingFace Diffusers for research, training, and model development.
One-liner takeaway: NVIDIA Cosmos is revolutionizing Physical AI by providing a powerful platform for world understanding and generation, enabling developers to build more sophisticated robots, autonomous vehicles, and smart infrastructure.
ââââââââââââââââââââââââââââââ
ð§ Channel: https://t.me/GithubRe
Github Top Repositories
Photo
ð 666ghj/MiroFish caught my eye on GitHub Trending today.
ð https://github.com/666ghj/MiroFish
ð A Simple and Universal Swarm Intelligence Engine, Predicting Anything. įŪæīéįĻįįūĪä―æšč―åžæïžéĒæĩäļįĐ
ââââââââââââââââââââââââââââââ
MiroFish is a cutting-edge AI prediction engine that utilizes multi-agent technology to forecast outcomes. By analyzing real-world data, it creates a parallel digital world where thousands of intelligent agents interact and evolve. This allows users to rehearse the future in a digital sandbox and make informed decisions after simulating various scenarios.
Key features include:
-
-
-
The workflow involves:
1. Graph building and environment setup
2. Simulation and report generation
3. Deep interaction with the simulated world
Technical highlights include:
- Utilization of
- Support for
The target audience includes decision-makers, researchers, and individuals interested in exploring what if scenarios.
To get started, users can deploy MiroFish via source code or Docker, and join the conversation on social media platforms.
In a nutshell, MiroFish is all about predicting anything - from serious predictions to playful simulations, making it possible to rehearse the future and win decisions after countless simulations.
ââââââââââââââââââââââââââââââ
ð§ Channel: https://t.me/GithubRe
ð https://github.com/666ghj/MiroFish
ð A Simple and Universal Swarm Intelligence Engine, Predicting Anything. įŪæīéįĻįįūĪä―æšč―åžæïžéĒæĩäļįĐ
ââââââââââââââââââââââââââââââ
MiroFish is a cutting-edge AI prediction engine that utilizes multi-agent technology to forecast outcomes. By analyzing real-world data, it creates a parallel digital world where thousands of intelligent agents interact and evolve. This allows users to rehearse the future in a digital sandbox and make informed decisions after simulating various scenarios.
Key features include:
-
Graph Building: extracting seed information and constructing a high-fidelity digital world-
Simulation: running parallel simulations to predict future trajectories-
Report Generation: generating detailed prediction reportsThe workflow involves:
1. Graph building and environment setup
2. Simulation and report generation
3. Deep interaction with the simulated world
Technical highlights include:
- Utilization of
OASIS (Open Agent Social Interaction Simulations) for the simulation engine- Support for
LLM API and Zep Cloud configurationsThe target audience includes decision-makers, researchers, and individuals interested in exploring what if scenarios.
To get started, users can deploy MiroFish via source code or Docker, and join the conversation on social media platforms.
In a nutshell, MiroFish is all about predicting anything - from serious predictions to playful simulations, making it possible to rehearse the future and win decisions after countless simulations.
ââââââââââââââââââââââââââââââ
ð§ Channel: https://t.me/GithubRe
⥠mvanhorn/last30days-skill is making waves. Here's the full picture.
ð https://github.com/mvanhorn/last30days-skill
ð AI agent skill that researches any topic across Reddit, X, YouTube, HN, Polymarket, and the web - then synthesizes a grounded summary
ââââââââââââââââââââââââââââââ
The mvanhorn/last30days-skill GitHub repository is home to a revolutionary AI agent-led search engine. This engine scores results based on upvotes, likes, and real money, rather than editor opinions. Key features include zero-config setup, immediate functionality with Reddit, HN, Polymarket, and GitHub, and the ability to unlock more platforms like X, YouTube, and TikTok in just 30 seconds.
Technical highlights of this repository include the use of a
This repository is perfect for anyone looking to stay up-to-date on the latest developments in their field, including developers, researchers, and industry professionals. With its ability to search multiple platforms at once and provide a brief summary of the most relevant information, mvanhorn/last30days-skill is an invaluable resource.
To get started, users can install the skill using
One-liner takeaway: mvanhorn/last30days-skill is a game-changing search engine that uses AI to scour multiple platforms and provide you with the most relevant, up-to-date information on any topic.
ââââââââââââââââââââââââââââââ
ð§ Channel: https://t.me/GithubRe
ð https://github.com/mvanhorn/last30days-skill
ð AI agent skill that researches any topic across Reddit, X, YouTube, HN, Polymarket, and the web - then synthesizes a grounded summary
ââââââââââââââââââââââââââââââ
The mvanhorn/last30days-skill GitHub repository is home to a revolutionary AI agent-led search engine. This engine scores results based on upvotes, likes, and real money, rather than editor opinions. Key features include zero-config setup, immediate functionality with Reddit, HN, Polymarket, and GitHub, and the ability to unlock more platforms like X, YouTube, and TikTok in just 30 seconds.
Technical highlights of this repository include the use of a
pre-research brain built in Python, which resolves topics and figures out where to search before the search begins. The engine also features intelligent search, cross-source cluster merging, and single-pass comparisons, making it a powerful tool for finding relevant information.This repository is perfect for anyone looking to stay up-to-date on the latest developments in their field, including developers, researchers, and industry professionals. With its ability to search multiple platforms at once and provide a brief summary of the most relevant information, mvanhorn/last30days-skill is an invaluable resource.
To get started, users can install the skill using
/plugin marketplace add mvanhorn/last30days-skill or npx skills add mvanhorn/last30days-skill -g. The repository is constantly being updated with new features and improvements, making it an exciting project to follow.One-liner takeaway: mvanhorn/last30days-skill is a game-changing search engine that uses AI to scour multiple platforms and provide you with the most relevant, up-to-date information on any topic.
ââââââââââââââââââââââââââââââ
ð§ Channel: https://t.me/GithubRe
âĪ1
Github Top Repositories
Photo
ð PaddlePaddle/PaddleOCR caught my eye on GitHub Trending today.
ð https://github.com/PaddlePaddle/PaddleOCR
ð Turn any PDF or image document into structured data for your AI. A powerful, lightweight OCR toolkit that bridges the gap between images/PDFs and LLMs. Supports 100+ languages.
ââââââââââââââââââââââââââââââ
PaddleOCR is a leading OCR toolkit and document AI engine that converts PDF documents and images into structured, LLM-ready data with industry-leading accuracy. It features intelligent document parsing, universal text recognition, and a developer-centric ecosystem. With
Key features include SOTA Document VLM with
It's designed for production-ready efficiency, achieving commercial-grade accuracy with an ultra-small footprint, and is seamlessly integrated with the Hugging Face ecosystem.
Whether you're a developer or researcher, PaddleOCR provides a complete pipeline to build high-quality datasets and supports various hardware backends.
One-liner takeaway: PaddleOCR simplifies document parsing and text recognition, empowering you to build intelligent applications with ease and accuracy!
ââââââââââââââââââââââââââââââ
ð§ Channel: https://t.me/GithubRe
ð https://github.com/PaddlePaddle/PaddleOCR
ð Turn any PDF or image document into structured data for your AI. A powerful, lightweight OCR toolkit that bridges the gap between images/PDFs and LLMs. Supports 100+ languages.
ââââââââââââââââââââââââââââââ
PaddleOCR is a leading OCR toolkit and document AI engine that converts PDF documents and images into structured, LLM-ready data with industry-leading accuracy. It features intelligent document parsing, universal text recognition, and a developer-centric ecosystem. With
70k+ Stars and trusted by top-tier projects, PaddleOCR is the bedrock for building intelligent RAG and Agentic applications. Key features include SOTA Document VLM with
96.3% accuracy on OmniDocBench v1.6, structure-aware conversion to Markdown or JSON, and universal text recognition supporting 100+ languages. It's designed for production-ready efficiency, achieving commercial-grade accuracy with an ultra-small footprint, and is seamlessly integrated with the Hugging Face ecosystem.
Whether you're a developer or researcher, PaddleOCR provides a complete pipeline to build high-quality datasets and supports various hardware backends.
One-liner takeaway: PaddleOCR simplifies document parsing and text recognition, empowering you to build intelligent applications with ease and accuracy!
ââââââââââââââââââââââââââââââ
ð§ Channel: https://t.me/GithubRe
âĪ1