#python #atari #deepmind_lab #gcp #google_research_football #impala #r2d2 #rl #tf2
https://github.com/google-research/seed_rl
https://github.com/google-research/seed_rl
GitHub
GitHub - google-research/seed_rl: SEED RL: Scalable and Efficient Deep-RL with Accelerated Central Inference. Implements IMPALA…
SEED RL: Scalable and Efficient Deep-RL with Accelerated Central Inference. Implements IMPALA and R2D2 algorithms in TF2 with SEED's architecture. - google-research/seed_rl
#python #alphago #alphazero #deep_learning #deep_reinforcement_learning #gym #machine_learning #mcts #model_based_rl #monte_carlo_tree_search #muzero #muzero_general #neural_network #python3 #pytorch #reinforcement_learning #residual_network #rl #self_learning #tensorboard
https://github.com/werner-duvaud/muzero-general
https://github.com/werner-duvaud/muzero-general
GitHub
GitHub - werner-duvaud/muzero-general: MuZero
MuZero. Contribute to werner-duvaud/muzero-general development by creating an account on GitHub.
#python #deep_reinforcement_learning #gym #hyperparameter_optimization #hyperparameter_search #hyperparameter_tuning #lab #openai #optimization #pybullet #pybullet_environments #pytorch #reinforcement_learning #rl #robotics #sde #stable_baselines #tuning_hyperparameters
https://github.com/DLR-RM/rl-baselines3-zoo
https://github.com/DLR-RM/rl-baselines3-zoo
GitHub
GitHub - DLR-RM/rl-baselines3-zoo: A training framework for Stable Baselines3 reinforcement learning agents, with hyperparameter…
A training framework for Stable Baselines3 reinforcement learning agents, with hyperparameter optimization and pre-trained agents included. - DLR-RM/rl-baselines3-zoo
#python #ai #control #decision_making #distributed_computing #machine_learning #marl #model_based_reinforcement_learning #multi_agent_reinforcement_learning #pytorch #reinforcement_learning #rl #robotics #torch
https://github.com/pytorch/rl
https://github.com/pytorch/rl
GitHub
GitHub - pytorch/rl: A modular, primitive-first, python-first PyTorch library for Reinforcement Learning.
A modular, primitive-first, python-first PyTorch library for Reinforcement Learning. - pytorch/rl
👍1
#python #agent #agentic_ai #grpo #kimi_ai #llms #lora #qwen #qwen3 #reinforcement_learning #rl
ART is a tool that helps you train smart agents for real-world tasks using reinforcement learning, especially with the GRPO method. The standout feature is RULER, which lets you skip the hard work of designing reward functions by using a large language model to automatically score how well your agent is doing—just describe your task, and RULER takes care of the rest. This makes building and improving agents much faster and easier, works for any task, and often performs as well as or better than hand-crafted rewards. You can install ART with a simple command and start training agents right away, even on your own computer or with cloud resources.
https://github.com/OpenPipe/ART
ART is a tool that helps you train smart agents for real-world tasks using reinforcement learning, especially with the GRPO method. The standout feature is RULER, which lets you skip the hard work of designing reward functions by using a large language model to automatically score how well your agent is doing—just describe your task, and RULER takes care of the rest. This makes building and improving agents much faster and easier, works for any task, and often performs as well as or better than hand-crafted rewards. You can install ART with a simple command and start training agents right away, even on your own computer or with cloud resources.
https://github.com/OpenPipe/ART
GitHub
GitHub - OpenPipe/ART: Agent Reinforcement Trainer: train multi-step agents for real-world tasks using GRPO. Give your agents on…
Agent Reinforcement Trainer: train multi-step agents for real-world tasks using GRPO. Give your agents on-the-job training. Reinforcement learning for Qwen2.5, Qwen3, Llama, and more! - OpenPipe/ART
#python #gym #gym_environment #reinforcement_learning #reinforcement_learning_agent #reinforcement_learning_environments #rl_environment #rl_training
NeMo Gym helps you build and run reinforcement‑learning training environments for large language models, letting you develop, test, and collect verified rollouts separately from the training loop and integrate with your preferred RL framework and model endpoints (OpenAI, vLLM, etc.). It includes ready resource servers, datasets, and patterns for multi‑step, multi‑turn, and tool‑using scenarios, runs on a typical dev machine (no GPU required), and is early-stage with evolving APIs and docs. Benefit: you can generate high‑quality, verifiable training data faster and plug it into existing training pipelines to improve model behavior.
https://github.com/NVIDIA-NeMo/Gym
NeMo Gym helps you build and run reinforcement‑learning training environments for large language models, letting you develop, test, and collect verified rollouts separately from the training loop and integrate with your preferred RL framework and model endpoints (OpenAI, vLLM, etc.). It includes ready resource servers, datasets, and patterns for multi‑step, multi‑turn, and tool‑using scenarios, runs on a typical dev machine (no GPU required), and is early-stage with evolving APIs and docs. Benefit: you can generate high‑quality, verifiable training data faster and plug it into existing training pipelines to improve model behavior.
https://github.com/NVIDIA-NeMo/Gym
GitHub
GitHub - NVIDIA-NeMo/Gym: Build RL environments for LLM training
Build RL environments for LLM training. Contribute to NVIDIA-NeMo/Gym development by creating an account on GitHub.