hijkzzz/Awesome-LLM-Strawberry
A collection of LLM papers, blogs, and projects, with a focus on OpenAI o1 and reasoning techniques.
Language:
Total stars: 77
Stars trend:
#chainofthought, #coding, #llm, #mathematics, #mcts, #openaio1, #reinforcementlearning, #strawberry
A collection of LLM papers, blogs, and projects, with a focus on OpenAI o1 and reasoning techniques.
Language:
Total stars: 77
Stars trend:
15 Sep 2024
5am ██▏ +17
6am █▊ +14
7am █▎ +10
8am ▍ +3
9am ▌ +4
10am ▎ +2
11am █▍ +11
12pm ▋ +5
1pm █▏ +9#chainofthought, #coding, #llm, #mathematics, #mcts, #openaio1, #reinforcementlearning, #strawberry
kmario23/deep-learning-drizzle
Drench yourself in Deep Learning, Reinforcement Learning, Machine Learning, Computer Vision, and NLP by learning from these exciting lectures!!
Language:HTML
Total stars: 12193
Stars trend:
#html
#artificialintelligencealgorithms, #artificialneuralnetworks, #bayesianstatistics, #computervision, #deeplearning, #deepneuralnetworks, #deepreinforcementlearning, #explainableai, #geometricdeeplearning, #graphneuralnetworks, #machinelearning, #medicalimaging, #naturallanguageprocessing, #optimization, #patternrecognition, #probabilisticgraphicalmodels, #probability, #reinforcementlearning, #speechrecognition, #visualrecognition
Drench yourself in Deep Learning, Reinforcement Learning, Machine Learning, Computer Vision, and NLP by learning from these exciting lectures!!
Language:HTML
Total stars: 12193
Stars trend:
8 Oct 2024
3pm +1
4pm +0
5pm +0
6pm +0
7pm +0
8pm +0
9pm +0
10pm +0
11pm +0
9 Oct 2024
12am +0
1am +0
2am ██████████████████▊ +162#html
#artificialintelligencealgorithms, #artificialneuralnetworks, #bayesianstatistics, #computervision, #deeplearning, #deepneuralnetworks, #deepreinforcementlearning, #explainableai, #geometricdeeplearning, #graphneuralnetworks, #machinelearning, #medicalimaging, #naturallanguageprocessing, #optimization, #patternrecognition, #probabilisticgraphicalmodels, #probability, #reinforcementlearning, #speechrecognition, #visualrecognition
❤1
eloialonso/diamond
DIAMOND (DIffusion As a Model Of eNvironment Dreams) is a reinforcement learning agent trained in a diffusion world model.
Language:Python
Total stars: 367
Stars trend:
#python
#artificalintelligense, #atari, #deeplearning, #diffusionmodels, #machinelearning, #reinforcementlearning, #research, #worldmodels
DIAMOND (DIffusion As a Model Of eNvironment Dreams) is a reinforcement learning agent trained in a diffusion world model.
Language:Python
Total stars: 367
Stars trend:
12 Oct 2024
9am ▊ +6
10am ▋ +5
11am ▏ +1
12pm █▏ +9
1pm █▋ +13
2pm █ +8
3pm ▉ +7
4pm █▍ +11
5pm █▋ +13
6pm █▎ +10
7pm █▏ +9
8pm █▍ +11#python
#artificalintelligense, #atari, #deeplearning, #diffusionmodels, #machinelearning, #reinforcementlearning, #research, #worldmodels
AgibotTech/agibot_x1_train
The reinforcement learning training code for AgiBot X1.
Language:Python
Total stars: 109
Stars trend:
#python
#opensource, #reinforcementlearning, #robotics
The reinforcement learning training code for AgiBot X1.
Language:Python
Total stars: 109
Stars trend:
24 Oct 2024
2am ▏ +1
3am ▎ +2
4am ▉ +7
5am ██▎ +18
6am █▎ +10
7am ███▉ +31
8am ██▋ +21#python
#opensource, #reinforcementlearning, #robotics
MaximeVandegar/Papers-in-100-Lines-of-Code
Implementation of papers in 100 lines of code.
Language:Python
Total stars: 875
Stars trend:
#python
#3d, #aes, #artificialintelligence, #deeplearning, #diffusionmodels, #educational, #gans, #generativemodel, #implementationofresearchpaper, #inverserendering, #machinelearning, #metalearning, #nerf, #neuralradiancefields, #papers, #python, #pytorch, #reinforcementlearning, #research, #rl
Implementation of papers in 100 lines of code.
Language:Python
Total stars: 875
Stars trend:
6 Dec 2024
12am ▏ +1
1am ▌ +4
2am ▋ +5
3am ▍ +3
4am ▌ +4
5am ▌ +4
6am ▌ +4
7am █▎ +10
8am █▎ +10
9am █▊ +14
10am ▊ +6
11am █▎ +10#python
#3d, #aes, #artificialintelligence, #deeplearning, #diffusionmodels, #educational, #gans, #generativemodel, #implementationofresearchpaper, #inverserendering, #machinelearning, #metalearning, #nerf, #neuralradiancefields, #papers, #python, #pytorch, #reinforcementlearning, #research, #rl
labmlai/annotated_deep_learning_paper_implementations
🧑🏫 60+ Implementations/tutorials of deep learning papers with side-by-side notes 📝; including transformers (original, xl, switch, feedback, vit, ...), optimizers (adam, adabelief, sophia, ...), gans(cyclegan, stylegan2, ...), 🎮 reinforcement learning (ppo, dqn), capsnet, distillation, ... 🧠
Language:Python
Total stars: 58051
Stars trend:
#python
#attention, #deeplearning, #deeplearningtutorial, #gan, #literateprogramming, #lora, #machinelearning, #neuralnetworks, #optimizers, #pytorch, #reinforcementlearning, #transformer, #transformers
🧑🏫 60+ Implementations/tutorials of deep learning papers with side-by-side notes 📝; including transformers (original, xl, switch, feedback, vit, ...), optimizers (adam, adabelief, sophia, ...), gans(cyclegan, stylegan2, ...), 🎮 reinforcement learning (ppo, dqn), capsnet, distillation, ... 🧠
Language:Python
Total stars: 58051
Stars trend:
19 Jan 2025
9pm ▏ +1
10pm █▏ +9
11pm ▌ +4
20 Jan 2025
12am ▍ +3
1am ▍ +3
2am ▋ +5
3am █▏ +9
4am █▎ +10
5am █▏ +9
6am █▏ +9
7am █ +8
8am ▋ +5#python
#attention, #deeplearning, #deeplearningtutorial, #gan, #literateprogramming, #lora, #machinelearning, #neuralnetworks, #optimizers, #pytorch, #reinforcementlearning, #transformer, #transformers
turningpoint-ai/VisualThinker-R1-Zero
Explore the Multimodal “Aha Moment” on 2B Model
Language:Python
Total stars: 208
Stars trend:
#python
#deepseek, #deepseekr1, #deepseekr1zero, #grpo, #multimodal, #multimodaljourney, #multimodalr1, #posttraining, #r1, #r1zero, #reasoning, #reinforcementlearning
Explore the Multimodal “Aha Moment” on 2B Model
Language:Python
Total stars: 208
Stars trend:
5 Mar 2025
1am ▍ +3
2am +0
3am ▏ +1
4am █ +8
5am ██▉ +23
6am ██▉ +23
7am ██▌ +20
8am ▉ +7
9am █▏ +9
10am ▉ +7
11am ▉ +7
12pm █▌ +12#python
#deepseek, #deepseekr1, #deepseekr1zero, #grpo, #multimodal, #multimodaljourney, #multimodalr1, #posttraining, #r1, #r1zero, #reasoning, #reinforcementlearning
FareedKhan-dev/all-rl-algorithms
Implementation of all RL algorithms in a simpler way
Language:Jupyter Notebook
Total stars: 87
Stars trend:
#jupyternotebook
#agent, #llm, #openai, #python, #reinforcementlearning, #rl
Implementation of all RL algorithms in a simpler way
Language:Jupyter Notebook
Total stars: 87
Stars trend:
30 Mar 2025
3pm ██▋ +21
4pm ███▏ +25
5pm █▉ +15
6pm █▍ +11
7pm █▊ +14#jupyternotebook
#agent, #llm, #openai, #python, #reinforcementlearning, #rl
❤1
inclusionAI/AReaL
Distributed RL System for LLM Reasoning
Language:Python
Total stars: 374
Stars trend:
#python
#llm, #llmreasoning, #machinelearningsystems, #mlsys, #reinforcementlearning, #rl
Distributed RL System for LLM Reasoning
Language:Python
Total stars: 374
Stars trend:
31 Mar 2025
12am ▏ +1
1am ▍ +3
2am ███████▉ +63
3am ███████▌ +60
4am ███▍ +27#python
#llm, #llmreasoning, #machinelearningsystems, #mlsys, #reinforcementlearning, #rl
girafe-ai/ml-course
Open Machine Learning course
Language:Jupyter Notebook
Total stars: 2606
Stars trend:
#jupyternotebook
#computervision, #course, #deeplearning, #machinelearning, #materials, #naturallanguageprocessing, #python, #pytorch, #reinforcementlearning, #seminars
Open Machine Learning course
Language:Jupyter Notebook
Total stars: 2606
Stars trend:
9 Apr 2025
11am ▌ +4
12pm ▉ +7
1pm █▏ +9
2pm █ +8
3pm ▊ +6
4pm ▏ +1
5pm ▎ +2
6pm ▌ +4
7pm █▎ +10
8pm ▉ +7
9pm █▏ +9
10pm █▏ +9#jupyternotebook
#computervision, #course, #deeplearning, #machinelearning, #materials, #naturallanguageprocessing, #python, #pytorch, #reinforcementlearning, #seminars
ivanbelenky/RL
R.L. methods and techniques.
Language:Python
Total stars: 108
Stars trend:
#python
#gridworld, #markov, #markovdecisionprocesses, #qlearning, #qlearning, #reinforcementlearning, #sarsa, #tabularmethods
R.L. methods and techniques.
Language:Python
Total stars: 108
Stars trend:
6 May 2025
10pm ▏ +1
11pm ██ +16
7 May 2025
12am █▎ +10
1am █▉ +15
2am ▉ +7
3am ▊ +6
4am ▌ +4
5am ▍ +3
6am █ +8
7am ▉ +7
8am ▍ +3#python
#gridworld, #markov, #markovdecisionprocesses, #qlearning, #qlearning, #reinforcementlearning, #sarsa, #tabularmethods
NVlabs/Long-RL
Long-RL: Scaling RL to Long Sequences
Language:Python
Total stars: 241
Stars trend:
#python
#efficientai, #largelanguagemodels, #longsequence, #multimodality, #reinforcementlearning, #sequenceparallelism
Long-RL: Scaling RL to Long Sequences
Language:Python
Total stars: 241
Stars trend:
11 Jul 2025
2am ███▌ +28
3am ███▌ +28
4am █▉ +15
5am █▎ +10
6am ██▌ +20
7am ▉ +7
8am ▊ +6
9am ▉ +7
10am ▎ +2
11am ▏ +1
12pm ▍ +3
1pm ▌ +4#python
#efficientai, #largelanguagemodels, #longsequence, #multimodality, #reinforcementlearning, #sequenceparallelism
PufferAI/PufferLib
Simplifying reinforcement learning for complex game environments
Language:C
Total stars: 2477
Stars trend:
#c
#reinforcementlearning
Simplifying reinforcement learning for complex game environments
Language:C
Total stars: 2477
Stars trend:
11 Jul 2025
2pm ▌ +4
3pm ▍ +3
4pm █ +8
5pm █▍ +11
6pm █ +8
7pm █▌ +12
8pm █ +8
9pm █▏ +9
10pm █▎ +10
11pm ▎ +2#c
#reinforcementlearning
💩1
OpenPipe/ART
Agent Reinforcement Trainer: train multi-step agents for real-world tasks using GRPO. Give your agents on-the-job training. Reinforcement learning for Qwen2.5, Qwen3, Llama, Kimi, and more!
Language:Python
Total stars: 1030
Stars trend:
#python
#agent, #agenticai, #grpo, #kimiai, #llms, #lora, #qwen, #qwen3, #reinforcementlearning, #rl
Agent Reinforcement Trainer: train multi-step agents for real-world tasks using GRPO. Give your agents on-the-job training. Reinforcement learning for Qwen2.5, Qwen3, Llama, Kimi, and more!
Language:Python
Total stars: 1030
Stars trend:
11 Jul 2025
5pm ▉ +7
6pm █▌ +12
7pm ██ +16
8pm █▌ +12
9pm █▎ +10
10pm █▎ +10
11pm ▋ +5
12 Jul 2025
12am ▉ +7#python
#agent, #agenticai, #grpo, #kimiai, #llms, #lora, #qwen, #qwen3, #reinforcementlearning, #rl
RLinf/RLinf
RLinf is a flexible and scalable open-source infrastructure designed for post-training foundation models (LLMs, VLMs, VLAs) via reinforcement learning.
Language:Python
Total stars: 197
Stars trend:
#python
#agenticai, #aiinfra, #embodiedai, #largelanguagemodels, #reinforcementlearning, #rlinfra, #rlinf, #vlarl
RLinf is a flexible and scalable open-source infrastructure designed for post-training foundation models (LLMs, VLMs, VLAs) via reinforcement learning.
Language:Python
Total stars: 197
Stars trend:
1 Sep 2025
2am ▊ +6
3am ██▌ +20
4am █▏ +9
5am █▊ +14
6am █▌ +12
7am ▊ +6
8am █▏ +9
9am █ +8#python
#agenticai, #aiinfra, #embodiedai, #largelanguagemodels, #reinforcementlearning, #rlinfra, #rlinf, #vlarl
mujocolab/mjlab
Isaac Lab API, powered by MuJoCo-Warp, for RL and robotics research.
Language:Python
Total stars: 266
Stars trend:
#python
#isaaclab, #mujoco, #mujocowarp, #reinforcementlearning, #roboticssimulation
Isaac Lab API, powered by MuJoCo-Warp, for RL and robotics research.
Language:Python
Total stars: 266
Stars trend:
29 Sep 2025
8pm ▌ +4
9pm ▉ +7
10pm ▍ +3
11pm ▊ +6
30 Sep 2025
12am ▉ +7
1am ▋ +5
2am █ +8
3am █▎ +10
4am ▉ +7
5am █▊ +14
6am █▋ +13
7am ██▏ +17#python
#isaaclab, #mujoco, #mujocowarp, #reinforcementlearning, #roboticssimulation
enactic/openarm
A fully open-source humanoid arm for physical AI research and deployment in contact-rich environments.
Language:MDX
Total stars: 1215
Stars trend:
#mdx
#bilateralteleoperation, #forcefeedback, #genesis, #gravitycompensation, #humanoidrobot, #imitationlearning, #machinelearning, #moveit2, #mujoco, #opensource, #openarm, #python, #reinforcementlearning, #robot, #robotarm, #robotics, #ros2, #teleoperation
A fully open-source humanoid arm for physical AI research and deployment in contact-rich environments.
Language:MDX
Total stars: 1215
Stars trend:
15 Oct 2025
2pm ▌ +4
3pm ▌ +4
4pm ▏ +1
5pm ▌ +4
6pm ▎ +2
7pm ▍ +3
8pm ▎ +2
9pm ▎ +2
10pm ▎ +2
11pm ▏ +1
16 Oct 2025
12am ▍ +3
1am ▉ +7#mdx
#bilateralteleoperation, #forcefeedback, #genesis, #gravitycompensation, #humanoidrobot, #imitationlearning, #machinelearning, #moveit2, #mujoco, #opensource, #openarm, #python, #reinforcementlearning, #robot, #robotarm, #robotics, #ros2, #teleoperation
modelscope/AgentEvolver
AgentEvolver: Towards Efficient Self-Evolving Agent System
Language:Python
Total stars: 335
Stars trend:
#python
#agent, #agentsystem, #llm, #reinforcementlearning, #selfevolving
AgentEvolver: Towards Efficient Self-Evolving Agent System
Language:Python
Total stars: 335
Stars trend:
18 Nov 2025
9pm ▋ +5
10pm ▏ +1
11pm ▎ +2
19 Nov 2025
12am ▍ +3
1am ▏ +1
2am ▎ +2
3am +0
4am ▍ +3
5am ▎ +2
6am ▊ +6
7am ▍ +3#python
#agent, #agentsystem, #llm, #reinforcementlearning, #selfevolving
Renforce-Dynamics/beyondAMP
One-Step Integration of AMP into IsaacLab
Language:Python
Total stars: 93
Stars trend:
#python
#amp, #animation, #isaaclab, #motion, #reinforcementlearning, #robotics
One-Step Integration of AMP into IsaacLab
Language:Python
Total stars: 93
Stars trend:
19 Nov 2025
11pm ▏ +1
20 Nov 2025
12am ▏ +1
1am ▍ +3
2am ▉ +7
3am ▏ +1
4am ▌ +4
5am ▌ +4
6am ▋ +5
7am ▎ +2#python
#amp, #animation, #isaaclab, #motion, #reinforcementlearning, #robotics