Robi makes stuff
TGYB25_HQ.png
@robi_makes_stuff, thanks for all the effort you put into making this, that was cool asfπ₯
π₯6
Henok
My honest takeaway from my recent social media detox (just under a month) is this: Before this, I hadn't gone without social media for more than twelve hours since I started using it. So, stepping away for twenty five days and focusing on other things feltβ¦
What I was doing the past 30-35 days:
> I started playing chess. I learned the moves, basic checkmates, and played around 100 games. My progress isn't quite what I expected, but it's not too bad. Starting chess at 19 has its challenges!
I spent most of my time searching for resources and following recommendations.
So, I'd say we have another hobby in the mix!
> I learned Pygame, a Python library for making 2D games. The learning curve was relatively gentle given my intermediate Python proficiency.
> I took an RL course using videos and books, and I'm currently attempting to implement Q-Learning and DQN on a Pong game I created using Pygame. (I'll talk about reinforcement learning in more detail someday.) Once I complete the project, it will become my first GitHub repository! π₯³
> I started reading the book "A History of God." It's a bit dense and challenging to digest, but I'm almost halfway through. I'm trying to challenge the indifference I previously held towards religion. I'm honestly developing a deeper understanding. I'll be documenting my journey here.
> I completed two books: "Rich Dad Poor Dad" and Plato's "The Republic."
I did all of this because I had ample free time. I took a three-week leave from AASTU and went to my hometown.
> I started playing chess. I learned the moves, basic checkmates, and played around 100 games. My progress isn't quite what I expected, but it's not too bad. Starting chess at 19 has its challenges!
I spent most of my time searching for resources and following recommendations.
So, I'd say we have another hobby in the mix!
> I learned Pygame, a Python library for making 2D games. The learning curve was relatively gentle given my intermediate Python proficiency.
> I took an RL course using videos and books, and I'm currently attempting to implement Q-Learning and DQN on a Pong game I created using Pygame. (I'll talk about reinforcement learning in more detail someday.) Once I complete the project, it will become my first GitHub repository! π₯³
> I started reading the book "A History of God." It's a bit dense and challenging to digest, but I'm almost halfway through. I'm trying to challenge the indifference I previously held towards religion. I'm honestly developing a deeper understanding. I'll be documenting my journey here.
> I completed two books: "Rich Dad Poor Dad" and Plato's "The Republic."
I did all of this because I had ample free time. I took a three-week leave from AASTU and went to my hometown.
2π₯18π3
Henok
What I was doing the past 30-35 days: > I started playing chess. I learned the moves, basic checkmates, and played around 100 games. My progress isn't quite what I expected, but it's not too bad. Starting chess at 19 has its challenges! I spent most of myβ¦
This is the most productive streak i have ever had.
Not too bad, right?
Not too bad, right?
π18π3π«‘1
Henok
Once I complete the project
Looks like I have completed the project though i dont think it is fully optimized (will do it step by step). Demo video incoming...
β‘2π₯1
This media is not supported in your browser
VIEW IN TELEGRAM
This is my Agent's performance after 25 episodes. The agent is in the right side.
1β‘2
This media is not supported in your browser
VIEW IN TELEGRAM
And this after 315 episodes.
π₯4π2β‘1
This implementation uses Deep Q-Learning (DQN), where an AI agent learns to play Pong through trial and error. The system works through an ongoing cycle of observation, decision making, and learning. At each moment, the AI observes the game state - represented by 7 numbers capturing the ball's position, velocity, predicted future position, and paddle location. The neural network processes this information to estimate the quality (Q-value) of each possible action: moving up, staying still, or moving down.
A critical element is the epsilon-greedy strategy (Ξ΅), which balances exploration and exploitation. Initially set to 1.0 (100% random actions), epsilon gradually decays to 0.05 (5% random actions) as training progresses. This allows the AI to first explore various strategies before committing to its learned knowledge. The neural network, structured with three hidden layers (512-256-128 neurons) using ReLU activation and layer normalization, continuously improves through experience.
Learning happens through the replay memory system, which stores past experiences (state, action, reward, next state). By randomly sampling from these memories, the AI learns from a diverse set of situations rather than just recent ones. The target network, a periodically updated copy of the main network provides stable learning targets to prevent destructive feedback loops. The reward system carefully balances multiple factors: points for intercepting the ball, bonuses for strategic angles, and small penalties for unnecessary movement to discourage jittery behavior.
As training progresses, you'll see the rally lengths increase from just a few hits to hundreds. The metrics logger tracks four key indicators: rally length (showing gameplay improvement), episode rewards (measuring overall performance), training loss (indicating learning stability), and epsilon (tracking the exploration rate). These metrics automatically generate visual reports every 100 episodes, giving you clear insight into the AI's learning progress without cluttering the game interface. The entire system demonstrates how reinforcement learning can create competent game AI through autonomous trial-and-error learning, without any pre-programmed strategies.
#RL #PYGAME #MYPROJECTS #PONG
A critical element is the epsilon-greedy strategy (Ξ΅), which balances exploration and exploitation. Initially set to 1.0 (100% random actions), epsilon gradually decays to 0.05 (5% random actions) as training progresses. This allows the AI to first explore various strategies before committing to its learned knowledge. The neural network, structured with three hidden layers (512-256-128 neurons) using ReLU activation and layer normalization, continuously improves through experience.
Learning happens through the replay memory system, which stores past experiences (state, action, reward, next state). By randomly sampling from these memories, the AI learns from a diverse set of situations rather than just recent ones. The target network, a periodically updated copy of the main network provides stable learning targets to prevent destructive feedback loops. The reward system carefully balances multiple factors: points for intercepting the ball, bonuses for strategic angles, and small penalties for unnecessary movement to discourage jittery behavior.
As training progresses, you'll see the rally lengths increase from just a few hits to hundreds. The metrics logger tracks four key indicators: rally length (showing gameplay improvement), episode rewards (measuring overall performance), training loss (indicating learning stability), and epsilon (tracking the exploration rate). These metrics automatically generate visual reports every 100 episodes, giving you clear insight into the AI's learning progress without cluttering the game interface. The entire system demonstrates how reinforcement learning can create competent game AI through autonomous trial-and-error learning, without any pre-programmed strategies.
#RL #PYGAME #MYPROJECTS #PONG
π₯5π1π―1π1
You might think Pong is a super simple game, and it is for humans! But teaching a computer to play it well using reinforcement learning is surprisingly tough. It's like trying to teach a baby to ride a bike.
First, you have to tell the computer what "good" looks like. That means defining the rewards. Should it get a reward just for hitting the ball? For not missing? Only for scoring? If you don't get the rewards right, the computer might learn to do weird things instead of actually trying to win. To fix this, we need to be very careful about the reward function. We might give it a small reward for hitting the ball, a bigger reward for scoring, and a penalty for missing the ball. The key is to guide it toward the right goal.
Second, the computer has to try everything. It has to experiment with moving up, down, and staying still in every possible situation. That's a lot of options! It's like the baby randomly wobbling all over the place on the bike. It takes a long time for the computer to figure out what works and what doesn't. To tackle this, we can use techniques like Deep Q-Networks (DQN). DQN helps the computer learn which actions are likely to be good in different situations, so it doesn't have to try everything randomly. It's like the baby slowly learning to balance and steer.
So, even though Pong seems simple, making an AI learn to play it well using RL requires some clever tricks, a well designed reward system, a powerful learning algorithm, and a lot of patience!
#RL #Pong
First, you have to tell the computer what "good" looks like. That means defining the rewards. Should it get a reward just for hitting the ball? For not missing? Only for scoring? If you don't get the rewards right, the computer might learn to do weird things instead of actually trying to win. To fix this, we need to be very careful about the reward function. We might give it a small reward for hitting the ball, a bigger reward for scoring, and a penalty for missing the ball. The key is to guide it toward the right goal.
Second, the computer has to try everything. It has to experiment with moving up, down, and staying still in every possible situation. That's a lot of options! It's like the baby randomly wobbling all over the place on the bike. It takes a long time for the computer to figure out what works and what doesn't. To tackle this, we can use techniques like Deep Q-Networks (DQN). DQN helps the computer learn which actions are likely to be good in different situations, so it doesn't have to try everything randomly. It's like the baby slowly learning to balance and steer.
So, even though Pong seems simple, making an AI learn to play it well using RL requires some clever tricks, a well designed reward system, a powerful learning algorithm, and a lot of patience!
#RL #Pong
π₯3π―1π1
It took me several days to fine tune the hyperparameters so that the training works optimally. If something is missing, the AI will just stop learning. The above screenshot is a good example. I kept it training over night, after 12000 episodes, the AI was never showing good numbers and I had to add more sophistication to my dqn_agent code, like more batch sizes, more layers and adjusted the reward and penalty parameters.
π₯1π1
Henok
And this after 315 episodes.
With this progress, if i keep it training for severa hours, the agent will be practically flawless given that the pygame simulation works perfect.
π₯3
Here is how the training is progressing, ironically, the auto player is losing to the AI agent, it has became flawless. Another reason the rally is stunted at 84 is that the pygame is glitching a bit (the ball sometimes goes through the paddle instead of rebounding). The exploitation rate is also stabilized.
1π3π«‘2π1
Forwarded from et/acc
This milestone marks a distinct moment for Ethiopian tech startups accleration, akin to Roger Bannister's historic sub-four-minute mile run on May 6, 1954-a feat once thought impossible that opened the door for many to follow.
Better-Authβs traction so far signals an acceleration in Ethiopiaβs tech ecosystem, showcasing "Made in Ethiopia, built for the world" innovation. With its rapid global adoption and recognition, Better-Auth sets a new standard and inspiration for young future Ethiopian builders aiming for global impact.
What once seemed out of reach is now real. From Addis to the Valley. Shipping Execution.
From this moment forward, the path is open. Expect more Ethiopian founders to chase global ambitions, and succeed.
Signals a new dawn for Ethiopian Acceleration. πͺπΉ/ACC π₯
Better-Authβs traction so far signals an acceleration in Ethiopiaβs tech ecosystem, showcasing "Made in Ethiopia, built for the world" innovation. With its rapid global adoption and recognition, Better-Auth sets a new standard and inspiration for young future Ethiopian builders aiming for global impact.
What once seemed out of reach is now real. From Addis to the Valley. Shipping Execution.
From this moment forward, the path is open. Expect more Ethiopian founders to chase global ambitions, and succeed.
Signals a new dawn for Ethiopian Acceleration. πͺπΉ/ACC π₯
π₯5β€1
et/acc
From Addis to the Valley. Shipping Execution.
This is so cool, congrats to the creators once againπ₯
π₯5
Forwarded from Dagmawi Babi
Interestingly sad graph.
Software Dev job postings have decreeses SUBSTANTIALLY.
Software Dev job postings have decreeses SUBSTANTIALLY.
π6
This media is not supported in your browser
VIEW IN TELEGRAM
Why does this feel so goodπ₯Ί
β€8β€βπ₯3π1