Henok – Telegram

Henok

1.58K subscribers

832 photos

122 videos

165 files

170 links

Henok here. Just a messy collection of interesting things to improve or make your life worse!
Reach me at @StoicallyAwake.

Download Telegram

About

Blog

Apps

Platform

1.58K subscribers

This media is not supported in your browser

VIEW IN TELEGRAM

And this after 315 episodes.

🔥4👀2⚡1

353 viewsStellar🌌, 08:34

This implementation uses Deep Q-Learning (DQN), where an AI agent learns to play Pong through trial and error. The system works through an ongoing cycle of observation, decision making, and learning. At each moment, the AI observes the game state - represented by 7 numbers capturing the ball's position, velocity, predicted future position, and paddle location. The neural network processes this information to estimate the quality (Q-value) of each possible action: moving up, staying still, or moving down.

A critical element is the epsilon-greedy strategy (ε), which balances exploration and exploitation. Initially set to 1.0 (100% random actions), epsilon gradually decays to 0.05 (5% random actions) as training progresses. This allows the AI to first explore various strategies before committing to its learned knowledge. The neural network, structured with three hidden layers (512-256-128 neurons) using ReLU activation and layer normalization, continuously improves through experience.

Learning happens through the replay memory system, which stores past experiences (state, action, reward, next state). By randomly sampling from these memories, the AI learns from a diverse set of situations rather than just recent ones. The target network, a periodically updated copy of the main network provides stable learning targets to prevent destructive feedback loops. The reward system carefully balances multiple factors: points for intercepting the ball, bonuses for strategic angles, and small penalties for unnecessary movement to discourage jittery behavior.

As training progresses, you'll see the rally lengths increase from just a few hits to hundreds. The metrics logger tracks four key indicators: rally length (showing gameplay improvement), episode rewards (measuring overall performance), training loss (indicating learning stability), and epsilon (tracking the exploration rate). These metrics automatically generate visual reports every 100 episodes, giving you clear insight into the AI's learning progress without cluttering the game interface. The entire system demonstrates how reinforcement learning can create competent game AI through autonomous trial-and-error learning, without any pre-programmed strategies.

#RL #PYGAME #MYPROJECTS #PONG

🔥5👍1💯1🏆1

339 viewsStellar🌌, 08:40

You might think Pong is a super simple game, and it is for humans! But teaching a computer to play it well using reinforcement learning is surprisingly tough. It's like trying to teach a baby to ride a bike.

First, you have to tell the computer what "good" looks like. That means defining the rewards. Should it get a reward just for hitting the ball? For not missing? Only for scoring? If you don't get the rewards right, the computer might learn to do weird things instead of actually trying to win. To fix this, we need to be very careful about the reward function. We might give it a small reward for hitting the ball, a bigger reward for scoring, and a penalty for missing the ball. The key is to guide it toward the right goal.

Second, the computer has to try everything. It has to experiment with moving up, down, and staying still in every possible situation. That's a lot of options! It's like the baby randomly wobbling all over the place on the bike. It takes a long time for the computer to figure out what works and what doesn't. To tackle this, we can use techniques like Deep Q-Networks (DQN). DQN helps the computer learn which actions are likely to be good in different situations, so it doesn't have to try everything randomly. It's like the baby slowly learning to balance and steer.

So, even though Pong seems simple, making an AI learn to play it well using RL requires some clever tricks, a well designed reward system, a powerful learning algorithm, and a lot of patience!

#RL #Pong

🔥3💯1🏆1

360 viewsStellar🌌, edited 08:43

It took me several days to fine tune the hyperparameters so that the training works optimally. If something is missing, the AI will just stop learning. The above screenshot is a good example. I kept it training over night, after 12000 episodes, the AI was never showing good numbers and I had to add more sophistication to my dqn_agent code, like more batch sizes, more layers and adjusted the reward and penalty parameters.

🔥1🏆1

426 viewsStellar🌌, 08:48

And this after 315 episodes.

With this progress, if i keep it training for severa hours, the agent will be practically flawless given that the pygame simulation works perfect.

🔥3

403 viewsStellar🌌, 08:52

Here is how the training is progressing, ironically, the auto player is losing to the AI agent, it has became flawless. Another reason the rally is stunted at 84 is that the pygame is glitching a bit (the ball sometimes goes through the paddle instead of rebounding). The exploitation rate is also stabilized.

1👀3🫡2👍1

416 viewsStellar🌌, edited 09:57

If you have any thoughts, let me know

🔥7

388 viewsStellar🌌, 09:59

Forwarded from et/acc

This milestone marks a distinct moment for Ethiopian tech startups accleration, akin to Roger Bannister's historic sub-four-minute mile run on May 6, 1954-a feat once thought impossible that opened the door for many to follow.

Better-Auth’s traction so far signals an acceleration in Ethiopia’s tech ecosystem, showcasing "Made in Ethiopia, built for the world" innovation. With its rapid global adoption and recognition, Better-Auth sets a new standard and inspiration for young future Ethiopian builders aiming for global impact.

What once seemed out of reach is now real. From Addis to the Valley. Shipping Execution.

From this moment forward, the path is open. Expect more Ethiopian founders to chase global ambitions, and succeed.

Signals a new dawn for Ethiopian Acceleration. 🇪🇹/ACC 🔥

🔥5❤1

420 viewsStellar🌌, 10:00

From Addis to the Valley. Shipping Execution.

This is so cool, congrats to the creators once again🔥

🔥5

495 viewsStellar🌌, 10:01

AMOR FATI

❤‍🔥7

434 viewsStellar🌌, 19:36

Forwarded from Dagmawi Babi

Interestingly sad graph.

Software Dev job postings have decreeses SUBSTANTIALLY.

😭6

390 viewsStellar🌌, 13:45

Buzayehu_Demissie_ብዙአየሁ_ደምሴ_salayesh_lyrics__ሳላይሽ____Ethiolyrics(256k)

❤4

423 viewsStellar🌌, 14:56

This media is not supported in your browser

VIEW IN TELEGRAM

Why does this feel so good🥺

❤8❤‍🔥3👍1

476 viewsStellar🌌, 06:58

Peak COD imo, what about for you?

🔥4

531 viewsStellar🌌, 07:12

Okay, one fact about me:

I HAVE SOCIAL ANXIETY🙂

👀12

535 viewsStellar🌌, 07:21

Forwarded from V Put-in

When childhood ends, responsibility without power begins

🔥6⚡2❤2💯1

568 viewsStellar🌌, 11:21

Who said programming is simpler than math?

💯6👍1

387 viewsStellar🌌, 10:22

"One of the most frequent questions I get asked in my DMs is: "How do I become good at anything?" It's a question we all grapple with at some point because feeling inadequate simply isn't enjoyable. So, what does it truly take to excel in a skill or area of knowledge?

First and foremost: cultivate curiosity. Without it, you're essentially inert. Think of it as the engine that drives your learning journey. It's that intrinsic desire to ask "why," "what if," and "how does this work?" Curiosity is a gift we're all born with; it's just a matter of nurturing it.

Next comes genuine interest. Curiosity sparks the initial questions, but interest sustains you on the path to finding answers. It's the force that compels you to dig deeper, to explore the nuances, and to truly connect with the subject matter. It transforms a fleeting question into a lasting pursuit.

But curiosity and interest alone aren't enough. You also need intentional courage. This isn't about grand, heroic acts; it's about the daily, deliberate choices that prioritize growth: the courage to allocate time for focused practice, the courage to set boundaries and eliminate unnecessary distractions, the courage to prioritize long term goals over immediate gratification. It's about pushing through the moments when you feel like giving up, when the path ahead seems dauntingly steep.

And recognize this: Learning is effortful. Growth isn't free. It requires investment of time, energy, and focus. There will be frustration, setbacks, and moments of self doubt. But it's in those challenging moments that the real learning happens. Don't expect instant gratification; focus on consistent progress.

Remember, becoming good at anything is a journey, not a destination. It's about embracing the process, celebrating small wins, and continuously striving to learn and improve.

As the saying goes, "The master has failed more times than the beginner has even tried." So, embrace the failures, learn from your mistakes, and keep moving forward."

🔥11

577 viewsStellar🌌, 10:37

Learning is not free, you have to pay attention.

-Richard Feynman

💯6❤‍🔥5

402 viewsStellar🌌, 10:38

it will become my first GitHub repository! 🥳

Here we go. My first github commit. Starring is appreciated. There are more to come, im currently actively doing 2 projects.

https://github.com/HenokNet/pong-rainbow

GitHub - HenokNet/pong-rainbow: A Rainbow DQN agent that learns to master Pong against a physics based opponent using PyTorch and…

A Rainbow DQN agent that learns to master Pong against a physics based opponent using PyTorch and reinforcement learning. - HenokNet/pong-rainbow

1🔥4

456 viewsStellar🌌, 06:05

Does anybody know about iCog Labs and some details about their new internship call? Let me know in the comments section or dm me directly🙏

426 viewsStellar🌌, 07:35