Oh My Task!
266 subscribers
6 photos
10 links
Here I share my knowledge and interests with the geeks who at least once yelled "OH MY F… TASK!"

You’ll read about OS, concurrency, multi-processing and stuff in this channel.

Reach me at @ShahriarShariati
Download Telegram
Channel created
Alright geeks. Guess I should start posting here after almost 2 years of occupying the @ohmytask

You know, focusing is difficult these days. You have so many things to think about. And the growth of technology… I mean how the hell we could reach that much advanced AI tools for less than 2 years?!

So yeah, we can’t focus like our ancestors. Neither do our computers. They can’t focus too. Or better to say they should not focus on one thing. We don’t need them to focus. We need them to overthink many things. We want them to get stuck in the loops. Loops like the web servers that help you see this message right after I click the send button.

Let’s talk about these tiny hardware-beings that can’t focus. Here in this channel.
👍8
I always have a respect for the aged geeks who have been growing with the technology and felt the history instead of reading it. One of them is David Beazley. He was one of the developers who used first versions of Python (<=1.0) in big projects and it’s cool to see these guys seen all the change history of these tools and programming languages.

Follow him on Youtube and watch his lectures about multi-threading and asyncio.

Here he’s teaching you how to build your async:
https://youtu.be/Y4Gt3Xjd7G8?si=A4GrpDoqQPZ8Q8G5

@OhMyTask
7👍5🖕1
Will get rid of GIL?

Of course I’m exited for using the power of a Python without the annoying GIL.

But to be honest, I read their official announcement about that and it includes a lots of "if"s :)
They are really concerned about breaking changes and backward compatibility. They should be actually.
They even mentioned a couple of times that "We’ll bring back GIL if it causes problems" [1].

So let’s not be geeky emotional about it. And remember, Developers asked many times from the man who invented Python: Will we have a Python without the GIL? And the answer is "I’m OK. Do it if you can do it without any breaking changes[2].
From my sight, he got a point. They really don’t want to repeat Python 2 -> 3 challenges. They avoid increasing complexity. And more importantly, they don’t seem to be really convinced of reasons behind removing GIL.

So let’s use tools right. Python is powerful? Right. Our code base is on it and this GIL is really annoying? Correct. But these changes won’t happen in one night and even when they happen, might not be like our expectations.

I prefer to use another programming languages when I really concerned about using the power of my CPU. At least for the next 5 years :)

@OhMyTask
👍8
Be careful. Your load balancer might make the workers jealous.

@OhMyTask
😁6
Are workers jealous of each other?

Assume the point of life of each worker, is the highest amount of assigned tasks. We’re running a distributed processing system. The main process is gonna distribute the incoming tasks to its forked process (workers). How to divide fairly? You know the tasks are indivisible objects. We cannot trim a function and say: Hey worker 1, do up to line 12. Hey worker 2, do the rest.

What if the workers envy each other? For example, worker 1 is free at this time and really needs a task to do. But you give a task to worker 2 instead. Which might be full of tasks already. They may envy each other.

Is there always an envy-free approach? Unfortunately not. The world doesn't work like that. Or maybe from the point of view. So how can we implement some task allocation strategy for our workers to not envy each other?

Let's do some simplification. We can't be 100% envy-free right? Ok. Let's say a worker is now jealous of some other workers. And the jealousy is gone by giving only one task to it. We do it. Give one task to the jealous worker and boom! Now no one will envy each other. It’s called envy-free up to one good. Or shortly EF1. Which is a relaxation method for envy-free item allocation. One of the main fields in computational social choice.

The Round-robin algorithm is one of the best EF1 item allocation algorithms. It assigns the incoming tasks to workers sequentially. For example, if we have 4 workers (W) and 5 tasks (T). It assigns:
T1 -> W1
T2 -> W2
T3 -> W3
T4 -> W4 (End of workers)
T5 -> W1 (Start at first)


It’s proved that workers do not envy each other if, at each round, we give one task to the worker who has the most free computational resource (or the most desire for a new task).

So next time you see the Round-robin method in broker exchange, scheduling, load balancing, etc.. Remember the logic behind it. It tries to be a fair parent process to its forked children.

@OhMyTask
🔥3👍2🥴1🆒1
Let Ants do the job @OhMyTask
👍4
Decentralized Brain of Nature

What is your first reaction when you see an ant or a group of them? Scream? Get mad? Grap something and try to kill them?

OK. Chill out for one minute. Let's observe. Imagine you see a group of ants (more often) and they're living their life, stealing your tiny pieces of food and going back to their colony.
Have you ever wondered how they're moving in a specific pattern without chaos? Are they talking to each other and yelling: Hey ant-186, look where ant-240 is going and follow him? Are they sending wireless messages to each other?

Nature knows its way. The group of ants has a decentralized brain or more scientific Swarm intelligence. When they are moving, they emit chemicals called Pheromone. These pheromones help ants find the path to a food source and also return to the nest in an efficient manner.
The pheromone has a lifetime and the more time passes, the smell of it gets weaker. So the ants know this pheromone is new or old. The more amount of pheromone on a route, the stronger smell, and probably the more efficient and acceptable path to use.

What's the point here? Ok. Let's see what challenges we have in our systems. Imagine we are in a huge and complex internet network with lots of nodes, lots of connections, lots of routers, etc.
We have groups of packets from node A to node B. There are a couple of ways to reach the node B. In a chaotic strongly connected network. How do you efficiently find the best path to Node B?

Or imagine there has been a fire accident and you're sending a couple of rescue drones to find injured people. The fire is getting worse. Drones are inside the building. How do you manage these drones to don't get damaged and most importantly, find the people inside the building as fast as they can?

Two above challenges and many challenges like those couldn't be solved by a centralized management system. They could but it's not that efficient and reliable. So here the algorithms of nature would help us. Swarm intelligence algorithms such as ACO (Ant Colony Optimization) we learned earlier could save lots of lives.

The deeper you look at the phenomena, the more you avoid erasing or killing them. Just you learn from the behavior.

@OhMyTask
👍4🔥21
Suppose you're looking for a "complete enough" course for async-await programming on Python. Who's better than an active contributor of Python to teach you?

Take a look at this amazing ~6 hours course made by Łukasz Langa (Python core developer)

https://youtube.com/playlist?list=PLhNSoGM2ik6SIkVGXWBwerucXjgP1rHmB&si=eyhr590l6I9DlIZi

@OhMyTask
👍13
What I mean by the handler of this channel @OhMyTask
😁10
The Creation of App
@OhMyTask
ASGI Lifespan

First of all, let’s talk a bit about the WSGI, the core of each modern Pythonic web framework.
Despite the importance of this protocol, it’s really simple. All you need is a callable (function or class) that follows a specific signature. When you run the application using a WSGI-friendly web server like the Gunicorn, for each HTTP request the web server calls your defined callable with the data of that incoming request. And your callable should respond to that in a specific format.
For example:
def simple_app(environ, start_response):
status = '200 OK'
response_headers = [('Content-type','text/plain')]
start_response(status, response_headers)
return ['Hello world!\n']

The simple_app function receives environ that contains the data of incoming HTTP requests. The start_response is a function that is used for announcing web server that we’re sending the response data or even revoking the connection.

ASGI (Asynchronous Server Gateway Interface) is an async version of that. But it can do more. Thanks to the async-await of Python. It can keep the connection open for a long time for protocols like WebSocket. It can receive and send data a couple of times.
Here’s an ASGI app that servers WebSocket:
async def application(scope, receive, send):
event = await receive()
...
await send({"type": "websocket.send", ...})


We can call multiple functionalities on web servers using the type attribute. In this example we used websocket.send to send data through the WebSocket protocol.

Now that we understand the structure of ASGI applications. Assume we're gonna develop a simple web app with an AI model. We need to load the model in memory before serving any request. So, how exactly did we find out that our application started serving requests? How to do operations before that? When the lifetime of our ASGI application begins and when it finishs?

Here the lifespan comes into the game. When an ASGI-friendly web server like Uvicorn loads the application and wants to start the server, It calls our application, with the scope of lifespan. And in the application we notice that it's the beginning of our life. So we can prepare anything we need for the lifetime of the application, such as loading AI models, making db connections, etc.

async def app(scope, receive, send):
if scope['type'] == 'lifespan':
while True:
message = await receive()
if message['type'] == 'lifespan.startup':
... # Do some startup here!
await send({'type': 'lifespan.startup.complete'})
elif message['type'] == 'lifespan.shutdown':
... # Do some shutdown here!
await send({'type': 'lifespan.shutdown.complete'})
return
else:
pass # Handle other types

The above example shows you a sample of managing application lifetime using lifespan scope.

This feature has been added to the ASGI since 2018 but it wasn't supported in ASGI-based web frameworks for some reasons. But now, they're replacing their old-fashioned lifetime management methods with this fancy lifespan.
For example, you can now pass an async context manager to a FastAPI app as lifespan. And it automatically handles the communication with web server.
@asynccontextmanager
async def lifespan(app: FastAPI):
create_db_connection()
yield
close_db_connection()

app = FastAPI(lifespan=lifespan)

In the above example we easily managed the operations before and after application lifetime. Learn more about lifespan in FastAPI here.

This is a handy feature. This unifies all AGSI web servers to act similarly in managing lifetime. And you don't have to write lifetime manager based on the web server you use. These are the challenges with WSGI applications.

@OhMyTask
🔥21
See how distributed you are

When it comes to writing apps that use concurrency or parallelism, this question is usually asked: Am I really doing concurrent/parallel?

There are some profiling tools that will help you make a good report of the execution of your app. In Python, personally I use viztracer which is very handy. You just need to run your program like:
viztracer myapp.py

And it will profile the execution of your program and trace each of process, thread and coroutines. And it stores the result in a json file.
Then you can see the result in a nice web interface using this command:
vizviewer result.json


It happened many times that I assumed my tiny piece of code will perfectly doing well at concurrency, parallelism but I checked the running process and found out I was wrong!

Don’t be a code-delivery programmer. Double check your insights.

@OhMyTask
👍7🔥31👌1
An optimal system is made of an aware manager and good workers.
🤩5
Gunicorn with Uvicorn inside

I'm up to reading the Gunicorn and Uvicorn source code and playing with the parameters to see the actual process/thread/coroutine management of those. I will write a detailed blog post about it soon but so far I wanna explain to you why it's recommended to use the Gunicorn web server with Uvicorn worker class.

The Gunicorn provides powerful worker management with lots of customizations. Gunicorn manages workers, and workers manage Python web applications. There are two types of workers which are sync and async. The term "async" here is a bit different than the async functions in Python. It's about the way worker serves the requests.

There are a couple of default worker classes in Gunicorn such as sync, eventlet, gevent, torando and gthread. There are differences between request and web application handling in these workers that I will explore in the upcoming blog post. However, none of them can run the async functions in Python. So you will lose the power of this handy process manager for just being in async code style.
Here is the moment that Uvicorn comes. Although the Uvicorn is a standalone web server that runs ASGI applications in Python, but it does not have the abilities of Gunicorn. The great thing that developers of Uvicorn have done is that they developed a Uvicorn worker class based on the basic Gunicorn worker class. So you can easily run the Gunicorn to manage Uvicorn workers, and use this worker to serve incoming requests with async web applications.

To do that you can simply run this command:
gunicorn example:app -w 4 -k uvicorn.workers.UvicornWorker

This command will spawn 4 Uvicorn workers and loads the ASGI application into them. So you have the performance of Gunicorn and Uvicorn together.

You might ask if we should avoid running our apps only with Uvicorn? The answer is not the same for different situations. Sometimes you wanna keep the running process of a web application flat and simple to put it as a single processing unit into Kubernetes pods or Docker containers for better management and debugging. So you should see your structure first.

@OhMyTask
👍5🔥2
An article that worth reading. It’s about comparing Python, Java and Go performance in multi-threaded computations.
The article shows the benchmarks of running Matrix multiplication, QuickSort and Conway’s game of life algorithms. These are the algorithms that take more than O(n) time complexity to run. Multithreading could do a lot for speedup the running process of them.

- Read Article

@OhMyTask
👍41🔥1🆒1