Python etc
6.11K subscribers
18 photos
194 links
Regular tips about Python and programming in general

Owner — @pushtaev

© CC BY-SA 4.0 — mention if repost
Download Telegram
If you want a context manager to make asynchronous operations upon entering or exiting context, you should use asynchronous context managers. Instead of calling m.__enter__() and m.__exit__() Python does await m.__aenter__() and await m.__aexit__() respectively.

Asynchronous context managers are used with async with syntax:

import asyncio

class Slow:
def __init__(self, delay):
self._delay = delay

async def __aenter__(self):
await asyncio.sleep(self._delay / 2)

async def __aexit__(self, *exception):
await asyncio.sleep(self._delay / 2)

async def main():
async with Slow(1):
print('slow')

loop = asyncio.get_event_loop()
loop.run_until_complete(main())
In Python 3 keys, values and items methods of dicts return view objects. They returned lists back in Python 2. The main difference is views don't store all items in memory, but yield them as long as they are requested. It works just fine as long as you are trying to iterate over keys (which you usually are), but you can't access elements by index anymore.

TypeError: 'dict_keys' object does not support indexing


You can argue that you don't really need indexing keys since their order is random, but it's not completely true. First of all, d.keys()[0] can be a proper way to get any key (use next(d.keys()) in Python 3). Second, since Python 3.7 dicts are insertion ordered.
Every method can be treated as a plain function and called with a custom self:

In : class A:
...: def foo(self):
...: return self
...:

In : A().foo
Out: <bound method A.foo of <...>>

In : A.foo
Out: <function __main__.A.foo>

In : A.foo(A())
Out: <__main__.A at 0x7f55ddd32898>


You can even convert a function back to the bound method. Any function is a descriptor, so it can be abused by calling __get__:

In [8]: b = A()

In [9]: A.foo.__get__(b, A)
Out[9]: <bound method A.foo of <...>>
Different asyncio tasks obviously have different stacks. You can view at all of them at any moment using asyncio.all_tasks() to get all currently running tasks and task.get_stack() to get a stack for each task.

import linecache
import asyncio
import random


async def producer(queue):
while True:
await queue.put(random.random())
await asyncio.sleep(0.01)

async def avg_printer(queue):
total = 0
cnt = 0
while True:
while queue.qsize():
x = await queue.get()
total += x
cnt += 1
queue.task_done()
print(total / cnt)
await asyncio.sleep(1)

async def monitor():
while True:
await asyncio.sleep(1.9)
for task in asyncio.all_tasks():
if task is not asyncio.current_task():
f = task.get_stack()[-1]
last_line = linecache.getline(
f.f_code.co_filename,
f.f_lineno,
f.f_globals,
)
print(task)
print('\t', last_line.strip())

print()

async def main():
loop = asyncio.get_event_loop()

queue = asyncio.Queue()

loop.create_task(producer(queue))
loop.create_task(producer(queue))
loop.create_task(producer(queue))

loop.create_task(avg_printer(queue))
loop.create_task(monitor())



loop = asyncio.get_event_loop()
loop.create_task(main())
loop.run_forever()


To avoid messing with the stack object directly and using the linecache module you can call task.print_stack() instead.
Ads time!
A familiar situation: you open a social site and see a block with the accounts of people which you may know. Making such a feature is an example of a Data Scientist task.
Do you have a desire to learn the profession and do really cool things?

The SkillFactory has a course for you – the Data Science specialization, where you will develop the skills with which you can take up the tasks of training a speech recognition service, identifying fraudulent transactions, forecasting demand for goods and even generating music or poems in the future.

Here you will go through the data scientist’s must-have: Python, machine learning, neural networks and deep learning, Big Data and Data engineering. And also mathematics, statistics and a management module.

📍 Stop pulling, get the opportunity in 12 months to work on cool projects in the popular field: https://clc.to/hdwc7A
python -m webbrowser -t 'http://www.python.org'
You can translate or delete characters of a string (like the tr utility does) with the translate method of str:

>>> 'Hello, world!'.translate({
... ord(','): ';',
... ord('o'): '0',
... })
'Hell0; w0rld!'


The only argument of translate is a dictionary mapping character codes to characters (or codes). It’s usually more convenient to create such a dictionary with str.maketrans static method:

>>> 'Hello, world!'.translate(str.maketrans({
... ',': ';',
... 'o': '0',
... }))
'Hell0; w0rld!'


Or even:

>>> 'Hello, world!'.translate(str.maketrans(
... ',o', ';0'
... ))
'Hell0; w0rld!'


The third argument is for deleting characters:

>>> tr = str.maketrans(',o', ';0', '!')
>>> tr
{44: 59, 111: 48, 33: None}
>>> 'Hello, world!'.translate(tr)
'Hell0; w0rld'
The attributes of classes are stored in dictionaries, and that could be a problem since they don't preserve order in Python 3.5 and older:

$ cat test.py
class M:
def __new__(meta, cls, bases, ns):
print(ns)

class A(metaclass=M):
a = 1
b = 2
$ python3.4 test.py
{'__module__': '__main__', 'b': 2, '__qualname__': 'A', 'a': 1}
$ python3.4 test.py
{'a': 1, 'b': 2, '__module__': '__main__', '__qualname__': 'A'}
$ python3.4 test.py
{'__module__': '__main__', 'b': 2, '__qualname__': 'A', 'a': 1}
$ python3.4 test.py
{'__qualname__': 'A', 'a': 1, '__module__': '__main__', 'b': 2}
$ python3.4 test.py
{'b': 2, 'a': 1, '__module__': '__main__', '__qualname__': 'A'}
$ python3.4 test.py
{'b': 2, '__qualname__': 'A', '__module__': '__main__', 'a': 1}


However, you can replace the attribute container by using the __prepare__ metaclass method:

$ cat test.py
from collections import OrderedDict


class M:
def __new__(meta, cls, bases, ns):
print(ns)

@classmethod
def __prepare__(metacls, cls, bases):
return OrderedDict()

class A(metaclass=M):
a = 1
b = 2
$ python3.4 test.py
OrderedDict([('__module__', '__main__'), ('__qualname__', 'A'), ('a', 1), ('b', 2)])
$ python3.4 test.py
OrderedDict([('__module__', '__main__'), ('__qualname__', 'A'), ('a', 1), ('b', 2)])
$ python3.4 test.py
OrderedDict([('__module__', '__main__'), ('__qualname__', 'A'), ('a', 1), ('b', 2)])


There is no need to do such thing in Python 3.6+, since dictionaries now preserve order:

$ cat test.py
class M:
def __new__(meta, cls, bases, ns):
print(ns)

class A(metaclass=M):
a = 1
b = 2
$ python3.6 test.py
{'__module__': '__main__', '__qualname__': 'A', 'a': 1, 'b': 2}
$ python3.6 test.py
{'__module__': '__main__', '__qualname__': 'A', 'a': 1, 'b': 2}
$ python3.6 test.py
{'__module__': '__main__', '__qualname__': 'A', 'a': 1, 'b': 2}
$ python3.6 test.py
{'__module__': '__main__', '__qualname__': 'A', 'a': 1, 'b': 2}
You can define dictionaries in two ways, using literals or the dict function:

>>> dict(a=1, b=2)
{'a': 1, 'b': 2}
>>> {'a': 1, 'b': 2}
{'a': 1, 'b': 2}


Literals work faster than dict, but the function has some advantages.

First, you don't need to add additional quotes. However, it only works as long as all keys are valid Python identifiers.

>>> dict(a=1)
{'a': 1}
>>> dict(1='a')
File "<stdin>", line 1
SyntaxError: keyword can't be an expression


Second, you can't accidentally provide the same key twice:

>>> {'a': 1, 'a': 1}
{'a': 1}
>>> dict(a=1, a=1)
File "<stdin>", line 1
SyntaxError: keyword argument repeated


Third, you can easily create new the new dictionary based on some already existed one.

>>> d = dict(b=2)
>>> dict(a=1, **d)
{'a': 1, 'b': 2}


Mind, however, that keys can't be redefined with this syntax:

>>> dict(b=3, **d)
Traceback (most recent call last):
File "<stdin>", line 1, in <module>
TypeError: type object got multiple values for keyword argument 'b'
Since Python 3.5, it's actually possible to use unpacking with dictionary and list literals.

In : {**{'a': 1}, 'b': 2, **{'c': 3}}
Out: {'a': 1, 'b': 2, 'c': 3}

In : [1, 2, *[3, 4]]
Out: [1, 2, 3, 4]


For dictionaries, this form is even more powerful than the dict function, since it allows values to be overridden:

In : {**{'a': 1, 'b': 1}, 'a': 2, **{'b': 3}}
Out: {'a': 2, 'b': 3}
mypy doesn’t yet support recursive types definitions:

from typing import Optional, Dict
from pathlib import Path

TreeDict = Dict[str, 'TreeDict']


def tree(path: Path) -> TreeDict:
return {
f.name: tree(f) if f.is_dir() else None
for f in path.iterdir()
}

The error is Cannot resolve name "TreeDict" (possible cyclic definition).

Stay tuned here: https://github.com/python/mypy/issues/731
In asyncio, the common practice to schedule execution of some code at a later time is to spawn a task that does await asyncio.sleep(x):

import asyncio

async def do(n=0):
print(n)
await asyncio.sleep(1)
loop.create_task(do(n + 1))
loop.create_task(do(n + 1))

loop = asyncio.get_event_loop()
loop.create_task(do())
loop.run_forever()


However, creating a new task may be expensive and is not necessary if you aren't planning to any asynchronous operations (like the do function in the example). Another way to do this is to use loop.call_later and loop.call_at functions that schedule an asynchronous callback to be called:

import asyncio                     

def do(n=0):
print(n)
loop = asyncio.get_event_loop()
loop.call_later(1, do, n+1)
loop.call_later(1, do, n+1)

loop = asyncio.get_event_loop()
do()
loop.run_forever()
Ordinary function just needs to call itself to become recursive. It’s not so simple for generators: you usually have to use yield from for recursive generators:


from operator import itemgetter


tree = {
'imgs': {
'1.png': None,
'2.png': None,
'photos': {
'me.jpg': None
},
},
'MANIFEST': None,
}


def flatten_tree(tree):
for name, children in sorted(
tree.items(),
key=itemgetter(0)
):
yield name
if children:
yield from flatten_tree(children)


print(list(flatten_tree(tree)))
Using sorted with the key argument is usually more efficient than providing custom comparison method since key is calculated only once for every value:

>>> sorted([-4, -2, 3, 1], key=lambda x: (print(x), abs(x)))
-4
-2
3
1
[1, -2, 3, -4]
You can use for not only with variables but with any expression. It’s evaluated on every iteration:

>>> log2 = {}
>>> key = 1
>>> for log2[key] in range(100):
... key *= 2
...
>>> log2[16]
4
>>> log2[1024]
10
The difference between function definition and generator definition is the presence of the yield keyword in the function body:


In : def f():
...: pass
...:

In : def g():
...: yield
...:

In : type(f())
Out: NoneType

In : type(g())
Out: generator


That means that in order to create an empty generator you have to do something like this:


In : def g():
...: if False:
...: yield
...:

In : list(g())
Out: []


However, since yield from supports simple iterators that better looking version would be this:

def g():
yield from []
Holiday season in Russia! See you later :).
The order of except blocks matter: if exceptions can be caught by more than one block, the higher block applies. The following code doesn’t work as intended:

import logging


def get(storage, key, default):
try:
return storage[key]
except LookupError:
return default
except IndexError:
return get(storage, 0, default)
except TypeError:
logging.exception('unsupported key')
return default


print(get([1], 0, 42)) # 1
print(get([1], 10, 42)) # 42
print(get([1], 'x', 42)) # error msg, 42


except IndexError never works since IndexError is a subclass of LookupError. More concrete exception should always be higher:

import logging


def get(storage, key, default):
try:
return storage[key]
except IndexError:
return get(storage, 0, default)
except LookupError:
return default
except TypeError:
logging.exception('unsupported key')
return default


print(get([1], 0, 42)) # 1
print(get([1], 10, 42)) # 1
print(get([1], 'x', 42)) # error msg, 42
If you access the class attribute that is a descriptor object you get what its __get__ returns, not the object itself:

class Descriptor:
def __get__(self, *args):
return 42

class A:
x = Descriptor()

a = A()
print(a.x) # 42
print(getattr(a, 'x')) # 42


If you want to get the descriptor object you have to manually check __dict__ of object and all its classes (according to MRO):

class Descriptor:
def __get__(self, *args):
return 42

class A:
x = Descriptor()

class B(A):
pass

def getattr_raw(obj, attr):
for x in [obj] + type(obj).mro():
if attr in x.__dict__:
return x.__dict__[attr]

raise AttributeError()

b = B()
print(getattr_raw(b, 'x'))
Python supports parallel assignment meaning that all variables are modified at once after all expressions are evaluated. Moreover, you can use any expression that supports assignment, not only variables:

def shift_inplace(lst, k):
size = len(lst)
lst[k:], lst[0:k] = lst[0:-k], lst[-k:]

lst = list(range(10))

shift_inplace(lst, -3)
print(lst)
# [3, 4, 5, 6, 7, 8, 9, 0, 1, 2]

shift_inplace(lst, 5)
print(lst)
# [8, 9, 0, 1, 2, 3, 4, 5, 6, 7]