472 subscribers
6 photos
1 video
2 files
550 links
python, go, code quality, security, magic

Website and RSS:


Download Telegram
🎥 The Mess We're In is a talk by Joe Armstrong, a co-creator of Erlang, at Strange Loop conference. It's about complexity of modern system, about that chaos of duplicated and unreliable components that barely work.

The talk is 9 years old. Joe Armstrong died in 2019. Strange Loop conference will happen in this year last time. And yet the problems he covers are even more pronounced today and will stay true until the fall of humanity.
🐍 Pandas 2.0 is released! The biggest highlight is a support for pyarrow instead of numpy as a backend, which gives a better performance and a better support for missing values. A few months ago, there was a great post about it in the pandas blog: pandas 2.0 and the Arrow revolution.

The release notes are a bit too verbose, so I recommend reading this TL;DR instead: Everything you need to know about pandas 2.0.0.

ruff is making a lot of buzz in Python community recently. It's a Python linter written in Rust. It re-implements all built-in rules from flake8, most of the rules from pylint, and a lot of flake8 plugins. On top of having it all out-of-the-box, it's very fast and provides autofix logic for many stylistic checks.

It's still in "beta", and a few months ago when I tried it for the first time it exploded on most of big projects I tried it on, but now it's stable enough. I recommend giving it a try on your pet projects.

For me, the number one feature of the tool is a vscode plugin that lints the code as you type it. It's something that I had in Atom years ago and what I've been missing the most in vscode. It even has autofixes as refactoring actions!

From the disadvantages, I'd mention that it brings the clumsy configuration approach from flake8 and still doesn't have a way to write plugins for it. But I think fixing it is just a matter of time.

BTW, did you know Atom had been archived in March? It was objectively less stable and consistent than vscode, but it had some nice features, and it's always great for ecosystem to have alternatives. I still use Atom hotkeys and theme in vscode. Probably, since Microsoft acquired Github, ditching Atom in favor of vscode was inevitable.

msgspec is a serialization and validation library for Python with schemas defined declaratively with type annotations (like dataclasses). Think of it as a pydantic alternative. And it has quite a few advantages:

1. Performance. In their benchmarks they claim to tear apart everything else, especially pydantic. And while Pydantic 2.0 with the core implemented on Rust is on horizon, it's not clear when it will be finished, and I still expect msgspec to stay ahead.

2. No implicit type conversion. If the filed type is specified as int and you pass in a string, pydantic (by default) will try its best to convert the passed value to the int. While it is useful for some cases, like parsing URL or GET query parameters, most often you'll use it for interacting with other APIs and providing your own API, and there types are better to be strict.

3. Clean and consistent API. Pydantic has a relatively long history, and it has accumulated a fair share of bad decisions. Again, it's something that the author wants to fix in pydantic 2.0, which isn't clear when it will come out. Msgspec has a clean and functional API that doesn't mess with your classes.

4. Out-of-the-box support for many formats: JSON, YAML, TOML, and MessagePack. And if you need more, there is a function that converts your data into JSON-compatible primitive types that you can then serialize as you want.

A nice bonus is the support for dataclass and TypedDict, so you can give it a try without rewriting the models you have.

📝 How Software Companies Die. How is it written in 28 years ago and describes all the places I worked in? In 1995, I was busy non-existing. I don't think it's how companies die, though. More like how software products turn into unmaintainable pile of garbage. And garbage products can sell just fine. Take Jira for example.
📝 An aperiodic monotile exists. A big math discovery the beauty of which in its simplicity. The article explains it very well. TL;DR: Mathematicians have found a shape that you can use to tile with an infinitely big floor without gaps and without repetitions of the pattern. I think I know what I lasercut next.
An Animated Introduction to Elixir. This is a guide into Elixir with quite an interesting format. Each chapter is build around a code example (similar to Go by example) and when you press the "play" button, you see how the code changes a bit, and on the left there is an explanation for the change, with links and images. Beautiful! It only works on desktop, though.

📝 Type embedding: Golang's fake inheritance is a great beginner-friendly blog post about inheritance in Go. It shows how it's different from OOP, what's good and bad about it, how it plays with interfaces, and how (not) to shoot yourself in a foot.

🦀 The built-in Rust testing framework is alright for small tests, but it quickly falls apart when the project grows. And even for small projects, the tests look messy and the failure output isn't helpful. I haven't found one all-in framework like pytest but there are some small projects that together make something much better:

+ nextest is a fast test runner with a nice colorful output, powerful DSL for filtering tests to be run, and built-in mechanism for detecting slow and flaky tests. And the great thing is that it runs regular Rust tests, so you don't need to change anything in the prject itself to use it.

+ proptest is a hypothesis-inspired framework for writing property-based tests. You tell it what kind of input you want for your test, and it generates random values, trying to cover as many corner cases as possible. When a test fails, it finds a nice and small example for reproducing the failure.

+ rstest is a library that provides pytest-like fixtures and test parametrization (for table-driven testing). I'm not sure if fixtures is a good idea (I wrote pytypest for Python just to fix the mess that pytest fixtures bring in big projects) but parametrization is a must. You can also try test-case library, it looks similar but also lets you to specify a name for each case.

+ k9 provides a macro for snapshot testing. It dumps on disk whatever data you pass into it on the first run, and later on consequent runs compares the new value with the dumped one.

+ A better assert statement is still in question. The built-in assert! and assert_eq! do not provide a helpful enoug error message, far cry from what you can see in pytest or ex_unit. k9 and pretty-assertions provide some additional assertions but that's still not enough. In particular, I right away miss something like assert_is_close for comparing floats or an assertion that will show the arguments with which the target function was called. There is approx library just for that but I think adding a library for each assertion type doesn't scale well.

I wonder if the built-in Rust macro system allows making a nice looking assert macro with pytest-like source rewrites. Like a macro that will convert magic_assert!(f(a, b) > x) into assert!(f(a, b) > x, "f({}, {}) > {}", a, b, x).

🦀 Found it! assert2 solves the problem I described above. You simply write assert!(a + b > 13) and if it fails, instead of "something wrong, good luck" it will say "5 + 6 > 13", with nice colors and all. Now my basic set of human-friendly testing for Rust is complete.

📝 I investigated the Underground Economy of Glassdoor Reviews is an unexpectedly detailed post about how companies buy online reviews on Glassdoor and similar platforms. The author contacted freelancers providing such services, aggregated their quotes and services, asked some "hows", and built interesting charts for a few random companies. The research perhaps lacks scale, but still, quite an interesting rabbit hole.
📝 My 20 Year Career is Technical Debt or Deprecated.

I certainly see and feel what the author means. Technologies move at a crazy pace because we always want our tools to be better, more "modern" if you wish. If your Python service isn't a Python 3.10+ microservice on FastAPI with async/await and type annotations, that's it, it's legacy. Just 4 years ago having a Django monolith was fine but not today. And things even worse if you have something on Ruby or PHP.

We always thrive for new shiny technologies, and a lot of people and projects are left behind. If you don't migrate a project to new stuff, you have troubles hiring smart and agile engineers. If you, as an engineer, don't learn the new technologies, then you stay behind with these projects, without new features (or often even basic bug fixes) and without much happening in the project.

These big old technologies don't die. Ruby on Rails, Laravel, and JQuery still have active development and regular releases, and people still use it in a lot of projects. And engineers with a good experience in COBOL or Fortran are still needed and paid good enough (PHP and Delphi engineers are very underpaid, though). And yet, usually these aren't projects and people making something big. More like maintaining.

Is it good or bad? I'm glad we're making a rapid progress and always want our tools to be better. And yet, some sacrifices were made along the way.
🎥 I made a series of videos about hacking DVWS using Burp Suite:

Many years ago, if you want to practice basics of penetration testing, the best way to do that was to play around with DVWA. DVWA is "Damn Vulnerable Web Application", a web app specifically designed to have as many vulnerabilities as possible for you to learn how to exploit them. This project is cool and all but very dated. You're unlikely to encounter something like this (no-API no-framework multipage PHP service) in the modern world.

DVWS is "Damn Vulnerable Web Services" a modern alternative to DVWS. It's powered by Angular, nodejs, Express, MySQL, and MongoDB, uses ORMs, provides REST API as well as GraphQL endpoints. A lot of stuff to exploit. I've covered 15 vulnerabilities but that's far from all, I might do more videos later.

Lastly, Burp Suite is a tool for penetration testing, mostly manual one. At first I though I'll make videos about using it, but turns out there is not that much to tell, so the focus shifted from a tool to tool-agnostic techniques.
Go 1.21rc1 is released! It's not a final release yet, but we already can see what new features will be there, and that's huge:

1. Built-in function min, max, and clear. You know, like in all other normal programming languages.
2. Experimental fix for the loop variable capture, one of the most common Go mistakes.
3. Profile-guided optimization is now stable. The Go compiler itself now uses it to make its code 10% faster.
4. Structured logging, with both human-readable and JSON output. For me personally it 100% replaces any third-party logging libraries.
5. Generic functions for slices and maps.
6. WASI support. WASI is the standard of defining system calls in WebAssembly.

The release notes draft:

That's a short list but every item on it is amazing. I hope they'll pick enums (or union types, about the same thing, as Rust shows) next, the most requested Go feature at the moment.

Channel photo updated
A long time ago, I used to use Twitter to find out about new projects and blog posts. Later, I moved to Reddit. Now, both are the very manifestation of enshittification, so I removed both accounts. Hacker News never seemed useful to me (perhaps, I just don't know how to use it). I recently registered on Mastodon and started to follow a few hashtags, but they include a lot of unimportant stuff, like discussions, questions, and personal project updates.

So, looks like it's time to go back to RSS. I've run a poll on Mastodon and seems like 92% of Mastodon users (740+ respondents) read RSS feeds. I'm quite surprised by the results because RSS seemed to me like something that left mainstream a long time ago, like IRC, forums, and the alike. And even if that's a bias of Mastodon users, that's still impressive.

fluent-reader is the best RSS reader I found so far. Cross-platform, nice UI, nice previews, built-in browser. I tried to find something that isn't Electron but the truth is that many websites cannot be properly rendered without a full-featured browser. Out of all feeds I tried, Mastodon caused the most trouble. You can get an RSS feed for any Mastodon account (for example, here is mine) but there are no titles and if you follow the link, it won't render anything without JS. So, if you want to read Mastodon RSS feeds, it's important for the reader to properly render descriptions or support JS for opened links. And fluent-reader does both.
To make all my content accessible, I've provided source, website, and RSS for every project I have:

+ 🐿 ITGram: 🌐 website, 📶 rss, 📢 telegram, 📄 source.
+ 🐍 Python etc: 🌐 website, 📶 rss, 📢 telegram, 📄 source.
+ ✏️ Blog: 🌐 website, 📶 rss, 📄 source.

I had to do quite a few changes for that:

The blog is now powered by Hugo static site generator with PaperMod theme which provides RSS out of the box. Before it was powered by home-grown chameleon engine but now I'm moving everything possible to static sites because how easy it is to deploy things on netlify.

For ITGram, I exported all posts with a Python script I found online and then cleaned up with a bunch of ad-hoc scripts. It's also served by Hugo and PaperMod, and for publishing posts back into Telegram I made a short script on top of telethon.

Finally, Python etc was always Markdown-first. It has lots of custom logic, so PaperMod won't fit. I generate the website with Jinja2 and a bunch of custom scripts. For RSS, I considered python-feedgen and rfeed. The former is more popular but I picked the latter because the example in the readme is easier to understand.
refurb is yet another linter for Python. What makes it special is that it internally runs as a mypy plugin. This gives the linter static information about types for everything. As a result, refurb can give more precise and reliable suggestions. But of course it also means refurb is slower than something like pycodestyle which works on regular expressions or ruff which is written on Rust and works only with AST.

Mutation testing is technique for evaluating how good your tests are. The idea is that a special tool (or a very bored human) breaks the code by slightly changing ("mutating") something in it and then running all the tests, and the tests must fail. If they don't fail, you either have dead code or the tests aren't good enough.

There are some pre-requisites for when it mkes sense to do that:

1. The test coverage is 100%. If it's not, you already know what you should write tests for.
2. The tests are fast.
3. The tests are reliable.
4. The codebase is small.

mutmut is a tool for mutation testing of Python code. You simply point it to the directory with the code, directory with the tests, and what command to use to run the tests, and it will do the rest. When it's done, you can generate an HTML report with the list of diffs of mutations for each file that weren't detected by the tests.

I think it's an "advanced" testing technique. I use it not that often and only on small projects that need to be reliable. There are quite a few "false positives" (things that you can't and shouldn't test, like databse connection options), but from my experience it also a very reliable way to detect what tests you're missing.

taplo is a CLI tool (and a Rust library) for working with TOML. It can check TOML files for syntax errors, validate them against JSON schemas, and format.

The best thing about it is the Even Better TOML VSCode plugin wrapping taplo that provides all the same things as CLI plus smart syntax highlighting, navigation, refactoring, and even autocomplete for pyproject.toml, fly.toml, Cargo.toml and a few other formats).