Dev Miscellaneous
357 subscribers
883 photos
6 videos
5 files
912 links
A channel where you can find developer tips, tools, APIs, resources, memes and interesting contents.

Join our comments chat for more.

Comments chat (friendly :D)
https://t.me/+r_fUfa1bx1g0MGRk
Download Telegram
Mechanical sympathy for QR codes: making NSW check-in better

QR codes are now critical infrastructure here in NSW, Australia. Let's learn how to make them better.

https://huonw.github.io/blog/2021/10/nsw-covid-qr/

@DevMisc
#qrcode #optimization #misc
Fast character case conversion

...or how to compress sparse arrays.

https://github.com/apankrat/notes/tree/master/fast-case-conversion

@DevMisc
#algorithm #optimization #misc
Saving a third of our memory by re-ordering Go struct fields

https://wagslane.dev/posts/go-struct-ordering

@DevMisc
#golang #memory #optimization
Speeding up my AoC solution by a factor of 2700 with Dijkstra’s

https://blog.siraben.dev/2021/12/28/aoc-speedup.html

@DevMisc
#optimization #misc
Counting Bytes Faster Than You'd Think Possible

- The author was able to significantly optimize a byte-counting program, achieving a ~550x speedup over a naive implementation.
- The key optimization was using an interleaved memory access pattern, reading from different 4KB pages in a round-robin fashion, instead of sequential access.
- This interleaved access pattern takes advantage of the "Streamer" hardware prefetcher in modern CPUs, which can maintain separate forward and backward access streams for each 4KB page.
- Interleaving 8 different 4KB pages was found to be the optimal approach, providing up to a 30% performance boost over sequential access.
- The author also unrolled the inner loop to process 2 cache lines (64 bytes) at a time, and added a prefetch instruction to fetch the next set of data.
- The final solution uses AVX2 SIMD instructions to perform the byte counting in a highly efficient manner.
- The author was able to achieve a ranking of #13 on the HighLoad leaderboard with this optimized solution.
- The interleaved memory access pattern seems to be an under-discussed optimization technique, with the author not recalling seeing it used in other code.
- The author encourages readers to share any other memory-based optimizations they are aware of, as the author is interested in learning about them.
- The document provides the full source code for the optimized byte-counting program, allowing readers to study and potentially apply the techniques in their own work.


https://blog.mattstuchlik.com/2024/07/21/fastest-memory-read.html

@DevMisc
#asm #cpp #optimization
1🤯1