Dev Miscellaneous

Mechanical sympathy for QR codes: making NSW check-in better

QR codes are now critical infrastructure here in NSW, Australia. Let's learn how to make them better.

https://huonw.github.io/blog/2021/10/nsw-covid-qr/

@DevMisc
#qrcode #optimization #misc

96 viewsdavide. φ, 15:00

Dev Miscellaneous

Fast character case conversion

...or how to compress sparse arrays.

https://github.com/apankrat/notes/tree/master/fast-case-conversion

@DevMisc
#algorithm #optimization #misc

101 viewsdavide. φ, 09:00

Dev Miscellaneous

Understanding why our build got 15x slower with Webpack 5

https://engineering.tines.com/blog/understanding-why-our-build-got-15x-slower-with-webpack

@DevMisc
#javascript #v8 #optimization

98 viewsdavide. φ, 10:00

Dev Miscellaneous

Saving a third of our memory by re-ordering Go struct fields

https://wagslane.dev/posts/go-struct-ordering

@DevMisc
#golang #memory #optimization

205 viewsdavide. φ, 10:01

Dev Miscellaneous

Speeding up my AoC solution by a factor of 2700 with Dijkstra’s

https://blog.siraben.dev/2021/12/28/aoc-speedup.html

@DevMisc
#optimization #misc

242 viewsdavide. φ, 13:00

Dev Miscellaneous

Counting Bytes Faster Than You'd Think Possible

- The author was able to significantly optimize a byte-counting program, achieving a ~550x speedup over a naive implementation.
- The key optimization was using an interleaved memory access pattern, reading from different 4KB pages in a round-robin fashion, instead of sequential access.
- This interleaved access pattern takes advantage of the "Streamer" hardware prefetcher in modern CPUs, which can maintain separate forward and backward access streams for each 4KB page.
- Interleaving 8 different 4KB pages was found to be the optimal approach, providing up to a 30% performance boost over sequential access.
- The author also unrolled the inner loop to process 2 cache lines (64 bytes) at a time, and added a prefetch instruction to fetch the next set of data.
- The final solution uses AVX2 SIMD instructions to perform the byte counting in a highly efficient manner.
- The author was able to achieve a ranking of #13 on the HighLoad leaderboard with this optimized solution.
- The interleaved memory access pattern seems to be an under-discussed optimization technique, with the author not recalling seeing it used in other code.
- The author encourages readers to share any other memory-based optimizations they are aware of, as the author is interested in learning about them.
- The document provides the full source code for the optimized byte-counting program, allowing readers to study and potentially apply the techniques in their own work.

https://blog.mattstuchlik.com/2024/07/21/fastest-memory-read.html

@DevMisc
#asm #cpp #optimization

❤1🤯1

292 viewsedited 10:51

Dev Miscellaneous

@DevMisc
#meme #llvm #optimization #lowlevel

❤1

453 viewsedited 12:13

About

Blog

Apps

Platform