Understanding Multi-Platform Docker Builds with QEMU
https://www.reddit.com/r/programming/comments/1olwc2d/understanding_multiplatform_docker_builds_with/
submitted by /u/Helpful_Geologist430 (https://www.reddit.com/user/Helpful_Geologist430)
[link] (https://cefboud.com/posts/qemu-virtualzation-docker-multi-build/) [comments] (https://www.reddit.com/r/programming/comments/1olwc2d/understanding_multiplatform_docker_builds_with/)
https://www.reddit.com/r/programming/comments/1olwc2d/understanding_multiplatform_docker_builds_with/
submitted by /u/Helpful_Geologist430 (https://www.reddit.com/user/Helpful_Geologist430)
[link] (https://cefboud.com/posts/qemu-virtualzation-docker-multi-build/) [comments] (https://www.reddit.com/r/programming/comments/1olwc2d/understanding_multiplatform_docker_builds_with/)
Part 3: Building LLMs from Scratch – Model Architecture & GPU Training [Follow-up to Part 1 and 2]
https://www.reddit.com/r/programming/comments/1olwg7b/part_3_building_llms_from_scratch_model/
<!-- SC_OFF -->I’m excited to share Part 3 of my series on building an LLM from scratch. This installment dives into the guts of model architecture, multi-GPU training, memory-precision tricks, checkpointing & inference. What you’ll find inside: Two model sizes (117M & 354M parameters) and how we designed the architecture. Multi-GPU training setup: how to handle memory constraints, fp16/bf16 precision, distributed training. Experiment tracking (thanks Weights & Biases), checkpointing strategies, resume logic for long runs. Converting PyTorch checkpoints into a deployable format for inference / sharing. Real-world mistakes and learnings: out-of-memory errors, data-shape mismatches, GPU tuning headaches. Why it matters:
Even if your data pipeline and tokenizer (see Part 2) are solid, your model architecture and infrastructure matter just as much — otherwise you’ll spend more time debugging than training. This post shows how to build a robust training pipeline that actually scales. If you’ve followed along from Part 1 and Part 2, thanks for sticking with it — and if you’re just now jumping in, you can catch up on those earlier posts (links below). Resources: 🔗 Blog post (https://blog.desigeek.com/post/2025/11/building-llm-from-scratch-part3-model-architecture-gpu-training/) 🔗 GitHub codebase (https://github.com/bahree/helloLondon) 🔗Part 2: Data Collection & Custom Tokenizers (https://www.reddit.com/r/programming/comments/1o56elg/building_llms_from_scratch_part_2_data_collection/) 🔗Part 1: Quick Start & Overview (https://www.reddit.com/r/programming/comments/1nq0166/a_step_by_step_guide_on_how_to_build_a_llm_from/) 🔗 LinkedIn Post (https://www.linkedin.com/posts/amitbahree_ai-llm-generativeai-activity-7390442713931767808-xSfS) - If that is your thing. <!-- SC_ON --> submitted by /u/amitbahree (https://www.reddit.com/user/amitbahree)
[link] (https://blog.desigeek.com/post/2025/11/building-llm-from-scratch-part3-model-architecture-gpu-training/) [comments] (https://www.reddit.com/r/programming/comments/1olwg7b/part_3_building_llms_from_scratch_model/)
https://www.reddit.com/r/programming/comments/1olwg7b/part_3_building_llms_from_scratch_model/
<!-- SC_OFF -->I’m excited to share Part 3 of my series on building an LLM from scratch. This installment dives into the guts of model architecture, multi-GPU training, memory-precision tricks, checkpointing & inference. What you’ll find inside: Two model sizes (117M & 354M parameters) and how we designed the architecture. Multi-GPU training setup: how to handle memory constraints, fp16/bf16 precision, distributed training. Experiment tracking (thanks Weights & Biases), checkpointing strategies, resume logic for long runs. Converting PyTorch checkpoints into a deployable format for inference / sharing. Real-world mistakes and learnings: out-of-memory errors, data-shape mismatches, GPU tuning headaches. Why it matters:
Even if your data pipeline and tokenizer (see Part 2) are solid, your model architecture and infrastructure matter just as much — otherwise you’ll spend more time debugging than training. This post shows how to build a robust training pipeline that actually scales. If you’ve followed along from Part 1 and Part 2, thanks for sticking with it — and if you’re just now jumping in, you can catch up on those earlier posts (links below). Resources: 🔗 Blog post (https://blog.desigeek.com/post/2025/11/building-llm-from-scratch-part3-model-architecture-gpu-training/) 🔗 GitHub codebase (https://github.com/bahree/helloLondon) 🔗Part 2: Data Collection & Custom Tokenizers (https://www.reddit.com/r/programming/comments/1o56elg/building_llms_from_scratch_part_2_data_collection/) 🔗Part 1: Quick Start & Overview (https://www.reddit.com/r/programming/comments/1nq0166/a_step_by_step_guide_on_how_to_build_a_llm_from/) 🔗 LinkedIn Post (https://www.linkedin.com/posts/amitbahree_ai-llm-generativeai-activity-7390442713931767808-xSfS) - If that is your thing. <!-- SC_ON --> submitted by /u/amitbahree (https://www.reddit.com/user/amitbahree)
[link] (https://blog.desigeek.com/post/2025/11/building-llm-from-scratch-part3-model-architecture-gpu-training/) [comments] (https://www.reddit.com/r/programming/comments/1olwg7b/part_3_building_llms_from_scratch_model/)
When Logs Become Chains: The Hidden Danger of Synchronous Logging
https://www.reddit.com/r/programming/comments/1omb1wa/when_logs_become_chains_the_hidden_danger_of/
<!-- SC_OFF -->Most applications log synchronously without thinking twice. When your code calls logger.info(”User logged in”), it doesn’t just fire-and-forget. It waits. The thread blocks until that log entry hits disk or gets acknowledged by your logging service. In normal times, this takes microseconds. But when your logging infrastructure slows down—perhaps your log aggregator is under load, or your disk is experiencing high I/O wait—those microseconds become milliseconds, then seconds. Your application thread pool drains like water through a sieve. Here’s the brutal math: If you have 200 worker threads and each log write takes 2 seconds instead of 2 milliseconds, you can only handle 100 requests per second instead of 100,000. Your application didn’t break. Your logs did. https://systemdr.substack.com/p/when-logs-become-chains-the-hidden https://www.youtube.com/watch?v=pgiHV3Ns0ac&list=PLL6PVwiVv1oR27XfPfJU4_GOtW8Pbwog4 <!-- SC_ON --> submitted by /u/Extra_Ear_10 (https://www.reddit.com/user/Extra_Ear_10)
[link] (https://systemdr.substack.com/p/when-logs-become-chains-the-hidden) [comments] (https://www.reddit.com/r/programming/comments/1omb1wa/when_logs_become_chains_the_hidden_danger_of/)
https://www.reddit.com/r/programming/comments/1omb1wa/when_logs_become_chains_the_hidden_danger_of/
<!-- SC_OFF -->Most applications log synchronously without thinking twice. When your code calls logger.info(”User logged in”), it doesn’t just fire-and-forget. It waits. The thread blocks until that log entry hits disk or gets acknowledged by your logging service. In normal times, this takes microseconds. But when your logging infrastructure slows down—perhaps your log aggregator is under load, or your disk is experiencing high I/O wait—those microseconds become milliseconds, then seconds. Your application thread pool drains like water through a sieve. Here’s the brutal math: If you have 200 worker threads and each log write takes 2 seconds instead of 2 milliseconds, you can only handle 100 requests per second instead of 100,000. Your application didn’t break. Your logs did. https://systemdr.substack.com/p/when-logs-become-chains-the-hidden https://www.youtube.com/watch?v=pgiHV3Ns0ac&list=PLL6PVwiVv1oR27XfPfJU4_GOtW8Pbwog4 <!-- SC_ON --> submitted by /u/Extra_Ear_10 (https://www.reddit.com/user/Extra_Ear_10)
[link] (https://systemdr.substack.com/p/when-logs-become-chains-the-hidden) [comments] (https://www.reddit.com/r/programming/comments/1omb1wa/when_logs_become_chains_the_hidden_danger_of/)
AI Broke Interviews
https://www.reddit.com/r/programming/comments/1omf635/ai_broke_interviews/
submitted by /u/yusufaytas (https://www.reddit.com/user/yusufaytas)
[link] (https://yusufaytas.com/ai-broke-interviews/) [comments] (https://www.reddit.com/r/programming/comments/1omf635/ai_broke_interviews/)
https://www.reddit.com/r/programming/comments/1omf635/ai_broke_interviews/
submitted by /u/yusufaytas (https://www.reddit.com/user/yusufaytas)
[link] (https://yusufaytas.com/ai-broke-interviews/) [comments] (https://www.reddit.com/r/programming/comments/1omf635/ai_broke_interviews/)
Replication: from bug reproduction to replicating everything (a mental model)
https://www.reddit.com/r/programming/comments/1omgxqe/replication_from_bug_reproduction_to_replicating/
submitted by /u/dmp0x7c5 (https://www.reddit.com/user/dmp0x7c5)
[link] (https://l.perspectiveship.com/re-rep) [comments] (https://www.reddit.com/r/programming/comments/1omgxqe/replication_from_bug_reproduction_to_replicating/)
https://www.reddit.com/r/programming/comments/1omgxqe/replication_from_bug_reproduction_to_replicating/
submitted by /u/dmp0x7c5 (https://www.reddit.com/user/dmp0x7c5)
[link] (https://l.perspectiveship.com/re-rep) [comments] (https://www.reddit.com/r/programming/comments/1omgxqe/replication_from_bug_reproduction_to_replicating/)
Silent Disagreements are worst in Software Engineering
https://www.reddit.com/r/programming/comments/1omjvxr/silent_disagreements_are_worst_in_software/
submitted by /u/thehustlingengineer (https://www.reddit.com/user/thehustlingengineer)
[link] (https://open.substack.com/pub/thehustlingengineer/p/the-silent-career-killer-most-engineers?r=yznlc&utm_medium=ios) [comments] (https://www.reddit.com/r/programming/comments/1omjvxr/silent_disagreements_are_worst_in_software/)
https://www.reddit.com/r/programming/comments/1omjvxr/silent_disagreements_are_worst_in_software/
submitted by /u/thehustlingengineer (https://www.reddit.com/user/thehustlingengineer)
[link] (https://open.substack.com/pub/thehustlingengineer/p/the-silent-career-killer-most-engineers?r=yznlc&utm_medium=ios) [comments] (https://www.reddit.com/r/programming/comments/1omjvxr/silent_disagreements_are_worst_in_software/)
How Docker Containers Work Under the Hood (Hands-On Demo)
https://www.reddit.com/r/programming/comments/1oml5wf/how_docker_containers_work_under_the_hood_handson/
submitted by /u/Helpful_Geologist430 (https://www.reddit.com/user/Helpful_Geologist430)
[link] (https://youtu.be/cXhr5e58fio?si=miocvZ6F_GDIhfgR) [comments] (https://www.reddit.com/r/programming/comments/1oml5wf/how_docker_containers_work_under_the_hood_handson/)
https://www.reddit.com/r/programming/comments/1oml5wf/how_docker_containers_work_under_the_hood_handson/
submitted by /u/Helpful_Geologist430 (https://www.reddit.com/user/Helpful_Geologist430)
[link] (https://youtu.be/cXhr5e58fio?si=miocvZ6F_GDIhfgR) [comments] (https://www.reddit.com/r/programming/comments/1oml5wf/how_docker_containers_work_under_the_hood_handson/)
The Annotated Diffusion Transformer
https://www.reddit.com/r/programming/comments/1omlgkd/the_annotated_diffusion_transformer/
submitted by /u/DataBaeBee (https://www.reddit.com/user/DataBaeBee)
[link] (https://leetarxiv.substack.com/p/the-annotated-diffusion-transformer) [comments] (https://www.reddit.com/r/programming/comments/1omlgkd/the_annotated_diffusion_transformer/)
https://www.reddit.com/r/programming/comments/1omlgkd/the_annotated_diffusion_transformer/
submitted by /u/DataBaeBee (https://www.reddit.com/user/DataBaeBee)
[link] (https://leetarxiv.substack.com/p/the-annotated-diffusion-transformer) [comments] (https://www.reddit.com/r/programming/comments/1omlgkd/the_annotated_diffusion_transformer/)
My Mistakes and Advice Leading Engineering Teams
https://www.reddit.com/r/programming/comments/1omobrc/my_mistakes_and_advice_leading_engineering_teams/
submitted by /u/gregorojstersek (https://www.reddit.com/user/gregorojstersek)
[link] (https://newsletter.eng-leadership.com/p/my-mistakes-and-advice-leading-engineering) [comments] (https://www.reddit.com/r/programming/comments/1omobrc/my_mistakes_and_advice_leading_engineering_teams/)
https://www.reddit.com/r/programming/comments/1omobrc/my_mistakes_and_advice_leading_engineering_teams/
submitted by /u/gregorojstersek (https://www.reddit.com/user/gregorojstersek)
[link] (https://newsletter.eng-leadership.com/p/my-mistakes-and-advice-leading-engineering) [comments] (https://www.reddit.com/r/programming/comments/1omobrc/my_mistakes_and_advice_leading_engineering_teams/)
Choosing a dependency
https://www.reddit.com/r/programming/comments/1ompqpe/choosing_a_dependency/
submitted by /u/nfrankel (https://www.reddit.com/user/nfrankel)
[link] (https://blog.frankel.ch/choosing-dependency/) [comments] (https://www.reddit.com/r/programming/comments/1ompqpe/choosing_a_dependency/)
https://www.reddit.com/r/programming/comments/1ompqpe/choosing_a_dependency/
submitted by /u/nfrankel (https://www.reddit.com/user/nfrankel)
[link] (https://blog.frankel.ch/choosing-dependency/) [comments] (https://www.reddit.com/r/programming/comments/1ompqpe/choosing_a_dependency/)
How Google Wants to Bring AI to Every High School Student - pythonjournals.com
https://www.reddit.com/r/programming/comments/1on3mxk/how_google_wants_to_bring_ai_to_every_high_school/
submitted by /u/Funny-Ad-5060 (https://www.reddit.com/user/Funny-Ad-5060)
[link] (https://pythonjournals.com/how-google-wants-to-bring-ai-to-every-high-school-student/) [comments] (https://www.reddit.com/r/programming/comments/1on3mxk/how_google_wants_to_bring_ai_to_every_high_school/)
https://www.reddit.com/r/programming/comments/1on3mxk/how_google_wants_to_bring_ai_to_every_high_school/
submitted by /u/Funny-Ad-5060 (https://www.reddit.com/user/Funny-Ad-5060)
[link] (https://pythonjournals.com/how-google-wants-to-bring-ai-to-every-high-school-student/) [comments] (https://www.reddit.com/r/programming/comments/1on3mxk/how_google_wants_to_bring_ai_to_every_high_school/)
AI Is Making It Harder for Junior Developers to Get Hired
https://www.reddit.com/r/programming/comments/1on4k0o/ai_is_making_it_harder_for_junior_developers_to/
submitted by /u/ImpressiveContest283 (https://www.reddit.com/user/ImpressiveContest283)
[link] (https://www.finalroundai.com/blog/ai-is-making-it-harder-for-junior-developers-to-get-hired) [comments] (https://www.reddit.com/r/programming/comments/1on4k0o/ai_is_making_it_harder_for_junior_developers_to/)
https://www.reddit.com/r/programming/comments/1on4k0o/ai_is_making_it_harder_for_junior_developers_to/
submitted by /u/ImpressiveContest283 (https://www.reddit.com/user/ImpressiveContest283)
[link] (https://www.finalroundai.com/blog/ai-is-making-it-harder-for-junior-developers-to-get-hired) [comments] (https://www.reddit.com/r/programming/comments/1on4k0o/ai_is_making_it_harder_for_junior_developers_to/)
Interview Questions I Faced for a Python Developer
https://www.reddit.com/r/programming/comments/1on4lbp/interview_questions_i_faced_for_a_python_developer/
submitted by /u/Funny-Ad-5060 (https://www.reddit.com/user/Funny-Ad-5060)
[link] (https://pythonjournals.com/interview-questions-i-faced-for-a-python-developer/) [comments] (https://www.reddit.com/r/programming/comments/1on4lbp/interview_questions_i_faced_for_a_python_developer/)
https://www.reddit.com/r/programming/comments/1on4lbp/interview_questions_i_faced_for_a_python_developer/
submitted by /u/Funny-Ad-5060 (https://www.reddit.com/user/Funny-Ad-5060)
[link] (https://pythonjournals.com/interview-questions-i-faced-for-a-python-developer/) [comments] (https://www.reddit.com/r/programming/comments/1on4lbp/interview_questions_i_faced_for_a_python_developer/)
The APM paradox: Too much data, too few answers
https://www.reddit.com/r/programming/comments/1on5pxg/the_apm_paradox_too_much_data_too_few_answers/
submitted by /u/joshuap (https://www.reddit.com/user/joshuap)
[link] (https://www.honeybadger.io/blog/apm-paradox/) [comments] (https://www.reddit.com/r/programming/comments/1on5pxg/the_apm_paradox_too_much_data_too_few_answers/)
https://www.reddit.com/r/programming/comments/1on5pxg/the_apm_paradox_too_much_data_too_few_answers/
submitted by /u/joshuap (https://www.reddit.com/user/joshuap)
[link] (https://www.honeybadger.io/blog/apm-paradox/) [comments] (https://www.reddit.com/r/programming/comments/1on5pxg/the_apm_paradox_too_much_data_too_few_answers/)
Notes by djb on using Fil-C (2025)
https://www.reddit.com/r/programming/comments/1on6gtd/notes_by_djb_on_using_filc_2025/
submitted by /u/fiskfisk (https://www.reddit.com/user/fiskfisk)
[link] (https://cr.yp.to/2025/fil-c.html) [comments] (https://www.reddit.com/r/programming/comments/1on6gtd/notes_by_djb_on_using_filc_2025/)
https://www.reddit.com/r/programming/comments/1on6gtd/notes_by_djb_on_using_filc_2025/
submitted by /u/fiskfisk (https://www.reddit.com/user/fiskfisk)
[link] (https://cr.yp.to/2025/fil-c.html) [comments] (https://www.reddit.com/r/programming/comments/1on6gtd/notes_by_djb_on_using_filc_2025/)
Down with template (or not)!
https://www.reddit.com/r/programming/comments/1on774o/down_with_template_or_not/
submitted by /u/Yaruxi (https://www.reddit.com/user/Yaruxi)
[link] (https://cedardb.com/blog/down_with_template/) [comments] (https://www.reddit.com/r/programming/comments/1on774o/down_with_template_or_not/)
https://www.reddit.com/r/programming/comments/1on774o/down_with_template_or_not/
submitted by /u/Yaruxi (https://www.reddit.com/user/Yaruxi)
[link] (https://cedardb.com/blog/down_with_template/) [comments] (https://www.reddit.com/r/programming/comments/1on774o/down_with_template_or_not/)
A Beginner’s Field Guide to Large Language Models
https://www.reddit.com/r/programming/comments/1on8bzl/a_beginners_field_guide_to_large_language_models/
submitted by /u/sdxyz42 (https://www.reddit.com/user/sdxyz42)
[link] (https://newsletter.systemdesign.one/p/llm-concepts) [comments] (https://www.reddit.com/r/programming/comments/1on8bzl/a_beginners_field_guide_to_large_language_models/)
https://www.reddit.com/r/programming/comments/1on8bzl/a_beginners_field_guide_to_large_language_models/
submitted by /u/sdxyz42 (https://www.reddit.com/user/sdxyz42)
[link] (https://newsletter.systemdesign.one/p/llm-concepts) [comments] (https://www.reddit.com/r/programming/comments/1on8bzl/a_beginners_field_guide_to_large_language_models/)
Your URL Is Your State
https://www.reddit.com/r/programming/comments/1on9pu5/your_url_is_your_state/
submitted by /u/BrewedDoritos (https://www.reddit.com/user/BrewedDoritos)
[link] (https://alfy.blog/2025/10/31/your-url-is-your-state.html) [comments] (https://www.reddit.com/r/programming/comments/1on9pu5/your_url_is_your_state/)
https://www.reddit.com/r/programming/comments/1on9pu5/your_url_is_your_state/
submitted by /u/BrewedDoritos (https://www.reddit.com/user/BrewedDoritos)
[link] (https://alfy.blog/2025/10/31/your-url-is-your-state.html) [comments] (https://www.reddit.com/r/programming/comments/1on9pu5/your_url_is_your_state/)
A Soiree into Symbols in Ruby
https://www.reddit.com/r/programming/comments/1on9rkq/a_soiree_into_symbols_in_ruby/
submitted by /u/iamstonecharioteer (https://www.reddit.com/user/iamstonecharioteer)
[link] (https://tech.stonecharioteer.com/posts/2025/ruby-symbols/) [comments] (https://www.reddit.com/r/programming/comments/1on9rkq/a_soiree_into_symbols_in_ruby/)
https://www.reddit.com/r/programming/comments/1on9rkq/a_soiree_into_symbols_in_ruby/
submitted by /u/iamstonecharioteer (https://www.reddit.com/user/iamstonecharioteer)
[link] (https://tech.stonecharioteer.com/posts/2025/ruby-symbols/) [comments] (https://www.reddit.com/r/programming/comments/1on9rkq/a_soiree_into_symbols_in_ruby/)
How to choose between SQL and NoSQL
https://www.reddit.com/r/programming/comments/1ona7wq/how_to_choose_between_sql_and_nosql/
submitted by /u/stmoreau (https://www.reddit.com/user/stmoreau)
[link] (https://www.systemdesignbutsimple.com/p/how-to-choose-between-sql-and-nosql) [comments] (https://www.reddit.com/r/programming/comments/1ona7wq/how_to_choose_between_sql_and_nosql/)
https://www.reddit.com/r/programming/comments/1ona7wq/how_to_choose_between_sql_and_nosql/
submitted by /u/stmoreau (https://www.reddit.com/user/stmoreau)
[link] (https://www.systemdesignbutsimple.com/p/how-to-choose-between-sql-and-nosql) [comments] (https://www.reddit.com/r/programming/comments/1ona7wq/how_to_choose_between_sql_and_nosql/)