Reddit Programming
208 subscribers
1.22K photos
124K links
I will send you newest post from subreddit /r/programming
Download Telegram
I compiled my research on modern bot detection into a deep-dive on multi-layer fingerprinting (TLS/JA3, Canvas, Biometrics)
https://www.reddit.com/r/programming/comments/1okyk2z/i_compiled_my_research_on_modern_bot_detection/

<!-- SC_OFF -->As part of the research for my asyncio Python automation library (pydoll), I fell down the rabbit hole of modern bot detection and ended up writing what is essentially a technical manual on the subject. I wanted to share the findings with the community. I found that User-Agent spoofing is almost entirely irrelevant now. The real detection happens by correlating data across a "stack" of fingerprints to check for consistency. The full guide is here: https://pydoll.tech/docs/deep-dive/fingerprinting/ The research covers the full detection architecture. It starts at the network layer, analyzing how your client's TLS "Client Hello" packet creates a unique signature (JA3) that can identify Python's requests library before a single HTTP request is even sent.Then, it moves to the hardware layer, detailing how browsers are fingerprinted based on the unique way your specific GPU/driver combination renders an image (Canvas/WebGL). Finally, it covers the biometric layer, explaining how systems analyze the physics of your mouse movements (based on Fitts's Law) and the cadence of your typing (digraph analysis) to distinguish you from a machine. <!-- SC_ON --> submitted by /u/thalissonvs (https://www.reddit.com/user/thalissonvs)
[link] (https://pydoll.tech/docs/deep-dive/fingerprinting/) [comments] (https://www.reddit.com/r/programming/comments/1okyk2z/i_compiled_my_research_on_modern_bot_detection/)
C3 0.7.7 Vector ABI changes, RISC-V improvements and more
https://www.reddit.com/r/programming/comments/1okzgsu/c3_077_vector_abi_changes_riscv_improvements_and/

<!-- SC_OFF -->For those who don't know about C3: it is a general purpose language that strives to be an evolution of C. The 0.7.7 release among other things changes the vector ABI to pass SIMD vectors as arrays by default, which opens up ABI compatibility with C libraries that uses structs for things like vectors. Other than this it improves RISC-V support and introduces struct initializer splatting (similar to Dart copyWith), and implicit deref subscripting using foo.[i] which is primarily useful when working with generic macros that may both take arrays and pointers to arrays. Some more to dig into if you're interested in C3 Here are some interviews on C3: - https://www.youtube.com/watch?v=UC8VDRJqXfc - https://www.youtube.com/watch?v=9rS8MVZH-vA Here is a series doing various tasks in C3: - https://ebn.codeberg.page/programming/c3/c3-file-io/ Repository with link to various C3 resources and projects: - https://github.com/c3lang/c3-showcase Some projects: Gameboy emulator https://github.com/OdnetninI/Gameboy-Emulator/ RISCV Bare metal Hello World: https://www.youtube.com/watch?v=0iAJxx6Ok4E "Depths of Daemonheim" roguelike https://github.com/TechnicalFowl/7DRL-2025 <!-- SC_ON --> submitted by /u/Nuoji (https://www.reddit.com/user/Nuoji)
[link] (https://c3-lang.org/blog/c3-language-at-0-7-7-vector-abi,-riscv-improvements-and-more/) [comments] (https://www.reddit.com/r/programming/comments/1okzgsu/c3_077_vector_abi_changes_riscv_improvements_and/)
Looking for advice
https://www.reddit.com/r/programming/comments/1olgivd/looking_for_advice/

<!-- SC_OFF -->Hello! Well, how can I start, I'm young and I'm finishing the school I will enter the next year to study CS and AI engineering and I just want to well, be someone "important" not in a popular way but in a academic way, I have been doing a lot of proyects and my final protect for school was my programming language cattleya, I don't truly know if I'm doing all in a correct way, and I feel kind of lost essentially because, I mean making a language on school is extraordinary but in university is like a simple task so I don't know what to do to keep that "high impact profile". If some of you can please recommend me some courses, videos, proyects, or just give your pov pls tell me. :) <!-- SC_ON --> submitted by /u/InflationNo7838 (https://www.reddit.com/user/InflationNo7838)
[link] (https://github.com/justlebadura/cattleyaLang) [comments] (https://www.reddit.com/r/programming/comments/1olgivd/looking_for_advice/)
I built a service that automatically reviews GitHub PRs using code analysis + inline comments
https://www.reddit.com/r/programming/comments/1olinc4/i_built_a_service_that_automatically_reviews/

<!-- SC_OFF -->I recently built a small system that automatically reviews GitHub Pull Requests. How it works GitHub Webhook triggers on a new PR Backend fetches changed files from GitHub’s API Code is passed to an external review service That service analyzes for issues (lint, bug patterns, performance risk, etc.) Then it posts inline review comments back on the PR Stack
Node.js / Express / MongoDB
Python / FastAPI with LangChain for the analysis workflow The most challenging parts were: Verifying webhook signatures correctly Mapping diff hunks to correct line numbers Handling API pagination + rate limits Messaging between Node Python services It’s still evolving, but it now handles basic reviews end-to-end. Repo: https://github.com/Mohammed-bm/pr-review Posting here in case anyone is curious about the approach. <!-- SC_ON --> submitted by /u/mohammedbm13 (https://www.reddit.com/user/mohammedbm13)
[link] (https://pr-review-ten.vercel.app/login) [comments] (https://www.reddit.com/r/programming/comments/1olinc4/i_built_a_service_that_automatically_reviews/)
Not So Fast: Analyzing the Performance of WebAssembly vs. Native Code (WASM 45% slower)
https://www.reddit.com/r/programming/comments/1oljj3v/not_so_fast_analyzing_the_performance_of/

<!-- SC_OFF -->Note: The study uses a modified BROWSIX (a linux kernel for browsers) to achieve fair comparisons of complex WASM programs versus native programs. Background: I am looking into WASM and wanted to understand about it's actual performance characteristics. The study suggests that former small synthetic benchmarks can get fairly close to native speed (10% ish loss), but the benchmarks in this study are at least 45% worse then native speed. That being said running a linux kernel in a browser at that penalty is probably better then powerpoint fps performance. Another less academic benchmark (https://00f.net/2023/01/04/webassembly-benchmark-2023/) from 2023 shows that in some cases WASM runtimes can be worse then node/v8, bun quite regularly, some runtimes only winning by a margin, but overall tend to be faster then node, with a few clear winners. (Not sure whether node has all the potential performance benefits and if it's representative for browser performance.) Current Verdict: You not simply switch to WASM and go vrrrm. The runtime and the code matters, a lot. <!-- SC_ON --> submitted by /u/Zomgnerfenigma (https://www.reddit.com/user/Zomgnerfenigma)
[link] (https://ar5iv.labs.arxiv.org/html/1901.09056) [comments] (https://www.reddit.com/r/programming/comments/1oljj3v/not_so_fast_analyzing_the_performance_of/)
DigitalOcean is chasing me for $0.01: What it taught me about automation
https://www.reddit.com/r/programming/comments/1ols6mk/digitalocean_is_chasing_me_for_001_what_it_taught/

<!-- SC_OFF -->TL;DR: A quick reminder that automation is powerful but needs thoughtful thresholds and edge-case handling to avoid unintended resource waste. <!-- SC_ON --> submitted by /u/modelop (https://www.reddit.com/user/modelop)
[link] (https://linuxblog.io/digitalocean-1-cent-automation/) [comments] (https://www.reddit.com/r/programming/comments/1ols6mk/digitalocean_is_chasing_me_for_001_what_it_taught/)
Part 3: Building LLMs from Scratch – Model Architecture & GPU Training [Follow-up to Part 1 and 2]
https://www.reddit.com/r/programming/comments/1olwg7b/part_3_building_llms_from_scratch_model/

<!-- SC_OFF -->I’m excited to share Part 3 of my series on building an LLM from scratch. This installment dives into the guts of model architecture, multi-GPU training, memory-precision tricks, checkpointing & inference. What you’ll find inside: Two model sizes (117M & 354M parameters) and how we designed the architecture. Multi-GPU training setup: how to handle memory constraints, fp16/bf16 precision, distributed training. Experiment tracking (thanks Weights & Biases), checkpointing strategies, resume logic for long runs. Converting PyTorch checkpoints into a deployable format for inference / sharing. Real-world mistakes and learnings: out-of-memory errors, data-shape mismatches, GPU tuning headaches. Why it matters:
Even if your data pipeline and tokenizer (see Part 2) are solid, your model architecture and infrastructure matter just as much — otherwise you’ll spend more time debugging than training. This post shows how to build a robust training pipeline that actually scales. If you’ve followed along from Part 1 and Part 2, thanks for sticking with it — and if you’re just now jumping in, you can catch up on those earlier posts (links below). Resources: 🔗 Blog post (https://blog.desigeek.com/post/2025/11/building-llm-from-scratch-part3-model-architecture-gpu-training/) 🔗 GitHub codebase (https://github.com/bahree/helloLondon) 🔗Part 2: Data Collection & Custom Tokenizers (https://www.reddit.com/r/programming/comments/1o56elg/building_llms_from_scratch_part_2_data_collection/) 🔗Part 1: Quick Start & Overview (https://www.reddit.com/r/programming/comments/1nq0166/a_step_by_step_guide_on_how_to_build_a_llm_from/) 🔗 LinkedIn Post (https://www.linkedin.com/posts/amitbahree_ai-llm-generativeai-activity-7390442713931767808-xSfS) - If that is your thing. <!-- SC_ON --> submitted by /u/amitbahree (https://www.reddit.com/user/amitbahree)
[link] (https://blog.desigeek.com/post/2025/11/building-llm-from-scratch-part3-model-architecture-gpu-training/) [comments] (https://www.reddit.com/r/programming/comments/1olwg7b/part_3_building_llms_from_scratch_model/)
When Logs Become Chains: The Hidden Danger of Synchronous Logging
https://www.reddit.com/r/programming/comments/1omb1wa/when_logs_become_chains_the_hidden_danger_of/

<!-- SC_OFF -->Most applications log synchronously without thinking twice. When your code calls logger.info(”User logged in”), it doesn’t just fire-and-forget. It waits. The thread blocks until that log entry hits disk or gets acknowledged by your logging service. In normal times, this takes microseconds. But when your logging infrastructure slows down—perhaps your log aggregator is under load, or your disk is experiencing high I/O wait—those microseconds become milliseconds, then seconds. Your application thread pool drains like water through a sieve. Here’s the brutal math: If you have 200 worker threads and each log write takes 2 seconds instead of 2 milliseconds, you can only handle 100 requests per second instead of 100,000. Your application didn’t break. Your logs did. https://systemdr.substack.com/p/when-logs-become-chains-the-hidden https://www.youtube.com/watch?v=pgiHV3Ns0ac&list=PLL6PVwiVv1oR27XfPfJU4_GOtW8Pbwog4 <!-- SC_ON --> submitted by /u/Extra_Ear_10 (https://www.reddit.com/user/Extra_Ear_10)
[link] (https://systemdr.substack.com/p/when-logs-become-chains-the-hidden) [comments] (https://www.reddit.com/r/programming/comments/1omb1wa/when_logs_become_chains_the_hidden_danger_of/)