Reddit Programming
200 subscribers
1.22K photos
126K links
I will send you newest post from subreddit /r/programming
Download Telegram
dashboard to display component details instantly. Step 3: Parse Lock Files (The Hard Part) This was the gnarly part. Four different formats, each with quirks. Yarn Lock (v1 Classic) Looks like TOML with nested dependency lists: Code "@pkgjs/parseargs@^0.11.0": version "0.11.0" resolved "https://registry.npmjs.org/..." dependencies: package-json "^6.0.0" I wrote a line-by-line parser. The trick: track indentation to know when you're inside a package block vs. dependency list. npm package-lock.json Flat JSON structure (v2/v3): JSON { "packages": { "node_modules/lodash": { "version": "4.17.21", "dependencies": { ... } } } } Easier to parse with JsonDocument, but the key names have node_modules/ prefixes that need stripping. pnpm-lock.yaml YAML with name@version keys: YAML packages: /lodash/4.17.21: version: 4.17.21 dependencies: react: 18.2.0 I treated this as mostly line-based text parsing since I didn't want to add a full YAML dependency. Works for the common cases. Bun Lock JSONC format with array-based entries. Least common, so I parse it but mark binary bun.lockb files as unparseable. Step 4: Resolve Dependencies Once I had a parsed lock file, I needed to extract: Local dependencies (internal workspace packages like u/company/shared) Direct dependencies (what's explicitly in package.json) Transitive dependencies (what your dependencies need) C# // Read package.json dependencies var directRanges = ReadDirectDependencyRanges(packageJsonContent); // For each direct dep, look it up in the lock file foreach (var (name, range) in directRanges) { var pkg = Resolve(name, range, parsedLock); if (pkg != null) { // It's resolved to version X.Y.Z direct.Add(new ResolvedDependency(pkg.Name, pkg.Version, range)); // Queue it to traverse its dependencies queue.Enqueue(pkg); } } // Depth-first traversal to collect transitives while (queue.TryDequeue(out var pkg)) { foreach (var (depName, depRange) in pkg.DependencyRanges) { var dep = Resolve(depName, depRange, parsedLock); if (dep != null && !visited.Contains($"{dep.Name}@{dep.Version}")) { transitive.Add(...); queue.Enqueue(dep); } } } Result: Three lists of ResolvedDependency objects with exact versions and requested ranges. Silverfish uses this to build the full dependency graph in its UI. Step 5: Handle Monorepos Monorepos have multiple package.json files. The key insight: walk up the directory tree to find the root lock file. C# static IEnumerable AncestorDirs(string dir) { var current = dir; while (true) { yield return current; if (string.IsNullOrEmpty(current)) break; current = Path.GetDirectoryName(current); } } So packages/web/package.json in an entria-style monorepo correctly finds the root yarn.lock instead of failing. Each workspace member gets its own component record in Silverfish. How the Silverfish IDP Uses This Once the analyzer extracts all this metadata, it: Maps dependencies visually — showing which components depend on what Flags version mismatches — when different packages pin different versions of the same library Detects tech stacks — knowing which services are frontend, which are backend, which databases they use Tracks upgrades — identifying outdated packages and planning coordinated updates Enables governance — enforcing policies like "no direct jquery dependencies" or "all frontends must use React 18+" Lessons Learned Abstraction beats assumptions: I wrote the whole thing to accept Func> readFileContentAsync instead of directly reading files. This made it testable and backend-agnostic (GitHub API, filesystem, cache, whatever). Format-specific parsing is worth it: I could have given up on Yarn/pnpm/Bun and only parsed npm lock files. But each format's parser is ~100-150 lines and handles real repos that exist in the wild. Conflicts are data, not errors: Instead of failing when I find multiple lock files, I report them. That's valuable information ("why do you have both yarn.lock and package-lock.json?"). Monorepos are normal: Walking ancestor directories
for lock files + detecting internal workspace packages turned out to be essential, not an edge case. Version constraints matter: Storing both the requested range (^1.2.3) and resolved version (1.2.5) proved useful—you can detect upgradeable deps without breaking changes. What's Next The JS/TS analyzer is one piece of Silverfish's language support. It already has support for .NET languages and Ruby. I'll be building similar analyzers for Python, Go, Java, and other ecosystems. The pattern is the same: detect the package manager, identify components, resolve dependencies, extract versions. If you're trying to understand complex multi-language codebases at scale, this approach should help. The code is C# 14 with only standard library dependencies—no bloat. <!-- SC_ON --> submitted by /u/DavidArno (https://www.reddit.com/user/DavidArno)
[link] (https://dashboard.silverfishsoftware.com/documentation) [comments] (https://www.reddit.com/r/programming/comments/1suhkea/how_i_built_an_automated_jsts_repository_analyzer/)
Modern LZ Compression Part 2: FSE and Arithmetic Coding
https://www.reddit.com/r/programming/comments/1suioko/modern_lz_compression_part_2_fse_and_arithmetic/

<!-- SC_OFF -->This is the second article in a series discussing modern compression techniques. The first one covered Huffman + LZ. This one covers optimal entropy coders (FSE and Arithmetic), and some additional tricks to get closer to the state of the art. The full compressor and decompressor are just over 1500 lines of pretty compact C++: https://github.com/glinscott/linzip2/blob/master/main.cc. It's been seven years since the first article! Hopefully not so long before the third (and probably final one). Part 1 discussion thread: https://www.reddit.com/r/programming/comments/amfzqg/modern_lz_compression/ <!-- SC_ON --> submitted by /u/glinscott (https://www.reddit.com/user/glinscott)
[link] (https://glinscott.github.io/lz/part2.html) [comments] (https://www.reddit.com/r/programming/comments/1suioko/modern_lz_compression_part_2_fse_and_arithmetic/)
Caching Beyond Redis: Real-World Strategies That Don’t Break Your System
https://www.reddit.com/r/programming/comments/1sv664k/caching_beyond_redis_realworld_strategies_that/

<!-- SC_OFF -->In the article, I break down:
• why caching is really a trade-off between speed and correctness
• when to use in-memory cache, Redis-style distributed cache, and CDN caching
• cache-aside, write-through, write-back, and read-through with real examples
• cache invalidation, stale data, and cache stampedes
• when caching is the wrong solution entirely <!-- SC_ON --> submitted by /u/anant94 (https://www.reddit.com/user/anant94)
[link] (https://commitlog.cc/posts/caching-beyond-redis) [comments] (https://www.reddit.com/r/programming/comments/1sv664k/caching_beyond_redis_realworld_strategies_that/)
Quickly restoring 1M+ files from backup
https://www.reddit.com/r/programming/comments/1sxy05v/quickly_restoring_1m_files_from_backup/

<!-- SC_OFF -->Back in 2016, we faced a technical challenge implementing a restore for a large number of files (million or more) from a backup. We had to restore them both quickly and durably, meaning the restored files had to survive a power loss. Neither of the standard approaches worked, so for the solution we had to rely on a couple of undocumented NT internals. <!-- SC_ON --> submitted by /u/axkotti (https://www.reddit.com/user/axkotti)
[link] (https://blog.axiorema.com/engineering/quickly-restoring-1m-files-from-backup/) [comments] (https://www.reddit.com/r/programming/comments/1sxy05v/quickly_restoring_1m_files_from_backup/)