Indexing Solana Programs in Rust: Notes From a Python Backend Engineer
via DEV Community: rust (author: Brian Ting)
via DEV Community: rust (author: Brian Ting)
Telegraph
Indexing Solana Programs in Rust: Notes From a Python Backen…
TL;DR I built a small Solana program activity indexer in Rust to pressure-test the patterns I rely on every day in Python — cursor-based syncs, idempotent ingestion, mockable I/O — against an unfamiliar language and an unfamiliar chain. The repo is here:…
Securing the Stack: How to Build Zero-Trust Applications with WebAssembly
via DEV Community: rust (author: Alain Airom (Ayrom))
via DEV Community: rust (author: Alain Airom (Ayrom))
Telegraph
Securing the Stack: How to Build Zero-Trust Applications wit…
My firs hands-on experience to learn WASM! Introduction For a while now, I’ve been diving into articles and books on WebAssembly, exploring its mechanics and potential. But as any developer knows, there is a massive gap between reading a specification and…
Clipboard Monitor + Gemini in a Tauri App — Building a Smarter Dev Tool
All tests run on an 8-year-old MacBook Air.
All results from shipping 7 Mac apps as a solo developer. No sponsored opinion.
HiyokoHelper monitors the clipboard and optionally sends content to Gemini for analysis. It sounds simple. The implementation has specific gotchas.
Here's what I learned.
Clipboard monitoring in Tauri
There's no built-in file-watcher equivalent for the clipboard. You poll.
rustuse tauri_plugin_clipboard_manager::ClipboardExt;
use std::time::Duration;
use tokio::time::interval;
async fn watch_clipboard(handle: AppHandle) {
let mut last = String::new();
let mut ticker = interval(Duration::from_millis(500));
}
500ms polling is fast enough to feel responsive. Lower than 200ms and you're burning CPU for no user benefit.
What to do with clipboard content
For HiyokoHelper, clipboard content goes to Gemini for command explanation and danger detection. The flow:
Clipboard changes
Frontend receives clipboard-changed event
User clicks "Analyze" (or auto-analyze is on)
Content sent to Gemini with a structured prompt
Response shown in the UI
Auto-analyze on every clipboard change is too aggressive. Users copy passwords, personal data, sensitive content. Analyze on demand, not automatically.
The dangerous content problem
Terminal error messages and shell commands are the primary use case. But users also copy:
Passwords
API keys
Personal information
Code with secrets in it
Before sending to Gemini: strip obvious secrets. API keys (long alphanumeric strings), password manager output, anything that looks like a credential.
rustfn looks_like_secret(text: &str) -> bool {
// Long random strings
let has_long_random = text.split_whitespace()
.any(|w| w.len() > 30 && w.chars().all(|c| c.is_alphanumeric()));
}
Not perfect. Good enough to avoid the most obvious cases.
The history cache
Store clipboard history in SQLite with timestamps. Let users browse, search, and pin items.
Don't store everything forever. Cap at 100 items. Auto-delete items older than 7 days. Clipboard history that grows unbounded is a privacy and storage problem.
The verdict
Clipboard monitoring is polling. Gemini integration needs explicit user action, not auto-send. History needs a retention policy.
The useful part — AI analysis of terminal errors and commands — is genuinely useful once you get the UX right.
If this was useful, a ❤️ helps more than you'd think — thanks!
Hiyoko PDF Vault → https://hiyokoko.gumroad.com/l/HiyokoPDFVault
X → @hiyoyok
via DEV Community: rust (author: hiyoyo)
All tests run on an 8-year-old MacBook Air.
All results from shipping 7 Mac apps as a solo developer. No sponsored opinion.
HiyokoHelper monitors the clipboard and optionally sends content to Gemini for analysis. It sounds simple. The implementation has specific gotchas.
Here's what I learned.
Clipboard monitoring in Tauri
There's no built-in file-watcher equivalent for the clipboard. You poll.
rustuse tauri_plugin_clipboard_manager::ClipboardExt;
use std::time::Duration;
use tokio::time::interval;
async fn watch_clipboard(handle: AppHandle) {
let mut last = String::new();
let mut ticker = interval(Duration::from_millis(500));
loop {
ticker.tick().await;
if let Ok(current) = handle.clipboard().read_text() {
if current != last && !current.is_empty() {
last = current.clone();
handle.emit("clipboard-changed", current).ok();
}
}
}
}
500ms polling is fast enough to feel responsive. Lower than 200ms and you're burning CPU for no user benefit.
What to do with clipboard content
For HiyokoHelper, clipboard content goes to Gemini for command explanation and danger detection. The flow:
Clipboard changes
Frontend receives clipboard-changed event
User clicks "Analyze" (or auto-analyze is on)
Content sent to Gemini with a structured prompt
Response shown in the UI
Auto-analyze on every clipboard change is too aggressive. Users copy passwords, personal data, sensitive content. Analyze on demand, not automatically.
The dangerous content problem
Terminal error messages and shell commands are the primary use case. But users also copy:
Passwords
API keys
Personal information
Code with secrets in it
Before sending to Gemini: strip obvious secrets. API keys (long alphanumeric strings), password manager output, anything that looks like a credential.
rustfn looks_like_secret(text: &str) -> bool {
// Long random strings
let has_long_random = text.split_whitespace()
.any(|w| w.len() > 30 && w.chars().all(|c| c.is_alphanumeric()));
// Common secret patterns
let has_secret_pattern = text.contains("sk-")
|| text.contains("AIza")
|| text.contains("ghp_");
has_long_random || has_secret_pattern
}
Not perfect. Good enough to avoid the most obvious cases.
The history cache
Store clipboard history in SQLite with timestamps. Let users browse, search, and pin items.
Don't store everything forever. Cap at 100 items. Auto-delete items older than 7 days. Clipboard history that grows unbounded is a privacy and storage problem.
The verdict
Clipboard monitoring is polling. Gemini integration needs explicit user action, not auto-send. History needs a retention policy.
The useful part — AI analysis of terminal errors and commands — is genuinely useful once you get the UX right.
If this was useful, a ❤️ helps more than you'd think — thanks!
Hiyoko PDF Vault → https://hiyokoko.gumroad.com/l/HiyokoPDFVault
X → @hiyoyok
via DEV Community: rust (author: hiyoyo)
Web Developer Travis McCracken on My Favorite VSCode Extensions for Backend Dev
via DEV Community: rust (author: Travis McCracken Web Developer)
via DEV Community: rust (author: Travis McCracken Web Developer)
Telegraph
Web Developer Travis McCracken on My Favorite VSCode Extensi…
Building Robust Backends with Rust and Go: A Perspective from Web Developer Travis McCracken As a passionate web developer specializing in backend technologies, I’ve spent countless hours exploring the capabilities of different programming languages to build…
Five problems every agent loop has. No framework needed.
Most agent failure modes are not interesting. They are boring. They are the same five problems in different costumes. After eighteen months running agent loops in production, I keep meeting these five and only these five.
I do not build agent frameworks. I build small libraries that fix one failure mode each. You install the one you need. The composition emerges from your code, not a framework's architecture diagram.
Here they are in roughly the order you will hit them.
1. The JSON is not JSON
Your model returns
Fix: repair before validate. Strip the fence. Extract the largest balanced JSON object from surrounding prose. Remove trailing commas. Then validate against your schema. If validation fails, send the model back a precise hint, not a generic "invalid JSON, please try again."
The hint is the trick. Smaller models self-correct beautifully on a structural complaint. They do not self-correct on a vague reprimand.
2. The tool args are wrong
The model picks the right tool. It calls it with
Fix: validate every tool call against its schema before running it. Validation issues become the tool's response. Feed them back. The model fixes the call on the next turn.
Always return all validation issues at once, not the first. The model fixes them in one retry, not five.
3. The agent wanders to the wrong network
The minute your agent can pick URLs, you handed it URL-picking power. A confused-deputy bug or a prompt injection sends a fetch to a domain you did not authorize. By the time you notice, an API key is in an attacker's log.
Fix: declarative domain allowlist. List the four hosts your agent legitimately needs. Block everything else with an error message at the HTTP layer.
4. The context budget runs out
You stack five turns of chat history. You drop the system message accidentally during truncation. The agent forgets what it is doing. Or you drop the trailing user turn and the model answers a question you never asked.
Fix: anchored truncation. Preserve the leading
Drop-oldest is the right default for chat. Drop-middle is better when you want both early grounding and recent context. Both keep the load-bearing pieces of the prompt.
5. Regressions sneak in
You tweak a system prompt. The agent now picks tools in a slightly different order. Sometimes that is fine. Sometimes it is a regression that breaks the deployed app and you only notice next Friday.
Fix: snapshot tests for agent traces. Record one run end-to-end. First test run writes the snapshot. Later runs diff and fail with a unified diff if anything changed. Refresh with an env var when the change is intentional.
The framework I did not write
A naive agent loop hitting all five of these:
Ten lines. Five concerns. Each concern is a separate 200-line library that does one thing and ships independently.
The framework version of this is 2000 lines, locks you to one HTTP client, opinionated about which provider, and bundles all five concerns into a single API you cannot pry apart when one of them is wrong.
I have shipped both kinds. The small libraries win every time.
If you are building an agent and have not hit problems 1 or 2 yet, you will. Skip the framework. Pick the small library when the problem actually shows up. Compose.
That is the whole stack.
via DEV Community: rust (author: Mukunda Rao Katta)
Most agent failure modes are not interesting. They are boring. They are the same five problems in different costumes. After eighteen months running agent loops in production, I keep meeting these five and only these five.
I do not build agent frameworks. I build small libraries that fix one failure mode each. You install the one you need. The composition emerges from your code, not a framework's architecture diagram.
Here they are in roughly the order you will hit them.
1. The JSON is not JSON
Your model returns
Sure, here you go:\njson\n{...,}\n``. You parse it. You crash.Fix: repair before validate. Strip the fence. Extract the largest balanced JSON object from surrounding prose. Remove trailing commas. Then validate against your schema. If validation fails, send the model back a precise hint, not a generic "invalid JSON, please try again."
The hint is the trick. Smaller models self-correct beautifully on a structural complaint. They do not self-correct on a vague reprimand.
2. The tool args are wrong
The model picks the right tool. It calls it with
units: "kelvin" against an enum of ["c", "f"]. You run the tool. Bad things happen.Fix: validate every tool call against its schema before running it. Validation issues become the tool's response. Feed them back. The model fixes the call on the next turn.
Always return all validation issues at once, not the first. The model fixes them in one retry, not five.
3. The agent wanders to the wrong network
The minute your agent can pick URLs, you handed it URL-picking power. A confused-deputy bug or a prompt injection sends a fetch to a domain you did not authorize. By the time you notice, an API key is in an attacker's log.
Fix: declarative domain allowlist. List the four hosts your agent legitimately needs. Block everything else with an error message at the HTTP layer.
4. The context budget runs out
You stack five turns of chat history. You drop the system message accidentally during truncation. The agent forgets what it is doing. Or you drop the trailing user turn and the model answers a question you never asked.
Fix: anchored truncation. Preserve the leading
system message and the trailing user turn. Drop from the middle.Drop-oldest is the right default for chat. Drop-middle is better when you want both early grounding and recent context. Both keep the load-bearing pieces of the prompt.
5. Regressions sneak in
You tweak a system prompt. The agent now picks tools in a slightly different order. Sometimes that is fine. Sometimes it is a regression that breaks the deployed app and you only notice next Friday.
Fix: snapshot tests for agent traces. Record one run end-to-end. First test run writes the snapshot. Later runs diff and fail with a unified diff if anything changed. Refresh with an env var when the change is intentional.
The framework I did not write
A naive agent loop hitting all five of these:
`rust
let fitted = Fitter::new(8_000).fit(messages, Strategy::DropOldest);
let raw = call_model(&fitted).await?;
let action = caster.parse(&raw)?;
if action.kind == "tool" {
validator(&action.tool)?.validate(&action.args)
.map_err(|e| anyhow!(e.for_llm()))?;
run_tool(&action).await
} else {
Ok(action.text)
}
`Ten lines. Five concerns. Each concern is a separate 200-line library that does one thing and ships independently.
The framework version of this is 2000 lines, locks you to one HTTP client, opinionated about which provider, and bundles all five concerns into a single API you cannot pry apart when one of them is wrong.
I have shipped both kinds. The small libraries win every time.
If you are building an agent and have not hit problems 1 or 2 yet, you will. Skip the framework. Pick the small library when the problem actually shows up. Compose.
That is the whole stack.
via DEV Community: rust (author: Mukunda Rao Katta)
Building a Low-Cost Image Converter on AWS With Rust Lambda
Project demo link: https://image-ignite.vercel.app/
This started as a hobby project. I was thinking about a simple image conversion service for resizing ,convert, compress uploaded images.
Since it was a personal project, cost mattered a lot.
Keeping EC2 instances running 24/7 felt unnecessary for a workload that only existed for a few seconds per request.
So I started experimenting with AWS Lambda.
First Attempt: Node.js + Sharp on Lambda
My initial stack was:
The architecture was very simple:
At first, Lambda felt like the perfect solution:
● no idle server costs
● automatic scaling
● no server management
● pay only when used
Exactly what I wanted for a hobby project.
The Problem I Started Noticing
After some usage, cold starts became noticeable.
Especially for image-heavy requests.
The actual image processing was fast enough, but startup time sometimes became a large part of the total request.
Approximate numbers from my experience:
For a workload that only runs a few seconds, that startup overhead feels significant.
Why I Switched to Rust
I rebuilt the processor using Rust Lambda.
Not because of hype.
The workload simply matched Rust better.
The difference became noticeable almost immediately.
For this type of workload, lower startup time mattered a lot more than I initially expected.
Final Thoughts
This project started as a small hobby image converter where I wanted to avoid the cost of always-on infrastructure. I initially used Node.js with Sharp on AWS Lambda, but cold starts became noticeable for short-lived image processing tasks. Moving to Rust improved startup time and reduced memory usage significantly. More importantly, it changed how I think about infrastructure — some workloads work better as temporary, event-driven compute instead of persistent servers.
via DEV Community: rust (author: fayismahmood)
Project demo link: https://image-ignite.vercel.app/
This started as a hobby project. I was thinking about a simple image conversion service for resizing ,convert, compress uploaded images.
Since it was a personal project, cost mattered a lot.
Keeping EC2 instances running 24/7 felt unnecessary for a workload that only existed for a few seconds per request.
So I started experimenting with AWS Lambda.
First Attempt: Node.js + Sharp on Lambda
My initial stack was:
The architecture was very simple:
Upload → Lambda → Process Image → Store Result
At first, Lambda felt like the perfect solution:
● no idle server costs
● automatic scaling
● no server management
● pay only when used
Exactly what I wanted for a hobby project.
The Problem I Started Noticing
After some usage, cold starts became noticeable.
Especially for image-heavy requests.
The actual image processing was fast enough, but startup time sometimes became a large part of the total request.
Approximate numbers from my experience:
For a workload that only runs a few seconds, that startup overhead feels significant.
Why I Switched to Rust
I rebuilt the processor using Rust Lambda.
Not because of hype.
The workload simply matched Rust better.
The difference became noticeable almost immediately.
For this type of workload, lower startup time mattered a lot more than I initially expected.
Final Thoughts
This project started as a small hobby image converter where I wanted to avoid the cost of always-on infrastructure. I initially used Node.js with Sharp on AWS Lambda, but cold starts became noticeable for short-lived image processing tasks. Moving to Rust improved startup time and reduced memory usage significantly. More importantly, it changed how I think about infrastructure — some workloads work better as temporary, event-driven compute instead of persistent servers.
via DEV Community: rust (author: fayismahmood)
Building an open-source Windows file transfer app with Rust, Tauri and QUIC
via DEV Community: rust (author: Kerim Sabic)
via DEV Community: rust (author: Kerim Sabic)
Telegraph
Building an open-source Windows file transfer app with Rust,…
Building an open-source Windows file transfer app with Rust, Tauri and QUIC I recently launched Lightning P2P, an open-source Windows file transfer app. The idea is simple: Select a file → generate a link or QR code → receive the file. No account.No cloud…
The 2 extra bits — building software for systems that can't fail
Why "38bits"?
In 1952, IBM shipped the IBM 701 — the first large-scale electronic scientific computer in history. Its accumulator had 36 bits.
But the engineers added 2 more.
Not because the spec demanded it. Not because customers asked. They added them because critical operations cannot tolerate overflow on the last digit. Those 2 bits never needed to exist — but they're what separated sufficient from exceptional.
That's the name. That's the principle.
What we do
We build software for systems that can't fail:
● Smart contracts on Solana with Anchor + Rust
● Zero-trust security from the first commit
● APIs built for real load — observability, rate limiting, CI/CD with production-grade tests
● Independent code review with seniority — not a trainee reviewing the senior
Brazilian fintechs. DeFi protocols. Backends where downtime has a real dollar cost.
Principles we don't negotiate
1. Senior in every critical decision. Not a junior executing while a senior reviews once. Real seniority shapes architecture, not just approves PRs.
2. Zero-trust from commit #1. Auth, rate limiting, observability, vulnerability scanning — not bolted on after the first incident.
3. Tests in production-real conditions. Concurrency, load, edge cases. Local with happy paths is not enough.
4. Living documentation. Decisions get written. Tradeoffs get documented. If it's not written, it didn't happen.
5. Defined SLA + SLO. Not "we'll do our best" — measurable commitments with consequences.
The customer we want
CTOs, engineering heads, technical founders running:
● Fintech (PIX, payment processors, exchanges)
● DeFi protocols pre-audit
● Critical infrastructure where downtime ≠ negotiable
● Pre-launch systems that can't afford a bad first impression
If you're hunting for the cheapest provider, we're not it.
If you've been burned by a code review that didn't catch what it should have — we want to talk.
What's next here
We're going to share:
● War stories from production (anonymized)
● Stack decisions and tradeoffs
● Solana/Rust patterns we actually use
● Security findings worth sharing
● How we think about senioridade in code review
38bits — software where the 2 extra bits matter.
Site: drexbrasil.com · Telegram: @Fl38bits_bot
via DEV Community: rust (author: 38bits)
Why "38bits"?
In 1952, IBM shipped the IBM 701 — the first large-scale electronic scientific computer in history. Its accumulator had 36 bits.
But the engineers added 2 more.
Not because the spec demanded it. Not because customers asked. They added them because critical operations cannot tolerate overflow on the last digit. Those 2 bits never needed to exist — but they're what separated sufficient from exceptional.
That's the name. That's the principle.
What we do
We build software for systems that can't fail:
● Smart contracts on Solana with Anchor + Rust
● Zero-trust security from the first commit
● APIs built for real load — observability, rate limiting, CI/CD with production-grade tests
● Independent code review with seniority — not a trainee reviewing the senior
Brazilian fintechs. DeFi protocols. Backends where downtime has a real dollar cost.
Principles we don't negotiate
1. Senior in every critical decision. Not a junior executing while a senior reviews once. Real seniority shapes architecture, not just approves PRs.
2. Zero-trust from commit #1. Auth, rate limiting, observability, vulnerability scanning — not bolted on after the first incident.
3. Tests in production-real conditions. Concurrency, load, edge cases. Local with happy paths is not enough.
4. Living documentation. Decisions get written. Tradeoffs get documented. If it's not written, it didn't happen.
5. Defined SLA + SLO. Not "we'll do our best" — measurable commitments with consequences.
The customer we want
CTOs, engineering heads, technical founders running:
● Fintech (PIX, payment processors, exchanges)
● DeFi protocols pre-audit
● Critical infrastructure where downtime ≠ negotiable
● Pre-launch systems that can't afford a bad first impression
If you're hunting for the cheapest provider, we're not it.
If you've been burned by a code review that didn't catch what it should have — we want to talk.
What's next here
We're going to share:
● War stories from production (anonymized)
● Stack decisions and tradeoffs
● Solana/Rust patterns we actually use
● Security findings worth sharing
● How we think about senioridade in code review
38bits — software where the 2 extra bits matter.
Site: drexbrasil.com · Telegram: @Fl38bits_bot
via DEV Community: rust (author: 38bits)
I’ve Given Up on Bun. I’m Removing It from SuperRails
via DEV Community: rust (author: Hulk in Public)
via DEV Community: rust (author: Hulk in Public)
Telegraph
I’ve Given Up on Bun. I’m Removing It from SuperRails
Bun’s implementation language has been migrated from Zig to Rust. I have no intention of criticizing either Zig or Rust. I think both are excellent languages.What I want to criticize is Bun’s development process. Rewrite Bun in Rust #30412
DataZen: a 10 MB open-source database client built with Tauri and Rust
DataZen is a free, MIT-licensed desktop app for PostgreSQL, MySQL, SQLite, and Redis.
Why another client?
● TablePlus is great but paid for many teams
● DBeaver is powerful but heavy on RAM and startup time
DataZen targets daily dev work: connect, browse, run SQL, export — in a <10 MB installer.
Stack
● Tauri v2 + Rust backend (sqlx, redis, russh for SSH)
● React + CodeMirror 6 frontend
● Credentials encrypted locally (AES-256-GCM)
Features
● Multi-window workflow
● Built-in SSH tunnels (no local
● SQL editor with table/column autocomplete
● Virtual scrolling for large tables
● Backup to SQL, CSV/JSON import/export
● PG ↔ MySQL schema + data sync
● Redis key browser
● Dark theme, English + Chinese UI
Status
Early v0.0.3, but I use it as a daily driver for SQL + Redis.
● Download: https://github.com/flyxl/datazen/releases
● Site: https://flyxl.github.io/datazen/
● Repo: https://github.com/flyxl/datazen
macOS: if Gatekeeper blocks the app, run
Feedback: wuxiaolongklws@gmail.com (
via DEV Community: rust (author: flyxl)
DataZen is a free, MIT-licensed desktop app for PostgreSQL, MySQL, SQLite, and Redis.
Why another client?
● TablePlus is great but paid for many teams
● DBeaver is powerful but heavy on RAM and startup time
DataZen targets daily dev work: connect, browse, run SQL, export — in a <10 MB installer.
Stack
● Tauri v2 + Rust backend (sqlx, redis, russh for SSH)
● React + CodeMirror 6 frontend
● Credentials encrypted locally (AES-256-GCM)
Features
● Multi-window workflow
● Built-in SSH tunnels (no local
ssh binary)● SQL editor with table/column autocomplete
● Virtual scrolling for large tables
● Backup to SQL, CSV/JSON import/export
● PG ↔ MySQL schema + data sync
● Redis key browser
● Dark theme, English + Chinese UI
Status
Early v0.0.3, but I use it as a daily driver for SQL + Redis.
● Download: https://github.com/flyxl/datazen/releases
● Site: https://flyxl.github.io/datazen/
● Repo: https://github.com/flyxl/datazen
macOS: if Gatekeeper blocks the app, run
xattr -cr /Applications/DataZen.app after install.Feedback: wuxiaolongklws@gmail.com (
mailto:wuxiaolongklws@gmail.com) — stars and issues welcome!via DEV Community: rust (author: flyxl)
I Built "harumi" — A Pure Rust PDF Editing Library with CJK Support
Overview
harumi is a Pure Rust library that lets you dynamically add CJK text (Japanese, Chinese, Korean) to existing PDFs. Unlike bindings-based solutions, it has zero C dependencies and handles font subsetting automatically.
● crates.io: https://crates.io/crates/harumi
● GitHub: https://github.com/kent-tokyo/harumi
Why Another Rust PDF Crate?
The existing Rust PDF ecosystem leaves a gap:
harumi fills that gap: append-only editing of existing PDFs, Pure Rust, with automatic CJK font subsetting and ToUnicode CMap generation built in.
The Three Hard Problems of CJK in PDF
Getting Japanese (and CJK in general) right inside a PDF isn't just about "embedding a font." There are three distinct challenges:
1. Font Subsetting
A full Japanese font file can easily exceed 10 MB. For practical file sizes you must extract only the glyphs actually used and rebuild the font binary — this is subsetting. harumi does this automatically at save time.
2. ToUnicode CMap Generation
PDFs separate rendering (Glyph IDs) from semantics (Unicode code points). Without a ToUnicode CMap, copy-paste and text search produce garbled output. harumi generates this mapping for every font it embeds.
3. Glyph Advance Width Recalculation
After subsetting, Glyph IDs are reassigned. The advance widths stored in the PDF must be recalculated to match — otherwise text spacing breaks. harumi handles this as part of the save pipeline.
Lazy Subsetting Pipeline
harumi uses a lazy subsetting design to handle all three problems in one pass:
1.
2. Collect all text draw calls across all pages
3. Walk every page at
4. Subset the font to only those glyphs
5. Reassign Glyph IDs
6. Build the ToUnicode CMap
7. Recalculate advance widths and write the final CIDFont object
This single-pass approach avoids redundant font processing and keeps the implementation straightforward.
Feature Overview
Current Status & Roadmap
harumi is published on crates.io and the source is available on GitHub.
Planned improvements:
● Broader CJK font format support
● Form field editing
● Performance optimizations for large documents
Feedback, issues, and contributions are very welcome!
via DEV Community: rust (author: kent-tokyo)
Overview
harumi is a Pure Rust library that lets you dynamically add CJK text (Japanese, Chinese, Korean) to existing PDFs. Unlike bindings-based solutions, it has zero C dependencies and handles font subsetting automatically.
● crates.io: https://crates.io/crates/harumi
● GitHub: https://github.com/kent-tokyo/harumi
Why Another Rust PDF Crate?
The existing Rust PDF ecosystem leaves a gap:
harumi fills that gap: append-only editing of existing PDFs, Pure Rust, with automatic CJK font subsetting and ToUnicode CMap generation built in.
The Three Hard Problems of CJK in PDF
Getting Japanese (and CJK in general) right inside a PDF isn't just about "embedding a font." There are three distinct challenges:
1. Font Subsetting
A full Japanese font file can easily exceed 10 MB. For practical file sizes you must extract only the glyphs actually used and rebuild the font binary — this is subsetting. harumi does this automatically at save time.
2. ToUnicode CMap Generation
PDFs separate rendering (Glyph IDs) from semantics (Unicode code points). Without a ToUnicode CMap, copy-paste and text search produce garbled output. harumi generates this mapping for every font it embeds.
3. Glyph Advance Width Recalculation
After subsetting, Glyph IDs are reassigned. The advance widths stored in the PDF must be recalculated to match — otherwise text spacing breaks. harumi handles this as part of the save pipeline.
Lazy Subsetting Pipeline
harumi uses a lazy subsetting design to handle all three problems in one pass:
1.
embed_font() — store raw font bytes; no processing yet2. Collect all text draw calls across all pages
3. Walk every page at
save() time, gathering the complete set of used characters4. Subset the font to only those glyphs
5. Reassign Glyph IDs
6. Build the ToUnicode CMap
7. Recalculate advance widths and write the final CIDFont object
This single-pass approach avoids redundant font processing and keeps the implementation straightforward.
Feature Overview
use harumi::Document;
let mut doc = Document::open("input.pdf")?;
// Append text (including invisible text for search layers)
doc.page(0).add_text("Hello, 世界!", font, 12.0, x, y)?;
// Draw shapes and embed images
doc.page(0).draw_rect(x, y, width, height, color)?;
doc.page(0).embed_image(image_bytes, x, y, width, height)?;
// Page operations
doc.rotate_page(1, 90)?;
doc.delete_page(2)?;
doc.reorder_pages(&[2, 0, 1])?;
// Merge and split
let other = Document::open("other.pdf")?;
doc.merge(other)?;
let parts = doc.split_at(&[3])?;
// Extract text
let text = doc.extract_text(0)?;
// Metadata
doc.set_title("My Document")?;
doc.save("output.pdf")?;
Current Status & Roadmap
harumi is published on crates.io and the source is available on GitHub.
Planned improvements:
● Broader CJK font format support
● Form field editing
● Performance optimizations for large documents
Feedback, issues, and contributions are very welcome!
via DEV Community: rust (author: kent-tokyo)
Auth multi-tenant que prova, não promete: como o GarraIA fecha 110 cenários de RBAC + 81 de RLS antes do beta da Fase 3
via DEV Community: rust (author: Michel)
via DEV Community: rust (author: Michel)
Telegraph
Auth multi-tenant que prova, não promete: como o GarraIA fec…
No GarraIA — framework de agentes IA em Rust, 100% local, MIT — a Fase 3 (Group Workspace) é o módulo onde múltiplos usuários compartilham arquivos, tasks, chats e memória IA dentro de um espaço comum. É o módulo que define se o projeto pode ser usado por…
[Solved] 12 Best Sites to Buy Mix Gmail Accounts in Bulk &
via DEV Community: rust (author: marcellapa)
via DEV Community: rust (author: marcellapa)
Telegraph
[Solved] 12 Best Sites to Buy Mix Gmail Accounts in Bulk &
Introduction✨🚀💬📞⚡🔥🌟 24/7 Premium Instant Support Available✨🚀💬📞⚡🔥🌟 Telegram: https://t.me/Getusasmm✨🚀💬📞⚡🔥🌟 WhatsApp: +1 (579) 550-8030✨🚀💬📞⚡🔥🌟 Email: getusasmm@gmail.com✨🚀💬📞⚡🔥🌟 Discord: Getusasmm https://getusasmm.com/product/buy-mix-gmail-accounts/ https:…
Building a Browser-Based RPG Map Editor with Rust, WebAssembly, WebGL2, and React
via DEV Community: rust (author: TheXper)
via DEV Community: rust (author: TheXper)
Telegraph
Building a Browser-Based RPG Map Editor with Rust, WebAssemb…
I've been building RPGMapEditor.com — a browser-based fantasy map editor for dungeon masters, worldbuilders, and tabletop RPG players. The stack is: Rust + WebAssembly for the editor core, WebGL2 for rendering, React + TypeScript for UI, Rocket for the backend…