Stuff
3 subscribers
197K links
Download Telegram
Launch HN: Infra.new (YC W23) – DevOps Copilot with Guardrails Built In
2 by TankeJosh | 0 comments on Hacker News.
Hey HN, we’re Caleb, Michael, and Josh, the founders of infra.new ( https://infra.new/ ), a DevOps Copilot that can configure and deploy apps on AWS, GCP, and Azure using Terraform and GitHub Actions. You start by describing your infrastructure needs in detail and optionally attach any source code. The agent will clarify your requirements and either execute the task immediately or generate a plan with step-by-step instructions for you to approve. Once you’re happy with the changes, export everything to GitHub or let the agent provision it in your cloud account. Here’s a quick demo of deploying a new app to GCP / AWS: https://ift.tt/JyqkpIT Why build a new coding agent when there are good ones already out there? We believe there’s room for a new agent that is specifically built for DevOps tasks since the risks are much higher – it's easy to rollback AI-related errors in a web app, but fixing a misconfigured database is not nearly as easy. By focusing specifically on cloud infra, we can provide all the visibility and checks you need to feel confident in your configuration changes. At our previous jobs, we built an internal data / ML platform at Google Life Sciences that involved migrating off of internal Google infrastructure to the public cloud (GCP). We quickly learned how complicated it can be to configure cloud infrastructure well, even for seemingly simple tasks. Configuring an app with CI/CD requires knowledge of multiple infra tools, cloud services, and best practices. Mistakes can be costly and diagnosing issues can send you down a rabbit hole of cloud docs. Our goal is to help engineers feel confident when making changes in their cloud. We designed the workflow to start with a prompt, a template, or a GitHub repository. After clarifying your requirements, the agent will start generating IaC, CI/CD, and other configurations using the latest docs, public Terraform Registries, and a set of best practices we dynamically load into the context window. All changes are run through static analysis to detect hallucinations, estimate cost changes, and visualize your infrastructure components as you go. Once you’re happy with the changes, you can export everything to GitHub for review. You also have the option to deploy directly to your cloud from the workspace and let the agent diagnose any deployment issues. The deployment flow is "pseudo-deterministic" in that it follows a checklist of human-guided instructions that help it stay in bounds, but we still recommend only using this feature for dev environments and using GitOps for any changes to production. The current plan is to continue adding support for more tools (Kubernetes and GitLab are next) and we may add a CLI that lets you bring the agent into your local workspace. We’d love to hear your feedback and ideas!
Surprises in Logic
2 by jxmorris12 | 0 comments on Hacker News.
Algebraic Semantics for Machine Knitting
28 by PaulHoule | 0 comments on Hacker News.
Show HN: Morphik – Open-source RAG that understands PDF images, runs locally
3 by Adityav369 | 0 comments on Hacker News.
Hey HN, we’re Adi and Arnav. A few months ago, we hit a wall trying to get LLMs to answer questions over research papers and instruction manuals. Everything worked fine, until the answer lived inside an image or diagram embedded in the PDF. Even GPT‑4o flubbed it (we recently tried O3 with the same, and surprisingly it flubbed it too). Naive RAG pipelines just pulled in some text chunks and ignored the rest. We took an invention disclosure PDF ( https://ift.tt/FTBSvx5... ) containing an IRR‑vs‑frequency graph and asked GPT “From the graph, at what frequency is the IRR maximized?”. We originally tried this on gpt-4o, but while writing this used the new natively multimodal model o4‑mini‑high. After a 30‑second thinking pause, it asked for clarifications, then churned out buggy code, pulled data from the wrong page, and still couldn’t answer the question. We wrote up the full story with screenshots here: https://ift.tt/hf1OCyI . We got frustrated enough to try fixing it ourselves. We built Morphik to do multimodal retrieval over documents like PDFs, where images and diagrams matter as much as the text. To do this, we use Colpali-style embeddings, which treat each document page as an image and generate multi-vector representations. These embeddings capture layout, typography, and visual context, allowing retrieval to get a whole table or schematic, not just nearby tokens. Along with vector search, this could now retrieve exact pages with relevant diagrams and pass them as images to the LLM to get relevant answers. It’s able to answer the question with an 8B llama 3.1 vision running locally! Early pharma testers hit our system with queries like "Which EGFR inhibitors at 50 mg showed ≥ 30% tumor reduction?" We correctly returned the right tables and plots, but still hit a bottleneck, we weren’t able to join the dots across multiple reports. So we built a knowledge graph: we tag entities in both text and images, normalize synonyms (Erlotinib → EGFR inhibitor), infer relations (e.g. administered_at, yields_reduction), and stitch everything into a graph. Now a single query could traverse that graph across documents and surface a coherent, cross‑document answer along with the correct pages as images. To illustrate that, and just for fun, we built a graph of 100 Paul Graham’s essays here: https://ift.tt/CcNmyO9 You can search for various nodes, (eg. startup, sam altman, paul graham and see corresponding connections). In our system, we create graphs and store the relevant text chunks along with the entities, so on querying, we can extract the relevant entity, do a search on the graph and pull in the text chunks of all connected nodes, improving cross document queries. For longer or multi-turn queries, we added persistent KV caching, which stores intermediate key-value states from transformer attention layers. Instead of recomputing attention from scratch every time, we reuse prior layers, speeding up repeated queries and letting us handle much longer context windows. We’re open‑source under the MIT Expat license: https://ift.tt/3LK4cDm Would love to hear your RAG horror stories, what worked, what didn’t and any feedback on Morphik. We’re here for it.