Stuff

Show HN: ART – a new open-source RL framework for training agents
12 by kcorbitt | 0 comments on Hacker News.
Hey HN, I wanted to share a new project we've been working on for the last couple of months called ART ( https://ift.tt/5myhoDk ). ART is a new open-source framework for training agents using reinforcement learning (RL). RL allows you to train an agent to perform better at any task whose outcome can be measured and quantified. There are many excellent projects focused on training LLMs with RL, such as GRPOTrainer ( https://ift.tt/h4Hz7LV ) and verl ( https://ift.tt/LI47yXz ). We've used these frameworks extensively for customer-facing projects at OpenPipe, but grew frustrated with some key limitations: - Multi-turn workflows, where the agent calls a tool, gets a response, and calls another, are not well supported. This makes them a non-starter for any task that requires an agent to perform a sequence of actions. - Other frameworks typically have low GPU efficiency. They may require multiple H100 GPUs just to train a small 7B parameter model, and aren't able to keep the GPUs busy consistently during both the "rollout" and "training" phases of the training loop. - Existing frameworks are typically not a convenient shape for integrating with existing agentic codebases. Existing trainers expect you to call raw text completion endpoints, and don't automatically provide industry-standard chat completion APIs. ART is designed to address these limitations and make it easy to train high-quality agents. We've also shared many details and practical lessons learned is in this post, which walks through a demo of training an email research agent that outperforms o3 ( https://ift.tt/lOhzij9 ). You can also find out more about ART's architecture in our announcement post ( https://ift.tt/AvJhFmr... ). Happy to answer any questions you have!

1 view17:30

Stuff

Show HN: Create your own finetuned AI model using Google Sheets
7 by QueensGambit | 0 comments on Hacker News.
Hello HN, We built Promptrepo to make finetuning accessible to product teams — not just ML engineers. Last week, OpenAI’s CPO shared how they use fine-tuning for everything from customer support to deep research, and called it the future for serious AI teams. Yet most teams I know still rely on prompting, because fine-tuning is too technical, while the people who have the training data (product managers and domain experts) are often non-technical. With Promptrepo, they can now: - Add training examples in Google Sheets - Click a button to train - Deploy and test instantly - Use OpenAI, Claude, Gemini or Llama models We’ve used this internally for years to power AI workflows in our products (Formfacade, Formesign, Neartail), and we're now opening it up to others. Would love your feedback and happy to answer any questions! --- Try it free - https://ift.tt/OQtnRGP Demo video - https://www.youtube.com/watch?v=e1CTin1bD0w Why we built it - https://ift.tt/S9IHFVr...

1 view17:30

Stuff

Young people aren't as happy as they used to be [Global Flourishing Study]
24 by marojejian | 21 comments on Hacker News.

1 view17:30

Stuff

Archil (YC F24) Is Hiring a Distributed Systems Engineer (In-Person, SF)
1 by huntaub | 0 comments on Hacker News.

1 view17:30

Stuff

Show HN: 1.2 users a day to keep the 9–5 away
4 by dmasiii | 1 comments on Hacker News.
In my long career as an “almost digital entrepreneur” (a fancy way to say I’ve tried a thousand things online without making a single cent), I never really felt that “this is it, I’m so close, I’ll finally quit everything and update my passport: job title? SaaS founder.” (Small detail: I don’t even have a passport. But I like to imagine that if I did, I’d want something cooler than “unemployed creative” written on it). For years, I collected side projects, hobbies, half-dead MVPs, and random nonsense, all with the same ending: super hyped at the beginning, burned out in the middle, completely abandoned by the end. But a couple years ago, I decided to take things more seriously (well… I try). I started building SaaS products. Simple, fast stuff, nothing too fancy.And finally, after a long toxic relationship with perfectionism, I realized something super basic but actually powerful:I don’t need thousands of users.I just need 1.2 paying users a day.Literally. Not to get rich, no Lamborghinis parked outside (also, I live in an apartment with no garage), but enough to live well, keep building, and maybe say “this is my job” without looking down in shame. It’s part math, part mindset.Like they told us in the first year of computer science: big problems get solved by breaking them into smaller ones.100 users a day? Anxiety.1.2 users a day? I can breathe. So yeah, this is my new mantra:“1.2 a day to keep the office job away.” Let’s see where this road takes me

1 view18:30

Stuff

NotebookLM Audio Overviews are now available in over 50 languages
55 by saikatsg | 21 comments on Hacker News.

1 view18:30

Stuff

Reversible computing with mechanical links and pivots
12 by tennysont | 1 comments on Hacker News.

1 view18:30

Stuff

Columbia student Mohsen Mahdawi is free after judge orders his release
30 by zzzeek | 1 comments on Hacker News.

1 view18:30

Stuff

The Mira Pro Color is Boox's first color E Ink monitor
17 by tortilla | 7 comments on Hacker News.