Show HN: I’m 16 years old and working on my first startup, a study app
26 by WilliamCranna | 9 comments on Hacker News.
26 by WilliamCranna | 9 comments on Hacker News.
Absolute Zero: Reinforced Self-Play Reasoning with Zero Data
7 by leodriesch | 2 comments on Hacker News.
7 by leodriesch | 2 comments on Hacker News.
Ask HN: What will tech employment look like in 10 years?
16 by ipnon | 17 comments on Hacker News.
What jobs will become prevalent? Which will become scarce? I do not predict the elimination of the humble coder, but the covid hiring wave has come and gone, and Big Tech for the most part successfully minimized the workforces of those who were hired in the covid wave: frontend, backend and fullstack engineers. The patterns of code required for these positions have been successfully recognized by the LLMs I think, and for many cases a single staff engineer with experience and a trusty LLM is similarly productive as a team of 2-4 junior engineers led by a senior engineer was only a short 5 years ago. I do not expect much expansion in this "traditional" web development (these positions have really only existed in modern form for about 20 years, roughly when Rails was first released). Many such as Amjad Masad and Beff Jezos are of the opinion that for those who would have taken these positions before, the options are to either drill down the stack towards the bare metal, by reason of relative difficulty of embedded engineering, and that one struggles to imagine high-stakes software such as in a SpaceX rocket, Boeing airplane, or Anduril drone relying primarily on vibe-coded slop hastily LGTM'd into production. So the kind of software that requires large amounts of formal, simulated, or physical verification seems to still be necessary, but this is much more difficult to write than a webpage. Expansions in the labor market for those writing C, C++, Rust in the context of operating systems, embedded systems, microcontrollers, drivers, and so forth seems likely. The other option seems to be to leave the stack entirely, and leverage small teams to create niche and targeted applications for small segments of users. There has been some success in this area as well, but requires a much broader skillset than simply being an expert programmer and understanding some computer science. The options seem to be either to start reading Bjarne Stroustrup or Peter Thiel. But the skill ceiling for either path is fairly high, and for the short term I predict a sustained contraction in the software engineering labor market, while people adapt their educations and long-term career goals. Headcounts at FAANG I don't see recovering soon if ever. This has broader implications for a traditional startup route where one earned their stripes at FAANG before launching their own venture, but I digress ...
16 by ipnon | 17 comments on Hacker News.
What jobs will become prevalent? Which will become scarce? I do not predict the elimination of the humble coder, but the covid hiring wave has come and gone, and Big Tech for the most part successfully minimized the workforces of those who were hired in the covid wave: frontend, backend and fullstack engineers. The patterns of code required for these positions have been successfully recognized by the LLMs I think, and for many cases a single staff engineer with experience and a trusty LLM is similarly productive as a team of 2-4 junior engineers led by a senior engineer was only a short 5 years ago. I do not expect much expansion in this "traditional" web development (these positions have really only existed in modern form for about 20 years, roughly when Rails was first released). Many such as Amjad Masad and Beff Jezos are of the opinion that for those who would have taken these positions before, the options are to either drill down the stack towards the bare metal, by reason of relative difficulty of embedded engineering, and that one struggles to imagine high-stakes software such as in a SpaceX rocket, Boeing airplane, or Anduril drone relying primarily on vibe-coded slop hastily LGTM'd into production. So the kind of software that requires large amounts of formal, simulated, or physical verification seems to still be necessary, but this is much more difficult to write than a webpage. Expansions in the labor market for those writing C, C++, Rust in the context of operating systems, embedded systems, microcontrollers, drivers, and so forth seems likely. The other option seems to be to leave the stack entirely, and leverage small teams to create niche and targeted applications for small segments of users. There has been some success in this area as well, but requires a much broader skillset than simply being an expert programmer and understanding some computer science. The options seem to be either to start reading Bjarne Stroustrup or Peter Thiel. But the skill ceiling for either path is fairly high, and for the short term I predict a sustained contraction in the software engineering labor market, while people adapt their educations and long-term career goals. Headcounts at FAANG I don't see recovering soon if ever. This has broader implications for a traditional startup route where one earned their stripes at FAANG before launching their own venture, but I digress ...
Show HN: GlassFlow – OSS streaming dedup and joins from Kafka to ClickHouse
13 by super_ar | 2 comments on Hacker News.
Hi HN! We are Ashish and Armend, founders of GlassFlow. We just launched our open-source streaming ETL that deduplicates and joins Kafka streams before ingesting them to ClickHouse https://ift.tt/XqT8Ayz Why we built this:Dedup with batch data is straightforward. You load the data into a temporary table. Then, find only the latest versions of the record through hashes or keys and keep them. After that, move the clean data into your main table. But have you tried this with streaming data? Users of our prev product were running real-time analytics pipelines from Kafka to ClickHouse and noticed that the analyses were wrong due to duplicates. The source systems produced duplicates as they ingested similar user data from CRMs, shop systems and click streams. We wanted to solve this issue for them with the existing ClickHouse options, but ClickHouse ReplacingMergeTree has an uncontrollable background merging process. This means the new data is in the system, but you never know when they’ll finish the merging, and until then, your queries return incorrect results. We looked into using FINAL but haven't been happy with the speed for real-time workloads. We tried Flink, but there is too much overhead to manage Java Flink jobs, and a self-built solution would have put us in a position to set up and maintain state storage, possibly a very large one (number of unique keys), to keep track of whether we have already encountered a record. And if your dedupe service fails, you need to rehydrate that state before processing new records. That would have been too much maintenance for us. We decided to solve it by building a new product and are excited to share it with you. The key difference is that the streams are deduplicated before ingesting to ClickHouse. So, ClickHouse always has clean data and less load, eliminating the risk of wrong results. We want more people to benefit from it and decided to open-source it (Apache-2.0). Main components: - Streaming deduplication: You define the deduplication key and a time window (up to 7 days), and it handles the checks in real time to avoid duplicates before hitting ClickHouse. The state store is built in. - Temporal Stream Joins:You can join two Kafka streams on the fly with a few config inputs. You set the join key, choose a time window (up to 7 days), and you're good. - Built-in Kafka source connector:There is no need to build custom consumers or manage polling logic. Just point it at your Kafka cluster, and it auto-subscribes to the topics you define. Payloads are parsed as JSON by default, so you get structured data immediately. As underlying tech, we decided on NATS to make it lightweight and low-latency. - ClickHouse sink:Data gets pushed into ClickHouse through a native connector optimized for performance. You can tweak batch sizes and flush intervals to match your throughput needs. It handles retries automatically, so you don't lose data on transient failures. We'd love to hear your feedback and know if you solved it nicely with existing tools. Thanks for reading!
13 by super_ar | 2 comments on Hacker News.
Hi HN! We are Ashish and Armend, founders of GlassFlow. We just launched our open-source streaming ETL that deduplicates and joins Kafka streams before ingesting them to ClickHouse https://ift.tt/XqT8Ayz Why we built this:Dedup with batch data is straightforward. You load the data into a temporary table. Then, find only the latest versions of the record through hashes or keys and keep them. After that, move the clean data into your main table. But have you tried this with streaming data? Users of our prev product were running real-time analytics pipelines from Kafka to ClickHouse and noticed that the analyses were wrong due to duplicates. The source systems produced duplicates as they ingested similar user data from CRMs, shop systems and click streams. We wanted to solve this issue for them with the existing ClickHouse options, but ClickHouse ReplacingMergeTree has an uncontrollable background merging process. This means the new data is in the system, but you never know when they’ll finish the merging, and until then, your queries return incorrect results. We looked into using FINAL but haven't been happy with the speed for real-time workloads. We tried Flink, but there is too much overhead to manage Java Flink jobs, and a self-built solution would have put us in a position to set up and maintain state storage, possibly a very large one (number of unique keys), to keep track of whether we have already encountered a record. And if your dedupe service fails, you need to rehydrate that state before processing new records. That would have been too much maintenance for us. We decided to solve it by building a new product and are excited to share it with you. The key difference is that the streams are deduplicated before ingesting to ClickHouse. So, ClickHouse always has clean data and less load, eliminating the risk of wrong results. We want more people to benefit from it and decided to open-source it (Apache-2.0). Main components: - Streaming deduplication: You define the deduplication key and a time window (up to 7 days), and it handles the checks in real time to avoid duplicates before hitting ClickHouse. The state store is built in. - Temporal Stream Joins:You can join two Kafka streams on the fly with a few config inputs. You set the join key, choose a time window (up to 7 days), and you're good. - Built-in Kafka source connector:There is no need to build custom consumers or manage polling logic. Just point it at your Kafka cluster, and it auto-subscribes to the topics you define. Payloads are parsed as JSON by default, so you get structured data immediately. As underlying tech, we decided on NATS to make it lightweight and low-latency. - ClickHouse sink:Data gets pushed into ClickHouse through a native connector optimized for performance. You can tweak batch sizes and flush intervals to match your throughput needs. It handles retries automatically, so you don't lose data on transient failures. We'd love to hear your feedback and know if you solved it nicely with existing tools. Thanks for reading!
In 2025, venture capital can't pretend everything is fine any more
14 by namanyayg | 1 comments on Hacker News.
14 by namanyayg | 1 comments on Hacker News.
Title of work deciphered in sealed Herculaneum scroll via digital unwrapping
48 by namanyayg | 11 comments on Hacker News.
48 by namanyayg | 11 comments on Hacker News.