TLDR Dev 2026-04-30
The last software engineer 🫗, Zed 1.0 ⚡️, Claude’s HERMES.md mishap 👜
😘 Kiss bugs goodbye with fully automated end-to-end test coverage (Sponsor)
Bugs sneak out when less than 80% of user flows are tested before shipping. However, getting that kind of coverage (and staying there) is hard and pricey for any team.
QA Wolf's AI-native service provides high-volume, high-speed test coverage for web and mobile apps, reducing your organization's QA cycle from days to minutes.
They can get you:
- 80% automated E2E test coverage in weeks
- Unlimited parallel test runs
- 24-hour maintenance and on-demand test creation
- Zero flakes, guaranteed
Engineering teams move faster, releases stay on track, and testing happens automatically—so developers can focus on building, not debugging.
The result? Drata achieved 4x more test cases and 86% faster QA cycles.
⭐ Rated 4.8/5 on G2.
Schedule a demo to learn more
Evaluating Netflix Show Synopses with LLM-as-a-Judge (9 minute read)
Netflix developed an LLM-as-a-Judge system to score show synopses on four quality dimensions, anchoring it to a golden set of roughly 600 expert-labeled examples built through a rigorous calibration process. This approach, which uses judges and techniques like tiered reasoning and consensus scoring, achieves 83–92% accuracy and allows them to flag and fix weak synopses weeks before a show ships.
The React Compiler at Eighteen Months: The Arc, the Debates, and What's Next (11 minute read)
Eighteen months after React 19 shipped the stable compiler, its true impact is seen in eliminating bugs like forgotten dependencies and missing memos rather than delivering benchmark wins, though adoption in brownfield projects is hindered by legacy libraries that break the Rules of React. Open debates continue regarding whether those Rules should be a hard contract and the risk of 'use no memo' becoming technical debt, while future plans include finer per-component control, compiler-aware Server Components, a stable useEvent, React Native parity, and improved DevTools.
The end of "Just ask Sarah" (6 minute read)
AI agents shift the role of docs from organizational memory for humans to execution context for machines. Unlike human engineers who can rely on unwritten institutional knowledge, agents only have their limited context window, forgetting information between sessions. This absence of durable intent leads agents to perpetuate patterns without understanding their underlying reasoning, resulting in "intent debt" and potentially incorrect decisions.
How to be Successful Interviewing for Big Tech (10 minute read)
Traditional Big Tech interviews, often focused on LeetCode problems, inaccurately assess an engineer's true capability to ship working software. The interview process, like at Postman, should be designed to mirror the actual job, prioritizing real-world relevance, the use of modern tools including AI, and collaborative problem-solving.
The Last Software Engineer (7 minute read)
As AI agents automate implementation, software engineering value is shifting from writing code to exercising judgment about what to build and why. Product engineering now requires humans to own implementation consequences and user impact, making strategic decision-making the most durable skill for engineers.
Deployment ≠ Promotion (Sponsor)
Argo CD solved deployment. But promoting artifacts from dev → staging → prod still runs on brittle CI scripts that make rollbacks painful.
This guide shows how Kargo — built by Akuity, the creators of Argo CD — closes the second mile of GitOps.
Two foundation models built for agentic coding. (Website)
Poolside develops foundation models specifically for agentic coding, training models from scratch using its own data, infrastructure, and reinforcement learning. It has two main models: Laguna XS.2, an open-weight, lighter, and faster Mixture of Experts (MoE) model, and Laguna M.1, its most capable and largest MoE model.
OpenSpec (GitHub Repo)
OpenSpec is a spec-driven development framework that brings predictability and organization to AI coding assistants. It solves the issue of unpredictable AI output by creating a lightweight specification layer, making sure humans and AI align on requirements before code is written.
Zed 1.0 (5 minute read)
Zed 1.0, a code editor, was reimagined with a "video game" architecture and a custom Rust-based GPUI framework to achieve high performance with multi-language support and deep AI-native capabilities.
Copy Fail (6 minute read)
"Copy Fail" (CVE-2026-31431) is a critical Linux kernel flaw that has existed since 2017. It allows unprivileged local users to gain root access via a simple Python script. This vulnerability poses a severe risk to multi-tenant environments like container clusters, requiring kernel updates or disabling the `algif_aead` module for mitigation.
I benchmarked caveman against two words (6 minute read)
Benchmarks comparing the "Caveman" Claude code compression plugin against the simple prompt "be brief" found that the two-word prompt matched Caveman in both token reduction and output quality across most tested categories. While "be brief" is recommended for simple token reduction, Caveman is ideal for those needing predictable and structured output formats.
The most important software engineering news in one daily email
Join 450,000 readers for
one daily email