TLDR DevOps 2026-05-01
OpenAI and AWS βοΈ, Terraform Audit Guide π, Better Software Testing π§ͺ
Gene Kim + Google Cloud + Okta on DevEx (Sponsor)
May 13 | Virtual | Free
If you are rethinking developer productivity in the age of AI, this is the conversation.
At the Developer Experience Summit, hosted by Harness, hear from leaders at Google Cloud, Okta, andMorningstar on what is actually working in developer experience today.
Featuring Gene Kim, bestselling author of Vibe Coding and DevOps Handbook.Live virtual event on May 13. Recording available.
Register Now
OpenAI models, Codex, and Managed Agents come to AWS (3 minute read)
OpenAI and AWS expanded their partnership to bring GPT-5.5 and other OpenAI models to Amazon Bedrock, allowing enterprises to build AI applications within their existing AWS infrastructure and security protocols. The collaboration also introduces Codex (OpenAI's coding tool used by 4 million weekly users) on Bedrock and launches Amazon Bedrock Managed Agents powered by OpenAI for deploying multi-step workflow agents in production environments.
Agents can now create Cloudflare accounts, buy domains, and deploy (7 minute read)
Cloudflare launched a new protocol co-designed with Stripe that lets AI coding agents automatically provision Cloudflare accounts, purchase domains, set up paid subscriptions, and deploy production applications on behalf of users without requiring manual setup steps like entering credit cards or copying API tokens. The protocol, part of Stripe Projects' launch, includes built-in safeguards like a $100 monthly spending limit per provider and uses Stripe as an identity provider to attest user identity while keeping payment details hidden from agents.
Kubernetes v1.36: Staleness Mitigation and Observability for Controllers (6 minute read)
Kubernetes v1.36 introduced new features to combat "staleness" in controllersβwhen outdated local caches cause controllers to take incorrect actions or miss updatesβby adding atomic FIFO processing to client-go and implementing staleness checks in four high-contention controllers (ReplicaSet, DaemonSet, Job, and StatefulSet) that now verify cache resource versions before acting. The update also includes new metrics like `stale_sync_skips_total` to monitor when controllers skip syncs due to stale data, with all features enabled by default and controllable via feature gates.
Terraform Audit Guide: Monitoring, Logging & Compliance (12 minute read)
A Terraform audit evaluates infrastructure code, state, runs, and backend to ensure security and compliance, using tools like Checkov, Trivy, and OPA with best practices such as continuous auditing, state protection, version control, and policy enforcement.
What does using AI for post-mortems actually mean? (4 minute read)
AI improves post-mortems by automating preparation like timelines and drafts, but risks producing convincing yet unowned conclusions if it replaces human analysis. Real value comes from human synthesis and accountability, not AI-generated summaries alone.
Post-quantum encryption for Cloudflare IPsec is generally available (6 minute read)
Cloudflare made post-quantum encryption in its IPsec service generally available, successfully testing interoperability with branch connectors from Fortinet and Cisco using the new IETF hybrid ML-KEM (FIPS 203) draft standard. The rollout comes as Cloudflare moved its full post-quantum security target to 2029 amid recent quantum computing advances, though IPsec adoption lagged four years behind TLS due to the community's focus on Quantum Key Distribution, which requires specialized hardware and doesn't work at internet scale.
Techniques for better software testing (7 minute read)
Better software testing means going beyond hand-written examples by using randomness, fuzzing, swarm testing, concurrency, fault injection, and test-specific configurations to expose edge cases that normal unit or integration tests miss. Tests should validate continuously, exercise rare failure paths, cover the full system surface, and intentionally test recovery from βgoodβ crashes so bugs surface earlier and are easier to debug.
Test network paths with TCP, UDP, and ICMP in Datadog (7 minute read)
Designing network tests with protocols like TCP, UDP, or ICMP improves root cause analysis by matching application traffic, revealing latency, packet loss, and reliability issues.
Bridging the trust gap: Unified public CA orchestration with IBM Vault (4 minute read)
Vault Enterprise now integrates public certificate authorities via ACME, unifying private and public PKI workflows to eliminate manual processes, reduce outage risk, and centralize governance while enabling automated issuance, renewal, and revocation through a single platform.
Kubernetes v1.36: Tiered Memory Protection with Memory QoS (3 minute read)
Kubernetes v1.36 introduced significant updates to its alpha Memory QoS feature, adding opt-in memory reservation with tiered protection that separates Guaranteed Pods (hard protection via memory.min), Burstable Pods (soft protection via memory.low), and BestEffort Pods (no protection).
Get our free daily newsletter with curated tools π», trends π, and insights π‘, for DevOps Engineers π¨βπ»
Join 340,000 readers for
one daily email