TLDR

TLDR Data 2026-05-07

Netflix’s ML Metadata Graph 🧬, Inside DuckDB’s Speed 🦆, Searchable S3 Storage 🔎

📱

Deep Dives

Democratizing Machine Learning at Netflix: Building the Model Lifecycle Graph (14 minute read)

DuckDB Internals: Why is DuckDB Fast? (17 minute read)

Building Self-Healing Data Pipelines at Halodoc (9 minute read)

From SSH to REST: A Security-Driven Modernization of Slack's EMR Data Pipelines (15 minute read)

🚀

Opinions & Advice

Can Agents Replace the Search Stack? (6 minute read)

Beyond the hype: The enterprise AI architecture we actually need (7 minute read)

We're Missing Data: The Other Half of AI Transformation (6 minute read)

💻

Launches & Tools

How We Accelerated Transpilation by Compiling SQLGlot with mypyc (8 minute read)

Integrating AI Into Apache Kafka Architectures: Patterns and Best Practices (11 minute read)

🎁

Miscellaneous

Redis Array Type: Short Story of a Long Development (3 minute read)

⚡️

Quick Links

Curated deep dives, tools and trends in big data, data science and data engineering 📊

Join 400,000 readers for one daily email