TLDR Data 2026-06-04

dbt Core v2 Alpha 🦀, Cart Prediction with LLMs 🛒, Ray vs Daft 🧪

It's official: Fivetran + dbt Labs merge to build the data foundation for trustworthy AI agents (Sponsor)

📱

Deep Dives

Your Cart Has a Story. Here's How We Learned to Read It (7 minute read)

Vector Search in Manticore Search: A Deep Dive (28 minute read)

A field journal on Ray Data and Daft for multimodal data lake (14 minute read)

🚀

Opinions & Advice

Debunking 8 data layout myths: why Liquid Clustering outperforms partitioning (11 minute read)

The Rise of Multi-Query Engines (7 minute read)

💻

Launches & Tools

Diving deep into Redis's new array data type (25 minute read)

Routing Multiple Query Engines with Iceberg (18 minute read)

dbt Core v2 is here: still open source, now rebuilt for what's next (9 minute read)

ingestr (GitHub Repo)

🎁

Miscellaneous

OpenTelemetry Launches “Blueprints” Initiative to Simplify Enterprise Observability Adoption (3 minute read)

MongoDB and Stored Procedures (10 minute read)

⚡️

Quick Links

Authorization for AI agents: What to build before the EU AI Act deadline (6 minute read)

Pluto 1.0 Release (12 minute read)

dltHub AI Workbench data quality toolkit: schema-aware checks that route their own fixes (4 minute read)

Curated deep dives, tools and trends in big data, data science and data engineering 📊

Join 400,000 readers for one daily email

Privacy Careers Advertise