TLDR AI 2024-06-21

Anthropic Claude 3.5 Sonnet 3ļøāƒ£, Apple Open Source Models šŸŒ, Factory AI Series A šŸ’°

šŸš€
Headlines & Launches

Anthropic launches Claude 3.5 Sonnet (4 minute read)

Claude 3.5 Sonnet passes Opus in performance at 1/5 the cost. It is also the best vision model available now. This highlights the improved capabilities of the frontier models.

Apple researchers add 20 more open-source models to improve text and image AI (2 minute read)

Apple has contributed 20 Core Machine Learning models to the Hugging Face open-source AI repository, enhancing its existing range of public models with advances in image classification and depth segmentation. These contributions follow Apple's release of the Ferret large language model and the four OpenELMs to Hugging Face earlier in the year. The move demonstrates Apple's increasing engagement with the AI research community and its commitment to advancing AI capabilities.

Factory's Series A Announcement (2 minute read)

Factory has secured a $15M Series A funding, led by Sequoia Capital, to expand its team and enhance its suite of AI-powered software development tools known as Droids. Its products have achieved a new state-of-the-art on the SWE-bench AI coding benchmark and are driving rapid growth in its customer base. Factory aims to further automate software engineering, reducing tedious tasks and improving development cycles.
šŸ§ 
Research & Innovation

Pruning with LayerMerge (16 minute read)

LayerMerge is a new method that improves neural network efficiency by jointly pruning convolution layers and activation functions.

Optimizing inference at Character AI (4 minute read)

Character AI serves 20,000 queries per second, which is 20% of Google's search volume. It has a number of innovations to run this efficiently.

Remote Sensing Change Detection (17 minute read)

ChangeViT is a framework that utilizes vision transformers (ViTs) for detecting large-scale environmental changes in remote sensing images.
šŸ‘Øā€šŸ’»
Engineering & Resources

Evaluating Web Agents in Real-Time (GitHub Repo)

WebCanvas is a new framework for evaluating autonomous web agents in dynamic, live web environments.

Can We Trust Vision-Language Models? (6 minute read)

Vision-enabled language models (VLMs) like GPT-4o and Gemini power autonomous agents capable of tasks such as making purchases or editing code. This work highlights the vulnerability of these agents to malicious attacks.

94% accuracy on CIFAR-10 in 3.29 seconds (GitHub Repo)

CIFAR-10 is an image classification benchmark. This code provides a training configuration that achieves good performance in an amazingly short time.
šŸŽ
Miscellaneous

Time Series Forecasting with TimeSieve (12 minute read)

TimeSieve is a new model designed to tackle common challenges in time series forecasting.

Cost Of Self-Hosting Llama-3 8B-Instruct (7 minute read)

Self-hosting an LLM like Llama-3 8B-Instruct can be significantly more expensive than using ChatGPT, costing around $17 per million tokens compared to ChatGPT's $1 per million tokens. The cost of self-hosted hardware can be reduced to less than $0.01 per million tokens, though it would take approximately 5.5 years to break even on the initial investment.
āš”ļø
Quick Links

Khroma (Product)

Khroma uses AI to learn which colors you like and creates limitless palettes for you to discover, search, and save.

New Benchmark for Depth Normal Estimation (2 minute read)

A new benchmark evaluates state-of-the-art depth and surface normal estimation models.

Nvidia releases Mamba 2 (Hugging Face Hub)

The hybrid simple Mamba model recently highlighted in an Nvidia paper has been released.
The most important AI, ML, and data science news in a free daily email.
Join 500,000 readers for