TLDR AI 2024-06-21

Anthropic Claude 3.5 Sonnet 3️⃣, Apple Open Source Models 🌐, Factory AI Series A 💰

🚀

Headlines & Launches

Anthropic launches Claude 3.5 Sonnet (4 minute read)

Claude 3.5 Sonnet passes Opus in performance at 1/5 the cost. It is also the best vision model available now. This highlights the improved capabilities of the frontier models.

Apple researchers add 20 more open-source models to improve text and image AI (2 minute read)

Apple has contributed 20 Core Machine Learning models to the Hugging Face open-source AI repository, enhancing its existing range of public models with advances in image classification and depth segmentation. These contributions follow Apple's release of the Ferret large language model and the four OpenELMs to Hugging Face earlier in the year. The move demonstrates Apple's increasing engagement with the AI research community and its commitment to advancing AI capabilities.

Factory's Series A Announcement (2 minute read)

Factory has secured a $15M Series A funding, led by Sequoia Capital, to expand its team and enhance its suite of AI-powered software development tools known as Droids. Its products have achieved a new state-of-the-art on the SWE-bench AI coding benchmark and are driving rapid growth in its customer base. Factory aims to further automate software engineering, reducing tedious tasks and improving development cycles.

🧠

Research & Innovation

Pruning with LayerMerge (16 minute read)

LayerMerge is a new method that improves neural network efficiency by jointly pruning convolution layers and activation functions.

Optimizing inference at Character AI (4 minute read)

Character AI serves 20,000 queries per second, which is 20% of Google's search volume. It has a number of innovations to run this efficiently.

Remote Sensing Change Detection (17 minute read)

ChangeViT is a framework that utilizes vision transformers (ViTs) for detecting large-scale environmental changes in remote sensing images.

👨‍💻

Engineering & Resources

Evaluating Web Agents in Real-Time (GitHub Repo)

WebCanvas is a new framework for evaluating autonomous web agents in dynamic, live web environments.

Can We Trust Vision-Language Models? (6 minute read)

Vision-enabled language models (VLMs) like GPT-4o and Gemini power autonomous agents capable of tasks such as making purchases or editing code. This work highlights the vulnerability of these agents to malicious attacks.

94% accuracy on CIFAR-10 in 3.29 seconds (GitHub Repo)

CIFAR-10 is an image classification benchmark. This code provides a training configuration that achieves good performance in an amazingly short time.

🎁

Miscellaneous

Time Series Forecasting with TimeSieve (12 minute read)

TimeSieve is a new model designed to tackle common challenges in time series forecasting.

Cost Of Self-Hosting Llama-3 8B-Instruct (7 minute read)

Self-hosting an LLM like Llama-3 8B-Instruct can be significantly more expensive than using ChatGPT, costing around $17 per million tokens compared to ChatGPT's $1 per million tokens. The cost of self-hosted hardware can be reduced to less than $0.01 per million tokens, though it would take approximately 5.5 years to break even on the initial investment.

⚡️

Join 500,000 readers for

Privacy Careers Advertise