Claude 3.5 sets new AI benchmarks, beating GPT-4o in coding and reasoning

Anthropic has launched Claude 3.5 Sonnet, the latest addition to its AI model lineup, claiming it surpasses previous models and competitors like OpenAI’s GPT-4 Omni. Available for free on Claude.ai and the Claude iOS app, the model is also accessible via the Anthropic API, Amazon Bedrock, and Google Cloud’s Vertex AI. Claude 3.5 Sonnet is priced at $3 per million input tokens and $15 per million output tokens, with a 200,000-token context window.

Claude 3.5 Sonnet benchmarks (Anthropic)

Claude 3.5 Sonnet sets new benchmarks in graduate-level reasoning (GPQA), undergraduate-level knowledge (MMLU), and coding proficiency (HumanEval). It demonstrates significant improvements in understanding nuance, humor, and complex instructions and excels at generating high-quality content with a natural tone. The model operates at twice the speed of Claude 3 Opus, making it suitable for complex tasks like context-sensitive customer support and multi-step workflows.

“In an internal agentic coding evaluation, Claude 3.5 Sonnet solved 64% of problems, outperforming Claude 3 Opus, which solved 38%.”

The model can independently write, edit, and execute code, making it effective for updating legacy applications and migrating codebases. It also excels in visual reasoning tasks, such as interpreting charts and graphs, and can accurately transcribe text from imperfect images, benefiting sectors like retail, logistics, and financial services.

Anthropic has also introduced Artifacts, a new feature on Claude.ai that allows users to generate and edit content like code snippets, text documents, or website designs in real time. This feature marks Claude’s evolution from a conversational AI to a collaborative work environment, with plans to support team collaboration and centralized knowledge management in the future.

Anthropic emphasizes its commitment to safety and privacy, stating that Claude 3.5 Sonnet has undergone rigorous testing to reduce misuse. The model has been evaluated by external experts, including the UK’s Artificial Intelligence Safety Institute (UK AISI), and has integrated feedback from child safety experts to update its classifiers and fine-tune its models. Anthropic assures that it does not train its generative models on user-submitted data without explicit permission.

Looking ahead, Anthropic plans to release Claude 3.5 Haiku and Claude 3.5 Opus later this year, along with new features like Memory, which will enable Claude to remember user preferences and interaction history.

Mentioned in this article

Anthropic

OpenAI

Posted In: AI, Technology

Author

Liam 'Akiba' Wright

Senior Editor at CryptoSlate

Also known as "Akiba," Liam is a reporter, editor and podcast producer at CryptoSlate. He believes that decentralized technology has the potential to make widespread positive change.

Editor Editor

News Desk

Editor at CryptoSlate

CryptoSlate is a comprehensive and contextualized source for crypto news, insights, and data. Focusing on Bitcoin, macro, DeFi and AI.

Disclaimer: Our writers' opinions are solely their own and do not reflect the opinion of CryptoSlate. None of the information you read on CryptoSlate should be taken as investment advice, nor does CryptoSlate endorse any project that may be mentioned or linked to in this article. Buying and trading cryptocurrencies should be considered a high-risk activity. Please do your own due diligence before taking any action related to content within this article. Finally, CryptoSlate takes no responsibility should you lose money trading cryptocurrencies.

Marathon Digital diversifies revenue by mining Kaspa, aims for 16% global hash rate

Bitwise CIO predicts $15 billion inflow into Ethereum ETFs despite potential Grayscale outflows

Tether USDT sees rapid $580 million growth on Telegram-linked TON blockchain

BloFin believes spot Ethereum ETFs will drive short-term surge in ETH prices

Abra settles with US states, will repay $82 million to affected customers

Ripple’s Garlinghouse says SEC Gensler’s actions could sink Biden in November

Aave DAO evaluates joining Lido Alliance to boost staked Ethereum market

Ava Labs eyes web3 integration in South Korea’s booming K-Pop industry

Coinbase says it will not facilitate Ocean-Fetch AI merger

Marathon Digital diversifies revenue by mining Kaspa, aims for 16% global hash rate

Buterin argues for blockchain as defense against ‘efficiency’ of Authoritarian regimes

Tether stops minting USDT on EOS and Algorand gives 1 year for redemptions

Polkadot-Cardano bridge development sparks community excitement

Ripple faces new trial over alleged misleading 2017 statements by CEO Brad Garlinghouse

Jump Crypto President resigns 4 days after reports of CFTC investigation

Vitalik believes memecoins should be used for philanthropy and social impact

Polkadot’s eyes stablecoin dominance with new proposal

OP3N to trial first Twitch style Web3 creator house in Lisbon powered by Avalanche

Claude 3.5 sets new AI benchmarks, beating GPT-4o in coding and reasoning

Mentioned in this article

Liam 'Akiba' Wright

News Desk

TRON Founder Justin Sun Wins Landmark Case in the People’s Court of China

Latest Alpha

STHs faced substantial losses as Bitcoin briefly fell below $60k

Bitfinex whales boost long positions by 2,500 BTC

In this article

Anthropic

OpenAI

Featured Story

TRON Founder Justin Sun Wins Landmark Case in the People’s Court of China

Featured Report

Minutes Network

Bybit Wins Innovative Collaborations Award at MENA Business Awards 2024

Bitcoin breaks $60,000 level while Memereum presale nears 25M tokens sold

Fetch

SingularityNET

Ocean Protocol

Ethereum

Polygon

Bitcoin

Tether

Polkadot

Cardano

XRP

Arweave

Avalanche

Aave

Lido Staked ETH

Lido DAO Token

Gary Gensler

Mark Cuban

Cameron Winklevoss

Tyler Winklevoss

Donald Trump

Brad Garlinghouse

Joe Biden

Vitalik Buterin

Janet Yellen

Stani Kulechov

Michael Saylor

Cathie Wood

Coinbase

KuCoin

Binance

HTX

Bitfinex

Bitget

Crypto.com

CoinShares

Grayscale Investments

Fidelity Investments

Tether Limited

Mt. Gox

Ripple

FTX

Jump Crypto

Circle

Ava Labs

Lido

Metaplanet

MicroStrategy

Ark Invest

Gemini

Claude 3.5 sets new AI benchmarks, beating GPT-4o in coding and reasoning

Mentioned in this article

Liam 'Akiba' Wright

News Desk

TRON Founder Justin Sun Wins Landmark Case in the People’s Court of China

Latest Alpha Market Report

Crypto derivatives markets: Bitcoin vs. Ethereum

Bybit Wins Innovative Collaborations Award at MENA Business Awards 2024

Bitcoin breaks $60,000 level while Memereum presale nears 25M tokens sold

Copper & Sui partner to build out full institutional accessibility

STHs faced substantial losses as Bitcoin briefly fell below $60k

Bitfinex whales boost long positions by 2,500 BTC

Anthropic

OpenAI

TRON Founder Justin Sun Wins Landmark Case in the People’s Court of China

Minutes Network

Get a Daily Summary of Crypto News, Insights and Market Data Straight to Your Inbox.

Cheers! You're subscribed to CryptoSlate.