Claude 3.5 sets new AI benchmarks, beating GPT-4o in coding and reasoning

Receive, Manage & Grow Your Crypto Investments With Brighty

Anthropic has launched Claude 3.5 Sonnet, the latest addition to its AI model lineup, claiming it surpasses previous models and competitors like OpenAI’s GPT-4 Omni. Available for free on Claude.ai and the Claude iOS app, the model is also accessible via the Anthropic API, Amazon Bedrock, and Google Cloud’s Vertex AI. Claude 3.5 Sonnet is priced at $3 per million input tokens and $15 per million output tokens, with a 200,000-token context window.

Claude 3.5 Sonnet benchmarks (Anthropic)

Claude 3.5 Sonnet sets new benchmarks in graduate-level reasoning (GPQA), undergraduate-level knowledge (MMLU), and coding proficiency (HumanEval). It demonstrates significant improvements in understanding nuance, humor, and complex instructions and excels at generating high-quality content with a natural tone. The model operates at twice the speed of Claude 3 Opus, making it suitable for complex tasks like context-sensitive customer support and multi-step workflows.

“In an internal agentic coding evaluation, Claude 3.5 Sonnet solved 64% of problems, outperforming Claude 3 Opus, which solved 38%.”

The model can independently write, edit, and execute code, making it effective for updating legacy applications and migrating codebases. It also excels in visual reasoning tasks, such as interpreting charts and graphs, and can accurately transcribe text from imperfect images, benefiting sectors like retail, logistics, and financial services.

Anthropic has also introduced Artifacts, a new feature on Claude.ai that allows users to generate and edit content like code snippets, text documents, or website designs in real time. This feature marks Claude’s evolution from a conversational AI to a collaborative work environment, with plans to support team collaboration and centralized knowledge management in the future.

Anthropic emphasizes its commitment to safety and privacy, stating that Claude 3.5 Sonnet has undergone rigorous testing to reduce misuse. The model has been evaluated by external experts, including the UK’s Artificial Intelligence Safety Institute (UK AISI), and has integrated feedback from child safety experts to update its classifiers and fine-tune its models. Anthropic assures that it does not train its generative models on user-submitted data without explicit permission.

Looking ahead, Anthropic plans to release Claude 3.5 Haiku and Claude 3.5 Opus later this year, along with new features like Memory, which will enable Claude to remember user preferences and interaction history.

Mentioned in this article

Anthropic

OpenAI

Posted In: AI, Technology

Author

Liam 'Akiba' Wright

Editor-in-Chief at CryptoSlate

Also known as "Akiba," Liam Wright is a reporter, podcast producer, and Editor-in-Chief at CryptoSlate. He believes that decentralized technology has the potential to make widespread positive change.

Editor

News Desk

Editor at CryptoSlate

CryptoSlate is a comprehensive and contextualized source for crypto news, insights, and data. Focusing on Bitcoin, macro, DeFi and AI.

Disclaimer: Our writers' opinions are solely their own and do not reflect the opinion of CryptoSlate. None of the information you read on CryptoSlate should be taken as investment advice, nor does CryptoSlate endorse any project that may be mentioned or linked to in this article. Buying and trading cryptocurrencies should be considered a high-risk activity. Please do your own due diligence before taking any action related to content within this article. Finally, CryptoSlate takes no responsibility should you lose money trading cryptocurrencies.

Retail investors dominate demand for spot Bitcoin ETFs – Binance Research

Lido DAO launches its Community Staking Module to boost Ethereum decentralization

Tether slams WSJ report alleging US probe as ‘irresponsible reporting’

Bernstein predicts $200k Bitcoin in infamous ‘Black Book’ amid rising institutional demand

South Korea to regulate cross-border crypto trades by 2025

Polymarket CEO says platform is strictly ‘non-partisan’ in response to NYT article

Cardano unlocks Bitcoin liquidity with BitcoinOS Grail Bridge integration

Vitalik Buterin outlines how Ethereum’s Verge can bring blockchain nodes to smartwatches

Coinbase CEO Brian Armstrong offers AI agent Truth Terminal its own crypto wallet

Retail investors dominate demand for spot Bitcoin ETFs – Binance Research

US government-linked address likely exploited for over $20 million in crypto

BlackRock eyes crypto derivatives market with BUIDL as collateral

Ripple CEO optimistic about crypto post-election, regardless of outcome

Solana registers new all-time high in daily transaction-related fees

MicroStrategy stock to BTC ratio hits all-time high, surpassing 2021 bull run

Bitcoin miner from 2010 moves part of 50 BTC stash to active wallet linked to exchanges

Ethereum hits multi-year low against Bitcoin erasing all gains since 2021

Bitcoin ETFs see $380 net inflow in 2 days with BlackRock leading the charge

Bitcoin trades boast 98% profitability over the past 5,200 days

Ethereum leads liquidations as $259 million wiped out in 24 hours amid Bitcoin price swing

Sunny Aggarwal’s vision for seamless cross-chain trading with Polaris

The rise of crypto neobanks: Nikolai Denisenko on Brighty’s mission

Revolutionizing Biotech: Paul Kohlhaas discusses decentralized science and open innovation

Sui aims to become the “Internet Coordination Layer,” says Mysten Labs Co-Founder

Liquidium CEO Robin Obermaier discusses Bitcoin DeFi and cross-chain lending

Bitcoin and stablecoins will reshape global monetary policy, says Noelle Acheson

Claude 3.5 sets new AI benchmarks, beating GPT-4o in coding and reasoning

Mentioned in this article

Liam 'Akiba' Wright

News Desk

Japan’s Web3 transformation: How Monex Group is powering the nation’s crypto ecosystem

Featured Story

Japan’s Web3 transformation: How Monex Group is powering the nation’s crypto ecosystem

In this article

Anthropic

OpenAI

Valhalla Partners with Hong Kong International Cricket Sixes for a Thrilling Comeback

Tectum Presents SoftNote at BRICS IFE Forum: The Future of Cross-Border Transactions

MicroStrategy stock to BTC ratio hits all-time high, surpassing 2021 bull run

Bitcoin miner from 2010 moves part of 50 BTC stash to active wallet linked to exchanges

Ethereum hits multi-year low against Bitcoin erasing all gains since 2021

Bitcoin ETFs see $380 net inflow in 2 days with BlackRock leading the charge

Bitcoin trades boast 98% profitability over the past 5,200 days

Ethereum leads liquidations as $259 million wiped out in 24 hours amid Bitcoin price swing

Sunny Aggarwal’s vision for seamless cross-chain trading with Polaris

The rise of crypto neobanks: Nikolai Denisenko on Brighty’s mission

Revolutionizing Biotech: Paul Kohlhaas discusses decentralized science and open innovation

Sui aims to become the “Internet Coordination Layer,” says Mysten Labs Co-Founder

Liquidium CEO Robin Obermaier discusses Bitcoin DeFi and cross-chain lending

Bitcoin and stablecoins will reshape global monetary policy, says Noelle Acheson

Bitcoin

Solana

Goatseus Maximus

Ethereum

Tether

USD Coin

XRP

Ronin

Smooth Love Potion

Immutable

Axie Infinity

Illuvium

IQ

Chainlink

Jason Lowery

Nayib Bukele

Jack Mallers

Jameson Lopp

Adam Back

Peter Todd

Mark Cuban

Brian Armstrong

Vitalik Buterin

Jeremy Allaire

Michael Arrington

Nic Carter

Chris Larsen

Brad Garlinghouse

Kamala Harris

Donald Trump

Gary Gensler

Charles Hoskinson

Elizabeth Warren

John Deaton

Ben Armstrong

Tyler Winklevoss

Cameron Winklevoss

Elon Musk

Paul Tudor Jones

Navin Vethanayagam

Sergey Nazarov

Hunter Horsley

VanEck

MicroStrategy

Strike

Microsoft

Gemini

Coinbase

CryptoQuant

BlackRock

Fidelity Investments

Grayscale Investments

Ark Invest

21shares

CoinShares

Valkyrie

Goldman Sachs

Circle

Arrington XRP Capital

Sequoia Capital

Haun Ventures

Stellar Development Foundation

a16z

Visa

Revolut

Glassnode