Tech Claude

Claude score on Humanity’s Last Exam by June 30?

Ended Jun 30, 2026, 00:00 UTC

30%+

Yes

35%+

Yes

45%+

50%+

55%+

Market resolution

This Claude score on Humanity’s Last Exam by June 30 prediction market is settled. The percentages above are the final outcome probabilities reported by Polymarket.

Final volume$407.18K Reported open interest$35.32K Final syncJul 1, 2026 9:12 am

Final probabilities, volume, and open interest are sourced from Polymarket and were last synced at Jul 1, 2026 9:12 am.

Claude’s HLE market is pricing a narrow path: Anthropic has enough time to place a stronger model on Scale’s Humanity’s Last Exam leaderboard by June 30, yet the odds compress once the target moves from a competitive score into a clear benchmark break. That shape matters because the contract settles on a public listing, making product timing and evaluation exposure as important as model capability.

The ladder prices a climb to 50%, then a harder jump

The three outcomes form a visible curve: 45%+ at 23%, 50%+ at 20.5%, and 55%+ at 8.4%. As an inference from those prices, the market treats 45% and 50% as linked states of the same upgrade cycle, while 55% sits in a different regime. The narrow gap between the first two thresholds suggests that if Anthropic produces a Claude model capable of moving meaningfully on HLE, the market sees the extra five points as a smaller obstacle than the next five.

That matters because HLE-style benchmarks can produce step changes in market perception. A score just above 45% would validate the release-timing thesis but leave the top rung unresolved. A score near 50% would pressure both lower outcomes together. A visible result in the low 50s could reshape expectations for 55% even before it crosses, because the remaining distance would shrink to a few leaderboard points.

Outcome	Yes price	Inferred burden
45%+	23%	A meaningful Claude gain reaches the public leaderboard
50%+	20.5%	The same gain extends into a stronger benchmark band
55%+	8.4%	A larger leap arrives with clean public verification

The deadline turns Anthropic’s release cadence into a settlement constraint

The closing window is doing real pricing work. The rules require any Anthropic Claude model to appear on Scale’s leaderboard with the specified score by June 30, 2026, 11:59 PM ET. That means the market is implicitly evaluating at least three linked events: a model improvement exists, the model is evaluated for HLE, and the score becomes visible on the settlement source before the deadline.

The late-June timing also reduces the value of broad claims about future AI progress. General expectations for Claude can support the lower rungs, but the contract pays attention to a dated, source-specific artifact. A powerful internal model, a delayed public launch, or a score published after the cutoff would fail to satisfy the market rules. That is why the market can give meaningful probability to 45% while still drawing a steep line at 55%: the hardest outcome needs both a larger gain and a cleaner path to public verification.

Scale’s leaderboard is the evidence layer that matters most

The settlement source concentrates power in a single external record. Polymarket’s resolution criteria refer to the Scale Humanity’s Last Exam leaderboard, and each threshold resolves through the listed score. This creates a blind spot for narrative-driven repricing: impressive demos, model-card claims, or third-party commentary have weaker settlement value unless they point toward a Scale-listed Claude score.

That rule also explains why the market can price the lower thresholds meaningfully without treating them as simple bets on Anthropic’s overall research trajectory. The required artifact is specific: an Anthropic Claude model, a Scale HLE score, and a number at or above the relevant threshold. Every extra dependency reduces the market’s willingness to carry the same confidence from 45% through 55%.

Catalysts need to change the leaderboard path

Specific repricing catalysts are easy to define because the rules are narrow. A hypothetical new Claude entry on Scale with a score above 45%, 50%, or 55% would directly settle or nearly settle the debate for the relevant rung. A hypothetical entry close to a threshold would matter because the market would have to decide whether another update can arrive before the clock runs out.

Other catalysts would work through inference. A hypothetical Anthropic release announcement that names HLE or comparable exam-style evaluations could shift expectations if it increases confidence that a Scale score is coming. A hypothetical Scale leaderboard update policy, format change, or delay would also matter because the contract is tied to what the leaderboard lists; broad consensus about Claude’s ability has weaker settlement force.

Market structure can amplify those updates. The market has recorded $366.71K of volume and $51.77K of open interest, showing sustained engagement, while posted liquidity is $11.64K. That combination can leave prices sensitive to fresh objective evidence, especially if a leaderboard change affects multiple thresholds at once.

The main counter-signal is benchmark-specific friction

The strongest counterargument to the climb-through-50 story is that model progress may arrive in forms outside HLE’s scoring frame or through channels outside the contract. A Claude release could improve user-facing performance, coding, tool use, or safety behavior without producing the required Scale leaderboard score. Since the market resolves on an entry in one leaderboard, broad perceptions of AI progress can diverge from the contract outcome.

This failure mode helps explain the collapse from 50%+ to 55%+. The higher target requires a larger capability move, a public evaluation, and clean timing. It also leaves less room for measurement variance or partial gains. If future evidence shows Claude improving on adjacent benchmarks without a Scale HLE listing, the lower thresholds could lose narrative support because the settlement risk would become more visible.

The market’s central tension is publication versus capability. The prices imply belief that Anthropic can generate an HLE-relevant improvement before the deadline, yet the steep top rung says the public leaderboard path is the choke point. Any update that collapses those two issues into one visible Scale score would be the event most likely to force the cleanest repricing.

Sources

Polymarket market Scale

Market details

Resolution criteria: This market will resolve to "Yes" if the Humanity’s Last Exam leaderboard lists any Anthropic Claude model with a score of at least the specified score by June 30, 2026, 11:59 PM ET. Otherwise, this market will resolve to "No".
Platform
Category: Tech › Claude
Close date: June 30, 2026, 12:00 AM UTC
Settlement source: scale.com
Market rules summary: Multi-timeframe Polymarket event. Each listed timeframe is represented by its Yes price on the underlying binary market. View full rules

Frequently asked questions

What was the final result of the Claude score on Humanity’s Last Exam by June 30 prediction market?

Polymarket reports the Claude score on Humanity’s Last Exam by June 30 prediction market as closed. The final snapshot shows 30%+ at 100%, 35%+ at 100%, 45%+ at 0%, and 50%+ at 0%. The final market snapshot includes $407.18K volume and $35.32K open interest. CryptoSlate last synced the final market data at Jul 1, 2026, 08:12 UTC.

How does the Claude score on Humanity’s Last Exam by June 30 prediction market resolve?

This market will resolve to "Yes" if the Humanity’s Last Exam leaderboard lists any Anthropic Claude model with a score of at least the specified score by June 30, 2026, 11:59 PM ET. Otherwise, this market will resolve to "No". Multi-timeframe Polymarket event. Each listed timeframe is represented by its Yes price on the underlying binary market. The settlement source listed for this market is Scale.