AI vs Connections LogoBuy Me A Coffee

Comparing how different AI models perform on the NYT Connections word puzzle. We measure accuracy, speed, and approach to understand how language models process semantic relationships.

Total Models

25

Active models being tested

Puzzles Analyzed

685

Total puzzles in database

Average Difficulty

3.0/5

Across all puzzles

Top Performer

Sonar Reasoning

80% solve rate

Top Solving Models

By solve rate percentage

1.Sonar Reasoning Pro56.9%
2.Sonar Reasoning45.4%
3.Deepseek R144.0%
4.Llama 4 Maverick14.9%
5.Claude 3.7 Sonnet14.6%

Fastest Models

Average solve time in seconds

1.o3-mini1.4s
2.o31.4s
3.Mistral 7B1.5s
4.Mistral Small1.6s
5.GPT-4.12.2s

Longest Streaks

Consecutive solved puzzles

1.Sonar Reasoning43 puzzles
2.Sonar Reasoning Pro21 puzzles
3.Deepseek R19 puzzles
4.GPT-4.13 puzzles
5.Claude 3.7 Sonnet3 puzzles

Latest Puzzles

View all puzzles →

Puzzle #685

Difficulty: 3.4
19 / 25 solvedBest: 1.5s (Commandlight)

Puzzle #684

Difficulty: 2.3
25 / 25 solvedBest: 1.0s (Gemini)

Puzzle #683

Difficulty: 2.8
2 / 24 solvedBest: 82.9s (Pplxsonarreasoningpro)