ai

Table of Contents

AI
AI Model Performance Comparison
Agentic Coding (SWE-bench Verified)
Agentic Terminal Coding (terminal-bench)
Graduate-level Reasoning (GPQA Diamond)
Agentic Tool Use (TAU-bench)
- Retail
- Airline
Local deployment
Interfacing with the AI mode
- - MCP - Model Context Protocol

AI

Learning AI:

What is the best way to learn Artificial Intelligence for a beginner

AI Model Performance Comparison

This page presents a comparison of various AI models across different task categories, based on the provided data.

Agentic Coding (SWE-bench Verified)

Model	Score 1	Score 2
Claude Opus 4	72.5%	79.4%
Claude Sonnet 4	72.7%	80.2%
Claude Sonnet 3.7	62.3%	70.3%
OpenAI o3	69.1%
OpenAI GPT-4.1	54.6%
Gemini 2.5 Pro (Preview 05-06)	63.2%

Agentic Terminal Coding (terminal-bench)

Model	Score 1	Score 2
Claude Opus 4	43.2%	50.0%
Claude Sonnet 4	35.5%	41.3%
Claude Sonnet 3.7	35.2%
OpenAI o3	30.2%
OpenAI GPT-4.1	30.3%
Gemini 2.5 Pro (Preview 05-06)	25.3%

Graduate-level Reasoning (GPQA Diamond)

Model	Score 1	Score 2
Claude Opus 4	79.6%	83.3%
Claude Sonnet 4	75.4%	83.8%
Claude Sonnet 3.7	78.2%
OpenAI o3	83.3%
OpenAI GPT-4.1	66.3%
Gemini 2.5 Pro (Preview 05-06)	83.0%

Agentic Tool Use (TAU-bench)

Retail

Model	Score
Claude Opus 4	81.4%
Claude Sonnet 4	80.5%
Claude Sonnet 3.7	81.2%
OpenAI o3	70.4%
OpenAI GPT-4.1	68.0%
Gemini 2.5 Pro (Preview 05-06)	N/A

Airline

Model	Score
Claude Opus 4	59.6%
Claude Sonnet 4	60.0%
Claude Sonnet 3.7	58.4%
OpenAI o3	52.0%

Local deployment

Running AI Models Locally with Docker and Spring AI Play https://www.danvega.dev/blog/docker-model-runner

AI Gemma 3

https://habr.com/ru/articles/896290/

https://spring.io/blog/2025/04/10/spring-ai-docker-model-runner

Gemii-cli

https://www.youtube.com/watch?v=xqvprnPocHs

https://github.com/google-gemini/gemini-cli

winget install -e --id OpenJS.NodeJS

npm install -g @google/gemini-cli

npm upgrade -g @google/gemini-cli

# start
gemini

Docker Model Runner

https://habr.com/ru/articles/898778/

Spring AI

https://spring.io/projects/spring-ai

Interfacing with the AI mode

MCP - Model Context Protocol

https://habr.com/ru/articles/893482/

ai.txt · Last modified: 2025/07/20 05:05 by skipidar