User Tools

Site Tools


ai

Differences

This shows you the differences between two versions of the page.

Link to this comparison view

Next revision
Previous revision
ai [2023/12/05 15:30] – created skipidarai [2025/07/20 05:05] (current) skipidar
Line 1: Line 1:
 ===== AI ===== ===== AI =====
 +
 +Learning AI:
 +
 +What is the best way to learn Artificial Intelligence for a beginner
 +
 +
 +{{youtube>FzLABAppJfM?}}
  
 {{https://s3.eu-central-1.amazonaws.com/alf-digital-wiki-pics/sharex/Jsh6B5GPnt.png}} {{https://s3.eu-central-1.amazonaws.com/alf-digital-wiki-pics/sharex/Jsh6B5GPnt.png}}
 +
 +
 +
 +
 +====== AI Model Performance Comparison ======
 +
 +This page presents a comparison of various AI models across different task categories, based on the provided data.
 +
 +===== Agentic Coding (SWE-bench Verified) =====
 +^ Model           ^ Score 1 ^ Score 2 ^
 +| Claude Opus 4   | 72.5%   | 79.4%   |
 +| Claude Sonnet 4 | 72.7%   | 80.2%   |
 +| Claude Sonnet 3.7 | 62.3%   | 70.3%   |
 +| OpenAI o3       | 69.1%           |
 +| OpenAI GPT-4.1  | 54.6%           |
 +| Gemini 2.5 Pro (Preview 05-06) | 63.2% |         |
 +
 +===== Agentic Terminal Coding (terminal-bench) =====
 +^ Model           ^ Score 1 ^ Score 2 ^
 +| Claude Opus 4   | 43.2%   | 50.0%   |
 +| Claude Sonnet 4 | 35.5%   | 41.3%   |
 +| Claude Sonnet 3.7 | 35.2%           |
 +| OpenAI o3       | 30.2%           |
 +| OpenAI GPT-4.1  | 30.3%           |
 +| Gemini 2.5 Pro (Preview 05-06) | 25.3% |         |
 +
 +===== Graduate-level Reasoning (GPQA Diamond) =====
 +^ Model           ^ Score 1 ^ Score 2 ^
 +| Claude Opus 4   | 79.6%   | 83.3%   |
 +| Claude Sonnet 4 | 75.4%   | 83.8%   |
 +| Claude Sonnet 3.7 | 78.2%           |
 +| OpenAI o3       | 83.3%           |
 +| OpenAI GPT-4.1  | 66.3%           |
 +| Gemini 2.5 Pro (Preview 05-06) | 83.0% |         |
 +
 +===== Agentic Tool Use (TAU-bench) =====
 +==== Retail ====
 +^ Model           ^ Score   ^
 +| Claude Opus 4   | 81.4%   |
 +| Claude Sonnet 4 | 80.5%   |
 +| Claude Sonnet 3.7 | 81.2%   |
 +| OpenAI o3       | 70.4%   |
 +| OpenAI GPT-4.1  | 68.0%   |
 +| Gemini 2.5 Pro (Preview 05-06) | N/A     |
 +==== Airline ====
 +^ Model           ^ Score   ^
 +| Claude Opus 4   | 59.6%   |
 +| Claude Sonnet 4 | 60.0%   |
 +| Claude Sonnet 3.7 | 58.4%   |
 +| OpenAI o3       | 52.0%   |
 +| OpenAI GPT-4.1
 +
 +
 +====== Local deployment ======
 +
 +Running AI Models Locally with Docker and Spring AI
 +Play
 +https://www.danvega.dev/blog/docker-model-runner
 +
 +
 +
 +=== AI Gemma 3 ===
 +
 +https://habr.com/ru/articles/896290/
 +
 +https://spring.io/blog/2025/04/10/spring-ai-docker-model-runner
 +
 +
 +=== Gemii-cli ===
 +
 +https://www.youtube.com/watch?v=xqvprnPocHs
 +
 +https://github.com/google-gemini/gemini-cli
 +
 +<sxh shell>
 +winget install -e --id OpenJS.NodeJS
 +
 +npm install -g @google/gemini-cli
 +
 +npm upgrade -g @google/gemini-cli
 +
 +# start
 +gemini
 +</sxh>
 +
 +
 +=== Docker Model Runner ===
 +
 +https://habr.com/ru/articles/898778/
 +
 +
 +=== Spring AI ===
 +
 +https://spring.io/projects/spring-ai
 +
 +
 +
 +===== Interfacing with the AI mode =====
 +
 +=== MCP - Model Context Protocol ===
 +
 +
 +https://habr.com/ru/articles/893482/
 +
 +
 +{{https://s3.eu-central-1.amazonaws.com/alf-digital-wiki-pics/sharex/QgOdPGCjWV.png}}
 +
 +
 +
 +
 +
 +
 +{{youtube>KC8HT0eWSGk?}}
 +
 +
ai.1701790240.txt.gz · Last modified: by skipidar