Differences

This shows you the differences between two versions of the page.

--- ai [2024/01/10 07:47] – skipidar
+++ ai [2025/07/20 05:05] (current) – skipidar
@@ Line 5: / Line 5: @@
 What is the best way to learn Artificial Intelligence for a beginner
-https://qr.ae/pKDW7R
+{{youtube>FzLABAppJfM?}}
 {{https://s3.eu-central-1.amazonaws.com/alf-digital-wiki-pics/sharex/Jsh6B5GPnt.png}}
+====== AI Model Performance Comparison ======
+This page presents a comparison of various AI models across different task categories, based on the provided data.
+===== Agentic Coding (SWE-bench Verified) =====
+^ Model           ^ Score 1 ^ Score 2 ^
+| Claude Opus 4   | 72.5%   | 79.4%   |
+| Claude Sonnet 4 | 72.7%   | 80.2%   |
+| Claude Sonnet 3.7 | 62.3%   | 70.3%   |
+| OpenAI o3       | 69.1%   |         |
+| OpenAI GPT-4.1  | 54.6%   |         |
+| Gemini 2.5 Pro (Preview 05-06) | 63.2% |         |
+===== Agentic Terminal Coding (terminal-bench) =====
+^ Model           ^ Score 1 ^ Score 2 ^
+| Claude Opus 4   | 43.2%   | 50.0%   |
+| Claude Sonnet 4 | 35.5%   | 41.3%   |
+| Claude Sonnet 3.7 | 35.2%   |         |
+| OpenAI o3       | 30.2%   |         |
+| OpenAI GPT-4.1  | 30.3%   |         |
+| Gemini 2.5 Pro (Preview 05-06) | 25.3% |         |
+===== Graduate-level Reasoning (GPQA Diamond) =====
+^ Model           ^ Score 1 ^ Score 2 ^
+| Claude Opus 4   | 79.6%   | 83.3%   |
+| Claude Sonnet 4 | 75.4%   | 83.8%   |
+| Claude Sonnet 3.7 | 78.2%   |         |
+| OpenAI o3       | 83.3%   |         |
+| OpenAI GPT-4.1  | 66.3%   |         |
+| Gemini 2.5 Pro (Preview 05-06) | 83.0% |         |
+===== Agentic Tool Use (TAU-bench) =====
+==== Retail ====
+^ Model           ^ Score   ^
+| Claude Opus 4   | 81.4%   |
+| Claude Sonnet 4 | 80.5%   |
+| Claude Sonnet 3.7 | 81.2%   |
+| OpenAI o3       | 70.4%   |
+| OpenAI GPT-4.1  | 68.0%   |
+| Gemini 2.5 Pro (Preview 05-06) | N/A     |
+==== Airline ====
+^ Model           ^ Score   ^
+| Claude Opus 4   | 59.6%   |
+| Claude Sonnet 4 | 60.0%   |
+| Claude Sonnet 3.7 | 58.4%   |
+| OpenAI o3       | 52.0%   |
+| OpenAI GPT-4.1
+====== Local deployment ======
+Running AI Models Locally with Docker and Spring AI
+Play
+https://www.danvega.dev/blog/docker-model-runner
+=== AI Gemma 3 ===
+https://habr.com/ru/articles/896290/
+https://spring.io/blog/2025/04/10/spring-ai-docker-model-runner
+=== Gemii-cli ===
+https://www.youtube.com/watch?v=xqvprnPocHs
+https://github.com/google-gemini/gemini-cli
+<sxh shell>
+winget install -e --id OpenJS.NodeJS
+npm install -g @google/gemini-cli
+npm upgrade -g @google/gemini-cli
+# start
+gemini
+</sxh>
+=== Docker Model Runner ===
+https://habr.com/ru/articles/898778/
+=== Spring AI ===
+https://spring.io/projects/spring-ai
+===== Interfacing with the AI mode =====
+=== MCP - Model Context Protocol ===
+https://habr.com/ru/articles/893482/
+{{https://s3.eu-central-1.amazonaws.com/alf-digital-wiki-pics/sharex/QgOdPGCjWV.png}}
+{{youtube>KC8HT0eWSGk?}}