Differences

This shows you the differences between two versions of the page.

--- ai [2025/06/27 15:20] – skipidar
+++ ai [2025/07/20 05:05] (current) – skipidar
@@ Line 5: / Line 5: @@
 What is the best way to learn Artificial Intelligence for a beginner
-https://qr.ae/pKDW7R
+{{youtube>FzLABAppJfM?}}
 {{https://s3.eu-central-1.amazonaws.com/alf-digital-wiki-pics/sharex/Jsh6B5GPnt.png}}
@@ Line 12: / Line 13: @@
-Here's the data from the image in a list format, categorized by the type of task and then by the AI model:
+====== AI Model Performance Comparison ======
-. Agentic Coding (SWE-bench Verified)
-* Claude Opus 4: 72.5% / 79.4%
+This page presents a comparison of various AI models across different task categories, based on the provided data.
-* Claude Sonnet 4: 72.7% / 80.2%
-* Claude Sonnet 3.7: 62.3% / 70.3%
+===== Agentic Coding (SWE-bench Verified) =====
-* OpenAI o3: 69.1%
+^ Model           ^ Score 1 ^ Score 2 ^
-* OpenAI GPT-4.1: 54.6%
+| Claude Opus 4   | 72.5%   | 79.4%   |
-* Gemini 2.5 Pro (Preview 05-06): 63.2%
+| Claude Sonnet 4 | 72.7%   | 80.2%   |
-. Agentic Terminal Coding (terminal-bench)
+| Claude Sonnet 3.7 | 62.3%   | 70.3%   |
-* Claude Opus 4: 43.2% / 50.0%
+| OpenAI o3       | 69.1%   |         |
-* Claude Sonnet 4: 35.5% / 41.3%
+| OpenAI GPT-4.1  | 54.6%   |         |
-* Claude Sonnet 3.7: 35.2%
+| Gemini 2.5 Pro (Preview 05-06) | 63.2% |         |
-* OpenAI o3: 30.2%
-* OpenAI GPT-4.1: 30.3%
+===== Agentic Terminal Coding (terminal-bench) =====
-* Gemini 2.5 Pro (Preview 05-06): 25.3%
+^ Model           ^ Score 1 ^ Score 2 ^
-. Graduate-level Reasoning (GPQA Diamond)
+| Claude Opus 4   | 43.2%   | 50.0%   |
-* Claude Opus 4: 79.6% / 83.3%
+| Claude Sonnet 4 | 35.5%   | 41.3%   |
-* Claude Sonnet 4: 75.4% / 83.8%
+| Claude Sonnet 3.7 | 35.2%   |         |
-* Claude Sonnet 3.7: 78.2%
+| OpenAI o3       | 30.2%   |         |
-* OpenAI o3: 83.3%
+| OpenAI GPT-4.1  | 30.3%   |         |
-* OpenAI GPT-4.1: 66.3%
+| Gemini 2.5 Pro (Preview 05-06) | 25.3% |         |
-* Gemini 2.5 Pro (Preview 05-06): 83.0%
-. Agentic Tool Use (TAU-bench)
+===== Graduate-level Reasoning (GPQA Diamond) =====
-* Retail:
+^ Model           ^ Score 1 ^ Score 2 ^
-* Claude Opus 4: 81.4%
+| Claude Opus 4   | 79.6%   | 83.3%   |
-* Claude Sonnet 4: 80.5%
+| Claude Sonnet 4 | 75.4%   | 83.8%   |
-* Claude Sonnet 3.7: 81.2%
+| Claude Sonnet 3.7 | 78.2%   |         |
-* OpenAI o3: 70.4%
+| OpenAI o3       | 83.3%   |         |
-* OpenAI GPT-4.1: 68.0%
+| OpenAI GPT-4.1  | 66.3%   |         |
-* Airline:
+| Gemini 2.5 Pro (Preview 05-06) | 83.0% |         |
-* Claude Opus 4: 59.6%
-* Claude Sonnet 4: 60.0%
+===== Agentic Tool Use (TAU-bench) =====
-* Claude Sonnet 3.7: 58.4%
+==== Retail ====
-* OpenAI o3: 52.0%
+^ Model           ^ Score   ^
-* OpenAI GPT-4.1: 49.4%
+| Claude Opus 4   | 81.4%   |
-* Gemini 2.5 Pro (Preview 05-06): (No data provided)
+| Claude Sonnet 4 | 80.5%   |
-. Multilingual Q&A (MMMUA)
+| Claude Sonnet 3.7 | 81.2%   |
-* Claude Opus 4: 88.8%
+| OpenAI o3       | 70.4%   |
-* Claude Sonnet 4: 86.5%
+| OpenAI GPT-4.1  | 68.0%   |
-* Claude Sonnet 3.7: 85.9%
+| Gemini 2.5 Pro (Preview 05-06) | N/A     |
-* OpenAI o3: 88.8%
+==== Airline ====
-* OpenAI GPT-4.1: 83.7%
+^ Model           ^ Score   ^
-* Gemini 2.5 Pro (Preview 05-06): (No data provided)
+| Claude Opus 4   | 59.6%   |
-. Visual Reasoning (MMMU (validation))
+| Claude Sonnet 4 | 60.0%   |
-* Claude Opus 4: 76.5%
+| Claude Sonnet 3.7 | 58.4%   |
-* Claude Sonnet 4: 74.4%
+| OpenAI o3       | 52.0%   |
-* Claude Sonnet 3.7: 75.0%
+| OpenAI GPT-4.1
-* OpenAI o3: 82.9%
-* OpenAI GPT-4.1: 74.8%
-* Gemini 2.5 Pro (Preview 05-06): 79.6%
+====== Local deployment ======
-. High School Math Competition (AIME 2024)
-* Claude Opus 4: 75.5% / 90.0%
+Running AI Models Locally with Docker and Spring AI
-* Claude Sonnet 4: 70.5% / 85.0%
+Play
-* Claude Sonnet 3.7: 54.8%
+https://www.danvega.dev/blog/docker-model-runner
-* OpenAI o3: 88.9%
-* OpenAI GPT-4.1: (No data provided)
-* Gemini 2.5 Pro (Preview 05-06): 83.0%
+=== AI Gemma 3 ===
+https://habr.com/ru/articles/896290/
+https://spring.io/blog/2025/04/10/spring-ai-docker-model-runner
+=== Gemii-cli ===
+https://www.youtube.com/watch?v=xqvprnPocHs
+https://github.com/google-gemini/gemini-cli
+<sxh shell>
+winget install -e --id OpenJS.NodeJS
+npm install -g @google/gemini-cli
+npm upgrade -g @google/gemini-cli
+# start
+gemini
+</sxh>
+=== Docker Model Runner ===
+https://habr.com/ru/articles/898778/
+=== Spring AI ===
+https://spring.io/projects/spring-ai
+===== Interfacing with the AI mode =====
+=== MCP - Model Context Protocol ===
+https://habr.com/ru/articles/893482/
+{{https://s3.eu-central-1.amazonaws.com/alf-digital-wiki-pics/sharex/QgOdPGCjWV.png}}
+{{youtube>KC8HT0eWSGk?}}