User Tools

Site Tools


ai

Differences

This shows you the differences between two versions of the page.

Link to this comparison view

Both sides previous revisionPrevious revision
Next revision
Previous revision
ai [2024/01/10 07:47] skipidarai [2025/07/06 19:34] (current) skipidar
Line 5: Line 5:
 What is the best way to learn Artificial Intelligence for a beginner What is the best way to learn Artificial Intelligence for a beginner
  
-https://qr.ae/pKDW7R+ 
 +{{youtube>FzLABAppJfM?}}
  
 {{https://s3.eu-central-1.amazonaws.com/alf-digital-wiki-pics/sharex/Jsh6B5GPnt.png}} {{https://s3.eu-central-1.amazonaws.com/alf-digital-wiki-pics/sharex/Jsh6B5GPnt.png}}
Line 11: Line 12:
  
  
 +
 +====== AI Model Performance Comparison ======
 +
 +This page presents a comparison of various AI models across different task categories, based on the provided data.
 +
 +===== Agentic Coding (SWE-bench Verified) =====
 +^ Model           ^ Score 1 ^ Score 2 ^
 +| Claude Opus 4   | 72.5%   | 79.4%   |
 +| Claude Sonnet 4 | 72.7%   | 80.2%   |
 +| Claude Sonnet 3.7 | 62.3%   | 70.3%   |
 +| OpenAI o3       | 69.1%           |
 +| OpenAI GPT-4.1  | 54.6%           |
 +| Gemini 2.5 Pro (Preview 05-06) | 63.2% |         |
 +
 +===== Agentic Terminal Coding (terminal-bench) =====
 +^ Model           ^ Score 1 ^ Score 2 ^
 +| Claude Opus 4   | 43.2%   | 50.0%   |
 +| Claude Sonnet 4 | 35.5%   | 41.3%   |
 +| Claude Sonnet 3.7 | 35.2%           |
 +| OpenAI o3       | 30.2%           |
 +| OpenAI GPT-4.1  | 30.3%           |
 +| Gemini 2.5 Pro (Preview 05-06) | 25.3% |         |
 +
 +===== Graduate-level Reasoning (GPQA Diamond) =====
 +^ Model           ^ Score 1 ^ Score 2 ^
 +| Claude Opus 4   | 79.6%   | 83.3%   |
 +| Claude Sonnet 4 | 75.4%   | 83.8%   |
 +| Claude Sonnet 3.7 | 78.2%           |
 +| OpenAI o3       | 83.3%           |
 +| OpenAI GPT-4.1  | 66.3%           |
 +| Gemini 2.5 Pro (Preview 05-06) | 83.0% |         |
 +
 +===== Agentic Tool Use (TAU-bench) =====
 +==== Retail ====
 +^ Model           ^ Score   ^
 +| Claude Opus 4   | 81.4%   |
 +| Claude Sonnet 4 | 80.5%   |
 +| Claude Sonnet 3.7 | 81.2%   |
 +| OpenAI o3       | 70.4%   |
 +| OpenAI GPT-4.1  | 68.0%   |
 +| Gemini 2.5 Pro (Preview 05-06) | N/A     |
 +==== Airline ====
 +^ Model           ^ Score   ^
 +| Claude Opus 4   | 59.6%   |
 +| Claude Sonnet 4 | 60.0%   |
 +| Claude Sonnet 3.7 | 58.4%   |
 +| OpenAI o3       | 52.0%   |
 +| OpenAI GPT-4.1
 +
 +
 +====== Local deployment ======
 +
 +Running AI Models Locally with Docker and Spring AI
 +Play
 +https://www.danvega.dev/blog/docker-model-runner
 +
 +
 +
 +=== AI Gemma 3 ===
 +
 +https://habr.com/ru/articles/896290/
 +
 +https://spring.io/blog/2025/04/10/spring-ai-docker-model-runner
 +
 +
 +=== Gemii-cli ===
 +
 +https://www.youtube.com/watch?v=xqvprnPocHs
 +
 +https://github.com/google-gemini/gemini-cli
 +
 +<sxh shell>
 +winget install -e --id OpenJS.NodeJS
 +
 +npm install -g @google/gemini-cli
 +
 +npm upgrade -g @google/gemini-cli
 +
 +# start
 +gemini
 +</sxh>
 +
 +
 +=== Docker Model Runner ===
 +
 +https://habr.com/ru/articles/898778/
 +
 +
 +=== Spring AI ===
 +
 +https://spring.io/projects/spring-ai
 +
 +
 +
 +===== Interfacing with the AI mode =====
 +
 +=== MCP - Model Context Protocol ===
 +
 +
 +https://habr.com/ru/articles/893482/
 +
 +
 +{{https://s3.eu-central-1.amazonaws.com/alf-digital-wiki-pics/sharex/QgOdPGCjWV.png}}
ai.1704872827.txt.gz · Last modified: by skipidar