User Tools

Site Tools


ai

Differences

This shows you the differences between two versions of the page.

Link to this comparison view

Both sides previous revisionPrevious revision
Next revision
Previous revision
ai [2025/06/27 15:20] skipidarai [2025/07/06 19:34] (current) skipidar
Line 5: Line 5:
 What is the best way to learn Artificial Intelligence for a beginner What is the best way to learn Artificial Intelligence for a beginner
  
-https://qr.ae/pKDW7R+ 
 +{{youtube>FzLABAppJfM?}}
  
 {{https://s3.eu-central-1.amazonaws.com/alf-digital-wiki-pics/sharex/Jsh6B5GPnt.png}} {{https://s3.eu-central-1.amazonaws.com/alf-digital-wiki-pics/sharex/Jsh6B5GPnt.png}}
Line 12: Line 13:
  
  
-Here's the data from the image in list format, categorized by the type of task and then by the AI model: +====== AI Model Performance Comparison ====== 
-1. Agentic Coding (SWE-bench Verified) + 
-Claude Opus 472.5% 79.4% +This page presents comparison of various AI models across different task categories, based on the provided data. 
-Claude Sonnet 472.7% 80.2% + 
-Claude Sonnet 3.762.3% 70.3% +===== Agentic Coding (SWE-bench Verified) ===== 
-OpenAI o369.1% +^ Model           ^ Score 1 ^ Score 2 ^ 
-OpenAI GPT-4.154.6% +Claude Opus 4   | 72.5%   | 79.4%   | 
-Gemini 2.5 Pro (Preview 05-06)63.2% +Claude Sonnet 4 72.7%   | 80.2%   | 
-2. Agentic Terminal Coding (terminal-bench) +Claude Sonnet 3.7 62.3%   | 70.3%   | 
-Claude Opus 443.2% 50.0% +OpenAI o3       | 69.1%   |         | 
-Claude Sonnet 435.5% 41.3% +OpenAI GPT-4.1  54.6%   |         | 
-Claude Sonnet 3.735.2% +Gemini 2.5 Pro (Preview 05-06) 63.2% |         | 
-OpenAI o330.2% + 
-OpenAI GPT-4.130.3% +===== Agentic Terminal Coding (terminal-bench) ===== 
-Gemini 2.5 Pro (Preview 05-06)25.3% +^ Model           ^ Score 1 ^ Score 2 ^ 
-3. Graduate-level Reasoning (GPQA Diamond) +Claude Opus 4   | 43.2%   | 50.0%   | 
-Claude Opus 479.6% 83.3% +Claude Sonnet 4 35.5%   | 41.3%   | 
-Claude Sonnet 475.4% 83.8% +Claude Sonnet 3.7 35.2%   |         | 
-Claude Sonnet 3.778.2% +OpenAI o3       | 30.2%   |         | 
-OpenAI o383.3% +OpenAI GPT-4.1  30.3%   |         | 
-OpenAI GPT-4.166.3% +Gemini 2.5 Pro (Preview 05-06) 25.3% |         | 
-Gemini 2.5 Pro (Preview 05-06)83.0% + 
-4. Agentic Tool Use (TAU-bench) +===== Graduate-level Reasoning (GPQA Diamond) ===== 
-Retail: +^ Model           ^ Score 1 ^ Score 2 ^ 
-Claude Opus 481.4% +Claude Opus 4   | 79.6%   | 83.3%   | 
-Claude Sonnet 480.5% +Claude Sonnet 4 75.4%   | 83.8%   | 
-Claude Sonnet 3.781.2% +Claude Sonnet 3.7 78.2%   |         | 
-OpenAI o370.4% +OpenAI o3       | 83.3%   |         | 
-OpenAI GPT-4.168.0% +OpenAI GPT-4.1  66.3%   |         | 
-Airline: +Gemini 2.5 Pro (Preview 05-06) 83.0% |         | 
-Claude Opus 459.6% + 
-Claude Sonnet 460.0% +===== Agentic Tool Use (TAU-bench) ===== 
-Claude Sonnet 3.758.4% +==== Retail ==== 
-OpenAI o352.0% +^ Model           ^ Score   ^ 
-OpenAI GPT-4.1: 49.4% +Claude Opus 4   | 81.4%   | 
-* Gemini 2.5 Pro (Preview 05-06): (No data provided) +Claude Sonnet 4 80.5%   | 
-5. Multilingual Q&A (MMMUA) +Claude Sonnet 3.7 81.2%   | 
-* Claude Opus 4: 88.8% +OpenAI o3       | 70.4%   | 
-* Claude Sonnet 4: 86.5% +OpenAI GPT-4.1  68.0%   | 
-* Claude Sonnet 3.785.9% +| Gemini 2.5 Pro (Preview 05-06) | N/A     | 
-* OpenAI o388.8% +==== Airline ==== 
-* OpenAI GPT-4.1: 83.7% +^ Model           ^ Score   ^ 
-* Gemini 2.5 Pro (Preview 05-06): (No data provided) +Claude Opus 4   | 59.6%   | 
-6. Visual Reasoning (MMMU (validation)) +Claude Sonnet 4 60.0%   | 
-* Claude Opus 476.5% +Claude Sonnet 3.7 58.4%   | 
-* Claude Sonnet 4: 74.4% +OpenAI o3       | 52.0%   | 
-* Claude Sonnet 3.7: 75.0% +OpenAI GPT-4.1 
-* OpenAI o382.9% + 
-* OpenAI GPT-4.1: 74.8% + 
-* Gemini 2.5 Pro (Preview 05-06): 79.6% +====== Local deployment ====== 
-7. High School Math Competition (AIME 2024) + 
-* Claude Opus 4: 75.5% 90.0% +Running AI Models Locally with Docker and Spring AI 
-* Claude Sonnet 4: 70.5% 85.0% +Play 
-* Claude Sonnet 3.754.8% +https://www.danvega.dev/blog/docker-model-runner 
-* OpenAI o388.9% + 
-* OpenAI GPT-4.1: (No data provided) + 
-* Gemini 2.5 Pro (Preview 05-06)83.0%+ 
 +=== AI Gemma === 
 + 
 +https://habr.com/ru/articles/896290/ 
 + 
 +https://spring.io/blog/2025/04/10/spring-ai-docker-model-runner 
 + 
 + 
 +=== Gemii-cli === 
 + 
 +https://www.youtube.com/watch?v=xqvprnPocHs 
 + 
 +https://github.com/google-gemini/gemini-cli 
 + 
 +<sxh shell> 
 +winget install -e --id OpenJS.NodeJS 
 + 
 +npm install -g @google/gemini-cli 
 + 
 +npm upgrade -g @google/gemini-cli 
 + 
 +# start 
 +gemini 
 +</sxh> 
 + 
 + 
 +=== Docker Model Runner === 
 + 
 +https://habr.com/ru/articles/898778/ 
 + 
 + 
 +=== Spring AI === 
 + 
 +https://spring.io/projects/spring-ai 
 + 
 + 
 + 
 +===== Interfacing with the AI mode ===== 
 + 
 +=== MCP Model Context Protocol === 
 + 
 + 
 +https://habr.com/ru/articles/893482/ 
  
 +{{https://s3.eu-central-1.amazonaws.com/alf-digital-wiki-pics/sharex/QgOdPGCjWV.png}}
ai.1751037628.txt.gz · Last modified: by skipidar