ai
Differences
This shows you the differences between two versions of the page.
| Both sides previous revisionPrevious revisionNext revision | Previous revision | ||
| ai [2025/06/27 15:20] – skipidar | ai [2025/07/20 05:05] (current) – skipidar | ||
|---|---|---|---|
| Line 5: | Line 5: | ||
| What is the best way to learn Artificial Intelligence for a beginner | What is the best way to learn Artificial Intelligence for a beginner | ||
| - | https:// | + | |
| + | {{youtube> | ||
| {{https:// | {{https:// | ||
| Line 12: | Line 13: | ||
| - | Here's the data from the image in a list format, categorized by the type of task and then by the AI model: | + | ====== AI Model Performance Comparison ====== |
| - | 1. Agentic Coding (SWE-bench Verified) | + | |
| - | * Claude Opus 4: 72.5% / 79.4% | + | This page presents |
| - | * Claude Sonnet 4: 72.7% / 80.2% | + | |
| - | * Claude Sonnet 3.7: 62.3% / 70.3% | + | ===== Agentic Coding (SWE-bench Verified) |
| - | * OpenAI o3: 69.1% | + | ^ Model ^ Score 1 ^ Score 2 ^ |
| - | * OpenAI GPT-4.1: 54.6% | + | | Claude Opus 4 | 72.5% | 79.4% | |
| - | * Gemini 2.5 Pro (Preview 05-06): 63.2% | + | | Claude Sonnet 4 | 72.7% | 80.2% | |
| - | 2. Agentic Terminal Coding (terminal-bench) | + | | Claude Sonnet 3.7 | 62.3% | 70.3% | |
| - | * Claude Opus 4: 43.2% / 50.0% | + | | OpenAI o3 | 69.1% | | |
| - | * Claude Sonnet 4: 35.5% / 41.3% | + | | OpenAI GPT-4.1 |
| - | * Claude Sonnet 3.7: 35.2% | + | | Gemini 2.5 Pro (Preview 05-06) |
| - | * OpenAI o3: 30.2% | + | |
| - | * OpenAI GPT-4.1: 30.3% | + | ===== Agentic Terminal Coding (terminal-bench) |
| - | * Gemini 2.5 Pro (Preview 05-06): 25.3% | + | ^ Model ^ Score 1 ^ Score 2 ^ |
| - | 3. Graduate-level Reasoning (GPQA Diamond) | + | | Claude Opus 4 | 43.2% | 50.0% | |
| - | * Claude Opus 4: 79.6% / 83.3% | + | | Claude Sonnet 4 | 35.5% | 41.3% | |
| - | * Claude Sonnet 4: 75.4% / 83.8% | + | | Claude Sonnet 3.7 | 35.2% | | |
| - | * Claude Sonnet 3.7: 78.2% | + | | OpenAI o3 | 30.2% | | |
| - | * OpenAI o3: 83.3% | + | | OpenAI GPT-4.1 |
| - | * OpenAI GPT-4.1: 66.3% | + | | Gemini 2.5 Pro (Preview 05-06) |
| - | * Gemini 2.5 Pro (Preview 05-06): 83.0% | + | |
| - | 4. Agentic Tool Use (TAU-bench) | + | ===== Graduate-level Reasoning (GPQA Diamond) |
| - | * Retail: | + | ^ Model ^ Score 1 ^ Score 2 ^ |
| - | * Claude Opus 4: 81.4% | + | | Claude Opus 4 | 79.6% | 83.3% | |
| - | * Claude Sonnet 4: 80.5% | + | | Claude Sonnet 4 | 75.4% | 83.8% | |
| - | * Claude Sonnet 3.7: 81.2% | + | | Claude Sonnet 3.7 | 78.2% | | |
| - | * OpenAI o3: 70.4% | + | | OpenAI o3 | 83.3% | | |
| - | * OpenAI GPT-4.1: 68.0% | + | | OpenAI GPT-4.1 |
| - | * Airline: | + | | Gemini 2.5 Pro (Preview 05-06) |
| - | * Claude Opus 4: 59.6% | + | |
| - | * Claude Sonnet 4: 60.0% | + | ===== Agentic Tool Use (TAU-bench) |
| - | * Claude Sonnet 3.7: 58.4% | + | ==== Retail |
| - | * OpenAI o3: 52.0% | + | ^ Model ^ Score ^ |
| - | * OpenAI GPT-4.1: | + | | Claude Opus 4 | 81.4% | |
| - | * Gemini 2.5 Pro (Preview 05-06): (No data provided) | + | | Claude Sonnet 4 | 80.5% | |
| - | 5. Multilingual Q&A (MMMUA) | + | | Claude Sonnet 3.7 | 81.2% | |
| - | * Claude Opus 4: 88.8% | + | | OpenAI o3 | 70.4% | |
| - | * Claude Sonnet 4: 86.5% | + | | OpenAI GPT-4.1 |
| - | * Claude Sonnet | + | | Gemini 2.5 Pro (Preview 05-06) | N/A | |
| - | * OpenAI o3: 88.8% | + | ==== Airline |
| - | * OpenAI GPT-4.1: 83.7% | + | ^ Model ^ Score ^ |
| - | * Gemini 2.5 Pro (Preview 05-06): (No data provided) | + | | Claude Opus 4 | 59.6% | |
| - | 6. Visual Reasoning (MMMU (validation)) | + | | Claude Sonnet 4 | 60.0% | |
| - | * Claude Opus 4: 76.5% | + | | Claude Sonnet 3.7 | 58.4% | |
| - | * Claude Sonnet 4: 74.4% | + | | OpenAI o3 | 52.0% | |
| - | * Claude Sonnet 3.7: 75.0% | + | | OpenAI GPT-4.1 |
| - | * OpenAI o3: 82.9% | + | |
| - | * OpenAI GPT-4.1: 74.8% | + | |
| - | * Gemini 2.5 Pro (Preview 05-06): 79.6% | + | ====== Local deployment ====== |
| - | 7. High School Math Competition (AIME 2024) | + | |
| - | * Claude Opus 4: 75.5% / 90.0% | + | Running AI Models Locally with Docker and Spring AI |
| - | * Claude Sonnet 4: 70.5% / 85.0% | + | Play |
| - | * Claude Sonnet 3.7: 54.8% | + | https://www.danvega.dev/ |
| - | * OpenAI o3: 88.9% | + | |
| - | * OpenAI GPT-4.1: (No data provided) | + | |
| - | * Gemini 2.5 Pro (Preview 05-06): 83.0% | + | |
| + | === AI Gemma 3 === | ||
| + | |||
| + | https://habr.com/ | ||
| + | |||
| + | https://spring.io/ | ||
| + | |||
| + | |||
| + | === Gemii-cli === | ||
| + | |||
| + | https://www.youtube.com/ | ||
| + | |||
| + | https://github.com/ | ||
| + | |||
| + | <sxh shell> | ||
| + | winget install | ||
| + | |||
| + | npm install | ||
| + | |||
| + | npm upgrade -g @google/ | ||
| + | |||
| + | # start | ||
| + | gemini | ||
| + | </ | ||
| + | |||
| + | |||
| + | === Docker Model Runner === | ||
| + | |||
| + | https://habr.com/ | ||
| + | |||
| + | |||
| + | === Spring AI === | ||
| + | |||
| + | https://spring.io/ | ||
| + | |||
| + | |||
| + | |||
| + | ===== Interfacing with the AI mode ===== | ||
| + | |||
| + | === MCP - Model Context Protocol === | ||
| + | |||
| + | |||
| + | https://habr.com/ | ||
| + | |||
| + | |||
| + | {{https://s3.eu-central-1.amazonaws.com/ | ||
| + | |||
| + | |||
| + | |||
| + | |||
| + | |||
| + | |||
| + | {{youtube> | ||
ai.1751037628.txt.gz · Last modified: by skipidar
