ai
Differences
This shows you the differences between two versions of the page.
Both sides previous revisionPrevious revisionNext revision | Previous revision | ||
ai [2025/06/27 15:20] – skipidar | ai [2025/07/06 19:34] (current) – skipidar | ||
---|---|---|---|
Line 5: | Line 5: | ||
What is the best way to learn Artificial Intelligence for a beginner | What is the best way to learn Artificial Intelligence for a beginner | ||
- | https:// | + | |
+ | {{youtube> | ||
{{https:// | {{https:// | ||
Line 12: | Line 13: | ||
- | Here's the data from the image in a list format, categorized by the type of task and then by the AI model: | + | ====== AI Model Performance Comparison ====== |
- | 1. Agentic Coding (SWE-bench Verified) | + | |
- | * Claude Opus 4: 72.5% / 79.4% | + | This page presents |
- | * Claude Sonnet 4: 72.7% / 80.2% | + | |
- | * Claude Sonnet 3.7: 62.3% / 70.3% | + | ===== Agentic Coding (SWE-bench Verified) |
- | * OpenAI o3: 69.1% | + | ^ Model ^ Score 1 ^ Score 2 ^ |
- | * OpenAI GPT-4.1: 54.6% | + | | Claude Opus 4 | 72.5% | 79.4% | |
- | * Gemini 2.5 Pro (Preview 05-06): 63.2% | + | | Claude Sonnet 4 | 72.7% | 80.2% | |
- | 2. Agentic Terminal Coding (terminal-bench) | + | | Claude Sonnet 3.7 | 62.3% | 70.3% | |
- | * Claude Opus 4: 43.2% / 50.0% | + | | OpenAI o3 | 69.1% | | |
- | * Claude Sonnet 4: 35.5% / 41.3% | + | | OpenAI GPT-4.1 |
- | * Claude Sonnet 3.7: 35.2% | + | | Gemini 2.5 Pro (Preview 05-06) |
- | * OpenAI o3: 30.2% | + | |
- | * OpenAI GPT-4.1: 30.3% | + | ===== Agentic Terminal Coding (terminal-bench) |
- | * Gemini 2.5 Pro (Preview 05-06): 25.3% | + | ^ Model ^ Score 1 ^ Score 2 ^ |
- | 3. Graduate-level Reasoning (GPQA Diamond) | + | | Claude Opus 4 | 43.2% | 50.0% | |
- | * Claude Opus 4: 79.6% / 83.3% | + | | Claude Sonnet 4 | 35.5% | 41.3% | |
- | * Claude Sonnet 4: 75.4% / 83.8% | + | | Claude Sonnet 3.7 | 35.2% | | |
- | * Claude Sonnet 3.7: 78.2% | + | | OpenAI o3 | 30.2% | | |
- | * OpenAI o3: 83.3% | + | | OpenAI GPT-4.1 |
- | * OpenAI GPT-4.1: 66.3% | + | | Gemini 2.5 Pro (Preview 05-06) |
- | * Gemini 2.5 Pro (Preview 05-06): 83.0% | + | |
- | 4. Agentic Tool Use (TAU-bench) | + | ===== Graduate-level Reasoning (GPQA Diamond) |
- | * Retail: | + | ^ Model ^ Score 1 ^ Score 2 ^ |
- | * Claude Opus 4: 81.4% | + | | Claude Opus 4 | 79.6% | 83.3% | |
- | * Claude Sonnet 4: 80.5% | + | | Claude Sonnet 4 | 75.4% | 83.8% | |
- | * Claude Sonnet 3.7: 81.2% | + | | Claude Sonnet 3.7 | 78.2% | | |
- | * OpenAI o3: 70.4% | + | | OpenAI o3 | 83.3% | | |
- | * OpenAI GPT-4.1: 68.0% | + | | OpenAI GPT-4.1 |
- | * Airline: | + | | Gemini 2.5 Pro (Preview 05-06) |
- | * Claude Opus 4: 59.6% | + | |
- | * Claude Sonnet 4: 60.0% | + | ===== Agentic Tool Use (TAU-bench) |
- | * Claude Sonnet 3.7: 58.4% | + | ==== Retail |
- | * OpenAI o3: 52.0% | + | ^ Model ^ Score ^ |
- | * OpenAI GPT-4.1: | + | | Claude Opus 4 | 81.4% | |
- | * Gemini 2.5 Pro (Preview 05-06): (No data provided) | + | | Claude Sonnet 4 | 80.5% | |
- | 5. Multilingual Q&A (MMMUA) | + | | Claude Sonnet 3.7 | 81.2% | |
- | * Claude Opus 4: 88.8% | + | | OpenAI o3 | 70.4% | |
- | * Claude Sonnet 4: 86.5% | + | | OpenAI GPT-4.1 |
- | * Claude Sonnet | + | | Gemini 2.5 Pro (Preview 05-06) | N/A | |
- | * OpenAI o3: 88.8% | + | ==== Airline |
- | * OpenAI GPT-4.1: 83.7% | + | ^ Model ^ Score ^ |
- | * Gemini 2.5 Pro (Preview 05-06): (No data provided) | + | | Claude Opus 4 | 59.6% | |
- | 6. Visual Reasoning (MMMU (validation)) | + | | Claude Sonnet 4 | 60.0% | |
- | * Claude Opus 4: 76.5% | + | | Claude Sonnet 3.7 | 58.4% | |
- | * Claude Sonnet 4: 74.4% | + | | OpenAI o3 | 52.0% | |
- | * Claude Sonnet 3.7: 75.0% | + | | OpenAI GPT-4.1 |
- | * OpenAI o3: 82.9% | + | |
- | * OpenAI GPT-4.1: 74.8% | + | |
- | * Gemini 2.5 Pro (Preview 05-06): 79.6% | + | ====== Local deployment ====== |
- | 7. High School Math Competition (AIME 2024) | + | |
- | * Claude Opus 4: 75.5% / 90.0% | + | Running AI Models Locally with Docker and Spring AI |
- | * Claude Sonnet 4: 70.5% / 85.0% | + | Play |
- | * Claude Sonnet 3.7: 54.8% | + | https://www.danvega.dev/ |
- | * OpenAI o3: 88.9% | + | |
- | * OpenAI GPT-4.1: (No data provided) | + | |
- | * Gemini 2.5 Pro (Preview 05-06): 83.0% | + | |
+ | === AI Gemma 3 === | ||
+ | |||
+ | https://habr.com/ | ||
+ | |||
+ | https://spring.io/ | ||
+ | |||
+ | |||
+ | === Gemii-cli === | ||
+ | |||
+ | https://www.youtube.com/ | ||
+ | |||
+ | https://github.com/ | ||
+ | |||
+ | <sxh shell> | ||
+ | winget install -e --id OpenJS.NodeJS | ||
+ | |||
+ | npm install | ||
+ | |||
+ | npm upgrade -g @google/gemini-cli | ||
+ | |||
+ | # start | ||
+ | gemini | ||
+ | </sxh> | ||
+ | |||
+ | |||
+ | === Docker Model Runner === | ||
+ | |||
+ | https://habr.com/ | ||
+ | |||
+ | |||
+ | === Spring AI === | ||
+ | |||
+ | https://spring.io/ | ||
+ | |||
+ | |||
+ | |||
+ | ===== Interfacing with the AI mode ===== | ||
+ | |||
+ | === MCP - Model Context Protocol === | ||
+ | |||
+ | |||
+ | https://habr.com/ | ||
+ | {{https:// |
ai.1751037628.txt.gz · Last modified: by skipidar