ai
Differences
This shows you the differences between two versions of the page.
Both sides previous revisionPrevious revisionNext revision | Previous revision | ||
ai [2024/01/10 07:47] – skipidar | ai [2025/07/06 19:34] (current) – skipidar | ||
---|---|---|---|
Line 5: | Line 5: | ||
What is the best way to learn Artificial Intelligence for a beginner | What is the best way to learn Artificial Intelligence for a beginner | ||
- | https:// | + | |
+ | {{youtube> | ||
{{https:// | {{https:// | ||
Line 11: | Line 12: | ||
+ | |||
+ | ====== AI Model Performance Comparison ====== | ||
+ | |||
+ | This page presents a comparison of various AI models across different task categories, based on the provided data. | ||
+ | |||
+ | ===== Agentic Coding (SWE-bench Verified) ===== | ||
+ | ^ Model ^ Score 1 ^ Score 2 ^ | ||
+ | | Claude Opus 4 | 72.5% | 79.4% | | ||
+ | | Claude Sonnet 4 | 72.7% | 80.2% | | ||
+ | | Claude Sonnet 3.7 | 62.3% | 70.3% | | ||
+ | | OpenAI o3 | 69.1% | ||
+ | | OpenAI GPT-4.1 | ||
+ | | Gemini 2.5 Pro (Preview 05-06) | 63.2% | | | ||
+ | |||
+ | ===== Agentic Terminal Coding (terminal-bench) ===== | ||
+ | ^ Model ^ Score 1 ^ Score 2 ^ | ||
+ | | Claude Opus 4 | 43.2% | 50.0% | | ||
+ | | Claude Sonnet 4 | 35.5% | 41.3% | | ||
+ | | Claude Sonnet 3.7 | 35.2% | ||
+ | | OpenAI o3 | 30.2% | ||
+ | | OpenAI GPT-4.1 | ||
+ | | Gemini 2.5 Pro (Preview 05-06) | 25.3% | | | ||
+ | |||
+ | ===== Graduate-level Reasoning (GPQA Diamond) ===== | ||
+ | ^ Model ^ Score 1 ^ Score 2 ^ | ||
+ | | Claude Opus 4 | 79.6% | 83.3% | | ||
+ | | Claude Sonnet 4 | 75.4% | 83.8% | | ||
+ | | Claude Sonnet 3.7 | 78.2% | ||
+ | | OpenAI o3 | 83.3% | ||
+ | | OpenAI GPT-4.1 | ||
+ | | Gemini 2.5 Pro (Preview 05-06) | 83.0% | | | ||
+ | |||
+ | ===== Agentic Tool Use (TAU-bench) ===== | ||
+ | ==== Retail ==== | ||
+ | ^ Model ^ Score ^ | ||
+ | | Claude Opus 4 | 81.4% | | ||
+ | | Claude Sonnet 4 | 80.5% | | ||
+ | | Claude Sonnet 3.7 | 81.2% | | ||
+ | | OpenAI o3 | 70.4% | | ||
+ | | OpenAI GPT-4.1 | ||
+ | | Gemini 2.5 Pro (Preview 05-06) | N/A | | ||
+ | ==== Airline ==== | ||
+ | ^ Model ^ Score ^ | ||
+ | | Claude Opus 4 | 59.6% | | ||
+ | | Claude Sonnet 4 | 60.0% | | ||
+ | | Claude Sonnet 3.7 | 58.4% | | ||
+ | | OpenAI o3 | 52.0% | | ||
+ | | OpenAI GPT-4.1 | ||
+ | |||
+ | |||
+ | ====== Local deployment ====== | ||
+ | |||
+ | Running AI Models Locally with Docker and Spring AI | ||
+ | Play | ||
+ | https:// | ||
+ | |||
+ | |||
+ | |||
+ | === AI Gemma 3 === | ||
+ | |||
+ | https:// | ||
+ | |||
+ | https:// | ||
+ | |||
+ | |||
+ | === Gemii-cli === | ||
+ | |||
+ | https:// | ||
+ | |||
+ | https:// | ||
+ | |||
+ | <sxh shell> | ||
+ | winget install -e --id OpenJS.NodeJS | ||
+ | |||
+ | npm install -g @google/ | ||
+ | |||
+ | npm upgrade -g @google/ | ||
+ | |||
+ | # start | ||
+ | gemini | ||
+ | </ | ||
+ | |||
+ | |||
+ | === Docker Model Runner === | ||
+ | |||
+ | https:// | ||
+ | |||
+ | |||
+ | === Spring AI === | ||
+ | |||
+ | https:// | ||
+ | |||
+ | |||
+ | |||
+ | ===== Interfacing with the AI mode ===== | ||
+ | |||
+ | === MCP - Model Context Protocol === | ||
+ | |||
+ | |||
+ | https:// | ||
+ | |||
+ | |||
+ | {{https:// |
ai.1704872827.txt.gz · Last modified: by skipidar