Building Your First AI Agent in 2026 — A Practical Getting Started Guide
Skip the theory. Here's how to build a working AI agent in 2026 using the Anthropic Agent SDK, MCP for tool integration, and the architectural patterns that teams are actually shipping in production.
Published by GitIntel Research
TLDR
- • An AI agent is a model + tools + a loop — the model decides what to do, tools let it act on the world, the loop runs until the task is done.
- • Anthropic Agent SDK is the current production standard for Claude-based agents. Manages context, tool calls, and multi-turn reasoning.
- • MCP handles tool integration — instead of writing API adapters, you connect pre-built MCP servers for GitHub, databases, Slack, and hundreds more.
- • Three architectural patterns cover 90% of production agent use cases: single-agent with tools, multi-agent with handoffs, and background agents with checkpointing.
What "Agent" Actually Means
The word is overloaded. An AI agent, practically speaking, is a model that can take actions — not just generate text. It reads inputs, reasons about what to do, calls tools to act on the world (query a database, write a file, make an API call), observes the results, and continues until the task is complete or it determines it needs more information.
The minimal components:
- A model that can reason and decide (Claude, GPT-4, Gemini)
- Tools the model can call (functions with typed inputs/outputs)
- A loop that runs the model, executes tool calls, feeds results back, repeats
That's it. The complexity comes from what you do with these three things: how you design the tools, how you handle errors and retries, how you maintain context across multi-step tasks, how you coordinate multiple agents working in parallel.
In 2026, the tooling for all of this has matured considerably. Here's how to build something that works.
Part 1: Your First Agent in 30 Lines
We'll build a research agent that can look up information and write a summary. The Anthropic Agent SDK handles the loop — you define the tools and the goal.
import anthropic
client = anthropic.Anthropic()
# Define tools the agent can use
tools = [
{
"name": "search_web",
"description": "Search the web for current information on a topic",
"input_schema": {
"type": "object",
"properties": {
"query": {
"type": "string",
"description": "The search query"
}
},
"required": ["query"]
}
},
{
"name": "write_file",
"description": "Write content to a file",
"input_schema": {
"type": "object",
"properties": {
"filename": {"type": "string"},
"content": {"type": "string"}
},
"required": ["filename", "content"]
}
}
]
def run_agent(goal: str) -> str:
messages = [{"role": "user", "content": goal}]
while True:
response = client.messages.create(
model="claude-opus-4-5",
max_tokens=4096,
tools=tools,
messages=messages
)
# If model is done reasoning, return the result
if response.stop_reason == "end_turn":
return response.content[0].text
# Process tool calls
if response.stop_reason == "tool_use":
tool_results = []
for block in response.content:
if block.type == "tool_use":
result = execute_tool(block.name, block.input)
tool_results.append({
"type": "tool_result",
"tool_use_id": block.id,
"content": result
})
# Feed results back to the model
messages.append({"role": "assistant", "content": response.content})
messages.append({"role": "user", "content": tool_results})
def execute_tool(name: str, inputs: dict) -> str:
if name == "search_web":
# Your search implementation here
return search_implementation(inputs["query"])
elif name == "write_file":
with open(inputs["filename"], "w") as f:
f.write(inputs["content"])
return f"Wrote {len(inputs['content'])} chars to {inputs['filename']}"
return "Unknown tool"
# Run it
result = run_agent(
"Research the current state of MCP server adoption and write a 300-word summary to summary.md"
)
This is the minimal working agent. The model calls search_web, reads the results, may call it again with different queries, then calls write_file to produce the output. The loop exits when stop_reason == "end_turn".
Part 2: Adding Real Tools with MCP
Writing custom tool implementations for every API is tedious. MCP solves this by letting you connect pre-built servers that expose tools the agent can use directly.
import anthropic
from mcp import ClientSession, StdioServerParameters
from mcp.client.stdio import stdio_client
async def run_agent_with_mcp(goal: str):
# Connect to MCP servers
async with stdio_client(
StdioServerParameters(
command="npx",
args=["-y", "@modelcontextprotocol/server-github"]
)
) as (read, write):
async with ClientSession(read, write) as session:
# Get available tools from the MCP server
tools_response = await session.list_tools()
mcp_tools = [
{
"name": tool.name,
"description": tool.description,
"input_schema": tool.inputSchema
}
for tool in tools_response.tools
]
client = anthropic.Anthropic()
messages = [{"role": "user", "content": goal}]
while True:
response = client.messages.create(
model="claude-opus-4-5",
max_tokens=4096,
tools=mcp_tools,
messages=messages
)
if response.stop_reason == "end_turn":
return response.content[0].text
if response.stop_reason == "tool_use":
tool_results = []
for block in response.content:
if block.type == "tool_use":
# Delegate to MCP server
result = await session.call_tool(
block.name,
arguments=block.input
)
tool_results.append({
"type": "tool_result",
"tool_use_id": block.id,
"content": str(result.content)
})
messages.append({"role": "assistant", "content": response.content})
messages.append({"role": "user", "content": tool_results})
Now your agent can call GitHub APIs — create issues, read PRs, fetch file contents — without you writing a single API adapter. Connect different MCP servers for different tool sets: Postgres for database access, Slack for notifications, Playwright for browser automation.
Part 3: The Three Production Patterns
Pattern 1: Single Agent with Tools (80% of use cases)
The pattern above. One agent, a set of tools, a loop. Use this for:
- Research and summarization tasks
- Code generation with file writing
- Data analysis pipelines
- Content creation workflows
The key design decision: how many tools to give the agent. More tools = more flexibility, but the model's tool selection quality degrades as the list grows. Keep the tool set focused on the task. A research agent doesn't need file system access to production directories.
Pattern 2: Multi-Agent with Handoffs
One orchestrator agent breaks a task into subtasks and hands them to specialist agents. Each specialist has a narrower tool set and a more focused prompt.
def orchestrate(task: str) -> str:
# Orchestrator breaks task into steps
plan = planning_agent(task)
results = []
for step in plan.steps:
if step.type == "research":
result = research_agent(step.description)
elif step.type == "code":
result = coding_agent(step.description)
elif step.type == "review":
result = review_agent(step.description, results)
results.append(result)
# Synthesizer combines outputs
return synthesis_agent(results, original_task=task)
Use this for:
- Tasks with clearly separable subtasks (research → write → review)
- Workflows where different steps need different expertise
- Long-running tasks that would exhaust a single agent's context
The cost: each handoff adds latency and token overhead. Orchestrator-specialist patterns typically cost 3-5× more than a single agent for equivalent tasks. Use them when task complexity justifies it, not by default.
Pattern 3: Background Agents with Checkpointing
Agents that run for minutes or hours, saving state at checkpoints and resuming after interruptions. This is where Anthropic's managed agents infrastructure shines.
import anthropic
client = anthropic.Anthropic()
# Create a long-running agent session
session = client.beta.sessions.create(
model="claude-opus-4-5",
system="You are a code review agent. Review PRs and post detailed comments.",
tools=pr_review_tools
)
# The session persists across network interruptions and model calls
# Agent state is managed server-side
for pr in pending_prs:
result = session.run(
f"Review PR #{pr.number}: {pr.title}\n\nDiff:\n{pr.diff}"
)
post_review_comment(pr.number, result)
Use this for:
- CI/CD pipeline agents that run for 5-30 minutes
- Batch processing workflows (reviewing 100 PRs, analyzing 1000 files)
- Any agent task where you can't guarantee the calling process stays alive
The Anthropic Agent SDK's session management handles context window limits automatically (sliding window or summarization), retries transient failures, and provides resumable sessions if the calling process dies mid-task.
Part 4: Error Handling That Actually Works
The agents that fail in production almost always fail on error handling. Three patterns that matter:
Retry with exponential backoff for transient failures:
import time
def call_with_retry(fn, max_attempts=3, base_delay=1.0):
for attempt in range(max_attempts):
try:
return fn()
except anthropic.RateLimitError:
if attempt == max_attempts - 1:
raise
delay = base_delay * (2 ** attempt)
print(f"Rate limited. Waiting {delay}s before retry {attempt + 1}/{max_attempts}")
time.sleep(delay)
Validate tool inputs before execution:
def execute_tool_safely(name: str, inputs: dict) -> str:
try:
# Validate inputs before dangerous operations
if name == "delete_file":
if not inputs["path"].startswith("/allowed/directory/"):
return "Error: path outside allowed directory"
return execute_tool(name, inputs)
except Exception as e:
# Return errors to the model so it can handle them
return f"Error executing {name}: {str(e)}"
Returning errors to the model (rather than crashing) lets the agent reason about what went wrong and try a different approach. An agent that receives "Error: file not found at /src/main.py" can look for the correct path. An agent whose tool call raises an unhandled exception just stops.
Timeout long-running tasks:
import asyncio
async def run_with_timeout(agent_task, timeout_seconds=300):
try:
return await asyncio.wait_for(agent_task, timeout=timeout_seconds)
except asyncio.TimeoutError:
return "Task exceeded time limit. Partial results saved to checkpoint.json"
Part 5: What to Know Before You Ship
Context window management. Claude Opus has a 200K token context window. A long agent session with many tool call results will fill it. Plan for this: summarize older history, use structured checkpoints, or use the managed sessions API that handles this automatically.
Cost is proportional to turns. Each model call costs tokens. An agent that takes 20 tool call turns to complete a task costs roughly 20× more than a single non-agentic call. Profile your agents before scaling. For batch workloads, caching intermediate results often reduces cost dramatically.
AI-generated code in your agent's output needs the same review discipline as AI-generated code elsewhere. An agent that writes and commits code is doing it with the same models that generate code with AI coding assistants. The code will look right and sometimes be wrong in subtle ways. Don't skip review just because an agent generated it.
Log everything. Agent behavior is non-deterministic. An agent that worked correctly in testing may take different tool call paths in production due to slight variations in input. Log every tool call, every result, every model decision. This is what lets you debug production failures without reproducing them.
The Repository Pattern
The repos worth studying if you want to build agents in 2026:
- OpenHands (42K stars) — Full-featured AI software developer. Read the agent loop implementation.
- Claude Skills collection (127K stars) — 2,300+ production agent patterns in
SKILL.mdfiles. - LangGraph — State machine approach to agent orchestration. Strong for complex, branching workflows.
- AutoGPT — Older but the code is well-documented. Good for understanding the fundamental agent loop.
The Anthropic Agent SDK documentation at docs.anthropic.com covers the production patterns. The computer_use example is particularly good for seeing how tool design affects agent behavior.
Track What Your Agents Are Committing
Agents that write code leave attribution markers in git. GitIntel surfaces them.
# Install
curl -fsSL https://gitintel.com/install.sh | sh
# See AI attribution in your repo
cd your-repo
gitintel scan
Open source (MIT) · Local-first · No data leaves your machine
Sources: Anthropic Agent SDK documentation; MCP SDK documentation (modelcontextprotocol.io); OpenHands GitHub (42K stars, Apache 2.0); Anthropic Managed Agents announcement Q1 2026; Claude API pricing as of April 2026.
Related reading on GitIntel: