How to Build an AI Agent with Python: Complete Step-by-Step Guide (2026)
Stop following outdated ChatGPT wrapper tutorials. This is how you build an AI agent that can actually think, use tools, remember context, and take action — using Python and nothing else you can't pip install.
What You'll Build
By the end of this guide, you'll have a Python AI agent that can:
- Reason about tasks and break them into steps
- Use tools — web search, file operations, API calls, database queries
- Remember past interactions and learned context
- Self-correct when something fails
- Run autonomously on a schedule or trigger
Not a chatbot. Not a wrapper. An actual agent that does work while you sleep.
The Agent Architecture (Keep It Simple)
Most tutorials overcomplicate this. An AI agent has exactly four components:
- Brain — The LLM that reasons and decides (Claude, GPT-4, etc.)
- Tools — Functions the agent can call (search, calculate, read files)
- Memory — Context from past runs and conversations
- Loop — The think → act → observe cycle that drives everything
That's the whole thing. Every framework — LangChain, CrewAI, AutoGen — is just a fancy wrapper around these four pieces. We're going to build it from scratch so you actually understand what's happening.
# The entire agent loop in 15 lines
while not task_complete:
# 1. Think — send context + tools to LLM
response = llm.chat(messages, tools=available_tools)
# 2. Act — if the LLM wants to use a tool, run it
if response.tool_calls:
for tool_call in response.tool_calls:
result = execute_tool(tool_call)
messages.append(tool_result(result))
# 3. Observe — check if we're done
if response.stop_reason == "end_turn":
task_complete = True
final_answer = response.content
Everything else is just making this loop smarter, safer, and more capable.
1 Project Setup
Create your project structure:
mkdir my-agent && cd my-agent
python -m venv venv
source venv/bin/activate # Windows: venv\Scripts\activate
pip install anthropic python-dotenv requests beautifulsoup4
Project structure:
my-agent/
├── agent.py # Core agent loop
├── tools.py # Tool definitions
├── memory.py # Memory system
├── config.py # Settings
├── .env # API keys
└── memory/ # Persistent memory storage
└── conversations.json
Your .env file:
ANTHROPIC_API_KEY=sk-ant-...
# or
OPENAI_API_KEY=sk-...
2 Build the Tool System
Tools are what separate an agent from a chatbot. Here's how to build a clean, extensible tool system:
# tools.py
import json
import requests
from bs4 import BeautifulSoup
from datetime import datetime
# Tool registry — add new tools here
TOOLS = {}
def tool(name: str, description: str, parameters: dict):
"""Decorator to register a function as an agent tool."""
def decorator(func):
TOOLS[name] = {
"function": func,
"schema": {
"name": name,
"description": description,
"input_schema": {
"type": "object",
"properties": parameters,
"required": list(parameters.keys())
}
}
}
return func
return decorator
@tool(
name="web_search",
description="Search the web and return top results with snippets.",
parameters={
"query": {"type": "string", "description": "Search query"}
}
)
def web_search(query: str) -> str:
"""Search using DuckDuckGo (no API key needed)."""
url = "https://html.duckduckgo.com/html/"
resp = requests.post(url, data={"q": query}, headers={
"User-Agent": "Mozilla/5.0"
})
soup = BeautifulSoup(resp.text, "html.parser")
results = []
for r in soup.select(".result")[:5]:
title = r.select_one(".result__title")
snippet = r.select_one(".result__snippet")
if title and snippet:
results.append(f"**{title.get_text(strip=True)}**\n{snippet.get_text(strip=True)}")
return "\n\n".join(results) if results else "No results found."
@tool(
name="read_file",
description="Read the contents of a local file.",
parameters={
"path": {"type": "string", "description": "File path to read"}
}
)
def read_file(path: str) -> str:
try:
with open(path, "r") as f:
content = f.read()
return content[:10000] # Truncate for safety
except Exception as e:
return f"Error reading file: {e}"
@tool(
name="write_file",
description="Write content to a file. Creates the file if it doesn't exist.",
parameters={
"path": {"type": "string", "description": "File path"},
"content": {"type": "string", "description": "Content to write"}
}
)
def write_file(path: str, content: str) -> str:
try:
with open(path, "w") as f:
f.write(content)
return f"Successfully wrote {len(content)} chars to {path}"
except Exception as e:
return f"Error writing file: {e}"
@tool(
name="run_python",
description="Execute Python code and return the output. Use for calculations, data processing, etc.",
parameters={
"code": {"type": "string", "description": "Python code to execute"}
}
)
def run_python(code: str) -> str:
import io, contextlib
output = io.StringIO()
try:
with contextlib.redirect_stdout(output):
exec(code, {"__builtins__": __builtins__})
result = output.getvalue()
return result if result else "Code executed successfully (no output)."
except Exception as e:
return f"Error: {e}"
@tool(
name="get_current_time",
description="Get the current date and time.",
parameters={}
)
def get_current_time() -> str:
return datetime.now().strftime("%Y-%m-%d %H:%M:%S")
def execute_tool(name: str, arguments: dict) -> str:
"""Execute a registered tool by name."""
if name not in TOOLS:
return f"Unknown tool: {name}"
try:
func = TOOLS[name]["function"]
return func(**arguments)
except Exception as e:
return f"Tool error ({name}): {e}"
def get_tool_schemas() -> list:
"""Get all tool schemas for the API call."""
return [t["schema"] for t in TOOLS.values()]
@tool(...). Your agent immediately gets access to it — no rewiring needed.
3 Build the Memory System
An agent without memory is just a very expensive function call. Here's a simple but effective memory system:
# memory.py
import json
import os
from datetime import datetime
MEMORY_DIR = "memory"
MEMORY_FILE = os.path.join(MEMORY_DIR, "agent_memory.json")
def _ensure_dir():
os.makedirs(MEMORY_DIR, exist_ok=True)
def load_memory() -> dict:
"""Load the agent's persistent memory."""
_ensure_dir()
if os.path.exists(MEMORY_FILE):
with open(MEMORY_FILE, "r") as f:
return json.load(f)
return {"facts": [], "conversations": [], "tasks": []}
def save_memory(memory: dict):
"""Save the agent's memory to disk."""
_ensure_dir()
with open(MEMORY_FILE, "w") as f:
json.dump(memory, f, indent=2, default=str)
def add_fact(memory: dict, fact: str):
"""Store a learned fact."""
memory["facts"].append({
"fact": fact,
"learned_at": datetime.now().isoformat()
})
# Keep last 100 facts
memory["facts"] = memory["facts"][-100:]
save_memory(memory)
def add_conversation(memory: dict, task: str, result: str):
"""Store a conversation summary."""
memory["conversations"].append({
"task": task,
"result": result[:500],
"timestamp": datetime.now().isoformat()
})
memory["conversations"] = memory["conversations"][-50:]
save_memory(memory)
def get_context(memory: dict, max_items: int = 10) -> str:
"""Build a context string from memory for the agent."""
parts = []
if memory["facts"]:
recent_facts = memory["facts"][-max_items:]
parts.append("Known facts:\n" + "\n".join(
f"- {f['fact']}" for f in recent_facts
))
if memory["conversations"]:
recent = memory["conversations"][-5:]
parts.append("Recent tasks:\n" + "\n".join(
f"- {c['task']} → {c['result'][:100]}" for c in recent
))
return "\n\n".join(parts) if parts else "No prior context."
This is intentionally simple. You can upgrade to vector search (ChromaDB, Pinecone) later, but for most agents, JSON + recency is all you need.
4 Build the Agent Core
This is the brain. The main loop that ties everything together:
# agent.py
import anthropic
import json
from dotenv import load_dotenv
from tools import execute_tool, get_tool_schemas
from memory import load_memory, add_conversation, get_context
load_dotenv()
client = anthropic.Anthropic()
SYSTEM_PROMPT = """You are a capable AI agent. You can think step-by-step,
use tools to gather information and take actions, and remember context
from previous interactions.
Rules:
1. Think before acting. Break complex tasks into steps.
2. Use tools when you need real data — don't make things up.
3. If a tool fails, try a different approach.
4. Be concise in your responses.
5. If you learn something important, mention it so it can be saved.
{memory_context}"""
def run_agent(task: str, max_iterations: int = 10) -> str:
"""Run the agent on a task until completion."""
memory = load_memory()
context = get_context(memory)
messages = [{"role": "user", "content": task}]
system = SYSTEM_PROMPT.format(memory_context=context)
tools = get_tool_schemas()
for i in range(max_iterations):
# Call the LLM
response = client.messages.create(
model="claude-sonnet-4-20250514",
max_tokens=4096,
system=system,
tools=tools,
messages=messages
)
# Collect text and tool use blocks
assistant_text = ""
tool_calls = []
for block in response.content:
if block.type == "text":
assistant_text += block.text
elif block.type == "tool_use":
tool_calls.append(block)
# Add assistant message to history
messages.append({"role": "assistant", "content": response.content})
# If no tool calls, we're done
if not tool_calls:
add_conversation(memory, task, assistant_text)
return assistant_text
# Execute each tool call
tool_results = []
for tc in tool_calls:
print(f" 🔧 Using tool: {tc.name}({json.dumps(tc.input)[:100]})")
result = execute_tool(tc.name, tc.input)
tool_results.append({
"type": "tool_result",
"tool_use_id": tc.id,
"content": str(result)
})
messages.append({"role": "user", "content": tool_results})
return "Agent reached max iterations without completing the task."
if __name__ == "__main__":
import sys
if len(sys.argv) > 1:
task = " ".join(sys.argv[1:])
else:
task = input("What should I do? → ")
print(f"\n🤖 Working on: {task}\n")
result = run_agent(task)
print(f"\n✅ Result:\n{result}")
That's it. That's a production-capable AI agent in ~80 lines. Run it:
python agent.py "Research the top 3 Python web frameworks and write a comparison to frameworks.md"
The agent will search the web, read results, reason about them, and write a file — autonomously.
🛠️ Want the Complete Agent Blueprint?
The AI Employee Playbook (€29) includes production-ready agent templates, advanced tool patterns, deployment configs, monitoring setup, and the exact system prompts we use in production.
Get the Playbook — €295 Add Error Handling & Retries
Production agents need to handle failures gracefully. Here's the pattern:
# Add to agent.py
import time
def run_agent_safe(task: str, max_retries: int = 3) -> str:
"""Run agent with automatic retry on failure."""
for attempt in range(max_retries):
try:
return run_agent(task)
except anthropic.RateLimitError:
wait = 2 ** attempt * 10
print(f" ⏳ Rate limited, waiting {wait}s...")
time.sleep(wait)
except anthropic.APIError as e:
print(f" ❌ API error (attempt {attempt+1}): {e}")
if attempt == max_retries - 1:
return f"Failed after {max_retries} attempts: {e}"
time.sleep(5)
except Exception as e:
return f"Unexpected error: {e}"
return "Max retries exceeded."
6 Advanced: Multi-Tool Chains
The real power comes when your agent chains tools together. Here's an example that researches a topic, writes a report, and saves it:
# Example: autonomous research agent
result = run_agent("""
Research the current state of electric truck adoption in Europe.
Steps:
1. Search for the latest data and news
2. Find specific numbers: market share, growth rates, key manufacturers
3. Write a concise 500-word report with sources
4. Save it to research/ev-trucks-europe.md
""")
# The agent will:
# → web_search("electric truck adoption Europe 2026")
# → web_search("electric truck market share Europe statistics")
# → web_search("electric truck manufacturers Europe sales data")
# → write_file("research/ev-trucks-europe.md", "...")
7 Advanced: Scheduled Agent Runs
An agent that only runs when you ask isn't truly autonomous. Here's how to schedule it:
# scheduler.py
import schedule
import time
from agent import run_agent_safe
def morning_briefing():
"""Run every morning at 8 AM."""
result = run_agent_safe("""
Create my morning briefing:
1. Search for today's top AI news
2. Check if there are any new Python security advisories
3. Write a brief summary to briefings/today.md
""")
print(f"Morning briefing: {result[:200]}")
def weekly_report():
"""Run every Friday at 5 PM."""
result = run_agent_safe("""
Create a weekly summary:
1. Read all files in briefings/ from this week
2. Synthesize into key themes and trends
3. Write to reports/weekly-summary.md
""")
print(f"Weekly report: {result[:200]}")
schedule.every().day.at("08:00").do(morning_briefing)
schedule.every().friday.at("17:00").do(weekly_report)
print("🤖 Agent scheduler running...")
while True:
schedule.run_pending()
time.sleep(60)
8 Deploy to Production
Three options, from simplest to most robust:
Option A: Simple Server (VPS/Cloud VM)
# Use systemd to keep it running
# /etc/systemd/system/my-agent.service
[Unit]
Description=AI Agent
After=network.target
[Service]
Type=simple
User=agent
WorkingDirectory=/home/agent/my-agent
ExecStart=/home/agent/my-agent/venv/bin/python scheduler.py
Restart=always
RestartSec=10
Environment=ANTHROPIC_API_KEY=sk-ant-...
[Install]
WantedBy=multi-user.target
sudo systemctl enable my-agent
sudo systemctl start my-agent
Option B: Docker Container
# Dockerfile
FROM python:3.12-slim
WORKDIR /app
COPY requirements.txt .
RUN pip install -r requirements.txt
COPY . .
CMD ["python", "scheduler.py"]
docker build -t my-agent .
docker run -d --restart always \
-e ANTHROPIC_API_KEY=sk-ant-... \
--name agent my-agent
Option C: Serverless (AWS Lambda / Cloud Functions)
# For event-driven agents (webhook triggers, scheduled tasks)
# Deploy with AWS SAM, Serverless Framework, or Google Cloud Functions
# Best for: agents that respond to events, not continuous runners
Common Patterns That Actually Work
Pattern 1: The Guardrail Pattern
Prevent your agent from going off the rails:
# Add to your system prompt:
GUARDRAILS = """
NEVER do these things:
- Delete files outside the working directory
- Make network requests to internal IPs
- Execute system commands (rm, kill, etc.)
- Spend more than $0.50 on API calls per task
If unsure about an action, stop and explain what you want to do.
"""
Pattern 2: The Reflection Pattern
Make your agent check its own work:
# After the agent completes a task, add:
review = run_agent(f"""
Review this output for accuracy and completeness:
Task: {original_task}
Output: {result}
Check for: factual errors, missing information,
formatting issues. Suggest improvements.
""")
Pattern 3: The Fallback Chain
Use cheaper models for simple tasks, expensive models for hard ones:
MODELS = {
"fast": "claude-3-5-haiku-20241022",
"balanced": "claude-sonnet-4-20250514",
"powerful": "claude-opus-4-20250514"
}
def smart_route(task: str) -> str:
"""Route to the right model based on complexity."""
# Try fast model first
result = run_agent(task, model=MODELS["fast"])
if "I'm not sure" in result or "I cannot" in result:
# Escalate to balanced
result = run_agent(task, model=MODELS["balanced"])
return result
Tool Comparison: Framework vs. Bare Python
| Aspect | Bare Python (This Guide) | LangChain | CrewAI |
|---|---|---|---|
| Setup Time | 10 minutes | 30 minutes | 20 minutes |
| Lines of Code | ~200 | ~150 | ~100 |
| Understanding | You know every line | Framework magic | Framework magic |
| Debugging | Easy — it's your code | Hard — deep abstractions | Medium |
| Flexibility | Total control | High (complex API) | Medium (opinionated) |
| Multi-agent | DIY (simple) | LangGraph add-on | Built-in |
| Best For | Learning, custom agents | Complex pipelines | Team of agents |
"Start bare. Adopt a framework only when your custom code starts duplicating what the framework provides. For most people, that day never comes."
7 Mistakes That Kill Python Agents
- No max iterations — Your agent runs forever, burns $50 in API calls. Always set a cap.
- Trusting tool output blindly — Web search returns garbage sometimes. Add validation.
- Huge context windows — Cramming everything into the prompt. Be selective with memory.
- No error handling — One API timeout crashes everything. Use retries.
- Too many tools — More than 10-12 tools confuses the LLM. Keep it focused.
- Skipping logging — When your agent does something weird at 3 AM, you need logs.
- Building everything at once — Start with 2 tools and one task. Expand when it works.
60-Minute Quickstart: Research Agent
Copy-paste this and have a working agent in under an hour:
- 0-5 min: Create project, install deps, add API key to
.env - 5-20 min: Copy
tools.pyfrom Step 2 above - 20-35 min: Copy
memory.pyfrom Step 3 - 35-50 min: Copy
agent.pyfrom Step 4 - 50-60 min: Test with 3 different tasks, add one custom tool
That's it. You now have a functioning AI agent that can search the web, read/write files, run Python code, and remember past interactions.
What's Next
Once your basic agent works, here's the upgrade path:
- Week 2: Add domain-specific tools (database queries, API integrations, email)
- Week 3: Implement structured output with Pydantic models
- Week 4: Add vector memory with ChromaDB for semantic search
- Month 2: Build a second agent and have them collaborate
- Month 3: Deploy with monitoring, alerts, and cost tracking
⚡ Skip the Learning Curve
The AI Employee Playbook (€29) gives you production-ready Python agent templates, 15+ tool implementations, deployment scripts, monitoring dashboards, and the exact patterns we use to run agents 24/7 in real businesses.
Get the Playbook — €29