Tutorial · Feb 19, 2026 · 20 min read

How to Build an AI Agent with Python: Complete Step-by-Step Guide (2026)

Stop following outdated ChatGPT wrapper tutorials. This is how you build an AI agent that can actually think, use tools, remember context, and take action — using Python and nothing else you can't pip install.

~200

Lines of Code

$0.02

Cost Per Task

60min

To Production

∞

Use Cases

What You'll Build

By the end of this guide, you'll have a Python AI agent that can:

Reason about tasks and break them into steps
Use tools — web search, file operations, API calls, database queries
Remember past interactions and learned context
Self-correct when something fails
Run autonomously on a schedule or trigger

Not a chatbot. Not a wrapper. An actual agent that does work while you sleep.

⚡ Prerequisites: Python 3.10+, basic understanding of APIs, an API key from Anthropic or OpenAI. That's it. No ML degree required.

The Agent Architecture (Keep It Simple)

Most tutorials overcomplicate this. An AI agent has exactly four components:

Brain — The LLM that reasons and decides (Claude, GPT-4, etc.)
Tools — Functions the agent can call (search, calculate, read files)
Memory — Context from past runs and conversations
Loop — The think → act → observe cycle that drives everything

That's the whole thing. Every framework — LangChain, CrewAI, AutoGen — is just a fancy wrapper around these four pieces. We're going to build it from scratch so you actually understand what's happening.

# The entire agent loop in 15 lines
while not task_complete:
    # 1. Think — send context + tools to LLM
    response = llm.chat(messages, tools=available_tools)

    # 2. Act — if the LLM wants to use a tool, run it
    if response.tool_calls:
        for tool_call in response.tool_calls:
            result = execute_tool(tool_call)
            messages.append(tool_result(result))

    # 3. Observe — check if we're done
    if response.stop_reason == "end_turn":
        task_complete = True
        final_answer = response.content

Everything else is just making this loop smarter, safer, and more capable.

1 Project Setup

Create your project structure:

mkdir my-agent && cd my-agent
python -m venv venv
source venv/bin/activate  # Windows: venv\Scripts\activate

pip install anthropic python-dotenv requests beautifulsoup4

Project structure:

my-agent/
├── agent.py          # Core agent loop
├── tools.py          # Tool definitions
├── memory.py         # Memory system
├── config.py         # Settings
├── .env              # API keys
└── memory/           # Persistent memory storage
    └── conversations.json

Your .env file:

ANTHROPIC_API_KEY=sk-ant-...
# or
OPENAI_API_KEY=sk-...

2 Build the Tool System

Tools are what separate an agent from a chatbot. Here's how to build a clean, extensible tool system:

# tools.py
import json
import requests
from bs4 import BeautifulSoup
from datetime import datetime

# Tool registry — add new tools here
TOOLS = {}

def tool(name: str, description: str, parameters: dict):
    """Decorator to register a function as an agent tool."""
    def decorator(func):
        TOOLS[name] = {
            "function": func,
            "schema": {
                "name": name,
                "description": description,
                "input_schema": {
                    "type": "object",
                    "properties": parameters,
                    "required": list(parameters.keys())
                }
            }
        }
        return func
    return decorator

@tool(
    name="web_search",
    description="Search the web and return top results with snippets.",
    parameters={
        "query": {"type": "string", "description": "Search query"}
    }
)
def web_search(query: str) -> str:
    """Search using DuckDuckGo (no API key needed)."""
    url = "https://html.duckduckgo.com/html/"
    resp = requests.post(url, data={"q": query}, headers={
        "User-Agent": "Mozilla/5.0"
    })
    soup = BeautifulSoup(resp.text, "html.parser")
    results = []
    for r in soup.select(".result")[:5]:
        title = r.select_one(".result__title")
        snippet = r.select_one(".result__snippet")
        if title and snippet:
            results.append(f"**{title.get_text(strip=True)}**\n{snippet.get_text(strip=True)}")
    return "\n\n".join(results) if results else "No results found."

@tool(
    name="read_file",
    description="Read the contents of a local file.",
    parameters={
        "path": {"type": "string", "description": "File path to read"}
    }
)
def read_file(path: str) -> str:
    try:
        with open(path, "r") as f:
            content = f.read()
        return content[:10000]  # Truncate for safety
    except Exception as e:
        return f"Error reading file: {e}"

@tool(
    name="write_file",
    description="Write content to a file. Creates the file if it doesn't exist.",
    parameters={
        "path": {"type": "string", "description": "File path"},
        "content": {"type": "string", "description": "Content to write"}
    }
)
def write_file(path: str, content: str) -> str:
    try:
        with open(path, "w") as f:
            f.write(content)
        return f"Successfully wrote {len(content)} chars to {path}"
    except Exception as e:
        return f"Error writing file: {e}"

@tool(
    name="run_python",
    description="Execute Python code and return the output. Use for calculations, data processing, etc.",
    parameters={
        "code": {"type": "string", "description": "Python code to execute"}
    }
)
def run_python(code: str) -> str:
    import io, contextlib
    output = io.StringIO()
    try:
        with contextlib.redirect_stdout(output):
            exec(code, {"__builtins__": __builtins__})
        result = output.getvalue()
        return result if result else "Code executed successfully (no output)."
    except Exception as e:
        return f"Error: {e}"

@tool(
    name="get_current_time",
    description="Get the current date and time.",
    parameters={}
)
def get_current_time() -> str:
    return datetime.now().strftime("%Y-%m-%d %H:%M:%S")

def execute_tool(name: str, arguments: dict) -> str:
    """Execute a registered tool by name."""
    if name not in TOOLS:
        return f"Unknown tool: {name}"
    try:
        func = TOOLS[name]["function"]
        return func(**arguments)
    except Exception as e:
        return f"Tool error ({name}): {e}"

def get_tool_schemas() -> list:
    """Get all tool schemas for the API call."""
    return [t["schema"] for t in TOOLS.values()]

🎯 Pro tip: The decorator pattern means adding a new tool is just writing a function with @tool(...). Your agent immediately gets access to it — no rewiring needed.

3 Build the Memory System

An agent without memory is just a very expensive function call. Here's a simple but effective memory system:

# memory.py
import json
import os
from datetime import datetime

MEMORY_DIR = "memory"
MEMORY_FILE = os.path.join(MEMORY_DIR, "agent_memory.json")

def _ensure_dir():
    os.makedirs(MEMORY_DIR, exist_ok=True)

def load_memory() -> dict:
    """Load the agent's persistent memory."""
    _ensure_dir()
    if os.path.exists(MEMORY_FILE):
        with open(MEMORY_FILE, "r") as f:
            return json.load(f)
    return {"facts": [], "conversations": [], "tasks": []}

def save_memory(memory: dict):
    """Save the agent's memory to disk."""
    _ensure_dir()
    with open(MEMORY_FILE, "w") as f:
        json.dump(memory, f, indent=2, default=str)

def add_fact(memory: dict, fact: str):
    """Store a learned fact."""
    memory["facts"].append({
        "fact": fact,
        "learned_at": datetime.now().isoformat()
    })
    # Keep last 100 facts
    memory["facts"] = memory["facts"][-100:]
    save_memory(memory)

def add_conversation(memory: dict, task: str, result: str):
    """Store a conversation summary."""
    memory["conversations"].append({
        "task": task,
        "result": result[:500],
        "timestamp": datetime.now().isoformat()
    })
    memory["conversations"] = memory["conversations"][-50:]
    save_memory(memory)

def get_context(memory: dict, max_items: int = 10) -> str:
    """Build a context string from memory for the agent."""
    parts = []
    if memory["facts"]:
        recent_facts = memory["facts"][-max_items:]
        parts.append("Known facts:\n" + "\n".join(
            f"- {f['fact']}" for f in recent_facts
        ))
    if memory["conversations"]:
        recent = memory["conversations"][-5:]
        parts.append("Recent tasks:\n" + "\n".join(
            f"- {c['task']} → {c['result'][:100]}" for c in recent
        ))
    return "\n\n".join(parts) if parts else "No prior context."

This is intentionally simple. You can upgrade to vector search (ChromaDB, Pinecone) later, but for most agents, JSON + recency is all you need.

4 Build the Agent Core

This is the brain. The main loop that ties everything together:

# agent.py
import anthropic
import json
from dotenv import load_dotenv
from tools import execute_tool, get_tool_schemas
from memory import load_memory, add_conversation, get_context

load_dotenv()
client = anthropic.Anthropic()

SYSTEM_PROMPT = """You are a capable AI agent. You can think step-by-step,
use tools to gather information and take actions, and remember context
from previous interactions.

Rules:
1. Think before acting. Break complex tasks into steps.
2. Use tools when you need real data — don't make things up.
3. If a tool fails, try a different approach.
4. Be concise in your responses.
5. If you learn something important, mention it so it can be saved.

{memory_context}"""

def run_agent(task: str, max_iterations: int = 10) -> str:
    """Run the agent on a task until completion."""
    memory = load_memory()
    context = get_context(memory)

    messages = [{"role": "user", "content": task}]
    system = SYSTEM_PROMPT.format(memory_context=context)
    tools = get_tool_schemas()

    for i in range(max_iterations):
        # Call the LLM
        response = client.messages.create(
            model="claude-sonnet-4-20250514",
            max_tokens=4096,
            system=system,
            tools=tools,
            messages=messages
        )

        # Collect text and tool use blocks
        assistant_text = ""
        tool_calls = []

        for block in response.content:
            if block.type == "text":
                assistant_text += block.text
            elif block.type == "tool_use":
                tool_calls.append(block)

        # Add assistant message to history
        messages.append({"role": "assistant", "content": response.content})

        # If no tool calls, we're done
        if not tool_calls:
            add_conversation(memory, task, assistant_text)
            return assistant_text

        # Execute each tool call
        tool_results = []
        for tc in tool_calls:
            print(f"  🔧 Using tool: {tc.name}({json.dumps(tc.input)[:100]})")
            result = execute_tool(tc.name, tc.input)
            tool_results.append({
                "type": "tool_result",
                "tool_use_id": tc.id,
                "content": str(result)
            })

        messages.append({"role": "user", "content": tool_results})

    return "Agent reached max iterations without completing the task."

if __name__ == "__main__":
    import sys
    if len(sys.argv) > 1:
        task = " ".join(sys.argv[1:])
    else:
        task = input("What should I do? → ")
    print(f"\n🤖 Working on: {task}\n")
    result = run_agent(task)
    print(f"\n✅ Result:\n{result}")

That's it. That's a production-capable AI agent in ~80 lines. Run it:

python agent.py "Research the top 3 Python web frameworks and write a comparison to frameworks.md"

The agent will search the web, read results, reason about them, and write a file — autonomously.

🛠️ Want the Complete Agent Blueprint?

The AI Employee Playbook (€29) includes production-ready agent templates, advanced tool patterns, deployment configs, monitoring setup, and the exact system prompts we use in production.

Get the Playbook — €29

5 Add Error Handling & Retries

Production agents need to handle failures gracefully. Here's the pattern:

# Add to agent.py
import time

def run_agent_safe(task: str, max_retries: int = 3) -> str:
    """Run agent with automatic retry on failure."""
    for attempt in range(max_retries):
        try:
            return run_agent(task)
        except anthropic.RateLimitError:
            wait = 2 ** attempt * 10
            print(f"  ⏳ Rate limited, waiting {wait}s...")
            time.sleep(wait)
        except anthropic.APIError as e:
            print(f"  ❌ API error (attempt {attempt+1}): {e}")
            if attempt == max_retries - 1:
                return f"Failed after {max_retries} attempts: {e}"
            time.sleep(5)
        except Exception as e:
            return f"Unexpected error: {e}"
    return "Max retries exceeded."

6 Advanced: Multi-Tool Chains

The real power comes when your agent chains tools together. Here's an example that researches a topic, writes a report, and saves it:

# Example: autonomous research agent
result = run_agent("""
Research the current state of electric truck adoption in Europe.
Steps:
1. Search for the latest data and news
2. Find specific numbers: market share, growth rates, key manufacturers
3. Write a concise 500-word report with sources
4. Save it to research/ev-trucks-europe.md
""")

# The agent will:
# → web_search("electric truck adoption Europe 2026")
# → web_search("electric truck market share Europe statistics")
# → web_search("electric truck manufacturers Europe sales data")
# → write_file("research/ev-trucks-europe.md", "...")

7 Advanced: Scheduled Agent Runs

An agent that only runs when you ask isn't truly autonomous. Here's how to schedule it:

# scheduler.py
import schedule
import time
from agent import run_agent_safe

def morning_briefing():
    """Run every morning at 8 AM."""
    result = run_agent_safe("""
    Create my morning briefing:
    1. Search for today's top AI news
    2. Check if there are any new Python security advisories
    3. Write a brief summary to briefings/today.md
    """)
    print(f"Morning briefing: {result[:200]}")

def weekly_report():
    """Run every Friday at 5 PM."""
    result = run_agent_safe("""
    Create a weekly summary:
    1. Read all files in briefings/ from this week
    2. Synthesize into key themes and trends
    3. Write to reports/weekly-summary.md
    """)
    print(f"Weekly report: {result[:200]}")

schedule.every().day.at("08:00").do(morning_briefing)
schedule.every().friday.at("17:00").do(weekly_report)

print("🤖 Agent scheduler running...")
while True:
    schedule.run_pending()
    time.sleep(60)

8 Deploy to Production

Three options, from simplest to most robust:

Option A: Simple Server (VPS/Cloud VM)

# Use systemd to keep it running
# /etc/systemd/system/my-agent.service
[Unit]
Description=AI Agent
After=network.target

[Service]
Type=simple
User=agent
WorkingDirectory=/home/agent/my-agent
ExecStart=/home/agent/my-agent/venv/bin/python scheduler.py
Restart=always
RestartSec=10
Environment=ANTHROPIC_API_KEY=sk-ant-...

[Install]
WantedBy=multi-user.target

sudo systemctl enable my-agent
sudo systemctl start my-agent

Option B: Docker Container

# Dockerfile
FROM python:3.12-slim
WORKDIR /app
COPY requirements.txt .
RUN pip install -r requirements.txt
COPY . .
CMD ["python", "scheduler.py"]

docker build -t my-agent .
docker run -d --restart always \
  -e ANTHROPIC_API_KEY=sk-ant-... \
  --name agent my-agent

Option C: Serverless (AWS Lambda / Cloud Functions)

# For event-driven agents (webhook triggers, scheduled tasks)
# Deploy with AWS SAM, Serverless Framework, or Google Cloud Functions
# Best for: agents that respond to events, not continuous runners

💰 Cost tip: A Claude-powered agent running 100 tasks/day costs roughly $2-5/day in API calls. That's $60-150/month — less than any SaaS tool it replaces.

Common Patterns That Actually Work

Pattern 1: The Guardrail Pattern

Prevent your agent from going off the rails:

# Add to your system prompt:
GUARDRAILS = """
NEVER do these things:
- Delete files outside the working directory
- Make network requests to internal IPs
- Execute system commands (rm, kill, etc.)
- Spend more than $0.50 on API calls per task

If unsure about an action, stop and explain what you want to do.
"""

Pattern 2: The Reflection Pattern

Make your agent check its own work:

# After the agent completes a task, add:
review = run_agent(f"""
Review this output for accuracy and completeness:

Task: {original_task}
Output: {result}

Check for: factual errors, missing information,
formatting issues. Suggest improvements.
""")

Pattern 3: The Fallback Chain

Use cheaper models for simple tasks, expensive models for hard ones:

MODELS = {
    "fast": "claude-3-5-haiku-20241022",
    "balanced": "claude-sonnet-4-20250514",
    "powerful": "claude-opus-4-20250514"
}

def smart_route(task: str) -> str:
    """Route to the right model based on complexity."""
    # Try fast model first
    result = run_agent(task, model=MODELS["fast"])
    if "I'm not sure" in result or "I cannot" in result:
        # Escalate to balanced
        result = run_agent(task, model=MODELS["balanced"])
    return result

Tool Comparison: Framework vs. Bare Python

Aspect	Bare Python (This Guide)	LangChain	CrewAI
Setup Time	10 minutes	30 minutes	20 minutes
Lines of Code	~200	~150	~100
Understanding	You know every line	Framework magic	Framework magic
Debugging	Easy — it's your code	Hard — deep abstractions	Medium
Flexibility	Total control	High (complex API)	Medium (opinionated)
Multi-agent	DIY (simple)	LangGraph add-on	Built-in
Best For	Learning, custom agents	Complex pipelines	Team of agents

"Start bare. Adopt a framework only when your custom code starts duplicating what the framework provides. For most people, that day never comes."

7 Mistakes That Kill Python Agents

No max iterations — Your agent runs forever, burns $50 in API calls. Always set a cap.
Trusting tool output blindly — Web search returns garbage sometimes. Add validation.
Huge context windows — Cramming everything into the prompt. Be selective with memory.
No error handling — One API timeout crashes everything. Use retries.
Too many tools — More than 10-12 tools confuses the LLM. Keep it focused.
Skipping logging — When your agent does something weird at 3 AM, you need logs.
Building everything at once — Start with 2 tools and one task. Expand when it works.

60-Minute Quickstart: Research Agent

Copy-paste this and have a working agent in under an hour:

0-5 min: Create project, install deps, add API key to .env
5-20 min: Copy tools.py from Step 2 above
20-35 min: Copy memory.py from Step 3
35-50 min: Copy agent.py from Step 4
50-60 min: Test with 3 different tasks, add one custom tool

That's it. You now have a functioning AI agent that can search the web, read/write files, run Python code, and remember past interactions.

What's Next

Once your basic agent works, here's the upgrade path:

Week 2: Add domain-specific tools (database queries, API integrations, email)
Week 3: Implement structured output with Pydantic models
Week 4: Add vector memory with ChromaDB for semantic search
Month 2: Build a second agent and have them collaborate
Month 3: Deploy with monitoring, alerts, and cost tracking

⚡ Skip the Learning Curve

The AI Employee Playbook (€29) gives you production-ready Python agent templates, 15+ tool implementations, deployment scripts, monitoring dashboards, and the exact patterns we use to run agents 24/7 in real businesses.