AI Engineering Learning: Multi-Agent Systems & Orchestration
Table of Contents
- Introduction to Multi-Agent Systems
- Core Concepts & Theory
- Week 6: Orchestration Patterns
- Week 7: Advanced Patterns
- Week 8: CrewAI & Production
- Real-World Projects
Introduction to Multi-Agent Systems
What Are Multi-Agent Systems?
A Multi-Agent System (MAS) is a group of specialized AI agents working together to solve complex problems that no single agent could handle alone.
Simple Analogy: Think of a hospital:
- A receptionist handles patient intake
- A doctor diagnoses
- A nurse administers treatment
- A pharmacist manages medications
- Each specialist has a focused role, but they work together
Without coordination, they’d overlap, miss steps, or be inefficient. With orchestration, they work seamlessly.
Why Multi-Agent Systems?
Problem Without Multi-Agents:
1
2
3
4
5
6
7
8
9
One Super-Agent
↓
Tries to handle: customer service + billing + technical support + data retrieval
↓
Becomes confused about roles
↓
Slower response times
↓
Higher error rates
Solution With Multi-Agents:
1
2
3
4
5
6
7
8
9
Customer Service Agent ← Fast, focused
↓
Billing Agent ← Specialized knowledge
↓
Technical Support Agent ← Expert
↓
Orchestrator ← Routes to right agent
↓
Perfect response in less time
Real-World Impact
Companies using multi-agent systems report:
- 50% reduction in customer resolution time
- 60% higher customer satisfaction
- 40% fewer escalations
- Better specialized expertise
- More maintainable code
Core Concepts & Theory
1. The Three Orchestration Architectures
Pattern 1: Centralized Orchestration (Supervisor Pattern)
How It Works:
1
2
3
4
5
6
7
8
User Request
↓
[Supervisor Agent]
├─ Analyzes request
├─ Routes to appropriate agent
└─ Aggregates results
↓
Response
Structure:
- ONE central supervisor
- MANY specialized agents
- Supervisor sees all, decides all
Advantages: ✅ Easy to understand ✅ Complete control ✅ Good for predictable workflows ✅ Simple error handling
Disadvantages: ❌ Supervisor becomes bottleneck ❌ Scales poorly with many agents ❌ Less flexible for dynamic tasks
Best For:
- Well-defined workflows
- Linear processes
- Compliance-heavy systems
- Team sizes < 5 agents
Pattern 2: Hierarchical Orchestration
How It Works:
1
2
3
4
5
Top-Level Supervisor
/ \
Manager1 Manager2
/ \ / \
Agent1 Agent2 Agent3 Agent4
Structure:
- Multiple levels
- Each level coordinates below
- Upper levels guide strategy
- Lower levels handle tasks
Advantages: ✅ Scalable to many agents ✅ Balances control and autonomy ✅ Good for large teams ✅ Can handle parallel execution
Disadvantages: ❌ More complex to build ❌ Harder to debug ❌ More latency (multiple hops)
Best For:
- Large systems (10+ agents)
- Enterprise workflows
- Mixed sequential/parallel tasks
Pattern 3: Decentralized/Peer-to-Peer
How It Works:
1
2
3
4
5
Agent1 ←→ Agent2
↑ ↑
↓ ↓
Agent4 ←→ Agent3
(No central controller)
Structure:
- Agents talk directly
- No central authority
- Self-organizing
- Emergent behavior
Advantages: ✅ Highly resilient ✅ No single point of failure ✅ Very flexible ✅ Scales horizontally
Disadvantages: ❌ Complex to debug ❌ Hard to ensure consistency ❌ Difficult to control ❌ Emerging behavior unpredictable
Best For:
- Swarm systems
- Dynamic environments
- Resilience critical
- Ad-hoc collaboration
2. Tool Calling Pattern (Supervisor with Tools)
The Concept: The supervisor treats other agents as “tools” to be called.
Flow:
1
2
3
4
5
6
7
8
9
10
11
12
13
Supervisor receives request
↓
Supervisor decides which agent-tool to call
↓
Agent-tool executes independently
↓
Agent-tool returns result
↓
Supervisor receives result
↓
Supervisor decides next step
↓
Response to user
Key Differences from Native Agents:
| Aspect | Tool Calling | Native Agent |
|---|---|---|
| Who decides routing | Supervisor only | Both supervisor and agent |
| Control | Centralized | Distributed |
| Flexibility | Lower | Higher |
| Complexity | Simpler | More complex |
| Used for | Structured workflows | Complex reasoning |
3. Handoff Pattern (Agent-to-Agent Transfer)
The Concept: Agents can directly transfer control to other agents without going through a supervisor.
Flow:
1
2
3
4
5
6
7
8
9
10
11
Agent A handles request
↓
Agent A realizes it's not the right specialist
↓
Agent A calls handoff_tool("Agent B")
↓
Control transfers to Agent B
↓
Agent B continues with full context
↓
Agent B returns result
Key Advantage: The USER interacts with different agents naturally, like talking to a receptionist who transfers you to a specialist.
Example:
1
2
3
4
5
6
7
8
9
User: "I have a billing question and a technical problem"
↓
Billing Agent: "I'll handle billing. For technical issues,
let me transfer you to our tech specialist."
↓
[Agent hands off to Technical Agent]
↓
Technical Agent: "I see you've already resolved billing.
Now let me help with your technical issue..."
4. Adaptive Orchestration
The Concept: Instead of following a fixed plan, the system adapts its agent selection based on:
- Real-time progress
- Intermediate results
- Context changes
- Performance metrics
Flow:
1
2
3
4
5
6
7
8
9
10
11
Step 1: Plan initial agents
↓
Agent 1 executes
↓
[Evaluate progress]
↓
Adjust: Select different agents for step 2?
↓
Decide dynamically
↓
Continue or replan
Example:
1
2
3
4
5
6
7
8
9
10
11
12
Query: "Help me plan a trip"
Initial Plan:
- Weather Agent → get weather forecast
After Weather Result:
IF weather is good:
→ Suggest outdoor activities
ELSE:
→ Suggest indoor activities
Dynamically select next agent based on weather data
Week 6: Orchestration Patterns
1. Installation & Setup
1
2
3
4
5
pip install langchain
pip install langchain-openai
pip install langgraph
pip install crewai
pip install python-dotenv
2. Supervisor Pattern (Tool Calling)
Create supervisor_pattern.py:
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
78
79
80
81
82
83
84
85
86
87
88
89
90
import os
from dotenv import load_dotenv
from typing import Annotated
from langchain_openai import ChatOpenAI
from langchain.agents import create_tool_calling_agent, AgentExecutor
from langchain.prompts import ChatPromptTemplate
from langchain_core.tools import tool
from langgraph.graph import StateGraph, START, END
from langchain_core.messages import BaseMessage
from typing import TypedDict, Literal
load_dotenv()
# ============ SPECIALIST AGENTS ============
@tool
def handle_billing(query: str) -> str:
"""Handle billing-related questions."""
return f"Billing Agent: Processing billing query: {query}"
@tool
def handle_technical(query: str) -> str:
"""Handle technical support questions."""
return f"Technical Agent: Troubleshooting: {query}"
@tool
def handle_account(query: str) -> str:
"""Handle account management."""
return f"Account Agent: Managing account: {query}"
# ============ SUPERVISOR ============
class SupervisorState(TypedDict):
"""State for supervisor workflow."""
user_query: str
routing_decision: str
response: str
def supervisor_router(state: SupervisorState) -> dict:
"""Supervisor that routes to appropriate specialist."""
query = state["user_query"]
# Classify the query
if any(word in query.lower() for word in ["bill", "invoice", "payment", "charge"]):
decision = "billing"
result = handle_billing.invoke({"query": query})
elif any(word in query.lower() for word in ["technical", "error", "bug", "crash"]):
decision = "technical"
result = handle_technical.invoke({"query": query})
elif any(word in query.lower() for word in ["account", "profile", "password", "email"]):
decision = "account"
result = handle_account.invoke({"query": query})
else:
decision = "general"
result = "General response: I'm not sure which team to route this to."
return {
"routing_decision": decision,
"response": result
}
# ============ BUILD THE WORKFLOW ============
workflow = StateGraph(SupervisorState)
# Add the supervisor node
workflow.add_node("supervisor", supervisor_router)
# Add edges
workflow.add_edge(START, "supervisor")
workflow.add_edge("supervisor", END)
# Compile and run
app = workflow.compile()
# Test the supervisor
print("=== Supervisor Pattern Demo ===\n")
test_queries = [
"Why was I charged twice for my subscription?",
"My app keeps crashing when I try to upload files",
"I want to change my account password",
"Tell me about your pricing plans"
]
for query in test_queries:
print(f"Customer: {query}")
result = app.invoke({"user_query": query, "routing_decision": "", "response": ""})
print(f"Routed to: {result['routing_decision']}")
print(f"Response: {result['response']}\n")
3. Supervisor with Tool Calling Agent
Create supervisor_tool_calling.py:
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
78
79
80
81
82
83
84
85
86
87
88
89
90
91
92
93
94
95
import os
from dotenv import load_dotenv
from langchain_openai import ChatOpenAI
from langchain.agents import create_tool_calling_agent, AgentExecutor
from langchain.prompts import ChatPromptTemplate
from langchain_core.tools import tool
from langchain_core.messages import HumanMessage
load_dotenv()
llm = ChatOpenAI(model="gpt-3.5-turbo-0125")
# ============ SUB-AGENTS ============
def create_billing_agent():
"""Create specialized billing agent."""
billing_tools = [
tool(lambda: "Last invoice: $99.99 on 2025-01-15")(name="get_invoice"),
tool(lambda amount: f"Refund of ${amount} processed")(name="process_refund")
]
prompt = ChatPromptTemplate.from_template(
"You are a billing specialist. Help with: {query}"
)
agent = create_tool_calling_agent(llm, billing_tools, prompt)
return AgentExecutor(agent=agent, tools=billing_tools, verbose=False)
def create_technical_agent():
"""Create specialized technical support agent."""
tech_tools = [
tool(lambda: "Checking server status: All systems operational")(name="check_status"),
tool(lambda error: f"Troubleshooting: {error}")(name="diagnose_error")
]
prompt = ChatPromptTemplate.from_template(
"You are a technical support specialist. Help with: {query}"
)
agent = create_tool_calling_agent(llm, tech_tools, prompt)
return AgentExecutor(agent=agent, tools=tech_tools, verbose=False)
# ============ SUPERVISOR ============
billing_agent = create_billing_agent()
technical_agent = create_technical_agent()
# Wrap agents as tools for supervisor
@tool
def escalate_to_billing(query: str) -> str:
"""Escalate billing questions to billing specialist."""
result = billing_agent.run(query)
return result
@tool
def escalate_to_technical(query: str) -> str:
"""Escalate technical questions to tech specialist."""
result = technical_agent.run(query)
return result
supervisor_tools = [escalate_to_billing, escalate_to_technical]
supervisor_prompt = ChatPromptTemplate.from_template("""
You are a customer service supervisor. Route requests to appropriate specialists:
- Billing issues: charges, invoices, refunds, payments
- Technical issues: crashes, errors, performance
Tools available:
- escalate_to_billing: For financial/billing questions
- escalate_to_technical: For technical/performance issues
Route the following query appropriately:
{query}
""")
supervisor_agent = create_tool_calling_agent(llm, supervisor_tools, supervisor_prompt)
supervisor_executor = AgentExecutor(
agent=supervisor_agent,
tools=supervisor_tools,
verbose=True
)
# ============ TEST ============
print("=== Supervisor with Tool Calling ===\n")
queries = [
"I was double-charged last month, can you refund one charge?",
"My app keeps crashing on iOS, can you help?"
]
for query in queries:
print(f"Customer: {query}")
result = supervisor_executor.run(query)
print(f"Result: {result}\n")
Week 7: Advanced Patterns
1. Handoff Pattern
Create handoff_pattern.py:
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
78
79
80
81
82
83
84
85
86
87
88
89
90
91
92
93
94
95
96
97
98
99
100
101
102
103
104
105
106
107
108
109
110
111
112
113
114
115
116
117
118
119
120
121
import os
from dotenv import load_dotenv
from typing import Literal
from langchain_openai import ChatOpenAI
from langchain_core.tools import tool
from langgraph.graph import StateGraph, START, END
from langgraph.types import Command
from typing import TypedDict
from langchain.prompts import ChatPromptTemplate
load_dotenv()
llm = ChatOpenAI(model="gpt-3.5-turbo-0125")
class ConversationState(TypedDict):
"""State for handoff workflow."""
messages: list
current_agent: str
# ============ AGENTS WITH HANDOFF CAPABILITY ============
def billing_agent(state: ConversationState) -> Command[Literal["billing_agent", "technical_agent", "general_agent", END]]:
"""Billing agent that can handoff to other agents."""
# Create prompt for this agent
prompt = ChatPromptTemplate.from_template("""
You are a billing specialist. You can:
1. Handle billing questions (invoices, charges, refunds)
2. Handoff to technical agent if technical issue
3. Handoff to general agent if escalation needed
Current conversation: {messages}
Respond to the query. If handoff needed, say: "Handoff to [agent_name]"
""")
chain = prompt | llm
response = chain.invoke({"messages": state["messages"]})
response_text = response.content
# Check if handoff needed
if "handoff to technical" in response_text.lower():
return Command(
goto="technical_agent",
update={"messages": state["messages"] + [response_text]}
)
elif "handoff to general" in response_text.lower():
return Command(
goto="general_agent",
update={"messages": state["messages"] + [response_text]}
)
else:
return Command(
goto=END,
update={"messages": state["messages"] + [response_text]}
)
def technical_agent(state: ConversationState) -> Command[Literal["billing_agent", "technical_agent", "general_agent", END]]:
"""Technical agent with handoff capability."""
prompt = ChatPromptTemplate.from_template("""
You are a technical support specialist. You can:
1. Handle technical issues (crashes, errors)
2. Handoff to billing if payment issue
3. Handoff to general for escalation
Current conversation: {messages}
If handoff needed, say: "Handoff to [agent_name]"
""")
chain = prompt | llm
response = chain.invoke({"messages": state["messages"]})
response_text = response.content
if "handoff to billing" in response_text.lower():
return Command(
goto="billing_agent",
update={"messages": state["messages"] + [response_text]}
)
else:
return Command(
goto=END,
update={"messages": state["messages"] + [response_text]}
)
def general_agent(state: ConversationState) -> Command[Literal["billing_agent", "technical_agent", "general_agent", END]]:
"""General support with escalation handling."""
response = "General support: Your request has been escalated to a human agent."
return Command(
goto=END,
update={"messages": state["messages"] + [response]}
)
# ============ BUILD HANDOFF WORKFLOW ============
workflow = StateGraph(ConversationState)
workflow.add_node("billing_agent", billing_agent)
workflow.add_node("technical_agent", technical_agent)
workflow.add_node("general_agent", general_agent)
workflow.add_edge(START, "billing_agent")
# Compile
app = workflow.compile()
# Test
print("=== Handoff Pattern Demo ===\n")
# Start with billing query that needs technical handoff
initial_state = {
"messages": ["Customer: I was charged but haven't received my product AND my app is crashing"],
"current_agent": "billing_agent"
}
result = app.invoke(initial_state)
print("Conversation:")
for msg in result["messages"]:
print(f"- {msg}\n")
2. Adaptive Orchestration
Create adaptive_orchestration.py:
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
78
79
80
81
82
83
84
85
86
87
88
89
90
91
92
93
94
95
96
97
98
99
100
101
102
103
104
105
106
107
108
109
110
111
112
113
114
115
116
import os
from dotenv import load_dotenv
from langchain_openai import ChatOpenAI
from langchain.prompts import ChatPromptTemplate
from typing import TypedDict
from langgraph.graph import StateGraph, START, END
load_dotenv()
llm = ChatOpenAI(model="gpt-3.5-turbo-0125")
class TaskState(TypedDict):
"""State for adaptive workflow."""
task: str
research_results: str
analysis_results: str
final_output: str
needs_expert_review: bool
def research_node(state: TaskState) -> dict:
"""Research phase."""
prompt = ChatPromptTemplate.from_template(
"Research the following topic and provide findings: {task}"
)
chain = prompt | llm
response = chain.invoke({"task": state["task"]})
return {"research_results": response.content}
def analyze_findings(state: TaskState) -> dict:
"""Analyze research findings."""
prompt = ChatPromptTemplate.from_template(
"""Analyze these research findings for complexity and accuracy:
{findings}
Rate the complexity (1-10) and confidence (1-10).
If complexity > 7 or confidence < 5, flag for expert review."""
)
chain = prompt | llm
response = chain.invoke({"findings": state["research_results"]})
needs_review = ("expert review" in response.content.lower() or
"flag" in response.content.lower())
return {
"analysis_results": response.content,
"needs_expert_review": needs_review
}
def decide_next_step(state: TaskState) -> str:
"""Dynamically decide next step based on analysis."""
if state["needs_expert_review"]:
return "expert_review"
else:
return "finalize"
def expert_review_node(state: TaskState) -> dict:
"""Expert review for complex topics."""
prompt = ChatPromptTemplate.from_template(
"""You are an expert. Review this analysis and provide expert insights:
Analysis: {analysis}
Original Findings: {findings}"""
)
chain = prompt | llm
response = chain.invoke({
"analysis": state["analysis_results"],
"findings": state["research_results"]
})
return {"analysis_results": response.content}
def finalize_node(state: TaskState) -> dict:
"""Finalize the output."""
return {"final_output": state["analysis_results"]}
# Build workflow
workflow = StateGraph(TaskState)
workflow.add_node("research", research_node)
workflow.add_node("analyze", analyze_findings)
workflow.add_node("expert_review", expert_review_node)
workflow.add_node("finalize", finalize_node)
workflow.add_edge(START, "research")
workflow.add_edge("research", "analyze")
workflow.add_conditional_edges(
"analyze",
decide_next_step,
{
"expert_review": "expert_review",
"finalize": "finalize"
}
)
workflow.add_edge("expert_review", "finalize")
workflow.add_edge("finalize", END)
# Compile and test
app = workflow.compile()
print("=== Adaptive Orchestration Demo ===\n")
result = app.invoke({
"task": "Explain quantum computing and its business applications",
"research_results": "",
"analysis_results": "",
"final_output": "",
"needs_expert_review": False
})
print(f"Final Output:\n{result['final_output']}")
print(f"Expert Review Required: {result['needs_expert_review']}")
Week 8: CrewAI & Production
1. CrewAI Installation
1
2
pip install crewai
pip install crewai-tools
2. Building a Crew (Multi-Agent Team)
Create crewai_example.py:
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
import os
from dotenv import load_dotenv
from crewai import Agent, Task, Crew
from crewai_tools import FileReadTool, ScrapeWebsiteTool
load_dotenv()
# ============ DEFINE AGENTS ============
# Agent 1: Researcher
researcher = Agent(
role="Research Analyst",
goal="Find and analyze information about AI trends",
backstory="""You are an expert researcher with deep knowledge of AI.
You excel at finding reliable sources and synthesizing information.""",
tools=[ScrapeWebsiteTool()],
verbose=True
)
# Agent 2: Writer
writer = Agent(
role="Content Writer",
goal="Write engaging articles based on research",
backstory="""You are a talented writer who transforms research into
compelling narratives that engage audiences.""",
tools=[FileReadTool()],
verbose=True
)
# Agent 3: Editor
editor = Agent(
role="Editor",
goal="Polish and refine content for publication",
backstory="""You are a meticulous editor ensuring quality and clarity.
You improve flow, grammar, and impact.""",
verbose=True
)
# ============ DEFINE TASKS ============
# Task 1: Research
research_task = Task(
description="Research AI trends in 2025",
agent=researcher,
expected_output="Comprehensive summary of AI trends"
)
# Task 2: Write
write_task = Task(
description="Write an article based on the research",
agent=writer,
expected_output="Well-written article draft",
context=[research_task] # Depends on research task output
)
# Task 3: Edit
edit_task = Task(
description="Edit and finalize the article",
agent=editor,
expected_output="Final polished article",
context=[write_task] # Depends on write task output
)
# ============ CREATE CREW ============
crew = Crew(
agents=[researcher, writer, editor],
tasks=[research_task, write_task, edit_task],
verbose=True # Show all steps
)
# ============ EXECUTE ============
result = crew.kickoff(inputs={"topic": "AI in 2025"})
print("\n=== Final Result ===")
print(result)
3. Complex Customer Support System
Create crewai_customer_support.py:
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
from crewai import Agent, Task, Crew
from crewai_tools import DatabaseTool, EmailTool
# ============ AGENTS FOR CUSTOMER SUPPORT ============
# Data Specialist
data_agent = Agent(
role="Data Retrieval Specialist",
goal="Quickly find customer information and order details",
backstory="Expert at database queries and customer lookup",
tools=[DatabaseTool()],
verbose=True
)
# Support Specialist
support_agent = Agent(
role="Support Specialist",
goal="Resolve customer issues based on data",
backstory="Expert customer service professional with years of experience",
tools=[EmailTool()],
verbose=True
)
# Quality Assurance
qa_agent = Agent(
role="QA Reviewer",
goal="Ensure solutions meet quality standards",
backstory="Quality expert ensuring customer satisfaction",
verbose=True
)
# ============ TASKS ============
# Get customer data
get_data_task = Task(
description="""Retrieve customer information for customer_id: {customer_id}
Include: order history, account status, previous issues""",
agent=data_agent,
expected_output="Complete customer profile"
)
# Resolve issue
resolve_task = Task(
description="Resolve the customer issue: {issue}",
agent=support_agent,
expected_output="Solution and action plan",
context=[get_data_task]
)
# Review solution
review_task = Task(
description="Review solution for quality and completeness",
agent=qa_agent,
expected_output="Quality assessment and approval",
context=[resolve_task]
)
# ============ CREATE CREW ============
crew = Crew(
agents=[data_agent, support_agent, qa_agent],
tasks=[get_data_task, resolve_task, review_task],
process="sequential" # Tasks happen one after another
)
# ============ EXECUTE ============
result = crew.kickoff(inputs={
"customer_id": "C12345",
"issue": "Payment failed but was charged twice"
})
print(result)
Real-World Projects
Project 1: Document Processing Pipeline
1
2
3
4
5
# Multiple agents handle document workflow:
# 1. Classification Agent: Determines document type
# 2. Extraction Agent: Pulls key information
# 3. Validation Agent: Checks for completeness
# 4. Storage Agent: Archives document
Project 2: Multi-Channel Support
1
2
3
4
5
# Agents for different channels:
# - Email Support Agent
# - Chat Support Agent
# - Phone Support Agent
# - Escalation Manager Agent
Project 3: Content Creation Workflow
1
2
3
4
5
6
# Content team agents:
# - Topic Researcher
# - Content Writer
# - SEO Optimizer
# - Content Editor
# - Publisher
Production Best Practices
1. Error Handling
1
2
3
4
5
6
7
def robust_agent(state):
"""Agent with error handling."""
try:
result = execute_task(state)
return {"result": result, "error": None}
except Exception as e:
return {"result": None, "error": str(e)}
2. Monitoring
1
2
3
4
5
6
7
8
9
10
def log_agent_action(agent_name, action, result):
"""Log all agent actions for auditing."""
log = {
"timestamp": datetime.now(),
"agent": agent_name,
"action": action,
"result": result,
"status": "success" if result else "failed"
}
save_to_database(log)
3. Agent Metrics
1
2
3
4
5
6
# Track per-agent:
- Response time
- Success rate
- Error rate
- User satisfaction
- Escalation rate
Summary: Week 6-8 Learning
Week 6 Accomplishments
✅ Understood three orchestration architectures
✅ Built supervisor patterns
✅ Implemented tool calling
✅ Mastered basic routing
Week 7 Accomplishments
✅ Implemented handoff patterns
✅ Built adaptive orchestration
✅ Created dynamic agent selection
✅ Multi-agent conversations
Week 8 Accomplishments
✅ Learned CrewAI framework
✅ Built production crews
✅ Complex multi-agent systems
✅ Enterprise patterns
Key Takeaways
- Choose Pattern Wisely: Supervisor vs Hierarchical vs Decentralized
- Use Tools Strategically: Wrap agents as tools for supervisor pattern
- Handoffs Enable Flexibility: Direct agent-to-agent transfer feels natural
- Adapt Dynamically: Adjust agent selection based on progress
- CrewAI Simplifies: Framework handles most orchestration
- Monitor Everything: Track agent performance and errors
- Test Thoroughly: Multi-agent behavior is complex
- Scale Gradually: Start simple, add complexity
Next Steps
- Advanced CrewAI: Hierarchical crews
- Parallel Execution: Run agents simultaneously
- Memory Integration: Shared state across agents
- Custom Tools: Agent-specific capabilities
- Production Deployment: Scaling to thousands of users
Resources
- LangChain Multi-Agent Docs
- CrewAI Official Docs
- LangGraph Multi-Agent Tutorial
- DeepLearning.AI Multi-Agent Course
Main Page: AI Engineering Learning
Congratulations! You now understand how to build sophisticated multi-agent systems where specialized agents work together orchestrated by supervisors or collaborating peer-to-peer. You’re ready to tackle enterprise-scale AI applications.
