Function Calling Is the Foundation of Reliable AI
The biggest reliability problem with LLMs is getting structured, predictable output. Free-form text generation works in demos but fails in production when you need to parse the response, pass it to another system, or guarantee a specific format. OpenAI's function calling (now called "tools") solves this: the model emits a structured JSON function call that your application executes, rather than trying to parse instructions from free text.
Defining Tools
Define tools using JSON Schema. The model reads the description and parameter schemas to decide when and how to call each tool. Good descriptions are essential — vague descriptions lead to missed or incorrect tool calls.
tools = [
{
"type": "function",
"function": {
"name": "get_order_status",
"description": "Get the current status and tracking information for a customer order. Use this when the customer asks about their order status, shipping, or delivery.",
"parameters": {
"type": "object",
"properties": {
"order_id": {
"type": "string",
"description": "The order ID, typically starts with 'ORD-'"
},
"include_tracking": {
"type": "boolean",
"description": "Whether to include detailed shipment tracking events"
}
},
"required": ["order_id"],
"additionalProperties": False
}
}
},
{
"type": "function",
"function": {
"name": "initiate_refund",
"description": "Initiate a refund for an order. Only use this after confirming the customer wants a refund and the order is eligible.",
"parameters": {
"type": "object",
"properties": {
"order_id": {"type": "string"},
"reason": {
"type": "string",
"enum": ["defective", "wrong_item", "not_delivered", "changed_mind"]
},
"amount": {
"type": "number",
"description": "Refund amount in USD. Omit for full refund."
}
},
"required": ["order_id", "reason"]
}
}
}
]
The Tool Use Loop
from openai import OpenAI
client = OpenAI()
messages = [{"role": "user", "content": "What's the status of order ORD-12345?"}]
while True:
response = client.chat.completions.create(
model="gpt-4o",
messages=messages,
tools=tools,
tool_choice="auto",
)
message = response.choices[0].message
messages.append(message)
# If no tool calls, we have the final answer
if not message.tool_calls:
print(message.content)
break
# Execute each tool call
for tool_call in message.tool_calls:
result = execute_tool(tool_call.function.name,
json.loads(tool_call.function.arguments))
messages.append({
"role": "tool",
"tool_call_id": tool_call.id,
"content": json.dumps(result)
})
# Loop back to get the model's response to the tool results
Parallel Tool Calls
The model may call multiple tools in a single turn when the calls are independent. Handle all tool calls before sending back results.
import asyncio
async def handle_tool_calls(tool_calls):
tasks = [
execute_tool_async(tc.function.name, json.loads(tc.function.arguments))
for tc in tool_calls
]
results = await asyncio.gather(*tasks)
return [
{
"role": "tool",
"tool_call_id": tc.id,
"content": json.dumps(result)
}
for tc, result in zip(tool_calls, results)
]
Structured Outputs (Strict Mode)
For guaranteed JSON schema compliance, use structured outputs with strict: true. The model is constrained to produce output that exactly matches your schema — no extra fields, no missing required fields, no type mismatches.
from pydantic import BaseModel
from openai import OpenAI
class CustomerIntent(BaseModel):
category: Literal["refund", "order_status", "product_question", "other"]
urgency: Literal["low", "medium", "high"]
order_id: str | None
summary: str
response = client.beta.chat.completions.parse(
model="gpt-4o",
messages=messages,
response_format=CustomerIntent,
)
intent = response.choices[0].message.parsed
# intent is a fully typed CustomerIntent object — no JSON parsing needed
Tool Execution Safety
- Never trust tool arguments blindly: Validate inputs even though the model generated them. A malformed order ID from the model should return an error, not crash your database query.
- Implement tool-level rate limiting: The model may call a tool multiple times in a loop. Limit tool calls per conversation to prevent runaway agent behaviour.
- Log all tool calls: Store the function name, arguments, and result for every tool call. This is your primary debugging tool when agents behave unexpectedly.
- Require confirmation for destructive actions: For tools like
initiate_refundordelete_account, have the model confirm with the user before executing — never auto-execute irreversible actions.