OpenAI Function Calling: Building Structured AI Workflows

Function Calling Is the Foundation of Reliable AI

The biggest reliability problem with LLMs is getting structured, predictable output. Free-form text generation works in demos but fails in production when you need to parse the response, pass it to another system, or guarantee a specific format. OpenAI's function calling (now called "tools") solves this: the model emits a structured JSON function call that your application executes, rather than trying to parse instructions from free text.

Defining Tools

Define tools using JSON Schema. The model reads the description and parameter schemas to decide when and how to call each tool. Good descriptions are essential — vague descriptions lead to missed or incorrect tool calls.

tools = [
    {
        "type": "function",
        "function": {
            "name": "get_order_status",
            "description": "Get the current status and tracking information for a customer order. Use this when the customer asks about their order status, shipping, or delivery.",
            "parameters": {
                "type": "object",
                "properties": {
                    "order_id": {
                        "type": "string",
                        "description": "The order ID, typically starts with 'ORD-'"
                    },
                    "include_tracking": {
                        "type": "boolean",
                        "description": "Whether to include detailed shipment tracking events"
                    }
                },
                "required": ["order_id"],
                "additionalProperties": False
            }
        }
    },
    {
        "type": "function",
        "function": {
            "name": "initiate_refund",
            "description": "Initiate a refund for an order. Only use this after confirming the customer wants a refund and the order is eligible.",
            "parameters": {
                "type": "object",
                "properties": {
                    "order_id": {"type": "string"},
                    "reason": {
                        "type": "string",
                        "enum": ["defective", "wrong_item", "not_delivered", "changed_mind"]
                    },
                    "amount": {
                        "type": "number",
                        "description": "Refund amount in USD. Omit for full refund."
                    }
                },
                "required": ["order_id", "reason"]
            }
        }
    }
]

The Tool Use Loop

from openai import OpenAI

client = OpenAI()
messages = [{"role": "user", "content": "What's the status of order ORD-12345?"}]

while True:
    response = client.chat.completions.create(
        model="gpt-4o",
        messages=messages,
        tools=tools,
        tool_choice="auto",
    )

    message = response.choices[0].message
    messages.append(message)

    # If no tool calls, we have the final answer
    if not message.tool_calls:
        print(message.content)
        break

    # Execute each tool call
    for tool_call in message.tool_calls:
        result = execute_tool(tool_call.function.name,
                              json.loads(tool_call.function.arguments))
        messages.append({
            "role": "tool",
            "tool_call_id": tool_call.id,
            "content": json.dumps(result)
        })
    # Loop back to get the model's response to the tool results

Parallel Tool Calls

The model may call multiple tools in a single turn when the calls are independent. Handle all tool calls before sending back results.

import asyncio

async def handle_tool_calls(tool_calls):
    tasks = [
        execute_tool_async(tc.function.name, json.loads(tc.function.arguments))
        for tc in tool_calls
    ]
    results = await asyncio.gather(*tasks)
    return [
        {
            "role": "tool",
            "tool_call_id": tc.id,
            "content": json.dumps(result)
        }
        for tc, result in zip(tool_calls, results)
    ]

Structured Outputs (Strict Mode)

For guaranteed JSON schema compliance, use structured outputs with strict: true. The model is constrained to produce output that exactly matches your schema — no extra fields, no missing required fields, no type mismatches.

from pydantic import BaseModel
from openai import OpenAI

class CustomerIntent(BaseModel):
    category: Literal["refund", "order_status", "product_question", "other"]
    urgency: Literal["low", "medium", "high"]
    order_id: str | None
    summary: str

response = client.beta.chat.completions.parse(
    model="gpt-4o",
    messages=messages,
    response_format=CustomerIntent,
)

intent = response.choices[0].message.parsed
# intent is a fully typed CustomerIntent object — no JSON parsing needed

Tool Execution Safety

Never trust tool arguments blindly: Validate inputs even though the model generated them. A malformed order ID from the model should return an error, not crash your database query.
Implement tool-level rate limiting: The model may call a tool multiple times in a loop. Limit tool calls per conversation to prevent runaway agent behaviour.
Log all tool calls: Store the function name, arguments, and result for every tool call. This is your primary debugging tool when agents behave unexpectedly.
Require confirmation for destructive actions: For tools like initiate_refund or delete_account, have the model confirm with the user before executing — never auto-execute irreversible actions.

OpenAI Function Calling: Building Structured AI Workflows

Function Calling Is the Foundation of Reliable AI

Defining Tools

The Tool Use Loop

Parallel Tool Calls

Structured Outputs (Strict Mode)

Tool Execution Safety

Jaspi.io — AI Hiring Platform

How to Build a Production RAG System with LangChain and OpenAI

Building Multi-Agent AI Systems with LangGraph

Want to Build This for Your Team?