Agents #agents#tools#function calling

AI Agent Tool Design in 2026: Building Tools That LLMs Can Actually Use

Function calling is only as good as your tool design. A practical guide to designing clear schemas, handling errors, and building composable tool ecosystems.

15 min · January 10, 2026 · Updated January 27, 2026

TL;DR

Function calling enables agents to take real-world actions—but tool design determines whether those actions succeed.
Design tools with clear schemas: descriptive names, explicit parameter types, and helpful descriptions for every field.
Follow the Single Responsibility Principle: one tool, one job. Compose complex workflows from simple tools.
Validate inputs before execution—models will occasionally generate invalid parameters.
Implement idempotent tools for safe retries. Agents will retry failed calls.
Return structured, actionable errors. “Something went wrong” doesn’t help the agent recover.
Test tools with adversarial prompts. Models may try to use tools in unexpected ways.

The Function Calling Paradigm

Function calling transforms LLMs from text generators into action-takers. Instead of describing what to do, models can now do it—query databases, send emails, update records, and interact with the world.

The workflow:

1. You define available tools with JSON schemas
2. User makes a request
3. Model decides if tools are needed
4. Model outputs structured function call (name + arguments)
5. Your code executes the function
6. Results return to the model
7. Model generates final response

This formal contract means model outputs are executable, not just descriptive. But the quality of that execution depends entirely on your tool design.

Tool Design Principles

Principle 1: Clear, Descriptive Schemas

Your tool schema is the model’s only information about what a tool does. Make it count.

Bad schema:

{
  "name": "send",
  "parameters": {
    "type": "object",
    "properties": {
      "to": { "type": "string" },
      "msg": { "type": "string" }
    }
  }
}

Good schema:

{
  "name": "send_email",
  "description": "Sends an email to a specified recipient. Use for customer communications, notifications, and follow-ups. Returns success status and message ID.",
  "parameters": {
    "type": "object",
    "properties": {
      "to_email": {
        "type": "string",
        "description": "Recipient email address. Must be a valid email format.",
        "format": "email"
      },
      "subject": {
        "type": "string",
        "description": "Email subject line. Keep under 60 characters for best display.",
        "maxLength": 100
      },
      "body": {
        "type": "string",
        "description": "Email body content. Supports plain text. Use \\n for line breaks."
      },
      "priority": {
        "type": "string",
        "enum": ["low", "normal", "high"],
        "description": "Email priority. Use 'high' only for time-sensitive communications.",
        "default": "normal"
      }
    },
    "required": ["to_email", "subject", "body"]
  }
}

Key improvements:

Descriptive name (send_email vs send)
Tool-level description explaining purpose and use cases
Field-level descriptions with guidance
Format hints and constraints
Enums for controlled vocabularies
Required fields explicitly marked

Principle 2: Single Responsibility

Each tool should do one thing well. Complex operations should compose multiple tools.

Bad: Monolithic tool

{
  "name": "manage_customer",
  "description": "Create, update, delete, or query customers"
}

Good: Focused tools

[
  { "name": "create_customer", "description": "Creates a new customer record" },
  { "name": "update_customer", "description": "Updates an existing customer" },
  { "name": "delete_customer", "description": "Deletes a customer record" },
  { "name": "get_customer", "description": "Retrieves customer by ID" },
  { "name": "search_customers", "description": "Searches customers by criteria" }
]

Benefits:

Clearer model decision-making
Simpler parameter sets
Easier testing
More predictable behavior

Principle 3: Explicit Parameter Types

Use the strictest types possible:

Instead of	Use	Why
`string` for numbers	`number` or `integer`	Prevents “five” when you need 5
`string` for dates	`string` with format	”2026-01-27” not “next Tuesday”
`string` for choices	`enum`	Constrains to valid options
Open `object`	Specific properties	Predictable structure

Principle 4: Return Structured Results

Tool results should be structured and actionable:

Bad return:

{ "result": "Customer created successfully" }

Good return:

{
  "success": true,
  "customer_id": "cust_abc123",
  "created_at": "2026-01-27T10:00:00Z",
  "action_taken": "create",
  "warnings": [],
  "next_steps": ["Add payment method", "Set subscription tier"]
}

The model can use structured returns to:

Confirm success to users
Pass IDs to subsequent tool calls
Handle edge cases
Provide meaningful follow-up

Input Validation

Models occasionally generate invalid parameters. Validate before execution.

Validation Strategy

interface ValidationResult {
  valid: boolean;
  errors: Array<{ field: string; message: string }>;
  sanitized?: Record<string, unknown>;
}

function validateToolInput(
  toolName: string,
  params: Record<string, unknown>
): ValidationResult {
  const schema = getToolSchema(toolName);
  const errors: Array<{ field: string; message: string }> = [];
  
  // Check required fields
  for (const field of schema.required || []) {
    if (params[field] === undefined || params[field] === null) {
      errors.push({ field, message: `${field} is required` });
    }
  }
  
  // Type validation
  for (const [field, value] of Object.entries(params)) {
    const fieldSchema = schema.properties[field];
    if (!fieldSchema) {
      errors.push({ field, message: `Unknown field: ${field}` });
      continue;
    }
    
    if (fieldSchema.type === 'string' && typeof value !== 'string') {
      errors.push({ field, message: `${field} must be a string` });
    }
    
    if (fieldSchema.enum && !fieldSchema.enum.includes(value)) {
      errors.push({ 
        field, 
        message: `${field} must be one of: ${fieldSchema.enum.join(', ')}` 
      });
    }
    
    // Format validation (email, date, etc.)
    if (fieldSchema.format === 'email' && !isValidEmail(value as string)) {
      errors.push({ field, message: `${field} must be a valid email` });
    }
  }
  
  return {
    valid: errors.length === 0,
    errors,
    sanitized: errors.length === 0 ? sanitize(params, schema) : undefined,
  };
}

Error Response Format

Return validation errors in a format the model can understand and recover from:

{
  "success": false,
  "error_type": "validation_error",
  "errors": [
    {
      "field": "to_email",
      "message": "Must be a valid email address",
      "provided": "john.doe",
      "expected": "email format (e.g., user@example.com)"
    }
  ],
  "retry_suggestion": "Please provide a valid email address for to_email"
}

Idempotency

Agents retry failed operations. Idempotent tools ensure retries are safe.

Idempotency Patterns

Pattern 1: Idempotency keys

async function createOrder(params: {
  idempotency_key: string;
  customer_id: string;
  items: Array<{ product_id: string; quantity: number }>;
}) {
  // Check if operation already completed
  const existing = await db.orders.findByIdempotencyKey(params.idempotency_key);
  if (existing) {
    return { success: true, order: existing, was_cached: true };
  }
  
  // Execute operation
  const order = await db.orders.create({
    ...params,
    idempotency_key: params.idempotency_key,
  });
  
  return { success: true, order, was_cached: false };
}

Pattern 2: Upsert for updates

async function updateCustomerEmail(params: {
  customer_id: string;
  email: string;
}) {
  // Upsert is naturally idempotent
  const customer = await db.customers.upsert({
    where: { id: params.customer_id },
    update: { email: params.email },
    create: { id: params.customer_id, email: params.email },
  });
  
  return { success: true, customer };
}

Pattern 3: State checking

async function cancelSubscription(params: {
  subscription_id: string;
}) {
  const subscription = await db.subscriptions.findById(params.subscription_id);
  
  // Already in target state - return success
  if (subscription.status === 'cancelled') {
    return { 
      success: true, 
      subscription, 
      already_cancelled: true 
    };
  }
  
  // Perform cancellation
  const updated = await db.subscriptions.update({
    where: { id: params.subscription_id },
    data: { status: 'cancelled', cancelled_at: new Date() },
  });
  
  return { success: true, subscription: updated, already_cancelled: false };
}

Error Handling

Structured errors help agents recover:

Error Categories

Category	HTTP Code	Agent Action
`validation_error`	400	Fix parameters, retry
`not_found`	404	Try different identifier
`permission_denied`	403	Report to user, don’t retry
`rate_limited`	429	Wait, retry
`internal_error`	500	Escalate or use fallback
`timeout`	408	Retry with backoff

Error Response Structure

interface ToolError {
  success: false;
  error_type: 'validation_error' | 'not_found' | 'permission_denied' | 
              'rate_limited' | 'internal_error' | 'timeout';
  error_code: string;  // Machine-readable code
  error_message: string;  // Human-readable message
  retry_after?: number;  // Seconds to wait before retry
  retry_suggestion?: string;  // Guidance for agent
  context?: Record<string, unknown>;  // Additional context
}

// Example
{
  "success": false,
  "error_type": "not_found",
  "error_code": "CUSTOMER_NOT_FOUND",
  "error_message": "No customer found with ID 'cust_xyz'",
  "retry_suggestion": "Search for customer by email instead",
  "context": {
    "searched_id": "cust_xyz",
    "alternative_actions": ["search_customers"]
  }
}

Tool Composition

Complex workflows compose simple tools:

Composition Patterns

Sequential composition:

get_customer → update_customer → send_email

Conditional composition:

search_customers → 
  if found: get_customer 
  else: create_customer
→ add_to_campaign

Parallel composition (if supported):

[get_customer, get_order_history, get_support_tickets] 
→ generate_customer_summary

Enabling Composition

Design tools to chain together:

Consistent ID formats: All tools use the same customer_id format
Return IDs from creates: create_customer returns the new ID
Accept IDs from previous calls: update_customer takes customer_id
Document relationships: Schema describes which tools work together

Security Considerations

Permission Scoping

Limit tool permissions to minimum necessary:

interface ToolPermissions {
  tool_name: string;
  allowed_operations: string[];
  resource_scope: 'own' | 'team' | 'org' | 'all';
  rate_limit: { requests: number; window: string };
}

// Example: Support agent tools
const supportAgentTools: ToolPermissions[] = [
  {
    tool_name: 'get_customer',
    allowed_operations: ['read'],
    resource_scope: 'all',
    rate_limit: { requests: 100, window: '1m' },
  },
  {
    tool_name: 'update_customer',
    allowed_operations: ['update'],
    resource_scope: 'all',
    rate_limit: { requests: 20, window: '1m' },
  },
  // delete_customer intentionally excluded
];

Audit Logging

Log all tool executions:

interface ToolExecutionLog {
  timestamp: string;
  tool_name: string;
  user_id: string;
  agent_id: string;
  input_params: Record<string, unknown>;
  output_result: Record<string, unknown>;
  execution_time_ms: number;
  success: boolean;
  error_type?: string;
}

Input Sanitization

Prevent injection attacks:

function sanitizeInput(input: string, context: 'sql' | 'shell' | 'html'): string {
  switch (context) {
    case 'sql':
      return escapeSqlIdentifiers(input);
    case 'shell':
      return escapeShellArg(input);
    case 'html':
      return escapeHtml(input);
    default:
      return input;
  }
}

Testing Tools

Unit Tests

Test each tool in isolation:

describe('create_customer tool', () => {
  it('creates customer with valid input', async () => {
    const result = await tools.create_customer({
      email: 'test@example.com',
      name: 'Test User',
    });
    
    expect(result.success).toBe(true);
    expect(result.customer_id).toBeDefined();
  });
  
  it('returns validation error for invalid email', async () => {
    const result = await tools.create_customer({
      email: 'invalid',
      name: 'Test User',
    });
    
    expect(result.success).toBe(false);
    expect(result.error_type).toBe('validation_error');
  });
  
  it('is idempotent with same idempotency_key', async () => {
    const key = 'test-key-123';
    const result1 = await tools.create_customer({ 
      idempotency_key: key, 
      email: 'test@example.com' 
    });
    const result2 = await tools.create_customer({ 
      idempotency_key: key, 
      email: 'test@example.com' 
    });
    
    expect(result1.customer_id).toBe(result2.customer_id);
    expect(result2.was_cached).toBe(true);
  });
});

Adversarial Testing

Test with prompts that might cause unexpected tool use:

describe('adversarial tool usage', () => {
  it('handles SQL injection in search', async () => {
    const result = await tools.search_customers({
      query: "'; DROP TABLE customers; --"
    });
    
    // Should not execute SQL, should treat as literal search
    expect(result.success).toBe(true);
    expect(result.customers).toHaveLength(0);
  });
  
  it('handles excessive parameter values', async () => {
    const result = await tools.search_customers({
      limit: 1000000  // Trying to dump database
    });
    
    // Should cap at max limit
    expect(result.customers.length).toBeLessThanOrEqual(100);
  });
});

Implementation Checklist

FAQ

How many tools should an agent have?

Start with 5–10 focused tools. More tools increase decision complexity for the model. Add tools as specific needs arise, not speculatively.

Should tools handle business logic or just CRUD?

Tools can include business logic, but keep it explicit. A approve_refund tool can check business rules internally—the model doesn’t need to know the rules, just the outcome.

How do I handle long-running tools?

Return immediately with a job ID, provide a check_job_status tool. Most models expect tools to complete within 10 seconds.

Can tools call other tools?

Yes, but be careful of deep nesting and timeouts. Consider whether the model should orchestrate the calls instead.

How do I version tools?

Include version in tool name (create_customer_v2) for breaking changes. For minor changes, update the schema in place.

Sources & Further Reading

OpenAI Function Calling Guide — Official documentation
LLM Function Calling Best Practices — Implementation patterns
Function Calling in AI Agents — Agent-specific guidance
Azure AI Agents Function Calling — Enterprise patterns
Agent Observability — Related: monitoring tool usage
API Design for AI Tools — Related: API patterns

Interested in our research?

We share our work openly. If you'd like to collaborate or discuss ideas — we'd love to hear from you.

Get in Touch

AI Agent Tool Design in 2026: Building Tools That LLMs Can Actually Use

TL;DR

The Function Calling Paradigm

Tool Design Principles

Principle 1: Clear, Descriptive Schemas

Principle 2: Single Responsibility

Principle 3: Explicit Parameter Types

Principle 4: Return Structured Results

Input Validation

Validation Strategy

Error Response Format

Idempotency

Idempotency Patterns

Error Handling

Error Categories

Error Response Structure

Tool Composition

Composition Patterns

Enabling Composition

Security Considerations

Permission Scoping

Audit Logging

Input Sanitization

Testing Tools

Unit Tests

Adversarial Testing

Implementation Checklist

FAQ

How many tools should an agent have?

Should tools handle business logic or just CRUD?

How do I handle long-running tools?

Can tools call other tools?

How do I version tools?

Sources & Further Reading

Interested in our research?

More Articles

Agent Economics in 2026: Cost, Latency, and the Business Model

Agentic Workflow Design in 2026: How to Turn Automation Into Outcomes

Agent Routing Strategies in 2026: The Router Is the Product

Let's build
something real.

AI Agent Tool Design in 2026: Building Tools That LLMs Can Actually Use

TL;DR

The Function Calling Paradigm

Tool Design Principles

Principle 1: Clear, Descriptive Schemas

Principle 2: Single Responsibility

Principle 3: Explicit Parameter Types

Principle 4: Return Structured Results

Input Validation

Validation Strategy

Error Response Format

Idempotency

Idempotency Patterns

Error Handling

Error Categories

Error Response Structure

Tool Composition

Composition Patterns

Enabling Composition

Security Considerations

Permission Scoping

Audit Logging

Input Sanitization

Testing Tools

Unit Tests

Adversarial Testing

Implementation Checklist

FAQ

How many tools should an agent have?

Should tools handle business logic or just CRUD?

How do I handle long-running tools?

Can tools call other tools?

How do I version tools?

Sources & Further Reading

Interested in our research?

More Articles

Agent Economics in 2026: Cost, Latency, and the Business Model

Agentic Workflow Design in 2026: How to Turn Automation Into Outcomes

Agent Routing Strategies in 2026: The Router Is the Product

Let's build something real.

Let's build
something real.