What is Tool Use?
The ability of an LLM to invoke external functions, APIs, or tools as part of its response generation — implemented through structured function-call schemas that the model produces and a runtime executes, foundational to all modern agent architectures.
Definition
Tool use is the capability of a language model to invoke external functions, APIs, or tools during response generation rather than relying solely on its internal knowledge. The pattern is implemented through structured function-call schemas: the developer registers tools (with names, descriptions, and parameter schemas), the model decides when to invoke a tool and produces a structured call, the runtime executes that call against the actual tool, and the result is fed back to the model for continued reasoning. This loop — model decides, runtime executes, result returns — is the foundation of all modern agent architectures.
Tool-use fidelity (the model's ability to produce well-formed tool calls reliably under pressure) is now a primary axis of model capability separate from raw reasoning quality. Models from labs that invest heavily in tool-use training (OpenAI, Anthropic, increasingly Alibaba and Moonshot) typically have higher fidelity than community fine-tunes that don't include explicit tool-use training data. Open-weight bases like GPT-OSS, Qwen 3+, Kimi K2.6, and Hermes 4 have particularly strong tool-use behavior; older or general-purpose bases often need fine-tuning to achieve production reliability.
Why It Matters
Tool use is the line between LLMs as text generators and LLMs as agents. Without tool use, a model can only produce text; with tool use, a model can take actions in the world — querying databases, calling APIs, controlling browsers, executing code. Every agent framework (LangChain, LangGraph, CrewAI, AutoGen, Mastra, smolagents, Hermes Agent) builds on tool use as its primitive. For production deployments, tool-use fidelity is often more important than peak reasoning capability — a model that hallucinates tool calls 5% of the time produces unreliable agents regardless of how clever its reasoning is otherwise.
Key Takeaways
- Tool use enables LLMs to invoke external functions and APIs during response generation
- Implemented through structured function-call schemas (name, description, parameters)
- Foundational to all modern agent frameworks and architectures
- Tool-use fidelity is a separate capability axis from raw reasoning quality
- Strong open-weight tool-use bases: GPT-OSS, Qwen 3+, Kimi K2.6, Hermes 4
How Ertas Helps
When fine-tuning models for agentic deployments in Ertas Studio, including explicit tool-use traces in the training data substantially improves the fine-tuned model's tool-use fidelity in production. Ertas Studio supports training data formats with structured function calls, observed tool outputs, and multi-step reasoning traces — letting you produce a fine-tune that handles your specific tool surface reliably rather than degrading toward generic tool-use behavior.
Related Resources
Ship AI that runs on your users' devices.
Early bird pricing starts at $14.50/mo — locked in for life. Plans for builders and agencies.