Improve your agent’s tool-calling accuracy with SFT and DPO on Amazon SageMaker AI
AWS ML Blog published a post detailing how to combine Supervised Fine-Tuning (SFT) and Direct Preference Optimization (DPO) to enhance the tool-calling accuracy of small language models (SLMs) using Amazon SageMaker AI. The guide walks through setting up training jobs on SageMaker, allowing developers to focus on training code rather than infrastructure management. It covers the entire pipeline from data preparation to model evaluation, comparing a base model against several fine-tuned variants. The post emphasizes data-driven decision-making by evaluating tool-calling accuracy metrics. This approach enables teams to deploy more capable agents with smaller models, reducing cost and latency while maintaining high reliability in tool use. The example uses SageMaker's managed training capabilities, making the technique accessible without specialized hardware management.
Developers can now fine-tune small models for reliable tool calling without managing infrastructure.