TL;DR

Facing high costs due to massive token usage, AT&T rebuilt its internal Ask AT&T assistant using small language models and a multi-agent stack, reducing expenses by 90% and improving efficiency.

What happened

AT&T addressed inefficiencies in its AI system by implementing a multi-agent architecture that leverages smaller, more efficient language models. This approach significantly reduced costs while boosting performance and user adoption.

Why it matters for ops

The shift towards using multiple small agents instead of large monolithic systems allows for better scalability, cost-efficiency, and fine-tuned functionality tailored to specific tasks.

Action items

  • Evaluate current AI infrastructure for potential multi-agent rearchitecture
  • Implement smaller language models for more precise task handling
  • Utilize a flexible orchestration layer to enhance system responsiveness

Source link

https://venturebeat.com/orchestration/8-billion-tokens-a-day-forced-at-and-t-to-rethink-ai-orchestration-and-cut