Back to Blog
Voice AI

VocalisAI V3: When Every Call Gets Its Own AI Team

March 5, 202614 min read

In 2024, VocalisAI V1 was a single voice agent named Sofia. She could qualify leads, answer questions, and close sales via Stripe Push-to-Link. It was impressive. But it was also limiting. One agent trying to be everything — in healthcare, logistics, legal, e-commerce — meant constant compromise on domain expertise.

V3 changes that entirely. Instead of one generalist, every call now gets a team.

The Core Problem with Single-Agent Voice AI

When you train a single voice agent to handle every possible call scenario, you're making a fundamental architectural mistake. A billing dispute requires different cognitive patterns than an emergency escalation. A sales pitch in Mexico requires different cultural calibration than one in the United States.

The solution isn't a better single agent. The solution is the right agent for the right call.

"Specialization is not a luxury in voice AI — it's the difference between an agent that handles a call and one that resolves it."

Enter Akiva: The Meta-Agent Supervisor

Akiva is not a voice agent. Akiva never speaks to a customer. Akiva's job is to listen to the first few seconds of a call, classify the intent, select the optimal specialist from the pool of 6 agents, and then monitor the entire interaction.

Classification happens across several dimensions:

  • Language & Region: ES-MX routes to Alex, EN-US routes to Nova
  • Urgency Level: Any distress signals trigger Diana (Emergency) immediately
  • Topic Domain: Payment queries go to Marco, retention signals activate Sara
  • Call Type: Inbound vs outbound determines the starting agent pool

Google Gemini Live API: The Voice Layer

VocalisAI V3 was built for the Google Gemini Live Hackathon. The Gemini Live API enables true real-time conversational AI — not the traditional request/response cycle that adds latency to every turn. The model listens continuously, understands context across the full conversation, and responds with sub-second latency.

Combined with ElevenLabs for voice synthesis (emotional, natural, regionally calibrated), the result is a conversation that feels genuinely human. Not because we're trying to deceive — the agents identify as AI when asked — but because natural conversation flow reduces friction and helps customers get what they need faster.

// Simplified Akiva routing logic
incoming_call = await twilio.receive()
intent = await akiva.classify(incoming_call.audio_stream)
agent = akiva.select_specialist(intent)
session = await agent.connect(incoming_call, gemini_live_api)
akiva.monitor(session) # async supervision

The TOF Ethical Layer: Why It's Non-Negotiable

Every interaction in VocalisAI V3 passes through the Tikun Olam Framework before any agent responds. This is not optional. It's not a compliance checkbox. It's the infrastructure.

Voice AI that's optimized purely for conversion becomes a manipulation engine. Marco (Billing) could theoretically push customers toward payment plans that maximize revenue but hurt the customer. Sara (Follow-up) could exploit emotional vulnerability to prevent cancellations. Diana (Emergency) could under-escalate to keep call times short.

The 5 Sefirotic dimensions evaluated on every interaction prevent this:

Chesed
Is this genuinely helpful?
Gevurah
Are limits respected?
Tiferet
Is it balanced?
Netzach
Long-term wellbeing?
Hod
Honest and clear?

Results & What's Next

After 12,000+ calls processed across the V1-V3 arc, the patterns are clear: specialized agents resolve calls faster, with higher customer satisfaction, and fewer escalations. The TOF layer has caught and prevented several edge cases where an agent would have acted against the customer's best interest.

V4 is on the roadmap: inter-agent collaboration (Alex and Marco working the same call), full memory across sessions (Sara remembering a customer from 3 months ago), and deeper TOF integration at the Akiva classification level.

Experience It Live

Call now and experience VocalisAI V3 — ask for Alex (ES-MX) or Nova (EN-US).

BETA