Popular on s4story


Similar on s4story

Stream Releases Open-Source AI Agent That Reads Your Face and Adapts How It Speaks

S For Story/10692670
BOULDER, Colo. - s4story -- Built on Vision Agents with Anam and Inworld to demonstrate emotionally aware, video-first AI

Stream released an open-source AI agent that responds to a user's facial expressions, gaze, and engagement in real time. The agent, called Crashout Buddy, is live at visionagents.ai.

The era of the floating orb is over. Most voice agents today are blind. They convert speech to text, run it through an LLM, and read the response back in a flat tone regardless of whether the user is laughing, frustrated, or close to tears. Built on Stream's Vision Agents framework in collaboration with Anam and Inworld, Crashout Buddy watches the user's face and shapes both what the agent says and how it says it. When the user goes quiet, it notices. When they look like they're about to lose it, it softens.

How It Works

The agent runs a multimodal perception stack on Stream's global edge network. MediaPipe tracks 52 facial blendshapes at 8 fps to classify emotion, gaze, and engagement. That signal is injected into the LLM (Gemini) on every turn, which steers Inworld's TTS-2 voice model using natural-language direction such as [say warmly with light, easy energy]. Anam renders a photorealistic, lip-synced avatar. Deepgram handles speech-to-text.

More on S For Story
The same pattern (facial state, rich agent context, expressive voice, lip-synced avatar) suits apps in dating, coaching, recruitment, tutoring, and customer support.

Key capabilities include:
  • Emotion, gaze, and engagement classification with hysteresis to prevent flicker
  • Natural-language voice steering in 100+ languages via Inworld TTS-2
  • Photorealistic lip-synced avatar via Anam's CARA model
  • Proactive re-engagement when the user drifts off-camera or goes quiet
  • Composable processors running at independent frame rates

Availability

The full project is open source. Try the demo at visionagents.ai, read the guide on the Stream blog, or explore the code at: https://github.com/GetStream/Vision-Agents

Contact
Emily Nekvasil
***@getstream.io


Source: Getstream.io

Show All News | Disclaimer | Report Violation

0 Comments

Latest on S For Story