Open Vision Agents by Stream is an open source framework from Stream for building real time, multimodal AI agents that watch, listen, and respond to live video streams. It focuses on combining video understanding models, such as YOLO and Roboflow based detectors, with real time large language models like OpenAI Realtime and Gemini Live to create interactive experiences. The framework uses Stream’s ultra low latency edge network so agents can join sessions quickly and maintain very low audio and video latency while processing frames and generating responses. Developers work with an agent abstraction that connects video edge providers, LLMs, and processors into pipelines, making it easier to orchestrate tasks like object detection, pose estimation, and conversational guidance. The project includes SDKs for React, Android, iOS, Flutter, React Native, and Unity, enabling integration into a wide variety of client environments such as mobile apps, web apps, and games.

Features

  • Framework for multimodal agents that process live video, audio, and text
  • Integrations with YOLO, Roboflow, and real time LLMs like OpenAI and Gemini
  • Ultra low latency streaming via Stream’s global edge network
  • Agent abstraction with processors for detection, pose, and custom logic
  • SDKs for React, Android, iOS, Flutter, React Native, and Unity
  • Ready made examples for sports coaching, safety monitoring, and interactive apps

Project Samples

Project Activity

See All Activity >

Categories

Text to Speech

License

Apache License V2.0

Follow Open Vision Agents by Stream

Open Vision Agents by Stream Web Site

Other Useful Business Software
Gemini 3 and 200+ AI Models on One Platform Icon
Gemini 3 and 200+ AI Models on One Platform

Access Google's best plus Claude, Llama, and Gemma. Fine-tune and deploy from one console.

Build generative AI apps with Vertex AI. Switch between models without switching platforms.
Start Free
Rate This Project
Login To Rate This Project

User Reviews

Be the first to post a review of Open Vision Agents by Stream!

Additional Project Details

Programming Language

Python

Related Categories

Python Text to Speech Software

Registered

2025-11-28