A complete recap of our biggest platform and product updates from January 2026.
Welcome to the January edition of the VideoSDK Monthly Updates! We’re starting 2026 with a major leap forward in AI infrastructure, voice capabilities, and developer control across the platform.
This month introduces a powerful new milestone: VideoSDK-managed inference for AI Agents, dramatically expanding what you can build without managing complex AI pipelines. Alongside this, we’ve broadened our AI ecosystem, launched evaluation tooling for agent performance, delivered advanced video optimization across SDKs, and pushed deeper into IoT voice experiences.
Let’s dive in.
VideoSDK-Managed Inference In Agents SDK
The biggest highlight this month is the launch of VideoSDK-managed inference support for Agents, a major step toward making real-time AI truly turnkey.
You can now run complete voice agent pipelines through the VideoSDK Gateway, eliminating the need to orchestrate multiple AI providers yourself.
Cascading Pipeline Supported Models (STT → LLM → TTS)
Fully managed routing across components:
- Speech-to-Text: Google, SarvamAI, Deepgram
- Large Language Models: Google-supported models
- Text-to-Speech: Google, SarvamAI, Cartesia
Realtime Pipeline Supported Models
- Powered by Gemini Realtime
- Enables ultra-low-latency conversational experiences
This update transforms agents from DIY integrations into a managed AI platform experience, dramatically reducing infrastructure overhead and time to production.
Expanded Speech & Audio Capabilities
- ElevenLabs : Enhanced TTS with language control
- Cartesia : Advanced generation configuration (emotion, speed, volume)
Together, these integrations significantly expand the range of voices, languages, latency profiles, and quality options available for building conversational AI.
Introducing Agent Evaluation & Benchmarking
Building agents is only half the challenge measuring their performance is equally critical.
This month introduces videosdk-eval, a dedicated framework for testing and validating agent quality before production deployment.
Key capabilities include:
- Simulation of multi-turn conversations
- Component-level evaluation (STT, LLM, TTS)
- Latency tracking per component and end-to-end
- Performance and quality benchmarking
Real-Time Analytics & Observability Improvements
Production AI systems require deep visibility into performance. January brings major upgrades to analytics across the agent platform.
Enhancements include:
- Improved latency tracking and tracing
- Token usage collection for major AI providers
- Real-time analytics streaming via PubSub
- Centralized playground analytics mode
These capabilities provide actionable insights for optimizing cost, performance, and user experience.
Advanced Video Optimization Across SDKs
Our core RTC SDKs received significant upgrades to video quality control and monitoring.
iOS SDK - Reliability & Lifecycle Improvements
New capabilities enhance stability for production apps:
- Explicit FAILED meeting state
- Bulk removal of event listeners
- Deterministic media track lifecycle management
- Automatic restoration of microphone and camera states after reconnection
Android SDK - Improved Observability
Android updates focused on developer experience and runtime stability:
- Enhanced trace messaging
- Safer listener management
- Thread-safety improvements
Expanding Into IoT Voice Experiences
January also marks continued progress toward embedded real-time communication.
IoT SDK Enhancements
- Acoustic Echo Cancellation (AEC) support for ESP32-S3-Korvo-V2
This enables clearer voice interactions on hardware devices such as smart assistants, kiosks, and industrial systems.
✨ What This Means for Developers
January’s updates move the platform decisively toward a future where developers can build sophisticated real-time AI systems without managing infrastructure complexity.
From fully managed inference pipelines to expanded AI providers, evaluation tooling, and advanced media controls, these improvements significantly reduce the gap between prototype and production.
SDK Sketches
What's Next?
As we continue through 2026, our focus remains on making real-time AI and communication infrastructure more powerful, flexible, and accessible to developers worldwide.
Expect deeper integrations, smarter agents, and continued improvements across performance, reliability, and developer experience.
New Content and Resources
Explore our latest tutorials and blogs to help you build more advanced AI agents and voice workflows.
- Voicemail detection tutorial - Learn how to automatically detect voicemails and trigger intelligent callbacks or alternate flows.
- Multi agent switching video tutorial - See how to automatically hand off conversations between specialized agents while preserving context.
- Testing and eval tutorial - A step-by-step guide to validating agent performance using evaluation tools and simulated conversations.
📝 Blogs
- Testing and eval blog - Best practices for measuring quality, latency, and reliability before deploying to production.
- Voicemail detection blog - How to build smarter telephony agents that understand call outcomes.
- Multi agent switching blog - Designing complex workflows with specialized agents collaborating in real time.
Ready to Build with the Latest?
Upgrade your SDKs to the latest versions to take advantage of all these new features and improvements.
Join our Discord Community