Real-Time cascading Speech-to-Speech Chatbot: Whisper, Agno (Llama 3.1), Kokoro, and Silero VAD 🚀

martian7r · April 28, 2025, 11:04pm

Hey everyone! I just launched Vocal-Agent — an open-source, real-time cascading speech-to-speech chatbot!

It combines:

Whisper for speech recognition
Silero VAD for voice activity detection
Extensible agent framework powered by Agno
Kokoro ONNX for natural voice synthesis

Key features include:

Ultra low-latency audio processing
Seamless web integration (Google Search, Wikipedia, Arxiv)

It’s built for smooth, real-time conversations and is easy to extend for custom applications.

Check it out on GitHub: GitHub - tarun7r/Vocal-Agent: A cutting-edge Cascading voice assistant combining real-time speech recognition, AI reasoning, and neural text-to-speech capabilities.

Would love to hear your thoughts, feedback, and ideas!

Monali · April 29, 2025, 5:58am

@martian7r This is awesome — huge congrats on the launch!
I’ll definitely share this with the team.

If you’ve posted about it on social media, feel free to share the link — we’d be happy to boost it from our side too!

Also, you should totally submit Vocal-Agent to our Global Agent Hackathon — there’s a $25k prize pool up for grabs! We’ll be giving feedback along the way to help you level it up even more. Would love to see this in the mix!

And if you haven’t already, join the Agno Discord to connect with fellow builders working on similar projects.

martian7r · April 29, 2025, 6:05am

Thanks so much! I’ll send over the link once it’s live.
The hackathon sounds awesome — I’ll definitely check it out! Just joined the Discord too. Appreciate all the support!

Topic		Replies	Views
Support for /api:v1 and /chat-completion(-sync) General question	2	374	November 26, 2025
My exploring about agno Ai General agent	1	73	June 23, 2025
Does xAI supports prompt caching in agno?, anthropic shown General agent	2	377	January 19, 2026
Text generation model independent of Openai General agent , knowledge , rag , tool-call , bug	4	82	April 8, 2025
Audio generation using gemini tts modals General agent , teams , feature-requests	3	528	September 10, 2025

Real-Time cascading Speech-to-Speech Chatbot: Whisper, Agno (Llama 3.1), Kokoro, and Silero VAD 🚀

Related topics