Jinny is a proof of concept for a voice and speech-enabled robot interface powered by ChatGPT and ElevenLabs that achieve human-like conversations
Traditionally, interactions between humans and robots have been rigid and structured. Humans had to master specific commands that robots could understand, while the robots themselves could only respond in a “robotic” manner. Leveraging the language learning capabilities of ChatGPT and the generative audio technology from ElevenLabs, we are addressing the challenge of facilitating more natural human-robot interactions.
As demonstrated in the video, we have developed Jinny, an advanced robotic interface capable of not only listening to and comprehending spoken language but also formulating natural responses. What sets Jinny apart is its ability to differentiate between conversational intent and task-oriented commands, thanks to the integration of a Function Calling API. Moreover, Jinny can respond using a voice that closely mimics human speech patterns.
Generative AI tech integrated
Monthly Recurring Subscribers
Time to release MVP
We conducted several remote meetings to brainstorm the design of both the robotic interface and the robot itself. To maximize efficiency, we opted for the mBot2 Neo as the robot’s main body and selected the Google Pixel 6a as a cost-effective yet powerful interface device. For the software stack, we decided to incorporate various API services, including Speechly for voice recognition, ChatGPT for natural language understanding, and ElevenLabs for generative audio.
Once we reached a consensus on building a simple prototype capable of engaging in natural conversations with humans as well as performing basic tasks like moving around on a table, we began prototyping individually. Throughout this phase, we continuously communicated our progress with the ultimate goal of integrating all the components cohesively.
We convened in person for several days to focus intensely on the integration of all the disparate components. Our efforts culminated in the creation of an autonomous robot, Jinny, capable of conversing naturally with humans and performing tasks independently.
After successfully integrating all the components, we unveiled Jinny to our colleagues during an internal lunchtime showcase. Subsequently, we presented our findings at the Northwest Robotics Alliance event, themed “Generative AI for Robotics” where we received positive feedback from attendees.