Jinny AI

Jinny is a proof of concept for a voice and speech-enabled robot interface powered by ChatGPT and ElevenLabs that achieve human-like conversations

Team

Elisha Terada – Web App & Interface
Johnny Rodriguez – Interface Design
Steve Yin & Michael Weller – Robot Hardware & Software

Timeframe

2023 – 2024

Background

Traditionally, interactions between humans and robots have been rigid and structured. Humans had to master specific commands that robots could understand, while the robots themselves could only respond in a “robotic” manner. Leveraging the language learning capabilities of ChatGPT and the generative audio technology from ElevenLabs, we are addressing the challenge of facilitating more natural human-robot interactions.

Results

As demonstrated in the video, we have developed Jinny, an advanced robotic interface capable of not only listening to and comprehending spoken language but also formulating natural responses. What sets Jinny apart is its ability to differentiate between conversational intent and task-oriented commands, thanks to the integration of a Function Calling API. Moreover, Jinny can respond using a voice that closely mimics human speech patterns.

3
Generative AI tech integrated
2
Monthly Recurring Subscribers
3^weeks
Time to release MVP

Process

Brainstorming

We conducted several remote meetings to brainstorm the design of both the robotic interface and the robot itself. To maximize efficiency, we opted for the mBot2 Neo as the robot’s main body and selected the Google Pixel 6a as a cost-effective yet powerful interface device. For the software stack, we decided to incorporate various API services, including Speechly for voice recognition, ChatGPT for natural language understanding, and ElevenLabs for generative audio.

Wireframing & Prototyping

Once we reached a consensus on building a simple prototype capable of engaging in natural conversations with humans as well as performing basic tasks like moving around on a table, we began prototyping individually. Throughout this phase, we continuously communicated our progress with the ultimate goal of integrating all the components cohesively.

Development

We convened in person for several days to focus intensely on the integration of all the disparate components. Our efforts culminated in the creation of an autonomous robot, Jinny, capable of conversing naturally with humans and performing tasks independently.

The Launch

After successfully integrating all the components, we unveiled Jinny to our colleagues during an internal lunchtime showcase. Subsequently, we presented our findings at the Northwest Robotics Alliance event, themed “Generative AI for Robotics” where we received positive feedback from attendees.