CS Majors Win 'Best Hack for Health' with App for the Visually Impaired
A team of first-year Georgia Tech computer science (CS) majors has created an AI-powered app that empowers visually impaired people to lead more independent lives.
JARVIS –yes, the team members are big Iron Man fans– uses open-source AI tools, machine learning frameworks, and computer vision techniques to provide users with a richer, more thorough understanding of their immediate surroundings.
CS majors Arnav Chintawar, Dhruv Roongta, and Sahibpreet Singh developed JARVIS as their entry for Cal Hacks 10.0, a hackathon organized by a University of California Berkeley’s student group. The team won multiple awards, including Best Hack for Health, Best Use of Zilliz, and Best Use of GitHub during the in-person event in late October that attracted more than 2,000 participants.
The multifunctional app acts much like a personal assistant for users. It can be integrated with a smartwatch and can:
- Recognize and interpret a person’s environment, offering detailed scene descriptions.
- Read text and make recommendations.
- Recognize friends and family.
Along with these capabilities, JARVIS can perceive, interpret, and describe non-verbal cues of an individual near the user.
A visually impaired family member inspired the team to develop JARVIS. Along with volunteering with local organizations, the team consulted advocates from the Center of the Visually Impaired to understand better the difficulties faced by the community and how the app could help.
“We set out to bridge the accessibility gap for blind and visually impaired individuals by giving them unprecedented situational awareness of their surroundings. We hope that JARVIS can improve the quality of life for this community,” said Roongta.
The team started by making JARVIS easy to use. Responding to spoken queries, JARVIS could help a user meet a friend for dinner. The app would describe the general setting and layout of the restaurant, and provide an approximate number of people present and the activities observed.
“We used a speech-to-text and text-to-speech model similar to Siri or Alexa to ensure JARVIS would be easy to access and seem familiar to users,” said Chintawar.
The user would then ask JARVIS to scan the room for friends, family, or others based on a database of images uploaded by the user. Once it recognizes someone, the app says their name and where they are in the room. The team estimates its identity classification model supporting this functionality to be about 95% accurate.
JARVIS would then analyze the friend’s facial expressions as the pair chats before ordering. It conveys the detected emotions via audio descriptions or haptic pulses through the user’s smartwatch. The pulses vary in intensity based on the level of emotion it observes.
When they are ready to review the menu, the user could ask JARVIS to list the appetizers or the vegetarian options. This capability integrates optical character recognition technology with the team’s text-to-speech model, which allows the app to make relevant recommendations.
“This project has broadened our technical knowledge and instilled in us a profound sense of empathy and a commitment to enhancing the lives of visually impaired individuals,” said Singh.
Building on its success, the team is pushing JARVIS forward with an eye toward future entrepreneurial competitions. Planned upgrades include extending compatibility to a broader range of wearable computing devices and more robust description capabilities.
“We look forward to participating in the 2024 Georgia Tech InVenture Prize competition with an improved version of JARVIS. This will likely include customizing the vision model and fine-tuning it on custom data,” said Roongta.
Additional details about the technologies behind JARVIS and the team’s development approach are available on its Cal Hack 10.0 hackathon development site.
As we step into 2024 and reflect on the previous year, 2023 was a huge year for news stories here at @GTcomputing . Dive into the 184 published news stories of 2023 and see if theres anything you missed! https://t.co/zUHBPiiEwp
— Georgia Tech Computing (@gtcomputing) January 11, 2024
Following an outstanding graduation season last December, over 16,000 students join us for the Spring Semester! We at the College of Computing wish you all a great semester. Stay updated on what your fellow classmates are achieving this semester at https://t.co/rn35itu0yk pic.twitter.com/1Nk8vRrAKw
— Georgia Tech Computing (@gtcomputing) January 10, 2024