Caption Speak: Empowering Sight with PI

Description

  • June, 2024

We built an affordable, real-time image captioning system that generates spoken descriptions to support visually impaired users in safer navigation and everyday independence. We tested the prototype with 3+ users and focused on delivering quick, practical feedback in real-world scenarios. The system runs on a Raspberry Pi 4 Model B, where we implemented an on-device CNN–LSTM captioning pipeline with camera-based image capture and audio output, achieving under 2 seconds per image for near real-time guidance.