What I did
Led development of a wearable AR assistive device that transcribes live speech into captions displayed on smart glasses. Integrated advanced ASR, diarization, and gaze-adaptive visualization to make speech more accessible for individuals with hearing loss.
Why it mattered
- Enabled inclusive communication for hearing-impaired users through real-time, personalized captioning.
- Improved accuracy by 12% and cut transcription latency by 75% through optimized audio and model integration.
- Supported 40+ user studies that shaped future iterations of assistive hearing technology.
How it worked
- ASR pipeline: Implemented AWS Transcribe streaming with noise-reduction preprocessing and speaker diarization.
- Accuracy optimization: Fine-tuned confidence scoring and word-timestamp smoothing for clearer real-time output.
- AR overlay: Used ARKit gaze tracking to position captions dynamically; built Unity interface for visual comfort.
- System architecture: Modularized services for capture, inference, display, and telemetry; FastAPI backend with PostgreSQL.
- Research tools: Web-based dashboard for device calibration and logging, reducing setup friction for researchers.
Tech
- Audio & ML: PyAudio, librosa, NumPy, AWS Transcribe
- Backend: Python, FastAPI, PostgreSQL, AWS Lambda
- AR & Frontend: ARKit, Unity, WebSocket-based dashboard
- Data & Telemetry: Pandas, Plotly, custom analytics tracking
My role
- Architected the ASR and audio router system, enabling real-time streaming between microphone array, beamformer, ASR engine, and AR display with sub-500 ms latency.
- Implemented adaptive beamforming algorithms (Delay-and-Sum, MVDR) in NumPy and PyTorch, improving speech clarity and noise suppression in dynamic environments.
- Integrated AWS Transcribe with a modular router service, supporting live and offline modes through Dockerized microservices and WebSocket messaging.
- Developed telemetry and analytics tools to measure signal quality, latency, and model confidence, guiding major iterations of the device architecture.