Text to Speech Using Your Own Voice: A Generative AI Approach

Generative AI is a growing method for improving the intelligence of machines. This means the application will have the ability to generate new content rather than just evaluating existing data. This project explored existing text-to-speech models, leveraging the real-time training aspects of them. This technology is useful in many contexts, such as when an individual’s speech is impaired due to disability or sickness. It is reasonable to assume that this technology will not only provide the benefits of text-to-speech but also increase social integration with its ability to replicate personalized voices. In non-accessibility contexts, this technology could be useful for video calling and online lecturing, improving the day-to-day life of users. This project requires the model to train on user voice inputs, in the form of uploaded audio files, and provide a live generation of a custom voice. This voice can then perform text-to-speech capabilities.

Team Members:

Suha Akhund
Tristan Becnel
Sing-Rong Chiu
Purva Kantawala
Sachin Nair
Haochen Xu

Semester

2024 Spring

Text to Speech Using Your Own Voice: A Generative AI Approach

Follow Texas ECE