One of the biggest problems in speech recognition and natural language processing is a lack of labeled data. As such, most recent techniques such as generative adversarial networks aim to be useful without much labeled data, but even then potential is limited by the lack of labeled data. One such lack is the matching of audio data to text data, i.e. subtitles.
Fluency Rodeo is a language learning tool that allows creating subtitles on audio and video media and turning these into flash cards that can be learned from using spaced repetition. The best way to learn a language is through immersion and media is an excellent way to get there. Allowing these data to be volunteered to improve speech recognition and natural language processing techniques could also be a great boon for researchers in these fields.