Speakers Data Summary

This dataset was created to display a wide array of set speech accents from people with varying language backgrounds. In this repository fluent Native speakers and non-native English speakers each read aloud the same sentence. In this data, we can see the demographic and linguistic backgrounds of the speakers. These demographics include age, birthplace, native language, sex, and country. This allows users to determine key attributes in predicting each accent. This dataset consists of 2140 speech samples. The samples are from speakers from 177 different countries and have 214 individual native languages.

    
    
See the full data set here