Dataset Description

Uncommon Voice Dataset

Summary:

UncommonVoice is a freely-available dataset of crowd-sourced voice disorder speech from 57 speakers. Spasmodic Dysphonia (SD) is the primary voice disorder represented in this dataset, however, the collection was not limited to only SD voices.

Contributors: Meredith Moore, Piyush Papreja, Michael Saxon, Visar Berisha, Sethuraman Panchanathan, Kim Kuman and the National Spasmodic Dysphonia Association.

This dataset was collected in affiliation with Arizona State University's Center for Cognitive Ubiquitous Computing.

Motivation:

Towards the goal of improving the representation of individuals with voice disorders in the vast corpora of speech, we present UncommonVoice, a crowdsourced, publicly available dataset of speech from individuals with voice disorders. While datasets like TORGO and UASPEECH focus on freely providing speech data from individuals with dysarthria, UncommonVoice focuses on providing data from individuals with dysphonia.

We believe that UncommonVoice posits a significant contribution to the field and will enable advancement in improving the accessibility of voice-based technologies as well as the development of voice-assistive technologies.

Subjects:

Subjects were recruited with the help of the National Spasmodic Dysphonia Association via email and social media. Individuals with and without voice disorders were asked to participate in the creation of a dataset of voice disorder speech. In the first step of the data collection process subjects were asked a series of survey questions (see paper for full list). In these questions, they were asked to self-report whether or not they had a voice disorder and provide more details on their disorder including self-reported ratings of voice quality.

Prompts:

Subjects were asked to complete four sections of data collection:

1. Non-words (sustained corner vowels, and DDK rate)

2. Read Speech: Randomly selected TIMIT sentences and the sentences required to complete the CAPE-V intelligibility assessment

3. Image Descriptions: Spontaneous speech to describe images from the MSCOCO dataset.

4. Non-words (round 2 to measure any change in voice over data collection process).

Recording Instructions:

Subjects used the web-based recording system (which is no longer available) to record their speech samples using the microphone of their choice.

Acknowledgements:

The National Spasmodic Dysphonia Association for their support, enthusiasm and participation in this research.

Meredith Moore was funded by the National Science Foundation Graduate Rearch Fellowship at the time of this research.

License:

UncommonVoice is licensed under the Creative Commons Attribution-NonCommercial-ShareAlike 4.0 International (CC BY-NC-SA 4.0)

Dataset Description

Uncommon Voice Dataset

UncommonVoice Download Procedure

Paper Download

Video

Interspeech 2020: UncommonVoice