NINNY data set (English phoneme confusion data)

Motivation

Speech perception data ( as part of 104 Linguistics) can make for large datasets with wide relevance or usefulness, including automatic speech recognition systems, norms for stimuli selection, and virtual experiments. Consider that the early reports on the identification of American English vowels (Peterson & Barney 1952) and consonants (Miller & Nicely 1955) have racked up citation counts of respectively 3941 and 2304 (in November 2019).

An obvious question is: how do other researchers find such datasets? The traditional method of making datasets known is based on conventional scientific publication, either in peer-reviewed scientific journals or at international conferences. It is desirable to have additional ways to publicize the existence of such datasets that would continue to reach audiences for many years. Text+ is one such initiative that can also help to maintain datasets and provides users with the original form of the data (which is not always the case with datasets that have been handed down locally from user to user).

Objectives

For the NINNY data set, native speakers of American English and Dutch non-native speakers of English listened to English syllables and identified either the consonant or the vowel (Cutler et al., 2004; Cutler et al., (in press)). The final data set contained both accuracy and reaction times of more than 100,000 responses. By November 2019, the 2004 NINNY study had received 280 citations (Google Scholar). The data have been analyzed to compare phoneme confusion patterns to predictions of speech perception models, to guide the selection of sound contrasts for word recognition studies, and for comparison with clinical data, among other things. The NINNY data set is of interest for researchers who work on either native listening or non-native listening (in comparison to native listening). In 2004, it was uncommon that journals insist on accepted publications providing open access data sets. The 2004 study listed the Max Planck Institute for Psycholinguistics website as a location for accessing the data. All stimuli in WAV-format and all individual identification responses were made available there ( https://www.mpi.nl/people/cutler-anne/research), and are still available. Researchers’ knowledge of best practices in collection and sharing of large behavioral datasets has developed considerably in the meantime, and we also recently deposited the dataset in the Tübingen Clarin-D repository. Archiving it in the repository made the data sustainable, and the data can now be easily found by other researchers.

References

Cutler, A., A. Weber, R. Smits, and N. Cooper. 2004. Patterns of English phoneme confusions by native and non-native listeners. Journal of the Acoustical Society of America 116 (6): 3668–3678.

Cutler, A., Ernestus, M., Warner, N., & Weber, A. (in press). Managing big data in speech perception. In A. Berez-Kroeker, B. McDonnell, E. Koller & L. Collister (eds.), The Open Handbook of Linguistic Data Management. Cambridge, MA: MIT Press.

Miller, G. A., and P. E. Nicely. 1955. An analysis of perceptual confusions among some English consonants. Journal of the Acoustical Society of America 27: 338–352.

Peterson, G. E., and H. L. Barney. 1952. Control methods used in a study of the vowels. Journal of the Acoustical Society of America 24 (2): 175–184.