Speech decoding resources

On this page, you can find an overview of publicly available speech decoding datasets, that generally contain neural data of subjects listening to natural speech. This can be either with multiple competing talkers (to decode selective auditory attention) or with a single talker.

In case your publicly available dataset is not mentioned, please contact me.

Currently, there is about:

  • 179 hours of EEG with competing speech
  • 8 hours of MEG with competing speech
  • 10.5 hours of EEG with polyphonic music
  • 281 hours of EEG with single talker speech
  • 71 hours of MEG with single talker speech

publicly available.

Multi-talker (selective attention)

Original referenceNumber of participantsParticipant populationAmount of data per participantNeurorecording systemStimuliSex of competing talkersLocation of competing talkersAcoustic room conditionsCommentsLink datasetLink paper
W. Biesmans, N. Das, T. Francart, and A. Bertrand, “Auditory-Inspired Speech Envelope Extraction Methods for Improved EEG-Based Auditory Attention Detection in a Cocktail Party Scenario,” IEEE Transactions on Neural Systems and Rehabilitation Engineering, vol. 25, no. 5, pp. 402–412, 201716young, normal hearing72 min (8 trials x 6 min + 12 trials x 2 min)EEG 64-channel BioSemiDutch short storiesmale-male90/-90 degreesdichotic and HRTF-filtered in anechoic roomDatasetPaper
S. A. Fuglsang, T. Dau, and J. Hjortkjær, “Noise-robust cortical tracking of attended speech in real-world acoustic scenes,” NeuroImage, vol. 156, pp. 435–444, 201718young, normal hearing50 min (60 trials x 50 s)EEG 64-channel BioSemiDanish fictional storiesmale-female60/-60 degreesHRTF-filtered in anechoic, mildly, and highly reverberant roomEOG availableDatasetPaper
S. A. Fuglsang, J. Märcher-Rørsted, T. Dau, J. Hjortkjær, "Effects of Sensorineural Hearing Loss on Cortical Synchronization to Competing Speech during Selective Attention," Journal of Neuroscience, vol. 40, no. 12, pp. 2562-2572, 20204422 hearing impaired + 22 normal hearing26.7 min (32 trials x 50 s)EEG 64-channel BioSemiDanish audiobooksmale-female90/-90 degreesHRTF-filteredsingle-talker, ERPs, EFRs, resting-state also available / in-ear EEG for 19 of 44 participants / EOG availableDatasetPaper
A. J. Power, J. J. Foxe, E.-J. Forde, R. B. Reilly, and E. C. Lalor, “At what time is the cocktail party? A late locus of selective attention to natural speech,” European Journal of Neuroscience, vol. 35, pp. 1497–1503, 201233young, normal hearing30 min (30 trials x 1 min)EEG 128-channel BioSemiEnglish fictional storiesmale-male90/-90 degreesdichoticused in seminal O'Sullivan paperDatasetPaper
A. Mundanad Narayanan, R. Zink, and A. Bertrand, “EEG miniaturization limits for stimulus decoding with EEG sensor networks,” Journal of Neural Engineering, vol. 18, no. 5, p. 056042, 202130young, normal hearing24 min (4 trials x 6 min)EEG 255-channel SynAmps RTDutch fictional storiesmale-male90/-90 degreesHRTF-filteredDatasetPaper
L. Straetmans, B. Holtze, S. Debener, M. Jaeger, and B. Mirkovic, “Neural tracking to go: auditory attention decoding and saliency detection with mobile EEG,” Journal of Neural Engineering, vol. 18, no. 6, p. 066054, 202120young, normal hearing30 min (6 trials x 5 min)EEG 24-channel EasyCap GmbH/SMARTINGGerman audiobooks + natural salient eventsmale-male45/-45 degreesHRTF-filtered, recorded in public cafeteria without other people3 trials during walking, 3 trials sitting / salient environmental sounds addedDatasetPaper
S. Akram, A. Presacco, J. Z. Simon, S. A. Shamma, and B. Babadi, “Robust decoding of selective auditory attention from MEG in a competingspeaker environment via state-space modeling,” NeuroImage, vol. 124, pp. 906–917, 20167young, normal hearing6 min (2 conditions x 3 repetitions x 1 min)MEG 157-channelEnglish fictional storiesmale-female90/-90 degreesdichoticinstructed attention switches 1 time per trialDatasetPaper
A. Presacco, S. Miran, B. Babadi, and J. Z. Simon, “Real-Time Tracking of Magnetoencephalographic Neuromarkers during a Dynamic Attention-Switching Task,” in Proceedings of the 2019 41st Annual International Conference of the IEEE Engineering in Medicine and Biology Society (EMBC), pp. 4148–4151, 20195young, normal hearing4.5 min (3 trials x 90 s)MEG 157-channelEnglish fictional storiesmale-female90/-90 degreesdichoticat-will attention switches 1-3 times per trialDatasetPaper
C. Brodbeck, L. E. Hong, and J. Z. Simon, “Rapid Transformation from Auditory to Linguistic Representations of Continuous Speech,” Current Biology, vol. 28, no. 24, pp. 3976–3983.e5, 201826normal hearing16 min (4 trials x 4 repetitions x 1 min)MEG 157-channelEnglish audiobooksmale-femaleNANADatasetPaper
G. Cantisani, G. Trégoat, S. Essid, G. Richard, "MAD-EEG: an EEG dataset for decoding auditory attention to a target instrument in polyphonic music," Speech, Music and Mind (SMM), Satellite Workshop of Interspeech 2019, Vienna, Austria, 20198young, normal hearing, non-professional musicians30-32 min (78 stimuli x 4 repetitions x 6 s)EEG 20-channel B-Alert X24 headsetPolyphonic music mixture (14 solos, 40 duets, 24 trios)various instrumentsNASpeakers at 45/-45 degrees, convex weighting of instruments in the mixtureEOG, EMG, ECG, head motion acceleration available / single instrument availableDatasetPaper
O. Etard, R. B. Messaoud, G. Gaugain, and T. Reichenbach, “No Evidence of Attentional Modulation of the Neural Response to the Temporal Fine Structure of Continuous Musical Pieces,” Journal of Cognitive Neuroscience, vol. 34, no. 3, pp. 411-424, 202217young, normal hearing22.4 min (7 stimuli of 11.2 min in total x 2 repetitions)EEG 4-channel Ag/AgCl electrodes (Multitrode, BrainProducts)Music (Bach's Two-Part Inventions)piano-guitarNAdichoticsingle instrument availableDatasetPaper
Y. Zhang, H. Ruan, Z. Yuan, H. Du, X. Gao, and J. Lu, "A Learnable Spatial Mapping for Decoding the Directional Focus of Auditory Attention Using EEG," 2023 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP), Rhodes Island, Greece, pp. 1-5, 202321normal hearing64 min (32 trials x 2 min)EEG 32-channel EMOTIV Epoc Flex SalineChinese news programsmale-femalerandom pairs from +-135/120/90/60/45/30/15 degreesloudspeaker arrayper trial, random pairs of the competing speaker directions are takenDatasetPaper
O. Etard, M. Kegler, C. Braiman, A.E. Forte, and T. Reichenbach, “Decoding of selective attention to continuous speech from the human auditory brainstem response,” NeuroImage, vol. 200, pp. 1-11, 201918young, normal hearing20 min (2 trials x 4 parts x 2.5 min)EEG 64-channel actiCAPEnglish audiobooksmale-female90/-90 degreesdichoticDatasetPaper
I. Rotaru, S. Geirnaert, N. Heintz, I. Van de Ryck, A. Bertrand, and T. Francart, "What are we really decoding? Unveiling biases in EEG-based decoding of the spatial focus of auditory attention," Journal of Neural Engineering, vol. 21, no. 1, 016017, 202413young, normal hearing80 min (2 blocks x 4 conditions x 10 min)EEG 64-channel BioSemiDutch science-outreach podcastsmale-male90/-90 degreesHRTF-filtered in anechoic roomPer condition, a different audio-visual condition is used (moving video, moving target noise, no visuals, static video). EOG also available. Per trial (=condition), there is one switch in attention after 5 minutesDatasetPaper
Z. Lin, T. He, S. Cai, and H. Li, "ASA: An Auditory Spatial Attention Dataset with Multiple Speaking Locations," Interspeech 2024, Kos, Greece, 202420normal hearing24 min (20 trials x 1-1.5 min)EEG 64-channel EasycapMandarin storiesmale-female90/-90, -60/60, -45/45, -30/30, -5/5 degreesHRTF-filtered through headphonesDatasetPaper
Y. Yan, X. Xu, H. Zhu, P. Tian, Z. Ge, X. Wu, and J. Chen, "Auditory Attention Decoding in Four-Talker Environment with EEG," Interspeech 2024, Kos, Greece, pp. 432-436, 2024 and H. Zhu et al., "Using Ear-EEG to Decode Auditory Attention in Multiple-speaker Environment," 2025 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP), Hyderabad, India, pp. 1-5, 202516young, normal hearing40 min (40 trials x 1 min)EEG 64-channel NeuSen Wireless + 20-channel cEEGridMandarin storiesmale (x2)-female (x2)-90/-30/+30/+90loudspeaker array4-competing speaker paradigm; original stimuli available upon request (2301111611@stu.pku.edu.cn), only envelopes availableDatasetPaper
M. Thornton, D. Mandic, and T. Reichenbach, "Decoding of selective attention to speech from Ear-EEG recordings," , arXiv, 2024 (arXiv:2401.05187)18young, normal hearing40 min (16 trials x 150 s)EEG 2-channel in-ear (reference FT7)English audiobooksmale-femaleNAdiotic via headphonesDatasetPaper
Q. Wang, Q. Zhou, Z. Ma, N. Wang, T. Zhang, Y. Fu, and J. Li, "Le Petit Prince (LPP) multi-talker: Naturalistic 7 T fMRI and EEG dataset," Scientific Data,vol. 12, no. 829, 202525young20 min (2 trials x 10 min)EEG 64-channel actiCAPMandarin audiobook ("The Little Prince")male-female (synthesized)NAinsert earphonesfixation on crosshair; also fMRI available and single-talker dataDatasetPaper

Single talker

Original referenceNumber of participantsParticipant populationAmount of data per participantNeurorecording systemStimuliSex of talkerCommentsLink datasetLink paper
G. M. Di Liberto, J. A. O’Sullivan, and E. C. Lalor, “Low-Frequency Cortical Entrainment to Speech Reflects Phoneme-Level Processing,” Current Biology, vol. 25, no. 19, pp. 2457–2465, 2015 and M. P. Broderick, A. J. Anderson, G. M. Di Liberto, M. J. Crosse, and E. C. Lalor, “Electrophysiological Correlates of Semantic Dissimilarity Reflect the Comprehension of Natural, Narrative Speech,” Current Biology, vol. 28, no. 5, pp. 803–809.e3, 201819young, normal-hearing60 min (20 trials x 180 s)EEG 128-channel BioSemiEnglish fictional storiesmaleDatasetPaper
G. M. Di Liberto, J. A. O’Sullivan, and E. C. Lalor, “Low-Frequency Cortical Entrainment to Speech Reflects Phoneme-Level Processing,” Current Biology, vol. 25, no. 19, pp. 2457–2465, 201510young, normal-hearing72.3 min (28 trials x 155 s)EEG 128-channel BioSemiEnglish fictional stories, reversedmalesame stimuli as dataset above, but reversedDatasetPaper
H. Weissbart, K. D. Kandylaki, and T. Reichenbach, “Cortical Tracking of Surprisal during Continuous Speech Comprehension,” Journal of Cognitive Neuroscience, vol. 32, no. 1, pp. 155–166, 202013young, normal-hearing40 min (15 trials x approx. 2.6 min)EEG 64-channel actiCAPEnglish short storiesmaleDatasetPaper
F. J. Vanheusden, M. Kegler, K. Ireland, C. Georga, D. M. Simpson, T. Reichenbach, and S. L. Bell, “Hearing Aids Do Not Alter Cortical Entrainment to Speech at Audible Levels in Mild-to-Moderately Hearing- Impaired Subjects,” Frontiers in Human Neuroscience, vol. 14, no. 109, 202017older, hearing impaired, hearing aid users25 min (8 trials x approx. 3 min)EEG 32-channel BioSemiEnglish audiobookfemaletrials aided and unaided by hearing aidDatasetPaper
L. Bollens, B. Accou, H. Van hamme, and T. Francart, "A Large Auditory EEG Decoding Dataset", KU Leuven RDR, 202385young, normal-hearing130-150 min (8 -10 trials x 15 min)EEG 64-channel BioSemiFlemish audiobooks and podcastsmale and femaleDataset
J. R. Brennan, and J. T. Hale, "Hierarchical structure guides rapid linguistic predictions during naturalistic listening," PLoS ONE, vol. 14, no. 1, e0207741, 201949young12.4 minEEG 61-channel actiCAPEnglish audiobookfemaleno mention of medical conditions of participantsDatasetPaper
S. A. Fuglsang, J. Märcher-Rørsted, T. Dau, J. Hjortkjær, "Effects of Sensorineural Hearing Loss on Cortical Synchronization to Competing Speech during Selective Attention," Journal of Neuroscience, vol. 40, no. 12, pp. 2562-2572, 20204422 hearing impaired + 22 normal hearing13.3 min (16 trials x 50 s)EEG 64-channel BioSemiDanish audiobooksmale and femaledual-talker, ERPs, EFRs, resting-state also available / in-ear EEG for 19 of 44 participantsDatasetPaper
L. Gwilliams, J.R. King, Marantz, A. Marantz, and D. Poeppel, "Neural dynamics of phoneme sequences reveal position-invariant code for content and order," Nature Communications, 13, 6606, 202227young, normal-hearing120 min (2 sessions x 1 hour)MEG 208-channelEnglish fictional storiesDatasetPaper
N. H. L. Lam, A. Hultén, P. Hagoort, and J.-M. Schoffelen, "Robust neuronal oscillatory entrainment to speech displays individual variation in lateralisation," Language, Cognition and Neuroscience, no. 33, vol. 8, pp. 943-954, 2018 + various other papers102young, healthyaround 8.4 min (120 sentences x 2.8-6 s)MEG 275-channelDutch sentencesfemalefMRI also available. Also resting-state and reading availableDatasetPaper
C. Brodbeck, L. E. Hong, and J. Z. Simon, “Rapid Transformation from Auditory to Linguistic Representations of Continuous Speech,” Current Biology, vol. 28, no. 24, pp. 3976–3983.e5, 201826normal-hearing8 min (8 trials x 1 min)MEG 157-channelEnglish audiobooksmale and femaleDatasetPaper
O. Etard, and T. Reichenbach, "Neural speech tracking in the theta and in the delta frequency band differentially encode clarity and comprehension of speech in noise," Journal of Neuroscience, vol. 39, no. 29, pp. 5750-5759, 201912young, normal-hearing40 min (4 noise levels x 4 parts x 2.5 min)EEG 64-channel actiCAPEnglish audiobooksmale and female4 levels of babble noise. Also EEG data of participantss listening to Dutch available (0% speech comprehension)DatasetPaper
Q. Wang, Q. Zhou, Z. Ma, N. Wang, T. Zhang, Y. Fu, and J. Li, "Le Petit Prince (LPP) multi-talker: Naturalistic 7 T fMRI and EEG dataset," Scientific Data,vol. 12, no. 829, 202525young20 min (2 trials x 10 min)EEG 64-channel actiCAPMandarin audiobook ("The Little Prince")male and female (synthesized)also dual-speaker data availableDatasetPaper