Speech decoding resources

On this page, you can find an overview of publicly available speech decoding datasets, that generally contain neural data of subjects listening to natural speech. This can be either with multiple competing talkers (to decode selective auditory attention) or with a single talker.

In case your publicly available dataset is not mentioned, please contact me.

Currently, there is about:

179 hours of EEG with competing speech
8 hours of MEG with competing speech
10.5 hours of EEG with polyphonic music
281 hours of EEG with single talker speech
71 hours of MEG with single talker speech

publicly available.

Multi-talker (selective attention)

Original reference	Number of participants	Participant population	Amount of data per participant	Neurorecording system	Stimuli	Sex of competing talkers	Location of competing talkers	Acoustic room conditions	Comments	Link dataset	Link paper
W. Biesmans, N. Das, T. Francart, and A. Bertrand, “Auditory-Inspired Speech Envelope Extraction Methods for Improved EEG-Based Auditory Attention Detection in a Cocktail Party Scenario,” IEEE Transactions on Neural Systems and Rehabilitation Engineering, vol. 25, no. 5, pp. 402–412, 2017	16	young, normal hearing	72 min (8 trials x 6 min + 12 trials x 2 min)	EEG 64-channel BioSemi	Dutch short stories	male-male	90/-90 degrees	dichotic and HRTF-filtered in anechoic room		Dataset	Paper
S. A. Fuglsang, T. Dau, and J. Hjortkjær, “Noise-robust cortical tracking of attended speech in real-world acoustic scenes,” NeuroImage, vol. 156, pp. 435–444, 2017	18	young, normal hearing	50 min (60 trials x 50 s)	EEG 64-channel BioSemi	Danish fictional stories	male-female	60/-60 degrees	HRTF-filtered in anechoic, mildly, and highly reverberant room	EOG available	Dataset	Paper
S. A. Fuglsang, J. Märcher-Rørsted, T. Dau, J. Hjortkjær, "Effects of Sensorineural Hearing Loss on Cortical Synchronization to Competing Speech during Selective Attention," Journal of Neuroscience, vol. 40, no. 12, pp. 2562-2572, 2020	44	22 hearing impaired + 22 normal hearing	26.7 min (32 trials x 50 s)	EEG 64-channel BioSemi	Danish audiobooks	male-female	90/-90 degrees	HRTF-filtered	single-talker, ERPs, EFRs, resting-state also available / in-ear EEG for 19 of 44 participants / EOG available	Dataset	Paper
A. J. Power, J. J. Foxe, E.-J. Forde, R. B. Reilly, and E. C. Lalor, “At what time is the cocktail party? A late locus of selective attention to natural speech,” European Journal of Neuroscience, vol. 35, pp. 1497–1503, 2012	33	young, normal hearing	30 min (30 trials x 1 min)	EEG 128-channel BioSemi	English fictional stories	male-male	90/-90 degrees	dichotic	used in seminal O'Sullivan paper	Dataset	Paper
A. Mundanad Narayanan, R. Zink, and A. Bertrand, “EEG miniaturization limits for stimulus decoding with EEG sensor networks,” Journal of Neural Engineering, vol. 18, no. 5, p. 056042, 2021	30	young, normal hearing	24 min (4 trials x 6 min)	EEG 255-channel SynAmps RT	Dutch fictional stories	male-male	90/-90 degrees	HRTF-filtered		Dataset	Paper
L. Straetmans, B. Holtze, S. Debener, M. Jaeger, and B. Mirkovic, “Neural tracking to go: auditory attention decoding and saliency detection with mobile EEG,” Journal of Neural Engineering, vol. 18, no. 6, p. 066054, 2021	20	young, normal hearing	30 min (6 trials x 5 min)	EEG 24-channel EasyCap GmbH/SMARTING	German audiobooks + natural salient events	male-male	45/-45 degrees	HRTF-filtered, recorded in public cafeteria without other people	3 trials during walking, 3 trials sitting / salient environmental sounds added	Dataset	Paper
S. Akram, A. Presacco, J. Z. Simon, S. A. Shamma, and B. Babadi, “Robust decoding of selective auditory attention from MEG in a competingspeaker environment via state-space modeling,” NeuroImage, vol. 124, pp. 906–917, 2016	7	young, normal hearing	6 min (2 conditions x 3 repetitions x 1 min)	MEG 157-channel	English fictional stories	male-female	90/-90 degrees	dichotic	instructed attention switches 1 time per trial	Dataset	Paper
A. Presacco, S. Miran, B. Babadi, and J. Z. Simon, “Real-Time Tracking of Magnetoencephalographic Neuromarkers during a Dynamic Attention-Switching Task,” in Proceedings of the 2019 41st Annual International Conference of the IEEE Engineering in Medicine and Biology Society (EMBC), pp. 4148–4151, 2019	5	young, normal hearing	4.5 min (3 trials x 90 s)	MEG 157-channel	English fictional stories	male-female	90/-90 degrees	dichotic	at-will attention switches 1-3 times per trial	Dataset	Paper
C. Brodbeck, L. E. Hong, and J. Z. Simon, “Rapid Transformation from Auditory to Linguistic Representations of Continuous Speech,” Current Biology, vol. 28, no. 24, pp. 3976–3983.e5, 2018	26	normal hearing	16 min (4 trials x 4 repetitions x 1 min)	MEG 157-channel	English audiobooks	male-female	NA	NA		Dataset	Paper
G. Cantisani, G. Trégoat, S. Essid, G. Richard, "MAD-EEG: an EEG dataset for decoding auditory attention to a target instrument in polyphonic music," Speech, Music and Mind (SMM), Satellite Workshop of Interspeech 2019, Vienna, Austria, 2019	8	young, normal hearing, non-professional musicians	30-32 min (78 stimuli x 4 repetitions x 6 s)	EEG 20-channel B-Alert X24 headset	Polyphonic music mixture (14 solos, 40 duets, 24 trios)	various instruments	NA	Speakers at 45/-45 degrees, convex weighting of instruments in the mixture	EOG, EMG, ECG, head motion acceleration available / single instrument available	Dataset	Paper
O. Etard, R. B. Messaoud, G. Gaugain, and T. Reichenbach, “No Evidence of Attentional Modulation of the Neural Response to the Temporal Fine Structure of Continuous Musical Pieces,” Journal of Cognitive Neuroscience, vol. 34, no. 3, pp. 411-424, 2022	17	young, normal hearing	22.4 min (7 stimuli of 11.2 min in total x 2 repetitions)	EEG 4-channel Ag/AgCl electrodes (Multitrode, BrainProducts)	Music (Bach's Two-Part Inventions)	piano-guitar	NA	dichotic	single instrument available	Dataset	Paper
Y. Zhang, H. Ruan, Z. Yuan, H. Du, X. Gao, and J. Lu, "A Learnable Spatial Mapping for Decoding the Directional Focus of Auditory Attention Using EEG," 2023 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP), Rhodes Island, Greece, pp. 1-5, 2023	21	normal hearing	64 min (32 trials x 2 min)	EEG 32-channel EMOTIV Epoc Flex Saline	Chinese news programs	male-female	random pairs from +-135/120/90/60/45/30/15 degrees	loudspeaker array	per trial, random pairs of the competing speaker directions are taken	Dataset	Paper
O. Etard, M. Kegler, C. Braiman, A.E. Forte, and T. Reichenbach, “Decoding of selective attention to continuous speech from the human auditory brainstem response,” NeuroImage, vol. 200, pp. 1-11, 2019	18	young, normal hearing	20 min (2 trials x 4 parts x 2.5 min)	EEG 64-channel actiCAP	English audiobooks	male-female	90/-90 degrees	dichotic		Dataset	Paper
I. Rotaru, S. Geirnaert, N. Heintz, I. Van de Ryck, A. Bertrand, and T. Francart, "What are we really decoding? Unveiling biases in EEG-based decoding of the spatial focus of auditory attention," Journal of Neural Engineering, vol. 21, no. 1, 016017, 2024	13	young, normal hearing	80 min (2 blocks x 4 conditions x 10 min)	EEG 64-channel BioSemi	Dutch science-outreach podcasts	male-male	90/-90 degrees	HRTF-filtered in anechoic room	Per condition, a different audio-visual condition is used (moving video, moving target noise, no visuals, static video). EOG also available. Per trial (=condition), there is one switch in attention after 5 minutes	Dataset	Paper
Z. Lin, T. He, S. Cai, and H. Li, "ASA: An Auditory Spatial Attention Dataset with Multiple Speaking Locations," Interspeech 2024, Kos, Greece, 2024	20	normal hearing	24 min (20 trials x 1-1.5 min)	EEG 64-channel Easycap	Mandarin stories	male-female	90/-90, -60/60, -45/45, -30/30, -5/5 degrees	HRTF-filtered through headphones		Dataset	Paper
Y. Yan, X. Xu, H. Zhu, P. Tian, Z. Ge, X. Wu, and J. Chen, "Auditory Attention Decoding in Four-Talker Environment with EEG," Interspeech 2024, Kos, Greece, pp. 432-436, 2024 and H. Zhu et al., "Using Ear-EEG to Decode Auditory Attention in Multiple-speaker Environment," 2025 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP), Hyderabad, India, pp. 1-5, 2025	16	young, normal hearing	40 min (40 trials x 1 min)	EEG 64-channel NeuSen Wireless + 20-channel cEEGrid	Mandarin stories	male (x2)-female (x2)	-90/-30/+30/+90	loudspeaker array	4-competing speaker paradigm; original stimuli available upon request (2301111611@stu.pku.edu.cn), only envelopes available	Dataset	Paper
M. Thornton, D. Mandic, and T. Reichenbach, "Decoding of selective attention to speech from Ear-EEG recordings," , arXiv, 2024 (arXiv:2401.05187)	18	young, normal hearing	40 min (16 trials x 150 s)	EEG 2-channel in-ear (reference FT7)	English audiobooks	male-female	NA	diotic via headphones		Dataset	Paper
Q. Wang, Q. Zhou, Z. Ma, N. Wang, T. Zhang, Y. Fu, and J. Li, "Le Petit Prince (LPP) multi-talker: Naturalistic 7 T fMRI and EEG dataset," Scientific Data,vol. 12, no. 829, 2025	25	young	20 min (2 trials x 10 min)	EEG 64-channel actiCAP	Mandarin audiobook ("The Little Prince")	male-female (synthesized)	NA	insert earphones	fixation on crosshair; also fMRI available and single-talker data	Dataset	Paper

Single talker

Original reference	Number of participants	Participant population	Amount of data per participant	Neurorecording system	Stimuli	Sex of talker	Comments	Link dataset	Link paper
G. M. Di Liberto, J. A. O’Sullivan, and E. C. Lalor, “Low-Frequency Cortical Entrainment to Speech Reflects Phoneme-Level Processing,” Current Biology, vol. 25, no. 19, pp. 2457–2465, 2015 and M. P. Broderick, A. J. Anderson, G. M. Di Liberto, M. J. Crosse, and E. C. Lalor, “Electrophysiological Correlates of Semantic Dissimilarity Reflect the Comprehension of Natural, Narrative Speech,” Current Biology, vol. 28, no. 5, pp. 803–809.e3, 2018	19	young, normal-hearing	60 min (20 trials x 180 s)	EEG 128-channel BioSemi	English fictional stories	male		Dataset	Paper
G. M. Di Liberto, J. A. O’Sullivan, and E. C. Lalor, “Low-Frequency Cortical Entrainment to Speech Reflects Phoneme-Level Processing,” Current Biology, vol. 25, no. 19, pp. 2457–2465, 2015	10	young, normal-hearing	72.3 min (28 trials x 155 s)	EEG 128-channel BioSemi	English fictional stories, reversed	male	same stimuli as dataset above, but reversed	Dataset	Paper
H. Weissbart, K. D. Kandylaki, and T. Reichenbach, “Cortical Tracking of Surprisal during Continuous Speech Comprehension,” Journal of Cognitive Neuroscience, vol. 32, no. 1, pp. 155–166, 2020	13	young, normal-hearing	40 min (15 trials x approx. 2.6 min)	EEG 64-channel actiCAP	English short stories	male		Dataset	Paper
F. J. Vanheusden, M. Kegler, K. Ireland, C. Georga, D. M. Simpson, T. Reichenbach, and S. L. Bell, “Hearing Aids Do Not Alter Cortical Entrainment to Speech at Audible Levels in Mild-to-Moderately Hearing- Impaired Subjects,” Frontiers in Human Neuroscience, vol. 14, no. 109, 2020	17	older, hearing impaired, hearing aid users	25 min (8 trials x approx. 3 min)	EEG 32-channel BioSemi	English audiobook	female	trials aided and unaided by hearing aid	Dataset	Paper
L. Bollens, B. Accou, H. Van hamme, and T. Francart, "A Large Auditory EEG Decoding Dataset", KU Leuven RDR, 2023	85	young, normal-hearing	130-150 min (8 -10 trials x 15 min)	EEG 64-channel BioSemi	Flemish audiobooks and podcasts	male and female		Dataset
J. R. Brennan, and J. T. Hale, "Hierarchical structure guides rapid linguistic predictions during naturalistic listening," PLoS ONE, vol. 14, no. 1, e0207741, 2019	49	young	12.4 min	EEG 61-channel actiCAP	English audiobook	female	no mention of medical conditions of participants	Dataset	Paper
S. A. Fuglsang, J. Märcher-Rørsted, T. Dau, J. Hjortkjær, "Effects of Sensorineural Hearing Loss on Cortical Synchronization to Competing Speech during Selective Attention," Journal of Neuroscience, vol. 40, no. 12, pp. 2562-2572, 2020	44	22 hearing impaired + 22 normal hearing	13.3 min (16 trials x 50 s)	EEG 64-channel BioSemi	Danish audiobooks	male and female	dual-talker, ERPs, EFRs, resting-state also available / in-ear EEG for 19 of 44 participants	Dataset	Paper
L. Gwilliams, J.R. King, Marantz, A. Marantz, and D. Poeppel, "Neural dynamics of phoneme sequences reveal position-invariant code for content and order," Nature Communications, 13, 6606, 2022	27	young, normal-hearing	120 min (2 sessions x 1 hour)	MEG 208-channel	English fictional stories			Dataset	Paper
N. H. L. Lam, A. Hultén, P. Hagoort, and J.-M. Schoffelen, "Robust neuronal oscillatory entrainment to speech displays individual variation in lateralisation," Language, Cognition and Neuroscience, no. 33, vol. 8, pp. 943-954, 2018 + various other papers	102	young, healthy	around 8.4 min (120 sentences x 2.8-6 s)	MEG 275-channel	Dutch sentences	female	fMRI also available. Also resting-state and reading available	Dataset	Paper
C. Brodbeck, L. E. Hong, and J. Z. Simon, “Rapid Transformation from Auditory to Linguistic Representations of Continuous Speech,” Current Biology, vol. 28, no. 24, pp. 3976–3983.e5, 2018	26	normal-hearing	8 min (8 trials x 1 min)	MEG 157-channel	English audiobooks	male and female		Dataset	Paper
O. Etard, and T. Reichenbach, "Neural speech tracking in the theta and in the delta frequency band differentially encode clarity and comprehension of speech in noise," Journal of Neuroscience, vol. 39, no. 29, pp. 5750-5759, 2019	12	young, normal-hearing	40 min (4 noise levels x 4 parts x 2.5 min)	EEG 64-channel actiCAP	English audiobooks	male and female	4 levels of babble noise. Also EEG data of participantss listening to Dutch available (0% speech comprehension)	Dataset	Paper
Q. Wang, Q. Zhou, Z. Ma, N. Wang, T. Zhang, Y. Fu, and J. Li, "Le Petit Prince (LPP) multi-talker: Naturalistic 7 T fMRI and EEG dataset," Scientific Data,vol. 12, no. 829, 2025	25	young	20 min (2 trials x 10 min)	EEG 64-channel actiCAP	Mandarin audiobook ("The Little Prince")	male and female (synthesized)	also dual-speaker data available	Dataset	Paper