One in five experiences hearing loss. The World Health Organization estimates that this number will increase to one in four in 2050. Luckily, effective hearing devices such as hearing aids and cochlear implants exist with advanced speaker enhancement algorithms that can significantly improve the quality of life of people suffering from hearing loss. State-of-the-art hearing devices, however, underperform in a so-called 'cocktail party' scenario, when multiple persons are talking simultaneously (such as at a family dinner or reception). In such a situation, the hearing device does not know which speaker the user intends to attend to and thus which speaker to enhance and which other ones to suppress. Therefore, a new problem arises in cocktail party problems: determining which speaker a user is attending to, referred to as the auditory attention decoding (AAD) problem.
In the first part of the thesis, we compare different AAD algorithms, which allows us to identify the gaps in the current AAD literature that are partly addressed in this thesis. To be able to perform this comparative study, we develop a new performance to evaluate AAD algorithms in the context of adaptive gain control for neuro-steered hearing devices.
In the second part, we address one of the main signal processing challenges in AAD: unsupervised and time-adaptive algorithms. We first develop an unsupervised version of the stimulus decoder that can be trained on a large batch of EEG and audio data without knowledge of ground-truth labels on the attention. This unsupervised but subject-specific stimulus decoder, starting from a random initial decoder, outperforms a supervised subject-independent decoder, and, using subject-independent information, even approximates the performance of a supervised subject-specific decoder. We also extend this unsupervised algorithm to an efficient time-adaptive algorithm, when EEG and audio are continuously streaming in, and show that it has the potential to outperform a fixed supervised decoder in a practical use case of AAD.
In the third part, we develop novel AAD algorithms that decode the spatial focus of auditory attention to provide faster and more accurate decoding. The developed methods achieve a much higher accuracy compared to the SR algorithm at a very fast decision rate. Furthermore, we show that these methods are also applicable on different directions of auditory attention, using only EEG channels close to the ears, and when generalizing to data from an unseen subject.
To summarize, in this thesis we have developed crucial building blocks for a plug-and-play, time-adaptive, unsupervised, fast, and accurate AAD algorithm that could be integrated with a low-latency speaker separation and enhancement algorithm, and a wearable, miniaturized EEG system to eventually lead to a neuro-steered hearing device.
I defended my PhD thesis on May 20, 2022 in Leuven. The jury consisted of:
I obtained my PhD degree in Electrical Engineering summa cum laude with congratulations from the board of examiners. A 2-minute video summary is available on Youtube!