Martijn Wobbes

A technically driven developer passionate about creating performant, scalable, and maintainable systems.

Martijn Wobbes

A technically driven developer passionate about creating performant, scalable, and maintainable systems.

3min read

AI Research Project

This project was completed as my Bachelor's thesis and focused on improving the reliability and interpretability of EEG-based Error-Related Potential (ErrP) detection. The goal was to train a deep learning model that not only classifies whether an ErrP is present, but also estimates how confident it is in each prediction. This additional uncertainty signal is particularly useful for real-world Brain-Computer Interface (BCI) systems, where noisy data can lead to unreliable decisions.

To enhance interpretability, the project also explored explainable AI (XAI) techniques. Shapley values (SHAP) were used to identify which parts of the EEG signal influenced the model's decisions and contributed to higher uncertainty. This aimed to help determine whether elevated uncertainty corresponded to noisy or corrupted EEG segments — a valuable property for safety-critical neural interfaces.

How it works

EEG recordings were sourced from a standard ErrP dataset (Chavarriaga & Millán, 2010). The data was pre-processed through filtering, epoch segmentation around feedback events, and normalization before being fed into a convolutional neural network (CNN). Synthetic noise of varying type, duration, and intensity was injected into controlled regions of the data to test the model's ability to identify uncertainty caused by corrupted signals.

The model architecture extends a well-established CNN for ErrP classification by introducing two output heads:


  • ErrP Classification Head: Predicts whether the EEG segment contains an ErrP.

  • Uncertainty Head: Estimates the model's confidence in its prediction on a per-trial basis.

Both heads are trained jointly using a dual-objective loss function, balancing classification accuracy and accurate uncertainty estimation. After training, SHAP was applied to generate feature-attribution maps for both outputs. The key focus was the uncertainty head — to analyze which EEG regions contribute to low-confidence predictions and whether this aligned with injected noise.

Architecture

The project is developed using python with PyTorch for the model training/evaluation. Jupyter Notebook is used for the execution of training, testing, and visualization of the experiments.

The processing pipeline begins with preparing raw EEG data for training through filtering, epoch extraction, and normalization. Noise injection can be applied at this stage to specific channels or time windows to validate whether the model learns to associate corrupted signals with higher uncertainty.

The CNN is based on a previously validated architecture for ErrP detection but modified to include a second uncertainty-prediction head. Training both heads simultaneously enables the model to detect ErrPs while also learning when its predictions may not be trustworthy.

Finally, SHAP explainability techniques are used to visualize the most influential features behind both the classification and uncertainty predictions. These visualizations help compare high-certainty versus low-certainty cases to better understand model behaviour and identify potential sources of confusion.

Results & Takeaways


The project showed that adding an uncertainty output provides meaningful insight into model reliability, especially when EEG input quality degrades. Key findings include:

  • Uncertainty as a diagnostic signal: Trials with artificially added noise consistently resulted in higher model uncertainty, making the model more transparent and safer for BCI use.

  • Improved interpretability: SHAP visualizations confirmed that the model focused on expected signal regions when detecting ErrPs. However, SHAP was less effective in precisely locating the origin of injected noise.

  • Robustness potential: Combining uncertainty estimates with XAI provides a path toward more reliable neural interfaces that can detect when predictions should be ignored, down-weighted, or re-evaluated.

Overall, this project strengthened my experience in deep learning for biosignals, working with uncertainty in neural networks, and applying explainable AI to improve model transparency.

See More

The code can be found in this GitHub Repository.

The paper can be found here.

LET'S GET IN
TOUCH