Date of this Version


Document Type



Objective: Although many studies have attempted to improve the performance of the visual-based P300-speller system, its performance is still not satisfactory. The current system has limitations for patients with neurodegenerative diseases, in which muscular control of the eyes may be impaired or deteriorate over time. Some studies have shown that the audiovisual stimuli with spatial and semantic congruence elicited larger event-related potential (ERP) amplitudes than do unimodal visual stimuli. Therefore, this study proposed a novel multisensory P300-speller based on audiovisual spatial and semantic congruence.

Methods: We designed a novel audiovisual P300-speller paradigm (AV spelling paradigm) in which the pronunciation and visual presentation of characters were matched in spatial position and semantics. We analyzed the ERP waveforms elicited in the AV spelling paradigm and visual-based spelling paradigm (V spelling paradigm) and compared the classification accuracies between these two paradigms.

Results: ERP analysis revealed significant differences in ERP amplitudes between the two paradigms in the following areas (AV > V): the frontal area at 60–140 ms, frontal–central–parietal area at 360–460 ms, frontal area at 700–800 ms, right temporal area at 380–480 and 700–780 ms, and left temporal area at 500–780 ms. Offline classification results showed that the accuracies were significantly higher in the AV spelling paradigm than in the V spelling paradigm after superposing 1, 2, 5, 6, 9, and 10 times (P < 0.05), and there were trends toward improvement in the accuracies at superposing 3, 4, 7, and 8 times (P = 0.06). Similar results were found for information transfer rate between V and AV spelling paradigms at 1, 2, 5, 6, and 10 superposition times (P < 0.05).

Significance: The proposed audiovisual P300-speller paradigm significantly improved the classification accuracies compared with the visual-based P300-speller paradigm. Our novel paradigm combines spatial and semantic features of two sensory modalities, and the present findings provide valuable insights into the development of multimodal ERP-based BCI paradigms.


Originally published in Frontiers in Neuroscience.

Creative Commons License

Creative Commons Attribution 4.0 License
This work is licensed under a Creative Commons Attribution 4.0 License.



Rights Statement

Rights Statement

In Copyright. URI:
This Item is protected by copyright and/or related rights. You are free to use this Item in any way that is permitted by the copyright and related rights legislation that applies to your use. For other uses you need to obtain permission from the rights-holder(s).