Individualized pattern recognition for detecting mind wandering from EEG during live lectures
aff001; Anita Acai
aff001; Natalie Wagner
aff001; Dan Bosynak
aff002; Stephen Kelly
aff001; Mohit Bhandari
aff001; Brad Petrisor
aff001; Ranil R. Sonnadara
Authors place of work:
Department of Surgery, McMaster University, Hamilton, Ontario, Canada
aff001; Research and High-Performance Computing Support, McMaster University, Hamilton, Ontario, Canada
aff002; Vector Institute for Artificial Intelligence, Toronto, Ontario, Canada
aff003; Department of Psychology, Neuroscience, & Behaviour, McMaster University, Hamilton, Ontario, Canada
aff004; LIVELab, McMaster University, Hamilton, Ontario, Canada
aff005; Department of Health Research Methods, Evidence, and Impact, McMaster University, Hamilton, Ontario, Canada
aff006; Department of Surgery, University of Toronto, Toronto, Ontario, Canada
Published in the journal:
PLoS ONE 14(9)
Neural correlates of mind wandering
The ability to detect mind wandering as it occurs is an important step towards improving our understanding of this phenomenon and studying its effects on learning and performance. Current detection methods typically rely on observable behaviour in laboratory settings, which do not capture the underlying neural processes and may not translate well into real-world settings. We address both of these issues by recording electroencephalography (EEG) simultaneously from 15 participants during live lectures on research in orthopedic surgery. We performed traditional group-level analysis and found neural correlates of mind wandering during live lectures that are similar to those found in some laboratory studies, including a decrease in occipitoparietal alpha power and frontal, temporal, and occipital beta power. However, individual-level analysis of these same data revealed that patterns of brain activity associated with mind wandering were more broadly distributed and highly individualized than revealed in the group-level analysis.
Mind wandering detection
To apply these findings to mind wandering detection, we used a data-driven method known as common spatial patterns to discover scalp topologies for each individual that reflects their differences in brain activity when mind wandering versus attending to lectures. This approach avoids reliance on known neural correlates primarily established through group-level statistics. Using this method for individual-level machine learning of mind wandering from EEG, we were able to achieve an average detection accuracy of 80–83%.
Modelling mind wandering at the individual level may reveal important details about its neural correlates that are not reflected when using traditional observational and statistical methods. Using machine learning techniques for this purpose can provide new insight into the varieties of neural activity involved in mind wandering, while also enabling real-time detection of mind wandering in naturalistic settings.
Social sciences – Sociology – Education – Lectures – Psychology – Cognitive psychology – Attention – Research and analysis methods – Bioassays and physiological analysis – Electrophysiological techniques – Brain electrophysiology – Electroencephalography – Imaging techniques – Neuroimaging – Electroencephalography – Functional magnetic resonance imaging – Diagnostic radiology – Magnetic resonance imaging – Functional magnetic resonance imaging – Simulation and modeling – Algorithms – Machine learning algorithms – Biology and life sciences – Physiology – Electrophysiology – Neurophysiology – Brain electrophysiology – Electroencephalography – Neuroscience – Neurophysiology – Brain electrophysiology – Electroencephalography – Brain mapping – Electroencephalography – Functional magnetic resonance imaging – Neuroimaging – Electroencephalography – Functional magnetic resonance imaging – Cognitive science – Cognitive psychology – Attention – Cognition – Neural networks – Psychology – Cognitive psychology – Attention – Medicine and health sciences – Physiology – Electrophysiology – Neurophysiology – Brain electrophysiology – Electroencephalography – Clinical medicine – Clinical neurophysiology – Electroencephalography – Diagnostic medicine – Diagnostic radiology – Magnetic resonance imaging – Functional magnetic resonance imaging – Radiology and imaging – Diagnostic radiology – Magnetic resonance imaging – Functional magnetic resonance imaging – Computer and information sciences – Artificial intelligence – Machine learning – Machine learning algorithms – Neural networks – Physical sciences – Mathematics – Applied mathematics – Algorithms – Machine learning algorithms
Whether at rest or under cognitive load, the human mind is prone to turning inwards [1–4]. This spontaneous internal thinking, which can occupy up to 50% of our daily lives by some accounts , is often referred to as mind wandering (MW) [6, 7]. While there is evidence that MW can play a useful role in a variety of settings (e.g., for content integration, creative thinking, and future planning [8, 9]), its occurrence during attention-demanding tasks can have detrimental effects on task-related learning and performance [8, 10–13].
Studies of MW during live or recorded lectures have suggested that a greater time spent mind wandering is negatively correlated with direct educational outcomes, such as poorer retention and comprehension [14–19]. Indirect effects of MW can also occur, such as through poorer note-taking . However, MW remains challenging to fully understand because we lack reliable and objective means of identifying exactly when someone is mind wandering and when they are not [7, 21]. The ability to detect MW as it occurs is, therefore, a crucial step towards improving our understanding of MW and its impacts on learning and performance. Furthermore, MW detection is an important first step towards developing approaches to counteract its negative effects. For example, real-time MW detection based on objective measures can facilitate the study of MW in a more fine-grained way and in more naturalistic settings, such as during lectures and other real-world tasks.
A major challenge in MW research is the various ways in which it has been defined throughout the literature. Many studies have either implicitly or explicitly used a content-based definition of MW, wherein any task-unrelated thought is categorized as MW [22, 23]. However, since this definition includes external stimulus-driven distractions (e.g., background noise), some researchers have opted to use a narrower definition: stimulus-independent thought [2, 9, 24]. More recently it has been argued that MW should be defined more specifically as spontaneous and unconstrained thought, which may better reflect the dynamic and uninhibited flow of thought that we internally experience when our minds wander . Despite these nuances, the definition of MW varies considerably across empirical studies, and what is considered MW may partially depend on the context in which it is studied. It is therefore important to study MW in naturalistic settings not only to better understand it and its effects more fully but also to ensure that MW detection methods apply to the settings in which they are intended to be used.
Thought probes in mind wandering research
A particular challenge in MW detection is that there is currently no known clear, objective, and externally observable indicator of MW. The most common approach to determining which time periods are likely to correspond to MW, and which are not, is to use thought probes (also known as experience sampling) . Thought probes involve interrupting a given task and prompting study participants to self-report their state of attention [26, 27]. Thought probes, however, may have limited utility as a tool for MW research and detection. One reason is that thought probes artificially disrupt the task and MW alike by presenting a stimulus to cue participants to respond to the probe. In a naturalistic setting like a lecture, this method can be problematic because it requires instructor buy-in and time for student responses, which is counter to the goal of the instructor because it disrupts student learning. There is also the potential for accidental cueing of participants’ attention if lecturers are aware of when thought probes are to occur and adopt strategies to minimize disruption, such as timing lecture material in a particular way. Since thought probes are most effective when unanticipated, it is generally preferable if the lecturers are also unaware of when they will occur. In addition, the effectiveness of thought probes depends, in part, on the probe rate as well; if they are used too often, participants do not have enough time between probes to begin mind wandering again . These factors make it challenging to obtain the data required to develop methods that can enable reliable continuous detection of MW, as the unreliability of self-reported MW and the likelihood that the thought probes affect the data are both factors that must be considered.
It is also important to note that some thought probes, depending on how they are presented to and interpreted by participants, may assume a content-based definition of MW . However, as the neural correlates of MW are not yet fully understood, thought probes remain a useful, simple, and low-cost tool for collecting data about MW until alternative and demonstrably superior detection methods are available. Moreover, despite their limitations, thought probes may be the one the most efficient and reliable tools available with which to develop and validate potentially better methods of MW detection.
Neural correlates of mind wandering
In recent years, neurophysiological measures, especially those acquired through functional magnetic resonance imaging (fMRI), have been increasingly used to understand the brain regions and processes that underlie MW. The default mode network (DMN), a brain network involving the posterior cingulate cortex, medial prefrontal cortex, temporoparietal cortex, and various lateral parietal regions, is often discussed in relation to MW. As its name suggests, the DMN has repeatedly been shown to be active during spontaneous thought, or when the mind is at “rest” [4, 7, 26]. More specifically, the DMN has also been shown to be active during self-reported instances of MW  and lapses in attention , and has even been shown to predict events of human error . However, the DMN is also known to be activated during purposeful internal thought, including future planning and episodic memory retrieval [29–31], and is interestingly not strictly anticorrelated with the dorsal attention network . This finding is important because it suggests that MW cannot simply be reduced to activation of the DMN, and activation of the DMN is not a sufficient indicator of MW.
Paradoxically, in addition to the DMN, MW also appears to be associated with activation of the executive control system of the brain, a finding that may help explain the relationship between MW and reduced task performance and learning [7, 26]. The role of the executive control network in MW remains unclear, but different hypotheses have been presented. A straightforward explanation is the control failure hypothesis, which proposes that brain regions that are a part of the executive control system attempt to reorient the brain to the task at hand during MW [33, 34]. However, there is some evidence to suggest that this may not be the case . For example, direct stimulation of the nodes in the executive control network can increase task-unrelated mental activity, while the control failure hypothesis would predict a decrease of task-unrelated mental activity . The decoupling hypothesis, on the other hand, proposes that the executive control network supports MW by suppressing task-related perceptual processing and orienting the brain towards personal goal-oriented thought [24, 36, 37]. However, it has been noted that this could instead reflect internal task-related thought (e.g., task-related creative problem solving), which would only be included in the broadest definitions of MW .
Although studies exploring functional connectivity (i.e., temporal correlations in the activity of multiple brain regions) using fMRI have contributed a wealth of information regarding the neuroanatomy and network activity related to MW, the phenomenon is thought to be a highly dynamic process that involves rapid fluctuation and spontaneity in thought. Understanding and detecting MW may, therefore, require the study of dynamic neural processes that give rise to such thinking on shorter time scales (see  for a review of dynamic functional connectivity with fMRI to better understand MW). EEG is a technology for acquiring neuroelectric activity with substantially greater temporal resolution (timescales in the order of a few hundred milliseconds to a few seconds versus a few seconds to few minutes with fMRI). EEG is therefore preferred for the more precise study of temporal dynamics of brain activity, albeit at the expense of the spatial resolution available with fMRI. Although EEG is largely limited to the cortex and large brain structures, it has been used to estimate the propagation of activity through less deep nodes of the neural networks established in fMRI research .
A primary paradigm in EEG research is the use of event-related potentials. Event-related potentials are obtained by averaging periods of recorded brain activity that are time-locked to an event or stimulus and are thus thought to reflect neural responses to that event or stimulus . Analysis of the P300, an event-related potential often used as an index of cognitive processing and attention, shows a decrease in average amplitude during MW, supporting the idea that MW may reduce the amount of cognitive resources available for task-related processing [36, 40].
Researchers also use EEG to investigate the oscillatory patterns of brain activity under a variety of conditions. Measuring such oscillations requires temporally-precise sampling of the local field potentials generated by populations of neurons firing synchronously in the brain, and is thus one of the most common applications of EEG [41, 42]. Moreover, the activity in various frequency bands has been linked to a variety of cognitive functions [43–45] and states [46, 47].
To investigate MW specifically, one study measured neural oscillations with EEG in participants who were instructed to focus on their breath and press a button whenever they noticed that their attention had lapsed . As expected, the authors found changes in oscillatory patterns that are associated with decreased alertness and vigilance: specifically, an increase in delta power (2–3.5 Hz; predominately in frontocentral regions) and theta power (~4–7 Hz; widespread, but most pronounced in occipital and parietocentral regions), along with decreased alpha (~9–11 Hz; focused on occipital regions) and beta power (~13–30 Hz; in frontolateral regions) during MW. Concordantly, a study investigating the relationships between oscillatory processes using a nearly identical study design found that the ratio between theta and beta activity in the frontal cortex, a measure that has been found to be negatively correlated with cognitive load  and attentional control , was higher during MW (although they did not find a relation between this measure and attentional control in their study) .
In a seemingly contradictory finding, another study employing thought probes while participants listened to stories discovered that not only did alpha power increase broadly over the scalp during MW (a finding that is consistent with previous work showing a similar change in alpha power associated with attentional shifts away from auditory language processing ), but that this change was also predictive of comprehension . Together, these findings suggest that the study design and the attentive task used as a control condition may themselves elicit different subtypes (or perhaps definitions) of MW, and/or influence the neural correlates of MW that are subsequently discovered. Given these findings, there is a significant need for further translational research that aims to study MW in naturalistic settings, and for the development of objective MW detection methods that can generalize across experimental paradigms.
Given that the broader research community has not yet reached a consensus on whether MW is simply spontaneous thought, stimulus-independent thought, or more broadly, task-unrelated thought , the neural correlates of MW have not been fully delineated from other, potentially related, mental processes. Consequently, methods of detecting MW that are based upon already known or suspected neural correlates of MW may not fully capture the dynamics and the diversity of brain activity involved in MW, which may be especially underrepresented by current methods that discover neural correlates after aggregating responses to thought probes in group-level analyses. Moreover, which neural correlates contribute to detection may depend on the definition of MW that is most applicable for the setting. For these reasons, there is considerable interest in being able to detect MW on an online basis using objective measures as a means of furthering the study of MW in a way that more fully characterizes its dynamic and variable nature. In addition, online MW detection is relevant to education and performance science research, where the ability to monitor MW and taper the delivery of educational and training material during MW may lead to more optimal learning .
Online detection of mind wandering
In the effort to better understand MW, there has been recent interest in developing methods for online detection. Previous approaches have made use of various physiological measures to detect MW, including eye tracking [53–55], heart rate variability [56, 57], and skin conductance and temperature [56, 58]. A few of these studies employed machine learning techniques to train classifiers that are, in theory, capable of automatic and continuous online detection of MW [21, 54, 55, 57, 58]. In comparison to other physiological measures, the use of neurophysiological signals for MW detection is both new and challenging. However, as MW reflects purely internal mental processes, reliable and accurate online detection of MW may ultimately necessitate the analysis of brain activity .
Online detection of MW with machine learning has mainly been explored with known neural correlates as provided by the cognitive neuroscience literature discussed earlier. One of the early studies demonstrating this approach used fMRI combined with measures of pupil diameter to achieve an average classification accuracy of nearly 80% . While further development of this approach may enable more precise neuroimaging studies, detection of MW using fMRI is unlikely to translate well to real-world applications in naturalistic settings. Using a more portable modality for measuring the hemodynamic response in the brain to address this limitation, another group used functional near-infrared spectroscopy to detect MW . However, this approach yielded a significantly reduced detection accuracy of 56% on average.
More recent work has focused on EEG for purely neurophysiological detection of MW. Combining both the event-related potential and spectral neural correlates of MW that have been discovered by previous EEG work, standard (i.e., group-level) machine learning techniques were able to achieve detection accuracies between 50% and 85%, depending on the participant . Notably, the study referenced here used two different laboratory attention tasks to train a model of MW that was not overly specific to only one task.
Individual variability in mind wandering
The overwhelming majority of previous work has focused on establishing behavioural and neural correlates of MW based on group-level analyses. This paradigm is powerful for establishing measures that are common across instances of MW and across individuals. However, individual-level analyses, especially those employing data-driven machine learning methodologies that do not rely on previously determined neural correlates of the mental process or state under investigation, have revealed surprising degrees of individual variability in the neural correlates of emotional states [61–63], the effects of concussion , and other areas [65, 66]. It remains an open question as to whether individual-level analysis of MW will also reveal a rich set of neural correlates that may only appear for some individuals under certain conditions.
While some have explored whether individual personality and cognitive factors contribute to MW rates [67–69], including as they relate to differences in brain activity using group-level correlations [70, 71], few studies have explored the individual variability in brain activation during MW itself. One study associated the type of self-generated thought, assessed on an individual level, to different neural correlates at the group level, demonstrating that even some degree of individual-level analysis can reveal a greater diversity of brain activation than previously discovered . Here we present work in which, in addition to group-level analyses, we also use novel analytic techniques to explore brain activity on a fully individual level and perform individual-level detection of MW.
In this study, we demonstrate, for what we believe to be the first time, machine learning-based detection of MW from EEG recorded simultaneously across the entire study sample in a naturalistic educational setting: during live lectures. Given the educational setting and the goal of identifying when participants were focused on the lecture itself, we used a general content-based definition of MW and considered a participant to be mind wandering if they were not paying attention to the lecture, as self-reported during thought probes. In addition to using a novel naturalistic setting, we employed a feature learning approach adapted from brain-computer interfacing in which patterns of brain activity associated with MW were learned on an individual basis from the data directly without constraining the models based on known neural correlates.
Materials and methods
Study setting and population
The study took place in the Large Interactive Virtual Environment (LIVE) Lab at McMaster University in Hamilton, Ontario, Canada . This 106-seat research center and performance space allows for the measurement of brain waves using 16-channel EEG in up to 16 audience members simultaneously, serving as a unique environment for research related to the neurophysiology of music, hearing, vision, movement, and learning. Simultaneous data collection, which is a key feature of the LIVELab, ensured that all participants were exposed to the same stimuli in the same manner and accounted for the potential for students’ attention to be influenced by their peers’ behaviours, which often occurs during live lectures .
Upon approval from the Hamilton Integrated Research Ethics Board (HiREB-0629), all orthopedic resident trainees (N = 25) and medical students completing an elective in orthopedic surgery (N = 9) at McMaster University were invited to attend two lectures in the LIVELab. Orthopedic surgery was selected because of the program’s interest in exploring innovative teaching methods in medical education. The first lecture was given by a female, doctoral-level researcher on the topic of intimate partner violence while the second was given by a male, doctoral-level researcher on the topic of meta-analytic methods in orthopedic surgery research. Each was approximately 30 minutes in length with a 15-minute break in between.
We informed invitees about the study via email before the teaching session and again on the morning of the event. The study contained a behavioural component, comprised of thought probes and quizzes, and an EEG component, comprised of EEG recording during both lectures. Interested individuals could consent to participate in both the behavioural and EEG components of the study, or the behavioural component only. Assignment to these groups was determined by participants’ preference at the time of study enrollment, as some participants wished to participate in the study but preferred not to be connected to the EEG equipment. Participants provided both their verbal and written informed consent to participate in the study.
EEG participants were fitted with caps and seated in the LIVELab. Sixteen-channel EEGs were collected simultaneously from each participant using the configuration shown in Fig 1. We interrupted each lecture approximately every four minutes with a bell ring and on-screen prompt asking participants to report their state of attention just before seeing the probe using the following question: “Just prior to seeing this probe, which of the following best describes your cognitive state?” Response options were: A) Paying attention, B) Not paying attention (i.e., mind wandering), or C) Unsure or unaware. The purpose of the probes was to identify points in the EEG data when participants self-reported that they were mind wandering versus not. Non-EEG participants also responded to the probes as a comparison group to ensure that being connected to the EEG equipment did not influence their attention.
In addition to the thought probes, we administered two quizzes after each lecture (shown in S1 and S2 Appendices). The first was administered immediately after each lecture to measure recall while the second was administered two weeks later at a teaching session to measure retention. Both quizzes contained five short answer (either fill-in-the-blank or one-word answer) questions worth one mark each that were supplied by the presenters and matched for difficulty. The questions spanned the entire lecture and were designed to test participants’ knowledge of the material covered, for example: “What is the best-reported IPV study design?” (Quiz 1, Immediate Recall) or “What term is used to describe a network with few trials and/or patients included within it?” (Quiz 2, Retention). In addition to testing content, the immediate recall quizzes also contained questions to gauge participants’ perceptions overall engagement, interest in the content of each lecture, and perceptions of presenter engagement.
We summarized behavioural data using descriptive statistics and conducted Shapiro-Wilk tests to verify assumptions of normality. Responses to the MW probes and questions about overall engagement, interest in the content, and presenter engagement were not normally distributed; thus, we opted to use non-parametric statistical tests to analyze these data. Quiz scores, on the other hand, met assumptions of normality and were therefore analyzed using parametric statistics. An exception was the retention scores for Lecture 2, which were skewed due to a floor effect caused by poor retention among participants. Since these data represented only a small portion of our data set and analyses of variance (ANOVAs) are considered relatively robust to deviations from normality , we opted to still use parametric analyses to analyze the quiz scores from both lectures.
We performed Mann-Whitney U tests to test for statistically significant differences in thought probe responses between EEG and non-EEG participants. We used mixed-effects ANOVAs using time of retention/recall quizzes (immediately following the lectures versus two weeks later) as the within-subjects factor and group (EEG versus non-EEG) as the between-subjects factor to test for statistically significant differences in quiz scores. We also computed Spearman’s rank-order correlation coefficients to test for associations between MW and quiz performance and used Cochran’s Q tests to determine if patterns of MW were stable over time.
The above statistical tests were conducted using IBM SPSS v. 25. The Holm-Bonferroni method of correcting for multiple comparisons was used when testing for differences between our EEG participants and behavioural-only participants (five measures).
Preprocessing and data cleaning
We analyzed EEG data using the MNE toolbox in Python . We used extensive denoising procedures because we expected greater contamination from eye movement and motor artifacts in a live lecture setting compared to traditional laboratory EEG recordings. These were performed separately for each participant. However, since some studies show that eye movements and possibly other artifacts can be indicative of MW [53–55, 76], we performed two machine learning analyses: one with artifact rejection, denoted Artifacts Suppressed, and one without artifact removal, denoted Artifacts Present. The Artifacts Suppressed approach was used to determine to what extent we could classify MW from just neuroelectric patterns and thereby promote the discovery of neural correlates of MW in naturalistic settings. In contrast, we elected to perform the Artifacts Present analysis as a comparison because in many real-world applications of MW detection, including education and performance science, the main concern is optimal detection of MW rather than reliance on purely neurological processes. In addition, since many previous MW detection studies used eye tracking rather than neural signals [53–55], the Artifacts Present approach allowed us to determine whether there was a benefit in terms of detection accuracy to skipping the use of ocular artifact suppression algorithms for real-world applications.
We removed an average of 0.53 (range: 0–2) channels per participant before analysis based on extreme variance. Remaining EEG signals were re-referenced to the average of all remaining channels and standardized using the exponential running mean and variance with a smoothness factor of 0.001, a technique for time-series standardization that reduces the influence of local fluctuations, such as high-voltage artifacts . While existing research tends to associate MW with changes in EEG alpha band activity [40, 51, 52], we chose to retain a broader range of frequencies to test whether machine learning analysis would benefit from other frequency bands. Thus, we bandpass filtered the standardized EEG signals to 1–30 Hz using a Type II finite impulse response filter.
Following this simple signal preprocessing procedure, we performed two-stage artifact rejection for each participant’s EEG data. First, we epoched the entire signal into a series of non-overlapping 1s epochs. We automatically rejected epochs containing major artifacts using the autoreject Python toolbox . This was done for both machine learning analyses (with and without independent component analysis artifact rejection), because most of the artifacts removed by autoreject were either of too high amplitude for the EEG amplifiers to properly characterize due to momentary loss of connection with the wireless receiver during data streaming, or due to major motion artifacts (e.g., major changes in posture).
Since autoreject does not remove most eye artifacts or smaller motor artifacts, we fit an independent component analysis model using the remaining epochs to separate and suppress those artifacts . The autoreject method was used before independent component analysis because the presence of large artifacts can interfere with the technique’s ability to separate eye movements and smaller motor artifacts from the remaining EEG. Models with 16 components were fit using the extended infomax algorithm, which is well-suited to EEG data containing a variety of noise sources . Since we did not directly record electrooculograms or electromyograms, we used statistical thresholding (skewness: ± 2.50, kurtosis: ± 3.00) to reject artifact-containing components and visually confirmed the selections . We removed an average of 5.0 independent components (range: 3–8). We then re-mixed the signals into channel space with the artifactual components removed for further analysis.
We re-epoched the denoised EEG signals for machine learning analysis by extracting the 10s periods before each MW probe onset. Each of these 10s periods was then further epoched into five, 2s time windows and labelled according to the participant’s response to the MW probe. Epochs where the participant reported being unsure if they were MW (approximately 4% of responses), were ignored. From an initial total of 65 epochs per participant, we retained an average of 49 (range: 35–60) epochs per participant, with an average of 17 (range: 5–25) MW epochs, and 32 (range: 15–45) non-MW epochs. The exact distribution of epochs for each participant is given in Table 1 to make it easier to compare the class distribution for each participant to the subject-specific classification performance measures presented in the results.
Since this is a novel study design in which MW was probed in a naturalistic setting and EEG was collected for the entire study group simultaneously, we performed a more traditional statistical analysis to determine whether the same neural correlates of MW reported in previous studies were also found in our study. We computed the band powers per epoch within canonical frequency bands typically used in EEG analysis (theta: 4–7 Hz, alpha: 8–12 Hz, beta1: 13–18 Hz, beta2: 19–30 Hz). To determine whether specific combinations of channels and frequency bands showed differences in band power during MW versus not MW, we fit a two-factor repeated-measures ANOVA over band powers for each channel aggregated across participants (16 total models). This included eight response variables since there are four frequency bands (theta, alpha, beta1, and beta2) over two conditions (MW versus not MW probe responses). We used probe response and frequency band as within-subject factors.
We then performed a similar analysis within each individual to determine whether MW was associated with changes in certain frequency bands. For this analysis, we fit a separate two-factor repeated-measures ANOVA over band powers for each participant aggregated across channels using probe response and frequency band as within-subject factors (15 models). We used Holm-Bonferroni corrections to correct the p-values across the multiple models (16 models for the channel-wise ANOVAs, and 15 models for the individual-specific ANOVAs). We report effect sizes in terms of ηp2, where:
Machine learning analysis
We performed intra-subject and inter-subject classification of MW using common spatial patterns to learn discriminative spatial filters [82, 83], and a non-linear support vector machine to fit a classification model . We computed six common spatial patterns (three for MW and three for not MW) and computed the log of the normalized power for each to use for classification. We used six common spatial pattern components to keep the dimensionality of the feature space low, reducing the potential for overfitting. We developed this approach as an adaptation of previous work on brain-computer interfacing [85, 86]. This approach has been developed for similar small-sample machine learning problems with EEG, where overfitting is avoided by learning a small number of features with a linear algorithm (although a greater number of samples may still lead to improved performance).
Features were learned with common spatial patterns for each of the frequency bands described earlier, and machine learning analysis was performed on each band independently. This was done so that we could determine whether the association between MW and band power changes found in prior research would remain important for single-subject detection of MW, given that the associations were discovered through the statistical analysis of EEG data averaged across participants.
Cross-validation was used to obtain statistically sound measures of classification accuracy. For each cross-validation iteration, common spatial pattern and support vector machine models were trained on a portion of the data and classification accuracy was obtained from the withheld portion of data. For intra-subject classification, we used five-fold cross-validation to partition epochs into training and test sets five different times while ensuring that epochs corresponding to the same MW probe always remained in the same set. For inter-subject classification, we used leave-one-subject-out cross-validation, meaning that each participant served as a test set once for a model trained on data aggregated from all other participants.
Due to MW occurring at a lower frequency than not MW, we report the precision, recall, and F1 score of the MW class alongside classification accuracy. The F1 score is a performance metric for binary classifiers that can be more informative for imbalanced class distributions because it considers both precision and recall as follows:
Moreover, we obtained the chance-level F1 score using random permutations of the ground truth labels. We obtained a distribution of scores by randomly shuffling the ground truth labels and training our machine learning pipeline to classify the shuffled labels. This was done for 500 iterations per participant, each with a random shuffling of the ground truth labels. We then used those scores to estimate the 95% confidence interval for a chance-level F1 score. Finally, we performed independent-samples t-tests comparing these randomized permutation results with the cross-validation results that were obtained with the correct ground truth labels . Chance-level F1 was computed after epoch rejection to compare classification accuracy to chance as accurately as possible.
A total of 23 individuals participated in the study. Fifteen participated in both the EEG and behavioural components and eight participated in the behavioural-only component. Further demographic information about the study participants is provided in Table 2.
We administered 15 thought probes across the two lectures (eight during the first and seven during the second), to which all 23 participants responded. A technical error occurred while administering the first and second probes during Lecture 1; thus, data from these probes were excluded from both the behavioural and machine learning analyses.
On average, participants reported MW during 32% of the probes in the first lecture and 38% of the probes in the second lecture, resulting in an average of 35% across both lectures. Participants were unsure about whether or not they had been MW approximately 4% of the time across both lectures. We did not find any significant differences in self-reported MW between EEG and non-EEG participants during either lecture (see Table 3 for detailed statistical results). We also did not find any significant differences in the proportion of MW at each time point for Lecture 1, χ2(5) = 6.08, p = 0.30; however, during Lecture 2, participants reported significantly more MW during the fourth (53% of participants) and sixth probes (56% of participants) than during the other probes (on average 16% of participants), χ2(6) = 38.67, p < 0.001.
Perceptions of interest and engagement
Using a five-point Likert scale where 1 = Very uninteresting and 5 = Very interesting, participants rated the content of both lectures as interesting (Lecture 1: M = 3.65, SD = 0.88; Lecture 2: M = 3.74, SD = 1.21). Using a similar scale, they also rated both presenters as engaging (Lecture 1: M = 3.78, SD = 0.85; Lecture 2: M = 3.70, SD = 0.88). We did not find any significant differences between EEG and non-EEG participants on either measure (see Table 3).
Immediate recall and retention
Completion rates of the immediate recall quiz and the retention quiz administered two weeks later were 100% and 87%, respectively. One question was removed from the immediate recall quiz corresponding to the first lecture due to poor wording.
Table 3 shows descriptive and test statistics for performance on the immediate recall and retention quizzes. For both lectures, mixed-effects ANOVAs showed significantly higher scores on the immediate recall quiz than the retention quiz two weeks later, Lecture 1: F(1,18) = 13.24, p < 0.01; Lecture 2: F(1,18) = 47.35, p < 0.001. There was also an effect of lecture on retention, such that participants’ retention scores were significantly higher for Lecture 1 than for Lecture 2, F(1,38) = 12.49, p = 0.001.
We did not find any significant differences in quiz scores between EEG and non-EEG participants for either lecture (see Table 3). Moreover, the number of times MW was reported did not significantly correlate with immediate recall, Lecture 1: rS(23) = 0.12, p = 0.58; Lecture 2: rS(23) = -0.02, p = 0.94, nor retention quiz scores, Lecture 1: rS(20) = -0.22, p = 0.35; Lecture 2: rS(20) = -0.18, p = 0.45 (see Fig 2). However, we caution readers that with sample sizes of 20 and 23, these correlational analyses are considerably underpowered, with post-hoc power calculations using G*Power v. 3.1 suggesting values of 0.05–0.15.
Power spectrum statistics
The results of the channel-wise repeated-measures ANOVA models are given in Table 4. We found small to moderate effects of MW in most channels, which we would not expect to see if the neural correlates of MW were constrained to specific networks in the brain and generalizable across participants. The largest effect sizes found were in F7 (ηp2 = 0.19, uncorrected p = 0.11), F8 (ηp2 = 0.22, uncorrected p = 0.08), P8 (ηp2 = 0.26, uncorrected p < 0.05), and O1 (ηp2 = 0.20, uncorrected p = 0.08). When examining the interaction between MW and frequency band, we found only moderate effects in F8 (ηp2 = 0.17, uncorrected p = 0.06), T8 (ηp2 = 0.29, uncorrected p < 0.01), O1 (ηp2 = 0.19, uncorrected p = 0.03), and Oz (ηp2 = 0.16, uncorrected p = 0.06), possibly suggesting that the frequency bands associated with MW were not consistent across individuals. However, none of the group-level statistics were significant at a 0.05 level after correcting for multiple comparisons. Post-hoc analyses revealed a similar pattern of findings (see Fig 3), but more clearly showed that the effect of MW remaining after aggregating data from the entire group of participants may depend on the frequency band as well.
The results of the individual-specific repeated-measures ANOVA models are given in Table 5. Here we found strong effects of MW and the interaction of MW and frequency band for all participants except for P7. For P14 and P15, we only found an effect of the interaction between MW and frequency band, suggesting that the effect of MW may be more isolated to a specific frequency band in those participants. Post-hoc analyses (see Fig 4) showed that the effect of MW appeared to be spread across frequency bands for most individuals, except for those individuals previously identified. For P7, we found no effect in any frequency band, and for P14 and P15, the strongest effect appeared in the beta frequencies. For some participants, MW was associated with a reduction in band power, while for others, the opposite was true.
Common spatial pattern and classification performance
Classification performance in terms of the F1 score is shown in Fig 5 for MW detection for the Artifacts Suppressed approach and in Fig 6 for the Artifact Present approach. The average intra-subject classification performance was well above chance for both approaches (Artifacts Suppressed: M = 0.83, SD = 0.12; Artifacts Present: M = 0.85, SD = 0.07) when considering the best frequency band per individual. The classification performance for the best frequency band per individual is shown in Table 6 for the Artifacts Suppressed approach and Table 7 for Artifacts Present approach. Inter-subject classification did not yield classification performance above chance levels (Artifacts Suppressed best frequency band: M = 0.56, SD = 0.12; Artifacts Present: M = 0.56, SD = 0.11). In other words, we were able to predict MW in individuals, but the neural patterns of MW differed across participants, as shown in Fig 7.
It is noteworthy that for some participants the Artifacts Present approach yielded the best classification performance, suggesting a potential role of ocular and some motor artifacts in MW detection. On average, the Artifacts Suppressed approach produced a slightly lower classification accuracy than the Artifacts Present, but this difference was not significant using a two-sided, paired-samples t-test, t(14) = -0.30, p = 0.77.
We computed Spearman’s ρ (rho) between the number of epochs available for training and the F1 score for each participant to check whether the variation in classification accuracy could partially be explained by the amount of available training data. For the Artifacts Suppressed classification accuracies, we found that ρ = 0.28, p = 0.32, indicating no significant relationship. For the Artifacts Present approach, we found that ρ = 0.51, p = 0.05, suggesting that there may be a relationship, and that additional training data may improve accuracy.
In addition to measuring the classification accuracy of our models, we compared their predicted MW rates per participant with each participant’s actual MW rate (see Fig 8). For both the Artifacts Suppressed and Artifacts Present approaches, observed and predicted rates were highly correlated (Artifacts Suppressed: r = 0.78, p = 0.0007; Artifacts Present: r = 0.81, p = 0.0002). Additionally, the predicted MW rates obtained with both approaches were very highly correlated with one another (r = 0.91, p < 0.0001).
In this study, we used machine learning methods designed for data-driven feature learning with EEG to detect MW during live lectures at both the individual and group levels. While our data revealed some similarities in the neural correlates of MW that have been found in prior studies, the neural correlates found in our study were much more highly individualized than previously reported. Our work suggests that understanding MW and developing applications based on MW detection may considerably benefit from methods that are capable of single-subject analysis of brain activity. This work suggests that further research on the individual variability of MW using signal processing methods that are more suitable for interpreting which networks are activated during MW is needed.
Consistent with the majority of the literature in the field, we found that MW occurred on average 35% of the time during the lectures based on participant responses to thought probes. However, individual rates of MW varied considerably, as can be seen in Fig 7, likely reflecting a combination of individual propensity to MW, interest in the lecture material, and circumstantial factors (e.g., how well each participant slept the night before the study). The considerable variability in MW rates further illustrates the need for individual-level analysis of MW as a means of understanding the phenomenon in more detail. The individual predicted rates of MW obtained by our machine learning approach varied considerably as well, but were not significantly different from the observed rates. Fig 8 shows that the predicted rates of MW were highly correlated with the observed rates, providing further evidence that the models were able to capture something useful about MW on an individual basis.
Similar MW rates were found using thought probes in a classroom-based study , but in that case, MW was also found to be predictive of lower academic performance . Contrary to what might be expected, we did not find a significant correlation between the prevalence of MW and recall or retention. A likely explanation is that the quiz questions used in our study were sent to us by the lecturers in advance, and therefore not directly linked to the content that was presented immediately before the thought probes. There was no mechanism used to connect the quiz items to the thought probes because we did not want the lecturers to be any more aware of when the probes would appear than the students to lower the probability of accidental cueing. This means that we had no way of truly knowing if participants were mind wandering when testable content was covered. Trainee motivation to learn and retain the content may also have been a factor, as anecdotal feedback suggested that not all participants found the lecture content relevant to their day-to-day clinical experiences.
Our findings also indicate some differences in participants’ behaviours during the two lectures. While the level of MW was roughly consistent across the various time points sampled in Lecture 1, there were two time points during Lecture 2 at which participants were significantly more likely to report MW. Lecture 2 also had overall poorer immediate recall and retention than Lecture 1. While no specific changes in the lecturer’s behaviour were documented during the two time points previously mentioned, the observed effects may have at least in part been due to differences in participants’ interest in the lecture content (there was more variability in participants’ reactions to Lecture 2). They may also have been related to the order in which the lectures were presented. Since Lecture 2 took place after participants had been in the LIVELab for nearly two hours, participants may have started to become fatigued, and thus may have been less motivated to pay attention to or retain content.
As stated earlier, the retention score data for Lecture 2 did not meet the assumptions of normality. However, since the significant differences reported were relatively large, they are unlikely to be false positives. Additionally, the correlational analyses reported in our study were underpowered due to a small sample size and we were unable to link the quiz questions in our study directly to the probes. However, since participants tended to MW at different time points throughout the lectures, linking the quiz questions and probes may ultimately have been of limited benefit as there would likely not have been enough data points to determine if those who were paying attention at a particular time point performed better on the corresponding quiz questions than those who were not. Linking the two may also have inadvertently cued participants to better remember the content covered close to a probe.
Mind wandering detection
We were able to classify MW significantly above-chance accuracy for most participants using only neuroelectric signals (the Artifacts Suppressed approach; see Fig 5). We were also interested in classification approaches that prioritized optimal MW detection over other considerations motivated by numerous studies showing that eye tracking was useful for MW detection [53–55]. For this reason, we also experimented with the Artifacts Present approach. Using this approach, we obtained significantly better than chance classification accuracy for all participants (see Fig 6). In particular, participants for whom MW was difficult to classify in the Artifacts Suppressed approach yielded results more in line with the other participants in the Artifacts Present approach. However, since classification accuracy was higher in the Artifacts Suppressed approach for some participants, the average classification accuracies were not significantly different between the two approaches.
Since our machine learning approach was based upon data-driven feature learning with common spatial patterns, a method adapted from brain-computer interfacing, we note a potential limitation in its application to MW detection. While the algorithm may learn spatial patterns that reflect different modes of MW (or perhaps mind wandering about different things) that may generalize well for an individual during a single session, the fact that it is a supervised algorithm means that its generalizability is limited to patterns already represented in the training data. Due to both the non-stationarity of brain activity and the day-to-day changes in mood, concerns, and preoccupations of an individual, it is possible that common spatial patterns learned on one day or in one context may not generalize to MW on other days or in other contexts. Indeed, this limitation of common spatial patterns is also present and actively being addressed in the brain-computer interfacing literature [89–91], and can be considered a limitation of supervised machine learning for non-stationary data more broadly .
Contrary to other studies identifying EEG correlates of MW using more traditional statistical analyses [40, 50], we did not find a consistent pattern of frequency band activation or spatial topographies that could be identified as a signature of MW across participants. While a recent study using machine learning to detect MW from EEG also showed that a variety of features of the EEG needed to be used together to reliably detect MW , our findings differed in that patterns of MW were highly individualized. In fact, our inter-subject model did not achieve above chance-level accuracy. The differences between our findings and those of previous studies can be explained by several factors, which we discuss next.
As discussed in the introduction, some of the discrepancies found in the neural correlates across EEG studies of MW may emerge from differences in experimental settings and the attentive task used as a control condition. As can be seen in Fig 3, our data were more in line with the EEG-derived neural correlates of MW discovered in more controlled laboratory-based experimental designs in which MW was contrasted against breath focus . While we did not find an effect on the lower frequency range (i.e., theta), we did find a decrease in alpha power (which we defined as the more commonly used 8–12 Hz band, rather than 9–11 Hz) over occipital and parietocentral regions, although we note that these findings were only statistically significant before correcting for multiple comparisons. In addition, we found a similar decrease in beta power over frontolateral regions, with the addition of a decrease over the left occipital cortex. It is important to note that, unlike previous studies, we split our beta band into a lower beta band and a higher band and only found these significant effects in the higher frequency beta band. The main difference in our findings was a lack of effect in the theta band, which may be because our study took place in a more naturalistic setting with both auditory and visual stimuli. This may have resulted in a greater degree of ocular artifact contamination, which would have disproportionately affected lower frequency bands. In a study where the active task was listening to stories, a similar decrease in alpha power in occipital and parietal areas was found when participants were aware that they were thinking about something other than the task, and were still partially attending to the task . In contrast, this study showed a widespread increase in alpha activity when participants were not aware that they were no longer paying attention to the task. If we were to interpret our results in the same way, it is possible that, on average, the participants in our study were often aware that they were thinking about something other than the lectures, and thus were intentionally mind wandering .
Individual-level neural correlates of mind wandering
A seemingly unique finding in our study was the highly individualized nature of the neural correlates of MW that were found via data-driven feature learning (common spatial patterns), in combination with the inability of our machine learning approach to identify patterns that could generalize across participants. This can be seen in Fig 4, where MW was associated with changes in band powers in different directions for different participants, and in Fig 7, where the spatial patterns most predictive of MW only showed almost no similarity across individuals (although for participants P3, P8, and P12, for whom the alpha band is shown, there were similarities in the common spatial patterns associated with not MW).
The variety of neural correlates found across individuals is too broad to identify any consistent patterns that can be associated with the neural networks identified in previous studies using EEG . Moreover, with 16 EEG channels, we lack the spatial resolution needed for accurate source localization [94, 95]. Furthermore, common spatial patterns only reflect scalp topologies that maximize the ratio of amplitude variance between two cognitive states, and may, therefore, omit brain regions that are activated in both states. As such, we would not necessarily expect a one-to-one correspondence between the common spatial patterns learned from the MW data and the network of brain regions that are associated with MW. With this limitation in mind, we discuss a possible interpretation of the common spatial patterns seen in Fig 7 to motivate further research exploring what machine learning methods may reveal about the neural correlates of MW, and other cognitive processes, based on individual-level analysis.
A study exploring the EEG scalp topologies (as opposed to common spatial patterns) in different frequency bands that appear during the activation of different resting-state networks in fMRI  may allow us to infer something about what the common spatial patterns reveal about the neural correlates of MW across individuals. Visually, the MW patterns for P3 are similar to the alpha EEG activity associated with visual networks, whereas the patterns associated with MW in P8, P12 and P15 are more reminiscent of the EEG activity associated with co-activation of the DMN, the frontoparietal control network, and the frontal attention network. All of these networks have been associated with MW through prior neuroimaging research . For P3, P8, and P12, the most predictive common spatial patterns and the most similar network-related scalp topologies were found in the alpha band, activity in which has been specifically linked to resting state functional brain activity [96–100] and a decrease in bottom-up sensory processing [97, 101]. For P15, while the common spatial patterns appear distinct because they show changes in the beta2 frequency band instead of alpha, the most similar beta band scalp topology shown in  is associated with the same networks as P8 and P12. This suggests that for P15, a reduction in beta activity, which is associated with a decrease in active thinking and concentration , was more predictive than an increase in alpha activity, even though they may reflect a change in the same resting-state networks and suggest a very similar change in cognitive state.
Comparing the common spatial patterns discovered through individual-level feature learning to EEG scalp topologies associated with specific brain networks may help explain why group-level analyses of functional connectivity in fMRI data consistently point to the same networks, but different EEG studies at times appear to show contradictory neural correlates (see our review of the neural correlates of MW in the introduction). This interpretation would suggest that while MW may generally be associated with changing activity patterns in the same set of resting-state networks, the oscillatory dynamics of those networks may change in different ways for different individuals, and different subnetworks may be activated at different times. This leads to a very interesting hypothesis about the dynamics and varieties of MW that warrants further exploration. However, we caution readers against over-interpreting these findings by assuming that common spatial patterns necessarily reveal the same brain networks discovered in previous studies. While similar inferences about the network activations for each participant can be drawn by comparing the learned common spatial patterns to scalp topologies identified in simultaneous EEG-fMRI studies, doing so in a rigorous way requires an entirely different kind of experimental approach that is outside of the scope of this paper. Ideally, the association between individual-level common spatial patterns and network activations would be tested by computing common spatial patterns from EEG data that were simultaneously acquired with fMRI.
We note two reasons for finding patterns of brain activity associated with MW that are more individualized than previously reported. The first is that the common spatial patterns algorithm may be particularly well-suited to machine learning analysis on an individual basis, as its roots are found in within-subject brain-computer interfacing research [85, 86]. Common spatial patterns may, therefore, identify patterns that are highly tuned to individuals and can be especially powerful in identifying patterns that would likely be lost upon averaging across individuals, or in other group-level analyses. The tradeoff is that this method may not be well-suited to generalization across individuals, and as noted earlier, recent research in brain-computer interfacing has focused on developing new methods specifically designed to overcome this limitation [103, 104].
Second, our broad content-based definition of MW may have led to a large degree of heterogeneity in the neural correlates of MW across participants, particularly if there was variation in how the thought probes were interpreted . This may have further contributed to poor inter-subject generalizability in our machine learning models. We explained earlier that we chose to use a broad definition of MW so that we could discover neural correlates of MW that were more likely to translate well into real-world applications related to our naturalistic setting (i.e., MW detection and/or attentional monitoring of trainees during live lectures). High variation has been found in previous work, and is thought to reflect differences in the content of thought during MW and while paying attention [93, 105], which can, in turn, lead to the activation of different neural networks in the brain . We add support to this hypothesis using traditional statistical analyses. As can be seen in Figs 3 and 4, almost every participant showed differences between MW and non-MW EEG epochs in multiple frequency bands. However, after combining the epochs across participants and comparing MW to non-MW EEG at each EEG channel, the differences almost entirely disappeared, suggesting that these differences did not generalize across participants. By showing that machine learning and data-driven feature learning can be used to detect MW on an individual basis, we can contribute further evidence that MW could be considered a highly variable phenomenon that can be expressed broadly across the neocortex and across a wide range of frequencies.
The application of machine learning for predictive modelling in other areas of neuroscience has revealed the possibility of neural correlates of various mental processes that were not previously identified in group-level statistical analyses [62, 64–69, 85, 86]. It is possible that individual-level analyses performed on fMRI data may also reveal that MW involves a greater variety of neural correlates and brain networks than could be identified through group-level analyses. Such individual analyses of MW may contribute substantially towards resolving the question of whether these different patterns of neural activation correspond to different types (i.e., intentional versus unintentional [93, 107]) or definitions of (i.e., content-based versus stimulus-independent versus spontaneous thought ) of MW, and furthermore, if it is possible to differentially detect types of neural activity unrelated to the task at hand. Future work could use simultaneous EEG-fMRI with individual-level modelling to gain a more precise understanding of the individual variability in network activation involved in MW, including how those networks interact. In addition, such data could be used to clarify the relationship between machine learning derived common spatial patterns that enable MW detection by establishing the correlation between the appearance of those patterns and specific brain networks. This may help resolve whether the executive control failure hypothesis or the decoupling hypothesis is closer to reality, or if neither is a sufficient description of why the executive control network can be co-active with the DMN during MW.
We were able to accurately detect MW from EEG at the individual level using data-driven feature learning and machine learning. These methods allowed us to show that the neural correlates of MW might be more variable than suggested by traditional statistical methods. With further study, these findings may lead to the development of new methods for online MW detection that facilitate the deeper study of the phenomenon, particularly at the individual level, while also enabling real-time MW detection in real-world settings. This work points to the possibility that MW might be associated with multiple patterns of activity in previously identified resting state brain networks, which are best revealed by analysis of brain activity at the individual level.
1. Smallwood J, Mrazek MD, Schooler JW. Medicine for the wandering mind: mind wandering in medical practice. Med Educ. 2011;45(11):1072–80. doi: 10.1111/j.1365-2923.2011.04074.x 21988623
2. Stawarczyk D, Majerus S, Maj M, Van der Linden M, D’Argembeau A. Mind-wandering: phenomenology and function as assessed with a novel experience sampling method. Acta Psychol (Amst). 2011;136(3):370–81. doi: 10.1016/j.actpsy.2011.01.002 21349473
4. Mason MF, Norton MI, Van Horn JD, Wegner DM, Grafton ST, Macrae CN. Wandering minds: the default network and stimulus-independent thought. Science. 2007;315(5810):393–5. doi: 10.1126/science.1131295 17234951
5. Killingsworth MA, Gilbert DT. A wandering mind is an unhappy mind. Science. 2010;330(6006):932. doi: 10.1126/science.1192439 21071660
6. Kucyi A. Just a thought: how mind-wandering is represented in dynamic brain connectivity. NeuroImage. 2018;180(Part B):505–14. doi: 10.1016/j.neuroimage.2017.07.001 28684334
7. Christoff K, Irving ZC, Fox KCR, Spreng N, Andrews-Hanna JR. Mind-wandering as spontaneous thought: a dynamic framework. Nat Rev Neurosci. 2016;17(11):718–31. doi: 10.1038/nrn.2016.113 27654862
8. Mooneyham BW, Schooler JW. The costs and benefits of mind-wandering: a review. Can J Exp Psychol. 2013;67(1):11–8. doi: 10.1037/a0031569 23458547
9. Smallwood J, Schooler JW. The science of mind wandering: empirically navigating the stream of consciousness. Annu Rev Psychol. 2015;66:487–518. doi: 10.1146/annurev-psych-010814-015331 25293689
10. Smallwood J, O’Connor RC, Sudberry MV, Haskell C, Ballantyne C. The consequences of encoding information on the maintenance of internally generated images and thoughts: the role of meaning complexes. Conscious Cogn. 2004;13(4):789–820. doi: 10.1016/j.concog.2004.07.004 15522632
11. Smallwood J, McSpadden M, Schooler JW. The lights are on but no one’s home: meta-awareness and the decoupling of attention when the mind wanders. Psychon Bull Rev. 2007;14(3):527–33. doi: 10.3758/BF03194102 17874601
12. Smallwood J, McSpadden M, Luus B, Schooler JW. Segmenting the stream of consciousness: the psychological correlates of temporal structures in the time series data of a continuous performance task. Brain Cogn. 2008;66(1):50–6. doi: 10.1016/j.bandc.2007.05.004 17614178
13. Smallwood J, McSpadden M, Schooler JW. When attention matters: the curious incident of the wandering mind. M&C. 2008;36(6):1144–50. doi: 10.3758/MC.36.6.1144 18927032
14. Risko EF, Anderson N, Sarwal A, Engelhardt M, Kingstone A. Everyday attention: variation in mind wandering and memory in a lecture. Appl Cogn Psychol. 2012;26(2):234–42. doi: 10.1002/acp.1814
15. Szpunar KK, Jing HG, Schacter DL. Overcoming overconfidence in learning from video-recorded lectures: implications of interpolated testing for online education. J Appl Res Mem Cogn. 2014;3(3):161–4. doi: 10.1016/j.jarmac.2014.02.001
16. Wammes JD, Seli P, Cheyne JA, Boucher P, Smilek D. Mind wandering during lectures II: relation to academic performance. Scholars Teach Learn Psychol. 2016;2 (1):33–48. doi: 10.1037/stl0000055
17. Pachai AA, Acai A, LoGuidice AB, Kim JA. The mind that wanders: challenges and potential benefits of mind wandering in education. Scholars Teach Learn Psychol. 2016;2(2):134–46. doi: 10.1037/stl0000060
18. Wilson K, Korn JH. Attention during lectures: beyond ten minutes. Teach Psychol. 2007;24(2):85–9. doi: 10.1080/00986280701291291
19. Szpunar KK, Moulton ST, Schacter DL. Mind wandering and education: from the classroom to online learning. Front Psychol. 2013;4:495. doi: 10.3389/fpsyg.2013.00495 23914183
20. Lindquist SI, McLean JP. Daydreaming and its correlates in an educational environment. Learn Individ Differ. 2011;21(2):158–67. doi: 10.1016/j.lindif.2010.12.006
21. Jin CY, Borst JP, van Vugt MK. Predicting task-general mind-wandering with EEG. Cogn Affect Behav Neurosci. 2019. doi: 10.3758/s13415-019-00707-1 30850931
22. Barron E, Riby LM, Greer J, Smallwood J. Absorbed in thought: the effect of mind wandering on the processing of relevant and irrelevant events. Psychol Sci. 2011;22(5):596–601. doi: 10.1177/0956797611404083 21460338
23. McVay JC, Kane MJ. Conducting the train of thought: working memory capacity, goal neglect, and mind wandering in an executive-control task. J Exp Psychol Learn Mem Cogn. 2009;35(1):196–204. doi: 10.1037/a0014104 19210090
25. Weinstein Y., Mind-wandering how do I measure thee with probes? Let me count the ways. Behav Res Methods. 2018;50(2):642–61. doi: 10.3758/s13428-017-0891-9 28643155
26. Christoff K, Gordon AM, Smallwood J, Smith R, Schooler JW. Experience sampling during fMRI reveals default network and executive system contributions to mind wandering. Proc Natl Acad Sci U S A. 2009;106(21):8719–24. doi: 10.1073/pnas.0900234106 19433790
27. Antrobus JS. Information theory and stimulus-independent thought. Br J Psychol. 1968;59(4):423–39. doi: 10.1111/j.2044-8295.1968.tb01157.x
28. Seli P, Carriere JS, Smilek D. How few and far between? Examining the effects of probe rate on self-reported mind wandering. Front Psychol. 2013;4:430. doi: 10.3389/fpsyg.2013.00430 23882239
29. Spreng RN, Mar RA, Kim ASN. The common neural basis of autobiographical memory, prospection, navigation, theory of mind, and the default mode: a quantitative meta-analysis. J Cogn Neurosci. 2009;21(3):489–510. doi: 10.1162/jocn.2008.21029 18510452
30. Buckner RL, Andrews-Hanna JR, Schacter DL. The brain’s default network: anatomy, function, and relevance to disease. Ann N Y Acad Sci. 2008;1124(1):1–38. doi: 10.1196/annals.1440.011 18400922
31. Andrews-Hanna JR. The brain’s default network and its adaptive role in internal mentation. Neuroscientist. 2013;18(3):251–70. doi: 10.1177/1073858411403316 21677128
32. Dixon ML, Andrews-Hanna JR, Spreng RN, Irving ZC, Mills C, Girn M, et al. Interactions between the default network and dorsal attention network vary across default subsystems, time, and cognitive states. NeuroImage. 2017;147:632–49. doi: 10.1016/j.neuroimage.2016.12.073 28040543
33. McVay JC, Kane MJ. Does mind wandering reflect executive function or executive failure? Comment on Smallwood and Schooler (2006) and Watkins (2008). Psychol Bull. 2010;136(2):188–207. doi: 10.1037/a0018298 20192557
34. Kane MJ, McVay JC. What mind wandering reveals about executive-control abilities and failures. Curr Dir Psychol Sci. 2012;21(5):348–54. doi: 10.1177/0963721412454875
35. Axelrod V, Rees G, Lavidor M, Bar M. Increasing propensity to mind-wander with transcranial direct current stimulation. Proc Natl Acad Sci U S A. 2015;112(11):3314–9. doi: 10.1073/pnas.1421435112 25691738
36. Smallwood J, Beach E, Schooler JW, Handy TC. Going AWOL in the brain: mind wandering reduces cortical analysis of external events. J Cogn Neurosci. 2008;20(3):458–69. doi: 10.1162/jocn.2008.20037 18004943
37. Smallwood J. Distinguishing how from why the mind wanders: a process-occurrence framework for self-generated mental activity. Psychol Bull. 2013;139(3):519–35. doi: 10.1037/a0030010 23607430
38. Girn M, Mills C, Laycock E, Ellamil M, Ward L, Christoff K. Neural dynamics of spontaneous thought: an electroencephalographic study. Proceedings of the 11th International Conference on Augmented Cognition, Lecture Notes in Computer Science, vol 10284; 2017; Vancouver, British Columbia, Canada. Cham, Switzerland: Springer. p. 28–44.
39. Murray MM, Brunet D, Michel CM. Topographic ERP analyses: a step-by-step tutorial review. Brain Topogr. 2008;20(4):249–64. doi: 10.1007/s10548-008-0054-5 18347966
40. Braboszcz C, Delorme A. Lost in thoughts: neural markers of low alertness during mind wandering. NeuroImage. 2011;54(4):3040–7. doi: 10.1016/j.neuroimage.2010.10.008 20946963
41. Klimesch W. EEG alpha and theta oscillations reflect cognitive and memory performance: a review and analysis. Brain Res Rev. 1999;29(2–3):169–95. doi: 10.1016/S0165-0173(98)00056-3 10209231
42. Herrmann CS, Strüber D, Helfrich RF, Engel AK. EEG oscillations: from correlation to causality. Int J Psychophysiol. 2016;103:12–21. doi: 10.1016/j.ijpsycho.2015.02.003 25659527
43. Fries P. A mechanism for cognitive dynamics: neuronal communication through neuronal coherencez. Trends Cogn Sci. 2005;9(10):474–80. doi: 10.1016/j.tics.2005.08.011 16150631
44. Fell J, Axmacher N. The role of phase synchronization in memory processes. Nat Rev Neurosci. 2011;12(2):105–18. doi: 10.1038/nrn2979 21248789
45. Schnitzler A, Gross J. Normal and pathological oscillatory communication in the brain. Nat Rev Neurosci. 2005;6(4):285–96. doi: 10.1038/nrn1650 15803160
46. Engel AK, Singer W. Temporal binding and the neural correlates of sensory awareness. Trends Cogn Sci. 2001;5(1):16–25. doi: 10.1016/S1364-6613(00)01568-0 11164732
47. Varela F, Lachaux J-P, Rodriguez E, Martinerie J. The brainweb: phase synchronization and large-scale integration. Nat Rev Neurosci. 2001;2(4):229–39. doi: 10.1038/35067550 11283746
48. Barry RJ, Clarke AR, Johnstone SJ. A review of electrophysiology in attention-deficit/hyperactivity disorder: I. Qualitative and quantitative electroencephalography. Clin Neurophysiol. 2003;114(2):171–83. doi: 10.1016/S1388-2457(02)00362-0 12559224
49. Keune PM, Hansen S, Weber E, Zapf F, Habich J, Muenssinger J, et al. Exploring resting-state EEG brain oscillatory activity in relation to cognitive functioning in multiple sclerosis. Clin Neurophysiol. 2017;128(9):1746–54. doi: 10.1016/j.clinph.2017.06.253 28772244
50. van Son D, De Blasio FM, Fogarty JS, Angelidis A, Barry RJ, Putman P. Frontal EEG theta/beta ratio during mind wandering episodes. Biol Psychol. 2019;140:19–27. doi: 10.1016/j.biopsycho.2018.11.003 30458199
51. Boudewyn MA, Long DL, Traxler MJ, Lesh TA, Dave S, Mangun GR, et al. Sensitivity to referential ambiguity in discourse: the role of attention, working memory, and verbal ability. J Cogn Neurosci. 2015;27(12):2309–23. doi: 10.1162/jocn_a_00837 26401815
52. Boudewyn MA, Carter CS. I must have missed that: alpha-band oscillations track attention to spoken language. Neuropsychologia. 2018;117:148–55. doi: 10.1016/j.neuropsychologia.2018.05.024 29842859
53. Smilek D, Carriere JSA, Cheyne JA. Out of mind, out of sight: eye blinking as indicator and embodient of mind wandering. Psychol Sci. 2010;21(6):786–9. doi: 10.1177/0956797610368063 20554601
54. Hutt S, Mills C, Bosch N, Krasich K, Brockmole J, D’Mello S. Out of the fr-“eye-ing pan: towards gaze-based models of attention during learning with technology in the classroom. Proceedings of the 25th Conference on User Modeling, Adaptation and Personalization; 2017 Jul; Bratislava, Slovakia. New York, NY: ACM. p. 94–103.
55. Faber M, Bixler R, D’Mello SK. An automated behavioral measure of mind wandering during computerized reading. Behav Res Methods. 2018;50(1):134–50. doi: 10.3758/s13428-017-0857-y 28181186
56. Smallwood J, Davies JB, Heim D, Finnigan F, Sudberry M, O’Connor R, et al. Subjective experience and the attentional lapse: task engagement and disengagement during sustained attention. Conscious Cogn. 2004;13(4):657–90. doi: 10.1016/j.concog.2004.06.003 15522626
57. Pham P, Wang J. AttentiveLearner: improving mobile MOOC learning via implicit heart rate tracking. In: Conati C, Heffernan N, Mitrovic A, Verdejo MF, editors. Proceedings of the 17th International Conference on Artificial Intelligence in Education, Lecture Notes in Computer Science, vol 9112; 2015 Jun; Madrid, Spain. Cham, Switzerland: Springer. p. 367–76.
58. Blanchard N, Bixler R, Joyce T, D’Mello S. Automated physiological-based detection of mind wandering during learning. In: Trausan-Matu S, Boyer KE, Crosby M, Panourgia K, editors. Proceedings of the 12th International Conference on Intelligent Tutoring Systems, Lecture Notes in Computer Science, vol 8474; 2014 Jun; Honolulu, Hawaii, United States. Cham, Switzerland: Springer. p. 55–60.
59. Mittner M, Boekel W, Tucker AM, Turner BM, Heathcote A, Fostmann BU. When the brain takes a break: A model-based analysis of mind wandering. J Neurosci. 2014;34(49):16286–95. doi: 10.1523/JNEUROSCI.2062-14.2014 25471568
60. Durantin G, Dehais F, Delorme A. Characterization of mind wandering using fNIRS. Front Syst Neurosci. 2015;9(45). doi: 10.3389/fnsys.2015.00045 25859190
61. Zheng W-L, Lu B-L. Personalizing EEG-based affective models with transfer learning. Proceedings of the 25th International Joint Conference on Artificial Intelligence; 2016; New York, New York, USA: AAAI Press. p. 2732–8.
62. Dhindsa K, Becker S. Emotional reaction recognition from EEG. Proceedings of the 7th International Workshop on Pattern Recognition in Neuroimaging; 2017; Toronto, Ontario, Canada: IEEE. p. 1–4.
63. Wang J-W, Nie D, Lu B-L. Emotional state classification from EEG data using machine learning approach. Neurocomputing. 2014;129:94–106. doi: 10.1016/j.neucom.2013.06.046
64. Boshra R, Dhindsa K, Boursalie O, Ruiter KI, Sonnadara RR, Samavi R, et al. From group-level statistics to single-subject prediction: machine learning detection of concussion in retired athletes. IEEE Trans Neural Syst Rehabil Eng. 2019;27(7):1492–501. doi: 10.1109/TNSRE.2019.2922553 31199262
65. Jayaram V, Alamgir M, Altun Y, Schölkopf B, Grosse-Wentrup M. Transfer learning in brain-computer interfaces. IEEE Computational Intelligence Magazine. 2016;11(1):20–31. doi: 10.1109/MCI.2015.2501545
66. Meltzer JA, Negishi M, Mayes LC, Constable RT. Individual differences in EEG theta and alpha dynamics during working memory correlate with fMRI responses across subjects. Clin Neurophysiol. 2007;118(11):2419–36. doi: 10.1016/j.clinph.2007.07.023 17900976
67. Kane MJ, Smeekens BA, von Bastian CC, Lurquin JH, Carruth NP, Miyake A. A combined experimental and individual-differences investigation into mind wandering during a video lecture. J Exp Psychol Gen. 2017;146(11):1649–74. doi: 10.1037/xge0000362 29094964
68. Robison MK, Gath KI, Unsworth N. The neurotic wandering mind: an individual differences investigation of neuroticism, mind-wandering, and executive control. Q J Exp Physiol. 2017;70(4):649–63. doi: 10.1080/17470218.2016.1145706 26821933
69. Smeekens BA, Kane MJ. Working memory capacity, mind wandering, and creative cognition: an individual-differences investigation into the benefits of controlled versus spontaneous thought. Psychol Aesthet Crea. 2016;10(4):389–415. doi: 10.1037/aca0000046 28458764
70. Kucyi A, Davis KD. Dynamic functional connectivity of the default mode network tracks daydreaming. NeuroImage. 2014;100(15):471–80. doi: 10.1016/j.neuroimage.2014.06.044 24973603
71. Chou Y-h, Sundman M, Whitson HE, Gaur P, Chu M-L, Weingarten CP, et al. Maintenance and representation of mind wandering during resting-state fMRI. Sci Rep. 2017;7:40722. doi: 10.1038/srep40722 28079189
72. McMaster LIVELab. n.d. Available from: http://livelab.mcmaster.ca/.
73. Sana F, Weston T, Cepeda NJ. Laptop multitasking hinders classroom learning for both users and nearby peers. Comput Educ. 2013;62:24–31. doi: 10.1016/j.compedu.2012.10.003
74. Maxwell SE, Delaney HD. Designing experiments and analyzing data. 2nd ed. New York: Taylor & Francis; 2004.
75. Gramfort A, Luessi M, Larson E, Engemann DA, Strohmeier D, Brodbeck C, et al. MEG and EEG data analysis with MNE-Python. Front Neurosci. 2013;7:267. doi: 10.3389/fnins.2013.00267 24431986
76. Carriere JS, Seli P, Smilek D. Wandering in both mind and body: individual differences in mind wandering and inattention predict fidgeting. Can J Exp Psychol. 2013;67(1):19–31. doi: 10.1037/a0031438 23458548
77. Chou Y-L. Statistical analysis with business and economic applications. 2nd. ed. New York: Holt, Rinehart, and Winston; 1975.
78. Jas M, Engemann DA, Bekhti Y, Raimondo F, Gramfort A. Autoreject: automated artifact rejection for MEG and EEG data. NeuroImage. 2017;159:417–29. doi: 10.1016/j.neuroimage.2017.06.030 28645840
79. Urigüen JA, Garcia-Zapirain B. EEG artifact removal—state-of-the-art and guidelines. J Neural Eng. 2015;12(3). doi: 10.1088/1741-2560/12/3/031001 25834104
80. Lee T-W, Girolami M, Sejnowski TJ. Independent component analysis using an extended infomax algorithm for mixed subgaussian and supergaussian sources. Neural Comput. 1999;11(2):417–41. doi: 10.1162/089976699300016719 9950738
81. Delorme A, Sejnowski T, Makeig S. Enhanced detection of artifacts in EEG data using higher-order statistics and independent component analysis. NeuroImage. 2007;34(4):1443–9. doi: 10.1016/j.neuroimage.2006.11.004 17188898
82. Ramoser H, Muller-Gerking J, Pfurtscheller G. Optimal spatial filtering of single trial EEG during imagined hand movement. IEEE Trans Rehabil Eng. 2000;8(4):441–6. 11204034
83. Blankertz B, Tomioka R, Lemm S. Optimizing spatial filters for robust EEG single-trial analysis. IEEE Signal Process Mag. 2008;25(1):41–56.
85. Dhindsa K, Becker S. Toward an open-ended BCI: a user-centered coadaptive design. Neural Comput. 2017;29(10):2742–68. doi: 10.1162/neco_a_01001 28777722
86. Dhindsa K, Carcone D, Becker S. A brain-computer interface based on abstract visual and auditory imagery: evidence for an effect of artistic training. 11th International Conference on Augmented Cognition, AC 2017, Held as Part of HCI International 2017; Vancouver, BC, Canada: Springer.
88. Wammes JD, Boucher P, Seli P, Smilek D. Mind wandering during lecture I: Changes in rates across an entire semester. Scholars Teach Learn Psychol. 2016;2(1):13–32. doi: 10.1037/stl0000053
89. Arvaneh M, Guan C, Ang KK, Quek C. Robust EEG channel selection across sessions in brain-computer interface involving stroke patients. Proceedings of the 2012 International Joint Conference on Neural Networks; 2012, Jun; Brisbane, Queensland, Australia: IEEE. p. 1–6.
90. Cho H, Ahn M, Kim K, Jun SC. Increasing session-to-session transfer in a brain–computer interface with on-site background noise acquisition. J Neural Eng. 2015;12(6). doi: 10.1088/1741-2560/12/6/066009 26447843
91. Goel P, Joshi RB, Sur M, Murthy HA. A common spatial pattern approach for classification of mental counting and motor execution EEG. In: Tiwary US, editor. Proceedings of the 10th International Conference on Intelligent Human Computer Interaction, Lecture Notes in Computer Science, vol 11278; 2018, Dec; Allahabad, India. Cham, Switzerland: Springer. p. 26–35.
92. Sayed-Mouchaweh M, Lughofer E, editors. Learning in non-stationary environments: methods and applications. New York, NY: Springer; 2012.
93. Seli P, Ralph BCW, Konishi M, Smilek D, Schacter DL. What did you have in mind? Examining the content of intentional and unintentional types of mind wandering. Conscious Cogn. 2017;51:149–56. doi: 10.1016/j.concog.2017.03.007 28371688
94. Song J, Davey C, Poulsen C, Luu P, Turovets S, Anderson E, et al. EEG source localization: sensor density and head surface coverage. J Neurosci Methods. 2015;256:9–21. doi: 10.1016/j.jneumeth.2015.08.015 26300183
95. Sohrabpour A, Lu Y, Kankirawatana P, Blount J, Kim H, He B. Effect of EEG electrode number on epileptic source localization in pediatric patients. Clin Neurophysiol. 2015;126(3):472–80. doi: 10.1016/j.clinph.2014.05.038 25088733
96. Jann K, Kottlow M, Dierks T, Boesch C, Koenig T. Topographic electrophysiological signatures of fMRI resting state networks. PLOS ONE. 2010;5(9):e12945. doi: 10.1371/journal.pone.0012945 20877577
97. Knyazev GG. EEG correlates of self-referential processing. Front Hum Neurosci. 2013;6:264. doi: 10.3389/fnhum.2013.00264 23761757
98. Laufs H, Krakow K, Sterzer P, Eger E, Beyerle A, Salek-Haddadi A, et al. Electroencephalographic signatures of attentional and cognitive default modes in spontaneous brain activity fluctuations at rest. Proc Natl Acad Sci U S A. 2003;100(19):11053–8. doi: 10.1073/pnas.1831638100 12958209
99. Gonçalves SI, de Munck JC, Pouwels PJW, Schoonhoven R, Kuijer JPA, Maurits NM, et al. Correlating the alpha rhythm to BOLD using simultaneous EEG/fMRI: inter-subject variability. NeuroImage. 2006;30(1):203–13. doi: 10.1016/j.neuroimage.2005.09.062 16290018
100. Jann K, Dierks T, Boesch C, Kottlow M, Strik W, Koenig T. BOLD correlates of EEG alpha phase-locking and the fMRI default mode network. NeuroImage. 2009;45(3):903–16. doi: 10.1016/j.neuroimage.2009.01.001 19280706
101. Goldman R, Stern J, Engel J, Cohen M. Simultaneous EEG and fMRI of the alpha rhythm. Neuroreport. 2002;13(18):2487–92. doi: 10.1097/01.wnr.0000047685.08940.d0 12499854
102. Baumeister J, Barthel T, Geiss KR, Weiss M. Influence of phosphatidylserine on cognitive performance and cortical activity after induced stress. Nutr Neurosci. 2008;11(3):103–10. doi: 10.1179/147683008X301478 18616866
103. Lotte F, Bougrain L, Cichocki A, Clerc M, Congedo M, Rakotomamonjy A, et al. A review of classification algorithms for EEG-based brain–computer interfaces: a 10 year update. J Neural Eng. 2018;15(3):031005. doi: 10.1088/1741-2552/aab2f2 29488902
104. Saha S, Ahmed KI, Mostafa R, Khandoker AH, Hadjileontiadis L. Enhanced inter-subject brain computer interface with associative sensorimotor oscillations. Healthc Technol Lett. 2017;4(1):39–43. doi: 10.1049/htl.2016.0073 28529762
105. Zedelius CM, Gross ME, Schooler JW. Mind wandering: more than a bad habit. In: Verplanken B, editor. The psychology of habit. Basel: Springer; 2018. p. 363–78.
106. Fox KCR, Spreng RN, Ellamil M, Andrews-Hanna JR, Christoff K. The wandering brain: Meta-analysis of functional neuroimaging studies of mind-wandering and related spontaneous thought processes. NeuroImage. 2015;111:611–21. doi: 10.1016/j.neuroimage.2015.02.039 25725466
107. Seli P, Carriere JS, Smilek D. Not all mind wandering is created equal: dissociating deliberate from spontaneous mind wandering. Psychol Res. 2015;79(5):750–8. doi: 10.1007/s00426-014-0617-x 25284016