Deep Belief Networks Based Brain Activity Classification Using EEG from Slow Cortical Potentials in Stroke

An electroencephalogram (EEG) is an electrical activity which is recorded from the scalp over the sensorimotor cortex during vigilance or sleeping conditions of subjects. It can be used to detect potential problems associated with brain disorders. The aim of this study is assessing the clinical usefulness of EEG which is recorded from slow cortical potentials (SCP) training in stroke patients using Deep belief network (DBN) which has a greedy layer wise training using Restricted Boltzmann Machines based unsupervised weight and bias evaluation and neural network based supervised training. EEGs are recorded during eight SCP neurofeedback sessions from two stroke patients with a sampling rate of 256 Hz. All EEGs are filtered with a low pass filter. Hilbert-Huang Transform is applied to the trails and various numbers of Instinct Mode Functions (IMFs) are obtained. High order statistics and standard statistics are extracted from IMFs to create the dataset. The proposed DBN-based brain activity classification has discriminated positivity and negativity tasks in stroke patients and has achieved high rates of 90.30%, 96.58%, and 91.15%, for sensitivity, selectivity, and accuracy, respectively.


Introduction
An electroencephalogram (EEG) is a biomedical signal that records the electrical activity in the brain [1].The neurons connect to each other by the dendrites and axons in the brain.The communication between the neurons is provided by the electrical impulses over the dendrites and axons.The electrical activity in the brain can be recorded and monitored placing the electrodes to specific areas on the head [2].The placement of the electrodes and the channel names are seen in Figure 1.The EEG is used in several areas such as providing the interaction between human and machine [3,4], determining the response of the brain to the visual and auditory signals [5,6], allowing the diagnosis of psycho-physiological and neurological disorders [7][8][9][10][11], and helping doctors to make a quick assessment of normal and abnormal patterns using the peaks and valleys on the EEG for all age groups.Seizure disorders (such as epilepsy), a head injury, encephalitis, a brain tumour, encephalopathy, memory problems, sleep disorders, stroke, dementia, and etc. are the diagnosable brain disorders using an EEG.The normal electrical activity of the brain is disrupted in many neurological disorders and brain potentials [2].The kind of the disorder can be stated by evaluating the shape and the interval of the disruptions on the EEG.EEG carries characteristic features that are helpful in the early diagnosis and early treatment processes of psychophysiological and neurological disorders [11].Slow cortical potential (SCP) is gradual changes in the cortical layer and the regulation of cortical excitability in cortical neuronal networks [12].The SCP has a duration that varies between 300ms and 10s [13].The SCP is a low-frequency EEG component and a non-invasive method.The electrodes are placed on the top centre of the head and conditioned responses method used in giving feedback.Due to these characteristics, the SCP may include the contingent negative variation, readiness potential, movement-related potentials, P300 and N400 potentials [1].Negative SCP is assumed to depolarization of cortical neuronal cells and positive SCP shows neuronal complication [13].The SCP has been correlated with a large number of cognitive processes in a systematic and topographically ways and has been determinately utilized in psychophysiological experiments to dissociate cognitive functions and motor performance of the brain [14,15].Ergenoglu et al. worked on determining the relationship between SCP and P300 amplitude [12], Khader et al. analysed  developed an auditory brain-computer stimuli for paralysed patients using the SCP [5].Stroke is a brain disorder that causes when the blood supply of brain cells is either cut off or reduced and the cells began to die.The abilities controlled by dead cells such as memory and muscle control are lost.The disorder affects the EEG and can be declared from the SCP and more potential.Deep Learning (DL) is an invasive, effective and machine learning algorithm that has a growing popularity and attempts to model high-level abstractions [18].In recent years, the DL is a new method of Machine Learning researches, which has been recognized with image processing, character recognition, speech recognition, frequently.Convolutional Neural Networks, Stacked Auto-encoders, Deep Boltzmann Machines and Deep Belief Networks (DBN) are the most effective DL algorithms [19].The biggest advantage of the DL is representing handcrafted features with the efficient algorithms for unsupervised or supervised feature learning and hierarchical feature extraction [20].In this study, the DBN algorithm is utilized as the classifier.The DBN is a robust and simple type of the DL algorithms which is comprised of both supervised and unsupervised learning stages.The experimental results that are obtained with the same number of hidden units and same structure of the DBN are compared with artificial neural network (ANN).

Materials and Methods
In this section, the SCP training database, the pre-processing of the database, Hilbert-Huang Transform, and statistical feature extraction methods would be explained.

Database
The EEG is an electrical activity which is recorded from the scalp over the sensorimotor cortex during vigilance or sleeping conditions of subjects.It can be used to detect potential problems associated with brain disorders.EEG data were recorded during eight SCP neurofeedback sessions from two chronic stroke patients [21].Neurofeedback sessions were conducted with an approximate interval of one week between sessions.The EEG was recorded with a sampling rate of 256 Hz from channel Cz using a Nexus-10 MKII DC amplifier (Mindmedia, Herten, The Netherlands).Each neurofeedback session included trials in which cortical positivity had to be increased and trials in which cortical negativity had to be increased.The durational presentation of the trials is 8s and is seen in Figure 2. The feedback includes a circle with the different size and colour related with the subject set successfully to baseline activity.The brain activities are labelled successful as positivity, and success indicated to the participant as negativity, if the trials are evaluated according to the task correctly, or not, respectively.All EEGs were filtered with a low pass filter (10 Hz).The negativity trials in SCP training sessions more frequent than the positivity trials.A total number of 8000 trials (500 trials for each session and each patient) with 2400 data points were segmented from 2 of EEGs.Distribution of the trials according to positivity and negativity situation in the database is given in Table 1.

Hilbert-Huang Transform
Hilbert-Huang Transform is a method that has a frequently use for feature extraction, filtering the signals, and similar processes on nonlinear and non-stationary signals.The growing trend on analysis of non-stationary and non-linear has led to opportunities of the analysis to expand the technical requirements [22].The HHT is one of the adaptive and extensive methods that can be considered as relatively recent.The HHT has been applied to lots of fields such as biomedical signal processing and geophysics [23].
Due to the flexibility of the stoppage criteria of the HHT algorithm, mathematical description of the HHT algorithm could not be defined precisely and clearly [24].The HHT has a twostep analysis.The first step is Empirical Mode Decomposition (EMD), and the second step is Hilbert Transform (HT), respectively.The EMD is pre-treatment of the original data and extracts n number of Intrinsic Mode Functions (IMF) and a residual signal.Each IMF is a signal which is based on a frequency modulation of the original data.The HT performs to obtain instantaneous frequency and amplitude values of each IMFs in the time-frequency domain [11,24].The HHT can perform more precise, distinctive and clear results than other methods in the presentation of time-frequency-energy for nonstationary and nonlinear signals [25].

Empirical Mode Decomposition
The EMD is a flexible analysis method which is used for extracting characteristic information obtained from non-linear and non-stationary processes.The most important characteristic of the EMD that is separated from other transformation algorithms is producing self-distinct oscillation of the original data by assuming that the signal consists of self-oscillations modes at the different frequency bands [22,26].satisfy the IMF extraction conditions, the local mean is recalculated using the local maximums and local minimums of new signal.This calculation is performed until an IMF is extracted.The residual signal is obtained subtracting the IMFs from last form of the residual signal.When the residual signal is a monotonic function or the residual signal has only one local extreme, it is not possible to extract another IMF and the EMD process ends [11,23,24].() represents for the last residual signal, () indicates the original signal, and n is the number of the IMF.

Hilbert Transform
The foremost characteristic feature of the non-linear signals is the internal wave frequency modulation that shows the instantaneous frequency oscillation in a single period.The signal characteristics are indicated straightforwardly by the instantaneous frequency distribution.The HT is a decomposition which determines the amplitude-frequency-time distribution of the signal [23,26].The HT of a signal is defined as: indicates the amplitude function of the signal,   represents for the instantaneous frequency function.The frequency-time distribution of the amplitude is called as Hilbert Spectrum and is shown as (, ).
The above equations show each IMF is amplitude and frequency modulated signals.The EMD proves its impact by analysing the non-stationary signals in the variable amplitude and frequency scales.

Deep Belief Networks
The DBN is a generative DL algorithm that consists of both unsupervised training and supervised training phases.The most important feature of the DBN is the idea of pre-training the weights, biases, and the other parameters using an unsupervised training algorithm such as Sparse Autoencoder or Restricted Boltzmann Machines (RBM) [18,27].In this study, the RBM is selected as pre-training algorithm in unsupervised training phase.
The RBM is a stochastic ANN type that calculates the weights of the units according to the probability distribution over a set of inputs.The stacking of the RBMs can build up the DBN with gradient descent algorithm or contrastive divergence algorithms [28].In the RBM, visible units that represent input data are connected to hidden units that learn to represent features using undirected weighted connections [19,28].Considering the RBM with input layer activations  (for visible units) and hidden layer activations ℎ (hidden units), bias of the visible unit , bias of the hidden unit : (, ℎ) indicates the joint distribution of the RBM and (, ℎ) represents for the energy function of the distribution.The DBN used latent variables in the deepest layer easy to understand the deepest features using at least two hidden layers in network.Each adjacent two layers have a connection to evaluate the greedily layer-wise pre-training [18].The parameters such as the weights and the biases obtained in unsupervised learning phase are unfolded to a neural network structure.The pre-trained parameters are updated and fine-tuned in supervising training phase.The whole network can be optimized by gradient descent algorithm in this learning phase.The detailed formulation about the DBN is presented in [18,29].

Experimental Results
The brain computer interfaces allow setting a connection between the machines and the brain to control devices with the SCP and more.The SCP training neurofeedback sessions are used to make a quick assessment of normal and abnormal patterns in the diagnosis of psycho-physiological and neurological disorders using the peaks and valleys from the EEG.computer stimuli for paralysed patients using the SCP and discussed auditory stimuli characteristics may have to be adapted to optimize brain-computer interface performances [5].The studies based on cognitive functions, motor performance of the brain and many neurological disorders in literature are indicated the efficiency and the importance of the SCP training and classification of the brain activities from the SCP training such as negativity trials or positivity trials in stroke patients.
The HHT is an effective method on non-linear and non-stationary EEG signals.Huang et al. [25] proposed a communication between human and computer.They used the HHT and wavelet transform for extracting the features from the steady-state visual evoked potential and indicated the HHT is more accurately expressing the time and frequency characteristics ability than the wavelet transform.Li et al. [30] proposed a sleep stage classification method using the EEG.They achieved a classification mean accuracy of 81.7% using the HHT, Fourier transform and wavelet transform features and discussed that the HHT is more successful and faster response for extracting the EEG features and tracking the rapid changes.Ozdemir et al. [11] and Oweis et al [22] applied the HHT to EEG and used intra wave frequency modulation on the different frequency bands to diagnose epileptic seizure prediction and classification and achieved accuracy rates of 89.66% and 94%, respectively.classifier is seen in Table 2.The classification performances are statistical valuation functions: Specificity, Sensitivity, and Accuracy which are obtained from confusion matrix of the classification.The formulation of the performance measurements are described in detail by Allahverdi et al. [29].The achieved classification performances using both the DBN classifier and the ANN with same number of hidden units in 2 or 3 hidden layers are presented in Table 3.The DBN and the ANN classifiers used same optimized learning rate, epochs, activation and output functions and hidden layer structures.The DBN structure with 3 hidden layers presented the highest accuracy of almost 91.15% using features from all sessions in neurofeedback.It is hard to compare the success of the proposed system because of there is no reported study yet that we could find in our detailed literature search on the SCP training in stroke.In this study, the classification of brain activity in stroke patients is an overcome process using the SCP training.There is a strong correlation between brain activity and the SCP trials in stroke.The DL algorithms with the HHT-based statistical features have achieved a high classification performance of 90.30%, 96.58%, and 91.15%, for sensitivity, selectivity, and accuracy, respectively.

Conclusion
The communication between the brain and computerized methods provides identifying the electrical neural patterns as a thoughtbefore the pattern has fully manifested in to a conscious feeling, allows paralysed people to control prosthetic limbs with their mind and helps to diagnose the psycho-physiological and neurological disorders to clinicians.The proposed technique is a DBN classification method to detect the brain activity using the trials from SCP training in stroke.The achieved results indicate the efficiency of the HHT-based statistical features and the DBN when used together on EEG.The proposed system bears classifying the positivity and negativity trials with high accuracy, sensitivity, and specificity achievements.The present results suggest that the DBN classifier is more successful when compared with ANN in the classification of the brain activity from the SCP training in stroke for the HHT-based statistical and high order statistical features.
the relations between the SCP and Blood-oxygen-level dependent (BOLD) signal changes[16], Devrim et al. investigated the detection of visual stimuli at sensory threshold using the SCP [6], Kotchoubey et al. used the SCP training in the research on epilepsy with analysis of influencing factors [7,8].Strehl et al. used functional magnetic resonance imaging and the BOLD signal in the SCP to reduce epileptic seizure frequency [17], Siniatchkin et al. evaluated the analysis of migraine [9], Schneider et al. determined the efficiency of the SCP training in psychiatric patients with alcohol dependency [10], Cosch et al. associated the SCP with the eventrelated potentials such as object, spatial, and verbal information [4], Hinterberger et al. suggested a robust and steady communication method between computer and brain for amyotrophic lateral sclerosis patients using the SCP and studied on developing a tough translation device [3], Pham et al.

Figure 1 .
Figure 1.EEG electrode placements on head In the following section, the SCP training database, Hilbert-Huang Transform (HHT), the HHT-based statistical feature extraction processes, and the DBN classifier are described in detail.The proposed SCP training classification system is explained.The experimental results that are obtained using the DBN classifier are presented.

Figure 2 .
Figure 2. Timeline of a trial from neurofeedback session Each oscillation is symmetrical relative to local mean of local extrema.Each different oscillation in the signal is indicated by an IMF.The IMFs are extracted from the signal by following two basic conditions[24,25]:The number of the local extremas and zero-crossing must be either same or difference must be equal to oneThe mean of the upper and lower envelopes which are obtained by the combination of the local maximum and local minimum must be equal at any t time The stated IMF extraction conditions are used for preventing negative frequency and keeping the instantaneous frequency of narrow-band signals in the frequency band while calculating the instantaneous frequency information[11].The local mean is calculated taking the average of the lower envelope defined by local minimums and the upper envelope defined by local maximums.The local mean is subtracted from the original signal and the new form of the signal is controlled verifying if it is an IMF.If the new form of the signal does notBaseline (0-2s)Active Phase (2s-8s) Considering these achievements of the HHT on EEG, we decided to utilize the HHT in the feature extraction of the brain activity classification from the SCP training in stroke.The proposed method consists of preparing EEG signal, analysing EEG using the HHT and the HHT-based statistical feature extraction stage and the DBN-based brain activity classification stage.To provide the different definitive and descriptive situations in patient and recording process of the SCP training, the dataset with eight of the SCP neurofeedback sessions that were conducted with an approximate interval of one week from two chronic stroke patients is selected.A low pass filter is applied to remove the potential noises and peaks in the EEG.8000 of the SCP trials with 2400 data points were extracted from the stroke patient EEGs.The extracted EEG trials are labelled as cortical negativity and positivity in the dataset.The EMD is applied to extracted trials with 2400 of data points.IMFs were obtained vary in the number from 5 to 8.
Figure 3 depicts a random trial and the IMFs decomposed from the random trial.HT is applied to each IMF.The amplitudefrequency-time distribution of each IMF is determined.Statistical features such as standard deviation, correlation co-efficient, skewness, kurtosis, minimum, maximum, covariance, mode and mean, energy based features, high order statistical features such as moment and cumulants were calculated for the each HT applied IMF and the feature set is created.The half of the trials (1750 of the positivity trials and 2250 of the negativity trials) from all sessions of the neurofeedback is used for training of the DBN classifier model and the remaining trials are reserved for testing the proposed DBN system.

Figure 4 .
Figure 4.The structure of the proposed DBN classifier (v represents for visible layer, h represents for hidden layer) The proposed DBN has 2 brain activities in SCP (Positivity, Negativity) as outputs.Figure 4 indicates the structure of the proposed DBN classifier with visible and 3 hidden layers.The RBM based greedy layer-wise pre-training is used in this model at the unsupervised learning stages of the DBN with 100 epochs.The optimization parameters such as learning rate, activation function of the supervised learning phase, output function of the DBN, and the number of the hidden units of the RBM were denoted by iterations.The models were tested with a limited number of the parameters and the highest classification performances are given.The learning rate of the model is 2 and the softmax output function was utilized in the DBN.The sigmoid activation function is selected in the all validations and training sessions.The DBN has 3 hidden layers with 350-100-260 hidden units, respectively.The confusion matrix of the proposed

Figure 3 .
Figure 3.A random EEG trial and extracted IMFs

Table 1 .
Stimuli trial distribution of each stroke patient [6]]aim of this study is classifying the brain activities from the SCP training in stroke patients and pointing out the negativity and positivity trails in the neurofeedback.The SCP training trials have been related with a large number of cognitive functions, motor performance of the brain and many neurological disorders in literature.Ergenoglu et al. worked with the positivity and negativity SCP trials to getting a connection between the SCP trainings and P300 amplitude and experienced the P300 amplitudes of trials with the negative SCPs are significantly higher in comparison with the positive SCPs at Cz, Pz, Fz, P3, and P4 channels under normal and abnormal conditions of the brain[12], Khader et al. investigated the relations between the SCP and Blood-oxygen-level dependent (BOLD) signal changes and proposed the similar topographical specificity of the SCP trials and the BOLD signal under cognitive experiments[16], Devrim et al. investigated the detection of visual stimuli at sensory threshold using the SCP and the negative SCP trials have a better separation ability than the positive SCP trials at Oz, Pz, Cz, and Fz channels[6].Kotchoubeyet al. analysed the influencing factors in epilepsy using twenty sessions SCP training and detected the positive trials are more important than the negative trials [7,8].Strehl et al. used functional magnetic resonance imaging and the BOLD signal in the SCP and reduced epileptic seizure frequency [17], Siniatchkin et al. evaluated the analysis of individual differences in migraine and experienced the negative trials from the SCP differed significantly between the individual habitat [9], Schneider et al. determined the efficiency of the SCP training with self-regulation task utilizing biofeedback and instrumental conditioning in psychiatric patients with alcohol dependency [10], Hinterberger et al. developed a tough translation device for ALS patients using the SCP trials [3], Pham et al. developed an auditory brain-

Table 2 .
Confusion Matrix of the DBN Classifier

Table 3 .
The classification performances according to classifier structures