• Ei tuloksia

Different data mining algorithm from the study were identified for use in the analysis of data on epilepsy. The methods were either a novel or an already existing method. In total Five (5) of the studies were novel method while ten (10) were already existing methods.

The most frequent data mining algorithm utilized was the Support vector machine (SVM) which was present in nine (9) of the studies as either the main data mining method or used in comparison studies.

The work by Wang and others (2017), Martinez-del-Rinco and others. 2017, Mohammed and others (2012) & Ying-Fang & Hsiu-sen (2016) all used the Support Vector machine as the data mining method and the database from Andrzejak and others (2001) which is a publicly available dataset, although different feature extraction methods were used in 3 studies a similar accuracy range 82% - 90 % was gotten, but the study by Martinez-del-Rinco and others (2017) yielded an accuracy of 85.9% of from the studies. The study by Mohammed and others (2012) yielded the highest accuracy of 89% while the lowest was 82% by Wang and others (2017). Despite the use of similar databases different classification was gotten which could be as a result of the different feature extraction method utilized.

Additionally, Zhang and others (2015) & Zhag and others (2016) both utilized the CHB-MIT database and the SVM as a data mining method. The former yielded a perfect sensitivity of 100% while the later yielded a sensitivity of 95.1% and a specificity of 96.2%, although in both cases the feature extraction method utilized wasn’t mentioned.

43

Finally, the studies by Mporas and others. (2014), Turner and others (2014), Conradsen and others (2010) utilized EEG data from St. Thomas epilepsy clinic, raw time series signal, and the Danish epilepsy center respectively and also different feature extraction methods as seen in the attached appendix. A sensitivity in the range of 91-100%, specificity of 100% yielded in the study by Conradsen and others (2010), while 85% and 90% accuracy by Turner and others (2014) and Mporas and others (2014) respectively.

In the six (6) studies left, Abualsaud and others (2014) & Zainuddin and others (2013) used the same publicly available dataset from Andrzejak and others (2001). But different data mining and feature extraction method, in the former study the data mining method Insurance based k-nearest neighbor yielded 99% accuracy and the later study an accuracy of 98.87%, sensitivity of 94.96% and specificity of 99.43% was recorded. The 4 other studies Haydari and others (2011), Ahmed and others (2015), Siddiqui (2016), Zhao and others (2016) as seen in the appendix, utilized different data mining methods and feature extraction techniques. The study by Siddiqui (2016) & Haydari and others (2011) recorded detection accuracy of 100% and 87% respectively, while Ahmed and others (2015) & Zhao and others (2016) yielded a sensitivity of 91% and selectivity greater than 90% respectively.

Table 2: Data mining methods with SVM used as classifier

Author Data mining method Strength of the method Limitation Martinez-del-Rinco et

al. 2017.

Support vector machine & Bag of words

-Lower training computational cost.

- The method is robust and can deal with real environmental conditions.

- Its performance makes it usable in real hospital environments to help in epilepsy diagnosis.

- There is a risk of overfitting, which means the results reported may be Artificially high and not necessarily a representation of the real performance.

44 Zhang et al. 2015 Support vector

machine (SVM)

Feature errors lead to substandard

performance utilizing the baseline model.

Zhang et al. 2016 Support vector machines

-Some of the issues associated with

Conradsen et al. 2010 Support vector machines for research use only.

-There is some discomfort associated with wearing the suit containing sensors by detection method in the sub-band A4.

- Bayes net can cope with large data within a short time

The complexity of the disease may give rise to false positives and seizure detection based on EEG may have problems of signal processing and analysis.

Turner et al. 2014. Deep belief networks with 3 different classifiers (KNN, SVM

& LR)

-In studies where training, validation and testing sets were from the same patient, seizure detection was successful on all patients.

- It may not be always possible to get long hours of trained data of a patient to use as a model.

45 Wang et al. 2017 (Decision tree, Random

forest, SVM + C4.5 and SVM + RF).

- no need for parameter tuning in RF and this makes it simple to apply and scale up the computation of a complex analysis

- RF performs poorly in two-group classification which may be because of the small size of the data set.

Mohammed et al. 2012 fusion of Bayesian, K-nearest neighbor, neural network, linear discriminant analysis &

SVM

- classification

accuracy of 89.5% was achieved using just time domain features and it is expected the accuracy will increase when more features are added

-All five classifiers use different mechanism features in the input feature vector

Other data mining methods used

Author Data mining method Strength of the method Limitation Haydari et al. 2011. Algorithm based on

Genetic algorithm &

-Location of epilepsy is gotten in real time.

- Fast and can be used for Large-scale EEG signals for patient monitoring

-When Genetic algorithm is used to obtain the optimum basis function, it takes a bit of time, although it is done only once.

Zhao et al. 2016 Restricted Boltzman machine (RBM)

-This model detects abnormal regions in resected areas in 58%

- There is a limited number of patient’s data that can be used to

46

of patients with 99%

accuracy.

- This model can locate lesion during pre-surgical evaluation hereby increasing the number of patients referred to surgery

build an automated

Zainuddin et al. 2013 Improved Wavelet neural network with type-2- fuzzy C-means algorithm

- The proposed method showed good predictive ability with an

accuracy of 98.7%.

- ANN is suitable in EEG Studies because it can be used to find an association between rapid variations in EEG recordings and has the attribute of fault tolerance

- EEG feature can be identified from the data by using the absolute values from the wavelet coefficients.

- A different data mining method (multilayer perceptron) based classifier from a different study achieved a better accuracy than this proposed method.

Abualsaud et al. 2014 BayesNet, Decision table, Insurance based EEG data have been used for epileptic seizure detection applications - High accuracy of about 99% gotten by the IBK.

- Each of the classifiers methods utilized

47

Ahmed et al. 2015 Spark streaming -The results show that the method can be used in clinics for epilepsy patients.

- The method was able to detect seizures in real time and with low latency.

Spark streaming needs to be improved before it is suitable for use in scientific applications and is currently applicable to unstructured data.

Mporas et al. 2014 SVM, C4.5 decision tree, K-nearest neighbor & multilayer perceptron neural network (MLP).

- A little improvement could still be achieved on the overall

performance using a post-processing smoothing filter on the recognized labels.

- Relative comparison with other studies is not feasible because of differences in each dataset. The accuracy gotten is competitive to the performance reported in the literature.

Siddiqui 2016 Decision tree;

systematic forest

-The only study that tried to detect seizures using ECoG dataset.

-The Sysfor is ideal because it produces realistic logic rules with a higher accuracy rate when compared to a single decision tree.

- It requires implanting electrodes into the brain in an invasive manner.

The table 2 shows the strength and limitation of each study is highlighted in the above, this provides an overview of the data mining method and the expected challenges and advantages of each method.