METHOD FOR SELECTING FEATURES FROM ELECTROENCEPHALOGRAM SIGNALS
20230248295 · 2023-08-10
Inventors
Cpc classification
A61B5/4088
HUMAN NECESSITIES
A61B5/374
HUMAN NECESSITIES
G16H50/20
PHYSICS
International classification
Abstract
A method for selecting features for the detection of a given brain condition, the method using a feature selector able to discriminate between at least two brain conditions, and using a plurality of electroencephalogram signals, relative to several brain electrodes and filtered on at least one frequency, the method comprising the following steps: e) for each signal, computing at least one value quantifying brain activity for each brain electrode; f) for each signal, performing a thresholding of said computed quantifying values according to at least one threshold percentage, thus forming at least one group of thresholded values corresponding to each signal; g) using said feature selector to rank said thresholded values; and h) based on this ranking, selecting at least one feature vector comprising at least one electrode, and one frequency and/or one threshold percentage, and corresponding to the best ranked thresholded values for the detection of said given brain condition.
Claims
1. A method for selecting features for the detection of a given brain condition, the method using a feature selector able to discriminate between at least two brain conditions, and using a plurality of electroencephalogram signals, relative to several brain electrodes and filtered on at least one frequency, the method comprising the following steps: a) for each signal, computing at least one value quantifying brain activity for each brain electrode; b) for each signal, performing a thresholding of said computed quantifying values according to at least one threshold percentage, thus forming at least one group of thresholded values corresponding to each signal; c) using said feature selector to rank said thresholded values; and d) based on this ranking, selecting at least one feature vector comprising at least one electrode, and one frequency and/or one threshold percentage, and corresponding to the best ranked thresholded values for the detection of said given brain condition.
2. The method of claim 1, wherein, in step a), said at least one computed value quantifies the functional connectivity between two brain electrodes arranged in a pair.
3. The method of claim 1, wherein, in step a), said at least one quantifying value is an epoch-based entropy computed between each pair of brain electrodes.
4. The method of claim 1, wherein the at least one quantifying value is a combination of several intermediate values, in particular computed by a mean, a variance, a variance of the discrete cumulative function or a standard deviation.
5. The method of any one of claim 1, wherein said feature selector uses at least pre-acquired values quantifying brain activity relative to several brain conditions in order to discriminate between at least two brain conditions.
6. The method of claim 1, wherein said brain conditions are Alzheimer’s disease (AD), mild cognitive impairment (MCI), subjective cognitive impairment (SCI) or healthy brain.
7. The method of claim 1, wherein the selection of step d) comprises: a first selection of at least one electrode corresponding to the best ranked thresholded values according to the ranking of step c); using a feature selector to rank the thresholded values corresponding to the selected electrode and to at least one frequency and/or one threshold percentage, for the selection of at least a feature vector of one electrode, and one frequency and/or one threshold percentage corresponding to the best ranked thresholded values for the discrimination of said given brain condition.
8. The method of claim 7, wherein the feature selector is the feature selector used in step c) or a second feature selector.
9. The method of claim 1, wherein, at step b), the thresholding is a proportional thresholding, within a predefined range of percentage values and with a predefined step, the predefined range preferably being a range of percentage values between 0% and 100%, the predefined step preferably being less than or equal to 10%.
10. The method of claim 1, wherein the electroencephalogram signals are filtered on at least one frequency band, better on two frequency bands, better still on at least four frequency bands, even better on at least five frequency bands; said frequency band being chosen amongst the bands delta, theta, alpha, beta, and gamma.
11. The method of claim 1, wherein step d) provides at least one feature vector comprising one electrode, one frequency band, and one threshold percentage.
12. The method of claim 1, wherein the feature selector uses a Gram Schmidt algorithm, preferably using an Orthogonal Forward Regression (OFR), preferably in combination with a leave-one-out method.
13. The method of claim 1, wherein the feature selector uses a correlation-based feature selection method, a logistic regression, a ranking by fisher ratio score, or a reverse sequential feature selection method.
14. The method of claim 1, wherein said at least one computed value is a complexity marker, or a slowing marker.
15. The method of claim 1, wherein, in step a), said at least one quantifying value is a coherence measure, a phase synchrony measure, a granger causality measure, a Tsallis entropy, a sample entropy, or a mutual information, or a graph parameter, for example, a clustering coefficient, a small word, a shortest path, an efficiency, a centrality, or a modularity.
16. The method of claim 1, wherein, in step d), a combination of several feature vectors is selected.
17. The method of claim 1, wherein the electroencephalogram signals are relative to a range of 2 to 256 brain electrodes, especially to 30 brain electrodes.
18. A method for discriminating among several types of brain conditions a subject suspected to have a brain condition, at least based on electroencephalogram signals of the subject relative to several brain electrodes and filtered on at least one frequency, the method using at least a classifier, especially a linear SVM classifier, trained beforehand to learn to discriminate said brain condition based at least on a plurality of feature vectors previously-selected by using the feature selection method of claim 1, the method comprising the following steps: a) for each signal, computing at least one value quantifying brain activity for each brain electrode; b) performing a thresholding of said computed quantifying value(s) according to at least a threshold percentage selected by the feature selection method of claim 1; c) forming a subject feature vector comprising at least one electrode, and one frequency and/or one threshold percentage based on the feature selection method of claim 1, and d) operating the trained classifier on said subject feature vector to discriminate said subject among several types of brain conditions.
19. Computer program product for selecting features for the detection of a given brain condition, by using a feature selector able to discriminate between at least two brain conditions, and a plurality of electroencephalogram signals, relative to several brain electrodes and filtered on at least one frequency, the computer program product comprising a support and stored on this support instructions that can be read by a processor, these instructions being configured, when executed, for: a) for each signal, computing at least one value quantifying brain activity for each brain electrode; b) for each signal, performing a thresholding of said computed quantifying values according to at least one threshold percentages, thus forming at least one group of thresholded values corresponding to each signal; c) using said feature selector to rank said thresholded values; and d) based on this ranking, selecting at least one feature vector comprising at least one electrode, and one frequency and/or one threshold percentage, and corresponding to the best ranked thresholded values for the detection of said given brain condition.
Description
BRIEF DESCRIPTION OF THE FIGURES
[0084] Other features and advantages of the invention will become more apparent on reading the following detailed description and on studying the attached drawing in which:
[0085]
[0086]
[0087]
[0088]
[0089]
[0090]
[0091]
[0092]
DETAILED DESCRIPTION
Acquisition of Electroencephalogram Signals
[0093] The method is based on an analysis of several electroencephalogram signals, each signal being acquired from a brain electrode positioned on the head of a subject. Usually, several EEG signals are acquired simultaneously from N brain electrodes positioned on the head of a subject.
[0094]
[0095] Such acquisition may be performed for several subjects M, preferably some of said subjects suffering from a given brain condition and others who do not suffer from said given brain condition.
[0096] Each EEG signal is then filtered on at least one frequency or frequency band, preferably on at least four frequencies bands, for example on the well-known frequency bands delta, theta, alpha and beta.
[0097] The lower limit and higher limit of these frequency bands can vary but they generally are respectively around [0.1 Hz-4 Hz], [4 Hz-8 Hz], [8 Hz-12 Hz] and [12 Hz-30 Hz]. Each EEG signal can also be filtered on the gamma frequency band, which generally corresponds to frequencies greater than 30 Hz.
Quantifying Brain Activity
[0098] At step a), for each subject and for each frequency band, the functional connectivity between two brain electrodes may be computed between each pair of brain electrodes, preferably via the computation of epoch-based entropies.
[0099] As illustrated in
[0100] Based on the epoch-based entropies, graph theory methods may be applied. In particular, graph theory provides a spatial visualisation that can highlight the relationship between the structure and the process of the brain. In
[0101] Preferably, before computing the representation 10, the quantifying values, in this example the epoch-based entropies, are normalized, for example following a min-max method considering at least part of the quantifying values of all the subjects and all frequencies or frequency bands, better considering all of the quantifying values of all subjects and all frequencies or frequency bands. The quantifying values may be comprised between 0 and 1.
[0102] The N epoch-based entropies relative to one electrode may then be combined, for example by measuring a variance of the discrete cumulative function, in order to have one quantifying brain activity value per electrode, per frequency band, and per subject. For each frequency band and for each subject, the quantifying values may thus be illustrated by a vector 1*N.
[0103] At the end of step a), a set of quantifying values is obtained comprising at least one quantifying value for each electrode, preferably for each frequency band and for each subject.
Epoch-Based Entropy Measure
[0104] An advantage of the epoch-based entropy is that it allows estimating the complexity of EEG signals locally over time but also spatially by estimating the inter-channel complexity.
[0105] To compute the epoch-based entropy, EEG signals are modelled by a continuous left-to-right Hidden Markov Model (HMM). Such modelling of two EEG signals is illustrated in
[0106] Then each observation z in a given epoch S.sub.i is considered as a realization Z.sub.i of a random variable Z that follows a given observation probability distribution P.sub.i(z) modelled by the mixture of Gaussians 64. Consequently, each stationary epoch of the signal is associated to a random variable, and the entropy H*(Z.sub.i) of the epoch S.sub.i is that of an ensemble of realizations of Z.sub.i:
[0107] By averaging the entropy over the K epochs of the EEG signal 61 of a subject, an entropy-based complexity value EpEn(Z) of the signal 61, called “epoch-based entropy”, is obtained as:
[0108] To model the inter-relations between EEG time series, filtered in a frequency or different frequencies or different frequency bands, recorded from N brain electrodes, an HMM is trained for each subject on a set of N EEG signals recorded from N brain electrodes. At a time t, a hidden state emits a N-dimensional observation vector. By applying the Viterbi algorithm, each EEG signal is segmented into K epochs, and the entropy H*(Z) of each epoch S.sub.i is computed considering the probability density estimated by the HMM on the observations of the K epochs. Although all K epochs are matched between EEG channels, the model does not constrain these epochs to be of equal length for all channels. Finally, by averaging the entropy over all the K epochs, an epoch-based entropy value associated to the multi-channel EEG of the subject is computed.
[0109] The quantifying value is not limited to an epoch-based entropy. For example, it may be a graph parameter such as a clustering coefficient used as a segregation measure and defined as follows:
being the number of triangles around a vertex i and K.sub.i being the degree of a vertex i and the sum of afferent and efferent connections.
Thresholding
[0110] At step b), starting from all quantifying values for each electrode of a subject, a thresholding step is performed, for each frequency band separately.
[0111] Preferably, such thresholding is a percentage thresholding (PT), as represented in
[0112] Other examples of percentage thresholdings are illustrated in
[0113] For each subject, for each frequency band, 5 groups of thresholded values are thus obtained in the example of
[0114] In the example corresponding to the 100% thresholding, 100% of the quantifying values are kept, forming a group of thresholded values corresponding to a PT equal to 100%. In the example corresponding to the 70% thresholding, the 70% highest quantifying values are kept, forming another group of thresholded values corresponding to a PT equal to 70%. In another embodiment, the 70% lowest quantifying values can be kept. In the example corresponding to a PT equal to 50%, the 50% highest quantifying values are kept. In another embodiment, the 50% lowest quantifying values can be kept. In the example corresponding to a PT equal to 30%, the 30% highest quantifying values are kept. In another embodiment, the 30% lowest quantifying values can be kept. In the last example corresponding to a PT equal to 10%, the 10% highest quantifying values are kept. In another embodiment, the 10% lowest quantifying values can be kept.
Feature Selector
[0115] At step c), in the example of
[0116] In particular, the feature selector comprises the following steps: [0117] C1) selection of a feature vector f.sub.i that best discriminate the brain conditions; [0118] C2) projection of the output vector onto the null space of the selected feature vector and orthogonalization of the rest of the feature vectors using Gram Schmidt algorithm; [0119] C3) removing the selected feature vector f.sub.i from the list of the feature vectors; [0120] C4) returning to C1 until a condition is met.
[0121] Said feature selector allows highlighting which thresholded values can discriminate the given brain condition from other brain conditions by ranking said thresholded values.
[0122] Preferably a leave-one-out procedure is used in case of sparse data, that is to say when EEG signals of few subjects are acquired to apply the method as described.
[0123] In particular, the feature selector uses at least pre-acquired values quantifying brain activity relative to several brain conditions in order to discriminate between at least two brain conditions. For example, pre-acquired values may correspond to the knowledge of which subjects have the given brain condition and which subjects have not the given brain condition. Pre-acquired values may also be clinical, medical or personal data relative to the subject, such as age, gender, educational level, profession, family situation, aetiology, subject history.
[0124] At step d), from the ranking of said thresholding values, a feature vector or better a combination of feature vectors is selected. For example, the following triplet can be selected (alpha, 10, FC4) that correspond to the thresholded values obtained for a EEG signal acquired with an electrode positioned on the FC4 location, according to the International 10-20 system, filtered on the frequency band alpha and thresholded with a PT of 10%. In another embodiment, a combination of vectors may be selected such as the following example [(alpha, 10, FC4); (delta, 20, T6); (beta, 20, FC3); (beta, 30, FP2); (theta, 100, T3)].
[0125] Preferably, the number of feature vectors selected is defined by using a random probe method, preferably again with a predefined risk level of maximum 10%, better of maximum 5%. For this purpose, random realizations of thresholding are generated, for example at least 100 random realizations, or better at least 1000 random realizations of thresholded values and concatenate with real thresholded values. The random thresholded values and the real thresholded values are then ranked as in step c). Feature vectors are then selected according to the predefined risk level. The estimation of the risk may be the condition to be met at step C4. The feature vector may be kept although it might be less relevant than the probe.
[0126]
[0130] In one embodiment, the feature selector is the feature selector used in step c) or a second feature selector. In a preferred embodiment, both feature selectors are Orthogonal Forward Regression algorithm using Gram Schmidt algorithms and used with a leave-one-out procedure. Preferably again, the selection of the number of feature vectors is defined using a random probe method.
[0131] In one embodiment, steps b) and c) may be repeated to compare different ranges of PT, as the feature selection is sensitive to the considered ranges of PT, for example considering the range 0% to 100%, 0% to 90%, 0% to 80% and so on until 0% to 20%, with a step of 10% on the all range 0% to 100%. One can also consider only part of the range, for example 10% to 100% with a step of 20%: 10% to 100%, 10% to 80%, 10% to 60%, 10% to 40%, and 10% to 20%.
[0132] The number of thresholded values depends on the range of the PT and on the predefined step. For example, when considering a 10%-60% range with a predefined step of 10%, for the four frequency bands alpha, beta, theta and delta, 24 cases are considered for each electrode for each subject.
[0133] Different feature vectors or combination of feature vectors may be selected depending on the range and the step of the PT.
EXAMPLE
[0134] A preferred embodiment of the method for selecting features for the discrimination of a given brain condition is illustrated in
[0135] In this example, EEG signals of 50 subjects are acquired, said EEG signals being relative to 30 brain electrodes. Among those 50 subjects, 28 subjects are AD subjects and 22 subjects are SCI subjects.
[0136] Each EEG signal is filtered on the following frequency bands: delta (0.1-4 Hz), theta (4-8 Hz), alpha (8-12 Hz) and beta (12-30 Hz).
[0137] For each subject and for each frequency band, epoch-based entropy is computed between each pair of brain electrodes, forming a 30×30 matrix comprising the computed quantifying values.
[0138] Each value of each matrix is then thresholded, by applying a percentage thresholding, within a range 10%-100% with a step of 10%. For each subject and for each frequency band, 10 groups of thresholded values are obtained, each comprising 30×30 thresholded values. The quantifying values that have not been retained after the thresholding step are then replaced by a neutral value, e.g. a mean value or a zero value for the set of values considered.
[0139] Each matrix 21 comprising the thresholded values are then converted to a vector 22, of size 30×1, wherein, in this example, each value relative to a brain electrode is the variance of the discrete cumulative function (variance of DCF) of the thresholded values relative to said electrode. The vectors 22 relative to each of the 50 subjects are then grouped into a matrix 23 of size 50×30.
[0140] For each frequency band and for each PT, the thresholded values are ranked using a feature selector 31, a Gram Schmidt algorithm following a leave-one-out procedure in this example. This ranking allows selecting at least one brain electrode, that is to say a vector 24 of size 50×1, said at least one brain electrode being relevant for discriminating the two brain conditions, namely AD and SCI, for the given frequency band and the given PT. In this example, this step is therefore repeated 40 times, for each of the 4 frequency bands and each of the 10 groups of thresholded values, forming a matrix 25 of size 50×40 that regroups the 40 selected vectors 24.
[0141] Another ranking is performed based on the selected thresholded values of matrix 25 comprising the thresholded values relative to the selected brain electrodes, using a feature selector 32 being a Gram Schmidt algorithm following a leave-one-out procedure. This ranking allows selecting at least one frequency band and at least one PT, thus forming at least one feature vector composed of a brain electrode, or a location, a frequency band and a percentage threshold step, for example the brain electrode positioned on the location FC4, according to the international 10-20 system, the frequency band alpha and the thresholding 10%.
[0142] In a preferred embodiment, to select the at least one frequency band and the at least one PT, a random probe method is applied, defining the number of feature vectors to keep. Say differently, the random probe method allows determining which triple, “brain electrode [0143] frequency band percentage threshold”, discriminates both brain conditions more than a random variable, considering a predefined acceptable risk level. To do that, several random variables are generated and ranked together with the thresholded values of matrix 25.
[0144] The selection of the at least one feature vector is preferably performed successively and comprises the following steps: [0145] pre-selection of a feature vector; [0146] determining whether this pre-selection respects the predefined acceptable risk level; [0147] if the pre-selection respects the predefined acceptable risk level: [0148] selection of the feature vector, [0149] feature vector removable from matrix 25, [0150] repetition at the pre-selection. [0151] if the pre-selection does not respect the predefined acceptable risk level, stopping the feature vector selection.
[0152] The number of selected feature vectors may vary but all selected feature vectors are advantageously significant.
Performances of the Feature Selection Method
[0153] The table of
[0154] The table shows that an accuracy of 80% is obtained considering all the 10 PT ranges by selecting the following combination of feature vectors [(alpha, 10, FC4); (delta, 20, T6); (beta, 20, FC3); (beta, 30, FP2); (theta, 100, T3)] or considering only 2 PT ranges (10% to 20%, with a step of 10%) by selecting the following combination of feature vectors [(alpha, 10, FC4); (delta, 20, T6); (beta, 20, FC3); (theta, 10, TP7)].
[0155] The method allows reaching a good classification performance, comparable to the known methods, by combining different PT, that is to say different network densities scales, different frequency bands and different electrodes, that is to say different locations on the brain, furthermore by using only one EEG marker, in this example the functional connectivity via the computation of the epoch-based entropy.
Discriminating Method
[0156]
[0157] Such a method uses a classifier, for example a linear Support Vector Machine (SVM), which may determine a score estimating whether said subject has said brain condition or not.
[0158] For this purpose, the classifier is trained beforehand to learn to discriminate said brain condition based at least on a plurality of feature vectors previously selected by using the method for selecting features for the discrimination of a given brain condition according to the invention.
[0159] Once the classifier is trained, it is possible to estimate, for subjects suspected to have said brain condition, if they have or not said brain condition, from feature vectors issued from the analysis of acquired EEG signals.
[0160] As such, for a subject, EEG signals of said subject are acquired and filtered on at least one frequency or frequency band, as previously explained. The acquisition and the filtering may depend on which feature vector(s) are previously selected for the training of the classifier.
[0161] At step a), for each EEG signal, at least one value quantifying brain activity for each brain electrode is computed, for example an epoch-based entropy is computed between each pair of brain electrodes. The computation of the quantifying brain activity may depend on which feature vector(s) are previously selected for the training of the classifier.
[0162] At step b), a percentage threshold is then performed on the computed quantifying values depending on which feature vector(s) are previously selected for the training of the classifier.
[0163] Step c) comprises the formation, from the thresholded values, of at least one feature vector which is given, at step d), as an entry to the classifier, said feature vector comprising at least one brain electrode, and one frequency or frequency band and/or one threshold percentage.
[0164] Preferably, the output of the classifier is a score estimating whether said subject has said brain condition, or several scores, each corresponding to one brain condition. For example, a probability or a percentage which may be transmitted to the subject and/or to a medical expert. It may also be a class, such as “AD”, “SCI”, or “MCI”, or a combination thereof, for example a probability that the subject belongs to the “AD” class, and/or a probability that the subject belongs to the “SCI” class, and/or a probability that the subject belongs to the “MCI” class.
[0165] The transmission of such result may be made electronically, for example send by email or displayed on a screen, for example on a computer, a mobile phone or a tablet. Alternatively, or additionally, it can be printed.
[0166] The invention is not limited to the examples described above.
[0167] In particular, the method is not limited to a discrimination between AD subjects and SCI subjects.
[0168] The acquisition of the EEG signals is not limited to an acquisition realized with 30 brain electrodes.
[0169] The feature selector is not limited to a Gram Schmidt algorithm in combination to a leave-one-out procedure. The Gram Schmidt algorithm may be in combination with a k-folds procedure. Alternatively, it may be a correlation-based feature selection method, a logistic regression, a ranking by fisher ratio score, a decision tree, a principal component analysis, a linear discriminant analysis, a t-SNE (t-distributed stochastic neighbor embedding), a recursive feature elimination or a reverse sequential feature selection method, in combination with a leave-one-out procedure or a k-folds procedure.
[0170] In particular, the lower limit and higher limit of the frequency bands are not limited to those described. Indeed, the lower limit and the higher limit of these well-known frequency bands can advantageously be defined as follows: [0171] the lower limit of the frequency band delta can be comprised between 0.05 Hz and 1 Hz and the higher limit of the frequency band delta can be comprised between 2 Hz and 5 Hz; [0172] the lower limit of the frequency band theta can be comprised between 3 Hz and 5 Hz and the higher limit of the frequency band theta can be comprised between 6 Hz and 9 Hz; [0173] the lower limit of the frequency band alpha can be comprised between 6 Hz and 9 Hz and the higher limit of the frequency band alpha can be comprised between 10 Hz and 17 Hz; [0174] the lower limit of the frequency band beta can be comprised between 10 Hz and 17 Hz and the higher limit of the frequency band beta can be comprised between 25 Hz and 35 Hz; [0175] the lower limit of the frequency band gamma can be comprised between 25 Hz and 35 Hz and the higher limit of the frequency band gamma can reach 100 Hz