FEEDBACK LOOP FOR EMOTION RECOGNITION SYSTEM
20220409113 · 2022-12-29
Inventors
Cpc classification
A61B5/374
HUMAN NECESSITIES
A61B5/165
HUMAN NECESSITIES
A61B5/7264
HUMAN NECESSITIES
A61B5/0205
HUMAN NECESSITIES
International classification
A61B5/16
HUMAN NECESSITIES
A61B5/00
HUMAN NECESSITIES
Abstract
The present invention relates to a system and method of emotion recognition. An emotion recognition system may utilize a Valence-Arousal factor along with training data. The training data may exist as emotions assigned to actual measurements of user inputs. The actual measurements of user inputs may be assigned to a plurality of points on the Valence-Arousal model. A user input acquisition device may be used to collect actual measurements of user inputs. A processor may utilize an algorithm to assign user emotions based on the training data. A user may provide feedback on the assigned user emotions, and the training data may be updated based on the user feedback, depending on whether the user feedback is considered an outlier to the training data.
Claims
1. An emotion recognition system comprising: a. a Valence-Arousal model comprising: i. a Valence factor comprising two endpoints; ii. an Arousal factor comprising two endpoints; iii. a plurality of points, one of said plurality of points being an origin; b. an algorithm; c. a user input acquisition device; d. a database comprising training data, said training data comprising: i. actual measurements of user inputs assigned to the endpoints of each the Valence factor and the Arousal factor; ii. actual measurements of user inputs assigned to some of the plurality of points, said some of the plurality of points being in addition to the endpoints of the Valence factor and the Arousal factor; iii. emotions assigned to each actual measurement of user inputs; e. a processor; and f. a user device, wherein the user input acquisition device collects actual measurements of user inputs, and wherein the actual measurements of user inputs are transmitted to the database in the form of non-transitory computer-readable media, and wherein the processor retrieves the actual measurements of user inputs from the database, and wherein the processor uses the algorithm to assign the actual measurements of user inputs to one or more of the plurality of points of the Valence-Arousal model, and wherein the processor uses the algorithm to recognize the closest corresponding emotions to the one or more of the plurality of points, and wherein the processor uses the algorithm to assign user emotions based on the closest corresponding emotions to said one or more of the plurality of points, and wherein the user emotions are transmitted to a user device in the form of non-transitory computer-readable media, and wherein the user emotions are displayed on the user device in the form of human-readable information, and wherein a user uses the user device to provide user's emotion feedback.
2. The emotion recognition system of claim 1, further comprising a credibility algorithm, wherein the credibility algorithm determines that the user's emotion feedback are outliers, and wherein the user's emotion feedback are discarded.
3. The emotion recognition system of claim 1, further comprising a credibility algorithm, wherein the credibility algorithm determines that the user's emotion feedback are not outliers, and wherein the processor re-assigns the emotions to re-assigned points on the Valence-Arousal model based on the user's emotion feedback, and wherein the processor updates the algorithm based on the re-assigned points.
4. The emotion recognition system of claim 1, wherein the user input acquisition device is an EEG device.
5. The emotion recognition system of claim 1, wherein the user input acquisition device in an ECG device.
6. The emotion recognition system of claim 1, wherein the actual measurements of user inputs are transformed into Hjorth parameters for further processing.
7. The emotion recognition system of claim 1, wherein the Fourier transform is applied to the actual measurements of user inputs to obtain other measurements of user inputs.
8. The emotion recognition system of claim 1, wherein the processor uses the algorithm to assign the actual measurements of user inputs to one or more of the plurality of points of the Valence-Arousal model by converting the actual measurements of user inputs into one or more scalograms or other continuous wavelet transformation coefficient(s) as the input of machine learning/deep learning.
9. The emotion recognition system of claim 8, wherein one or more pre-trained algorithms such as VGG16 is used within the convolution neural network.
10. A method of recognizing emotions comprising: a. Providing a Valence-Arousal model, the Valence-Arousal model comprising: i. a Valence factor comprising two endpoints; ii. an Arousal factor comprising two endpoints; iii. a plurality of points, one of said plurality of points being an origin; b. assigning training data to the Valence-Arousal model, comprising: i. assigning actual measurements of user inputs to the endpoints of each the Valence factor and the Arousal factor; ii. assigning actual measurements of user inputs to some of the plurality of points, said some of the plurality of points being in addition to the endpoints of the Valence factor and the Arousal factor; iii. assigning an emotion to each expected measurement of user inputs; c. providing an algorithm; d. providing a user input acquisition device; e. collecting actual measurements of user inputs with the user input acquisition device; f. providing a processor; g. using the processor to denoise the actual measurements of user inputs; h. transmitting the actual measurements of user inputs to a database in the form of non-transitory computer-readable media; i. using the processor to retrieve the actual measurements of user inputs from the database; j. using the processor to recognize one or more user emotions by: i. using the algorithm to assign the actual measurements of user inputs to one or more of the plurality of points of the Valence-Arousal model; ii. using the algorithm to recognize the closest corresponding emotions to the one or more of the plurality of points; iii. assigning user emotions based on the closest corresponding emotions; k. transmitting the user emotions to a user device in the form of non-transitory computer-readable media; l. displaying the user emotions on the user device in the form of human-readable text; and m. using the user device to provide user's emotion feedback.
11. The method of claim 10, further comprising providing a credibility algorithm, wherein after using the user device to provide user's emotion feedback, the credibility algorithm is used to determine that the user's emotion feedback are outliers, and wherein the user's emotion feedback are discarded.
12. The method of claim 10, further comprising providing a credibility algorithm, wherein after using the user device to provide user's emotion feedback, the credibility algorithm is used to determine that the user's emotion feedback are not outliers, and wherein the emotions are re-assigned to re-assigned points on the Valence-Arousal model, and wherein the algorithm is updated based on the re-assigned points.
13. The method of claim 12, wherein said method is continuously repeated.
14. The method of claim 10, wherein the user inputs device is an EEG device.
15. The method of claim 10, wherein the user inputs device is an ECG device.
16. The emotion recognition system of claim 10, wherein the actual measurements of user inputs are transformed into Hjorth parameters for further processing.
17. The emotion recognition system of claim 10, wherein the Fourier transform is applied to the actual measurements of user inputs to obtain other measurements of user inputs.
18. The emotion recognition system of claim 17, further comprising denoising the other measurements of user inputs.
19. The emotion recognition system of claim 1, wherein the processor uses the algorithm to assign the actual measurements of user inputs to one or more of the plurality of points of the Valence-Arousal model by converting the actual measurements of user inputs into one or more scalograms.
20. The emotion recognition system of claim 10, wherein one or more pre-trained algorithms such as VGG16 is used within the convolution neural network.
Description
BRIEF DESCRIPTION OF THE FIGURES
[0039]
[0040]
[0041]
[0042]
[0043]
[0044]
[0045]
[0046]
[0047]
[0048]
[0049]
[0050]
[0051]
DETAILED DESCRIPTION
[0052] The description provided herein describes example embodiments of the present invention and is not to be interpreted as limiting the invention to any particular embodiment, feature, step, or property. The figures provided and described herein also illustrate example embodiments of the invention, and are not to be interpreted as limiting the invention to any particular embodiment, feature, step, or property.
[0053] As shown in
[0054] The Pleasant end of the Valence factor may be considered positive Valence, and the Unpleasant end of the Valence factor may be considered negative Valence. The Activated end of the Arousal factor may be considered positive Arousal, and the Deactivated end of the Arousal factor may be considered negative Arousal. The intersection of the Valence and Arousal axes may be referred to as the origin of the Valence-Arousal model, wherein the magnitude of both Valence and Arousal are zero. The halves of a Valence-Arousal model defined by each axis may be referred to as “hemispheres”. For example, in
[0055] Due to the fact that Valence-Arousal models comprise finite ends to their axes, Valence-Arousal models may be depicted as circles, as shown in
[0056] As shown in
[0057] The Valence-Arousal models illustrated in
[0058] As shown in
[0059] Emotion recognition 303 is carried out by the processor utilizing the actual measurements of user inputs, a Valence-Arousal model, training data, and an algorithm. The output of emotion recognition 303 is user emotions that are provided to a user via emotion analysis output display to user 304. Emotion analysis output display to user 304 includes transmitting the user emotions to a user device in the form of non-transitory computer-readable media. The user may provide emotion labels 305 by using the user device to provide user's emotion feedback, which may be based on the user's own understanding of their own emotions. The method may further comprise utilizing a credibility algorithm for user emotion feedback evaluation considering user credibility 306, in which the credibility algorithm determines if the user's emotion feedback are considered outliers to the training data. Incoming emotion entry into database 307 includes entering either the user emotions or the user's emotion feedback into a database.
[0060] As shown in
[0061] During Valence-Arousal analysis 403, a processor uses an algorithm to assign the actual measurements of user inputs to one or more of a plurality of points of a Valence-Arousal model. The processor then uses the algorithm to recognize the closest corresponding emotions, said emotions having been assigned to some of the plurality of points as part of the training data. The processor then assigns user emotions based on the closest corresponding emotions during overall rating 404.
[0062] As shown in
[0063] If the user's emotion feedback are not an exact match to the user emotions recognized 500, it is determined whether the user's emotion feedback and the user emotions provided belong to the same hemisphere of Valence 501. If not, the user's emotion feedback are subject to manual review by experts 507, said experts being humans trained in the art of emotion recognition such as physiatrists. If the manual review by experts 507 determines that the user's emotion feedback are valid, then the user's emotion feedback are stored in the database 508 and the user's credibility is increased 510. If the manual review by experts 507 determines that the user's emotion feedback are invalid, then the user's emotion feedback are not entered into the database and the user's credibility is decreased 509.
[0064] If the user's emotion feedback and the user emotions provided belong to the same hemisphere of Valence 501, it is then determined if the user's emotion feedback and user emotions provided belong to the same hemisphere of Arousal 502. If so, it is then determined if the user's credibility “C” is higher than Cv 503. If C is higher than Cv, the emotions of the training data are re-assigned to re-assigned points on the Valence-Arousal model and the algorithm is updated based on the re-assigned points 504. The user's emotion feedback is entered into the database 508 and C is increased 510.
[0065] If, when determining if C is higher than Cv 503, it is determined that C is lower than Cv, the user's emotion feedback are subject to manual review by experts 507. If the manual review by experts 507 determines that the user's emotion feedback is valid, then the user's emotion feedback are stored in the database 508 and C is increased 510. If the manual review by experts 507 determines that the user's emotion feedback are invalid, then the user's emotion feedback are not entered into the database and the user's credibility is decreased 509.
[0066] If the manual user feedback and the emotion recognition of the invention do not belong to the same hemisphere of Valence 501 nor the same hemisphere of Arousal 502, the difference in Arousal level between the user's emotion feedback and the user emotions provided is calculated and compared to Ax 505. If said difference is greater than Ax, the user's emotion feedback is subject to manual review by experts 507. If the manual review by experts 507 determines that the user's emotion feedback is valid, then the user's emotion feedback is stored in the database 508 and C is increased 510. If the manual review by experts 507 determines that the user's emotion feedback are invalid, then the user's emotion feedback are not entered into the database and the user's credibility is decreased 509.
[0067] If the difference in Arousal level between the user's emotion feedback and the user emotions provided 505 is less than Ax, C is compared to Ca 506. If C is greater than Ca, the emotions of the training data are re-assigned to re-assigned points on the Valence-Arousal model and the algorithm is updated based on the re-assigned points 504. The user's emotion feedback are entered into the database 508 and C is increased 510. If C is less than Ca, the user's emotion feedback is subject to manual review by experts 507. If the manual review by experts 507 determines that the user's emotion feedback is valid, then the user's emotion feedback is stored in the database 508 and C is increased 510. If the manual review by experts 507 determines that the user's emotion feedback are invalid, then the user's emotion feedback are not entered into the database and the user's credibility is decreased 509.
[0068] Cv, C, Ca, and Ax are used in this description and are known statistical properties in the art of statistics. Thus, the terms Cv, C, Ca, and Ax shall be interpreted by their meaning in the art of statistics for purposes of this description.
[0069] The emotion recognition system of the present invention may comprise a test period (also referred to as a “reference period”) and a running period. As shown in
[0070] As shown in
[0071] As shown in
[0072] During environmental noise removal 801, 50 Hz or 60 Hz power line noise removal 802 may occur, during which portions of the raw user input that have frequencies of exactly 50 Hz or 60 Hz are removed from the user input. These specific frequency values are chosen since noise created by EEG devices is generally at the frequency of 50 Hz in European EEG devices and 60 Hz in American EEG devices. High pass and low pass band filtering 803 may also occur as part of environmental noise removal 801. During high pass and low bass band filtering 803, portions of the raw user input in the range of 0-4 Hz inclusive (low pass band) and portions of the raw user input in the range of 45-128 Hz inclusive (high pass band) are removed from the user input. These frequency ranges are chosen since they correlate with common environmental noise generated by events such as but not limited to the user moving or the user touching the user input acquisition device. Removal of other environmental noise 804 may further occur as part of environmental noise removal 801.
[0073] During biosignals noise removal 805, unwanted user input may be removed from the raw user input by CNN for EOG, EMG, and ECG denoising 806. In emotion recognition systems that utilize EEG as the preferred user input, other user inputs such as eye blink artifacts (EOG), muscle artifacts (EMG), and heartbeat artifacts (ECG) may be considered noise, and therefore should be removed from the user input. This may be achieved using a convolutional neural network (CNN) which is trained to recognize EEG signals when mixed with other biosignals such as EOG, EMG, and ECG. A convolutional neural network is a series of AI algorithms that are configured to recognize 1-dimensional (1D) images. The “algorithm” referred to herein may be a convolutional neural network. The “credibility algorithm” referred to herein may also be a convolutional neural network. Examples of CNNs that exist in the art are GoogLeNet, VGG16, and VGG19. During CNN for EOG, EMG, and ECG denoising 806, the various biosignals such as EEG, EOG, EMG, and ECG may be converted into 2D images that may be recognized using the CNN. The CNN may then separate the EEG signal from the rest of the biosignals, thereby removing the rest of the biosignals from the user input.
[0074] Removal of other biosignals noise 807 may also occur as part of biosignals noise removal 805.
[0075] As shown in
[0076] To determine the correlation between the user input and initial user annotations 902, the algorithm may calculate a correlation coefficient between the clean user input 703 and initial user annotations, as well as correlation coefficients between the initial user annotations and other user input entries that exist in the user database. Of the correlation coefficients between the initial user annotations and the other user input entries, an average correlation coefficient is calculated. A correlation criterion is then used to determine if the average correlation coefficient between the initial user annotations and existing user input entries is correlated to the correlation coefficient of the initial user annotations and the new, clean user input 703.
[0077] As shown in
[0078] Determination of best AI model 603 is particularly useful for subject dependent emotion recognition systems, since different AI models may work better for different users. Therefore, the same emotion recognition system may be used by multiple different users even though different AI models are used for each user.
[0079] The “AI model” described herein may comprise the algorithm and credibility algorithm described herein. Either, or both, of the algorithm and credibility algorithm may be a convolutional neural network.
[0080] As shown in
d=√{square root over (α(V2−V1).sup.2+β(A2−A1).sup.2)}
wherein d is the distance between the first and second emotions, V1 is the Valence value of the first emotion, V2 is the Valence value of the second emotion, A1 is the Arousal value of the first emotion, A2 is the Arousal value of the second emotion, α is a Valence constant, and β is an Arousal constant. The Valence and Arousal constants may be used to weigh either Valence or Arousal. For example, if Valence is determined to be twice as important for emotion recognition, the Valence constant may be set to twice the value of the Arousal constant.
[0081] The first and second emotions may be two emotions of the same user in the database. For example, a User A may provide two readings of a happiness emotion to the database, which may be the first and second emotions. The first and second emotions may alternatively be two emotions from different users in the database. For example, a User A may provide a reading of happiness and a User B may also provide a reading of happiness, which may be the first and second emotions. The first and second emotions may alternatively be an initial user annotation and a user emotion. For example, a User A may provide a reading of happiness (the first emotion), and may also provide an annotation of User A's Valence and Arousal values (the second emotion).
[0082] As shown in
[0083] As shown in
[0084] The number N may be any positive integer. The number N may be chosen to represent a number of consecutive non-credible user annotations that would determine a user to be incapable of providing accurate user annotations, therefore requiring human intervention 1305. Human intervention 1305 may be performed by medical experts in the art such as psychologists or psychiatrists.
[0085] Also as shown in
[0086] As shown in
[0087] As shown in