Patent classifications
G10L25/63
Method and system providing service based on user voice
A method for providing a service based on a user's voice includes steps of extracting a voice of a first user, generating text information or voice waveform information based on the voice of the first user, analyzing a disposition of the first user based on the text information and the voice waveform information, and then selecting a second user corresponding to the disposition of the first user based on the analysis result, providing the first user with a conversation connection service with the second user and acquiring information on a change in an emotional state of the first user based on conversation information between the first user and the second user, and re-selecting the second user corresponding to the disposition of the first user based on the acquired information on the change in the emotional state of the first user.
Method and system providing service based on user voice
A method for providing a service based on a user's voice includes steps of extracting a voice of a first user, generating text information or voice waveform information based on the voice of the first user, analyzing a disposition of the first user based on the text information and the voice waveform information, and then selecting a second user corresponding to the disposition of the first user based on the analysis result, providing the first user with a conversation connection service with the second user and acquiring information on a change in an emotional state of the first user based on conversation information between the first user and the second user, and re-selecting the second user corresponding to the disposition of the first user based on the acquired information on the change in the emotional state of the first user.
Editing device and editing method
An editing device acquires a first image in which an occupant of a vehicle has been imaged in association with a time point in a time series and a second image in which scenery around the vehicle has been imaged in association with a time point in a time series, acquires first index information indicating feelings of the occupant when the first image has been captured on the basis of the first image, and extracts the first image and the second image from first images of the time series and second images of the time series on the basis of the first index information and the time point associated with the first image based on the first index information to generate a library including the extracted images.
Editing device and editing method
An editing device acquires a first image in which an occupant of a vehicle has been imaged in association with a time point in a time series and a second image in which scenery around the vehicle has been imaged in association with a time point in a time series, acquires first index information indicating feelings of the occupant when the first image has been captured on the basis of the first image, and extracts the first image and the second image from first images of the time series and second images of the time series on the basis of the first index information and the time point associated with the first image based on the first index information to generate a library including the extracted images.
SYSTEMS AND METHODS FOR GENERATING TRAILERS FOR AUDIO CONTENT
An electronic device receives an audio file and divides the audio file into a plurality of segments. The electronic device, automatically, without user input, determines, for each segment, a descriptor from a plurality of descriptors and a value of the descriptor for the segment. The electronic device selects one or more segments of the plurality of segments, based on a comparison of the respective values of respective descriptors for respective segments and genre-specific criteria selected based on a genre of the audio file. The electronic device generates a trailer for the audio file using the selected one or more segments.
SYSTEMS AND METHODS FOR GENERATING TRAILERS FOR AUDIO CONTENT
An electronic device receives an audio file and divides the audio file into a plurality of segments. The electronic device, automatically, without user input, determines, for each segment, a descriptor from a plurality of descriptors and a value of the descriptor for the segment. The electronic device selects one or more segments of the plurality of segments, based on a comparison of the respective values of respective descriptors for respective segments and genre-specific criteria selected based on a genre of the audio file. The electronic device generates a trailer for the audio file using the selected one or more segments.
SYSTEMS AND METHODS FOR INSERTING EMOTICONS WITHIN A MEDIA ASSET
Systems and methods are described herein for inserting emoticons within a media asset based on an audio portion of the media asset. Each audio portion of a media asset is associated with a respective part of speech, and an emotion corresponding to the audio portion for the media asset is determined. A corresponding emoticon is identified based on the determined emotion in the audio portion and causing to be presented at the location within the media asset.
Automatic speech-based longitudinal emotion and mood recognition for mental health treatment
A method of predicting a mood state of a user may include recording an audio sample via a microphone of a mobile computing device of the user based on the occurrence of an event, extracting a set of acoustic features from the audio sample, generating one or more emotion values by analyzing the set of acoustic features using a trained machine learning model, and determining the mood state of the user, based on the one or more emotion values. In some embodiments, the audio sample may be ambient audio recorded periodically, and/or call data of the user recorded during clinical calls or personal calls.
Automatic speech-based longitudinal emotion and mood recognition for mental health treatment
A method of predicting a mood state of a user may include recording an audio sample via a microphone of a mobile computing device of the user based on the occurrence of an event, extracting a set of acoustic features from the audio sample, generating one or more emotion values by analyzing the set of acoustic features using a trained machine learning model, and determining the mood state of the user, based on the one or more emotion values. In some embodiments, the audio sample may be ambient audio recorded periodically, and/or call data of the user recorded during clinical calls or personal calls.
Electronic device and method of obtaining emotion information
Emotion information is obtained by an electronic device in order to improve communication between a person and the electronic device. Multimedia data is obtained regarding a person, predicted values for the person are obtained by applying the multimedia data to neural network models, and emotion information of the person is obtained by applying the predicted values to a weight model. Then, feedback information is obtained from the person with respect to the first emotion information of the person. Finally, the weight model is updated by using the feedback information. Subsequently, when multimedia data are again obtained regarding the person, new predicted values for the person are obtained by applying later multimedia data the plurality of neural network models, and emotion information of the person is again obtained, but this time using the weight model updated using the feedback information.