G06V40/176

Learning apparatus and method for creating emotion expression video and apparatus and method for emotion expression video creation

A learning apparatus for creating an emotion expression video according to an embodiment disclosed include first generative adversarial networks (GAN) that receive text for creating an emotion expression video, extract vector information by performing embedding on the input text, and create an image based on the extracted vector information, and second generative adversarial networks that receive an emotion expression image and a frame of comparison video, and create a frame of emotion expression video from the emotion expression image and the frame of comparison video.

ADVERTISEMENT MANAGEMENT APPARATUS FOR VEHICLE, ADVERTISEMENT MANAGEMENT METHOD FOR VEHICLE, AND STORAGE MEDIUM

An advertisement management apparatus for a vehicle includes a processor. The processor acquires image information on a captured image around a vehicle having an outer surface on which advertisement information is displayable, and the processor detects a predetermined motion of a person around the vehicle from the image information thus acquired and estimates a reaction of the person to the advertisement information.

DETECTING EMOTIONAL STATE OF A USER BASED ON FACIAL APPEARANCE AND VISUAL PERCEPTION INFORMATION
20230237844 · 2023-07-27 ·

A method for detecting an emotional state of a user includes obtaining a first data stream indicative of facial appearance and gaze direction of the user as the user is viewing a scene, determining, based on the first data stream, facial expression feature information indicative of emotional facial expression of the user, obtaining a second data stream indicative of visual content in a field of view of the user, determining, based on the second data stream, visual feature information indicative of visual content in the scene, determining emotional state information based on analyzing the facial expression feature information determined based on the first data stream and the visual feature information determined based on the second data stream, and performing an operation with respect to the emotional state information, wherein the emotional state information is indicative of the emotional state of the user.

COMPUTER PROGRAM, SERVER, TERMINAL, AND METHOD
20230023653 · 2023-01-26 · ·

A non-transitory computer readable medium storing computer executable instructions which, when executed by one or more processors, cause the one or more processors to perform a process including generating avatar information relating to an avatar expression or pose based on streamer data indicating a facial expression or pose of a streamer; acquiring gift information concerned with a gift of an object that is provided from a viewer to the streamer; determining whether the gift information satisfies a predetermined condition; in a case that the gift information is determined to satisfy the predetermined condition, rendering an avatar facial expression or pose using a predetermined facial expression or pose corresponding to the predetermined condition; and in a case that the gift information is determined to not satisfy the predetermined condition, rendering the avatar facial expression or pose using the avatar information generated based on the streamer data.

METHOD AND DATA PROCESSING APPARATUS

A method of generating an emotion descriptor icon includes receiving input content comprising video information, and performing analysis on the input content to produce information representing the video information with respect to a plurality of characteristics. The method also includes determining, based on a comparison of the information representing the video information at a temporal position in the video information and a set of information items respectively representing an emotion state, a relative likelihood of association between the input content and at least some of a plurality of emotion states, selecting an emotion state based on the outcome of the determination, and outputting an emotion descriptor icon selected from an emotion descriptor icon set comprising a plurality of emotion descriptor icons. The outputted emotion descriptor icon is associated with the selected emotion state.

LEARNING ASSISTANCE DEVICE AND LEARNING ASSISTANCE SYSTEM
20230230417 · 2023-07-20 ·

A learning assistance device for a user to perform a learning task includes: a first concentration level estimator that estimates a first concentration level of the user, by analyzing information from an image capturing section that captures an image of a user; a second concentration level estimator that estimates a second concentration level of the user, by analyzing information which the user has actively input when performing a learning task; and a presentation switching section that switches between learning task content and between presentation schemes, based on at least one of the first concentration level or the second concentration level.

ONLINE STREAMER AVATAR GENERATION METHOD AND APPARATUS
20230230305 · 2023-07-20 ·

This application provides techniques of generating a virtual character for an online streamer. The techniques comprises obtaining a human body image of a target online streamer captured by an image collection device, wherein the human body image of the target online streamer comprises at least a face and an upper body part of the target online streamer; separately performing face recognition and upper-body limb recognition on the human body image to obtain face features and limb features; determining parameters associated with a virtual character corresponding to the target online streamer based on the face features and the limb features; and generating the virtual character corresponding to the target online streamer based on the parameters, wherein the generated virtual character has a motion and an expression corresponding to that of the target online streamer.

SYSTEM AND METHOD FOR BODY LANGUAGE INTERPRETATION
20230230415 · 2023-07-20 ·

A system and method for reading and interpreting a wide range of nonverbal communicative cues, to include facial expression, pose, gesture, posture, and voice intonation. The output of this system is a scale between zero and one - with the scale indicating the interpretation of the nonverbal communication and accompanying text describing the interpretation. The system determines how a person intends to react and determines whether the person’s pronouncements are true or false.

AUGMENTING AUDIENCE MEMBER EMOTES IN LARGE-SCALE ELECTRONIC PRESENTATION
20230231730 · 2023-07-20 ·

A presentation service generates an audience interface for an electronic presentation. The audience interface may simulate an in-person presentation, including features such as a central presenter and seat locations for audience members. The audience members may select emotes which may be displayed in the audience interface. The emotes may indicate the audience members' opinion of the content being presented. The presentation service may enable chats between multiple audience members, grouping of audience members private rooms, and other virtual simulations of functions corresponding to in-person presentations.

Visual dubbing using synthetic models
11562597 · 2023-01-24 · ·

A computer-implemented method of processing target footage of a target human face includes training an encoder-decoder network comprising an encoder network, a first decoder network, and a second decoder network. The training includes training a first path through the encoder-decoder network including the encoder network and the first decoder network to reconstruct the target footage of the target human face, and training a second path through the encoder-decoder network including the encoder network and the second decoder network to process renders of a synthetic face model exhibiting a range of poses and expressions to determine parameter values for the synthetic face model corresponding to the range of poses and expressions. The method includes processing, using a trained network path comprising or trained using the encoder network and comprising the first decoder network, source data representing the synthetic face model exhibiting a source sequence of expressions, to generate output video data.