G06V40/176

STORAGE MEDIUM, DETERMINATION DEVICE, AND DETERMINATION METHOD

A non-transitory computer-readable storage medium storing a determination program that causes at least one computer to execute a process, the process includes acquiring a group of captured images that includes images including a face to which markers are attached; selecting, from a plurality of patterns that indicates a transition of positions of the markers, a first pattern that corresponds to a time-series change in the positions of the markers included in consecutive images among the group of captured images; and determining occurrence intensity of an action based on a determination criterion of the action determined based on the first pattern and the positions of the markers included in a captured image included after the consecutive images among the group of captured images.

ASYMMETRIC FACIAL EXPRESSION RECOGNITION

The present disclosure describes techniques for facial expression recognition. A first loss function may be determined based on a first set of feature vectors associated with a first set of images depicting facial expressions and a first set of labels indicative of the facial expressions. A second loss function may be determined based on a second set of feature vectors associated with a second set of images depicting asymmetric facial expressions and a second set of labels indicative of the asymmetric facial expressions. The first loss function and the second loss function may be used to determine a maximum loss function. The maximum loss function may be applied during training of a model. The trained model may be configured to predict at least one asymmetric facial expression in a subsequently received image.

ACTION SYNCHRONIZATION FOR TARGET OBJECT

A method for synchronizing an action of a target object with source audio is provided. Facial parameter conversion is performed on an audio parameter of the source audio at different time periods to obtain source parameter information of the source audio at the respective time periods. Parameter extraction is performed on a target video that includes the target object to obtain target parameter information of the target video. Image reconstruction is performed on the target object in the target video based on the source parameter information of the source audio and the target parameter information of the target video, to obtain a reconstructed image. Further, a synthetic video is generated based on the reconstructed image, the synthetic video including the target object, and the action of the target object being synchronized with the source audio.

Employment recruitment method based on face recognition and terminal device using same

An employment recruitment method based on face recognition includes acquiring a candidate's data from a third-party website, analyzing the candidate's data by a semantic analysis method to identify human resources information of the candidate, and analyzing messages and postings in the human resources information of the candidate to determine candidate's personality. A terminal device acquires a second face image of the candidate by a second camera, analyzes the second face image of the candidate by a computer vision algorithm to determine a micro-expression of the candidate, and provides the candidate's human resources information, the candidate's personality, and the candidate's micro-expression to the recruiter to evaluate the candidate. The terminal device applying the method is also disclosed.

Image synthesis for balanced datasets

A method may include obtaining a dataset including a target Action Unit (AU) combination and labeled images of the target AU combination with at least a first category of intensity for each AU of the target AU combination and a second category of intensity for each AU of the target AU combination. The method may also include determining that the first category of intensity for a first AU has a higher number of labeled images than the second category of intensity for the first AU, and based on the determination, identifying a number of new images to be synthesized in the second category of intensity for the first AU. The method may additionally include synthesizing the number of new images with the second category of intensity for the first AU, and adding the new images to the dataset.

AUDIO MATCHING METHOD AND RELATED DEVICE

Embodiments of the present application disclose an audio matching method and a related device. The audio matching method includes: obtaining audio data and video data; extracting to-be-recognized audio information from the audio data; extracting lip movement information of N users from the video data, where N is an integer greater than 1; inputting the to-be-recognized audio information and the lip movement information of the N users into a target feature matching model, to obtain a matching degree between each of the lip movement information of the N users and the to-be-recognized audio information; and determining a user corresponding to the lip movement information of the user with the highest matching degree as the target user to which the to-be-recognized audio information belongs.

METHOD OF FACE EXPRESSION RECOGNITION

The present invention provides a method of facial expression recognition including 3 steps: step 1: collecting facial expression data, which contributes to solve the problem of lacking data, disparate and bias data, that cause the overfitting problem when training the deep learning model; step 2: designing a new deep learning network that able to focus on special regions of the face to extract and learn the important features of facial expressions by intergating ensemble attention modules into basic deep network architecture like ResNet; step 3: training the ensemble attention deep learning model in step 2 on the collected dataset in step 1, using the combination of two loss functions including ArcFace and Softmax to reduce the overfitting problem.

Continuous video generation from voice data

One example method includes capturing audio data at a client engine while outputting an output video, the output video being based upon an original video stored at the client engine, delivering the captured audio data to a prediction engine upon the captured audio data being captured for a pre-determined time, receiving from the prediction engine substitute frame data used by the client engine to stitch one or more frames into the original video stored at the client engine, and following stitching the one or more frames into the output video to generate an altered output video, outputting the captured audio data and the altered video from the client engine.

Omnichannel intelligent negotiation assistant

An omnichannel intelligent negotiation assistant for generating timely, contextual negotiation assistance to a negotiator. The invention includes a semantic term extractor for converting a contract document into a negotiable term sheet. An omnichannel listener captures all negotiation inputs associated with a negotiation event, sequences each negotiation input by time, and analyzes the sentiment of the negotiation inputs in the context of a term sheet. The resulting annotated negotiation input stream is processed by an intervention generator that includes models of the parties and the negotiation itself as well as a referent negotiation model. The intervention generator includes a game theoretic model that, in concert with a trade-off matrix, allows the intervention generator to produce timely contextual interventions to the negotiator that assist in achieving a superior resulting negotiated agreement.

AVATAR BASED IDEOGRAM GENERATION
20180005420 · 2018-01-04 ·

Systems, devices, media, and methods are presented for generating ideograms from a set of images received in an image stream. The systems and methods detect at least a portion of a face within the image and identify a set of facial landmarks within the portion of the face. The systems and methods determine one or more characteristics representing the portion of the face, in response to detecting the portion of the face. Based on the one or more characteristics and the set of facial landmarks, the systems and methods generate a representation of a face. The systems and methods position one or more graphical elements proximate to the graphical model of the face and generate an ideogram from the graphical model and the one or more graphical elements.