Patent classifications
G06V40/176
ENHANCING VIDEO CHATTING
A method for a computing device to enhance video chatting Includes receiving a live video stream, processing a frame in the live video stream in real-time, and transmitting the frame to another computing device. Processing the frame in real-time includes detecting a face, an upper torso, or a gesture in the frame, and applying a visual effect to the frame. The method includes processing a next frame in the live video stream in real-time by repeating the enhancing, the detecting, and the applying.
Systems, methods, devices and apparatuses for detecting facial expression
A system, method and apparatus for detecting facial expressions according to EMG signals.
Holodouble: systems and methods for low-bandwidth and high-quality remote visual communication
A system receives input from a user to initiate a process of generating a holodouble of the user. The system obtains image data of the user and deconstructs the image data to obtain a set of sparse data that identifies one or more attributes associated with the image data the user. The system uses a holodouble training model to generate and train the holodouble of the user based on the set of sparse data and obtained image data. The system renders a representation of the holodouble to the user concurrently while capturing new image data of the user, receives input from the user comprising approval of the holodouble, and completes training of the holodouble by saving the holodouble for subsequent use. The subsequent use includes one or more remote visual communication sessions.
Determining a mood for a group
A system and method for determining a mood for a crowd is disclosed. In example embodiments, a method includes identifying an event that includes two or more attendees, receiving at least one indicator representing emotions of attendees, determining a numerical value for each of the indicators, and aggregating the numerical values to determine an aggregate mood of the attendees of the event.
MULTIPURPOSE CONTROLLERS AND METHODS
Method and apparatus is disclosed for a user to communicate with an electronic device. A processor receives user intention actions comprising facial expression (FE) information indicative of facial expressions and body information indicative of motion or position of one or more body parts of the user. When the FE or body information crosses a first level, the processor starts generating first signals based on the FE or body information to communicate with the electronic device. When the FE or body information crosses a second level, the processor can end generation of the first signals or modify the first signals. An image processing or eye gaze tracking system can provide some FE information or body information. The signals can modify attributes of an object of interest. Use of thresholds that are independent of sensor position or orientation with respect to the user's body are also disclosed.
SYSTEMS AND METHODS FOR GENERATING EMOTIONALLY-ENHANCED TRANSCRIPTION AND DATA VISUALIZATION OF TEXT
Generating emotionally enhanced transcription of non-textual data and an enriched visualization of transcribed data by capturing non-textual data of a speaker using bio-feedback technology, transcribing it into to a textual format, combining transcribed textual data with emotional state of the speaker to generate the emotionally enhanced transcribed textual data, and presenting emotionally enhanced transcribed textual data through an enriched visualization including color-coding transcribed textual data to identify mistakes in the transcribed data.
MACHINE LEARNING PROGRAM, MACHINE LEARNING METHOD, AND ESTIMATION APPARATUS
A computer-readable recording medium has stored a program that causes a computer to execute a process including: generating a trained model that includes performing machine learning of a 1st_model based on a 1st_output value that is obtained when a 1st_image is input to the 1st_model in response to input of training data containing pair of the 1st_image and a 2nd_image and containing a 1st_label indicating which of the 1st and 2nd_image has captured greater movement of muscles of facial expression of a photographic subject, a 2nd_output value obtained when the 2nd_image is input to a 2nd_model that has common parameters with the 1st_model, and the 1st_label; and generating a 3rd_model that includes performing machine learning based on a 3rd_output value obtained when a 3rd_image is input to the trained model, and a 2nd_label indicating of movement of muscles of facial expression of a photographic subject captured in the 3rd_image.
INFORMATION PROCESSING DEVICE
An information processing device includes a camera interface and a processor, the camera interface acquiring a moving image from a first camera that is installed at a production site and that images a worker and surroundings of the worker and from a second camera that is installed at the production site and that images a face of the worker. The processor detects an operation section of work performed by the worker from a predetermined number of consecutive frames included in the moving image acquired from the first camera using an inference model. The processor detects the emotion and the line-of-sight direction of the worker included in each frame of the moving image acquired from the second camera. Further, the processor provides a detection result.
METHOD AND SYSTEM FOR REPRESENTING AVATAR FOLLOWING MOTION OF USER IN VIRTUAL SPACE
A non-transitory computer-readable recording medium storing instructions that, when executed by a processor, cause the processor to set a communication session in which a plurality of users participate through a server, generate data for a virtual space, share motion data related to motions of the plurality of users through the communication session, generate a video in which avatars following the motions of the plurality of users are represented in the virtual space, based on the motion data, and share the generated video with the plurality of users through the communication session.
Assistive control of network-connected devices
Devices, computer-readable media, and methods for changing the state of a network-connected device in response to at least one facial gesture of a user are disclosed. For example, a processing system including at least one processor captures images of a face of a user, detects at least one facial gesture of the user from the images, determines an intention to change a state of a network-connected device from the at least one facial gesture, generates a command for the network-connected device in accordance with the intention, and outputs the command to cause the state of the network-connected device to change.