Patent classifications
H04N7/157
ELECTRONIC DEVICE WITH NON-PARTICIPANT IMAGE BLOCKING DURING VIDEO COMMUNICATION
An electronic device, computer program product, and method avoids presenting certain objects during a video communication session. During a video communication session with second electronic device(s), a controller of an electronic device identifies baseline image(s) from an image stream provided by an image capturing device of the electronic device. The baseline image includes a primary image portion of participant(s) and including a scene of objects within the foreground or background of participant (s), during an initial portion of the video communication session. The controller monitors the image stream for a subsequent detection of the primary image portion and of non-participant(s) or object(s) as a secondary image portion that is not included within the baseline image(s). The controller responds to detecting the secondary image portion subsequently appearing within the image stream by communicating, to the one or more second electronic devices, a substitute image stream that does not present the secondary image portion.
METHOD, APPARATUS, ELECTRONIC DEVICE, COMPUTER-READABLE STORAGE MEDIUM, AND COMPUTER PROGRAM PRODUCT FOR VIDEO COMMUNICATION
A method for video communication includes: selecting a first virtual object on a first video communication interface and obtaining associated first virtual object information, displaying a first virtual reality video image and a second virtual reality video image on the first video communication interface, the first virtual reality video image corresponds the first virtual object information and a first user feature, and the second virtual reality video image corresponds second virtual object information and a second user feature; and playing a target virtual audio, the target virtual audio including one or both of a first virtual audio or a second virtual audio, the first virtual audio corresponds to first voice data and the first virtual object information, and the second virtual audio corresponds to second voice data and the second virtual object information.
AVATAR ANIMATION IN VIRTUAL CONFERENCING
According to a general aspect, a method can include receiving a photo of a virtual conference participant, and a depth map based on the photo, and generating a plurality of synthesized images based on the photo. The plurality of synthesized images can have respective simulated gaze directions of the virtual conference participant. The method can also include receiving, during a virtual conference, an indication of a current gaze direction of the virtual conference participant. The method can further include animating, in a display of the virtual conference, an avatar corresponding with the virtual conference participant. The avatar can be based on the photo. Animating the avatar can be based on the photo, the depth map and at least one synthesized image of the plurality of synthesized images, the at least one synthesized image corresponding with the current gaze direction.
SYSTEMS AND METHODS FOR MULTI-AGENT CONVERSATIONS
A first input is received from a user input device. Based on the first input, a list of candidate intents is generated, and a plurality of agents is initialized. Each agent of the plurality of agents corresponds to a respective candidate intent. Each agent then provides a different response to the first input in accordance with its respective corresponding intent. A second input is then received that responds to one or more of the agents. Based on the agents to which the second input is responsive, the list of candidate intents is refined and, based on the refined list, one or more agents are deactivated.
System and method for an interactive digitally rendered avatar of a subject person
A system and method for an interactive digitally rendered avatar of a subject person to participate in a web meeting is described. In one embodiment, the method includes receiving an invite to a web meeting on a video conferencing platform, wherein the invite identifies a subject person and the video conferencing platform. The method also includes generating an interactive avatar of the subject person based on a data collection associated with the subject person stored in a database. The method further includes instantiating a platform integrator associated with the video conferencing platform identified in the invite and joining, by the interactive avatar of the subject person, the web meeting on the video conferencing platform. The platform integrator transforms outputs and inputs between the video conferencing platform and an interactive digitally rendered avatar system so that the interactive avatar of the subject person participates in the web meeting.
Systems and methods for reconstruction and rendering of viewpoint-adaptive three-dimensional (3D) personas
An exemplary method includes maintaining a receiver-side mesh-vertices list, receiving duplicative-vertex information from a sender, and responsively reducing the receiver-side mesh-vertices list in accordance with the received duplicative-vertex information, and rendering, using the reduced receiver-side mesh-vertices list, viewpoint-adaptive three-dimensional (3D) personas of a subject at least in part by weighting video pixel colors from different video-camera vantage points of video cameras that capture video streams of the subject, the weighting being performed according to a respective geometric relationship of each video-camera vantage point to a user-selected viewpoint.
Systems and methods of handling speech audio stream interruptions
A device for communication includes one or more processors configured to receive, during an online meeting, a speech audio stream representing speech of a first user. The one or more processors are also configured to receive a text stream representing the speech of the first user. The one or more processors are further configured to selectively generate an output based on the text stream in response to an interruption in the speech audio stream.
Generative adversarial neural network assisted video reconstruction
A latent code defined in an input space is processed by the mapping neural network to produce an intermediate latent code defined in an intermediate latent space. The intermediate latent code may be used as appearance vector that is processed by the synthesis neural network to generate an image. The appearance vector is a compressed encoding of data, such as video frames including a person's face, audio, and other data. Captured images may be converted into appearance vectors at a local device and transmitted to a remote device using much less bandwidth compared with transmitting the captured images. A synthesis neural network at the remote device reconstructs the images for display.
Virtual 3D communications with actual to virtual cameras optical axes compensation
A method for conducting a three dimensional (3D) video conference between multiple participants, the method may include determining, for each participant, updated 3D participant representation information within the virtual 3D video conference environment, that represents participant; wherein the determining comprises compensating for difference between an actual optical axis of a camera that acquires images of the participant and a desired optical axis of a virtual camera; and generating, for at least one participant, an updated representation of virtual 3D video conference environment, the updated representation of virtual 3D video conference environment represents the updated 3D participant representation information for at least some of the multiple participants.
ACTION SYNCHRONIZATION FOR TARGET OBJECT
A method for synchronizing an action of a target object with source audio is provided. Facial parameter conversion is performed on an audio parameter of the source audio at different time periods to obtain source parameter information of the source audio at the respective time periods. Parameter extraction is performed on a target video that includes the target object to obtain target parameter information of the target video. Image reconstruction is performed on the target object in the target video based on the source parameter information of the source audio and the target parameter information of the target video, to obtain a reconstructed image. Further, a synthetic video is generated based on the reconstructed image, the synthetic video including the target object, and the action of the target object being synchronized with the source audio.