H04N21/4728

Automated video cropping

The disclosed computer-implemented method may include receiving, as an input, segmented video scenes, where each video scene includes a specified length of video content. The method may further include scanning the video scenes to identify objects within the video scene and also determining a relative importance value for the identified objects. The relative importance value may include an indication of which objects are to be included in a cropped version of the video scene. The method may also include generating a video crop that is to be applied to the video scene such that the resulting cropped version of the video scene includes those identified objects that are to be included based on the relative importance value. The method may also include applying the generated video crop to the video scene to produce the cropped version of the video scene. Various other methods, systems, and computer-readable media are also disclosed.

Multimodal inputs for computer-generated reality
11698674 · 2023-07-11 · ·

Implementations of the subject technology provide determining an operating mode of an electronic device based at least in part on whether the electronic device is communicatively coupled to an associated base device. Based on the determined operating mode, the subject technology identifies a set of input modalities for initiating a recording of content within a field of view of the electronic device. The subject technology monitors sensor information generated by at least one sensor included in, or communicatively coupled to, the electronic device. Further, the subject technology initiates the recording of content within the field of view of the electronic device when the monitored sensor information indicates that at least one of the identified set of input modalities has been triggered.

OBJECT OR REGION OF INTEREST VIDEO PROCESSING SYSTEM AND METHOD

Systems, methods and apparatus for processing video can include a processor. The processor can be configured to perform object detection to detect visual indications of potential objects of interest in a video scene, to receive a selection of an object of interest from the potential objects of interest, and to provide enhanced video content within the video scene for the object of interest indicated by the selection.

OBJECT OR REGION OF INTEREST VIDEO PROCESSING SYSTEM AND METHOD

Systems, methods and apparatus for processing video can include a processor. The processor can be configured to perform object detection to detect visual indications of potential objects of interest in a video scene, to receive a selection of an object of interest from the potential objects of interest, and to provide enhanced video content within the video scene for the object of interest indicated by the selection.

REPRESENTING VOLUMETRIC VIDEO IN SALIENCY VIDEO STREAMS

Saliency regions are identified in a global scene depicted by volumetric video. Saliency video streams that track the saliency regions are generated. Each saliency video stream tracks a respective saliency region. A saliency stream based representation of the volumetric video is generated to include the saliency video streams. The saliency stream based representation of the volumetric video is transmitted to a video streaming client.

Efficient Delivery of Multi-Camera Interactive Content
20230216908 · 2023-07-06 ·

Techniques are disclosed relating to encoding recorded content for distribution to other computing devices. In various embodiments, a first computing device records content of a physical environment in which the first computing device is located, the content being deliverable to a second computing device configured to present a corresponding environment based on the recorded content and content recorded by one or more additional computing devices. The first computing device determines a pose of the first computing device within the physical environment and encodes the pose in a manifest usable to stream the content recorded by the first computing device to the second computing device. The encoded pose is usable by the second computing device to determine whether to stream the content recorded by the first computing device.

Rate control for fixed rate foveated display compression

Provided is a method of coding blocks of video data representing an image using an encoder, the method including identifying, by the encoder, a first region of the image and a second region of the image, a sum of a first number of pixels in the first region and a second number of pixels in the second region being equal to a total number of pixels of the image, and allocating, by the encoder, a first number of bits including base bits for encoding the first region, and a second number of bits including base bits and enhancement bits for encoding the second region, a sum of the first number of bits and the second number of bits being equal to a total number of bits for encoding all of the pixels, wherein the second region is encoded with a greater number of bits per pixel than the first region.

Rate control for fixed rate foveated display compression

Provided is a method of coding blocks of video data representing an image using an encoder, the method including identifying, by the encoder, a first region of the image and a second region of the image, a sum of a first number of pixels in the first region and a second number of pixels in the second region being equal to a total number of pixels of the image, and allocating, by the encoder, a first number of bits including base bits for encoding the first region, and a second number of bits including base bits and enhancement bits for encoding the second region, a sum of the first number of bits and the second number of bits being equal to a total number of bits for encoding all of the pixels, wherein the second region is encoded with a greater number of bits per pixel than the first region.

Automatic annotation for vehicle damage

Aspects described herein may allow an automated generation of an interactive multimedia content with annotations showing vehicle damage. In one method, a server may receive vehicle-specific identifying information of a vehicle. Image sensors may capture multimedia content showing aspects associated with exterior regions of the vehicle, and may send the multimedia content to the server. For each of the exterior regions of the vehicle, the server may determine, using a trained classification model, instances of damage. Furthermore, the server may generate an interactive multimedia content that shows images with annotations indicating instances of damage. The interactive multimedia content may be displayed via a user interface.

Point cloud data transmission device, point cloud data transmission method, point cloud data reception device, and point cloud data reception method
11544877 · 2023-01-03 · ·

Disclosed herein is a point cloud data transmission method. The transmission method may include encoding the point cloud data, encapsulating the point cloud data, and transmitting point cloud data. Disclosed herein is a point cloud data reception device. The reception device may include a receiver configured to receive the point cloud data, a decapsulator configured to decapsulate the point cloud data, and a decoder configured to decode the point cloud data.