Patent classifications
G06T3/0087
METHODS AND APPARATUS FOR MAXIMIZING CODEC BANDWIDTH IN VIDEO APPLICATIONS
Methods and apparatus for processing of video content to optimize codec bandwidth. In one embodiment, the method includes capturing panoramic imaging content (e.g., a 360° panorama), mapping the panoramic imaging content into an equi-angular cubemap (EAC) format, and splitting the EAC format into segments for transmission to maximize codec bandwidth. In one exemplary embodiment, the EAC segments are transmitted at a different frame rate than the subsequent display rate of the panoramic imaging content. For example, the mapping and frame rate may be chosen to enable the rendering of 8K, 360° content at 24 fps, using commodity encoder hardware and software that nominally supports 4K content at 60 fps.
SYSTEMS AND METHODS FOR COMBINED PIPELINE PROCESSING OF PANORAMIC IMAGES
Systems and methods for capturing and/or processing of panoramic imaging content using spatial redundancy-based mapping. Panoramic imaging content may be processed using a processing pipeline that may operate on a portion of the image. Images may be transformed prior to processing. Image transformation may introduce distortion and/or data redundancy. Image partitioning for the pipeline processing may be configured based on spatial redundancy associated with the transformation. Windowing operation may include partitioning an image using non-rectangular and/or non-equal windows.
SYSTEM FOR PROVIDING PROBE TRACE FIDUCIAL-FREE TRACKING
A method for referencing a tracking system's coordinate frame to a rigid body's coordinate frame is disclosed. The method involves obtaining a 3D model depicting some of the surfaces of the rigid body. A probe is provided with an affixed tracking reference component. A second tracking reference component is attached to the rigid body. The method involves tracking locations of the probe as it moves along surfaces of the rigid body and then determining a transform that relates the probe locations to the 3D model of the rigid body. In one embodiment the rigid body is a dental mandible or maxilla of a patient and the 3D model is a surface extracted from a computed tomography image of the patient's jaw and teeth.
DEEP DATA ASSOCIATION FOR ONLINE MULTI-CLASS MULTI-OBJECT TRACKING
A system for applying video data to a neural network (NN) for online multi-class multi-object tracking includes a computer programed to perform an image classification method including the operations of receiving a video sequence; detecting candidate objects in each of a previous and a current video frame; transforming the previous and current video frames into a temporal difference input image; applying the temporal difference input image to a pre-trained neural network (NN) (or deep convolutional network) comprising an ordered sequence of layers; and based on a classification value received by the neural network, associating a pair of detected candidate objects in the previous and current frames as belonging to one of matching objects and different objects.
Image processing apparatus, image processing method, and storage medium for generating a panoramic image
An image processing apparatus comprises an image obtaining unit that obtains image data based on image capturing by a plurality of image capture apparatuses configured to capture images in an imaging area from different positions, and a panoramic image generation unit that generates, based on the image data obtained by the image obtaining unit, a panoramic image including images in directions within a predetermined range based on a specific position in the imaging area, the panoramic image having an enlargement rate of an image in a specific direction included in the predetermined range which is larger than the enlargement rates of images in other directions.
FOVEATED VIDEO RENDERING
Techniques are described for generating and rendering video content based on area of interest (also referred to as foveated rendering) to allow 360 video or virtual reality to be rendered with relatively high pixel resolution even on hardware not specifically designed to render at such high pixel resolution. Processing circuitry may be configured to keep the pixel resolution within a first portion of an image of one view at the relatively high pixel resolution, but reduce the pixel resolution through the remaining portions of the image of the view based on an eccentricity map and/or user eye placement. A device may receive the images of these views and process the images to generate viewable content (e.g., perform stereoscopic rendering or interpolation between views). Processing circuitry may also make use of future frames within a video stream and base predictions on those future frames.
INFORMATION PROCESSING APPARATUS, INFORMATION PROCESSING METHOD, PROGRAM, MOBILE-OBJECT CONTROL APPARATUS, AND MOBILE OBJECT
The present technology relates to an information processing apparatus, an information processing method, a program, a mobile-object control apparatus, and a mobile object that make it possible to improve the accuracy in recognizing a target object.
An information processing apparatus includes a geometric transformation section that transforms at least one of a captured image or a sensor image to match coordinate systems of the captured image and the sensor image, the captured image being obtained by an image sensor, the sensor image indicating a sensing result of a sensor of which a sensing range at least partially overlaps a sensing range of the image sensor; and an object recognition section that performs processing of recognizing a target object on the basis of the captured image and sensor image of which the coordinate systems have been matched to each other. The present technology is applicable to, for example, a system used to recognize a target object around a vehicle.
System and Method for Simulating an Immersive Three-Dimensional Virtual Reality Experience
The present invention brings concerts directly to the people by streaming, preferably, 360° videos played back on a virtual reality headset and, thus, creating an immersive experience, allowing users to enjoy a performance of their favorite band at home while sitting in the living room. In some cases, 360° video material may not be available for a specific concert and the system has to fall back to traditional two-dimensional (2D) video material. For such cases, the present invention takes the limited space of a conventional video screen and expands it to a much wider canvas, by expanding color patterns of the video into the surrounding space. The invention may further provide seamless blending of the 2D medium into a 3D space and additionally enhancing the space with computer-generated effects and virtual objects that directly respond to the user's biometric data and/or visual and acoustic stimuli extracted from the played video.
System and method for video processing and presentation
Systems and methods for video processing and display are provided. In one aspect, a workstation is configured to determine a position of a remotely controlled vehicle when an image of the surroundings of the vehicle was captured, and to transform the image into a transformed image based on an estimated real-time position of the vehicle. In another aspect, a workstation is configured to identify an image captured prior to the time a remotely controlled vehicle changed its configuration. The workstation estimates a real-time position of the vehicle, and transforms the identified image into a transformed image based on the estimated real-time position of the vehicle, such that the transformed image represents at least a portion of a view of the surroundings of the vehicle at the estimated real-time position after the vehicle changed its configuration.
IMAGE PROCESSING USING SELF-ATTENTION
An image processing device for identifying one or more characteristics of an input image, the device including a processor configured to: receive the input image, the input image extending along a first axis and a second axis; form a series of attribute maps based on the received input image; perform a first correlation operation by identifying regions in respect of which the patterns of multiple ones of the series of attribute maps are correlated, and forming a first output in dependence on that operation; perform a second correlation operation for identifying combinations of (i) attributes and (ii) portions of the image having common location in terms of the first axis, and forming a second output in dependence on that operation; and form a representation of the one or more characteristics of the input image in dependence on at least the first output and the second output.