Patent classifications
G06T15/005
Apparatus and method for ray tracing instruction processing and execution
An apparatus and method to execute ray tracing instructions. For example, one embodiment of an apparatus comprises execution circuitry to execute a dequantize instruction to convert a plurality of quantized data values to a plurality of dequantized data values, the dequantize instruction including a first source operand to identify a plurality of packed quantized data values in a source register and a destination operand to identify a destination register in which to store a plurality of packed dequantized data values, wherein the execution circuitry is to convert each packed quantized data value in the source register to a floating point value, to multiply the floating point value by a first value to generate a first product and to add the first product to a second value to generate a dequantized data value, and to store the dequantized data value in a packed data element location in the destination register.
Method for performing shader occupancy for small primitives
A GPU includes shader cores and a shader warp packer unit. The shader warp packer unit may receive a first primitive associated with a first partially covered quad, and a second primitive associated with a second partially covered quad. The shader warp packer unit may determine that the first partially covered quad and the second partially covered quad have non-overlapping coverage. The shader warp packer unit may pack the first partially covered quad and the second partially covered quad into a packed quad. The shader warp packer unit may send the packed quad to the shader cores. The first partially covered quad and the second partially covered quad may be spatially disjoint from each other. The shader cores may receive and process the packed quad with no loss of information relative to the shader cores individually processing the first partially covered quad and the second partially covered quad.
XR preferred movement along planes
Presenting a virtual object includes obtaining, by a first device, a first geometric representation and a second geometric representation corresponding to a physical surface in a real environment, determining an initialization location on the first physical surface for a virtual object, obtaining a first normal for the first representation and a second normal for the second representation at the initialization location, and rendering the virtual object at the initialization location based on the first normal and the second normal.
Systems and methods for distributed scalable ray processing
Ray tracing systems have computation units (“RACs”) adapted to perform ray tracing operations (e.g. intersection testing). There are multiple RACs. A centralized packet unit controls the allocation and testing of rays by the RACs. This allows RACs to be implemented without Content Addressable Memories (CAMs) which are expensive to implement, but the functionality of CAMs can still be achieved by implemented them in the centralized controller.
Pixelation optimized delta color compression
A technique for compressing an original image is disclosed. According to the technique, an original image is obtained and a delta-encoded image is generated based on the original image. Next, a segregated image is generated based on the delta-encoded image and then the segregated image is compressed to produce a compressed image. The segregated image is generated because the segregated image may be compressed more efficiently than the original image and the delta image.
Intersection testing in a ray tracing system using ray coordinate system basis vectors
A method and an intersection testing module for performing intersection testing of a ray with a box in a ray tracing system. The ray and the box are defined in a 3D space using a space-coordinate system, and the ray is defined with a ray origin and a ray direction. A ray-coordinate system is used to perform intersection testing, wherein the ray-coordinate system has an origin at the ray origin, and the ray-coordinate system has three basis vectors. A first of the basis vectors is aligned with the ray direction. A second and a third of the basis vectors: (i) are both orthogonal to the first basis vector, (ii) are not parallel with each other, and (iii) have a zero as one component when expressed in the space-coordinate system. A result of performing the intersection testing is outputted for use by the ray tracing system.
INTERSECTION TESTING IN A RAY TRACING SYSTEM
A ray tracing unit and method for processing a ray in a ray tracing system performs intersection testing for the ray by performing one or more intersection testing iterations. Each intersection testing iteration includes: (i) traversing an acceleration structure to identify the nearest intersection of the ray with a primitive that has not been identified as the nearest intersection in any previous intersection testing iterations for the ray; and (ii) if, based on a characteristic of the primitive, a traverse shader is to be executed in respect of the identified intersection: executing the traverse shader in respect of the identified intersection; and if the execution of the traverse shader determines that the ray does not intersect the primitive at the identified intersection, causing another intersection testing iteration to be performed. When the intersection testing for the ray is complete, an output shader is executed to process a result of the intersection testing for the ray.
EFFICIENT CONVOLUTION OPERATIONS
A method of operation of a texturing/shading unit in a GPU pipeline is used for efficient convolution operations. The method uses texture hardware to collectively fetch all the texels required to calculate properties for a group of output pixels without any duplication. The method then bypasses bilinear filter hardware in the texture hardware and passes the fetched and unfiltered texel data from the texture hardware unit to shader hardware in the texturing/shading unit. The shader hardware uses the fetched texel data to perform a plurality of convolution operations to calculate the properties of each of the output pixel.
Position-based rendering apparatus and method for multi-die/GPU graphics processing
Position-based rendering apparatus and method for multi-die/GPU graphics processing. For example, one embodiment of a method comprises: distributing a plurality of graphics draws to a plurality of graphics processors; performing position-only shading using vertex data associated with tiles of a first draw on a first graphics processor, the first graphics processor responsively generating visibility data for each of the tiles; distributing subsets of the visibility data associated with different subsets of the tiles to different graphics processors; limiting geometry work to be performed on each tile by each graphics processor using the visibility data, each graphics processor to responsively generate rendered tiles; and wherein the rendered tiles are combined to generate a complete image frame.
Thread group scheduling for graphics processing
Embodiments are generally directed to thread group scheduling for graphics processing. An embodiment of an apparatus includes a plurality of processors including a plurality of graphics processors to process data; a memory; and one or more caches for storage of data for the plurality of graphics processors, wherein the one or more processors are to schedule a plurality of groups of threads for processing by the plurality of graphics processors, the scheduling of the plurality of groups of threads including the plurality of processors to apply a bias for scheduling the plurality of groups of threads according to a cache locality for the one or more caches.