G06F15/7839

Systems and methods for controlling access to secure debugging and profiling features of a computer system
11580264 · 2023-02-14 · ·

The present disclosure describes systems and methods for controlling access to secure debugging and profiling features of a computer system. Some illustrative embodiments include a system that includes a processor, and a memory coupled to the processor (the memory used to store information and an attribute associated with the stored information). At least one bit of the attribute determines a security level, selected from a plurality of security levels, of the stored information associated with the attribute. Asserting at least one other bit of the attribute enables exportation of the stored information from the computer system if the security level of the stored information is higher than at least one other security level of the plurality of security levels.

DATA TRANSMISSION METHOD AND APPARATUS
20230038051 · 2023-02-09 ·

A data transmission method and apparatus are provided. The data transmission method is applied to a computer system including at least two coprocessors, for example, including a first coprocessor and a second coprocessor. A shared memory is deployed between the first coprocessor and the second coprocessor, and is configured to store data generated when subtasks are separately executed. Further, the shared memory further stores a storage address of data generated when a subtask is executed, and a mapping relationship between each subtask and a coprocessor that executes the subtask. Therefore, a storage address of data to be read by the coprocessor may be found based on the mapping relationship, and the data may further be directly read from the shared memory without being copied by using a system bus. This improves efficiency of data transmission between the coprocessors.

Graphics processors and graphics processing units having dot product accumulate instruction for hybrid floating point format

Described herein is a graphics processing unit (GPU) comprising a first processing cluster to perform parallel processing operations, the parallel processing operations including a ray tracing operation and a matrix multiply operation; and a second processing cluster coupled to the first processing cluster, wherein the first processing cluster includes a floating-point unit to perform floating point operations, the floating-point unit is configured to process an instruction using a bfloat16 (BF16) format with a multiplier to multiply second and third source operands while an accumulator adds a first source operand with output from the multiplier.

Monolithic vector processor configured to operate on variable length vectors using a vector length register

A computer processor comprising a vector unit is disclosed. The vector unit may comprise a vector register file comprising at least one register to hold a varying number of elements. The vector unit may further comprise a vector length register file comprising at least one register to specify the number of operations of a vector instruction to be performed on the varying number of elements in the at least one register of the vector register file. The computer processor may be implemented as a monolithic integrated circuit.

Defining a transition zone between a shell and lattice cell array in a three-dimensional printing system
11484946 · 2022-11-01 · ·

An apparatus for manufacturing a three-dimensional article by additive manufacturing includes a processor and an information storage device storing software instructions. In response to execution by the processor, the software instructions cause the apparatus to: receive initial data defining the three-dimensional article having an outer surface, define a shell having the outer surface of the three-dimensional article and an opposing inner surface that defines an inner cavity, define a transition zone between the inner surface of the shell and a boundary that is inside the inner cavity and generally follows the inner surface of the shell, define a lattice of arrayed unit cells that fill the inside of the boundary, the lattice is defined by connected lattice segments, and define transition segments that couple the lattice to the inner surface of the shell.

Techniques for configuring parallel processors for different application domains

In various embodiments, a parallel processor includes a parallel processor module implemented within a first die and a memory system module implemented within a second die. The memory system module is coupled to the parallel processor module via an on-package link. The parallel processor module includes multiple processor cores and multiple cache memories. The memory system module includes a memory controller for accessing a DRAM. Advantageously, the performance of the parallel processor module can be effectively tailored for memory bandwidth demands that typify one or more application domains via the memory system module.

Vector population count determination via comparsion iterations in memory
11663005 · 2023-05-30 · ·

Examples of the present disclosure provide apparatuses and methods for determining a vector population count in a memory. An example method comprises determining, using sensing circuitry, a vector population count of a number of fixed length elements of a vector stored in a memory array.

SYSTEMS AND METHODS TO CONFIGURE FRONT PANEL HEADER

In one aspect, a device may include at least one processor programmed with instructions to power on the device responsive to an electrical connection of two pins on a front panel header of a system board and, based on powering on the device responsive to the electrical connection of two pins on the front panel header of the system board, present a basic input/output system (BIOS) setup screen on a display. The BIOS setup screen may provide one or more options for a person to configure pinouts of the front panel header. The processor may also be programmed with instructions to save the person's configuration of the pinouts of the front panel header based on user input using the BIOS setup screen and, responsive to a subsequent startup of the device, apply the configuration of the pinouts of the front panel header for operation of the device.

Processing data in memory using an FPGA

Processing data in memory using a field programmable gate array by reading a first portion of a data set to a burst block having a first data format, transforming a sub-portion of the first portion, to an element block having a second data format, processing the sub-portion yielding a first results set, transforming the first results set to the first data format of the burst block, and writing the first results set to the burst block.

Systems and methods for improving cache efficiency and utilization

Systems and methods for improving cache efficiency and utilization are disclosed. In one embodiment, a graphics processor includes processing resources to perform graphics operations and a cache controller of a cache coupled to the processing resources. The cache controller is configured to control cache priority by determining whether default settings or an instruction will control cache operations for the cache.