Patent classifications
G06F9/30
I2C communication
The present disclosure relates to a communication method by I2C bus between a emitting device and a receiving device, in which: a rising edge of a clock signal of the I2C bus, directly following a start condition of an I2C communication, is recorded; and when an interruption is generated within the receiving device, the receiving device verifies whether the rising edge was recorded.
Reducing save restore latency for power control based on write signals
A method of save-restore operations includes monitoring, by a power controller of a parallel processor (such as a graphics processing unit), of a register bus for one or more register write signals. The power controller determines that a register write signal is addressed to a state register that is designated to be saved prior to changing a power state of the parallel processor from a first state to a second state having a lower level of energy usage. The power controller instructs a copy of data corresponding to the state register to be written to a local memory module of the parallel processor. Subsequently, the parallel processor receives a power state change signal and writes state register data saved at the local memory module to an off-chip memory prior to changing the power state of the parallel processor.
Neural network training mechanism
- Gokcen Cilingir ,
- Elmoustapha Ould-Ahmed-Vall ,
- Rajkishore Barik ,
- Kevin Nealis ,
- Xiaoming Chen ,
- Justin E. Gottschlich ,
- Prasoonkumar Surti ,
- Chandrasekaran Sakthivel ,
- Abhishek Appu ,
- John C. Weast ,
- Sara S. Baghsorkhi ,
- Barnan Das ,
- Narayan Biswal ,
- Stanley J. Baran ,
- Nilesh V. Shah ,
- Archie Sharma ,
- Mayuresh M. Varerkar
An apparatus to facilitate neural network (NN) training is disclosed. The apparatus includes training logic to receive one or more network constraints and train the NN by automatically determining a best network layout and parameters based on the network constraints.
Systems and methods for performing horizontal tile operations
Disclosed embodiments relate to systems and methods for performing instructions specifying horizontal tile operations. In one example, a processor includes fetch circuitry to fetch an instruction specifying a horizontal tile operation, a location of a M by N source matrix comprising K groups of elements, and locations of K destinations, wherein each of the K groups of elements comprises the same number of elements, decode circuitry to decode the fetched instruction, and execution circuitry to respond to the decoded instruction by generating K results, each result being generated by performing the specified horizontal tile operation across every element of a corresponding group of the K groups, and writing each generated result to a corresponding location of the K specified destination locations.
Memory device test mode access
A system includes a memory device and a processing device coupled to the memory device. The processing device is configured to switch an operating mode of the memory device between a test mode and a non-test mode. The system further includes a test mode access component that is configured to access the memory device while the memory device is in the test mode to perform a test mode operation.
Fine-grained stack protection using cryptographic computing
A processor includes a register to store an encoded pointer to a variable in stack memory. The encoded pointer includes an encrypted portion and a fixed plaintext portion of a memory address corresponding to the variable. The processor further includes circuitry to, in response to a memory access request for associated with the variable, decrypt the encrypted portion of the encoded pointer to obtain first upper address bits of the memory address and a memory allocation size for a variable, decode the encoded pointer to obtain the memory address, verify the memory address is valid based, at least in part on the memory allocation size, and in response to determining that the memory address is valid, allow the memory access request.
Register sharing mechanism to equally allocate disabled thread registers to active threads
An apparatus is disclosed. The apparatus includes one or more processors comprising register sharing circuitry to receive meta-information indicating a number of threads that are to be disabled and provide an indication that an associated thread is disabled, a plurality of General Purpose Register Files (GRFs), wherein one or more of the plurality of GRFs is associated with one of the plurality of threads and a plurality of multiplexers coupled to the one or more GRFs to receive the indication from the register sharing circuitry and disable thread access to an associated GRF based on an indication that a thread is to be disabled.
Hardware for split data translation lookaside buffers
Systems, methods, and apparatuses relating to hardware for split data translation lookaside buffers. In one embodiment, a processor includes a decode circuit to decode instructions into decoded instructions, an execution circuit to execute the decoded instructions, and a memory circuit comprising a load data translation lookaside buffer circuit and a store data translation lookaside buffer circuit separate and distinct from the load data translation lookaside buffer circuit, wherein the memory circuit sends a memory access request of the instructions to the load data translation lookaside buffer circuit when the memory access request is a load data request and to the store data translation lookaside buffer circuit when the memory access request is a store data request to determine a physical address for a virtual address of the memory access request.
Computing chip, hashrate board and data processing apparatus
This disclosure relates to a computing chip, a hashrate board, and a data processing apparatus. The computing chip includes a plurality of operation stages arranged in a pipeline configuration. Each operation stage includes: a first combinational logic circuit occupying a plurality of first cell points adjacent to each other, at least a portion of the first cell points being located in a first incomplete column; one or more second combinational logic circuits each occupying one or more second cell points, at least a portion of the second cell points being located in a second incomplete column; and a plurality of registers each occupying a plurality of third cell points, at least a portion of the third cell points being located in the first incomplete column or the second incomplete column. The first cell points, the second cell points, and third cell points occupy equal areas on the computing chip.
Handling load-exclusive instructions in apparatus having support for transactional memory
An apparatus is described with support for transactional memory and load/store-exclusive instructions using an exclusive monitor indication to track exclusive access to a given address. In response to a predetermined type of load instruction specifying a load target address, which is executed within a given transaction, any exclusive monitor indication previously set for the load target address is cleared. In response to a load-exclusive instruction, an abort is triggered for a transaction for which the given address is specified as one of its working set of addresses. This helps to maintain mutual exclusion between transactional and non-transactional threads even if there is load speculation in the non-transactional thread.