Patent classifications
G06F9/30101
ULTRA-LOW-POWER AND LOW-AREA SOLUTION OF BINARY MULTIPLY-ACCUMULATE SYSTEM AND METHOD
Data structure and microcontroller architecture performing binary multiply-accumulate operations using multiple partial copies of weights. Destination-register location, source-register location, and weight-register location are received. Using the weight-register location, a sub-set of the weight bits is copied a select number of times based on a filter index value that is received. Each copy of the sub-set of weights is executed in parallel. Using the source-register location, a sub-set of the input bits is selected based on the size of the sub-set of weights, wherein the sub-set of input bits is shifted one bit from a previous sub-set of input bits. XOR operation is performed on each corresponding bit in the copy of the sub-set of weights with each corresponding bit in the selected sub-set of input bits. In a corresponding destination sub-location, output of each XOR operation is aggregated with each other and with current value of the corresponding destination sub-location.
Memory device to suspend ROM operation and a method of operating the memory device
A memory device in accordance with a described method of operation includes a read only memory (ROM) address controller and a suspend signal generator. The ROM address controller is configured to sequentially output a plurality of operation ROM addresses at which ROM codes to be executed in response to an operation command are stored, and to suspend output of the plurality of operation ROM addresses in response to a suspend signal. The suspend signal generator is configured to generate the suspend signal that is activated during a preset period depending on whether a suspend ROM address is identical to an operation ROM address, among the plurality of operation ROM addresses, currently being output. The suspend ROM address is an address at which a ROM code, execution of which is to be suspended, among the ROM codes, is stored.
Interface Bus Combining
Circuits and methods enabling common control of an agent device by two or more buses, particularly MIPI RFFE serial buses. In essence, the invention provides flagging signals designating completed register write operations to denote which of two registers are active, such that synchronization is accomplished in a clock-free manner. One embodiment includes at least two decoders, each including a common register and a bus (S/P) decoder coupled to a respective bus and to the common register. The S/P decoder asserts a write-complete signal when a write operation to a corresponding common register is completed. A multiplexer has at least two selectable input bus ports coupled to the common registers within the at least two decoders. A selection circuit selects an input bus port of the multiplexer in response to the assertion of a last write-complete signal from the S/P decoders.
PROCESSING-IN-MEMORY (PIM) SYSTEM AND OPERATING METHODS OF THE PIM SYSTEM
A processing-in-memory (PIM) system includes a host configured to generate a first request for a memory access operation and a second request for an arithmetic operation, a PIM controller configured to generate a first command based on the first request or the second request, a high speed interface configured to generate a second command based on the second request, and a PIM device configured to perform the memory access operation in response to the first command from the PIM controller and to perform the arithmetic operation in response to the second command from the high speed interface.
Apparatus and method for scalable qubit addressing
An apparatus and method for scalable qubit addressing. For example, one embodiment of a processor comprises: a decoder comprising quantum instruction decode circuitry to decode quantum instructions to generate quantum microoperations (uops) and non-quantum decode circuitry to decode non-quantum instructions to generate non-quantum uops; execution circuitry comprising: an address generation unit (AGU) to generate a system memory address responsive to execution of one or more of the non-quantum uops; and quantum index generation circuitry to generate quantum index values responsive to execution of one or more of the quantum uops, each quantum index value uniquely identifying a quantum bit (qubit) in a quantum processor; wherein to generate a first quantum index value for a first quantum uop, the quantum index generation circuitry is to read the first quantum index value from a first architectural register identified by the first quantum uop.
Fault isolation and recovery of CPU cores for failed secondary asymmetric multiprocessing instance
According to certain embodiments, a system includes one or more processors and one or more computer-readable non-transitory storage media comprising instructions that, when executed by the one or more processors, cause one or more components to perform operations including executing a software process of a secondary instance, the secondary instance running in parallel with a primary instance and associated with a plurality of cores including a bootstrap core, registering a non-maskable interrupt for the bootstrap core in the secondary instance, determining whether the secondary instance is in a fault state, wherein, if the secondary instance is in the fault state, halting the plurality of cores associated with the secondary instance, without impact to the primary instance, and recovering the bootstrap core by switching a context of the bootstrap core from the secondary instance to the primary instance via the non-maskable interrupt.
PROCESSORS EMPLOYING MEMORY DATA BYPASSING IN MEMORY DATA DEPENDENT INSTRUCTIONS AS A STORE DATA FORWARDING MECHANISM, AND RELATED METHODS
Processors employing memory bypassing in memory data dependent instructions as a store data forwarding mechanism, and related methods. To reduce stalls of memory data dependent, load-based instructions, a memory data dependency detection circuit is configured to detect a memory hazard between a store-based instruction and a load-based instruction based on their opcodes and designation/source operands. Some store-based and load-based instructions have opcodes identifying these instructions as having respective store and load address operand types that can be compared without resolution of their respective store and load addresses. For these detected types of instructions, the memory data dependency detection circuit is configured to determine if a source operand of a load-based instruction matches a target operand of a store-based instruction to detect a memory hazard earlier in the instruction pipeline. Identifying memory hazards earlier in an instruction pipeline can allow memory dependent instructions to be processed with avoided or reduced stalls.
APPARATUS, SYSTEM, AND METHOD FOR CONFIGURING A CONFIGURABLE COMBINED PRIVATE AND SHARED CACHE
Aspects disclosed in the detailed description include configuring a configurable combined private and shared cache in a processor. Related processor-based systems and methods are also disclosed. A combined private and shared cache structure is configurable to select a private cache portion and a shared cache portion.
Method and device for intrusion detection in a computer network
A device and method for intrusion detection in a computer network. A data packet is received at an input of a hardware switch unit, an output of the hardware switch unit is selected for sending the data packet or a copy as a function of security layer information from the data packet and of a hardware address, context information for the data packet being determined, an actual value from a field being compared in a comparison by a hardware filter with a setpoint value for values from this field, the field including security layer data or mediation layer data, and an interrupt for a computing device being triggered as a function of a result of the comparison, an analysis for detecting an intrusion pattern in a network traffic in the computer network, triggered by the interrupt, being carried out as a function of the context information for the data packet.
Heterogeneous-computing based emulator
In an approach, a processor receives an input indicative of a set of registers, the set of registers being configured for obtaining output data from a design-under-test (DUT) in a field-programmable gate array (FPGA) module. A processor executes a set of instructions for monitoring the output data in the set of registers;. A processor generates data indicative of at least one portion of changes of the output data in the set of registers during the execution of the set of instructions. A processor causes a separate machine to analyze the data via utilizing an interface to send the data to the separate machine.