Patent classifications
G06F12/0884
Increased parallelization efficiency in tiering environments
A computer-implemented method, according to one embodiment, includes: identifying block addresses which are associated with a given object, and combining the block addresses to a first set in response to determining that at least one token is currently issued on one or more of the identified block addresses. A first portion of the block addresses is transitioned to a second set, where the first portion includes ones of the block addresses determined as having a token currently issued thereon. Moreover, a second portion of the block addresses is divided into equal chunks, where the second portion includes the block addresses remaining in the first set. The chunks in the first set are allocated across two or more parallelization units. Furthermore, the block addresses in the second set are divided into equal chunks, and the chunks in the second set are allocated to at least one dedicated parallelization unit.
Victim cache that supports draining write-miss entries
A caching system including a first sub-cache and a second sub-cache in parallel with the first sub-cache, wherein the second sub-cache includes a set of cache lines, line type bits configured to store an indication that a corresponding cache line of the set of cache lines is configured to store write-miss data, and an eviction controller configured to flush stored write-miss data based on the line type bits.
Victim cache that supports draining write-miss entries
A caching system including a first sub-cache and a second sub-cache in parallel with the first sub-cache, wherein the second sub-cache includes a set of cache lines, line type bits configured to store an indication that a corresponding cache line of the set of cache lines is configured to store write-miss data, and an eviction controller configured to flush stored write-miss data based on the line type bits.
Memory system including parallel operation elements and control method to reduce read latency and omit status check
According to one embodiment, a memory system includes a non-volatile memory and a memory controller. The non-volatile memory includes a plurality of parallel operation elements each including a memory cell. The memory controller is configured to control the plurality of parallel operation elements. In reading data from the non-volatile memory, the memory controller is configured to sequentially instruct the plurality of parallel operation elements to perform a sense operation of sensing data stored in the memory cell included in each of the plurality of parallel operation elements. In a case where an operation period of the sense operation of any one of the plurality of parallel operation elements is expired, the memory controller instructs the one of the plurality of parallel operation elements to perform a transfer operation for the data without checking a status of the one of the plurality of parallel operation elements.
Memory system including parallel operation elements and control method to reduce read latency and omit status check
According to one embodiment, a memory system includes a non-volatile memory and a memory controller. The non-volatile memory includes a plurality of parallel operation elements each including a memory cell. The memory controller is configured to control the plurality of parallel operation elements. In reading data from the non-volatile memory, the memory controller is configured to sequentially instruct the plurality of parallel operation elements to perform a sense operation of sensing data stored in the memory cell included in each of the plurality of parallel operation elements. In a case where an operation period of the sense operation of any one of the plurality of parallel operation elements is expired, the memory controller instructs the one of the plurality of parallel operation elements to perform a transfer operation for the data without checking a status of the one of the plurality of parallel operation elements.
Multi-core interconnection bus, inter-core communication method, and multi-core processor
The present invention discloses a multi-core interconnection bus, including a request transceiver module adapted to receive a data request from a processor core, and forward the data request to a snoop and caching module through a request execution module, where the data request includes a request address; the snoop and caching module adapted to look up cache data validity information of the request address, acquire data from a shared cache, and sequentially return the cache data validity information and the data acquired from the shared cache to the request execution module; and the request execution module adapted to determine, based on the cache data validity information, a target processor core whose local cache stores valid data, forward the data request to the target processor core, and receive returned data; and determine response data from the data returned by the target processor core and that returned by the snoop and caching module, and return, through the request transceiver module, the response data to the processor core that initiates the data request. The present invention also discloses a corresponding inter-core communication method and a multi-core processor.
Multi-core interconnection bus, inter-core communication method, and multi-core processor
The present invention discloses a multi-core interconnection bus, including a request transceiver module adapted to receive a data request from a processor core, and forward the data request to a snoop and caching module through a request execution module, where the data request includes a request address; the snoop and caching module adapted to look up cache data validity information of the request address, acquire data from a shared cache, and sequentially return the cache data validity information and the data acquired from the shared cache to the request execution module; and the request execution module adapted to determine, based on the cache data validity information, a target processor core whose local cache stores valid data, forward the data request to the target processor core, and receive returned data; and determine response data from the data returned by the target processor core and that returned by the snoop and caching module, and return, through the request transceiver module, the response data to the processor core that initiates the data request. The present invention also discloses a corresponding inter-core communication method and a multi-core processor.
Main processor prefetching operands for coprocessor operations
Technology for providing data to a processing unit is disclosed. A computer processor may be divided into a master processing unit and consumer processing units. The master processing unit at least partially decodes a machine instruction and determines whether data is needed to execute the machine instruction. The master processing unit sends a request to memory for the data. The request may indicate that the data is to be sent from the memory to a consumer processing unit. The data sent by the memory in response to the request may be stored in local read storage that is close to the consumer processing unit for fast access. The master processing unit may also provide the machine instruction to the consumer processing unit. The consumer processing unit may access the data from the local read storage and execute the machine instruction based on the accessed data.
Main processor prefetching operands for coprocessor operations
Technology for providing data to a processing unit is disclosed. A computer processor may be divided into a master processing unit and consumer processing units. The master processing unit at least partially decodes a machine instruction and determines whether data is needed to execute the machine instruction. The master processing unit sends a request to memory for the data. The request may indicate that the data is to be sent from the memory to a consumer processing unit. The data sent by the memory in response to the request may be stored in local read storage that is close to the consumer processing unit for fast access. The master processing unit may also provide the machine instruction to the consumer processing unit. The consumer processing unit may access the data from the local read storage and execute the machine instruction based on the accessed data.
Write merging on stores with different tags
Techniques for caching data are provided that include receiving, by a caching system, a write memory command for a memory address, the write memory command associated with a first color tag, determining, by a first sub-cache of the caching system, that the memory address is not cached in the first sub-cache, determining, by second sub-cache of the caching system, that the memory address is not cached in the second sub-cache, storing first data associated with the first write memory command in a cache line of the second sub-cache, storing the first color tag in the second sub-cache, receiving a second write memory command for the cache line, the write memory command associated with a second color tag, merging the second color tag with the first color tag, storing the merged color tag, and evicting the cache line based on the merged color tag.