Patent classifications
G06F9/544
Memory system and operating method thereof
A memory system may include: a nonvolatile memory device; a write buffer; and a controller suitable for: checking whether first write data have been committed at a point of time that a flush operation is performed on the write buffer, separating the flush operation into first and second flush operations which do not overlap each other but are consecutive to each other, according to the check result, and performing the first and second flush operations, when the first write data grouped into a transaction and second write data, which are not grouped into a transaction, are mixed and stored in the write buffer according to the sequence of the write data, among the write data stored in the write buffer, the controller may select and may store the first write data which are committed, in a first storage region of the nonvolatile memory device during the first flush operation.
Memory device for swapping data and operating method thereof
An operating method of a memory device, which includes a first memory region and a second memory region, includes reading first data from the first memory region and storing the read first data in a data buffer block, performing a first XOR operation on the first data provided from the data buffer block and second data read from the second memory region to generate first result data, writing the first data stored in the data buffer block in the second memory region, performing a second XOR operation on the first data and the first result data to generate the second data, storing the generated second data in the data buffer block, and writing the second data stored in the data buffer block in the first memory region.
Auto termination of applications based on application and user activity
A system and method that automatically terminates an application. A method includes monitoring activity data points for an application launched by a client device within a workspace environment. The activity data points may include user interactions with a physical interface component and background interactions occurring with the application. State data for each file associated with the application is monitored and, if a determination is made that the application is inactive based on the activity data points, the method determines if a file associated with the application includes unsaved content based on state data. If it is determined that no files for the application include unsaved content, the method forecasts whether the application will be inactive for a future period based on the activity data. The application is terminated if it is determined that no files for the application include unsaved content and the application is forecast to be inactive.
Techniques to enable stateful decompression on hardware decompression acceleration engines
A hardware decompression acceleration engine including: an input buffer for receiving to-be-decompressed data from a software layer of a host computer; a decompression processing unit coupled to the input buffer for decompressing the to-be-decompressed data, the decompression processing unit further receiving first and second flags from the software layer of the host computer, wherein the first flag is indicative of a location of the to-be-decompressed data in a to-be-decompressed data block and the second flag is indicative of a presence of an intermediate state; and an output buffer for storing decompressed data from the decompression processing unit.
REAL-TIME DATA REPLICATION IN A MULTIPLE AVAILABILITY ZONE CLOUD PLATFORM
The present disclosure relates to computer-implemented methods, software, and systems for managing data replication. A request associated with storing content of a file is received at a storage service provided by in a multiple availability zone cloud platform. A lock request is sent to an in-memory data grid at a first instance of the storage service to lock the file for accessing. An input stream of the file is received at the persistence interface to be read iteratively in portions. A read portion of the file is iteratively stored in a first file system storage associated with instances of the storage service at a first availability zone. The portions of the file are provided iteratively to a replication executor at the first instance of the storage service to request replication of the content of the file into a second file storage of a second availability zone of the cloud platform.
HARDWARE COHERENCE FOR MEMORY CONTROLLER
A system includes a non-coherent component; a coherent, non-caching component; a coherent, caching component; and a level two (L2) cache subsystem coupled to the non-coherent component, the coherent, non-caching component, and the coherent, caching component. The L2 cache subsystem includes a L2 cache; a shadow level one (L1) main cache; a shadow L1 victim cache; and a L2 controller. The L2 controller is configured to receive and process a first transaction from the non-coherent component; receive and process a second transaction from the coherent, non-caching component; and receive and process a third transaction from the coherent, caching component.
REDUCING POWER CONSUMPTION BY HARDWARE ACCELERATOR DURING GENERATION AND TRANSMISSION OF MACHINE LEARNING INFERENCES
A hardware accelerator can receive, from a host processor, a slice of input data at a time-step. The hardware accelerator can process the input data using a machine learning model deployed on the hardware accelerator to compute a respective probability among multiple probabilities for each of multiple classes. The respective probability for each class being a likelihood that content in the slice belongs to the class. The hardware accelerator can determine, from the multiple probabilities, a preset number of highest probabilities for the slice of input data. The hardware accelerator can transmit the preset number of highest probabilities for the slice to the host processor. Related apparatus, systems, techniques and articles are also described.
METHOD AND TENSOR TRAVERSAL ENGINE FOR STRIDED MEMORY ACCESS DURING EXECUTION OF NEURAL NETWORKS
A tensor traversal engine in a processor system comprising a source memory component and a destination memory component, the tensor traversal engine comprising: a control signal register storing a control signal for a strided data transfer operation from the source memory component to the destination memory component, the control signal comprising an initial source address, an initial destination address, a first source stride length in a first dimension, and a first source stride count in the first dimension; a source address register communicatively coupled to the control signal register; a destination address register communicatively coupled to the control signal register; a first source stride counter communicatively coupled to the control signal register; and control logic communicatively coupled to the control signal register, the source address register, and the first source stride counter.
LOCK-FREE WORK-STEALING THREAD SCHEDULER
Systems and methods are provided for lock-free thread scheduling. Threads may be placed in a ring buffer shared by all computer processing units (CPUs), e.g., in a node. A thread assigned to a CPU may be placed in the CPU's local run queue. However, when a CPU's local run queue is cleared, that CPU checks the shared ring buffer to determine if any threads are waiting to run on that CPU, and if so, the CPU pulls a batch of threads related to that ready-to-run thread to execute. If not, an idle CPU randomly selects another CPU to steak threads from, and the idle CPU attempts to dequeue a thread batch associated with the CPU from the shared ring buffer. Polling may be handled through the use of a shared poller array to dynamically distribute polling across multiple CPUs.
SYSTEM COHERENCY PROTOCOL
Embodiments herein described a coherency protocol for a distributed computing topology that permits for large stalls on various interfaces. In one embodiment, the computing topology includes multiple boards which each contain multiple processors. When a particular core on a processor wants access to data that is not currently stored in its cache, the core can first initiate a request to search for the cache line in the caches for other cores on the same processor. If the cache line is not found, the cache coherency protocol permits the processor to then broadcast a request to the other processors on the same board. If a processor on the same board does not have the data, the processor can then broadcast the request to the other boards in the system. The processors in those boards can then search their caches to identify the data.