G06F12/0692

Compiler Global Memory Access Optimization In Code Regions Using Most Appropriate Base Pointer Registers
20170337142 · 2017-11-23 ·

A processing device includes a target processor instruction memory to store a plurality of target processor instructions that include a plurality of global memory access instructions. The processing device further includes a compiler to communicate with the target processor instruction memory, the compiler including: a global variable candidate detection module to identify a global memory access instruction within a set of code regions that use a set of global variable candidates to access a global memory, and a memory access optimization module to modify the global memory access instruction, wherein the modified global memory access instruction utilizes an unused base pointer register of a set of unused base pointer register candidates within the set of code regions, a global variable from the set of global variable candidates to be used as a base address, and an offset relative to the base address to access the global memory.

Modifying subsets of memory bank operating parameters

Methods, systems, and devices for modifying subsets of memory bank operating parameters are described. First global trimming information may be configured to adjust a first subset of operating parameters for a set of memory banks within a memory system. Second global trimming information may be configured to adjust a second subset of operating parameters for the set of memory banks. Local trimming information may be used to adjust one of the subsets of the operating parameters for a subset of the memory banks. To adjust one of the subsets of the operating parameters, the local trimming information may be combined with one of the first or second global trimming information to yield additional local trimming information that is used to adjust a corresponding subset of the operating parameters at the subset of the memory banks.

Data processing apparatus and data processing method
09804958 · 2017-10-31 · ·

A data processing apparatus for accessing a plurality of memories is provided. The data processing apparatus includes a function control circuitry and an address generation circuitry. The function control circuitry is utilized to record a first memory address where a first function is implemented after the first function is implemented and to determine which one of the plurality of memories is a target memory according to the first memory address. The address generation circuitry is utilized to output the first memory address to the target memory. In addition, the function control circuitry is configured to determine the target memory in the same processing cycle in which the address generation circuitry is configured to output the first memory address.

METHOD OF ACCESSING STORAGE DEVICE INCLUDING NONVOLATILE MEMORY DEVICE AND CONTROLLER
20170308464 · 2017-10-26 ·

Aspects of the inventive concept relates to a method of accessing a storage device including a nonvolatile memory device and a controller. The method includes writing user data, a first logical address and a second logical address associated with the user data in a storage space corresponding to the first logical address of the nonvolatile memory device. The user data is update data that updates previous data written in the nonvolatile memory device. The second logical address is a logical address of a storage space of the nonvolatile memory device in which the previous data is written.

OPPORTUNISTIC MEMORY TUNING FOR DYNAMIC WORKLOADS

Technology relating to tuning for operating memory devices is disclosed. The technology includes a computing device that selectively configures operating parameters for at least one operating memory device based at least in part of performance characteristics for an application or other workload that the computing device has been requested to execute. This technology may be implemented, at least in part, in the firmware via a Basic Input/Output System (BIOS) or Unified Extensible Firmware Interface (UEFI) of the computing device. Further, this technology may be employed by a computing device that is executing workloads on behalf of a distributed computing system, e.g., in a data center. Such data centers may include, for example, thousands of computing devices and even more operating memory devices.

Technologies for dividing work across accelerator devices

Technologies for dividing work across one or more accelerator devices include a compute device. The compute device is to determine a configuration of each of multiple accelerator devices of the compute device, receive a job to be accelerated from a requester device remote from the compute device, and divide the job into multiple tasks for a parallelization of the multiple tasks among the one or more accelerator devices, as a function of a job analysis of the job and the configuration of each accelerator device. The compute engine is further to schedule the tasks to the one or more accelerator devices based on the job analysis and execute the tasks on the one or more accelerator devices for the parallelization of the multiple tasks to obtain an output of the job.

Scalable synchronization mechanism for distributed memory

A method comprising receiving control information at a first processing element from a second processing element, synchronizing objects within a shared global memory space of the first processing element with a shared global memory space of a second processing element in response to receiving the control information and generating a completion event indicating the first processing element has been synchronized with the second processing element.

System and method for augmenting an existing artificial neural network
11238331 · 2022-02-01 · ·

A novel and useful augmented artificial neural network (ANN) incorporating an existing artificial neural network (ANN) coupled to a supplemental ANN and a first-in first-out (FIFO) stack for storing historical output values of the network. The augmented ANN exploits the redundant nature of information present in an input data stream. The addition of the supplemental ANN along with a FIFO enables the augmented network to look back into the past in making a decision for the current frame. It provides context aware object presence as well as lowers the rate of false detections and misdetections. The output of the existing ANN is stored in a FIFO to create a lookahead system in which both past output values of the supplemental ANN and ‘future’ values of the output of the existing ANN are used in making a decision for the current frame. In addition, the mechanism does not require retraining the entire neural network nor does it require data set labeling.

MULTI-VALUE MAPPING FOR OBJECT STORE

A method for mapping an object store may include storing a data entry within a mapping page for an object in the object store, wherein the data entry may include a key and a value, and the value may include an address for the object in the object store. The method may further include storing multiple data entries within the mapping page for multiple corresponding objects in the object store, wherein each data entry may include a key and one or more values for a corresponding object in the object store, and each value may include an address for the corresponding object in the object store. The data entries may be part of a mapping data structure which may include nodes, and each node may be stored within a mapping page.

Deep learning approach to mitigate the cold-start problem in textual items recommendations
11210358 · 2021-12-28 · ·

A method for mitigating cold starts in recommendations includes receiving a request that identifies a requested page and identifying a content vector of the requested page. The content vector is generated based on providing text of the requested page to a neural network text encoder. The method further includes selecting, based on a rank engine and the content vector, a link to a cold start page that does not satisfy a threshold level of interaction data. The rank engine ranks the selected link above a second link to a warm page that does satisfy the threshold level of the interaction data. The method further includes presenting the requested page with the selected link.