G06F5/06

Semiconductor device with first-in-first-out circuit
11805638 · 2023-10-31 ·

Apparatuses including a first-in first-out circuit are described. An example apparatus includes: a first-in first-out circuit including a first latch, a second latch and a logic circuit coupled in series. The first latch receives first data and latches the first data responsive to a first input pointer signal. The second latch receives the latched first data from the first latch and latches the received first data responsive to a second input pointer signal that has a different phase from the first input pointer signal and thus provides a second data. The logic circuit receives the second data and an output pointer signal and further provides an output data responsive to the output pointer signal.

Semiconductor device with first-in-first-out circuit
11805638 · 2023-10-31 ·

Apparatuses including a first-in first-out circuit are described. An example apparatus includes: a first-in first-out circuit including a first latch, a second latch and a logic circuit coupled in series. The first latch receives first data and latches the first data responsive to a first input pointer signal. The second latch receives the latched first data from the first latch and latches the received first data responsive to a second input pointer signal that has a different phase from the first input pointer signal and thus provides a second data. The logic circuit receives the second data and an output pointer signal and further provides an output data responsive to the output pointer signal.

Microthreading for accelerated deep learning

Techniques in advanced deep learning provide improvements in one or more of accuracy, performance, and energy efficiency. An array of compute elements and routers performs flow-based computations on wavelets of data. Some instructions are performed in iterations, such as one iteration per element of a fabric vector or FIFO. When sources for an iteration of an instruction are unavailable, and/or there is insufficient space to store results of the iteration, indicators associated with operands of the instruction are checked to determine whether other work can be performed. In some scenarios, other work cannot be performed and processing stalls. Alternatively, information about the instruction is saved, the other work is performed, and sometime after the sources become available and/or sufficient space to store the results becomes available, the iteration is performed using the saved information.

Microthreading for accelerated deep learning

Techniques in advanced deep learning provide improvements in one or more of accuracy, performance, and energy efficiency. An array of compute elements and routers performs flow-based computations on wavelets of data. Some instructions are performed in iterations, such as one iteration per element of a fabric vector or FIFO. When sources for an iteration of an instruction are unavailable, and/or there is insufficient space to store results of the iteration, indicators associated with operands of the instruction are checked to determine whether other work can be performed. In some scenarios, other work cannot be performed and processing stalls. Alternatively, information about the instruction is saved, the other work is performed, and sometime after the sources become available and/or sufficient space to store the results becomes available, the iteration is performed using the saved information.

Systems, apparatus, and methods for transmitting image data
11470282 · 2022-10-11 · ·

Examples described relate to systems, methods, and apparatus for transmitting image data. The apparatus may comprise a memory buffer configured to store data elements of an image frame and generate a signal indicating that each of the data elements of the image frame have been written to the memory buffer, a processor configured to initiate a flush operation for reading out at least one data element of the image frame from the memory buffer and to output the at least one data element at a first rate, a rate adjustment unit configured to receive the at least one data element from the processor at the first rate and to output the at least one data element at a second rate, and a multiplexer configured to receive the at least one data element from the processor at the first rate and configured to receive the at least one data element from the rate adjustment unit at the second rate. The multiplexer may select the at least one data element at the second rate for transmitting on a bus in response to receiving the signal from the memory buffer.

TASK ACTIVATING FOR ACCELERATED DEEP LEARNING

Techniques in advanced deep learning provide improvements in one or more of accuracy, performance, and energy efficiency. An array of processing elements performs flow-based computations on wavelets of data. Each processing element has a compute element and a routing element. Each router enables communication via wavelets with at least nearest neighbors in a 2D mesh. Routing is controlled by virtual channel specifiers in each wavelet and routing configuration information in each router. Execution of an activate instruction or completion of a fabric vector operation activates one of the virtual channels. A virtual channel is selected from a pool comprising previously activated virtual channels and virtual channels associated with previously received wavelets. A task corresponding to the selected virtual channel is activated by executing instructions corresponding to the selected virtual channel.

TASK ACTIVATING FOR ACCELERATED DEEP LEARNING

Techniques in advanced deep learning provide improvements in one or more of accuracy, performance, and energy efficiency. An array of processing elements performs flow-based computations on wavelets of data. Each processing element has a compute element and a routing element. Each router enables communication via wavelets with at least nearest neighbors in a 2D mesh. Routing is controlled by virtual channel specifiers in each wavelet and routing configuration information in each router. Execution of an activate instruction or completion of a fabric vector operation activates one of the virtual channels. A virtual channel is selected from a pool comprising previously activated virtual channels and virtual channels associated with previously received wavelets. A task corresponding to the selected virtual channel is activated by executing instructions corresponding to the selected virtual channel.

Converting a Stream of Data Using a Lookaside Buffer
20220283809 · 2022-09-08 ·

A stream of data is accessed from a memory system by an autonomous memory access engine, converted on the fly by the memory access engine, and then presented to a processor for data processing. A portion of a lookup table (LUT) containing converted data elements is preloaded into a lookaside buffer associated with the memory access engine. As the stream of data elements is fetched from the memory system each data element in the stream of data elements is replaced with a respective converted data element obtained from the LUT in the lookaside buffer according to a content of each data element to thereby form a stream of converted data elements. The stream of converted data elements is then propagated from the memory access engine to a data processor.

Individually addressing memory devices disconnected from a data bus

Memory devices and methods for operating the same are provided. A memory device can include at least one command contact and at least one data contact. The memory device can be configured to detect a condition in which the at least one command contact is connected to a controller and the at least one data contact is disconnected from the controller, and to enter, based at least in part on detecting the condition, a first operating mode with a lower nominal power rating than a second operating mode. Memory modules including one or more such memory devices can be provided, and memory systems including controllers and such memory modules can also be provided.

Individually addressing memory devices disconnected from a data bus

Memory devices and methods for operating the same are provided. A memory device can include at least one command contact and at least one data contact. The memory device can be configured to detect a condition in which the at least one command contact is connected to a controller and the at least one data contact is disconnected from the controller, and to enter, based at least in part on detecting the condition, a first operating mode with a lower nominal power rating than a second operating mode. Memory modules including one or more such memory devices can be provided, and memory systems including controllers and such memory modules can also be provided.