H04L49/506

MICROTHREADING FOR ACCELERATED DEEP LEARNING

Techniques in advanced deep learning provide improvements in one or more of accuracy, performance, and energy efficiency. An array of compute elements and routers performs flow-based computations on wavelets of data. Some instructions are performed in iterations, such as one iteration per element of a fabric vector or FIFO. When sources for an iteration of an instruction are unavailable, and/or there is insufficient space to store results of the iteration, indicators associated with operands of the instruction are checked to determine whether other work can be performed. In some scenarios, other work cannot be performed and processing stalls. Alternatively, information about the instruction is saved, the other work is performed, and sometime after the sources become available and/or sufficient space to store the results becomes available, the iteration is performed using the saved information.

Methods and apparatus for flow control associated with a switch fabric

In some embodiments, an apparatus includes a flow control module configured to receive a first data packet from an output queue of a stage of a multi-stage switch at a first rate when an available capacity of the output queue crosses a first threshold. The flow control module is configured to receive a second data packet from the output queue of the stage of the multi-stage switch at a second rate when the available capacity of the output queue crosses a second threshold. The flow control module configured to send a flow control signal to an edge device of the multi-stage switch from which the first data packet or the second data packet entered the multi-stage switch.

Secure In-line Network Packet Transmittal
20200099669 · 2020-03-26 ·

A network processor provides for in-line encryption and decryption of received and transmitted packets. For packet transmittal, a processor core generates packet data for encryption and forwards an encryption instruction to a cryptographic unit. The cryptographic unit generates an encrypted packet, and enqueues a send descriptor to a network interface controller, which, in turn, constructs and transmits an outgoing packet. For received encrypted packets, the network interface controller communicates with the cryptographic unit to decrypt the packet prior to enqueuing work to the processor core, thereby providing the processor core with a decrypted packet.

SYSTEM, APPARATUS AND METHOD FOR COMMUNICATING TELEMETRY INFORMATION VIA VIRTUAL BUS ENCODINGS
20200050570 · 2020-02-13 ·

In one embodiment, an apparatus comprises: an endpoint circuit to perform an endpoint operation on behalf of a host processor; and an input/output circuit coupled to the endpoint circuit to receive telemetry information from the endpoint circuit, encode the telemetry information into a virtual bus encoding, place the virtual bus encoding into a payload field of a control message, and communicate the control message having the payload field including the virtual bus encoding to an upstream device. Other embodiments are described and claimed.

Methods and apparatus for flow control associated with a switch fabric

In some embodiments, an apparatus includes a switch fabric having at least a first switch stage and a second switch stage, an edge device operatively coupled to the switch fabric and a management module. The edge device is configured to send a first portion of a data stream to the switch fabric such that the first portion of the data stream is received at a queue of the second switch stage of the switch fabric via the first switch stage of the switch fabric. The management module is configured to send a flow control signal configured to trigger the edge device to suspend transmission of a second portion of the data stream when a congestion level of the queue of the second switch stage of the switch fabric satisfies a condition in response to the first portion of the data stream being received at the queue.

Network packet microburst detection via network switching device hardware supporting quantizied congestion notification

Hardware of a network switching device supports quantized congestion notification (QCN) to notify senders of network packets received at the network switching device that the network switching device is experiencing congestion. The hardware is instead programmed to notify a processor of the network switching device of the congestion at an egress queue of the network switching device. The processor receives a congestion notification message (CNM) from the hardware that the hardware has detected the congestion at the egress queue. Responsive to receiving the CNM from the hardware, the processor detects a microburst of the network packets at the egress queue of the network switching device.

NETWORK DEVICE LEVEL OPTIMIZATIONS FOR LATENCY SENSITIVE RDMA TRAFFIC

Discussed herein is a framework that provisions for customized processing for different classes of traffic. A network device in a communication path between a source host machine and a destination host machine extracts a tag from a packet received by the network device. The packet originates at a source executing on the source host machine and whose destination is the destination host machine. The tag set by the source and indicative of a first traffic class to be associated with the packet, the first traffic class being selected by the source from a plurality of traffic classes. The network device determines, based on the tag, that the first traffic class corresponds to a latency sensitive traffic and processes the packet using one or more settings configured at the network device for processing packets associated with the first traffic class.

End to end flow control

A network device implementing the subject system for end to end flow control may include at least one processor circuit that may be configured to detect that congestion is being experienced by at least one queue of a port and identify another network device that is transmitting downstream traffic being queued at the at least one queue of the port that is at least partially causing the congestion. The at least one processor circuit may be further configured to generate an end to end flow control message that comprises an identifier of the port, the end to end flow control message indicating that the downstream traffic should be flow controlled at the another network device. The at least one processor circuit may be further configured to transmit, out-of-band and through at least one intermediary network device, the end to end flow control message to the another network device.

SIGNALING TO SUPPORT SCHEDULING IN AN INTEGRATED ACCESS AND BACKHAUL SYSTEM
20190357117 · 2019-11-21 ·

In accordance with an example embodiment of the present invention, an apparatus comprising: at least one processor; and at least one memory including computer program code, wherein the at least one memory and the computer program code are configured to, with the at least one processor, cause the apparatus to perform at least the following: allocate a physical uplink channel between at least one integrated access and backhaul node user equipment function and a parent distributed unit; and send at least one message via the physical uplink channel, wherein the at least one message includes at least: a destination queue depth scheduled on a downlink by at least one integrated access and backhaul node distributed unit.

Backpressure for Accelerated Deep Learning

Techniques in advanced deep learning provide improvements in one or more of accuracy, performance, and energy efficiency. An array of processing elements performs flow-based computations on wavelets of data. Each processing element comprises a respective compute element and a respective routing element. Each compute element comprises virtual input queues. Each router enables communication via wavelets with at least nearest neighbors in a 2D mesh. Routing is controlled by respective virtual channel specifiers in each wavelet and routing configuration information in each router. Each router comprises data queues. The virtual input queues of the compute element and the data queues of the router are managed in accordance with the virtual channels. Backpressure information, per each of the virtual channels, is generated, communicated, and used to prevent overrun of the virtual input queues and the data queues.