Patent classifications
H04L49/3045
Microthreading for accelerated deep learning
Techniques in advanced deep learning provide improvements in one or more of accuracy, performance, and energy efficiency. An array of compute elements and routers performs flow-based computations on wavelets of data. Some instructions are performed in iterations, such as one iteration per element of a fabric vector or FIFO. When sources for an iteration of an instruction are unavailable, and/or there is insufficient space to store results of the iteration, indicators associated with operands of the instruction are checked to determine whether other work can be performed. In some scenarios, other work cannot be performed and processing stalls. Alternatively, information about the instruction is saved, the other work is performed, and sometime after the sources become available and/or sufficient space to store the results becomes available, the iteration is performed using the saved information.
TASK ACTIVATING FOR ACCELERATED DEEP LEARNING
Techniques in advanced deep learning provide improvements in one or more of accuracy, performance, and energy efficiency. An array of processing elements performs flow-based computations on wavelets of data. Each processing element has a compute element and a routing element. Each router enables communication via wavelets with at least nearest neighbors in a 2D mesh. Routing is controlled by virtual channel specifiers in each wavelet and routing configuration information in each router. Execution of an activate instruction or completion of a fabric vector operation activates one of the virtual channels. A virtual channel is selected from a pool comprising previously activated virtual channels and virtual channels associated with previously received wavelets. A task corresponding to the selected virtual channel is activated by executing instructions corresponding to the selected virtual channel.
Circuit within switch and method for managing memory within switch
The present invention provides a circuit within a switch, wherein the circuit includes a memory and a control circuit. The memory includes at least a first area and a second area, the first area is used to provide a minimum guaranteed storage space for each of a plurality of egress queues, the second area is used to provide a shared space of the plurality of egress queues. The control circuit is coupled to the memory, and when an input port of the switch receives an input packet and stores the input packet into the memory, the control circuit dynamically determines a size of the second area according to a number of the egress queues that the input packet is forwarded to.
Centralized scheduling apparatus and method considering non-uniform traffic
The present disclosure relates to a centralized scheduling method and apparatus that considers non-uniform traffic and, more particularly, to a centralized scheduling method and apparatus for performing effective scheduling based on a characteristic of non-uniform traffic in consideration of a traffic distribution in a data center network.
METHOD AND SYSTEM FOR VIRTUAL CHANNEL REMAPPING
A virtual channel (VC) allocation system is provided. During operation, the system can maintain, at an ingress port of a switch, a set of counters. A respective counter can indicate a number of data units queued at a corresponding egress port for an ingress VC. A data unit can indicate a minimum number of bits needed to form a packet. The system can maintain, at an egress port, an ingress VC indicator indicating that a packet in an egress buffer for an egress VC corresponds to the ingress VC. Upon sending the packet, the system can update a counter based on the ingress VC indicator. The counter can be associated with the egress buffer and the ingress VC. The system can then issue, to a sender device, credits associated with the ingress VC based on a minimum number of available data units indicated by the set of counters.
Emulating output queued behavior in a virtual output queue switch
A system and method for routing network packets. A switch fabric connects a plurality of forwarding units, including an egress forwarding unit and two or more ingress forwarding units, each ingress forwarding unit forwarding network packets to the egress forwarding unit via the switch fabric. The egress forwarding unit includes a scheduler and an output queue. Each ingress forwarding unit includes a Virtual Output Queue (VOQ) connected to the output queue and a VOQ manager. The scheduler receives time of arrival information for packet groups stored in the VOQs, determines, based on the time of arrival information for each packet group, a device resident time for each packet group, and discards the packet groups when the determined device resident time for the packet group is greater than a maximum resident time.
CONTEXT-AWARE NVMe PROCESSING IN VIRTUALIZED ENVIRONMENTS
A node includes a shared memory for a distributed memory system on a network. A Non-Volatile Memory express (NVMe) request is received from a user space application executed by a Virtual Machine (VM) to send an NVMe command to a different node in the network. If a data size for the NVMe request exceeds a maximum segment size of an NVMe over Fabric (NVMe-oF) connection, packets are created to be sent for the NVMe request and an order is determined for sending the packets with one or more packets including data for the NVMe command being sent before a last packet that includes the NVMe command. In another aspect, Virtual Switching (VS) queues are created in a kernel space with each VS queue corresponding to a different respective user space application initiating requests and at least one user space application being executed by one or more other nodes.
Combined input and output queue for packet forwarding in network devices
An apparatus for switching network traffic includes an ingress packet forwarding engine and an egress packet forwarding engine. The ingress packet forwarding engine is configured to determine, in response to receiving a network packet, an egress packet forwarding engine for outputting the network packet and enqueue the network packet in a virtual output queue. The egress packet forwarding engine is configured to output, in response to a first scheduling event and to the ingress packet forwarding engine, information indicating the network packet in the virtual output queue and that the network packet is to be enqueued at an output queue for an output port of the egress packet forwarding engine. The ingress packet forwarding engine is further configured to dequeue, in response to receiving the information, the network packet from the virtual output queue and enqueue the network packet to the output queue.
SERVER, SERVER SYSTEM, AND METHOD OF INCREASING NETWORK BANDWIDTH OF SERVER
[Problem] An available network bandwidth is increased without limiting processing of applications.
[Solution] A server 20A includes a normal NIC 11 as an NIC having an expansion function, and a virtual patch panel 21 having a transfer function of transferring packets between the normal NIC 11 and an accelerator utilization type NIC 15, which is implemented by software. The server 20A is configured such that, when a packet is transferred between the normal NIC 11 and the accelerator utilization type NIC 15 via the virtual patch panel 21, the target function 16 transfers the packet to and from the APLs 12a to 12c.
REPROGRAMMING MULTICAST REPLICATION USING REAL-TIME BUFFER FEEDBACK
Methods and systems are described for programming a substitution of ingress replication buffering for egress replication buffering after identifying egress buffer errors (such as overflow) for multicast traffic. A network element is configured to identify which ports drop packets by monitoring egress buffers and/or multicast traffic in real time. A hardware forwarding engine provides feedback to a control plane processor of the network element to adapt and selectively reprogram multicast ingress replication, temporarily, for certain egress ports that may have, e.g., egress buffer errors or risk of issues due to high network traffic. Using virtual output queues in ingress buffers may reduce risk of egress port congestion, as egress buffers have more limited resources than ingress buffers; however, relying solely on ingress replication for multicast traffic may hinder unicast traffic. Ingress buffer replication of multicast traffic may be used selectively and temporarily.