G06F9/3877

SYSTEM AND METHOD FOR PERFORMING A Z PRE-PASS PHASE ON GEOMETRY AT A GPU FOR USE BY THE GPU WHEN RENDERING THE GEOMETRY
20230100853 · 2023-03-30 ·

A method for graphics processing including rendering graphics for an application using a plurality of graphics processing units (GPUs). The method including dividing responsibility for rendering geometry of the graphics between the GPUs based on screen regions, each GPU having a corresponding division of the responsibility which is known to the GPUs. The method including determining a Z-value for a piece of geometry during a pre-pass phase of rendering at a first GPU for an image, wherein the piece of geometry overlaps a first screen region for which the first GPU has a division of responsibility. The method including comparing the Z-value against a Z-buffer value for the piece of geometry. The method including generating information including a result of the comparing the Z-value against the Z-buffer value for use by the GPU when rendering the piece of geometry during a full render phase of rendering.

Coprocessor Register Renaming

A coprocessor with register renaming is disclosed. An apparatus includes a plurality of processors and a coprocessor respectively configured to execute processor instructions and coprocessor instructions. The coprocessor receives coprocessor instructions from ones of the processors. The coprocessor includes an array of processing elements and a result register set comprising storage elements respectively distributed within the array of processing elements. For a given member of the array of processing elements, a corresponding storage element is configured to store coprocessor instruction results generated by the given member. The result register set implements a plurality of contexts to store respective coprocessor states corresponding to coprocessor instructions received from different processors. Based on a determination that one of the contexts is inactive, the coprocessor is configured to store coprocessor instruction results corresponding to an active context within storage elements of the result register set corresponding to the inactive context.

CIRCUITRY AND METHODS FOR ACCELERATING STREAMING DATA-TRANSFORMATION OPERATIONS
20230100586 · 2023-03-30 ·

Systems, methods, and apparatuses for accelerating streaming data-transformation operations are described. In one example, a system on a chip (SoC) includes a hardware processor core comprising a decoder circuit to decode an instruction comprising an opcode into a decoded instruction, the opcode to indicate an execution circuit is to generate a single descriptor and cause the single descriptor to be sent to an accelerator circuit coupled to the hardware processor core, and the execution circuit to execute the decoded instruction according to the opcode; and the accelerator circuit comprising a work dispatcher circuit and one or more work execution circuits to, in response to the single descriptor sent from the hardware processor core: when a field of the single descriptor is a first value, cause a single job to be sent by the work dispatcher circuit to a single work execution circuit of the one or more work execution circuits to perform an operation indicated in the single descriptor to generate an output, and when the field of the single descriptor is a second different value, cause a plurality of jobs to be sent by the work dispatcher circuit to the one or more work execution circuits to perform the operation indicated in the single descriptor to generate the output as a single stream.

CLOUD SERVICE SYSTEM AND OPERATION METHOD THEREOF
20230032842 · 2023-02-02 · ·

A cloud service system and an operation method thereof are provided. The cloud service system includes a first computing resource pool, a second computing resource pool, and a task dispatch server. Each computing platform in the first computing resource pool does not have a co-processor. Each computing platform in the second computing resource pool has at least one co-processor. The task dispatch server is configured to receive a plurality of tasks. The task dispatch server checks a task attribute of a task to be dispatched currently among the tacks. The task dispatch server chooses to dispatch the task to be dispatched currently to the first computing resource pool or to the second computing resource pool for execution according to the task attribute.

EFFICIENT COMPRESSED VERBATIM COPY

Compressed verbatim copy can enable more efficient copying of compressed data. In one example, a compressed verbatim copy method involves receiving a command to copy compressed data from a source address of the memory device to a destination address. In response to the receipt of the command, the method involves copying the compressed data in a compressed format from the source address to the destination address without first decompressing the data. A second source address and a second destination address of metadata for the compressed data is determined, and the metadata is copied from the second source address to the second destination address.

Pin sharing for photonic processors

Aspects relate to a photonic processing system, an integrated circuit, and a method of operating an integrated circuit to control components to modulate optical signals. A photonic processing system, comprising: a photonic integrated circuit comprising: a first electrically-controllable photonic component electrically coupling an input pin to a first output pin; and a second electrically-controllable photonic component electrically coupling the input pin to a second output pin.

System and method for decoupling operations to accelerate processing of loop structures

An apparatus for hardware acceleration for use in operating a computational network is configured for determining that a loop structure including one or more loops is to be executed by a first processor. Each of the one or more loops includes a set of operations. The loop structure may be configured as a nested loop, a cascaded or a combination of the two. A second processor may be configured to decouple overhead operations of the loop structure from compute operations of the loop structure. The apparatus accelerates processing of the loop structure by simultaneously processing the overhead operations using the second processor separately from processing the compute operations based on the configuration to operate the computational network.

Checkins for services from a messenger chatbot
11615450 · 2023-03-28 · ·

A chatbot check-in platform includes a salon application associated with a messenger chatbot, wherein a user requests via the chatbot an appointment for a salon service. The platform includes a location service in communication with the salon application for finding a salon nearby a user; a salon services API in communication with the salon application for finding a requested salon service and time for appointment for the salon; and a database interface responding to a request for making an appointment with the salon. A method of using a chatbot check-in platform includes instructions, when executed by a processor, that cause the processor to execute actions including: receiving, by a first processor, a request received via an associated messenger chatbot for booking an appointment for a service provided by a salon, wherein the first processor is a processor of a device, the device includes machine readable memory accessible by the first processor; finding a salon and service, by the first processor in response the request; prompting, by the first processor, a booking for the salon and service via an associated chatbot; and upon receiving, by the first processor, a confirmation from the user via the chatbot, booking the salon and service.

Systems and methods for simultaneous control of safety-critical and non-safety-critical processes in automation systems using master-minion functionality
11487265 · 2022-11-01 · ·

A control system is for controlling safety-critical processes, non-safety-critical processes, and/or installation components. The control system includes: at least one control unit configured to control non-safety-critical processes and/or non-safety-critical installation components, at least one safety control unit for controlling safety-critical processes and/or safety-critical installation components, and at least one input/output unit connected to the first control unit via an internal input/output bus. The control system is configured to act as communication master or as communication minion or as both in a pool having other devices that is connected via field bus, and to that end, the control system includes a master communication coupler and a minion communication coupler. The control system is modularly configurable. At least the safety control unit includes respective subunits with master functionality and subunits with minion functionalities.

Hardware accelerator with analog-content addressable memory (a-CAM) for decision tree computation

Examples described herein relate to a decision tree computation system in which a hardware accelerator for a decision tree is implemented in the form of an analog Content Addressable Memory (a-CAM) array. The hardware accelerator accesses a decision tree. The decision tree comprises of multiple paths and each path of the multiple paths includes a set of nodes. Each node of the decision tree is associated with a feature variable of multiple feature variables of the decision tree. The hardware accelerator combines multiple nodes among the set of nodes with a same feature variable into a combined single node. Wildcard values are replaced for feature variables not being evaluated in each path. Each combined single node associated with each feature variable is mapped to a corresponding column in the a-CAM array and the multiple paths of the decision tree to rows of the a-CAM array.