Patent classifications
G06F9/30065
Resource-aware automatic machine learning system
The present disclosure relates to a system, a method, and a product for optimizing hyper-parameters for generation and execution of a machine-learning model under constraints. The system includes a memory storing instructions and a processor in communication with the memory. When executed by the processor, the instructions cause the processor to obtain input data and an initial hyper-parameter set; for an iteration, to build a machine learning model based on the hyper-parameter set, evaluate the machine learning model based on the target data to obtain a performance metrics set, and determine whether the performance metrics set satisfies the stopping criteria set. If yes, the instructions cause the processor to perform an exploitation process to obtain an optimal hyper-parameter set, and exit the iteration; if no, perform an exploration process to obtain a next hyper-parameter set, and perform a next iteration with using the next hyper-parameter set as the hyper-parameter set.
Systems, media, and methods for identifying loops of or implementing loops for a unit of computation
Systems, media, and methods may identify loops of a unit of computation for performing operations associated with the loops. The system, media, and methods may receive textual program code that includes a unit of computation that comprises a loop (e.g., explicit/implicit loop). The unit of computation may be identified by an identifier (e.g., variable name within the textual program code, text string embedded in the unit of computation, and/or syntactical pattern that is unique within the unit of computation). A code portion and/or a section thereof may include an identifier referring to the unit of computation, where the code portion and the unit of computation may be at independent locations of each other. The systems, media, and methods may semantically identify a loop that corresponds to the identifier and perform operations on the textual program code using the code portion and/or section.
STREAMING ADDRESS GENERATION
A digital signal processor having at least one streaming address generator, each with dedicated hardware, for generating addresses for writing multi-dimensional streaming data that comprises a plurality of elements. Each at least one streaming address generator is configured to generate a plurality of offsets to address the streaming data, and each of the plurality of offsets corresponds to a respective one of the plurality of elements. The address of each of the plurality of elements is the respective one of the plurality of offsets combined with a base address.
Responding to branch misprediction for predicated-loop-terminating branch instruction
A predicated-loop-terminating branch instruction controls, based on whether a loop termination condition is satisfied, whether the processing circuitry should process a further iteration of a predicated loop body or process a following instruction. If at least one unnecessary iteration of the predicated loop body is processed following a mispredicted-non-termination branch misprediction when the loop termination condition is mispredicted as unsatisfied for a given iteration when it should have been satisfied, processing of the at least one unnecessary iteration of the predicated loop body is predicated to suppress an effect of the at least one unnecessary iteration. When the mispredicted-non-termination branch misprediction is detected for the given iteration of the predicated-loop-terminating branch instruction, in response to determining that a flush suppressing condition is satisfied, flushing of the at least one unnecessary iteration of the predicated loop body is suppressed as a response to the mispredicted-non-termination branch misprediction.
Processor for executing a loop acceleration instruction to start and end a loop
A processor achieving a zero-overhead loop, includes instruction stream control circuitry and loop control circuitry. The loop control circuitry includes loop address detecting circuitry and loop end determining circuitry. By combining instructions and hardware, the loop control circuitry eliminates additional control instructions required b each loop iteration and can achieve loop acceleration with zero overhead, thereby improving the loop execution efficiency.
Apparatus for Array Processor and Associated Methods
An apparatus includes an array processor to process array data. The array data are arranged in a memory. The array data are specified with programmable per-dimension size and stride values.
Apparatus for Array Processor with Program Packets and Associated Methods
An apparatus includes an array processor to process array data in response to information contained in a packet, wherein the packet comprises a set of fields specifying configuration information for processing the array.
Apparatus for Processor with Macro-Instruction and Associated Methods
An apparatus includes an array processor to process array data in response to a set of macro-instructions. A macro-instruction in the set of macro-instructions performs loop operations, array iteration operations, and/or arithmetic logic unit (ALU) operations.
HIGH CLOCK-EFFICIENCY RANDOM NUMBER GENERATION SYSTEM AND METHOD
A system and method of quickly and efficiently generating a series of random numbers from a source of random numbers in a computing system, Steps includes: loading a data loop (a looped array of stored values with an index) with random data from a source of random data; then repeating the following: reading a value from the data loop in relation to the index; operating on the multi-bit value thereby outputting a derived random number; and moving the index in relation to the looped array. The data loop may be a simple feedback loop which may be a shift register loaded by direct memory access (DMA). The operation may be performed by one or more arithmetic logic units (ALU) which may be fed by one or more data feeds and may perform XOR, Mask Generator, Data MUX, and/or MOD.
COMPILER, COMPILATION METHOD, AND COMPILER DEVICE
The present disclosure relates to a compiler for causing a computer to execute a process. The process includes generating a first program, wherein the first program includes a first code that determines whether a first area of a memory that a process inside a loop included in a second program refers to in a first execution time of the loop is in duplicate with a second area of the memory that the process refers to in a second execution time of the loop, a second code that executes the process in an order of the first and second execution times when it is determined that the first and the second areas are duplicate, and a third code that executes the process for the first execution time and the process for the second execution time in parallel when it is determined that the first and the second areas are not duplicate.