Patent classifications
G06F9/30079
ADAPTIVE THREAD MANAGEMENT FOR HETEROGENOUS COMPUTING ARCHITECTURES
An apparatus and method for efficiently scheduling tasks in a dynamic manner to multiple cores that support a heterogeneous computing architecture. A computing system includes multiple cores with at least two cores being capable of executing instructions of a same instruction set architecture (ISA), and therefore, are architecturally compatible. In an implementation, each of the at least two cores is a general-purpose central processing unit (CPU) core capable of executing instructions of a same ISA. However, the throughput and the power consumption greatly differ between the at least two cores based on their hardware designs. An operating system scheduler assigns a thread to a first core, and the first core measures thread dynamic behavior of the thread over a time interval. Based on the thread dynamic behavior, the scheduler reassigns the thread to a second core different from the first core.
Data Science Platform
Disclosed herein is a data science platform that is built with a specific focus on monitoring and analyzing the operation of industrial assets, such as trucking assets, rail assets, construction assets, mining assets, wind assets, thermal assets, oil-and-gas assets, and manufacturing assets, among other possibilities. The disclosed data science platform is configured to carry out operations including (i) ingesting asset-related data from various different data sources and storing it for downstream use, (ii) transforming the ingested asset-related data into a desired formatting structure and then storing it for downstream use, (iii) evaluating the asset-related data to derive insights about an asset's operation that may be of interest to a platform user, which may involve data science models that have been specifically designed to analyze asset-related data in order to gain a deeper understanding of an asset's operation, and (iv) presenting derived insights and other asset-related data to platform users.
HYBRID VIRTUAL GPU CO-SCHEDULING
An embodiment of a semiconductor package apparatus may include technology to manage one or more virtual graphic processor units, and co-schedule the one or more virtual graphic processor units based on both general processor instructions and graphics processor instructions. Other embodiments are disclosed and claimed.
PROCESSOR CIRCUIT AND DATA PROCESSING METHOD
A processor circuit is provided. The processor circuit includes an instruction decode unit, an instruction detector, an address generator and a data buffer. The instruction decode unit is configured to decode a load instruction to generate a decoding result. The instruction detector, coupled to the instruction decode unit, is configured to detect if the load instruction is in a load-use scenario. The address generator, coupled to the instruction decode unit, is configured to generate a first address requested by the load instruction according to the decoding result. The data buffer is coupled to the instruction detector and the address generator. When the instruction detector detects that the load instruction is in the load-use scenario, the data buffer is configured to store the first address generated from the address generator, and store data requested by the load instruction according to the first address.
Efficient scheduling for hyper-threaded CPUs using memory monitoring
A system and method for scheduling of hyper-threaded CPUs using memory monitoring includes a memory with an operating system memory and a physical processor in communication with the memory. The physical processor includes a first hyper-thread and a second hyper-thread. A monitor instruction to monitor for updates to a designated memory location is executed in the first hyper-thread. The system further includes an operating system to execute on the physical processor and a system call configured to record in the operating system memory that the first hyper-thread is in a memory wait state. The system call is further configured to execute a memory wait instruction in the first hyper-thread. A task is executed in the second hyper-thread while the first hyper-thread is in the memory wait state.
Recovering register mapping state of a flushed instruction employing a snapshot of another register mapping state and traversing reorder buffer (ROB) entries in a processor
A register mapping circuit for recovering a register mapping state associated with a flushed instruction by traversing ROB entries from a snapshot of another register mapping state. The register mapping circuit includes a ROB control circuit, a snapshot circuit, and a register rename recovery circuit (RRRC). The ROB control circuit allocates ROB entries to instructions entering a processor pipeline, including a target ROB entry allocated to a target instruction and other ROB entries allocated to other instructions. The snapshot circuit captures snapshots of logical register-to-physical register mapping states in the rename map table in association with a subset of instructions that could be flushed. If the target instruction is flushed, the RRRC restores the rename map table register mapping state corresponding to the target instruction based on a snapshot in a ROB entry allocated to another instruction, and traverses register mapping updates in the intervening ROB entries.
Controlling timing of event data transmissions in an event architecture
Techniques for a service provider network to communicatively couple services and/or applications in a serverless computing environment. A pipe component can configure a pipe to integrate two services by transmitting data between services and/or applications using the pipe. The pipe may also be configured to transform how a service processes an event, control timing of event transmissions using the pipe, define an event structure for an event, and/or batch events. Pipes enable an application or service to exchange data with a variety of services provided by the service provider network while controlling what type of data is generated, stored, or transmitted.
Facilitating page table entry (PTE) maintenance in processor-based devices
Facilitating page table entry (PTE) maintenance in processor-based devices is disclosed. In this regard, a processor-based device includes processing elements (PEs) configured to support two new coherence states: walker-readable (W) and modified walker accessible (M.sub.W). The W coherence state indicates that read access to a corresponding coherence granule by hardware table walkers (HTWs) is permitted, but all write operations and all read operations by non-HTW agents are disallowed. The M.sub.W coherence state indicates that cached copies of the coherence granule visible only to HTWs may exist in other caches. In some embodiments, each PE is also configured to support a special page table entry (SP-PTE) field store instruction for modifying SP-PTE fields of a PTE, indicating to the PE's local cache that the corresponding coherence granule should transition to the M.sub.W state, and indicating to remote local caches that copies of the coherence granule should update their coherence state.
SHADOW CACHES FOR LEVEL 2 CACHE CONTROLLER
An apparatus including a CPU core and a L1 cache subsystem coupled to the CPU core. The L1 cache subsystem includes a L1 main cache, a L1 victim cache, and a L1 controller. The apparatus includes a L2 cache subsystem coupled to the L1 cache subsystem. The L2 cache subsystem includes a L2 main cache, a shadow L1 main cache, a shadow L1 victim cache, and a L2 controller. The L2 controller receives an indication from the L1 controller that a cache line A is being relocated from the L1 main cache to the L1 victim cache; in response to the indication, update the shadow L1 main cache to reflect that the cache line A is no longer located in the L1 main cache; and in response to the indication, update the shadow L1 victim cache to reflect that the cache line A is located in the L1 victim cache.
MERGING DATA FOR WRITE ALLOCATE
A method includes receiving, by a level two (L2) controller, a write request for an address that is not allocated as a cache line in a L2 cache. The write request specifies write data. The method also includes generating, by the L2 controller, a read request for the address; reserving, by the L2 controller, an entry in a register file for read data returned in response to the read request; updating, by the L2 controller, a data field of the entry with the write data; updating, by the L2 controller, an enable field of the entry associated with the write data; and receiving, by the L2 controller, the read data and merging the read data into the data field of the entry.