G06F2212/263

Distributed columnar data set subset retrieval

An apparatus includes a processor to: within each reading thread, retrieve a data set part and corresponding part metadata from storage device(s), analyze row group metadata for each row group within the data set part to identify candidate row group(s) meeting specified criteria, and store the candidate row group(s) and corresponding row group metadata within a data buffer of a queue; operate the queue as a FIFO buffer; within each provision thread, retrieve one of multiple row groups and corresponding metadata from within the data buffer, use information in the metadata to identify rows meeting the criteria, and provide those rows to the requesting device or an application; and in response to each instance of storage of a data set part within a data buffer of the queue, analyze the availability of storage space and/or of processing resources to determine whether to dynamically adjust the quantity of reading threads.

SYSTEM AND METHOD FOR IMPROVED PERFORMANCE IN A MULTIDIMENSIONAL DATABASE ENVIRONMENT
20220350819 · 2022-11-03 ·

In accordance with an embodiment, described herein is a system and method for improving performance within a multidimensional database computing environment. A multidimensional database, utilizing a block storage option, performs numerous input/output (I/O) operations when executing calculations. To separate I/O operations from calculations, a background task queue is created to identify data blocks requiring I/O. The background task queue is utilized by background writer threads to execute the I/O operations in parallel with calculations.

Dynamically Sizing a Hierarchical Tree Based on Activity

A method, a computing device, and a non-transitory machine-readable medium for allocating memory to data structures that map a first address space to a second is provided. In some embodiments, the method includes identifying, by a storage system, a pool of memory resources to allocate among a plurality of address maps. Each of the plurality of address maps includes at least one entry that maps an address in a first address space to an address in a second address space. An activity metric is determined for each of the plurality of address maps, and a portion of the pool of memory is allocated to each of the plurality of address maps based on the respective activity metric. The allocating of the portion of the memory pool to a first map may be performed in response to a merge operation being performed on the first map.

Changing Storage Volume Ownership Using Cache Memory
20170315725 · 2017-11-02 ·

A method, a computing device, and a non-transitory machine-readable medium for changing ownership of a storage volume from a first controller to a second controller without flushing data, is provided. In the system, the first controller is associated with a first DRAM cache comprising a primary partition that stores data associated with the first controller and a mirror partition that stores data associated with the second controller. The second controller in the system is associated with a second DRAM cache comprising a primary partition that stores data associated with the second controller and the mirror partition associated with the first controller. Further, the mirror partition in the second DRAM cache stores a copy of a data in the primary partition of the first DRAM cache and the mirror partition in the first DRAM cache stores a copy of a data in the primary partition of the second DRAM cache.

Method and apparatus for transferring information between different streaming protocols at wire speed

The present invention provides a mechanism for fast routing of data in a Storage Area Network. A protocol interface module (PIM) interfaces with outside networks and the storage devices, such as over fiber channel (FC). The PIM encapsulates received data into a streaming protocol, enabling storage processors to direct data to/from the appropriate physical disk in a similar manner to the directing of network messages over the Internet or other network.

Performing data reduction during host data ingest

A technique performs data reduction on host data of a write request during ingest under certain circumstances. Therein, raw host data of a write request is placed from the host into a data cache. Further, a data reducing ingest operation is performed that reduces the raw host data from the data cache into reduced host data (e.g., via deduplication, compression, combinations thereof, etc.). After completion of the data reducing ingest operation, a late-binding operation is performed that updates a mapper with ability to access the reduced host data from secondary storage. Such ingest-time data reduction may be enabled/disabled (e.g., turned on or off) per input/output (I/O) operation (e.g., used only for relatively large asynchronous I/O operations) and/or activated in situations in which the ingest bandwidth is becoming a bottleneck.

COMPUTER SYSTEM
20170308472 · 2017-10-26 · ·

A computer system, comprising first computers, an application operate on each of the first computers; the each of the first computers is coupled to a second computer for providing a storage area; the each of the first computers includes a processor, a memory, a cache device to which a cache area, and a interface; the memory includes a program for realizing an operating system; the operating system includes a cache driver; and a cooperation control module configured to issue a control I/O request for instructing arrangement control; and the cooperation control module generate the control I/O request from a detected I/O request based on a analysis result of the detected I/O request in a case where an issuance of the I/O request from the cache driver is detected; and transfer the control I/O request to an apparatus different from an apparatus of a transfer destination of the detected I/O request.

Cloud-native global file system with reshapable caching
20220058133 · 2022-02-24 ·

A cloud-native global file system in which a local filer creates objects and forward them to a cloud-based object store is augmented to include a reshapable caching scheme for the local filer. Like striped caches, the approach uses a stripe, but the striping is implemented via a true RAID 0 (disk striping) rather than as a striped LV (logical volume) device. This approach allows for a “reshape” operation to convert from a n-way stripe set to a n+-way stripe set. Preferably, a reshape involves redistributing each block on disk to its new calculated home. For example, going from a single disk to a two disk set would move every other block from disk 1 to disk 2, and rearrange the blocks on disk 1 to fill in the “holes”. Performance after the reshape matches that of a striped cache. In one embodiment, the cache is structured as a “degraded” RAID 4.

SCALABLE DATA ACCESS SYSTEM AND METHODS OF ELIMINATING CONTROLLER BOTTLENECKS
20170293428 · 2017-10-12 ·

A data access system has host computers having front-end controllers nFE_SAN connected via a bus or network interconnect to back-end storage controllers nBE_SAN, and physical disk drives connected via network interconnect to the nBE_SANs to provide a distributed, high performance, policy based or dynamically reconfigurable, centrally managed, data storage acceleration system. The hardware and software architectural solutions eliminate BE_SAN controller bottlenecks and improve performance and scalability. In an embodiment, the nBE_SAN (BE_SAN) firmware recognize controller overload conditions, informs Distributed Resource Manager (DRM), and, based on the DRM provided optimal topology information, delegates part of its workload to additional controllers. The nFE_SAN firmware and additional hardware using functionally independent and redundant CPUs and memory that mitigate single points of failure and accelerates write performance. The nFE_SAN and FE_SAN controllers facilitate Converged I/O Interface by simultaneously supporting storage I/O and network traffic.

USING STORAGE MANAGERS IN DATA STORAGE MANAGEMENT SYSTEMS FOR QUOTA DISTRIBUTION, COMPLIANCE, AND UPDATES

Storage managers are used in data storage management systems for license distribution, compliance, and updates. A licensed quota is managed at an aggregate level applicable to a collective plurality of storage operation cells and not by licensing each individual storage operation cell. A multi-cell environment belonging to a given customer is licensed by using an enhanced storage manager in each cell. One storage manager is a “license server” to the other storage managers or “child licensees.” A licensor issues a global license to the customer's designated license server, which distributes child licenses and manages other licensing aspects. Rather than licensing usage for individual storage operation cells, licensed usage is managed at an aggregate level using the license server and child licensees in a “self-service” model.