Integrating host-side storage device management with host-side non-volatile memory
11262942 · 2022-03-01
Assignee
Inventors
- Qi Wu (San Jose, CA, US)
- Wentao Wu (Milpitas, CA, US)
- Thad Omura (Los Altos, CA, US)
- Yang Liu (Milpitas, CA, US)
- Tong Zhang (Albany, NY, US)
Cpc classification
G06F3/0659
PHYSICS
G06F3/0604
PHYSICS
G06F12/0868
PHYSICS
G06F2212/7201
PHYSICS
G06F12/0873
PHYSICS
G06F2212/7203
PHYSICS
International classification
Abstract
The present disclosure relates to the field of solid-state data storage, and particularly to improving the speed performance and reducing the cost of solid-state data storage devices. A host-managed data storage system according to embodiments includes a set of storage devices, each storage device including a write buffer and memory; and a host coupled to the set of storage devices, the host including: a storage device management module for managing data storage functions for each storage device; memory including: a front-end write buffer; a first mapping table for data stored in the front-end write buffer; and a second mapping table for data stored in the memory of each storage device.
Claims
1. A method for managing a data storage system including a set of flash-memory storage devices, each storage device including a write buffer and memory and supporting multi-stream data write, and a host coupled to the set of storage devices, wherein the write buffer of each storage device comprises volatile memory and is not protected against power loss, the method comprising: allocating space in a memory of the host for a front-end write buffer; storing, in the memory of the host, a first mapping table for data stored in the front-end write buffer and a second mapping table for data stored in the memory of each storage device; managing, by the host, data storage functions for each storage device, the data storage functions including: address mapping, managing the write buffer of each storage device, scheduling reads and writes for the memory of each storage device, and controlling movement of data from the write buffer of each storage device to the memory of the storage device; and, for each storage device, storing data from a block in the write buffer of the storage device to a parallel unit in the memory of the storage device if a write priority factor of the parallel unit is greater than a read priority factor of the parallel unit, wherein the read priority factor of the parallel unit is based on an amount of data read from the parallel unit over a fixed period of time, and wherein the write priority factor of the parallel unit is based on the amount of data in the front-end buffer that is associated with the parallel unit.
2. The method according to claim 1, further comprising, in response to a read request from the host: looking up a read address for the read request in the first mapping table; if the read address is not found in the first mapping table, looking up the read address in the second mapping table to determine the storage device associated with the read address, fetching data from the memory of the storage device associated with the read address, and sending the data to the host; and if the read address is found in the first mapping table, fetching data at the read address in the front-end write buffer and sending the data to the host.
3. The method according to claim 1, further comprising, in response to a write request from the host: looking up a write address for the write request in the first mapping table; if the write address is not found in the first mapping table, writing at least a portion of data in the write request to the front-end write buffer; if the write address is found in the first mapping table, writing the data to the write address in the front-end write buffer, determining if the data has already been copied from the front-end buffer to one of the storage devices, and if the data has already been copied to one of the storage devices, writing the data to the write buffer of that storage device.
4. A host-managed data storage system, comprising: a set of storage devices, each storage device including a write buffer and memory, wherein the write buffer of each storage device comprises volatile memory and is not protected against power loss; and a host coupled to the set of storage devices, the host including: a storage device management module for managing data storage functions for each storage device, wherein the data storage functions for each storage device managed by the storage device management module include: address mapping, management of the write buffer of each storage device, and read and write scheduling for the memory of each storage device; memory including: a front-end write buffer; a first mapping table for data stored in the front-end write buffer; and a second mapping table for data stored in the memory of each storage device, wherein each storage device supports multi-stream data write, and wherein the write buffer of each storage device has a size of k.Math.n.sub.c, where k is a constant and n.sub.c denotes the amount of data being written to the memory of the storage device in each stream at the same time, wherein each storage device includes a number of parallel units and each parallel unit has a number of write streams, and wherein a total capacity of the write buffer of each storage device is independent from the number of parallel units in the storage device and the number of write streams in each parallel unit, wherein, for each storage device, the storage device management module controls movement of data from the write buffer of the storage device to the memory of the storage device, wherein, for each storage device, the storage device management module stores data from a block in the write buffer of the storage device to a parallel unit in the memory of the storage device if a write priority factor of the parallel unit is greater than a read priority factor of the parallel unit, wherein the read priority factor of the parallel unit is based on an amount of data read from the parallel unit over a fixed period of time, and wherein the write priority factor of the parallel unit is based on the amount of data in the front-end buffer that is associated with the parallel unit.
5. The storage system according to claim 4, wherein each storage device comprises a NAND flash memory storage device.
6. The storage system according to claim 4, wherein the set of storage devices support a total of d.Math.p.Math.m streams, where d denotes the number of storage devices in the set of storage devices, p denotes the number of parallel units supported by each storage device, and m denotes the number of streams supported in each parallel unit.
7. The storage system according to claim 6, wherein the front-end buffer in the memory on the host has a capacity of d.Math.p.Math.m.Math.n.sub.c.
8. The storage system according to claim 4, wherein, in response to a read request from the host, the storage device management module is configured to: look up a read address for the read request in the first mapping table; and if the read address is not found in the first mapping table, look up the read address in the second mapping table to determine the storage device associated with the read address, fetch data from the memory of the storage device associated with the read address, and send the data to the host.
9. The storage system according to claim 4, wherein, in response to a read request from the host, the storage device management module is configured to: look up a read address for the read request in the first mapping table; and if the read address is found in the first mapping table, fetch data at the read address in the front-end write buffer and send the data to the host.
10. The storage system according to claim 4, wherein, in response to a write request from the host, the storage device management module is configured to: look up a write address for the write request in the first mapping table; and if the write address is not found in the first mapping table, write at least a portion of data in the write request to the front-end write buffer.
11. The storage system according to claim 4, wherein, in response to a write request from the host, the storage device management module is configured to: look up a write address for the write request in the first mapping table; and if the write address is found in the first mapping table: write the data to the write address in the front-end write buffer; determine if the data has already been copied from the front-end buffer to one of the storage devices; and if the data has already been copied to one of the storage devices, write the data to the write buffer of that storage device.
12. A host-managed data storage system, comprising: a set of storage devices, each storage device including a write buffer and memory, wherein the write buffer of each storage device comprises volatile memory and is not protected against power loss; and a host coupled to the set of storage devices, the host including: a storage device management module for managing data storage functions for each storage device, wherein the data storage functions for each storage device managed by the storage device management module include: address mapping, management of the write buffer of each storage device, and read and write scheduling for the memory of each storage device; memory including: a front-end write buffer; a first mapping table for data stored in the front-end write buffer; and a second mapping table for data stored in the memory of each storage device, wherein each storage device supports multi-stream data write, and wherein the write buffer of each storage device has a size of k.Math.n.sub.c, where k is a constant and n.sub.c denotes the amount of data being written to the memory of the storage device in each stream at the same time, wherein the set of storage devices support a total of d.Math.p.Math.m streams, where d denotes the number of storage devices in the set of storage devices, p denotes the number of parallel units supported by each storage device, and m denotes the number of streams supported in each parallel unit, wherein the front-end buffer in the memory on the host has a capacity of d.Math.p.Math.m.Math.n.sub.c, wherein, for each storage device, the storage device management module controls movement of data from the write buffer of the storage device to the memory of the storage device, wherein, for each storage device, the storage device management module stores data from a block in the write buffer of the storage device to a parallel unit in the memory of the storage device if a write priority factor of the parallel unit is greater than a read priority factor of the parallel unit, wherein the read priority factor of the parallel unit is based on an amount of data read from the parallel unit over a fixed period of time, and wherein the write priority factor of the parallel unit is based on the amount of data in the front-end buffer that is associated with the parallel unit.
Description
BRIEF DESCRIPTION OF THE DRAWINGS
(1) The numerous advantages of the present disclosure may be better understood by those skilled in the art by reference to the accompanying figures.
(2)
(3)
(4)
(5)
(6)
(7)
(8)
DETAILED DESCRIPTION
(9) Reference will now be made in detail to embodiments of the disclosure, examples of which are illustrated in the accompanying drawings.
(10) In conventional design practice, as illustrated in
(11) In order to improve the performance of solid-state data storage devices, it is highly desirable for solid-state data storage devices to have the following three features: (1) support for multi-stream data write, (2) provide true I/O isolation by using separate parallel units, and (3) flexibly schedule NAND flash memory read and write operations by giving read operations a higher priority. However, to effectively implement these three features, solid-state data storage devices must integrate a large-capacity non-volatile write buffer, which unfortunately can noticeably increase the fabrication cost of solid-state data storage devices. In addition, since this demands stronger power loss protection, it may degrade the storage device stability.
(12) According to embodiments, the present disclosure presents a design strategy that can implement these three features to improve the performance of solid-state data storage devices without incurring the cost penalty suffered by conventional solid-state data storage devices. In particular, this may be achieved by providing host-side storage device management and host-side non-volatile memory.
(13) Host-Side Storage Device Management
(14) As described above with regard to
(15) Host-Side Non-Volatile Memory
(16) According to embodiments, the host 22 is equipped with non-volatile memory 30 (e.g., NVDIMM), from which the management module 26 can allocate and own a certain amount of memory space. Let n.sub.c denote the amount of data being written to NAND flash memory 32 on the storage device 24 in each stream at the same time. The storage device 24 internally contains a write buffer 34 with the total size of k.Math.n.sub.c, where the constant factor k is relatively small (e.g., 4 or 6). Hence, the total capacity of the write buffer 34 in the storage device 24 is largely independent from the number of parallel units and the number of write streams in each parallel unit. To minimize the cost of the storage device 24, the write buffer 34 in the storage device 24 does not have to be non-volatile (i.e., does not need to be protected by power loss protection (e.g., volatile memory). As further illustrated in
(17) The management module 26 can leverage the host-side non-volatile memory 30 to improve the performance of storage devices 24 using the techniques described below. Let d denote the number of storage devices 24 coupled to a host 22. Let p denote the number of parallel units being supported by one storage device 24. Let m denote the number of streams being supported in each parallel unit. Therefore, the d storage devices 24 support a total of d.Math.p.Math.m streams. As illustrated in
(18) The management module 36 also allocates additional space from the host-side non-volatile memory 30 to provide a back-end unified write buffer 44. Let h denote the capacity of the back-end unified write buffer 44. Therefore, the management module 26 allocates a total of h+d.Math.p.Math.m.Math.n.sub.c of write buffer space in the host-side non-volatile memory 30. In addition, the management module 26 maintains two address mapping tables in the host memory (either volatile or non-volatile memory space). The first mapping table is a non-volatile memory (NVM) mapping table 48, which covers the data being stored in the write buffers 42, 44 (with the capacity of h+d.Math.p.Math.m.Math.n.sub.c) in the host-side non-volatile memory 30. The second mapping table is a flash mapping table 50, covers the data being stored in the NAND flash memory 32 in the storage devices 24.
(19)
(20) At process A3, if the data that are replaced have already been copied to the write buffer 34 in a storage device 24, then, at process A4, the new data are immediately copied to the write buffer 34 to replace the old data. Otherwise (N at process A3) flow passes to process A5.
(21) Suppose the residual to-be-written data in the write request has a capacity of w and targets the write stream s.sub.i,j,k. Let r.sub.i,j,k denote the space in the block b.sub.i,j,k of the front-end write buffer 42 that has not yet been occupied with valid data. If the data can fit into the corresponding write buffer segment (i.e., w≤r.sub.i,j,k) (Y at process A5), then, at process A6, data is directly written into the corresponding write buffer block b.sub.i,j,k in the front-end write buffer 42 of the host-side non-volatile memory 30. Otherwise (i.e., w>r.sub.i,j,k) (N at process A5), the to-be-written data is partitioned into two subsets at process A7, where the size of one subset (denoted as w.sub.1) is r.sub.i,j,k, and the size of the other subset (denoted as w.sub.2) is w−r.sub.i,j,k. At process A8, the subset w.sub.1 is written into the write buffer block b.sub.i,j,k in the front-end write buffer 42 of the host-side non-volatile memory 30 and, at process A9, the subset w.sub.2 is written into the unified write buffer 44 in the host-side non-volatile memory 30.
(22)
(23)
(24) Inside each storage device 24, data must be moved from the small write buffer 34 to NAND flash memory 32. The management module 26 cohesively schedules the intra-device data movement and normal read requests in order to improve the read operation service quality without overflowing the front-end write buffer 42 of the host-side non-volatile memory 30. To facilitate such cohesive scheduling, the management module 26 maintains the following runtime characteristics: (1) Let p.sub.i,j denote the j-th parallel unit inside the i-th storage device 24. For each parallel unit inside each storage device 24, the management module 26 keeps a record of the amount of data that have been read from this parallel unit over a fixed most recent duration (e.g., 10 seconds), based upon which it maintains a read intensity factor of this parallel unit. The more data that have been read from this parallel unit over the fixed period, the higher the read intensity factor. Based upon the read intensity factor, a read priority factor t.sub.i,j is set for each parallel unit. The higher the read intensity factor, the higher the read priority factor t.sub.i,j. (2) Inside the front-side write buffer 42 of the host-side non-volatile memory 30, let d.sub.i,j denote the amount of data that are associated with the parallel unit p.sub.i,j. The management module 26 keeps a record of each d.sub.i,j. Accordingly, a write priority factor g.sub.i,jmn for each parallel unit is set. The larger the value of d.sub.i,j, the higher the write priority factor g.sub.i,j.
(25)
(26) At process D6, if any block in the back-end unified write buffer 44 is associated with the parallel unit p.sub.i,j (Y at process D6), then, at process D7, one block associated with the parallel unit p.sub.i,j is moved from the back-end unified write buffer 44 to the front-end write buffer 42 in the host-side non-volatile memory 30. Otherwise (N at process D6), flow passes back to process D3.
(27) It is understood that aspects of the present disclosure may be implemented in any manner, e.g., as a software program, or an integrated circuit board or a controller card that includes a processing core, I/O and processing logic. Aspects may be implemented in hardware or software, or a combination thereof. For example, aspects of the processing logic may be implemented using field programmable gate arrays (FPGAs), ASIC devices, or other hardware-oriented system.
(28) Aspects may be implemented with a computer program product stored on a computer readable storage medium. The computer readable storage medium can be a tangible device that can retain and store instructions for use by an instruction execution device. The computer readable storage medium may be, for example, but is not limited to, an electronic storage device, a magnetic storage device, an optical storage device, an electromagnetic storage device, a semiconductor storage device, or any suitable combination of the foregoing. A non-exhaustive list of more specific examples of the computer readable storage medium includes the following: a portable computer diskette, a hard disk, a random access memory (RAM), a read-only memory (ROM), an erasable programmable read-only memory (EPROM or Flash memory), a static random access memory (SRAM), a portable compact disc read-only memory (CD-ROM), a digital versatile disk (DVD), a memory stick, etc. A computer readable storage medium, as used herein, is not to be construed as being transitory signals per se, such as radio waves or other freely propagating electromagnetic waves, electromagnetic waves propagating through a waveguide or other transmission media (e.g., light pulses passing through a fiber-optic cable), or electrical signals transmitted through a wire.
(29) Computer readable program instructions for carrying out operations of the present disclosure may be assembler instructions, instruction-set-architecture (ISA) instructions, machine instructions, machine dependent instructions, microcode, firmware instructions, state-setting data, or either source code or object code written in any combination of one or more programming languages, including an object oriented programming language such as Java, Python, Smalltalk, C++ or the like, and conventional procedural programming languages, such as the “C” programming language or similar programming languages. The computer readable program instructions may execute entirely on the user's computer, partly on the user's computer, as a stand-alone software package, partly on the user's computer and partly on a remote computer or entirely on the remote computer or server. In the latter scenario, the remote computer may be connected to the user's computer through any type of network, including a local area network (LAN) or a wide area network (WAN), or the connection may be made to an external computer (for example, through the Internet using an Internet Service Provider). In some embodiments, electronic circuitry including, for example, programmable logic circuitry, field-programmable gate arrays (FPGA), or programmable logic arrays (PLA) may execute the computer readable program instructions by utilizing state information of the computer readable program instructions to personalize the electronic circuitry, in order to perform aspects of the present disclosure.
(30) The computer readable program instructions may be provided to a processor of a general purpose computer, special purpose computer, or other programmable data processing apparatus to produce a machine, such that the instructions, which execute via the processor of the computer or other programmable data processing apparatus, create means for implementing the functions/acts specified in the flowchart and/or block diagram block or blocks. The computer readable program instructions may also be stored in a computer readable storage medium that can direct a computer, a programmable data processing apparatus, and/or other devices to function in a particular manner, such that the computer readable storage medium having instructions stored therein comprises an article of manufacture including instructions which implement aspects of the function/act specified in the flowchart and/or block diagram block or blocks.
(31) Aspects of the present disclosure are described herein with reference to flowchart illustrations and/or block diagrams of methods, apparatus (systems), and computer program products according to embodiments of the disclosure. It will be understood that each block of the flowchart illustrations and/or block diagrams, and combinations of blocks in the flowchart illustrations and/or block diagrams, can be implemented by hardware and/or computer readable program instructions.
(32) The flowchart and block diagrams in the figures illustrate the architecture, functionality, and operation of possible implementations of systems, methods, and computer program products according to various embodiments of the present disclosure. In this regard, each block in the flowchart or block diagrams may represent a module, segment, or portion of instructions, which comprises one or more executable instructions for implementing the specified logical function(s). In some alternative implementations, the functions noted in the block may occur out of the order noted in the figures. For example, two blocks shown in succession may, in fact, be executed substantially concurrently, or the blocks may sometimes be executed in the reverse order, depending upon the functionality involved. It will also be noted that each block of the block diagrams and/or flowchart illustration, and combinations of blocks in the block diagrams and/or flowchart illustration, can be implemented by special purpose hardware-based systems that perform the specified functions or acts or carry out combinations of special purpose hardware and computer instructions.
(33) The foregoing description of various aspects of the present disclosure has been presented for purposes of illustration and description. It is not intended to be exhaustive or to limit the concepts disclosed herein to the precise form disclosed, and obviously, many modifications and variations are possible. Such modifications and variations that may be apparent to an individual in the art are included within the scope of the present disclosure as defined by the accompanying claims.