MEMORY MODULE WITH DATA BUFFERING
20210382834 · 2021-12-09
Inventors
Cpc classification
G06F12/00
PHYSICS
G06F13/00
PHYSICS
Y02D10/00
GENERAL TAGGING OF NEW TECHNOLOGICAL DEVELOPMENTS; GENERAL TAGGING OF CROSS-SECTIONAL TECHNOLOGIES SPANNING OVER SEVERAL SECTIONS OF THE IPC; TECHNICAL SUBJECTS COVERED BY FORMER USPC CROSS-REFERENCE ART COLLECTIONS [XRACs] AND DIGESTS
G11C15/00
PHYSICS
G11C7/1072
PHYSICS
International classification
G06F12/00
PHYSICS
G06F13/00
PHYSICS
G11C15/00
PHYSICS
G11C5/04
PHYSICS
Abstract
A memory module operable to communicate data with a memory controller via a memory bus. The memory module comprises memory devices and logic configurable to receive and register a set of input address and control signals associated with a read or write memory command and to output data transfer control signals. The memory module further comprises circuitry coupled between the memory bus and the memory devices. The circuitry is configurable to be in any of a plurality of states including a first state and a second state, and to transition from the first state to the second state in response to the data transfer control signals. The circuitry in the first state is configured to disable signal communication through the circuitry. The circuitry in the second state is configured to transfer the data signals associated with the read or write command in accordance with a transfer time budget of the memory module.
Claims
1. A memory module operable in a computer system, the computer system including a memory controller coupled to a memory bus, the memory bus including address and control signal lines and data signal lines, the memory module comprising: a printed circuit board having a plurality of edge connections configured to be electrically coupled to a corresponding plurality of contacts of a module slot of the computer system; logic coupled to the printed circuit board and configurable to receive input address and control signals associated with a read or write memory command via the address and control signal lines and to output registered address and control signals corresponding to the input address and control signals, the input address and control signals including a plurality of input chip select signals and other input address and control signals, the plurality of input chip select signals at least including a first input chip select signal and a second input chip select signal, the first input chip select signal having an active value and the second input chip select signal having an inactive value, the registered address and control signals including a plurality of registered chip select signals and other registered address and control signals, the plurality of registered chip select signals at least including a first registered chip select signal corresponding to the first input chip select signal and a second registered chip select signal corresponding to the second input chip select signal, the first registered chip select signal having an active value and the second registered chip select signal having an inactive value, wherein the logic is further configurable to output data transfer control signals associated with the read or write memory command; memory devices mounted on the printed circuit board, the memory devices at least including first memory devices and second memory devices, wherein the first memory devices are configured to receive the first registered chip select signal and the other registered address and control signals, and to receive or output data signals associated with the read or write command, and wherein the second memory devices are configured to receive the second registered chip select signal and the other registered address and control signals; and circuitry coupled between the memory devices and the data signal lines in the memory bus, and configurable to be in any of a plurality of states including a first state and a second state, wherein: the circuitry is configurable to transition from the first state to the second state in response to the data transfer control signals; the circuitry in the first state is configured to disable signal communication through the circuitry; the circuitry in the second state is configured to transfer the data signals associated with the read or write command via registered data transfers in accordance with a transfer time budget of the memory module; the transfer time budget of the memory module includes a predetermined amount of time delay associated with the registered data transfers through the circuitry; and an overall CAS latency of the memory module is greater than an actual operational CAS latency of each of the memory devices by at least the predetermined amount of time delay.
2. The memory module of claim 1, wherein each of the memory devices has a corresponding load, and the circuitry is configured to isolate the loads of the memory devices from the memory bus.
3. The memory module of claim 1, wherein the data signals include respective sets of consecutively transmitted data bits corresponding to respective data signal lines in the memory bus, and wherein each set of consecutively transmitted data bits are successively transferred through the circuitry in response to the data transfer control signals and in accordance with the transfer time budget of the memory module.
4. The memory module of claim 1, wherein each of the memory devices is 4-bits or 8-bits wide, and wherein the first memory devices form a rank that is 32-bits, 64-bits or 72-bits wide.
5. The memory module of claim 1, wherein each of the memory devices is 4-bits wide, and wherein the memory devices are configured in 8-bit-wide pairs.
6. The memory module of claim 1, wherein the circuitry includes logic pipelines configurable to enable data transfers between the memory devices and the memory bus through the circuitry.
7. The memory module of claim 1, wherein the memory module has a specified data rate, and wherein the data signals are transferred between the first memory devices and the memory controller at the specified data rate.
8. The memory module of claim 1, further comprising a phase locked loop clock driver configured to output a clock signal in response to one or more signals received from the memory controller, wherein the predetermined amount of time delay is at least one clock cycle time delay.
9. The memory module of claim 8, wherein the memory devices are dynamic random access memory devices configured to operate synchronously with the clock signal, and wherein each memory device in the first memory devices is configured to receive or output a respective set of data strobes and to receive or output data bits on both edges of each of the respective set of data strobes.
10. The memory module of claim 1, wherein the circuitry includes data paths that are disabled when the circuitry is in the first state and enabled when the circuitry is in the second state.
11. The memory module of claim 10, wherein the data paths are disabled when no data signals associated with any memory command are being transferred through the circuitry.
12. The memory module of claim 10, wherein the memory module has a specified data rate, and wherein the data signals are transferred through the data paths at the specified data rate.
Description
BRIEF DESCRIPTION OF THE DRAWINGS
[0018]
[0019]
[0020]
[0021]
[0022]
[0023]
[0024]
[0025]
[0026]
[0027]
[0028]
[0029]
[0030]
[0031]
[0032]
[0033]
[0034]
[0035]
[0036]
[0037]
[0038]
[0039]
[0040]
[0041]
DETAILED DESCRIPTION OF EXAMPLE EMBODIMENTS
Load Isolation
[0042]
[0043] As used herein, the term “load” is a broad term which includes, without limitation, electrical load, such as capacitive load, inductive load, or impedance load. As used herein, the term “isolation” is a broad term which includes, without limitation, electrical separation of one or more components from another component or from one another. As used herein, the term “circuit” is a broad term which includes, without limitation, an electrical component or device, or a configuration of electrical components or devices which are electrically or electromagnetically coupled together (e.g., integrated circuits), to perform specific functions.
[0044] Various types of memory modules 10 are compatible with embodiments described herein. For example, memory modules 10 having memory capacities of 512-MB, 1-GB, 2-GB, 4-GB, 8-GB, as well as other capacities, are compatible with embodiments described herein. Certain embodiments described herein are applicable to various frequencies including, but not limited to 100 MHz, 200 MHz, 400 MHz, 800 MHz, and above. In addition, memory modules 10 having widths of 4 bytes, 8 bytes, 16 bytes, 32bytes, or 32bits, 64 bits, 128 bits, 256 bits, as well as other widths (in bytes or in bits), are compatible with embodiments described herein. In certain embodiments, the memory module 10 comprises a printed circuit board on which the memory devices 30 are mounted, a plurality of edge connectors configured to be electrically coupled to a corresponding plurality of contacts of a module slot of the computer system, and a plurality of electrical conduits which electrically couple the memory devices 30 to the circuit 40 and which electrically couple the circuit 40 to the edge connectors. Furthermore, memory modules 10 compatible with embodiments described herein include, but are not limited to, single in-line memory modules (SIMMs), dual in-line memory modules (DIMMs), small-outline DIMMs (SO-DIMMs), unbuffered DIMMs (UDIMMs), registered DIMMs (RDIMMs), fully-buffered DIMM (FBDIMM), rank-buffered DIMMs (RBDIMMs), mini-DIMMs, and micro-DIMMs.
[0045] Memory devices 30 compatible with embodiments described herein include, but are not limited to, random-access memory (RAM), dynamic random-access memory (DRAM), synchronous DRAM (SDRAM), and double-data-rate DRAM (e.g., SDR, DDR-1, DDR-2, DDR-3). In addition, memory devices 30 having bit widths of 4, 8, 16, 32, as well as other bit widths, are compatible with embodiments described herein. Memory devices 30 compatible with embodiments described herein have packaging which include, but are not limited to, thin small-outline package (TSOP), ball-grid-array (BGA), fine-pitch BGA (FBGA), micro-BGA (μBGA), mini-BGA (mBGA), and chip-scale packaging (CSP). Memory devices 30 compatible with embodiments described herein are available from a number of sources, including but not limited to, Samsung Semiconductor, Inc. of San Jose, Calif., Infineon Technologies AG of San Jose, Calif., and Micron Technology, Inc. of Boise, Id. Persons skilled in the art can select appropriate memory devices 30 in accordance with certain embodiments described herein.
[0046] In certain embodiments, the plurality of memory devices 30 comprises a first number of memory devices 30. In certain such embodiments, the circuit 40 selectively isolates a second number of the memory devices 30 from the computer system, with the second number less than the first number.
[0047] In certain embodiments, the plurality of memory devices 30 are arranged in a first number of ranks. For example, in certain embodiments, the memory devices 30 are arranged in two ranks, as schematically illustrated by
[0048] In certain embodiments, the circuit comprises a logic element selected from a group consisting of: a programmable-logic device (PLD), an application-specific integrated circuit (ASIC), a field-programmable gate array (FPGA), a custom-designed semiconductor device, and a complex programmable-logic device (CPLD). In certain embodiments, the logic element of the circuit 40 is a custom device. Sources of logic elements compatible with embodiments described herein include, but are not limited to, Lattice Semiconductor Corporation of Hillsboro, Oreg., Altera Corporation of San Jose, Calif., and Xilinx Incorporated of San Jose, Calif. In certain embodiments, the logic element comprises various discrete electrical elements, while in certain other embodiments, the logic element comprises one or more integrated circuits.
[0049] In certain embodiments, the circuit 40 further comprises one or more switches which are operatively coupled to the logic element to receive control signals from the logic element. Examples of switches compatible with certain embodiments described herein include, but are not limited to, field-effect transistor (FET) switches, such as the SN74AUC1G66 single bilateral analog switch available from Texas Instruments, Inc. of Dallas, Tex.
[0050]
[0051] In certain embodiments, the circuit 40 selectively isolates the loads of at least some of the memory devices 30 from the computer system. The circuit 40 of certain embodiments is configured to present a significantly reduced load to the computer system. In certain embodiments in which the memory devices 30 are arranged in a plurality of ranks, the circuit 40 selectively isolates the loads of some (e.g., one or more) of the ranks of the memory module 10 from the computer system. In certain other embodiments, the circuit 40 selectively isolates the loads of all of the ranks of the memory module 10 from the computer system. For example, when a memory module 10 is not being accessed by the computer system, the capacitive load on the memory controller 20 of the computer system by the memory module 10 can be substantially reduced to the capacitive load of the circuit 40 of the memory module 10.
[0052] As schematically illustrated by
[0053] For example, in certain embodiments, the circuit 40 comprises a pair of switches 120 a, 120 b on the DQ data signal lines 102a, 102b as schematically illustrated by
[0054] In certain embodiments, the circuit 40 selectively isolates the loads of ranks of memory devices 30 from the computer system. As schematically illustrated in
[0055] The circuit 40 of
[0056] In the example embodiments schematically illustrated by
[0057] In certain embodiments, the load isolation provided by the circuit 40 advantageously allows the memory module 10 to present a reduced load (e.g., electrical load, such as capacitive load, inductive load, or impedance load) to the computer system by selectively switching between the two ranks of memory devices 30 to which it is coupled. This feature is used in certain embodiments in which the load of the memory module 10 may otherwise limit the number of ranks or the number of memory devices per memory module. In certain embodiments, the memory module 10 operates as having a data path rank buffer which advantageously isolates the ranks of memory devices 30 of the memory module 10 from one another, from the ranks on other memory modules, and from the computer system. This data path rank buffer of certain embodiments advantageously provides DQ-DQS paths for each rank or sets of ranks of memory devices which are separate from one another, or which are separate from the memory controller of the computer system. In certain embodiments, the load isolation advantageously diminishes the effects of capacitive loading, jitter and other sources of noise. In certain embodiments, the load isolation advantageously simplifies various other aspects of operation of the memory module 10, including but not limited to, setup-and-hold time, clock skew, package skew, and process, temperature, voltage, and transmission line variations.
[0058] For certain memory module applications that utilize multiple ranks of memory, increased load on the memory bus can degrade speed performance. In certain embodiments described herein, selectively isolating the loads of the ranks of memory devices 30 advantageously decreases the load on the computer system, thereby allowing the computer system (e.g., server) to run faster with improved signal integrity. In certain embodiments, load isolation advantageously provides system memory with reduced electrical loading, thereby improving the electrical topology to the memory controller 20. In certain such embodiments, the speed and the memory density of the computer system are advantageously increased without sacrificing one for the other.
[0059] In certain embodiments, load isolation advantageously increases the size of the memory array supported by the memory controller 20 of the computer system. The larger memory array has an increased number of memory devices 30 and ranks of memory devices 30 of the memory module 10, with a corresponding increased number of chip selects. Certain embodiments described herein advantageously provide more system memory using fewer chip selects, thereby avoiding the chip select limitation of the memory controller.
[0060] An exemplary section of Verilog code corresponding to logic compatible with a circuit 40 which provides load isolation is listed below in Example 1. The exemplary code of Example 1 corresponds to a circuit 40 comprising six FET switches for providing load isolation to DQ and DQS lines.
Example 1
[0061]
TABLE-US-00001 //======================== declarations reg rasN_R, casN_R, weN_R; wire actv_cmd_R, pch_cmd_R, pch_all_cmd_R, ap_xfr_cmd_R_R; wire xfr_cmd_R,mrs_cmd,rd_cmd_R; //- - - - - - - - - - - - - - - - - - - - - - - DDR 2 FET reg brs0N_R; // registered chip sel reg brs1N_R; // registered chip sel reg brs2N_R; // reRistercd chip sel reg brs3N_R; // registered chip se1 wire sel; wire sel_01; wire sel_23; wire rd_R1; wire wr_cmd_R,wr_R1; reg rd_ R2,rd_R3,rd_R4,rd_R5; reg wr_R2,wr_R3,wr_R4,wr_R5; reg enfet1,enfet2,enfet3,enfet4,enfet5,enfet6; wire wr_01_R1,wr_23_R1; reg wr_01_R2,wr_01_R3,wr_01_R4; reg wr_23_R2,wr_23_R3,wr_23_R4; wire rodt0_a,rodt0_b; //========================== logic always @(posedge clk_in) begin brs0N_R <= brs0_in_N; // cs0 brs1N_R <= brs1_in_N; // cs1 brs2N_R <= brs2_in_N; // cs2 brs3N_R <= brs3_in_N; // cs3 rasN_R <= brras_in_N ; casN_R <= brcas_in_N ; weN_R <= bwe_in_N ; end assign sel = ~brs0N_R | ~brs1N_R | ~brs2N_R | ~brs3N_R ; assign sel_01 = ~brs0N_R | ~brs1N_R ; assign sel_23 = ~brs2N_R | ~brs3N_R ; assign actv_cmd_R = !rasN_R & casN_R & weN_R; // activate cmd assign pch_cmd_R = !rasN_R & casN_R & !weN_R ;// pchg cmd assign xfr_cmd_R = rasN_R & !casN_R; // xfr cmd assign mrs_cmd = !rasN_R & !casN_R & !weN_R ; // md reg set cmd assign rd_cmd_R = rasN_R & !casiN,...,R & weN....R ; // read cmd assign wr_cmd_R = rasN_R & !casN_R & !weN_R; // write cmd //------------------------------------- assign rd_R1 = sel & rd_cmd_R; // rd cmd cyc 1 assign wr_R1 = sel & wr_cmd_R; // wr cmd cyc 1 //------------------------------------- always @(posedge clk_in) begin rd_R2 <= rd_R1 ; rd_R3 <= rd_R2; rd_R4 <= rd_R3; rd_R5 < rd_R4; // rd0_o_R6 <= rd0_o_R5: wr_R2 <= wr_R1 ; wr_R3 <= wr_R2; wr_R4 <= wr_R3; wr_R5 <= wr_R4; end //------------------------------------- assign wr_01_R1 = sel_01 & wr_cmd_R; // wr cmd cyc 1 for cs 2 & cs3 assign wr_23_R1 = sel_23 & wr_cmd_R; // wr cmd cyc 1 for cs 2 & cs3 always @(posedge clk_in) begin wr_01_R2 <= wr_01_R1 ; wr_01_R3 <= wr_01_R2; wr_01_R4 <= wr_01_R3; wr_23_R2 <= wr_23_R1 ; wr_23_R3 <= wr_23_R2; wr_23_R4 <= wr_23_R3; end assign rodt0_ab = (rodt0) // odt cmd from sys | (wr_23_R1) // wr 1st cyc to other mks (assume single dimm per channel) | (wr_23_R2) // wr 2nd cyc to other mks (assume single dimm per channel) | (wr_23_R3) // wr 3rd cyc to other mks (assume single dimm per channel) ; assign rodt1_ab = (rodt1) // odt cmd from sys | (wr_01_R1) // wr 1st cyc to other mks (assume single dimm per channel) | (wr_01_R2) // wr 2nd cyc to other mks (assume single dimm per channel) | (wr_01_R3) // wr 3rd cyc to other mks (assume single dimm per channel) ; //------------------------------------- always @(poseage clk_in) begin if ( | (rd_R2) // pre-am rd | (rd_R3) // 1st cyc of rd brst (c13) | (rd_R4) // 2nd cyc of rd brst (c13) | (wr_R1) // pre-am wr | (wr_R2) // wr brst 1st cyc | (wr_R3) // wr brst 2nd cyc ) begin enfet1 <= 1′b1; // enable fet enfet2 <= 1′b1; // enable fet enfet3 <= 1′b1; // enable fet enfet4 <= 1′b1; // enable fet enfet5 <= 1′b1; // enable fet enfet6 <= 1′b1; // enable fet end else begin enfet1 <= 1′b0; // disable fet enfet2 <= 1′b0; // disable fet enfet3 <= 1′b0; // disable fet enfet4 <= 1′b0; // disable fet enfet5 <= 1′b0; // disable fet enfet6 <= 1′b0; // disable fet end end
Back-to-Back Adjacent Read Commands
[0062] Due to their source synchronous nature, DDR SDRAM (e.g., DDR1, DDR2, DDR3) memory devices operate with a data transfer protocol which surrounds each burst of data strobes with a pre-amble time interval and a post-amble time interval. The pre-amble time interval provides a timing window for the receiving memory device to enable its data capture circuitry when a known valid level is present on the strobe signal to avoid false triggers of the memory device's capture circuit. The post-amble time interval provides extra time after the last strobe for this data capture to facilitate good signal integrity. In certain embodiments, when the computer system accesses two consecutive bursts of data from the same memory device, termed herein as a “back-to-back adjacent read,” the post-amble time interval of the first read command and the pre-amble time interval of the second read command are skipped by design protocol to increase read efficiency.
[0063] In certain embodiments, when the second read command accesses data from a different memory device than does the first read command, there is at least one time interval (e.g., clock cycle) inserted between the data strobes of the two memory devices. This inserted time interval allows both read data bursts to occur without the post-amble time interval of the first read data burst colliding or otherwise interfering with the pre-amble time interval of the second read data burst. In certain embodiments, the memory controller of the computer system inserts an extra clock cycle between successive read commands issued to different memory devices, as shown in the exemplary timing diagram of
[0064] In typical computer systems, the memory controller is informed of the memory boundaries between the ranks of memory of the memory module prior to issuing read commands to the memory module. Such memory controllers can insert wait time intervals or clock cycles to avoid collisions or interference between back-to-back adjacent read commands which cross memory device boundaries, which are referred to herein as “BBARX.”
[0065] In certain embodiments described herein in which the number of ranks 32 of the memory module 10 is doubled or quadrupled, the circuit 40 generates a set of output address and command signals so that the selection decoding is transparent to the computer system. However, in certain such embodiments, there are memory device boundaries of which the computer system is unaware, so there are occasions in which BBARX occurs without the cognizance of the memory controller 20 of the computer system. As shown in
[0066]
[0067]
[0068] In certain embodiments, as schematically illustrated by
[0069] As shown in
[0070] In certain embodiments, as schematically illustrated by
[0071] In certain embodiments, the circuit 40 also provides the load isolation described above in reference to
[0072] The circuit 40 of certain embodiments controls the isolation of the DQS data strobe signal lines 104a, 104b by monitoring commands received by the memory module 10 from the computer system and producing “windows” of operation whereby the appropriate switches 130 are activated or deactivated to enable and disable the DQS data strobe signal lines 104a, 104b to mitigate BBARX collisions. In certain other embodiments, the circuit 40 monitors the commands received by the memory module 10 from the computer system and selectively activates or deactivates the switches 120 to enable and disable the DQ data signal lines 102a, 102b to reduce the load of the memory module 10 on the computer system. In still other embodiments, the circuit 40 performs both of these functions together.
Command Signal Translation
[0073] Most high-density memory modules are currently built with 512-Megabit (“512-Mb”) memory devices wherein each memory device has a 64 M×8-bit configuration. For example, a 1-Gigabyte (“1-GB”) memory module with error checking capabilities can be fabricated using eighteen such 512-Mb memory devices. Alternatively, it can be economically advantageous to fabricate a 1-GB memory module using lower-density memory devices and doubling the number of memory devices used to produce the desired word width. For example, by fabricating a 1-GB memory module using thirty-six 256-Mb memory devices with 64 M×4-bit configuration, the cost of the resulting 1-GB memory module can be reduced since the unit cost of each 256-Mb memory device is typically lower than one-half the unit cost of each 512-Mb memory device. The cost savings can be significant, even though twice as many 256-Mb memory devices are used in place of the 512-Mb memory devices. For example, by using pairs of 512-Mb memory devices rather than single 1-Gb memory devices, certain embodiments described herein reduce the cost of the memory module by a factor of up to approximately five.
[0074] Market pricing factors for DRAM devices are such that higher-density DRAM devices (e.g., 1-Gb DRAM devices) are much more than twice the price of lower-density DRAM devices (e.g., 512-Mb DRAM devices). In other words, the price per bit ratio of the higher-density DRAM devices is greater than that of the lower-density DRAM devices. This pricing difference often lasts for months or even years after the introduction of the higher-density DRAM devices, until volume production factors reduce the costs of the newer higher-density DRAM devices. Thus, when the cost of a higher-density DRAM device is more than the cost of two lower-density DRAM devices, there is an economic incentive for utilizing pairs of the lower-density DRAM devices to replace individual higher-density DRAM devices.
[0075]
[0076] In certain embodiments, as schematically illustrated in
[0077] In certain embodiments, the memory module 10 further comprises electrical components which are electrically coupled to one another and are surface-mounted or embedded on the printed circuit board 210. These electrical components can include, but are not limited to, electrical conduits, resistors, capacitors, inductors, and transistors. In certain embodiments, at least some of these electrical components are discrete, while in other certain embodiments, at least some of these electrical components are constituents of one or more integrated circuits.
[0078] In certain embodiments, the printed circuit board 210 is mountable in a module slot of the computer system. The printed circuit board 210 of certain such embodiments has a plurality of edge connections electrically coupled to corresponding contacts of the module slot and to the various components of the memory module 10, thereby providing electrical connections between the computer system and the components of the memory module 10.
[0079] In certain embodiments, the plurality of memory devices 30 are arranged in a first number of ranks 32. For example, in certain embodiments, the memory devices 30 are arranged in four ranks 32a, 32b, 32 c, 32 d, as schematically illustrated by
[0080] As schematically illustrated by
[0081] In certain embodiments, the set of output address and command signals corresponds to a first number of ranks in which the plurality of memory devices 30 of the memory module 10 are arranged, and the set of input address and command signals corresponds to a second number of ranks per memory module for which the computer system is configured. The second number of ranks in certain embodiments is smaller than the first number of ranks. For example, in the exemplary embodiment as schematically illustrated by
[0082] In certain embodiments, the computer system is configured for a number of ranks per memory module which is smaller than the number of ranks in which the memory devices 30 of the memory module 10 are arranged. In certain such embodiments, the computer system is configured for two ranks of memory per memory module (providing two chip-select signals CS.sub.0, CS.sub.1) and the plurality of memory modules 30 of the memory module 10 are arranged in four ranks, as schematically illustrated by
[0083] In the exemplary embodiment schematically illustrated by
Logic Tables
[0084] Table 1 provides a logic table compatible with certain embodiments described herein for the selection among ranks of memory devices 30 using chip-select signals.
TABLE-US-00002 TABLE 1 State CS.sub.0 CS.sub.1 A.sub.n + .sub.1 Command CS.sub.0A CS.sub.0B CS.sub.1A CS.sub.1B 1 0 1 0 Active 0 1 1 1 2 0 1 1 Active 1 0 1 1 3 0 1 x Active 0 0 1 1 4 1 0 0 Active 1 1 0 1 5 1 0 1 Active 1 1 1 0 6 1 0 x Active 1 1 0 0 7 1 1 x x 1 1 1 1 Note: 1. CS.sub.0, CS.sub.1, CS.sub.0A, CS.sub.0B, CS.sub.1A, and CS.sub.1B are active low signals. 2. A.sub.n + 1is an active high signal. 3. ‘x’ is a Don′t Care condition. 4. Command involves a number of command signals that define operations such as refresh, precharge, and other operations.
[0085] In Logic State 1: CS.sub.0 is active low, A.sub.n+1 is non-active, and Command is active. CS.sub.0A is pulled low, thereby selecting Rank 0.
[0086] In Logic State 2: CS.sub.0 is active low, A.sub.n+1 is active, and Command is active. CS.sub.0B is pulled low, thereby selecting Rank 1.
[0087] In Logic State 3: CS.sub.0 is active low, A.sub.n+1 is Don't Care, and Command is active high. CS.sub.0A and CS.sub.0B are pulled low, thereby selecting Ranks 0 and 1.
[0088] In Logic State 4: CS.sub.1 is active low, A.sub.n+1 is non-active, and Command is active. CS.sub.1A is pulled low, thereby selecting Rank 2.
[0089] In Logic State 5: CS.sub.1 is active low, A.sub.n+1 is active, and Command is active. CS.sub.1B is pulled low, thereby selecting Rank 3.
[0090] In Logic State 6: CS.sub.1 is active low, A.sub.n+1 is Don't Care, and Command is active. CS.sub.1A and CS.sub.1B are pulled low, thereby selecting Ranks 2 and 3.
[0091] In Logic State 7: CS.sub.0 and CS.sub.1 are pulled non-active high, which deselects all ranks, i.e., CS.sub.0A, CS.sub.0B, CS.sub.1A, and CS.sub.1B are pulled high.
[0092] The “Command” column of Table 1 represents the various commands that a memory device (e.g., a DRAM device) can execute, examples of which include, but are not limited to, activation, read, write, precharge, and refresh. In certain embodiments, the command signal is passed through to the selected rank only (e.g., state 4 of Table 1). In such embodiments, the command signal (e.g., read) is sent to only one memory device or the other memory device so that data is supplied from one memory device at a time. In other embodiments, the command signal is passed through to both associated ranks (e.g., state 6 of Table 1). In such embodiments, the command signal (e.g., refresh) is sent to both memory devices to ensure that the memory content of the memory devices remains valid over time. Certain embodiments utilize a logic table such as that of Table 1 to simulate a single memory device from two memory devices by selecting two ranks concurrently.
[0093] Table 2 provides a logic table compatible with certain embodiments described herein for the selection among ranks of memory devices 30 using gated CAS signals.
TABLE-US-00003 TABLE 2 Density CS* RAS* CAS* WE* Bit A.sub.10 Command CAS0* CAS1* 1 x x x x x NOP x x 0 1 1 1 x x NOP 1 1 0 0 1 1 0 x ACTIVATE 1 1 0 0 1 1 1 x ACTIVATE 1 1 0 1 0 1 0 x READ 0 1 0 1 0 1 1 x READ 1 0 0 1 0 0 0 x WRITE 0 1 0 1 0 0 1 x WRITE 1 0 0 0 1 0 0 0 PRE- 1 1 CHARGE 0 0 1 0 1 0 PRE- 1 1 CHARGE 0 0 1 0 x 1 PRE- 1 1 CHARGE 0 0 0 0 x x MODE 0 0 REG SET 0 0 0 1 x x REFRESH 0 0
[0094] In certain embodiments in which the density bit is a row address bit, for read/write commands, the density bit is the value latched during the activate command for the selected bank.
Serial-Presence-Detect Device
[0095] Memory modules typically include a serial-presence detect (SPD) device 240 (e.g., an electrically-erasable-programmable read-only memory or EEPROM device) comprising data which characterize various attributes of the memory module, including but not limited to, the number of row addresses the number of column addresses, the data width of the memory devices, the number of ranks, the memory density per rank, the number of memory devices, and the memory density per memory device. The SPD device 240 communicates this data to the basic input/output system (BIOS) of the computer system so that the computer system is informed of the memory capacity and the memory configuration available for use and can configure the memory controller properly for maximum reliability and performance.
[0096] For example, for a commercially-available 512-MB (64 M×8-byte) memory module utilizing eight 512-Mb memory devices each with a 64 M×8-bit configuration, the SPD device contains the following SPD data (in appropriate bit fields of these bytes): [0097] Byte 3: Defines the number of row address bits in the DRAM device in the memory module [13 for the 512-Mb memory device]. [0098] Byte 4: Defines the number of column address bits in the DRAM device in the memory module [11 for the 512-Mb memory device]. [0099] Byte 13: Defines the bit width of the primary DRAM device used in the memory module [8 bits for the 512-Mb (64 M×8-bit) memory device]. [0100] Byte 14: Defines the bit width of the error checking DRAM device used in the memory module [8 bits for the 512-Mb (64 M×8-bit) memory device]. [0101] Byte 17: Defines the number of banks internal to the DRAM device used in the memory module [4 for the 512-Mb memory device].
[0102] In a further example, for a commercially-available 1-GB (128 M×8-byte) memory module utilizing eight 1-Gb memory devices each with a 128 M×8-bit configuration, as described above, the SPD device contains the following SPD data (in appropriate bit fields of these bytes): [0103] Byte 3: Defines the number of row address bits in the DRAM device in the memory module [14 for the 1-Gb memory device]. [0104] Byte 4: Defines the number of column address bits in the DRAM device in the memory module [11 for the 1-Gb memory device]. [0105] Byte 13: Defines the bit width of the primary DRAM device used in the memory module [8 bits for the 1-Gb (128 M×8-bit) memory device]. [0106] Byte 14: Defines the bit width of the error checking DRAM device used in the memory module [8 bits for the 1-Gb (128 M×8-bit) memory device]. [0107] Byte 17: Defines the number of banks internal to the DRAM device used in the memory module [4 for the 1-Gb memory device].
[0108] In certain embodiments, the SPD device 240 comprises data which characterize the memory module 10 as having fewer ranks of memory devices than the memory module 10 actually has, with each of these ranks having more memory density. For example, for a memory module 10 compatible with certain embodiments described herein having two ranks of memory devices 30, the SPD device 240 comprises data which characterizes the memory module 10 as having one rank of memory devices with twice the memory density per rank. Similarly, for a memory module 10 compatible with certain embodiments described herein having four ranks of memory devices 30, the SPD device 240 comprises data which characterizes the memory module 10 as having two ranks of memory devices with twice the memory density per rank. In addition, in certain embodiments, the SPD device 240 comprises data which characterize the memory module 10 as having fewer memory devices than the memory module 10 actually has, with each of these memory devices having more memory density per memory device. For example, for a memory module 10 compatible with certain embodiments described herein, the SPD device 240 comprises data which characterizes the memory module 10 as having one-half the number of memory devices that the memory module 10 actually has, with each of these memory devices having twice the memory density per memory device. Thus, in certain embodiments, the SPD device 240 informs the computer system of the larger memory array by reporting a memory device density that is a multiple of the memory devices 30 resident on the memory module 10. Certain embodiments described herein advantageously do not require system level changes to hardware (e.g., the motherboard of the computer system) or to software (e.g., the BIOS of the computer system).
[0109]
[0110] In certain such embodiments, the SPD device 240 of the memory module 10 is programmed to describe the combined pair of lower-density memory devices 31, 33 as one virtual or pseudo-higher-density memory device. In an exemplary embodiment, two 512-Mb memory devices, each with a 128 M×4-bit configuration, are used to simulate one 1-Gb memory device having a 128 M×8-bit configuration. The SPD device 240 of the memory module 10 is programmed to describe the pair of 512-Mb memory devices as one virtual or pseudo-1-Gb memory device.
[0111] For example, to fabricate a 1-GB (128 M×8-byte) memory module, sixteen 512-Mb (128 M×4-bit) memory devices can be used. The sixteen 512-Mb (128 M×4-bit) memory devices are combined in eight pairs, with each pair serving as a virtual or pseudo-1-Gb (128 M×8-bit) memory device. In certain such embodiments, the SPD device 240 contains the following SPD data (in appropriate bit fields of these bytes): [0112] Byte 3: 13 row address bits. [0113] Byte 4: 12 column address bits. [0114] Byte 13: 8 bits wide for the primary virtual 1-Gb (128 M×8-bit) memory device. [0115] Byte 14: 8 bits wide for the error checking virtual 1-Gb (128 M×8-bit) memory device. [0116] Byte 17: 4 banks.
[0117] In this exemplary embodiment, bytes 3, 4, and 17 are programmed to have the same values as they would have for a 512-MB (128 M×4-byte) memory module utilizing 512-Mb (128 M×4-bit) memory devices. However, bytes 13 and 14 of the SPD data are programmed to be equal to 8, corresponding to the bit width of the virtual or pseudo-higher-density 1-Gb (128 M×8-bit) memory device, for a total capacity of 1-GB. Thus, the SPD data does not describe the actual-lower-density memory devices, but instead describes the virtual or pseudo-higher-density memory devices. The BIOS accesses the SPD data and recognizes the memory module as having 4 banks of memory locations arranged in 2.sup.13 rows and 2.sup.12 columns, with each memory location having a width of 8 bits rather than 4 bits.
[0118] In certain embodiments, when such a memory module 10 is inserted in a computer system, the computer system's memory controller then provides to the memory module 10 a set of input address and command signals which correspond to the number of ranks or the number of memory devices reported by the SPD device 240. For example, placing a two-rank memory module 10 compatible with certain embodiments described herein in a computer system compatible with one-rank memory modules, the SPD device 240 reports to the computer system that the memory module 10 only has one rank. The circuit 40 then receives a set of input address and command signals corresponding to a single rank from the computer system's memory controller, and generates and transmits a set of output address and command signals corresponding to two ranks to the appropriate memory devices 30 of the memory module 10.
[0119] Similarly, when a two-rank memory module 10 compatible with certain embodiments described herein is placed in a computer system compatible with either one- or two-rank memory modules, the SPD device 240 reports to the computer system that the memory module 10 only has one rank. The circuit 40 then receives a set of input address and command signals corresponding to a single rank from the computer system's memory controller, and generates and transmits a set of output address and command signals corresponding to two ranks to the appropriate memory devices 30 of the memory module 10.
[0120] Furthermore, a four-rank memory module 10 compatible with certain embodiments described herein simulates a two-rank memory module whether the memory module 10 is inserted in a computer system compatible with two-rank memory modules or with two- or four-rank memory modules. Thus, by placing a four-rank memory module 10 compatible with certain embodiments described herein in a module slot that is four-rank-ready, the computer system provides four chip-select signals, but the memory module 10 only uses two of the chip-select signals.
[0121] In certain embodiments, the circuit 40 comprises the SPD device 240 which reports the CAS latency (CL) to the memory controller of the computer system. The SPD device 240 of certain embodiments reports a CL which has one more cycle than does the actual operational CL of the memory array. In certain embodiments, data transfers between the memory controller and the memory module are registered for one additional clock cycle by the circuit 40. The additional clock cycle of certain embodiments is added to the transfer time budget with an incremental overall CAS latency. This extra cycle of time in certain embodiments advantageously provides sufficient time budget to add a buffer which electrically isolates the ranks of memory devices 30 from the memory controller 20. The buffer of certain embodiments comprises combinatorial logic, registers, and logic pipelines. In certain embodiments, the buffer adds a one-clock cycle time delay, which is equivalent to a registered DIMM, to accomplish the address decoding. The one-cycle time delay of certain such embodiments provides sufficient time for read and write data transfers to provide the functions of the data path multiplexer/demultiplexer. Thus, for example, a DDR2 400-MHz memory system in accordance with embodiments described herein has an overall CAS latency of four, and uses memory devices with a CAS latency of three. In still other embodiments, the SPD device 240 does not utilize this extra cycle of time.
Memory Density Multiplication
[0122] In certain embodiments, two memory devices having a memory density are used to simulate a single memory device having twice the memory density, and an additional address signal bit is used to access the additional memory. Similarly, in certain embodiments, two ranks of memory devices having a memory density are used to simulate a single rank of memory devices having twice the memory density, and an additional address signal bit is used to access the additional memory. As used herein, such simulations of memory devices or ranks of memory devices are termed as “memory density multiplication,” and the term “density transition bit” is used to refer to the additional address signal bit which is used to access the additional memory by selecting which rank of memory devices is enabled for a read or write transfer operation.
[0123] For example, for computer systems which are normally limited to using memory modules which have a single rank of 128 M×4-bit memory devices, certain embodiments described herein enable the computer system to utilize memory modules which have double the memory (e.g., two ranks of 128 M×4-bit memory devices). The circuit 40 of certain such embodiments provides the logic (e.g., command and address decoding logic) to double the number of chip selects, and the SPD device 240 reports a memory device density of 256 M×4-bit to the computer system.
[0124] In certain embodiments utilizing memory density multiplication embodiments, the memory module 10 can have various types of memory devices 30 (e.g., DDR1, DDR2, DDR3, and beyond). The circuit 40 of certain such embodiments utilizes implied translation logic equations having variations depending on whether the density transition bit is a row, column, or internal bank address bit. In addition, the translation logic equations of certain embodiments vary depending on the type of memory module 10 (e.g., UDIMM, RDIMM, FBDIMM, etc.). Furthermore, in certain embodiments, the translation logic equations vary depending on whether the implementation multiplies memory devices per rank or multiplies the number of ranks per memory module.
TABLE-US-00004 TABLE 3A 128-Mb 256-Mb 512-Mb 1-Gb Number of banks 4 4 4 4 Number of row address bits 12 13 13 14 Number of column address 11 11 12 12 bits for “×4” configuration Number of column address 10 10 11 11 bits for “×8” configuration Number of column address 9 9 10 10 bits for “×16” configuration
[0125] Table 3A provides the numbers of rows and columns for DDR-1 memory devices, as specified by JEDEC standard JESD79D, “Double Data Rate (DDR) SDRAM Specification,” published February 2004, and incorporated in its entirety by reference herein.
[0126] As described by Table 3A, 512-Mb (128 M×4-bit) DRAM devices have 2.sup.13 rows and 2.sup.12 columns of memory locations, while 1-Gb (128 M×8-bit) DRAM devices have 2.sup.14 rows and 2.sup.11 columns of memory locations. Because of the differences in the number of rows and the number of columns for the two types of memory devices, complex address translation procedures and structures would typically be needed to fabricate a 1-GB (128 M×8-byte) memory module using sixteen 512-Mb (128 M×4-bit) DRAM devices.
[0127] Table 3B shows the device configurations as a function of memory density for DDR2 memory devices.
TABLE-US-00005 TABLE 3B Number Number of Number of Page Size of Rows Columns Internal Banks (×4s or ×8s) 256 Mb 13 11 4 1 KB 512 Mb 14 11 4 1 KB 1 Gb 14 11 8 1 KB 2 Gb 15 11 8 1 KB 4 Gb 16 11 8 1 KB
Table 4 lists the corresponding density transition bit for the density transitions between the DDR2 memory densities of Table 3B.
TABLE-US-00006 TABLE 4 Density Transition Density Transition Bit 256 Mb to 512 Mb A.sub.13 512 Mb to 1 Gb BA.sub.2 1 Gb to 2 Gb A.sub.14 2 Gb to 4 Gb A.sub.15
Other certain embodiments described herein utilize a transition bit to provide a transition from pairs of physical 4-Gb memory devices to simulated 8-Gb memory devices.
[0128] In an example embodiment, the memory module comprises one or more pairs of 256-Mb memory devices, with each pair simulating a single 512-Mb memory device. The simulated 512-Mb memory device has four internal banks while each of the two 256-Mb memory devices has four internal banks, for a total of eight internal banks for the pair of 256-Mb memory devices. In certain embodiments, the additional row address bit is translated by the circuit 40 to the rank selection between each of the two 256-Mb memory devices of the pair. Although there are eight total internal banks in the rank-converted memory array, the computer system is only aware of four internal banks. When the memory controller activates a row for a selected bank, the circuit 40 activates the same row for the same bank, but it does so for the selected rank according to the logic state of the additional row address bit A.sub.13.
[0129] In another example embodiment, the memory module comprises one or more pairs of 512-Mb memory devices, with each pair simulating a single 1-Gb memory device. The simulated 1-Gb memory device has eight internal banks while each of the two 512-Mb memory devices has four internal banks, for a total of eight internal banks for the pair of 512-Mb memory devices. In certain embodiments, the mapped BA.sub.2 (bank 2) bit is used to select between the two ranks of 512-Mb memory devices to preserve the internal bank geometry expected by the memory controller of the computer system. The state of the BA.sub.2 bit selects the upper or lower set of four banks, as well as the upper and lower 512-Mb rank.
[0130] In another example embodiment, the memory module comprises one or more pairs of 1-Gb memory devices, with each pair simulating a single 2-Gb memory device. Each of the two 1-Gb memory devices has eight internal banks for a total of sixteen internal banks, while the simulated 2-Gb memory device has eight internal banks. In certain embodiments, the additional row address bit translates to the rank selection between the two 1-Gb memory devices. Although there are sixteen total internal banks per pair of 1-Gb memory devices in the rank-converted memory array, the memory controller of the computer system is only aware of eight internal banks. When the memory controller activates a row of a selected bank, the circuit 40 activates the same row for the same bank, but is does so for the selected rank according to the logic state of the additional row address bit A.sub.14.
[0131] The circuit 40 of certain embodiments provides substantially all of the translation logic used for the decoding (e.g., command and address decoding). In certain such embodiments, there is a fully transparent operational conversion from the “system memory” density domain of the computer system to the “physical memory” density domain of the memory module 10. In certain embodiments, the logic translation equations are programmed in the circuit 40 by hardware, while in certain other embodiments, the logic translation equations are programmed in the circuit 40 by software. Examples 1 and 2 provide exemplary sections of Verilog code compatible with certain embodiments described herein. As described more fully below, the code of Examples 1 and 2 includes logic to reduce potential problems due to “back-to-back adjacent read commands which cross memory device boundaries or “BBARX.” Persons skilled in the art are able to provide additional logic translation equations compatible with embodiments described herein.
[0132] An exemplary section of Verilog code compatible with memory density multiplication from 512 Mb to 1 Gb using DDR2 memory devices with the BA.sub.2 density transition bit is listed below in Example 2. The exemplary code of Example 2 corresponds to a circuit 40 which receives one chip-select signal from the computer system and which generates two chip-select signals.
Example 2
[0133]
TABLE-US-00007 always @(posedge clk_in) begin rs0N_R <= rs0_in_N; //cs0 rasN_R <= ras_in_N; casN_R <= cas_in_N; weN_R <= we_in_N; end // Gated Chip Selects assign = (~rs0_in_N & ~cas_in_N) // ref
reg set | (~rs0_in_N & ras_in_N & cas_in_N) //ref exit, pw
| (~rs0_in_N & ~ras_in_N & cas_in_N & ~we_in_N & a10_in) //
| (~rs0_in_N & ~ras_in_N & cas_in_N & ~we_in_N & ~a10_in & ba2_in) //
single
| (~rs0_in_N & ~ras_in_N & cas_in_N & we_in_N & ba2_in) // activate | (~rs0_in_N & ras_in_N & ~cas_in_N & ba2_in) // xfr ; assign
= (~rs0_in_N & ~cas_in_N) // ref
reg set | (~rs0_in_N & ras_in_N & ~cas_in_N // ref exit
| (~rs0_in_N & ~ras_in_N & ~cas_in_N & ~we_in_N & a10_in) //
// pchg all | (~rs0_in_N & ~ras_in_N & ~cas_in_N & ~we_in_N & ~a10_in & ba2_in) // pchg single
| (~rs0_in_N & ~ras_in_N & ~cas_in_N & we_in_N & ba2_in) // activate | (~rs0_in_N & ras_in_N & ~cas_in_N & ba2_in) // xfr ; //------------------------- always @(posedge clk_in) begin a4
<= a4_in ; a5
<= a5_in ; a6
<= a6_in ; a
<=
;
<= ba0_in ;
<= ba1_in ;
<= ba2_in ;
end
// determine the cas latency
assign q
&
; //
always @(posedge clk_in)
N) //
c13 <=
; else if (
// load mode reg cmd begin c13 <= (~a6
& a5
a4
) ; end always @(posedge clk_in) if (~reset_N) // reset c14 <=
else if (
// load mode reg cmd begin c14 <= (~a6_r & a5_r & a4_r ) ; end always @(posedge clk_in) if(~reset_N) c15 <=
else if (q_
_cmd_
) // load mode reg cmd begin c15 <= (a6_r & ~a5_r & a4_r ) ; end assign pre
//wr brst c13 preamble ; assign pre
// rd brst c13 preamble | (wr _cmd_
& c13) // wr brst c13 1st pair | (wr_cmd_
& c14) // wr brst c14 preamble ; assign pre
= (wr_
& c13) // wr brst c13 2nd pair | (wr_cmd_
& c14) // wr brst c14 1st pair | (rd_cmd_
& c13) // rd brst c13 1st pair | (rd_cmd_
& c14) // rd brst c14 preamble ; assign pre
= (rd_
& c13) // rd brst c13 2nd pair | (wr_cmd_
& c14) // wr brst c14 1st pair | (rd_cmd_
& c14) // rd brst c14 1st pair ; //
assign pre_
=
| pre_
| pre_
4_enfet | pre_ cyc5_enfet ; assign pre_
=
| enfet_cyc3 | enfet_cyc4 | enfet_cyc5 ; //
assign pre_
(pre_cyc2 enfet & ~ba2
) | pre_
& ~ba2
) | pre_
& ~ba2
) | pre_
& ~ba2_cyc4) ; assign
| (pre_
&
2) | (pre_
& ba2_cyc3) | (pre_
& ba2_cyc4) ; assign pre_
& ~ba2_cyc2) |
& ~ba2_
|
4 & ~ba2_cyc4) |
5 & ~ba2_cyc5) ; assign pre_
= (
2 & ba2_cyc2)
3 & ba2_ cyc3)
4 & ba2_ cyc4)
5 & ba2_cyc5) ; always @(posedge clk_in) begin
2<= a
1 ; //
active ba2_
ba2_
cmd_
&
wr_cmd
&
wr_cmd
wr_cmd_
wr_cmd
<= wr_cmd_
wr_cmd
<= wr_cmd_cyc4 ; end always
begin
;
; dqs_
_b <= dqs_
; end // DQ FET enables assign
; assign
; assign
; assign enq_ fet4 =
; assign enq_fet 5
; //DQS FET enables assign
; assign
; assign
; assign
; assgin
; assign
;
indicates data missing or illegible when filed
[0134] Another exemplary section of Verilog code compatible with memory density multiplication from 256 Mb to 512 Mb using DDR2 memory devices and gated CAS signals with the row A.sub.13 density transition bit is listed below in Example 3. The exemplary code of Example 3 corresponds to a circuit 40 which receives one gated CAS signal from the computer system and which generates two gated CAS signals.
Example 3
[0135]
TABLE-US-00008 // latched a13 flags cs0, banks 0-3 always @(posedge clk_in) if (actv_cmd_R & ~rs0N_R & ~bnk1_R & ~bnk0_R ) // activate begin 1_a13_00 <= a13_r ; end always @(posedge clk_in) if (actv_cmd_R & ~rs0N_R & ~bnk1_R & bnk0 _R) // activate begin 1_al3_01 <= a13_r ; end always @(posedge clk_in) if (actv_cmd_R & ~rs0N_R & bnk1_R & ~bnk0_R) // activate begin 1_a13_10 <= a13_r ; end always @(posedge clk_ in) if (actv_cmd_R & ~rs0N_R & bnk1_R & bnk0_R) // activate begin 1_a13_11 <= a13_r ; end // gated cas assign cas_i = ~(casN_R); assign cas0_o =( ~rasN_R & cas_i) | ( rasN_R & ~1_a13_00 & ~bnk1_R & ~bnk0_R & cas_i) | ( rasN_R & ~1_a13_01 & ~bnk1_R & bnk0_R & cas_i) | ( rasN_R & ~1_a13_10 & bnk1_R & ~bnk0_R & cas_i) | ( rasN_R & ~1_a13_11 & bnk1_R & bnk0_R & cas_i) ; assign cas1 _o = ( ~rasN_R & cas_i) | ( rasN_R & 1_a13_00 & ~bnk1_R & ~bnk0_R & cas_i) | ( rasN_R & 1_a13_01 & ~bnk1_R & bnk0_R & cas_i) | ( rasN_R & 1_a13_10 & bnk1_R & ~bnk0_R & cas_i) | ( rasN_R & 1_a13_11 & bnk1_R & bnk0_R & cas_i) ; assign pcas_0_N = ~cas0_o; assign pcas_1_N = ~cas1_o; assign rd0_o_R1 = rasN_R & cas0_o & weN_R & ~rs0N_R; // rnk0 rd cmd cyc assign rd1_o_R1 = rasN_R & cas1_o & weN_R & ~rs0N_R; // rnk1 rd cmd cyc assign wr0_o_R1 = rasN_R & cas0_o & ~weN_R & ~rs0N_R; // rnk0 wr cmd cyc assign wr1_o_R1 = rasN_R & cas1_o & ~weN_R & ~rs0N_R ; // rnk1 wr cmd cyc always @(posedge clk_in) begin rd0_o_R2 <= rd0_o_R1 ; rd0_o_R3 <= rd0_o_R2; rd0_o_R4 <= rd0_o_R3; rd0_o_R5 <= rd0_o_R4; rd1_o_R2 <= rd1_o_R1 ; rd1_o_R3 <= rd1_o_R2; rd1_o_R4 <= rd1_o_R3; rd1_o_R5 <= rd1_o_R4; wr0_o_R2 <= wr0_o_R1 ; wr0_o_R3 <= wr0_o_R2; wr0_o_R4 <= wr0_o_R3; wr1_o_R2 <= wr1_o_R1 ; wr1_o_R3 <= wr1_o_R2; wr1_o_R4 <= wr1_o_R3; end always @(posedge clk_in) begin if ( (rd0_o_R2 & ~rd1_o_R4) //pre-am rd if no ped on rnk 1 | rd0_o_R3 // 1st cyc of rd brst | rd0_o_R4 // 2nd cyc of rd brst | (rd0_o_R5 & ~rd1_o_R2 & ~rd1_o_R3 // post-rd cyc if no ped on rnk 1 | (wr0_o_R1) // pre-am wr | wr0_o_R2 | wr0_o_R3 // wr brst 1st & 2nd cyc | (wr0_o_R4) // post-wr cyc )chgef9) | wr1_o_R1 | wr1_o_R2 | wr1_o_R3 | wr1 _o_R4 // rank 1 (chgef9) ) en_fet_a <= 1′b1; // enable fet else en_fet_a <= 1′b0; // disable fet end always @(posedge clk_in) begin if ( (rd1_o_R2 & ~rd0_o_R4) | rd1_o_R3 | rd1_o_R4 | (rd1_o_R5 & ~rd0_o_R2 & ~rd0_o_R3) | (wr1 _o_R1) // (chgef8) | wr1_ o_R2 | wr1_o_R3 | (wr1_o_ R4) // post-wr cyc (chgef9) | wr0_o_R1 | wr0_o_R2 | wr0_o_R3 | wr0_o_R4 // rank 0 (chgef9) ) en_fet_b <= 1′b1; // else en_fet_b <= 1′b0; end
[0136] In certain embodiments, the chipset memory controller of the computer system uses the inherent behavioral characteristics of the memory devices (e.g., DDR2 memory devices) to optimize throughput of the memory system. For example, for each internal bank in the memory array, a row (e.g., 1 KB page) is advantageously held activated for an extended period of time. The memory controller, by anticipating a high number of memory accesses or hits to a particular region of memory, can exercise this feature to advantageously eliminate time-consuming pre-charge cycles. In certain such embodiments in which two half-density memory devices are transparently substituted for a single full-density memory device (as reported by the SPD device 240 to the memory controller), the memory devices advantageously support the “open row” feature.
[0137]
[0138] The first rank 32a of
[0139] In the embodiment schematically illustrated by
[0140] For two “×4” memory devices 30 to work in tandem to mimic a “×8” memory device, the relative DQS pins of the two memory devices 30 in certain embodiments are advantageously tied together, as described more fully below. In addition, to access the memory density of a high-density memory module 10 comprising pairs of “×4” memory devices 30, an additional address line is used. While a high-density memory module comprising individual “×8” memory devices with the next-higher density would also utilize an additional address line, the additional address lines are different in the two memory module configurations.
[0141] For example, a 1-Gb 128 M×8-bit DDR-1 DRAM memory device uses row addresses A.sub.13-A.sub.0 and column addresses A.sub.11 and A.sub.9-A.sub.0. A pair of 512-Mb 128 M×4-bit DDR-1 DRAM memory devices uses row addresses A.sub.12-A.sub.0 and column addresses A.sub.12, A.sub.11, and A.sub.9-A.sub.0. In certain embodiments, a memory controller of a computer system utilizing a 1-GB 128 M×8 memory module 10 comprising pairs of the 512-Mb 128 M×4 memory devices 30 supplies the address and command signals including the extra row address (A.sub.13) to the memory module 10. The circuit 40 receives the address and command signals from the memory controller and converts the extra row address (A.sub.13) into an extra column address (A.sub.12).
[0142]
[0143] In the exemplary circuit 40 of
[0144] Thus, by allowing two lower-density memory devices to be used rather than one higher-density memory device, certain embodiments described herein provide the advantage of using lower-cost, lower-density memory devices to build “next-generation” higher-density memory modules. Certain embodiments advantageously allow the use of lower-cost readily-available 512-Mb DDR-2 SDRAM devices to replace more expensive 1-Gb DDR-2 SDRAM devices. Certain embodiments advantageously reduce the total cost of the resultant memory module.
[0145]
[0146] Each rank 32a, 32b, 32 c, 32 d of
[0147] In the embodiment schematically illustrated by
[0148] To access the additional memory density of the high-density memory module 10, the two chip-select signals (CS.sub.0, CS.sub.1) are used with other address and command signals to gate a set of four gated CAS signals. For example, to access the additional ranks of four-rank 1-GB 128 M×8-byte DDR-1 DRAM memory module, the CS.sub.0 and CS.sub.1 signals along with the other address and command signals are used to gate the CAS signal appropriately, as schematically illustrated by
[0149] In certain embodiments, the PLD 42 comprises an ASIC, an FPGA, a custom-designed semiconductor device, or a CPLD. In certain embodiments, the PLD 42 and the four “OR” logic elements 52, 54, 56, 58 are discrete elements, while in other certain embodiments, they are integrated within a single integrated circuit. Persons skilled in the art can select an appropriate PLD 42 and appropriate “OR” logic elements 52, 54, 56, 58 in accordance with embodiments described herein.
[0150] In the embodiment schematically illustrated by
[0151] In certain embodiments, the PLD 42 uses sequential and combinatorial logic procedures to produce the gated CAS signals which are each transmitted to a corresponding one of the four ranks 32a, 32b, 32 c, 32 d. In certain other embodiments, the PLD 42 instead uses sequential and combinatorial logic procedures to produce four gated chip-select signals (e.g., CS.sub.0a, CS.sub.0b, CS.sub.1a, and CS.sub.1b) which are each transmitted to a corresponding one of the four ranks 32a, 32b, 32 c, 32 d.
Tied Data Strobe Signal Pins
[0152] For proper operation, the computer system advantageously recognizes a 1-GB memory module comprising 256-Mb memory devices with 64 M×4-bit configuration as a 1-GB memory module having 512-Mb memory devices with 64 M×8-bit configuration (e.g., as a 1-GB memory module with 128 M×8-byte configuration). This advantageous result is desirably achieved in certain embodiments by electrically connecting together two output signal pins (e.g., DQS or data strobe pins) of the two 256-Mb memory devices such that both output signal pins are concurrently active when the two memory devices are concurrently enabled. The DQS or data strobe is a bi-directional signal that is used during both read cycles and write cycles to validate or latch data. As used herein, the terms “tying together” or “tied together” refer to a configuration in which corresponding pins (e.g., DQS pins) of two memory devices are electrically connected together and are concurrently active when the two memory devices are concurrently enabled (e.g., by a common chip-select or CS signal). Such a configuration is different from standard memory module configurations in which the output signal pins (e.g., DQS pins) of two memory devices are electrically coupled to the same source, but these pins are not concurrently active since the memory devices are not concurrently enabled. However, a general guideline of memory module design warns against tying together two output signal pins in this way.
[0153]
[0154]
[0155] A second problem may also arise from tying together two output signal pins.
[0156] Each of the memory devices 310, 320 of
[0157] Examples of memory devices 310, 320 which include such ODT circuits 332, 334 include, but are not limited to, DDR2 memory devices. Such memory devices are configured to selectively enable or disable the termination of the memory device in this way in response to signals applied to the ODT signal pin of the memory device. For example, when the ODT signal pin 362 of the first memory device 310 is pulled high, the termination resistors 352, 356 of the first memory device 310 are enabled. When the ODT signal pin 362 of the first memory device 310 is pulled low (e.g., grounded), the termination resistors 352, 356 of the first memory device 310 are disabled. By selectively disabling the termination resistors of an active memory device, while leaving the termination resistors of inactive memory devices enabled, such configurations advantageously preserve signal strength on the active memory device while continuing to eliminate signal reflections at the bus-die interface of the inactive memory devices.
[0158] In certain configurations, as schematically illustrated by
[0159] When connecting the first memory device 310 and, the second memory device 320 together to form a double word width, both the first memory device 310 and the second memory device 320 are enabled at the same time (e.g., by a common CS signal). Connecting the first memory device 310 and the second memory device 320 by tying the DQS pins 312, 322 together, as shown in
[0160]
[0161] In certain embodiments, the memory module 400 is a 1-GB unbuffered Double Data Rate (DDR) Synchronous Dynamic RAM (SDRAM) high-density dual in-line memory module (DIMM).
[0162] In certain embodiments, the memory module 400 comprises a plurality of memory devices configured in pairs, each pair having a first memory device 410 and a second memory device 420. For example, in certain embodiments, a 128 M×72-bit DDR SDRAM high-density memory module 400 comprises thirty-six 64 M×4-bit DDR-1 SDRAM integrated circuits in FBGA packages configured in eighteen pairs. The first memory device 410 of each pair has the first DQS pin 412 electrically coupled to the second DQS pin 422 of the second memory device 420 of the pair. In addition, the first DQS pin 412 and the second DQS pin 422 are concurrently active when the first memory device 410 and the second memory device 420 are concurrently enabled.
[0163] In certain embodiments, the first resistor 430 and the second resistor 440 each has a resistance advantageously selected to reduce the current flow between the first DQS pin 412 and the second DQS pin 422 while allowing signals to propagate between the memory controller and the DQS pins 412, 422. In certain embodiments, each of the first resistor 430 and the second resistor 440 has a resistance in a range between approximately 5 ohms and approximately 50 ohms. For example, in certain embodiments, each of the first resistor 430 and the second resistor 440 has a resistance of approximately 22 ohms. Other resistance values for the first resistor 430 and the second resistor 440 are also compatible with embodiments described herein. In certain embodiments, the first resistor 430 comprises a single resistor, while in other embodiments, the first resistor 430 comprises a plurality of resistors electrically coupled together in series and/or in parallel. Similarly, in certain embodiments, the second resistor 440 comprises a single resistor, while in other embodiments, the second resistor 440 comprises a plurality of resistors electrically coupled together in series and/or in parallel.
[0164]
[0165]
[0166] In certain embodiments, as schematically illustrated by
[0167] The voltage at the second DQS pin 422 in
[0168] In certain embodiments in which there is overshoot or undershoot of the voltages, the amount of current flow can be higher than those expected for nominal voltage values. Therefore, in certain embodiments, the resistances of the first resistor 430 and the second resistor 440 are advantageously selected to account for such overshoot/undershoot of voltages.
[0169] For certain such embodiments in which the voltage at the second DQS pin 422 is V.sub.DQS2=0.59 volts and the duration of the overdrive condition is approximately 0.8 nanoseconds at maximum, the total surge is approximately 0.59 V*1.2 ns=0.3 V-ns. For comparison, the JEDEC standard for overshoot/undershoot is 2.4 V-ns, so certain embodiments described herein advantageously keep the total surge within predetermined standards (e.g., JEDEC standards).
[0170]
[0171]
[0172] The memory module 600 further comprises a second memory device 620 having a second DQS pin 622 electrically coupled to the first DQS pin 612, a second ODT signal pin 624, a second ODT circuit 626, and at least one DQ pin 628. The first DQS pin 612 and the second DQS pin 622 are concurrently active when the first memory device 610 and the second memory device 620 are concurrently enabled. The second ODT signal pin 624 is electrically coupled to a voltage (e.g., ground), wherein the second ODT circuit 626 is responsive to the voltage by not terminating the second DQS pin 622 or the second DQ pin 624. This behavior of the second ODT circuit 626 is schematically illustrated in
[0173] The memory module 600 further comprises at least one termination assembly 630 having a third ODT signal pin 634 electrically coupled to the ODT bus 605, a third ODT circuit 636, and at least one termination pin 638 electrically coupled to the DQ pin 628 of the second memory device 620. The third ODT circuit 636 selectively electrically terminates the DQ pin 628 of the second memory device 620 through the termination pin 638 in response to an ODT signal received by the third ODT signal pin 634 from the ODT bus 605. This behavior of the third ODT circuit 636 is schematically illustrated in
[0174] In certain embodiments, the termination assembly 630 comprises discrete electrical components which are surface-mounted or embedded on the printed-circuit board of the memory module 600. In certain other embodiments, the termination assembly 630 comprises an integrated circuit mounted on the printed-circuit board of the memory module 600. Persons skilled in the art can provide a termination assembly 630 in accordance with embodiments described herein.
[0175] Certain embodiments of the memory module 600 schematically illustrated by
[0176] The first ODT signal pin 614 of the first memory device 610 receives an ODT signal from the ODT bus 605. In response to this ODT signal, the first ODT circuit 616 selectively enables or disables the termination resistance for both the first DQS pin 612 and the DQ pin 618 of the first memory device 610. The second ODT signal pin 624 of the second memory device 620 is tied (e.g., directly hard-wired) to the voltage (e.g., ground), thereby disabling the internal termination resistors 654, 658 on the second DQS pin 622 and the second DQ pin 628, respectively, of the second memory device 620 (schematically shown by open switches 674, 678 in
[0177] The termination resistor 656 of the DQ pin 618 of the first memory device 610 is enabled or disabled by the ODT signal received by the first ODT signal pin 614 of the first memory device 610 from the ODT bus 605. The termination resistance of the DQ pin 628 of the second memory device 620 is enabled or disabled by the ODT signal received by the third ODT signal pin 634 of the termination assembly 630 which is external to the second memory device 620. Thus, in certain embodiments, the first ODT signal pin 614 and the third ODT signal pin 634 receive the same ODT signal from the ODT bus 605, and the termination resistances for both the first memory device 610 and the second memory device 620 are selectively enabled or disabled in response thereto when these memory devices are concurrently enabled. In this way, certain embodiments of the memory module 600 schematically illustrated by
[0178] Certain embodiments of the memory module 600 schematically illustrated by
[0179] Certain embodiments described herein advantageously increase the memory capacity or memory density per memory slot or socket on the system board of the computer system. Certain embodiments advantageously allow for higher memory capacity in systems with limited memory slots. Certain embodiments advantageously allow for flexibility in system board design by allowing the memory module 10 to be used with computer systems designed for different numbers of ranks (e.g., either with computer systems designed for two-rank memory modules or with computer systems designed for four-rank memory modules). Certain embodiments advantageously provide lower costs of board designs.
[0180] In certain embodiments, the memory density of a memory module is advantageously doubled by providing twice as many memory devices as would otherwise be provided. For example, pairs of lower-density memory devices can be substituted for individual higher-density memory devices to reduce costs or to increase performance. As another example, twice the number of memory devices can be used to produce a higher-density memory configuration of the memory module. Each of these examples can be limited by the number of chip select signals which are available from the memory controller or by the size of the memory devices. Certain embodiments described herein advantageously provide a logic mechanism to overcome such limitations.
[0181] Various embodiments of the present invention have been described above. Although this invention has been described with reference to these specific embodiments, the descriptions are intended to be illustrative of the invention and are not intended to be limiting. Various modifications and applications may occur to those skilled in the art without departing from the true spirit and scope of the invention.