DYNAMIC FAULT CLUSTERING METHOD AND APPARATUS

20230317198 · 2023-10-05

    Inventors

    Cpc classification

    International classification

    Abstract

    A dynamic fault clustering method and apparatus for efficiently managing redundancy in semiconductor memories performs a collection operation of searching for and detecting a fault and an operation of appropriately clustering the fault at the same time, which reduces an amount of time spent performing Built-In Redundancy Analysis (BIRA).

    Claims

    1. A dynamic fault clustering method comprising: starting a self-test of a memory cell array divided into layers; performing a search for a new fault in the memory cell array; in response to the search finding the new fault, performing a check of whether a row address or a column address of the new fault matches a row address or a column address of a previously detected fault stored in an address storage device; setting a layer number to which the new fault belongs when the check determines that the row address or the column address of the new fault matches the row address or the column address of the previously detected fault; determining whether to perform a row-must or column-must repair when the check determines that the row address or the column address of the new fault does not match the row address or the column address of the previously detected fault; storing information on the row-must or column-must repair in a redundancy storage device when it is determined to perform the row-must or column-must repair; determining whether to cluster the new fault from a layer to which the new fault belongs to another layer when it is determined not to perform the row-must or column-must repair; storing corresponding layer information in the address storage device when it is determined that the fault is to be clustered; and of storing corresponding layer information in the redundancy storage device when it is determined that the fault is not to be clustered.

    2. The dynamic fault clustering method of claim 1, wherein storing the corresponding layer information in the redundancy storage device includes storing information indicating row-wise clustering or column-wise clustering in the redundancy storage device.

    3. The dynamic fault clustering method of claim 1, wherein the redundancy storage device is a content addressable memory.

    4. The dynamic fault clustering method of claim 1, wherein the address storage device is a content addressable memory.

    5. The dynamic fault clustering method of claim 1, wherein the redundancy storage device includes the row address, the column address, and a layer address of the new fault.

    6. The dynamic fault clustering method of claim 1, wherein the redundancy storage device includes a space for storing a flag signal indicating whether to apply row repair or column repair to the new fault.

    7. The dynamic fault clustering method of claim 1, wherein the redundancy storage device includes a mapped address for clustering.

    8. The dynamic fault clustering method of claim 1, wherein the address storage device includes a space for storing a flag signal indicating whether the clustered fault corresponds to a row-wise exchange or a column-wise exchange.

    9. The dynamic fault clustering method of claim 1, wherein performing the search for the new fault and the determining whether to cluster the new fault are performed at the same time.

    10. A dynamic fault clustering apparatus comprising: a semiconductor memory cell array including a plurality of layers and configured to store binary information; a global redundancy including extra cells provided to replace a fault occurring in the layers; a redundancy storage device configured to store a layer number, a row address, and a column address for the fault; an address storage device configured to store whether to perform row repair or column repair for the fault; a multiplexer configured to select binary information from one of the layer and the global redundancy; and a redundancy analyzer configured to search for the fault and to determine whether to cluster the fault from a layer to which the fault belongs to another layer.

    11. The dynamic fault clustering apparatus of claim 10, wherein when the redundancy analyzer determines to cluster the fault, the redundancy analyzer determines whether to cluster the fault using row-wise clustering or column-wise clustering.

    12. The dynamic fault clustering apparatus of claim 10, wherein the redundancy storage device is a content addressable memory.

    13. The dynamic fault clustering apparatus of claim 10, wherein the address storage device is a content addressable memory.

    14. The dynamic fault clustering apparatus of claim 10, wherein the redundancy storage device includes a row address, a column address, and a layer address of a new fault.

    15. The dynamic fault clustering apparatus of claim 10, wherein the redundancy storage device includes a mapped address to which the new fault is moved for clustering.

    16. The dynamic fault clustering apparatus of claim 10, wherein the redundancy analyzer performs the search for the fault and the determination of whether to cluster the fault at the same time.

    17. The dynamic fault clustering apparatus of claim 10, wherein the redundancy storage device includes a space for storing flag signals indicating whether to apply row repair or column repair to the new fault.

    18. The dynamic fault clustering apparatus of claim 17, wherein values of the flag signals are set when the clustering is performed.

    19. The dynamic fault clustering apparatus of claim 18, wherein the setting of the flag signals is performed by an analysis operation of the redundancy analyzer.

    20. The dynamic fault clustering apparatus of claim 10, wherein the redundancy storage device stores some or all of a layer number to which the new fault belongs, a row address, a column address, a mapped layer number for mapping, a row-must flag indicating whether row-wise repair is required, and a column-must flag indicating whether column-wise repair is required.

    Description

    BRIEF DESCRIPTION OF THE DRAWINGS

    [0016] FIG. 1 is a diagram for schematically explaining the present disclosure.

    [0017] FIG. 2 is a diagram for explaining a clustering process according to an embodiment of the present disclosure.

    [0018] FIG. 3 illustrates, as an example, the content of a redundancy storage device associated with a clustering process according to an embodiment of the present disclosure.

    [0019] FIG. 4 to FIG. 7 illustrate a process according to an embodiment of the present disclosure on a step-by-step basis.

    [0020] FIG. 8 illustrates an impact on analysis time of an embodiment of the present disclosure.

    [0021] FIG. 9 is a flowchart according to an embodiment of the present disclosure.

    [0022] FIG. 10 illustrates an apparatus according to an embodiment of the present disclosure.

    [0023] FIG. 11 is a graph illustrating a simulation result of the analysis time of an embodiment of the present disclosure.

    [0024] FIG. 12 is a graph illustrating a simulation result of the repair rate of an embodiment of the present disclosure.

    [0025] FIG. 13 to FIG. 15 are simulation results of the repair rate under different conditions of an embodiment of the present disclosure.

    [0026] FIG. 16 is a graph illustrating another simulation result of the analysis time of an embodiment of the present disclosure.

    DETAILED DESCRIPTION

    [0027] Hereinafter, embodiments of the present disclosure will be described in detail with reference to the accompanying drawings so that the present disclosure can be easily carried out by those skilled in the art to which the present disclosure pertains. The same reference numerals among the reference numerals in each drawing indicate the same elements.

    [0028] In the description of the present disclosure, when it is determined that detailed descriptions of related publicly-known technologies may obscure the subject matter of the present disclosure, the detailed descriptions thereof will be omitted.

    [0029] The terms such as first and second may be used to describe various components, but the components are not limited by the terms, and the terms are used only to distinguish one component from another component.

    [0030] Hereinafter, in the specification of the present disclosure, faults may be indicated by symbol X in a memory cell, may be indicated by a number such as #1 or #2 in order to emphasize a search order, or may be sometimes indicated as ‘1’ in order to indicate the occurrence of a fault or the presence of a fault, and it is noted that this does not indicate binary information ‘1’.

    [0031] Furthermore, in the specification of the present disclosure, clustering means an operation of moving responsibility for repairing faults to redundancies associated with an appropriate layer, which layer that may not be the same layer that the fault occurred in, and collecting the faults for effective repair, and is sometimes used interchangeably with a mapping operation. In the clustering, a layer to which repair of a fault is to be moved is called a mapped layer, and an original layer where the fault has occurred is called a mapping layer. The process of moving responsibility for handling a fault from the mapping layer to the mapped layer may be referred to herein as simply “moving the fault to the mapped layer,” though of course the actual fault memory cell does not move.

    [0032] Furthermore, repair means a repair operation of replacing operations intended to use a fault memory cell with analogous operations that instead use an extra memory cell, and the extra memory cells are called redundant cells, spare cells, or redundancy.

    [0033] FIG. 1 is a diagram for explaining the schematic characteristics of the present disclosure, and for convenience, illustrates only a 6×6 memory array having row addresses RA0 to RA5 and column addresses CA0 to CA5. FIG. 1 illustrates that when faults detected in two layers are moved to another layer and clustered, the direction of the move is not fixed. For example, a second fault #2 detected in layer 1 may be moved to layer 2 and clustered, and a fifth fault #5 detected in layer 2 may be moved to layer 1 and clustered. In other words, it can be said that a mapped layer and a mapping layer have not been predetermined or the direction of mapping have not been fixed. Mapping is performed by exchanging fault addresses. For reference, layers may belong to different memory chips, or even though they belong to one chip, they may belong to different arrays, pages, sectors, or groups.

    [0034] The characteristics of the present disclosure will be described in more detail with reference to FIG. 2 and FIG. 3. FIG. 2 illustrates an example wherein a total of 9 faults are distributed in four layers, and FIG. 3 illustrates a redundancy storage device (RCAM) that stores fault addresses of the faults. As such a storage device, a content addressable memory (CAM) or a storage device similar to a CAM may be used. In FIG. 3, the symbols LA, RA, CA, ML, RMF, and CMF indicate a layer number or a layer address, a row address, a column address, a number of a mapped layer to which a fault is to be moved, a row-must flag indicating row-wise repair, and a column-must flag indicating column-wise repair, respectively. A value of ML equal to LA indicates that no mapping is performed and a fault stays in (that is, is handled by redundancy associated with) an original layer. Since RMF and CMF have meaning only as flag signals, it is assumed in the present disclosure that a value of ‘1’ indicates that repair is required.

    [0035] As indicated by arrows in FIG. 2, faults occurring in each layer may be clustered in any layer according to a characteristic operation of the present disclosure. That is, a mapped layer and a mapping layer may not be fixed and may be changed for each fault for efficient repair. For example, among three faults #3, #4, and #5 detected in layer 2, since the faults #3 and #4 having row and column addresses of (RA, CA)=(0, 3) and (RA, CA)=(3, 0) share column addresses with faults #1 and #2 of layer 1, respectively, they may be mapped to layer 1. On the other hand, since the fault #5 of the layer 2 has an address of (RA, CA)=(4,1), it shares a column address with fault #9 of layer 4 and may be treated as column redundancy only when mapped to the layer 4. Therefore, the values of ML for the three faults #3, #4, and #5 are 1, 1, and 4, respectively.

    [0036] In order to facilitate the understanding of the characteristics of the present disclosure, a redundancy method of the present disclosure will be described with an example in which faults exist in each of two layers. FIG. 4 to FIG. 7 are diagrams for explaining the redundancy method of the present disclosure on a step-by-step basis. For reference, each fault is numbered according to the order of detection. First, as illustrated in FIG. 4, the first step is a case in which fault #2 is detected after fault #1 is detected in layer 1. Analysis of these faults shows that they are located on the same row. Therefore, they only need to be repaired using a row redundancy of layer 1 and mapping is not required. Accordingly, in the storage device (RCAM) for storing redundant addresses, the values of LA, RA, CA, ML, RMF, and CMF, which are information on these faults, are stored as (1, 0, 1, 1, 1, 0) and (1, 0, 3, 1, 1, 0). In FIG. 4, when the RMF values of the two faults are 1, it means that these faults need to be repaired using a row redundancy, and when the ML value is 1, it means that these faults stay in the layer 1, which is an original layer, without clustering. In such a case, since the remaining faults have not yet been detected, an address storage device (address content addressable memory (ACAM)) is also in an empty state. For reference, the R/CEF of the address storage device (ACAM) is a row/column exchange flag signal indicating whether an address exchange direction used by clustering is a row direction or a column direction.

    [0037] In the next step, when third fault #3 is detected at the position of (RA,CA)=(3,2) of layer 2 as illustrated in FIG. 5, the third fault #3 is compared with the faults of the layer 1, that is, the faults #1 and #2, for analysis. The analysis result indicates that a row address of the third fault #3 does not match a row address of any previously-detected fault and a column address of the third fault #3 does not match a column address of any previously-detected fault, which indicates that no clustering is required; such analysis is performed by an analyzer that analyzes redundancy. Since the analysis result indicates that no clustering is required, the values of the flag signals RMF and CMF indicating the presence or absence of row redundancy or column redundancy are also written as (0, 0) in the third row of RCAM as illustrated in FIG. 5.

    [0038] In the next step, when fourth fault #4 is detected at the position of (RA,CA)=(2, 2) of the layer 1 as illustrated in FIG. 6, the analysis result indicates that the fourth fault #4 shares a column address with the third fault #3. In such a case, it is preferable to perform clustering in which the fault #4 is mapped from the layer 1 and moved to the layer 2 and to repair the fault #4 using a column redundancy. That is, only when all of the row to which the fault #4 belongs are clustered to the layer 2, the fault #4 can be repaired using column redundancy. In such a case, the fault #4 needs to be mapped from the layer 1 and moved to the layer 2 for clustering, and the layer 2 becomes a mapped layer. Since exchange of a fault address between layers is required for mapping, this fact is written in the address storage device (ACAM) and the value of R/CEF is also written as 0, indicating that a row is being mapped. Therefore, a value representing the fault #4 in the address storage device (ACAM) is (LA, RA, CA, ML, R/CEF)=(1, 2, 2, 2, 0). Meanwhile, since the clustered faults #3 and #4 need be repaired using a column redundancy, the value of the column-must flag signal CMF of entry #3 stored in the redundancy storage device RCAM is changed to 1.

    [0039] In the last step, when fifth fault #5 is detected at the position of (RA,CA)=(2,3) of the layer 2 as illustrated in FIG. 7, corresponding fault information is stored in the redundancy storage device (RCAM). In such a case, since the fault #5 exists in a row already exchanged because of the fault #4, that is, the position of RA=2 of the layer 2, the value of ML is stored as ‘1’ in the redundancy storage device (RCAM) in consideration of this fact. When this step is described differently, the faults #4 and #5 have been mapped by exchanging the layers, and since the fault #4 has already been determined to be repaired by column redundancy, which means that the fault #5 is repaired by row redundancy after mapping. Accordingly, the redundancy operation of the present disclosure that performs dynamic clustering at the same time as each fault is detected is completed while consuming a minimum of time. That is, since the clustering is completed at the same time as the last fault is detected, determination of redundancy for repairing each fault is also completed. A series of dynamic analysis processes for performing clustering in the present disclosure are performed by a redundancy analyzer 310 to be described below.

    [0040] When the clustering method of the present disclosure is compared with the static clustering method in the related art, the advantages of the present disclosure become more apparent. In the case of the static clustering in the related art, proper clustering is started only after all faults are detected. Therefore, in a first step, the faults #1 and #2 are determined to be repaired using a row redundancy only after five faults #1 to #5 are stored in the redundancy storage device (RCAM), and in a second step, the fault #3 is clustered to layer 1. Then, in a third step, the faults #3 and #4 are determined to be repaired using a column redundancy, and in a fourth step, it is determined whether to repair the fault #5 using a row redundancy or a column redundancy. Therefore, unlike the present disclosure, in the related art, since a series of processes from the first step to the fourth step are additionally required after all the faults are stored, additional time for the processes is also required.

    [0041] FIG. 8 is a graph showing intuitive comparison of analysis time between an embodiment of the present disclosure and the related art. As described above, in the embodiment of the present disclosure, since clustering is performed at the same time as a fault is detected or collected, the time required for clustering is reduced. However, in the related art, clustering is performed after all of the faults are detected. Therefore, it can be seen that the related art has a difference in the execution time of redundancy management because an additional time is required for the self-test (BIST).

    [0042] FIG. 9 is a flowchart illustrating a clustering process in an embodiment of the present disclosure.

    [0043] When the built-in self-test (BIST) for a semiconductor memory has started or has not ended yet (step S110, branch “No”), it is searched whether there is a new fault (step S120), and if the self-test is ended (step S110, branch “Yes”), fault clustering is terminated and an operation of replacing a fault cell with a redundant cell is started. Whether a row address or a column address of a newly detected fault matches that of a previously detected fault is checked in the address storage device ACAM (S130). As a result of the check, when the row address or the column address of the newly detected fault matches that of the previously detected fault, a layer number ML of the new fault is manipulated (S140), and when the row address or the column address of the newly detected fault does not match that of the previously detected fault, whether to perform a row-must or column-must repair is determined (S150). When it is determined to perform the a row-must or column-must repair, corresponding information is set in the redundancy storage device (RCAM), that is, in the entry of the RCAM corresponding to the fault, the value of RMF is set to ‘1’ or the value of CMF is set to ‘1’ (S170). When it is determined not to perform the row-must or column-must repair, it is checked whether the fault is to be clustered from a layer to which the fault belongs to another layer (S160). As a result of the check, when the clustering operation is possible, corresponding layer information is stored in the address storage device ACAM (S180). In step S180, the changed layer number is stored in the mapped layer number ML, and R/CEF is written as 0 in the case of row-wise clustering and R/CEF is written as 1 in the case of column-wise clustering. When the clustering operation is not possible in step S160, corresponding information is stored in the redundancy storage device (RCAM) (S190).

    [0044] FIG. 10 illustrates a dynamic fault clustering apparatus according to an illustrative embodiment of the present disclosure. FIG. 10 shows one example of apparatuses capable of implementing the processes of the present disclosure, but embodiments are not limited thereto. The apparatus illustrated in FIG. 10 includes a memory cell array 350 that may store binary information and may be divided into multiple layers as needed, a global redundancy 340 provided to replace a fault cell, a redundancy CAM 330 provided for clustering, an address CAM 320, the redundancy analyzer 310, and a multiplexer 360.

    [0045] The array 350 includes a plurality of layers, and each layer may correspond to dividing a total number of cells in a memory chip by an appropriate capacity, or each layer may correspond to a set of memory cells belonging to each chip in a device, such as a high bandwidth memory (HBM) in which several memory chips are stacked and connected by through-silicon vias (TSVs).

    [0046] The global redundancy 340 may be a set of extra memory cells provided to replace a fault memory cell, and may replace a fault memory cell without distinguishing between row-wise repair or column-wise repair.

    [0047] The redundancy CAM 330 may store information on a fault memory cell, that is, faults, and may store some or all of a layer number to which each fault belongs, a row address, a column address, a mapped layer number for mapping, a row-must flag indicating whether row-wise repair is required, and a column-must flag indicating whether column-wise repair is required. Preferably, as the redundancy CAM 330, a content addressable memory (CAM) or a storage device similar to the CAM may be used.

    [0048] The address CAM 320 stores information on which a detected fault is layer-mapped through clustering, and is configured to store, for each fault, a layer number to which the fault belongs, a row address, a column address, a mapped layer number for mapping, and whether clustering is row-wise clustering or column-wise clustering. Preferably, as the address CAM 320, a content addressable memory (CAM) or a storage device similar to the CAM may be used.

    [0049] The redundancy analyzer 310 is configured to perform a series of analysis processes according to the present disclosure, and may be implemented as a combination of logic circuits. In embodiments, the redundancy analyzer 310 may include a processor or microcontroller that contributes to the performance of one or more of the analysis processes by executing instructions stored in a non-transitory computer-readable media.

    [0050] The multiplexer 360 is configured to selectively operate so that binary information may be inputted/outputted to/from the array 350 in the case of a normal memory cell and binary information may be inputted/outputted to/from the global redundancy 340 that is replacing a fault memory cell.

    [0051] FIG. 11 is a graph showing comparison of redundancy analysis time among simulations that compared embodiments of the present disclosure with the related art in order to verify the effects of the present disclosure, wherein a horizontal axis denotes the number of faults per layer and a vertical axis denotes the analysis time. It is assumed that the size of a memory cell is 1,024×1,024, faults have a Poisson probability distribution with an expected value of λ=20, and the types of faults are single faults of 60%, row-wise faults of 20%, and column-wise faults of 20%.

    [0052] Referring to FIG. 11, it can be seen that the analysis time of the dynamic redundancy method of the present disclosure is minimized compared to the static redundancy method in the related art. Particularly, the greater the number of faults detected in each layer, the more prominent the effect of the present disclosure is. For example, when the number of faults per layer is 28, the absolute time saved is greater than when the number of faults per layer is 16. The test time tends to increase rapidly as the degree of integration of a semiconductor memory increases and the cost associated with the test also tends to increase. Thus, the saving of redundancy analysis time is important in mass production of semiconductor memories. FIG. 12 is a diagram showing comparison of repair rates between an embodiment of the present disclosure and the related art, and a horizontal axis denotes an extra cell per layer and a vertical axis denotes a repair rate. It can be seen that there is no significant difference in the repair rate between the embodiment of the present disclosure and the related art.

    [0053] In order to further verify the advantage that accrues when a fault clustering technology of the present disclosure is applied, including that analysis time is saved and there is little reduction in the repair rate, verification was attempted while the types of faults were changed to have various ratios. For example, even when simulations were performed while the size of the memory cells is fixed to 2,048×512 and the ratio of single faults, row-wise faults, and column-wise faults are changed from (0.6, 0.2, 0.2) to (0.6, 0.35, 0.05) and (0.6, 0.05, 0.35), no significant reduction in the repair rate occurs, as illustrated in FIG. 13 to FIG. 15.

    [0054] In another simulation result of redundancy analysis time performed under different conditions, as illustrated in FIG. 16, time is reduced by 70% or more in an embodiment of the present disclosure compared to the related art. In FIG. 16, a horizontal axis indicates the assumed number of faults per layer (i.e., 10, 12, and 14,) and a vertical axis indicates the analysis time expressed in cycles. For reference, in the simulation of FIG. 16, it was assumed that the size of the memory cells was selected as 1,024×1,024, redundancy capable of treating 16 spares was provided in each of two layers, the ratio of single faults, row-wise faults, and column-wise faults was (0.6, 0.2, 0.2), and the faults followed a Poisson probability distribution.

    [0055] Although the present disclosure has been described with reference to the embodiments illustrated in the drawings, the embodiments of the disclosure are for illustrative purposes only, and those skilled in the art will appreciate that various modifications and equivalent other embodiments are possible from the embodiments. Thus, the true technical scope of the present disclosure should be defined by the following claims.