System and method for testing non-volatile memory express storage devices
11836059 · 2023-12-05
Assignee
Inventors
Cpc classification
G06F11/221
PHYSICS
International classification
G06F11/22
PHYSICS
Abstract
PCIe devices may be connected to a test system for development, quality assurance, manufacturing, design validation, qualification, certification, and other testing. PCIe bus or other unexpected errors can avoid direct capture by the test system. Inserting a PCIe analyzer can capture a trace of PCIe bus data around any specific trigger. Due to the high volume and speed of data crossing the data bus when testing multiple devices, finding a correct trigger for an analyzer trace capture is akin to finding a needle in a haystack. By configuring a specific trigger pattern that the test system can send across the PCIe bus without impacting any of the devices under test, the test system can trigger the analyzer at the precise time needed to capture a PCIe bus data trace around the error.
Claims
1. A method for capturing a bus data trace around an error event during testing of a peripheral component interconnect express (PCIe) device, comprising: connecting a test system to one or more devices under test across a PCIe bus, wherein the one or more devices under test comprise up to sixteen non-volatile memory express solid state drives; connecting a PCIe analyzer to the PCIe bus; capturing, by the PCIe analyzer, a trace of PCIe data into a buffer; transmitting data from the test system across the PCIe bus to the devices under test to test the devices under test; upon detecting, by the test system, an unexpected error, sending a trigger pattern across the PCIe bus within a write to a read-only register such that the trigger pattern is recognizable by the PCIe analyzer without impacting any of the devices under test; and detecting, by the PCIe analyzer, the trigger pattern and halting PCIe trace data capture to preserve PCIe data around the unexpected error.
2. A method for capturing a bus data trace around an error event during testing of a peripheral component interconnect express (PCIe) device, comprising: connecting a test system to one or more devices under test across a PCIe bus; connecting a PCIe analyzer to the PCIe bus; capturing, by the PCIe analyzer, a trace of PCIe data into a buffer; transmitting data from the test system across the PCIe bus to the devices under test to test the devices under test; upon detecting, by the test system, an unexpected error, sending a trigger pattern across the PCIe bus such that the trigger pattern is recognizable by the PCIe analyzer without impacting any of the devices under test; and detecting, by the PCIe analyzer, the trigger pattern and halting PCIe trace data capture to preserve PCIe data around the unexpected error.
3. The method of claim 2, further comprising simultaneously testing, as the one or more devices under test, up to sixteen non-volatile memory express solid state drives.
4. The method of claim 2, further comprising sending the trigger pattern within a write to a read-only register.
5. The method of claim 4, further comprising including four bytes of data within the write that includes the trigger pattern, wherein the four bytes of data specify a type and subtype associated with the unexpected error.
6. A system for capturing a bus data trace around an error event during testing of a peripheral component interconnect express (PCIe) device, comprising: a host test server; one or more devices under test connected to the host test server across a PCIe bus; a PCIe analyzer connected to the PCIe bus; wherein the PCIe analyzer includes a processor operating software instructions to: capture a trace of PCIe data into a buffer; and detect a trigger pattern sent on the PCIe bus and halting PCIe trace data capture; and wherein the host test server includes a processor operating software instructions to: transmit data across the PCIe bus to the devices under test to test the devices under test; and upon detecting an unexpected error, send the trigger pattern across the PCIe bus such that the trigger pattern is recognizable by the PCIe analyzer without impacting any of the devices under test.
7. The system of claim 6, wherein the one or more devices under test further comprise up to sixteen non-volatile memory express solid state drives.
8. The system of claim 6, wherein the host test server software instructions further comprise instructions to send the trigger pattern within a write to a read-only register.
9. The system of claim 8, wherein the host test server software instructions further comprise instructions to include four bytes of data within the write that includes the trigger pattern, wherein the four bytes of data specify a type and subtype associated with the unexpected error.
Description
BRIEF DESCRIPTION OF THE DRAWINGS
(1) In the drawings, closely related figures and items have the same number but different alphabetic suffixes. Processes, states, statuses, and databases are named for their respective functions.
(2)
(3)
(4)
(5)
DETAILED DESCRIPTION, INCLUDING THE PREFERRED EMBODIMENT
(6) In the following detailed description, reference is made to the accompanying drawings which form a part hereof, and in which are shown, by way of illustration, specific embodiments which may be practiced. It is to be understood that other embodiments may be used, and structural changes may be made without departing from the scope of the present disclosure.
(7) Operation
(8) Referring to
(9) During testing, unexpected errors on the PCIe bus may be encountered. Referring also to
(10) A solution is to a send a pattern over the PCIe bus from the test system which the analyzer is configured to recognize and based on that recognition capture the trace of the PCIe bus data. The challenge in sending such pattern is it needs to be addressed to a device connected on the PCIe bus, and should not interfere with or otherwise alter the devices under test.
(11) One pattern which can be sent and detected without impacting devices under test is a write 330 into a controller's device or vendor registers, which are read-only so the write will be ignored by the actual device. Any detectable trigger pattern can be used, such as “1ae3”. With a test system configured to send such trigger pattern, the analyzer can be configured to capture a PCIe data trace based on detection of the trigger pattern.
(12) Referring also to
(13) The test system may be configured, by default, to send the detectable pattern and trigger a PCIe trace on specific errors. Additional data codes may identify the error, and be included in the pattern (and if so, the analyzer trigger payload should be equivalently extended). One example set of error triggers may include: 0100—error requiring Controller Reset 0200—command timeout 0300—Controller Fatal Status detected 0400—error during Controller Reset or Initialization 0500—error during NVM Subsystem Reset 0600—error during PCI Reset 0700—error waiting for Controller Ready (being enabled) 0800—error waiting for Controller Shutdown (being disabled) 0900—device is “Gone”
(14) These may be individually enabled or disabled through command line or user interface controls of the test system.
(15) Additionally, a command line interface on the test system may be used to manually send a trigger signal, or scripts may be stored and run on the test system to send a trigger signal on additional error conditions. Within an example test system, an “sb_echo trigger=X[,Y]>/proc/vlun/nvme” command may be used, where X identifies a specific controller by target number or PCI name and the optional Y identifies the additional error information.
(16) After the trigger pattern is sent from the test system to the PCIe analyzer, a trace of the PCIe data is retained 340 on the analyzer. The specific trigger and any additional error information may be viewed, along with PCIe data before and after the error. This preserves the specific trace around an otherwise untraceable error.
(17) It is to be understood that the above description is intended to be illustrative, and not restrictive. Many other embodiments will be apparent to those of skill in the art upon reviewing the above description. The scope should, therefore, be determined with reference to the appended claims, along with the full scope of equivalents to which such claims are entitled.