RAPID ITEM DEVELOPMENT USING INTELLIGENT TEMPLATES TO EXPEDITE ITEM BANK EXPANSION

20200293994 ยท 2020-09-17

Assignee

Inventors

Cpc classification

International classification

Abstract

A system and method for rapidly developing items is presented. A root item is created or identified as a starting point for development of an item template. The item template is developed from the selected root item using a key calculation as well as calculations to generate distractors to ensure that an item discriminates well. The template identifies variables, provides the calculation and rationale for each answer option, and defines any variable constraints. Items are then cloned from the template. The cloned items may be identical in format to the root items. Finally, the statistical performance of the cloned item is verified by subjecting a few select cloned items to undergo an initial statistical analysis to validate the performance of a template and, once validated, multiple cloned items may be created and used in scored positions on future test administrations without needing to pretest.

Claims

1. A computerized method for expanding an item bank, comprising: identifying a root item for use as a starting point for developing an item template, wherein the item template identifies variables, provides a calculation and rationale for each answer option, and defines any variable constraints; creating cloned items from the item template, wherein the cloned items are identical in format to the root item; verifying statistical performance of the cloned items by subjecting three or more cloned items to statistical analysis to validate performance of the item template; and once validated, creating a plurality of cloned items for expanding an item bank.

2. The computerized method for expanding an item bank of claim 1, wherein items from the item bank are used in a certification examination.

3. The computerized method for expanding an item bank of claim 2, wherein the certification examination is presented to a user seeking certification.

4. The computerized method for expanding an item bank of claim 1, wherein the item template is developed from the root item using a key calculation as well as calculations to generate distractors to ensure that an item discriminates well.

5. The computerized method for expanding an item bank of claim 1, wherein the root item is selected because it met psychometric standards during the root item's most recent administration.

6. The computerized method for expanding an item bank of claim 1, wherein the variable constraints define a range of potential values for each variable.

7. The computerized method for expanding an item bank of claim 1, wherein the plurality of cloned items may be used without pretesting.

8. The computerized method for expanding an item bank of claim 1, wherein item response theory is used for verification.

9. The computerized method for expanding an item bank of claim 1, wherein the statistical performance of the three or more cloned items is determined by whether p-values of the root item and the three or more cloned items are within +/0.10 of each other.

10. The computerized method for expanding an item bank of claim 1, wherein the statistical performance of the three or more cloned items is determined by whether discrimination statistics of the root item and the three or more cloned items are within +/0.15 of each other.

11. The computerized method for expanding an item bank of claim 1, wherein the statistical performance of the three or more cloned items is determined by whether IRT b values of the root item and the three or more cloned items are within +/0.60 logits of each other.

Description

DESCRIPTION OF THE DRAWINGS

[0017] For a more complete understanding of the present invention, and the advantages thereof, reference is now made to the following descriptions taken in conjunction with the accompanying drawings, in which:

[0018] FIG. 1 depicts an example root item on the Cost of Goods Sold (COGS) concept in one embodiment of the invention;

[0019] FIG. 2 depicts a template developed from the root item in FIG. 1; and

[0020] FIG. 3 depicts an cloned item created from the template in FIG. 2.

DESCRIPTION OF THE PREFERRED EMBODIMENTS

[0021] The present invention is directed to improved methods and systems for, among other things, expediting item development. The configuration and use of the presently preferred embodiments are discussed in detail below. It should be appreciated, however, that the present invention provides many applicable inventive concepts that may be embodied in a wide variety of contexts other than test item generation. Accordingly, the specific embodiments discussed are merely illustrative of specific ways to make and use the invention, and do not limit the scope of the invention. In addition, the following terms shall have the associated meaning when used herein:

[0022] item means any item used in testing such as, for example, multiple choice, true-false, matching, completion, or essay questions; and

[0023] test means any test or examination that includes items, such as a certification test, standardized test or other examination.

[0024] For almost all organizations, an increase in item production directly corresponds to an increase in test development costs. Depending on the item writing process followed, the overhead cost of developing one, statistically-verified, scored item may range from several hundred dollars to several thousand dollars. And to achieve scored status, new items require pretesting which involves statistical validation and management of test publishing cycles and pretest tails/unscored item sets to obtain maximum throughput. The methods and systems of the present invention may help eliminate the need to repeat pretesting for variations on a specific topic or methodology, which increases opportunities to pretest different item types and levels of thinking.

[0025] Embodiments of the present invention include a strategic approach to quickly develop new items that do not have to undergo the pretest process before being used in scored positions on an exam. This reduces test development costs, keeps test content current and increases test security. In addition, the ramp up of test items and a growing item bank will allow organizations to expand their testing windows, even as far as expanding to on demand testing. This is a significant benefit to the organization as well as to candidates. With on demand testing, candidates will have the flexibility of testing on their schedule, without missing the opportunity to test, or re-study if they missed set windows. In addition, on demand testing allows an opportunity for more frequent statistical analysis, approval of statistically valid test items, decreased risk of item exposure, among the many benefits.

[0026] The goals of credentialing organizations may differ quite drastically, but the need for item growth is something that is shared industry-wide. Embodiments of the present invention present a viable, statistically-valid means for achieving this growth while meeting ever-changing business needs and achieving a number of strategic business goals including strengthening test security, decreasing long-term item development costs and maximizing volunteer efficiency. Through the use of templates, key concepts and content areas may be introduced more quickly and tested more efficiently to allow organizations to accurately assess candidate ability in an accelerated business environment.

[0027] Various embodiments of the present invention involve a four-step process that includes: 1) identifying or creating a root item; 2) developing a template from that root item; 3) using the template to clone additional items; and 4) conducting statistical validation of the cloned additional items. Once these processes have been completed, new items may be created from the template without the need for further pretesting. These processes are described in more detail below, walking through the process from the formation of a root item through the statistical vetting of cloned items.

Step 1: Identify or Create a Root Item

[0028] Embodiments of the present invention commence with the creation or identification of a root item which provides the starting point for development of the item template. A non-exclusive list of criteria are listed below for identifying a root item: [0029] 1) Ensure that the item is relevant to the test body of knowledge, aligns with the test blueprint and provides value to the content being tested; [0030] 2) Maintain pre-established organization guidelines for item writing style. [0031] 3) Provide many different variables that may be manipulated to create a dynamic template from which many item cloned items may be generated. For example, there may be many items that present numerical inputs to use for calculation purposes. These numerical inputs are used to create formulas for the key and each plausible distractor. Referring to FIG. 1 which is an example root item on the Cost of Goods Sold (COGS) concept in which the root item shown illustrates the importance of the variety in numerical inputs.

[0032] The selection of a pre-existing scored item from the item pool has several advantages, including style guide adherence, linkages to the test content outline, and previously-validated psychometric statistics. Those skilled in the art will appreciate that, when selecting an existing item, one may choose an item that met psychometric standards during its most recent administration.

Step 2: Develop a Template from the Root Item

[0033] To develop a template from the root item, the key calculation must be known as well as calculations to generate all the remaining plausible distractors to help ensure that an item discriminates well, and that test-savvy candidates are not able to guess the correct answer solely by process of elimination. For example, if there are only two ways to manipulate the variable in the item stem, the root item is too easy because more than one distractor will be implausible and may be quickly eliminated as incorrect.

[0034] The template should identify the variables, provide the calculation and rationale for each answer option, (i.e. key and distractors), and define any variable constraints. Variable constraints help ensure distractors are plausible and the item stem provides realistic information. If computer software is used, constraints are required to define the range of potential values for each variable. If subject matter experts are asked to clone items from a template, variable constraints promote standardization and provide additional quality control. Referring now to FIG. 2 which shows a template developed from the root item in FIG. 1. In this example, the constraints that must be followed are listed with each variable to ensure that plausibility is maintained. The placeholder shows the variable combination used to calculate each answer option.

Step 3: Clone Items from Template

[0035] Using an established template, one may create multiple cloned items. Cloned items should be identical in format to the root items, with the only changes made being to the different item variables. In many embodiments, it is important to adhere to the pre-determined variable constraints to ensure that all aspects of the new item remain plausible. Even small changes to the template's language, format or presentation may result in variability in the statistical performance of the cloned item. Those skilled in the art will appreciate the importance of adhering to the item template. FIG. 3 shows an cloned item that was created from the template in FIG. 2.

Step 4: Verify Statistical Performance of Cloned Items

[0036] As with any newly-developed item, a select few cloned items should undergo an initial statistical analysis to validate the performance of a template. However, once the performance of a template item is validated, multiple cloned items may be created and used in scored positions on future test administrations without needing to pretest. Approved item templates are those whose cloned items perform within a psychometrically acceptable range to the root item on multiple statistical measures. In some embodiments, the performance of at least three cloned items may be verified before approving the use of a template for mass item generation. These three cloned items may be referred to as the beta clones, with subsequent cloned items becoming immediately operational after successful performance of the beta clones have been verified.

[0037] To establish a consistent testing environment, organizations may administer the beta clones concurrently using multiple pretest tails/sets on the same or parallel base forms. This data collection design helps protect against sample changes and other sources of variance that may be introduced over time. If concurrent testing is not possible, it may be desirable to develop an individualized plan for administering at least three beta clones, from the same item template, during a reasonable timeframe.

[0038] The statistical performance of beta clones may be verified by having a large enough sample size to draw defensible conclusions when interpreting pretest results. In some of the embodiments, each of the beta clones must be administered to an adequate sample of candidates before making statistical comparisons of performance across the cloned items and root item. These exams typically have a relatively high candidate volume, which permits the use of item response theory (IRT) scoring. IRT is a powerful statistical model that allows for sample-independent comparisons of candidate and item performance. To maintain a stable IRT scoring scale, a minimum of 300 candidate responses to each pretest item is collected before running statistical analyses on examination data.

[0039] Approving an item template occurs by comparing the actual and predicted performance of the beta clones on, for example, three statistical indices. When three indices are used, the first index may be the IRT item difficulty, or b parameter. In general, item difficulty values for an test range from 4 to +4 logits, with higher values indicating more difficult items. In addition to judging cloned item performance by the IRT b parameter, it also compares cloned item performance using classical test theory (CTT) statistics. Unlike IRT parameters, CTT statistics are sample-dependent and vary depending on the proficiency level of the candidates taking the exam. An item's CTT difficulty value (p-value) reflects the proportion of candidates who answered the item correctly on a single test form during a specific administration window. An item's p-value is 0 if no candidates answered the item correctly, and 1 if all candidates answered it correctly.

[0040] The second CTT index may be an item's discrimination value, which represents the correlation between item and test performance. The system measures item discrimination using the point-biserial correlation coefficient. Discrimination values range from 1 to +1. A +1 occurs when all high performers (large total test score) answer an item correctly and all low performers (small total test score) respond incorrectly. The inverse results in a discrimination of 1. Larger discrimination values are desirable because they indicate a strong, positive relationship between answering an item correctly and performing well on the examination.

[0041] What constitutes similar item performance across the three indices may be determined by the organization and may be based on existing psychometric/statistical guidelines and thresholds used in decision making, such as those used when assessing item quality. In one embodiment, the system of the present invention uses the following guidelines to help determine whether the beta clones and root item are performing similarly: [0042] The p-values should be within +/0.10 of each other [0043] The discrimination statistics should be within +/0.15 of each other [0044] The IRT b values should be within +/0.60 logits of each other

[0045] Among all three statistics, one embodiment of the present invention relies most heavily on the IRT difficulty value to determine whether to approve an item template. For the IRT difficulty value, similar performance is achieved if the absolute value of the displacement statistic (difference between the actual and predicted difficulty of the item) is less than 0.60 logits. This is the same threshold that may be used when evaluating item performance to decide when to unanchor a scored item's difficulty parameter when calibrating pretest items. It may not be desirable to adopt these thresholds without considering their existing psychometric guidelines for assessing item quality. An organization's psychometric staff or qualified consultants should be involved when deciding the most defensible option for assessing cloned item performance and approving templates for mass item generation without pretesting.

[0046] In an exemplary embodiment, a computerized method for expanding an item bank is presented in which a root item is identified for use as a starting point for developing an item template, wherein the item template identifies variables, provides a calculation and rationale for each answer option, and defines any variable constraints. Cloned items are then created from the item template, wherein the cloned items are identical in format to the root item. Statistical performance of the cloned items is verified by subjecting three or more cloned items to statistical analysis to validate performance of the item template. Once validated, a plurality of cloned items are created for expanding an item bank.

[0047] In this embodiment, the root item is selected because it met psychometric standards during the root item's most recent administration. The item template may be developed from the root item using a key calculation as well as calculations to generate distractors to ensure that an item discriminates well. The variable constraints define a range of potential values for each variable. The plurality of cloned items may be used without pretesting, Item response theory is used for verification. The statistical performance of the three or more cloned items is determined by whether p-values of the root item and the three or more cloned items are within +/0.10 of each other, or the statistical performance of the three or more cloned items is determined by whether discrimination statistics of the root item and the three or more cloned items are within +/0.15 of each other, or the statistical performance of the three or more cloned items is determined by whether IRT b values of the root item and the three or more cloned items are within +/0.60 logits of each other. Items from the item bank may be used in a certification examination, and the certification examination may be presented to a user seeking certification.

[0048] The foregoing has outlined rather broadly certain aspects of the present invention in order that the detailed description of the invention that follows may better be understood. Additional features and advantages of the invention will be described hereinafter which form the subject of the claims of the invention. It should be appreciated by those skilled in the art that the conception and specific embodiment disclosed may be readily utilized as a basis for modifying or designing other structures or processes for carrying out the same purposes of the present invention. It should also be realized by those skilled in the art that such equivalent constructions do not depart from the spirit and scope of the invention as set forth in the appended claims.

[0049] While the present system and method has been disclosed according to the preferred embodiment of the invention, those of ordinary skill in the art will understand that other embodiments have also been enabled. Even though the foregoing discussion has focused on particular embodiments, it is understood that other configurations are contemplated. In particular, even though the expressions in one embodiment or in another embodiment are used herein, these phrases are meant to generally reference embodiment possibilities and are not intended to limit the invention to those particular embodiment configurations. These terms may reference the same or different embodiments, and unless indicated otherwise, are combinable into aggregate embodiments. The terms a, an and the mean one or more unless expressly specified otherwise. The term connected means communicatively connected unless otherwise defined.

[0050] When a single embodiment is described herein, it will be readily apparent that more than one embodiment may be used in place of a single embodiment. Similarly, where more than one embodiment is described herein, it will be readily apparent that a single embodiment may be substituted for that one device.

[0051] In light of the wide variety of methods for item development known in the art, the detailed embodiments are intended to be illustrative only and should not be taken as limiting the scope of the invention. Rather, what is claimed as the invention is all such modifications as may come within the spirit and scope of the claims and equivalents thereto.

[0052] None of the description in this specification should be read as implying that any particular element, step or function is an essential element which must be included in the claim scope. The scope of the patented subject matter is defined only by the allowed claims and their equivalents. Unless explicitly recited, other aspects of the present invention as described in this specification do not limit the scope of the claims.

[0053] To aid the Patent Office and any readers of any patent issued on this application in interpreting the claims appended hereto, the applicant wishes to note that it does not intend any of the appended claims or claim elements to invoke 35 U.S.C. 112(f) unless the words means for or step for are explicitly used in the particular claim.