Guide to MITRE-Educational Testing Service Inductive Reasoning Battery for a High Ability Population

By Amber Sprenger, Ph.D. , Robert Hartman, Ph.D.

MITRE and the Educational Testing Service developed tests to measure the core inductive reasoning component of fluid reasoning for research applications focusing on high-ability adults. This guide provides an overview of the test development approach.

Download Resources

MITRE and the Educational Testing Service (ETS) developed a suite of tests to measure the core inductive reasoning component of fluid reasoning (Gf) for research applications focusing on high-ability adults. This guide provides a short overview of the test development approach, followed by detailed descriptions of each test. A set of appendices provides general test administration information for researchers (Appendix A), as well as test instructions for participants, test items, item parameters, and scaled-scoring conversion tables for each test (Appendices B through I).
 
The inductive reasoning tests provided in this guide include three classes of “series completion” tests (figure series (FS), number series (NS), and letter series (LS)); a matrix reasoning (MR) test; and multiple “composite” test form options, each of which includes a mix of FS, NS, and/or LS items. Each test includes two parallel forms that are: 
  • highly reliable: marginal reliability ≥ 0.87; 
  • difficult: have maximum possible scores at least four observed sample standard deviations (SDs) above the observed sample means based on a highly educated sample;
  • essentially unidimensional.
All tests were developed using large samples (n≥2000 participants) with an equal representation of the following non-overlapping highest-educational-attainment groups: a) 3rd or 4th year college students; b) college graduates; c) master’s degree students; d) completed a master’s degree but not yet obtained a doctorate; e) doctorate degree-holders. This is in keeping with the test’s intended use: high-ability participants in research studies where a high test ceiling is of critical importance and where the trade-off of potential “floor effects” is an acceptable one. All tests include 30-35 items and are administered with a 30- or 32-minute time limit, except for the FS-NS-LS composite which includes 40 test items and is administered with a 45-minute time limit. Each test also requires a 1-2 minute per-item time limit (limits vary by test), which helps participants set expectations regarding how much time to spend per item, so as to avoid running out of time at the test level.