ÂÌñ»»ÆÞ

Skip to main content

Investigating the Relationship Between Sample Size and Reliability of Aggregate Test Score Measures in Colorado Schools

Link to Resource: Investigating the Relationship Between Sample Size and Reliability of Aggregate Test Score Measures in Colorado Schools

´¡³Ü³Ù³ó´Ç°ù²õ:ÌýBenjamin R. Shear, Erik Whitfield, Kaitlin Nath

Citation: Shear, B. R., Whitfield, E., & Nath, K. (2025). Investigating the relationship between sample size and reliability of aggregate test score measures in Colorado schools. Boulder, CO: The Center for Assessment, Design, Research and Evaluation (CADRE), ÂÌñ»»ÆÞ.

This report examines how changing Colorado’s minimum student count thresholds ("minimum n-size") could affect the availability and reliability of aggregate school and district achievement and growth measures. Using statewide test score data from 2017-2019 and a hierarchical linear modeling framework, the report documents that lowering the minimum student count thresholds modestly increases reporting for overall school results but can substantially increase reporting for disaggregated student groups, often with lower reliability. The analyses demonstrate how reliability of these aggregate test score measures increases with sample size, is higher for achievement than growth, and for schools compared to districts. The findings underscore tradeoffs between transparency and statistical stability, providing guidance for policymakers setting minimum n-size requirements.