Effect sizes - statistics that indicate the magnitude of difference - play a crucial role in assessment, institutional research, and scholarly inquiry, where it is common with large sample sizes to find statistically significant relationships that are in fact trivial in importance. This study provides guidelines for NSSE users, researchers, policymakers, and assessment professionals to judge the importance of an effect from student engagement results.
Using data from 984 U.S. institutions from the 2013 and 2014 administrations of NSSE, we examined the distribution of effect sizes from institutional comparisons. Recently published in Research & Practice in Assessment, "Contextualizing Effect Sizes in the National Survey of Student Engagement: An Empirical Analysis" provides a definition of effect size and a review of the limitations of hypothesis testing. After considering Cohen's (1988) rationale for interpreting the size of an effect, we simulated a distribution of NSSE effect sizes to provide a normative context to interpret the natural or relative variation in magnitudes of institution-to-peer-group comparisons.
As a general frame of reference, Cohen's d benchmarks of .2, .5, and .8 have been commonly used in educational research to represent small, medium, and large effects. However, results from our study suggest that these cut points are inadequate for institutional comparisons of NSSE Engagement Indicators (aggregate averaged scales of individual survey items), and that new interpretations are necessary. The consistency of simulated effect size values among the indicators point toward a new set of criteria:
- Small effects start at about .1
- Medium effects start at about .3
- Large effects start at about .5
These are helpful guidelines for assessment professionals, policymakers, and researchers to identify areas of engagement where an institution is doing comparatively well, and to identify areas in need of improvement. The proposed values are grounded in actual NSSE data, which allows for richer interpretations of the results. Institutions with meaningful differences will more likely find effect sizes of .3 or .5 and can be more confident in interpreting those effects as medium or large effects, instead of small or medium. Furthermore, although relatively small, one should not simply disregard effect sizes of .1 as trivial.