A Critical Examination of Spatial Skills Assessment: Validity, Bias, and Technology
At the highest level, this dissertation is a case study on how bias can become encoded into the tools used to measure a construct and into the very definition of the construct itself. In this case, the construct is spatial ability. This dissertation focuses on the validity and accuracy of spatial tests and illuminates gender bias that is interwoven with the history of spatial testing.
First, I present a critical analysis of the graphical imagery used in spatial tests and explain why the imagery may be unclear and lead the tests to be inaccurate. I analyzed a collection of research in which researchers modified the stimuli used in spatial tests and found that the tests became easier when the imagery was made clearer. Thus, I conclude that imagery presentation impacts test difficulty, a likely example of construct-irrelevant variance which may reduce the validity of some spatial skills assessments and introduce bias in favor of individuals with past experience in engineering graphics, who historically are more likely to be men.
Second, I make a critical review of gender differential research in spatial skills. I argue that the construct of “spatial ability” has been co-constructed with gender, in that it has been devised in a manner influenced by gender beliefs. Because of a preexisting belief that men had better spatial skills than women, some test creators “selectively bred” spatial instruments to produce the expected gender differences. Such instruments, including the very popular Mental Rotation Test (MRT), cannot validly assess between-group differences. Biological or evolutionary explanations for sex differences in spatial ability lack empirical evidence. Instead, the differences are rooted in the shaping of the construct of “spatial ability” to create the expected gender patterns.
Finally, I describe an experiment designed to investigate the hypothesis that using a spatial test with content from a feminized discipline will show different patterns in gender differences. Female engineering students outperformed males on the Digital Apparel Spatial Visualization Test (DASVT), while the male engineering students scored higher than the female students on the Purdue Spatial Visualization Test (PSVT:R). Students with relevant background experience scored better than students without experience on both assessments. The results demonstrate the shortcomings of using a single instrument to assess a concept as heterogeneous as spatial skills. I conclude this dissertation with a discussion of the implications of my work and recommendations for researchers and educators.
- Doctor of Philosophy
- West Lafayette