Purdue University Graduate School
Browse
2021.7.28 Karakis_Dissertation.pdf (1.42 MB)

PREDICTORS OF EARLY POSTSECONDARY STEM PERSISTENCE OF HIGH-ACHIEVING STUDENTS: AN EXPLANATORY STUDY USING MACHINE LEARNING TECHNIQUES

Download (1.42 MB)
thesis
posted on 2021-07-28, 21:06 authored by Nesibe KarakisNesibe Karakis

This study investigated high-achieving and non-high-achieving students’ persistence in STEM fields using nationally representative data from the High School Longitudinal Study of 2009 for the years 2009, 2012, 2013, 2013-2014, and 2016. The results indicated that approximately 70% of high-achieving and non-high-achieving students continued their initial STEM degrees within 3 years of college enrollment. The study revealed that the most important predictors of STEM persistence were: math proficiency level, school belonging, school engagement, school motivation, school problems, science self-efficacy, credits earned in computer sciences, GPA in STEM courses, credits earned in STEM courses, and credits earned in Advanced Placement/International Baccalaureate (AP/IB) courses. Based on the results, math proficiency was the most important variable in the study for both high-achieving and non-high-achieving students. Even though credits earned in AP/IB combined were among the most important variables, they were two times more important for high-achieving students (6.86% vs. 3.37%). Regarding demographic information related variables, socioeconomic status was the most important variable among gender, ethnicity, and urbanicity in models predicting STEM persistence and had higher importance for non-high-achieving students. Furthermore, Hispanic students' proportion of persistence differed from other underrepresented populations’ persistence. Non-high-achieving Hispanic students had the highest persistence rate, similar to well-represented populations (i.e., White, Asian). Machine learning methods used in the study including random forest and artificial neural network provided good accuracy for both achievement groups. Random forest accuracy was over 82% with the Synthetic Minority Over-Sampling Technique (SMOTE) dataset, while artificial neural network accuracy was over 92%.

History

Degree Type

  • Doctor of Philosophy

Department

  • Educational Studies

Campus location

  • West Lafayette

Advisor/Supervisor/Committee Chair

Nielsen Pereira

Additional Committee Member 2

Hua-Hua Chang

Additional Committee Member 3

Marcia Gentry

Additional Committee Member 4

Kristen Seward

Additional Committee Member 5

Anne Traynor