The Technical Qualities of the Elicited Imitation Subsection of The Assessment of College English, International (ACE-In)
The present study investigated technical qualities of the elicited imitation (EI) items used by the Assessment of College English – International (ACE-In), a locally developed English language proficiency test used in the undergraduate English Academic Purpose Program at Purdue University. EI is a controversial language assessment tool that has been utilized and examined for decades. The simplicity of the test format and the ease of rating place EI in an advantageous position to be widely implemented in language assessment. On the other hand, EI has received a series of critiques, primarily questioning its validity. To offer insights into the quality of the EI subsection of the ACE-In and to provide guidance for continued test development and revision, the present study examined the measurement qualities of the items by analyzing the pre- and post-test performance of 100 examines on EI. The analyses consist of an item analysis that reports item difficulty, item discrimination, and total score reliability; an examination of pre-post changes in performance that reports a matched pairs t-test and item instructional sensitivity; and an analysis of the correlation patterns between EI scores and TOEFL iBT total and subsection scores.
The results of the item analysis indicated that the current EI task was slightly easy for the intended population, but test items functioned satisfactorily in terms of separating examinees of higher proficiency from those of lower proficiency. The EI task was also found to have high internal consistency across forms. As for the pre-post changes, a significant pair-wise difference was found between the pre- and post-performance after a semester of instruction. However, the results also reported that over half of the items were relatively insensitive to instruction. The last stage of the analysis indicated that while EI scores had a significant positive correlation with TOEFL iBT total scores and speaking subsection scores, EI scores were negatively correlated with TOEFL iBT reading subsection scores.
Findings of the present study provided evidence in favor of the use of EI as a measure of L2 proficiency, especially as a viable alternative to free-response items. EI is also argued to provide additional information regarding examinees’ real-time language processing ability that standardized language tests are not intended to measure. Although the EI task used by the ACE-In is generally suitable for the targeted population and testing purposes, it can be further improved if test developers increase the number of difficult items and control the contents and the structures of sentence stimuli.
Examining the technical qualities of test items is fundamental but insufficient to build a validity argument for the test. The present EI test can benefit from test validation studies that exceed item analysis. Future research that focuses on improving item instructional sensitivity is also recommended.