The NAEP Reading Automated Scoring Challenge was a competition which was held in late 2021 by the National Center for Education (USA). The goal was to evaluate the applicability of natural language processing methodology to the task of scoring of large scale constructed response assessment data. In addition to this, participating models were also required to be interpretable and free of algorithmic bias against different student demographics. The data set which was used to evaluate the systems consisted of responses from 4th and 8th grade school student to 20 different reading comprehension tasks.
More than a dozen different teams from universities and private assessment companies submitted contributions. Out of these, 12 contributions, including ours, were selected as eligible with respect to interpretability and algorithmic fairness. These contributions were then ranked by quadratic weighted kappa scores. Our team, consisting of Nico Andersen, Sebastian Gombert, Ulf Kröhne and Fabian Zehner, was ranked fourth out of twelve (QWK score of 0.862; the winning team achieved 0.888) and won the first runners-up prize through this.
More information about the competition and respective evaluations can be found here.