For a special issue on Natural Language Processing in Psychology we proposed a hierarchical rater model-based approach to address the challenges in automatic essay scoring. Essay writing tests are an integral part of educational systems, essential for assessing students’ critical thinking, articulation and understanding. Since the manual scoring process requires significant resources and time, teachers are beginning to use Automated Essay Scoring (AES), which is potentially capable of alleviating the manual effort involved.

#AutomatedEssayScoring #NaturalLanguageProcessing #FormativeAssessment #EducationalTechnology #MachineLearning #AIinEducation #HierarchicalRaterModel #EdTech #ScoringAutomation #AI #AssessmentTools #MeasurementInvariance #TransformerModels #EducationResearch #UniversityTesting #AIModels #TechInEducation

There are an abundance of available models and each one has its own unique features and scoring methods. Thus, selecting the optimal model is complex and challenging, especially when different aspects of content have to be assessed over a number of rating items.

Our hierarchical rater model-based approach addresses these challenges by integrating predictions from multiple AES models and taking their distinct scoring behaviors into consideration. Our goal in this is to utilize the advantages of various models to create a more robust and accurate scoring system.

We tested our approach using data from a university essay writing test and found that our method achieved an accuracy comparable to the best individual AES models. This is an exciting development, as our method not only matches top-tier accuracy, but also reduces the discrepancies between human and automated scoring. This leads to a higher degree of measurement invariance, ensuring fairer and more consistent results.

All in all, our hierarchical rater model-based approach represents a significant step forward toward more reliable and efficient automated essay scoring, benefiting educators and students alike by reducing the burden of manual grading and enhancing the fairness of assessments.

Stay tuned as we continue to refine and expand this approach, aiming to transform how essays are evaluated in educational settings.

The publication can be found here: https://econtent.hogrefe.com/doi/10.1027/2151-2604/a000567

Reference: Fink, A., Gombert, S., Liu, T., Drachsler, H., Frey, A. (2024) A Hierarchical Rater Model Approach for Integrating Automated Essay Scoring Models. Zeitschrift für Psychologie, 232(3), 209–218.