For standardized exams to be fair and reliable, they must include a diverse range of question difficulties to accurately assess test taker abilities. Additionally, it’s crucial to balance the time allotted per question to avoid making the test unnecessarily rushed or sluggish. The goal of this year’s BEA shared task (competition) was to build systems which could predict Item Difficulty and Item Response Time for items taken from the United States Medical Licensing Examination (USMLE).

EduTec member Sebastian Gombert designed systems which are able to predict both variables simultaneously. These placed first out of 43 for predicting Item Difficulty and fitfth out of 34 for predicting Item Response Time. They use modified versions of established transformer language models in a multitask setup. A corresponding system description paper titled Predicting Item Difficulty and Item Response Time with Scalar-mixed Transformer Encoder Models and Rational Network Regression Heads authored by Sebastian Gombert, Lukas Menzel, Daniele Di Mitri and Hendrik Drachsler was accepted at the BEA workshop 2024, co-located with the NAACL 2024 in Mexico City, Mexico.