Early studies on the usage of ChatGPT in educational settings have reported substantial learning gains from ChatGPT applications. But how valid are these studies? Is using ChatGPT in education really as effective as it seems?

A newly published paper takes a deeper look at key findings from past debates about media and teaching methods to reveal frequent conceptual challenges that arise in studies about the effectiveness of ChatGPT. When researchers compare different types of media for learning, they sometimes mix up the effects of the teaching style with the features of the technology. If the instructional methods and the technological features are confused with one another, it makes it difficult to be able to interpret the actual effect of ChatGPT.

To help pinpoint the conceptual difficulties of these efficacy studies, the paper reviews important findings from the media/methods debate using the example of a recent meta-analysis. These three non-negotiable conditions are identified, which are needed to have effects be interpretable: 1.) a precise explanation of what exactly was done in the experiment, 2.) Details about the control group activities and 3.) valid indicators measuring whether learning took place.

The paper finds that few studies actually meet all three conditions, thus weakening these claims of ChatGPT’s effectiveness for learning. The authors therefore warn of “fast science” and strongly advocate for reflective, more conscious research practices. They stress that it is important for more mature research to take place before specific recommendations for educational tools are given. In the meantime, educators should carefully choose the specific didactic characteristics of their planned ChatGPT implementations.

–> For more details, read the full article:

Weidlich, J., Gašević, D., Drachsler, H. & Kirschner, P. (2025). ChatGPT in Education: An Effect in Search of a Cause. Journal of Computer Assisted Learning 41, no. 5: e70105. https://doi.org/10.1111/jcal.70105