Generative AI Can Harm Learning
A recent paper titled "Generative AI Can Harm Learning" presents an in-depth study on the effects of generative AI, specifically OpenAI's GPT-4, on human learning. The research, conducted by a team from the University of Pennsylvania, examines the role of AI in educational settings, particularly focusing on its impact on students' ability to acquire and retain new skills.
The study was conducted in a high school in Turkey, involving nearly a thousand students across multiple grades. The researchers deployed two versions of a GPT-4-based math tutor: one that mimicked a standard ChatGPT interface, referred to as GPT Base, and another that incorporated safeguards designed to promote learning, known as GPT Tutor. These tools were integrated into the math curriculum, comprising about 15% of the instructional content for the students involved.
The primary objective of the study was to understand how access to generative AI affects students' learning outcomes, both in the short and long term. The experiment was structured in three parts: an initial review of the material by the teacher, an assisted practice period where students worked on math problems with access to the AI tools, and a final unassisted exam where students were tested without any AI assistance. The key findings of the study provide a nuanced view of the potential benefits and drawbacks of using AI in educational contexts.
In the short term, the study found that access to GPT-4 significantly enhanced students' performance during the practice sessions. Students using the GPT Base tutor showed a 48% improvement in solving practice problems compared to the control group, while those using the GPT Tutor exhibited an even more remarkable 127% improvement. This outcome aligns with previous research indicating that AI tools can substantially boost productivity and performance in various tasks.
However, the study's findings revealed a critical downside when AI access was removed during the final exam. Students who had used the GPT Base tutor performed 17% worse on the exam compared to students who had no access to AI at all. This decline suggests that while AI tools can improve immediate task performance, they may also lead to over-reliance, inhibiting the deep learning and skill development necessary for success in the absence of such tools. In contrast, students who used the GPT Tutor, which was designed to avoid giving away answers and instead provided incremental hints, did not experience the same decline in exam performance. Their results were statistically indistinguishable from those of the control group, suggesting that AI tools, when carefully designed with learning in mind, can enhance educational outcomes without fostering dependency.
A particularly interesting aspect of the study was the discrepancy between students' perceptions and their actual learning outcomes. Despite the clear evidence that GPT Base users performed worse on the exams, these students did not perceive that they had learned less or performed poorly. On the contrary, they often believed they had done well, highlighting a significant gap between perceived and real learning. This finding underscores the importance of not only designing AI tools that support learning but also educating users about the limitations and potential pitfalls of relying too heavily on these technologies.
The study also explored the broader implications of these findings, particularly in relation to how generative AI could be integrated into educational practices. The researchers argue that while AI tools like ChatGPT have the potential to revolutionize education by making learning more accessible and personalized, there is a need for caution. Without appropriate safeguards, these tools might do more harm than good, particularly if they diminish students' ability to learn independently and retain critical skills.