Ready or not, AI is in our schools

Students worldwide are using generative AI tools to write papers and complete assignments. Teachers are using similar tools to grade tests. What exactly is going on here? Where is all of this heading? Can education return to a world before artificial intelligence?

How many students are using generative AI in school?

Many high school and college-age students embraced popular generative AI writing tools like OpenAI’s ChatGPT almost as soon as they started gaining international attention in 2022. The incentive was pretty clear. With just a few simple prompts, large language models (LLMs) at the time could scour their vast databases of articles, books, and archives and spit out relatively coherent short-form essay or question responses in seconds. The language wasn’t perfect and the models were prone to fabricating facts, but they were good enough to skirt past some educators, who, at the time, weren’t primed to spot tell-tell signs of AI manipulation.

The trend caught on like wildfire. Around one in five highschool aged teens who’ve heard about ChatGPT say they have already used the tools on classwork, according to a recent Pew Research survey. A separate report from ACT, which creates one of the two most popular standardized exams for college admission, claims nearly half (46%) of high school students have used AI to complete assignments. Similar trends are playing out in higher education. More than a third of US college students (37%) surveyed by the online education magazine Intelligent.com say they’ve used ChatGPT either to generate ideas, write papers, or both.

Those AI tools are finding their way onto graded papers. Turnitin, a prominent plagiarism detection company used by educators, recently told Wired it found evidence of AI manipulation in 22 million college and high school papers submitted through its service last year. Out of 200 million papers submitted in 2023, Turnitin claims 11% had more than 20% of its content allegedly composed using AI-generated material. And even though generative AI usage generally has cooled off among the general public, students aren’t showing signs of letting up.

Almost immediately after students started using AI writing tools, teachers turned to other AI models to try and stop them. As of writing, dozens of tech firms and startups currently claim to have developed software capable of detecting signs of AI-generated text. Teachers and professors around the country are already relying on these to various degrees. But critics say AI detection tools, even years after ChatGPT became popular, remain far from perfect.

A recent analysis of 18 different AI detection tools in the International Journal for Educational Integrity highlights a lack of comprehensive accuracy. None of the models studied accurately differentiated AI generated material from human writing. Worse still, only five of the models achieved an accuracy above 70%. Detection could get even more difficult as AI writing models improve over time.

Accuracy issues aren’t the only problem with limiting AI detection tools effectiveness. An overreliance on these still developing detection systems risks punishing students who might use otherwise helpful AI software that, in other contexts, would be permitted. That exact scenario played out recently with a University of North Georgia student named Marley Stevens who claims an AI detection tool interpreted her use of the popular spelling and writing aid Grammarly as cheating. Stevens claims she received a zero on that essay, making her ineligible for a scholarship she was pursuing.

“I talked to the teacher, the department head, and the dean, and [they said] I was ‘unintentionally cheating,’” Stevens alleged in a TikTok post. The University of North Georgia did not immediately respond to PopSci’s request for comment.

There’s evidence current AI detection tools also mistakenly confuse genuine human writing for AI content. In addition to general false positives, Stanford researchers warn detection tools may disproportionately penalize writing from non-native speakers. More than half (61.2%) of essays written by US-born, non-native speaking eighth graders included in the research were classified as AI generated. 97% of the essays from non-native speakers were flagged as AI generated by at least one of the seven different AI detection tools tested in the research. Widely rolled out detection tools could put more pressure on non-native speakers who are already tasked with overcoming language barriers.

How are schools responding to the rise in AI?

Educators are scrambling to find a solution to the influx of AI writing. Some major school districts in New York and Los Angeles have opted to ban use of the ChatGPT and related tools entirely. Professors in universities around the country have begun begrudgingly using AI detection software despite recognizing its known accuracy shortcomings. One of those educators, Michigan Technological University Professor of Composition, described these detectors as a “tool that could be beneficial while recognizing it’s flawed and may penalize some students,” during an interview with Inside Higher Ed.

Others, meanwhile, are taking the opposite approach and leaning into AI education tools with more open arms. In Texas, according to The Texas Tribune, the state’s Education Agency just this week moved to replace several thousand human standardized test grades with an “automated scoring system.” The agency claims its new system, which will score open-ended written responses included in the state’s public exam, could save the state $15-20 million per year. It will also leave an estimated 2,000 temporary graders out of a job. Elsewhere in the state, an elementary school is reportedly experimenting with using AI learning modules to teach children basic core curriculums and then supplementing that with human teachers.

AI in education: A new normal

While its possible AI writing detection tools could evolve to increase accuracy and reduce false positives, it’s unlikely they alone will transport education back to a time prior to ChatGPT. Rather than fight the new normal, some scholars argue educators should instead embrace AI tools in classrooms and lecture halls and instruct students how to use them effectively. In a blog post, researchers at MIT Sloan argue professors and teachers can still limit use of certain tools, but note they should do so through clearly written rules explaining their reasoning. Students, they write, should feel comfortable approaching teachers to ask when AI tools are and aren’t appropriate.

Others, like former Elon University professor C.W. Howell argue explicitly and intentionally exposing students to AI generated writing in a classroom setting may actually make them less likely to use it. Asking students to grade an AI-generated essay, Howell writes in Wired, can give students first hand experience noticing the way AI often fabricates sources or hallucinate quotes from an imaginary ether. AI generated essays, when looked at through a new lens, can actually improve education.

“Showing my students just how flawed ChatGPT is helped restore confidence in their own minds and abilities,” Howell writes.

Then again, if AI does fundamentally alter the economic landscape as some doomsday enthusiasts believe, students could always spend their days learning how to engineer prompts to train AI and contribute to the architecture of their new AI-dominated future.