THIS ARTICLE/PRESS RELEASE IS PAID FOR AND PRESENTED BY NTNU Norwegian University of Science and Technology - read more

"When a new tool as powerful as ChatGPT comes along, the knee-jerk reaction is to have school exams instead of home exams," adviser Rasmus Grønbæk Jensen at NTNU's examination office says.

How can we make sure students don't use ChatGPT to cheat during exams?

Teachers now face the extra challenge of designing exams that will prevent students from cheating their way to good grades with ChatGPT.

Published

Thousands of students are currently working on exams to show what they’ve learned during the semester. For some, the ChatGPT language robot may be tempting to use.

The language robot, which is based on artificial intelligence (AI), can answer questions and deliver ready-made texts and content on a wide range of topics. 

It has sparked debate in large parts of the world in recent months, including discussions about the conduct of exams and the risk of cheating.

Produces false and inconsistent responses

Benjamin Kille, who conducts research on AI at the Norwegian University of Science and Technology (NTNU), sees both challenges and opportunities with language robots.

“Google, OpenAI, and Microsoft have now produced such advanced language models that they can deliver texts that are difficult to distinguish from texts created by humans,” Kille says. “It’s unclear what text is used to feed OpenAI’s ChatGPT. We assume that it is text found online, which implies that it could include teaching materials. That enables ChatGPT to answer a number of exam questions."

Benjamin Kille is a researcher at the Department of Computer Science at NTNU.

However, he also says that experts have tested exam questions on ChatGPT, and they have found that it produces false and inconsistent answers.

“So students can’t yet rely on ChatGPT to get good exam grades,” he says.

Furthermore, he notes that students can use ChatGPT as a tool, for example to get started with writing their responses.

Artificial intelligence mimics the brain’s own network

A machine that can solve problems it has encountered before uses 'narrow AI' (narrow artificial intelligence). A machine that can solve problems it has not yet encountered uses 'general AI' (general artificial intelligence).

“So far we’re using narrow AI; we haven’t yet developed general AI to any great extent,” Kille says. 

The AI models under development mainly use machine learning to enable them to solve tasks. Machine learning is a specialisation within AI where statistical methods are used to allow computers to find patterns in large amounts of data.

This means that the machine 'learns' instead of being programmed.

“Machine learning uses artificial neural networks similar to the ones we have in our brains, and these language models have trillions of network connections,” Kille says. 

Completely irresponsible

Inga Strümke is a researcher at NTNU who specialises in AI. She recently published a book called Maskiner som tenker (Machines that think). Strümke recently talked to the Norwegian business newspaper Dagens Næringsliv about AI tools like ChatGPT:

Inga Strümke is an associate professor at NTNU whose recent book on artificial intelligence has become a best-seller in Norway.

“The technology is really powerful. If used correctly, and within reason, it can be extremely useful. If used incorrectly, it can be really harmful. The most important thing about the launch of ChatGPT was that everyone’s eyes were opened to the fact that AI is now part of our lives,” she said in the article.

Further, she went on to say that introducing ChatGPT “could have been done more elegantly. It was completely irresponsible to make ChatGPT available without giving the education sector – and many others – a chance to confront this revolutionary force.”

Tips on how exam tasks can be designed

In order to meet the challenge associated with exam tasks, a working group at NTNU has developed tips and advice on how to create exams that ChatGPT cannot help solve to any great extent.

Rasmus Grønbæk Jensen, an adviser at NTNU’s examination office, participated in this project with people from the Section for Teaching and Learning Support.

Here are NTNU’s tips on how to create tasks that cannot be solved with the use of AI models alone.

Knee-jerk reaction

“When a new tool as powerful as ChatGPT comes along, the knee-jerk reaction is to opt for exams at the university instead of home exams,” Grønbæk Jensen says. 

Universities typically have invigilators present in rooms where exams are held. Additionally, Safe Exam Browser is used to prevent students from acessing the internet. 

“But lengthy exams, which can only be offered as take-home exams, have their own advantages that a school-based exam can’t provide,” he says. “We need to build an understanding and a mindset in students that encourages them to use what they’ve learned. It’s important to create exams that both motivate and require students to demonstrate what they’ve learned."

Rasmus Grønbæk Jensen says it’s both easy and hard to make exams that make it difficult to craft answers using ChatGPT.

Grønbæk Jensen says it can be both difficult and easy to make these kinds of exams.

“Google has been around for a long time, and students can get information online on take-home exams. They’re also able to collaborate with fellow students or get help from others,” he says. 

However, a text that has been copied from websites and contains elements taken from Google constitutes plagiarism, which can be detected by a plagiarism checker.

“By contrast, a plagiarism checker can’t detect the use of artificial intelligence. ChatGPT can create unique text for each request it receives, and it’s really difficult to prove that AI has been used in exam answers. But if 15-20 people ask ChatGPT about the same thing, the answers will be somewhat similar,” Grønbæk Jensen says. “It’s important that we as a university assume that the students are here to learn and to develop; they’re not here to cheat. We can’t treat our students like suspects.”

Has to feel meaningful for students

Martha Torgeirdatter Dahl is a university lecturer in pedagogy at NTNU. 

She says that educational institutions now have to decide whether content used in assessments – the tasks we create and the ways in which the students are given the opportunity to demonstrate their knowledge – are adapted to a world where ChatGPT is a reality.

She believes that these types of exams will be particularly vulnerable:

  • Tasks where students are asked to explain theory and regurgitate content.
  • Tasks with a heavy emphasis on structure and spelling.
Martha Torgeirdatter Dahl says that exams need to be written to challenge students to reflect on what they have learned, rather than just regurgitate facts.

“In light of the new text generator tools, it’s clear that we need content in our assessments that requires more from students and that text generator tools cannot process as well. Student need to be given the opportunity, orally or in writing, to demonstrate independent reflection and assessment skills, and to actively apply the syllabus in order to demonstrate what they have learned. The tasks should be anchored within a context, by linking them to specific conditions and reflect current situations and personal experiences,” Dahl says.

She mentions that there is also a need for assessments that are meaningful for the student to spend time on.

Important to have a dialogue with the students

Dahl says it will be crucial to talk to students about the appropriate use of these tools and for teachers to decide how text generator tools might be used in a responsible way in their subject.

“Ongoing dialogue with the students about the overall purpose of the exam will also be important, to ensure that they see the value in their own ability to convey their academic reasoning and competence, both orally and in writing,” Dahl says. 

Writing exam questions in an AI world

  • Tasks that require good knowledge of the syllabus: Since chatbots probably are not familiar with all the syllabus literature, tasks that require in-depth knowledge of the syllabus will make it difficult to use AI in students’ answers. This applies especially to recent Scandinavian literature.
  • Vary the tasks: So far chatbots are best at working with text, so tasks that require using other formats, such as audio and video files, images or graphs make it difficult to use only chatbots to produce an answer.
  • Base tasks on students’ own experiences and personal reasoning: By asking students to work within a specific context and situation, they can demonstrate their skills, knowledge, and competence to a greater extent.
  • Use a case study: Make an unknown case the basis for answering the task.
  • Require complex, nuanced answers, for example by using specific or technical language and terminology.
  • Ask students to reflect on their own process and answers: The reflections can be related to data/source collection, structure of the answer/text, critical assessments of content, reasons for opting out of content, etc.



Powered by Labrador CMS