What implications does ChatGPT have for assessment?

ChatGPT stormed into the collective consciousness of the internet at the tail end of last week, presenting an immediate range of both technical questions and ethical concerns.

How does this work? Is it any good? What does this mean for, well, everything?

These were the questions which we were asking at Team Wonkhe – firing all sorts of nonsense to each other, including a particularly novelistic AI written short story taking place in the backrooms of the Office for Students.

But though such applications are highly amusing (there was a whole stream of novelty song lyrics written) there are real implications of such a powerful technology being utilised within higher education, by both students and academics.

The conversational chat tool: a how to

For those who have yet to explore it, ChatGPT’s functionality allows a user to chat directly with the AI algorithm to perform a range of tasks in the form of a message and response chat application. Whilst it certainly isn’t the first example of a text-prompt AI application, its simplicity of use and ease of access presents an immediate wide appeal.

It allows a user to request answers to questions or prompts, for example, from the relatively simple “chronologically describe the major events of the Vietnam war”, to the definitively more complex, “dissect the guerrilla war strategies of the Viet Cong in a 1000 word essay, commenting on the failures of the opposing forces”.

The obvious academic use concern of the technology has been exactly this – students can have their essays written directly by the AI, and after refining their input prompts, can generate a variety of solid 2:1 responses in less time than it’s taken you to read to this point in the article. However, the flip side of this concern was quickly identified – the AI can review and grade inputted works, and even provide feedback.

The (in)complexities of fact regurgitation

Now to show off ChatGPT’s power. I was struggling to come up with a subject, degree of study, and question for this part. So I asked the AI:

Prompt:

Give me an example of a first year undergraduate essay question.

Response:

Explain the concept of supply and demand and its significance in the economy. In your answer, include a discussion of how changes in supply and demand can affect market equilibrium and pricing. Be sure to use specific examples to illustrate your points.

Disturbingly easy. So let’s ask it to write the essay.

Prompt:

Write a 1000 word essay in response to this question.

The response was too long for this article – after all, it was 1000 words. I invite you to try creating prompts for yourself, but for the purposes of this discussion, I kindly asked the AI to summarise it into three sentences:

Prompt:

Summarise this essay in three sentences.

Response:

Supply and demand are fundamental concepts in economics that describe the relationship between the quantity of a good or service that is available for sale and the quantity that consumers are willing and able to purchase. The interplay between these two factors determines the market equilibrium, which is the point at which the quantity of a good or service that is supplied by producers is equal to the quantity that is demanded by consumers. The elasticity of a good or service is an important factor in determining how changes in supply and demand will affect market equilibrium and pricing.

Pretty impressive, right? Does that sound like something you’ve read recently? Students are onto it already – after all, plagiarism detection software may not yet be able to detect this. Even so, if something doesn’t quite read right, you can hit try again and it’ll deliver a new response. Even ask them to write it in the style of Shakespeare, Joyce – or Nadine Dorries. Or maybe even ask it to provide a mitigated response (prompt: “Don’t make it sound too good!”).

But as Charles Knight explains in his recent YouTube series on student AI generated content, the problems of AI works quickly extrapolate. As he discusses, plagiarism software may pick up on the relative regurgitation of facts and stylistic wordings of a submission, if simply submitted in generated verbatim.

Where AI essays become more difficult to detect is when students refine the submission, using the generated content as a starting point for their work. Given the practice of students being able to submit work early through the detection software, students could plausibly generate, refine, submit, then refine once again from their generated plagiarism report. As Charles hints, the AI could then be added back into this cycle to refine the submission once again.

There have been other discussions floating in the general Twitterverse regarding a couple of ways to combat this by changing the technology itself. One plausible solution would be watermarking – described as adding a unique signature to the AI’s output – maybe by using certain letters in a set order, or coding a message within. This also of course has problems too, in that once the watermark has been identified, it’s easy to detect and remove.

Troubling, but this is happening

I’m sure by now some academics are feeling some sense of panic. In my very recent career I was reviewing and grading essays, and can empathise entirely. But here’s a slightly dubious way its power can be used to advantage for the other side…

Prompt:

Write an assessment rubric for the above essay question.

As you might now expect, it created a very detailed assessment rubric – but it’s true power is shown from the next prompt.

Prompt:

Grade the essay response, and provide 150 words of feedback.

Response:

Based on the assessment rubric provided, the essay response would receive a score of 80-90%. The response provides a competent explanation of the concept of supply and demand and its significance in the economy, and includes specific examples to illustrate the points being made.

However, the organisation and coherence of the essay could be improved, and the discussion of the concepts may not be as complete as it could be. One area for improvement is the organisation of the essay. The introduction could be more engaging and provide a clearer thesis statement that outlines the main points that will be discussed in the essay. In addition, the body of the essay could be more structured, with each paragraph focused on a specific point and linked to the overall argument of the essay.

Another area for improvement is the completeness of the discussion. While the essay provides a good overview of the concept of supply and demand, it could be expanded to include more detailed examples and explanations of the factors that determine the elasticity of a good or service, and how these factors can affect market equilibrium and pricing. Overall, the response is a good effort, but could be improved in some areas to provide a more complete and well-organised explanation of the concept of supply and demand.

Relief, right? Relief? Bookmark this page – or cast it into the depths of your memory. One thing is for certain, technology like this is only going to improve with time. The only question now is when and how.

Prompt: summarise this for me

While we’ve only just scratched the surface of the functionality of ChatGPT, it’s clear that it presents a number of ethical and practical issues around plagiarism, and the authenticity of written work. As its intelligence develops and users become ever more able to operate it, soon it may be impossible to decipher what is the work of a real person, and what is the work of AI.

This particular application currently only has knowledge of the world and events up until 2021, but if and when given the power to search the internet, the possibility of effective real time AI response is certainly not far away.

And a comment from the ChatGPT AI after reading this article?

AI-generated essays and grading is a threat to the integrity of education. Allowing students to cheat on assignments and tests undermines the entire purpose of education, which is to challenge and develop students’ knowledge and skills. In addition, allowing AI to grade essays could lead to unfair grading practices and the devaluation of human expertise in the field. Ultimately, the use of AI in education should be carefully considered and regulated to prevent abuse and maintain the quality of education.

Well, until that regulation comes, it certainly feels like it’s anyone’s game.

One response to “What implications does ChatGPT have for assessment?”

chat gpt says:

Jan 2 2023 at 4:09 pm

The emergence of chatgpt not only improves our productivity, but also brings many new problems. I have the same concerns as you. It would be too bad if bad people use chatgpt in wrong places, such as cheating and manipulating public opinion.

What implications does ChatGPT have for assessment?

James Bagshaw

Former Operations Coordinator

The conversational chat tool: a how to

The (in)complexities of fact regurgitation

Troubling, but this is happening

Prompt: summarise this for me

festival side Festival side

post list Latest articles

The Public Accounts Committee on franchising and fraud

Not everyone can afford open access monographs

Quality assurance for Ukraine’s primary and secondary education – new, ambitious, and student-focused

Why data stewards are the real heroes of AI

Seven questions we’ll be debating at the 2024 Festival of Higher Education

It’s not (all) about the money, money, money

Higher education postcard: St George’s, University of London

UCAS equalities data, 2023 cycle

Podcast: First in family, NUS, hybrid grads

Students transition through, not just into, university

One response to “What implications does ChatGPT have for assessment?”

Leave a Reply Cancel reply

The conversational chat tool: a how to

The (in)complexities of fact regurgitation

Troubling, but this is happening

Prompt: summarise this for me

Share

Share

festival side Festival side

post list Latest articles

One response to “What implications does ChatGPT have for assessment?”

Leave a Reply Cancel reply