Humans beat ChatGPT in accounting exam, score 29% higher than AI bot

New Delhi: The researchers found that students performed better on accounting exams when compared to OpenAI’s chatbot product ChatGPT. Despite this, he said ChatGPT’s performance was “impressive” and that it was a “game changer that will change the way everyone teaches and learns – for the better.”

Researchers from Brigham Young University (BYU), US and 186 other universities wanted to know how OpenAI’s technology would perform on accounting exams. They have published their findings in the journal Issues in Accounting Education.

In the researchers’ audit, students scored an overall average of 76.7 per cent, compared to ChatGPT’s score of 47.4 per cent. While ChatGPT was found to score above the student average on 11.3 percent of questions, performing particularly well on accounting information systems (AIS) and auditing, the AI ​​bot was found to do poorly on tax, financial and managerial assessments Went. The researchers believe this could possibly be because ChatGPT struggled with the mathematical procedures required for the latter type.

The AI ​​bot, which uses machine learning to generate natural language text, was found to do better on true/false questions (68.7 percent correct) and multiple-choice questions (59.5 percent), but with short-answer questions struggled (between 28.7 and 39.1 percent).

In general, the researchers said that higher-order questions were difficult for ChatGPT to answer. In fact, ChatGPT has occasionally been found to provide official written explanations for incorrect answers or to answer the same question in different ways.

They also found that ChatGPT often provided explanations for their answers, even when they were wrong. Other times, it went on to select the wrong multiple choice answer, despite providing an accurate description. The researchers importantly noted that ChatGPT sometimes made up facts. For example, when providing a reference, it produced a genuine looking reference that was completely fabricated. The work and sometimes the author didn’t even exist.

The bot was also observed to make nonsensical mathematical errors such as adding two numbers in a subtraction problem, or dividing numbers incorrectly. Wanting to add to the intense ongoing debate about how models like ChatGPT should factor into education, lead study author David Wood, a BYU professor of accounting, decided to recruit as many professors as possible to see if they could. How does the AI ​​perform against real university accounting students. ,

Their co-author recruitment pitch exploded on social media: 327 co-authors from 186 educational institutions in 14 countries participated in the research, contributing 25,181 classroom audit questions. They also recruited undergraduate BYU students to feed ChatGPT another 2,268 textbook test bank questions. Questions include AIS, auditing, financial accounting, managerial accounting and tax, and vary in difficulty and type (true/false, multiple choice, short answer).