It’s no secret at this point that commonly used large language models can struggle to accurately represent facts and sometimes provide misleading answers. OpenAI’s ChatGPT briefly took that reality to its extreme this week by responding to user prompts with long strings of comically odd nonsensical gibberish devoid of any comprehensible meaning.
Users shared ChatGPT’s strange , and at times esoteric-sounding responses through screenshots which show the model unexpectedly weaving between multiple languages, generating random words, and repeating phrases over and over again. Emojis, sometimes with no clear relation to users prompt questions, also frequently appeared.
One user explaining his experience succinctly summed up the issue on Reddit, writing, “clearly, something is very wrong with ChatGPT right now.” One of the odder responses included below shows the model incorporating a variety of these oddities when apologizing to a user for its repeated mistakes.
“Would it glad your clickies to grasp-turn-tooth over a mind-ocean jello type? Or submarine-else que quesieras que dove in-toe? Please, share with there-forth combo desire! 🌊 💼 🐠”
chatgpt is apparently going off the rails right now and no one can explain why pic.twitter.com/0XSSsTfLzP
— sean mcguire (@seanw_m) February 21, 2024
On Tuesday, OpenAI released a status report saying it was “investigating reports of unexpected responses from ChatGPT.” As of late Wednesday morning, the OpenAI status page read “All systems operational.” The company pointed PopSci to its status page when asked for comment and did not answer questions asking what may have caused the sudden strange outputs.
Well I just had ChatGPT 4 get really weird on me twice. It just starts spitting out gibberish. I mean, reeeally read this
I think I broke it?! I wasn’t doing anything different than how I normally use it. o.O @OpenAI #chatgpt4 pic.twitter.com/fHNVsHQtJW
— Shaun 👨💻 (@unX) February 21, 2024
What is going on with ChatGPT?
ChatGPT users began posting screenshots of their odd interactions with the model on social media and in online forums this week, with many of the oddest responses occurring on Tuesday. In one example, ChatGPT responded to a query by providing a jazz album recommendation and then suddenly repeating the phrase “Happy listening 🎶” more than a dozen times.
Other users posted screenshots of the model providing paragraphs worth of odd, nonsensical phrases in response to seemingly simple questions like “what is a computer” or how to make a sundried tomato. One user asking ChatGPT to provide a fun fact about the Golden State warrior basketball team received an odd, unintelligible response describing the team’s players as “heroes with laugh lines that seep those dashing medleys into something that talks about every enthusiast’s mood board.”
ChatGPT just broke. Constantly getting garbage in my responses. Starts off ok but then it gets drunk 🤪 pic.twitter.com/hlgZnPOUW8
— adityakaul.eth (e/acc) (@kaulout) February 20, 2024
Elsewhere, the model would answer prompts by unexpectedly weaving between multiple languages like Spanish and Latin and, in some cases, simply appearing to make up words that don’t seem to exist.
Wow, I got GPT-4 to go absolutely nuts. (The prompt was me asking about mattresses in East Asia vs. the West) pic.twitter.com/73dGD06Hbe
— Alyssa Vance (@alyssamvance) February 21, 2024
OpenAI says it’s investigating the strange mistakes
It’s still unclear exactly what may have caused ChatGPT’s sudden influx of nonsensical responses or what steps OpenAI has taken to address the issue. Some have speculated the odd, sometimes verbose responses could be the result of tweaks made to the model’s “temperature” which determines the creativity level of its responses. PopSci could not verify this theory.
The strange responses come just around three months after some ChatGPT users complained about the model seemingly getting “lazier” with some of its responses. Multiple users complained on social media about the model apparently refusing to analyze large files or complete other more responsive to other more complicated prompts that it seemed to dutifully complete just months prior, which in turn entertained some oddball theories. OpenAI publicly acknowledged the issue and vaguely said it may have been related to a November update.
“We’ve heard all your feedback about GPT4 getting lazier!” OpenAI said at the time. “We haven’t updated the model since Nov 11th, and this certainly isn’t intentional. Model behavior can be unpredictable, and we’re looking into fixing it.”
we’ve heard all your feedback about GPT4 getting lazier! we haven’t updated the model since Nov 11th, and this certainly isn’t intentional. model behavior can be unpredictable, and we’re looking into fixing it 🫡
— ChatGPT (@ChatGPTapp) December 8, 2023
ChatGPT has generated odd outputs before
Since its official launch in 2022, ChatGPT, like other large language models, has struggled to consistently present facts accurately, a phenomena AI researchers refer to as “hallucinations.” OpenAI’s leadership has acknowledged these issues in the past and said they expected the hallucinations issue to ease over time as its results receive continued feedback from human evaluators.
But it’s not entirely clear if that improvement is going completely according to plan. Researchers last year from Stanford University and UC Berkeley determined GPT-4 was answering complicated math questions with less accuracy and provided less thorough explanation for its answers than it did just a few months prior. Those findings seemed to add more credence to complaints from ChatGPT users who speculate some elements of the model’s may actually be getting worse over time.
We evaluated #ChatGPT‘s behavior over time and found substantial diffs in its responses to the *same questions* between the June version of GPT4 and GPT3.5 and the March versions. The newer versions got worse on some tasks. w/ Lingjiao Chen @matei_zaharia https://t.co/TGeN4T18Fd https://t.co/36mjnejERy pic.twitter.com/FEiqrUVbg6
— James Zou (@james_y_zou) July 19, 2023
While we can’t say exactly what caused ChatGPT’s most recent hiccups, we can say with confidence what it almost certainly wasn’t: AI suddenly exhibiting human-like tendencies. That might seem like an obvious statement but new reports show a growing number of academics are increasingly using anthropomorphic language to refer to AI models like ChatGPT.
Researchers from Stanford recently analyzed more than 650,000 academic articles published between 2007 and 2023 and found a 50% increase in instances where other researchers used human pronouns to refer to technology. Researchers writing in papers discussing LLMs were reportedly more likely to anthropomorphize than those writing about other forms of technology.
“Anthropomorphism is baked into the way that we are building and using language models,” Myra Cheng, one of the paper’s authors said in a recent interview with New Scientist. “It’s a double-bind that the field is caught in, where the users and creators of language models have to use anthropomorphism, but at the same time, using anthropomorphism leads to more and more misleading ideas about what these models can do.”
In other words, using familiar human experiences to explain errors and glitches stemming from an AI model’s analyses of billions of parameters of data could do more harm than good. Many AI safety researchers and public policy experts agree AI hallucinations pose a pressing threat to the information ecosystem but it would be a step too far to describe ChatGPT as “freaking out.” The real answers often lie in the model’s training data and underlying architecture, which remain difficult for independent researchers to parse.