All last week, OpenAI watchers reported seeing strange things.
References to GPT-5.1 kept showing up in OpenAI’s codebase, and a “cloaked” model codenamed Polaris Alpha and widely believed to have come from OpenAI randomly appeared in OpenRouter, a platform that AI nerds use to test new systems.
Today, we learned what was going on. OpenAI announced the release of its brand new 5.1 model, an updated and revamped version of the GPT-5 model the company debuted in August.
As a former OpenAI Beta tester–and someone who burns through millions of GPT-5 tokens every month–here’s what you need to know about GPT-5.1.
A smarter, friendlier robot
In their release notes for the new model, OpenAI emphasizes that GPT-5.1 is “smarter” and “more conversational” than previous versions.
The company says that GPT-5.1 is “warmer by default” and “often surprises people with its playfulness while remaining clear and useful.”
While some people like talking with a chatbot as if it’s their long-time friend, others find that cringey. OpenAI acknowledges this, saying that “Preferences on chat style vary—from person to person and even from conversation to conversation.”
For that reason, OpenAI says users can customize the new model’s tone, choosing between pre-set options like “Professional,” “Candid” and “Quirky.”
There’s also a “Nerdy” option, which in my testing seems to make the model more pedantic and cause it to overuse terms like “level up.”
At their core, the new changes feel like a pivot towards the consumer side of OpenAI’s customer base.
Enterprise users probably don’t want a model that occasionally drops Dungeons and Dragons references. As the uproar over OpenAI’s initially voiceless GPT-5 model shows, though, everyday users do.
Even fewer hallucinations
OpenAI’s GPT-5 model fell short in many ways, but it was very good at providing accurate, largely hallucination-free responses.
I often use OpenAI’s models to perform research. With earlier models like GPT-4o, I found that I had to carefully fact check everything the model produced to ensure it wasn’t imagining some new software tool that doesn’t actually exist, or lying to me about myriad other small, crucial things.
With GPT-5, I had to do that far less. The model wasn’t perfect. But OpenAI had largely solved the problem of wild hallucinations.
According to the company’s own data, GPT-5 hallucinates only 26% of the time when solving a complex benchmark problem, versus 75% of the time with older models. In normal usage, that translates to a far lower hallucination rate on simpler, everyday queries that aren’t designed to trip the model up.
From my early testing, GPT-5.1 seems even less prone to hallucinate. I asked it to make a list of the best restaurants in my hometown, and to include addresses, website links and open hours for each one.
When I asked GPT-4 to complete a similar task years ago, it made up plausible-sounding restaurants that don’t exist. GPT-5 does better on such things, but still often misses details, like the fact that one popular restaurant recently moved down the street.
GPT-5.1’s list, though, is spot-on. Its choices are solid, they’re all real places, and the hours and locations are correct across all ten selections.
There’s a cost, though. Models that hallucinate less tend to take fewer risks, and can thus seem less creative than unconstrained, hallucination-laden ones.
To that point, the restaurants in GPT-5.1’s list aren’t wrong, but they’re mostly safe choices—the kinds of places that have been in town forever, and that every local would have visited a million times.
A real human reviewer (or a bolder model) might have highlighted a promising newcomer, just to keep things fresh and interesting. GPT-5.1 stuck with decade-old, proven classics.
OpenAI will likely try to carefully walk the link between accuracy and creativity with GPT-5.1 as the rollout continues. The model clearly gets things right more often, but it’s not yet clear if that will impact GPT-5.1’s ability to come up with things that are truly creative and new.
Better, more creative writing
In a similar vein, when OpenAI released their GPT-5 model, users quickly noticed that it produced boring, lifeless written prose.
At the time, I predicted that OpenAI had essentially given the model an “emotional lobotomy,” killing its emotional intelligence in order to curb a worrying trend of the model sending users down psychotic spirals.
Turns out, I was right. In a post on X last month, Sam Altman admitted that “We made ChatGPT pretty restrictive to make sure we were being careful with mental health issues.”
But Altman also said in the post “now that we have been able to mitigate the serious mental health issues and have new tools, we are going to be able to safely relax the restrictions in most cases.”
That process began with the rollout of new, more emotionally intelligent personalities in the existing GPT-5 model. But it’s continuing and intensifying with GPT-5.1.
Again, the model is already voicer than its predecessor. But as the system card for the new model shows, GPT-5.1’s Instant model (the default in the popular free version of the ChatGPT app) is also markedly better at detecting harmful conversations and protecting vulnerable users.
Naughty bits
If you’re squeamish about NSFW stuff, maybe cover your ears for this part.
In the same X post, Altman subtly dropped a sentence that sent the Internet into a tizzy: “As we roll out age-gating more fully and as part of our “treat adult users like adults” principle, we will allow even more, like erotica for verified adults.”
The idea of America’s leading AI company churning out reams of computer-generated erotica has already sparked feverish commentary from such varied sources as politicians, Christian leaders, tech reporters, and (judging from the number of Upvotes), most of Reddit.
For their part, though, OpenAI seems quite committed to moving ahead with this promise. In a calculus that surely makes sense in the strange techno-Libertarian circles of the AI world, the issue is intimately tied to personal freedom and autonomy.
In a recent article about the future of artificial intelligence, OpenAI again reiterated that “We believe that adults should be able to use AI on their own terms, within broad bounds defined by society,” placing full access to AI “on par with electricity, clean water, or food.”
All that’s to say that soon, the guardrails around ChatGPT’s naughty bits are almost certainly coming off.
That hasn’t yet happened at launch—the model still coyly demures when asked about explicit things. But along with GPT-5.1’s bolder personalities, it’s almost certainly on the way.
Deeper thought
In addition to killing GPT-5’s emotional intelligence, OpenAI made another misstep when releasing GPT-5.
The company tried to unify all queries within a single model, letting ChatGPT itself choose whether to use a simpler, lower-effort version of GPT-5, or a slower, more thoughtful one.
The idea was noble–there’s little reason to use an incredibly powerful, slow, resource-intensive LLM to answer a query like “Is tahini still good after 1 month in the fridge” (Answer: no)
But in practice, the feature was a failure. ChatGPT was no good at determining how much effort was needed to field a given query, which meant that people asking complex questions were often routed to a cheap, crappy model that gave awful results.
OpenAI fixed the issue in ChatGPT with a user interface kludge. But with GPT-5.1, OpenAI is once again bifurcating their model into an Instant and Thinking version.
The former responds to simple queries far faster than GPT-5, while the latter takes longer, chews through more tokens, and yields better results on complex tasks.
OpenAI says that there’s more fine grained nuance within GPT-5.1’s Thinking model, too. Unlike with GPT-5, the new model can dial up and down its level of thought to accurately answer tough questions without taking forever to return a response–a common gripe with the previous version.
OpenAI has also hinted that its future models will be “capable of making very small discoveries” in fields like science and medicine next year, with “systems that can make more significant discoveries” coming as soon as 2028.
GPT-5.1’s increased smarts and dialed-up thinking ability are a first step down that path.
An attempt to course correct
Overall, GPT-5.1 seems like an attempt to correct many of the glaring problems with GPT-5, while also doubling down on OpenAI’s more freedom-oriented, accuracy-focused, voicy approach to conversational AI.
The new model can think, write, and communicate better than its predecessors—and will soon likely be able to (ahem) “flirt” better too.
Whether it will do those things better than a growing stable of competing models from Google, Anthropic, and myriad Chinese AI labs, though, is anyone’s guess.
This story has been updated.