OpenAI’s latest model is a true x.1 release, with better instruction following, longer memory, and lower cost.
Apr 14, 2025
OpenAI released GPT-4.1 and it’s a real x.1 release. Better instruction following, longer memory, lower cost, and the best performance we’ve seen from any OpenAI model.
But: There’s no system card, no safety update, and it’s not even available in ChatGPT yet. Only the API. Weird.
Comparison of GPT-4.1 to previous intelligence v. latencyGPT-4.1's performance on coding tasks compared to previous models
Comparison of GPT-4.1 to previous intelligence v. latency
GPT-4.1's performance on coding tasks compared to previous models
The name actually fits for once — this feels like a 4.1. It’s not a whole new model, but it’s sharper, faster, and much more capable in a few important ways.
Instruction following is much better: It handles long, complex video, Q&A, and follows weird edge case instructions — like “summarize a video without subtitles and find the emotional arc.” That’s hard. And it gets it right.
And it's built for developer use cases. And coding was a big focus.
This is the biggest leap. 4.1 supports up to 1M tokens of context.
I usually have to switch to Gemini for long PDFs or webpages, but this changes everything.
Needle-in-the-haystack tests show it works — you can drop a sentence into a massive doc and it’ll still find it.
Caveat: Performance does degrade over long context. OpenAI’s own chart shows ~84% accuracy at 8k tokens, but only ~50% at 1M. Still — usable at that scale is a big deal.
One last note: we’ll also begin deprecating GPT-4.5 Preview in the API today as GPT-4.1 offers improved or similar performance on many key capabilities at lower latency and cost. GPT-4.5 in the API will be turned off in three months, on July 14, to allow time to transition (and
OpenAI dropped GPT-4.1 and it’s an insane leap ahead of GTP-4o. On the Box AI Enterprise Eval, it offers a 27 pt jump over GPT4o on data extraction, and is much better at doc and image Q&A. Box is rolling it out now in beta in the Box AI Studio, and GA shortly.
Box ran evals using GPT-4.1 and saw a 27-point gain over 4o for document extraction.
That’s a big jump. Box is already rolling it out in production. This makes 4.1 the model of choice for structured data extraction and doc Q&A at the enterprise level. Expect more to follow.
don't miss that OAI also published a prompting guide WITH RECEIPTS for GPT 4.1 specifically for those building agents... with a new recommendation for:
- telling the model to be persistent (+20%)
- dont self-inject/parse toolcalls (+2%)
- prompted planning (+4%)
- JSON BAD - use
ben
@benhylak
o1 is mind-blowing when you know how to use it.
it's really not a chat model -- you have to think of it more like a "report generator"
(link to article below)
You can only access 4.1 via API. Not even in ChatGPT Team or Enterprise yet.
@emollick points out something interesting: OpenAI seems to be splitting ChatGPT (high-power) from the API (cheap, dev-facing).
No system card. No big safety briefing. Is the API side is shipping fast while the product team catches up? Are the teams shipping as fast as they can, independently? Or is something else unfolding?