GPT-4.1: A true x.1 Release
OpenAI’s latest model is a true x.1 release, with better instruction following, longer memory, and lower cost.
Apr 14, 2025
OpenAI released GPT-4.1 and it’s a real x.1 release. Better instruction following, longer memory, lower cost, and the best performance we’ve seen from any OpenAI model.
But: There’s no system card, no safety update, and it’s not even available in ChatGPT yet. Only the API. Weird.
A True x.1 Release


- The name actually fits for once — this feels like a 4.1. It’s not a whole new model, but it’s sharper, faster, and much more capable in a few important ways.
- Instruction following is much better: It handles long, complex video, Q&A, and follows weird edge case instructions — like “summarize a video without subtitles and find the emotional arc.” That’s hard. And it gets it right.
- And it's built for developer use cases. And coding was a big focus.
1 Million Token Context: Haystack Mode

- This is the biggest leap. 4.1 supports up to 1M tokens of context.
- I usually have to switch to Gemini for long PDFs or webpages, but this changes everything.
- Needle-in-the-haystack tests show it works — you can drop a sentence into a massive doc and it’ll still find it.
- Caveat: Performance does degrade over long context. OpenAI’s own chart shows ~84% accuracy at 8k tokens, but only ~50% at 1M. Still — usable at that scale is a big deal.
Mini and Nano Might Be the Real Stars


- GPT-4.1 Mini is now about as good as GPT-4o — but cheaper and faster.
- GPT-4.1 Nano is as good as 4o-mini was — perfect for micro-intelligence tasks (think categorization, info retrieval, content extraction, etc.).
- These are going to be the new “daily drivers” for a lot of apps.
A Quiet Goodbye to GPT-4.5 Preview
One last note: we’ll also begin deprecating GPT-4.5 Preview in the API today as GPT-4.1 offers improved or similar performance on many key capabilities at lower latency and cost. GPT-4.5 in the API will be turned off in three months, on July 14, to allow time to transition (and
- GPT-4.5 Preview is being deprecated in the API. July 14 is the deadline.
- I liked this model. But OpenAI says 4.1 beats or matches it in key areas, and with better latency and cost.
- You can still use it in ChatGPT for now.
The Stealth Launch of “Quasar” and “Optimus”

- This was fun: OpenAI ran stealth tests of 4.1 under the names “Quasar” and “Optimus” through OpenRouter.
- We all knew who was behind them… but it was fun speculating and testing.
- Not totally clear why they did this — maybe to test expectations or avoid hype while collecting data?
Truely made for developer use-cases
OpenAI dropped GPT-4.1 and it’s an insane leap ahead of GTP-4o. On the Box AI Enterprise Eval, it offers a 27 pt jump over GPT4o on data extraction, and is much better at doc and image Q&A. Box is rolling it out now in beta in the Box AI Studio, and GA shortly.
- Box ran evals using GPT-4.1 and saw a 27-point gain over 4o for document extraction.
- That’s a big jump. Box is already rolling it out in production. This makes 4.1 the model of choice for structured data extraction and doc Q&A at the enterprise level. Expect more to follow.
- Further, they've released a new prompting guide that shows proof of improvements to the model's performance. Typically I've ignored these guides as the tips are often presented as "best practices" but this one is actually useful.
don't miss that OAI also published a prompting guide WITH RECEIPTS for GPT 4.1 specifically for those building agents... with a new recommendation for: - telling the model to be persistent (+20%) - dont self-inject/parse toolcalls (+2%) - prompted planning (+4%) - JSON BAD - use
o1 is mind-blowing when you know how to use it. it's really not a chat model -- you have to think of it more like a "report generator" (link to article below)
ChatGPT vs API?
- You can only access 4.1 via API. Not even in ChatGPT Team or Enterprise yet.
- @emollick points out something interesting: OpenAI seems to be splitting ChatGPT (high-power) from the API (cheap, dev-facing).
- No system card. No big safety briefing. Is the API side is shipping fast while the product team catches up? Are the teams shipping as fast as they can, independently? Or is something else unfolding?
Parting Thoughts
GPT-4.1 is a nice win for developers. It's faster, cheaper, and performs better across the board. For most apps, this is the new default.
But it's a Monday and I think we'll see bigger things from openai this week.