Coding in 2025
An overview of the current state and future directions of AI coding technologies.
Jun 26, 2025
I've been working on various projects experimenting with the current state of applied AI. Naturally that has given me a great opportunity to explore different code gen tools without the risks associated with experimentation on production code.
As I recently shared in You. Are. Not. Using. Enough. Compute. my thinking on productivity has undergone a "flippening" where I'm now noticing how much our old ways are holding us back.
This post is a more practical look at my experience with AI coding tools in 2025.
IDE-Based Agents

- They live where you already work—your editor—so adoption is one extension away. Or replace your favourite editor with a fork of your favourite editor with AI first changes (eg: Cursor).
- They can suggest code, complete functions, and or turn on agent mode and yolo a whole feature.
- Despite their ungodly ability, you're still in the drivers seat, but you can choose to be a prompt engineer and have your copilot finish any given task.
- This might not make sense, but as time goes on I imagine these will start feeling more like Linear.
- It's hard to imagine programming without these tools, now. But, tbh, it's hard to imagine programming at all with the tools below.
- IMO: Cursor is the best of these. But Github Copilot isn't as far behind as people claim.
CLI-Based AI Agents

- Tools like OpenAI’s Codex CLI, Google’s Gemini CLI, and Anthropic’s Claude Code (all open source) take over your terminal and run as an app with full controll of a folder and the tree of files from it.
- Being in the terminal means it can still do the basics like create and edit files, but can then go beyond tools the editor would provide or allow.
- These are surprisingly much better than their in-IDE counterparts. The only reason I can muster for "why" is that this is basically the same harness they use to RL the core models on coding tasks. Effecitvely this is a much more intuitive interaction pattern for the models.
- On mac these are mostly sandboxed so they're stuck in the folder you gave them -- but you can imagine they can run some very destructive commands if you're not careful.
- The downside is that these are far too indimidating for any non-devs and often require api-keys to run, so they’re not as accessible for casual users.
- IMO: Claude Code is the best of these. It's the original and the rest are largely immitating. But, you basically need Claude MAX plan ($200/month) to actually use it.
Cloud-Based AI Agents

- Services like Google’s Jules and ChatGPT’s in-browser Codex clone your repo into a secure VM, complete tasks, then open PRs with diffs and text summaries.
- It’s asynchronous: you submit a request, continue working, then review a fully formed plan.
- Non-devs can “deploy” by chat—no terminal needed—but you still keep PRs, code reviews, and CI/CD intact.
- You can run as many tasks as you want in parallel.
- IMO: OpenAI's Codex and specifically the Codex-1 model fine-tuned to code in large projects/repos is the best of any option by a margin (no such thing as wide margins in ai anymore). It not only "just works" but it does it in a way that respects your existing code.
AI-Powered App Builders

- Platforms like v0.dev, Lovable.dev, and Bolt.new let anyone describe an app in plain English and get UI, backend, and deployment—no manual boilerplate.
- They vary from full no-code (v0) to hybrid chat-plus-code (Lovable’s React+Supabase) or instant in-browser IDEs (Bolt).
- Great for rapid prototyping, though customization limits and vendor lock-in differ widely.
- For example, v0.dev, by Vercel (makers of the popular OSS framework Nextjs) runs a fine-tuned model to make it disproportionately good at building Nextjs apps, but it’s not as flexible for other stacks. The good news is, if you're building that anyway the v0 model is a great resource.
- The killer promise of these platforms is the "prompt to production" flow: you describe your app, it builds it, and you can deploy it with a click. They might be code agents but they're also hosting providers too. This is the most accessible for non-devs.
- IMO: In the short term, these are great for prototyping and explorations (these might be the new "figma"). In the long term these are the new "frontend frameworks." I know that barely makes sense, so if it feels like a strange take to you... technologies just think about it longer.
For now, Where's the magic?

- Prompt to Code: Speak to ChatGPT Codex. Tell it what you want; Select "Best of 4". It writes, tests, and commits x4.
- Codex -> Github -> Instant Preview Build: Auto-deploy on Vercel gives live preview URLs of those 4x attempts. Review them in your browser, select the winner, and continue.
- Gemini Code Assist Code Reviews: PR Bot reviews code and flags issues and suggests improvements. IMO this is important to be a different model than the one that wrote the code, so it has different weights and biases. Gemini 2.5 Pro is a very good choice here. Slam all it's suggestions back into Codex.
- Github Merge & Deploy: This combo leverages familiar tools, keeps you in control, and delivers feature cycles in minutes instead of days.
Parting Thoughts
With the above, the bottleneck isn't the AI's ability to generate code, but my own capacity to effectively prompt, test, and manage the complexity of multiple parallel feature developments. While the "Prompt to Code" flow allows for generating thousands of attempts simultaneously, my human limitation in processing and providing feedback for all of them means I can only do 4 features at a time while running at maximum capacity. And, often, I revert to a more sequential workflow, even though the tools enable much faster iteration.
As I look towards the next 18 months, the upcoming products and improvements in products will help me overcome these bottlenecks -- likely by removing me from the loop entirely.
Human in the loop is cope.