Skip to content

The LLM Tools I Actually Use as a Developer in Early 2026

As a developer in 2026, using LLM tools is pretty much a given at this point. The more interesting question is which ones are actually worth using and for what. Not every tool fits every context, and the one that makes sense for coding isn’t what I want for general chat or what I reach for when I need to keep side project costs low.

Here’s how my stack actually breaks down, by context.

What I Use for General AI Chat (and Why I Left ChatGPT)

I switched from ChatGPT to Gemini in December 2025 after about three years on ChatGPT. The context window on Gemini 3 Pro (1 million tokens vs GPT-5’s 256K in the ChatGPT interface) made a practical difference for how I use it and I haven’t had a reason to go back.

For general use, thinking things through, casual conversation, Gemini is where I land now.

Coding: Claude Code (and Why I Left Cursor)

For development work I use Claude Code exclusively at this point. I came over from Cursor.

The switch was partly a nudge from my senior developer at work, who strongly preferred it, and partly the general consensus that had been building around Anthropic being the strongest option for coding. Both things pointed the same direction and I went with it. Claude Max also ended up being covered through my work account, which made the decision pretty easy since I’d been paying for Cursor out of my own pocket.

After using it for a while I genuinely agree with the consensus. The quality of what Claude Code produces is noticeably better than what I was getting from Cursor. The one tradeoff is speed. Claude Code is slower and I do miss how fast Cursor felt sometimes. But for most work that’s an acceptable tradeoff.

I don’t use GitHub Copilot. It never clicked for me as a standalone tool once I had access to something with more autonomy.

I run Claude Code in the terminal alongside Warp. The feature I get the most out of is Warp’s spotlight modal: you set a keybind and get a floating terminal window you can show and hide over whatever you’re working in. No constantly resizing windows or splitting the IDE in awkward ways just to see CLI output. Claude Code output stays visible without disrupting the editor layout. Small quality-of-life thing but it genuinely changes the day-to-day rhythm of working with it.

For personal projects I use OpenCode. I make a deliberate effort not to lean on AI too heavily in my own time, mostly to keep my own coding skills sharp. So personal project AI usage is pretty minimal by choice.

How I Handle Agentic Workflows

This is a newer part of the stack. For agentic productivity I use both OpenClaw and Claude Cowork, and they run on different models.

Claude Cowork runs on Claude models, which makes sense given what it is.

OpenClaw I run through OpenRouter with cheaper models: Qwen3 Coder as primary with Mimo v2 Flash as fallback. For the general workflow OpenClaw handles, a cheaper model does the job well enough and the cost difference adds up over time. I wrote about how that config works here.

One thing agentic workflows have made obvious: the prompt matters more than the model for most tasks, and the wrong person owning the prompt creates its own problems. Worth a read if you’ve hit that wall: I wrote about it here.

Using OpenRouter for Side Projects (and Why Not Direct APIs)

For individual projects that hit LLM APIs I use OpenRouter rather than going to providers directly. There’s a small markup on costs but the convenience of a single key across all providers is worth it, especially when you’re not committing serious spend before a project has any traction.

For model choice I stay cheap. On a job listing project I built, I used GPT-4o-mini with structured outputs to classify and format data, good enough for the task and the cost stayed low. For a personal calorie and protein tracking app I’m currently building, I used Gemini 2.5 Flash with web search to estimate nutritional info from text and photos. Built it for myself first but it might go somewhere.

The general rule for projects: use the cheapest model that can reliably handle the specific task. Most structured output and classification work doesn’t need a frontier model.

Local LLMs: Not There Yet

I gave this a proper go. Got a Qwen model running locally on my 3060 Ti. The context window sat at around 16K tokens which is workable for small things, but the GPU strain wasn’t worth it given I already had VSCode autocomplete covering the lightweight suggestions.

The actual use case I was trying to solve was a learning-focused agent I could plug into OpenCode — not something that writes code for me, but something that pulls relevant documentation and explains it clearly in plain language. The kind of thing where you ask “how do I write a GET route in Hono?” and it gives you the docs, a working example, and the reasoning behind it rather than just producing code. A tool for learning alongside, not instead of.

I haven’t gotten there yet. My PC is due an upgrade and I think that changes what’s practical locally. It’s something I want to revisit, and honestly it might be worth a post of its own when I do.

My Current LLM Stack at a Glance

Different contexts call for different tools and I’d be a bit skeptical of anyone who says one model handles everything.

For me right now: Gemini for general use, Claude for serious development work, cheap models via OpenRouter when cost matters on projects, and local models still on the to-do list. It’ll probably look different again in another year.