I was reading Andrej Karpathy’s recent post, and it made me think about something I have noticed while using them. He laid out this evolution that’s happened over just a few years: back in 2020, it was all about pre-training on massive datasets and then fine-tuning. By 2022, we’d moved to reinforcement learning from human feedback—basically teaching models what good outputs look like through human preferences. But now in 2024, something fundamentally different is happening. Models are being trained in what’s almost like gamified environments where they’re given tasks, rewarded for solving them correctly, and they have to actually reason their way through to a solution. It’s not just pattern matching anymore, they’re working through problems step by step. Its not very effecient and i am not defending it, but it sure is a methodical way of solving and arguing in a structure.
Here’s what I noticed when I use thinking modes on claude or chatgpt: these models don’t take mental shortcuts the way humans do. When I’m solving a problem, I might just know the answer based on intuition or experience, my brain does this quick lookup and I’ve got it. But when a model reasons through something, it’s laying out the full argument. It’s considering counterarguments. It’s being extremely methodical about working through each step. And honestly? My first reaction was that this seems inefficient. But then I realized, that’s actually kind of the point. Models are just not there with Intuition which is is the ability to acquire knowledge without recourse to conscious reasoning or needing an explanation(source wikipedia).
Let me give you a real example. Say you ask me for a nice coffee shop that’s budget-friendly. I’ll immediately picture that cozy spot I found last month, the one with the good vibes and reasonable prices. My brain just retrieves that memory. But if you ask a model the same question, it’s going to break it down: “What makes a coffee shop ‘nice’? What does ‘budget-friendly’ mean? Let me find options that satisfy both criteria and evaluate them.” It’s more structured, more explicit. Different from human thinking, but thorough in its own way.
Now here’s the thing about compute costs. Right now, this kind of reasoning is expensive to run. All that step-by-step thinking takes computational power, and that makes it feel unsustainable at scale. But compute keeps getting cheaper. What seems prohibitively expensive today will be routine in a year or two. And when that happens, this methodical, argument-based reasoning,done at speeds humans can’t match—becomes incredibly practical. We won’t need models to mimic our mental shortcuts. We’ll just appreciate that they have their own way of working through problems that’s reliable, explicit, and getting more accessible every day. I also think there could be a connection here with Daniel Kahneman’s work but i need to think through it.
Leave a comment