Oddbean new post about | logout
 Claude IS a bro, but yeah you'd need to know enough terminology to corral it into a sophisticated project. 
 I haven't checked out Claude at all, but almost everything else I've used is terrible at "code". I suppose the biggest complaint, and why I think human software engineering is still necessary, is the code is so far from maintainable it's laughable. With some nostr projects I take a look though it's painfully obvious they used some LLM to generate most of the code. The repo structure isn't there, comments are non-existent or irrelevant, most functionality is over simplified and crammed into single-files with hard-coded/magic number semantics. It's just rough and I can't even put my finger on it, other than it's just hard to understand and maintain or build on. The output is usually so direct to the promt it can't be expanded.  
 I think its important here to emphasize, even if obvious for this response, that language models are tools. Sophisticated, but tools nonetheless. Like the stick used to knock an apple from a tree's highest branch, a tool. An extension of our capabilities, but an extension in a specific 'direction'. Continuous movement in one direction will always be problematic given enough time. Key point: as tool users we put intentionality to a tool in service of our goals, which are defined by us and can change depending on circumstance. As we use the tool, we must constantly adjust the direction our tools take us because they can't do that themselves - even sophisticated tools. This why I bring up the word 'corral' here, because language models need consistent guidance. Sophisticated models may need less guidance, but ultimately we are the goal setters and need to help it change direction when appropriate. A ship's effectiveness is only as good as the captain's guidance - which in software terms, a sophisticated captain would be one well versed on effective management of repo structure, commenting, testing, etc. at whatever level the language model lacks. There are also levels to what kind of direction a language model has that we need to be aware of, be it the initial prompt, or the path that the conversation takes as it gets longer - which longer conversations will corral it into a place with less flexibility than at the beginning of the conversation, but the beginnings of the conversation also have the highest variance because there is less context to work with.

Just to speak to Claude's sophistication though... really blew me away when I asked it to program a gui for a graph that I was creating, knowing exactly what I was trying to do and how to modify the code so that it actually fit my verbal description. Or helping me brainstorm how to solve a problem with concepts I didn't understand, yet still accomplishing what I needed. Still needed lots of guidance, but it is incredible how it is able to accomplish those small goals which inch toward my larger goal, making progress in ways I couldn't imagine if it were just me reading the documentation. 
 Yeah I'm in agreement. I suppose I sound more against it's existence, which maybe to some degree I admit I'm bothered with lack of "correctness" or attention to detail most AI response lacks, but that's just a gate-keeping mindset. I think more of my concern is the empowerment of ignorance of the unknown. I have been interested in AI code reviews of my existing code-bases for an after-complete error detection assessment and so on. I have so many things on my plate I haven't prioritized any LLM stuff. I also don't have the hardware resources to dedicate local setups, along with the sheer scrappiness of self-hosted models that also bothers me. Again, though I'm aware that's my lack of prioritization. I literally used ollama a couple weeks ago for the first time and failed to get it to work correctly then gave up lol

Summary: Most AI output is unpolished and that bothers me, and I don't believe, in the hands of the general public, there will be incentives to produce, and refine products, and quality of applications/tools will decline. People don't want good, they want good enough. And that's why I sound like I'm whining :)