hckrnws
Reminds me of the best saying I ever got from my CS professor. She would make us first write out our code and answer the question, "What will the output be?" before we were allowed to run it.
"If you don't know what you want your code to do, the computer sure as heck won't know either." I keep this with me today. Before I run my code for the first time or turn on my hardware for the first time, I ask myself, "What _exactly_ am I expecting to see here?" and if I can't answer that it makes me take a closer and more adversarial look at my own output before running it.
Isn't this the whole idea of TDD? Write your assertions, then write the code the fulfill it.
I'm not 100% convinced, while iterating fast on an early prototype, what's wrong with legitimately not knowing what e.g. the data structure will end up looking? Just let it run, check debugger/stdout/localhost page and adjust: "Oh, right, the entries are missing canonical IDs, but at the same time there are already all the comments in them, forgot they would be there – neat". What's wrong with that? Especially at uni, when working on low-stakes problems.
> what's wrong with legitimately not knowing what e.g. the data structure will end up looking?
But that's not what the above comment said.
> Just let it run, check debugger/stdout/localhost page and adjust: "Oh, right, the entries are missing canonical IDs, but at the same time there are already all the comments in them, forgot they would be there
So you did have an expectation that the entries should have some canonical IDs, and anticipated/desired a certain specific behavior of the system.
Which is basically the meaning of "what will the output be?" when simplified for programming novices at university.
This is a restatement of the old wisdom that to safely use a tool you must be 10% smarter than it is." Or stated differently, you must be "ahead" of the tool (capable of accurately modeling and predicting the outcome), not "behind" (only reacting). TDD is kind of an outgrowth of it. I've lived by the wisdom, but admit that for me there is a lot of fun in the act of verifying hypotheses in the course of development, even in the "test case gap" when you're writing the lines of code that don't make a difference in terms of making a long term test case go from red to green, or doing other exploratory work where the totality of behavior is not well charted. Those times are the best. "Moodily scowling at the computer screen again," has been a status update from chilluns on what I'm doing more times than I like to admit.
I find LLMs so much more exhausting than manual coding. It’s interesting. I think you quickly bump into how much a single human can feasibly keep track of pretty fast with modern LLMs.
I assume until LLMs are 100% better than humans in all cases, as long as I have to be in the loop there will be a pretty hard upper bound on what I can do and it seems like we’ve roughly hit that limit.
Funny enough, I get this feeling with a lot of modern technology. iPhones, all the modern messaging apps, etc make it much too easy to fragment your attention across a million different things. It’s draining. Much more draining than the old days
Same feeling as pair programming in my experience.
If your consciousness is driving, your brain is internally aligned. You type as you think. You can get flow state, or at least find a way to think around a problem.
If you're working with someone else and having to discuss everything as you go, then it's just a different activity. I've collaboratively written better code this way in the past. But it's slower and more exhausting.
Like pair programming, I hope people realise that there's a place for both, and doing exclusively one or the other full time isn't in everyone's best interests.
I've had a similar experience, where I pair-programmed with a coworker for a few days in a row (he understood the language better and I understood the problem better) and we couldn't be in the call for more than an hour at a time. Still, although it was more tiring, I found it quite engaging and enjoyable. I'd much rather bounce ideas back and forth with another person than with an LLM.
> I find LLMs so much more exhausting than manual coding
I do as well, so totally know what you're talking about. There's part of me that thinks it will become less exhausting with time and practice.
In high school and college I worked at this Italian place that did dine in, togo, and delivery orders. I got hired as a delivery driver and loved it. A couple years in there was a spell where they had really high turnover so the owners asked me to be a waiter for a little while. The first couple months I found the small talk and the need to always be "on" absolutely exhausting, but overtime I found my routine and it became less exhausting. I definitely loved being a delivery driver far more, but eventually I did hit a point where I didn't feel completely drained after every shift of waiting tables.
I can't help but think coding with LLMs will follow a similar pattern. I don't think I'll ever like it more than writing the code myself, but I have to believe at some point I'll have done it enough that it doesn't feel completely draining.
I think it's because traditionally, software engineering was a field where you built your own primitives, then composited those, etc... so that the entire flow of data was something that you had a mental model for, and when there was a bug, you simply sat down and fixed the bug.
With the rise of open source, there started to be more black-box compositing, you grabbed some big libraries like Django or NumPy and honestly just hoped there weren't any bugs, but if there were, you could plausibly step through the debugger and figure out what was going wrong and file a bug report.
Now, the LLMs are generating so many orders of magnitude more code than any human could ever have the chance to debug, you're basically just firing this stuff out like a firehose on a house fire, giving it as much control as you can muster but really just trusting the raw power of the thing to get the job done. And, bafflingly, it works pretty well, except in those cases where it doesn't, so you can't stop using the tool but you can't really ever get comfortable with it either.
Very good catch. The mental model thing is real I've caught myself approving LLM generated code that works but that I couldn't debug if it broke at 2am. With libraries you at least had docs and a community. With generated code, the only source of truth is... asking the same LLM again and hoping it's consistent.
> I think it's because traditionally, software engineering was a field where you built your own primitives, then composited those, etc... so that the entire flow of data was something that you had a mental model for
Not just that, but the fact that with programming languages you can have the utmost precision to describe _how_ the problem needs to be solved _and_ you can have some degree of certainty that your directions (code) will be followed accurately.
It’s maddening to go from that to using natural language which is interpreted by a non-deterministic entity. And then having to endlessly iterate on the results with some variation of “no, do it better” or, even worse, some clever “pattern” of directing multiple agents to check each other’s work, which you’ll have to check as well eventually.
> bafflingly, it works pretty well, except in those cases where it doesn't
so as a human, you would make the judgement that the cases where it works well enough is more than make up for the mistakes. Comfort is a mental state, and can be easily defeated by separating your own identity and ego with the output you create.
I mean, you could make that judgment in some cases, but clearly not all. If you use AI to ship 20 additional features but accidentally delete your production database you definitely have not come out ahead.
https://www.reddit.com/r/OpenAI/comments/1m4lqvh/replit_ai_w...
I think what will eventually help is something I call AI-discipline. LLMs are a tool, not more, no less. Just like we now recognize unbridled use of mobile phones to be a mental health issue, causing some to strictly limit their use, I think we will eventually recognize that the best use of LLMs is found by being judicious and intentional.
When I first started dabbling in the use of LLMs for coding, I almost went overboard trying to build all kinds of tools to maximize their use: parallel autonomous worktree-based agents, secure sandboxing for agents to do as they like, etc.
I now find it much more effective to use LLMs in a target and minimalist manner. I still architecturally important and tricky code by hand, using LLMs to do several review passes. When I do write code with LLMs, I almost never allow them to do it without me in the loop, approving every single edit. I limit the number of simultaneous sessions I manage to at most 3 or 4. Sometimes, I take a break of a few days from using LLMs (and ofter from writing any code at all), and just think and update the specs of the project(s) I'm working on at a high level, to ensure I not doing busy-work in the wrong direction.
I don't think I'm missing anything by this approach. If anything, I think I am more productive.
Thanks for the story. I also spent time as a delivery driver at an italian restaurant. It was a blast in the sense that i look back at that slice of life with pride and becoming. Never got the chance to be a waiter, but definitely they were characters and worked hard for their money. Also the cooking staff. What a hoot.
I think the upper limit is your ability to decide what to build among infinite possibilities. How should it work, what should it be like to use it, what makes the most sense, etc.
The code part is trivial and a waste of time in some ways compared to time spent making decisions about what to build. And sometimes even a procrastination to avoid thinking about what to build, like how people who polish their game engine (easy) to avoid putting in the work to plan a fun game (hard).
The more clarity you have about what you’re building, then the larger blocks of work you can delegate / outsource.
So I think one overwhelming part of LLMs is that you don’t get the downtime of working on implementation since that’s now trivial; you are stuck doing the hard part of steering and planning. But that’s also a good thing.
I've found writing the code massively helps your understanding of the problem and what you actually need or want. Most times I go into a task with a certain idea of how it should work, and then reevaluate having started. While an LLM will just do what you ask without questing, leaving you with none of the learnings you would have gained having done it. The LLM certainly didn't learn or remember anything from it.
In some cases, yes. But I’ve been doing this awhile now and there is a lot of code that has to be written that I will not learn anything from. And now, I have a choice to not write it.
Ehh, I find that the most tedious code is also the most sensitive to errors, stuff that blurs the divide between code and data.
I doubt if we're talking about the same sort of things at all. I'm talking about stuff like generic web crud. Too custom to be generated deterministically but recent models crush it and make fewer errors than I do. But that is not even all they can do. But yes, once you get into a large complicated code base its not always worth it, but even there one benefit is it to develop more test cases - and more complicated ones - than I would realistically bother with.
I actually like writing the tedious code by hand.
The whole time I'm doing it, I'm trying to think of better ways. I'm thinking of libraries, utilities or even frameworks I could create to reduce the tedium.
This is actually one of the things I dislike the most about LLM coding: they have no problem with tedium and will happily generate tens of thousands of lines where a much better approach could exist.
I think it's an innovation killer. Would any of the ORMs or frameworks we have today exist if we'd had LLMs this whole time?
I doubt it.
It depends on how you use them. In my workflow, I work with the LLM to get the desired result, and I'm familiar with the system architecture without writing any of the code.
I've written it up here, including the transcript of an actual real session:
https://www.stavros.io/posts/how-i-write-software-with-llms/
Thanks for writing this up.
I just woke up recently myself and found out these tools were actually becoming really, really good. I use a similar prompt system, but not as much focus on review - I've found the review bots to be really good already but it is more efficient to work locally.
One question I have since you mention using lots of different models - is do you ever have to tweak prompts for a specific model, or are these things pretty universal?
I don't tweak prompts, no. I find that there's not much need to, the models understand my instructions well enough. I think we're way past the prompt engineering days, all models are very good at following instructions nowadays.
Right when you're coding with LLM it's not you asking the LLM questions, it's LLM asking you questions, about what to build, how should it work exactly, should it do this or that under what conditions. Because the LLM does the coding, it's you have to do more thinking. :-)
And when you make the decisions it is you who is responsible for them. Whereas if you just do the coding the decisions about the code are left largely to you nobody much sees them, only how they affect the outcome. Whereas now the LLM is in that role, responsible only for what the code does not how it does it.
Hehe, speak for yourself- as a 1x coder on a good day, having a nonjudgmental partner who can explain stuff to me is one of the best parts of writing with an llm :)
I like that aspect of it too. LLM never seems to get offended even when I tell it its wrong. Just trying to understand why some people say it can feel exhausting. Instead of focusing on narrowly defined coding tasks, the work has changed and you are responsible for a much larger area of work, and expectations are similarly higher. You're supposed to produce 10x code now.
> Because the LLM does the coding, it's you have to do more thinking. :-)
I keep seeing this sentiment, but it sure sounds wrong to me.
Coding requires thinking (in humans, at any rate). When you're doing coding, you're doing both coding-thinking and the design thinking.
Now you're only doing one half of it.
I’d love to see what you’ve built. Can you share?
Maintenance is the hard part, not writing new code or steering and planning.
You can outsource that to another llm
If you care at code quality of course it is exhausting. It's supposed to be. Now there is more code for you to assure quality in the same length of time.
If you care about code quality you should be steering your LLM towards generating high quality code rather than writing just 'more code' though. What's exhausting is believing you care about high quality code, then assuming the only way to get high quality code from an LLM is to get it to write lots of low quality code that you have to fix yourself.
LLMs will do pretty much exactly what you tell them, and if you don't tell them something they'll make up something based on what they've been trained to do. If you have rules for what good code looks like, and those are a higher bar than 'just what's in the training data' then you need to build a clear context and write an unambiguous prompt that gets you what you want. That's a lot of work once to build a good agent or skill, but then the output will be much better.
> write an unambiguous prompt
That's an oxymoron. Prompts by definition are ambiguous otherwise you will be writing code.
I suspect it's because you need to keep more things in your head yourself; after a while of coding by hand, it becomes more labor and doesn't cost as much brain power anymore. But when offloading the majority of that coding to an LLM, you're left with the higher level tasks of software engineering, you don't get the "breaks" while writing code anymore.
How often, in your life, did you write code without stopping, in the middle of writing, to go back and review assumptions that turned out to be wrong?
I'm not talking about "oh, this function is deprecated, have to use this other one, but more "this approach is wrong, maybe delete it all and try a different approach"?
Because IME an AI never discards an approach, they just continue adding band aids and conditional to make the wrong approach work.
The tactical process of writing the code is also when you discover the errors in your design.
Like, did we think waterfall suddenly works now just because typing can be automated? No.
Classic coding was the process of incrementally saying "Ah, I'm getting it!" -- as your compile your code and it works better each time, you get a little dopamine hit from "solving" the puzzle. This creates states where time can pass with great alacrity as we enter these little dopamine induced trances we call "flow", which we all experience.
AI is not that, it's a casino. Every time you put words into the prompt you're left with a cortisol spike as you hope the LLM lottery gives you a good answer. You get a little dopamine spike when it does, but it's not the same as when you do it yourself because it's punctuated by anxiety, which is addictive but draining. And I personally have never gotten into a state of LLM-induced "flow", but maybe others have and can explain that experience. But to me there's too much anxiety around the LLM from the randomness of what it produces.
Working with LLMs for coding tasks feels more like juggling I think. You're fixating on the positions of all of the jobs you're handling simultaneously and while muscle memory (in this metaphor, the LLMs) are keeping each individual item in the air, you're actively managing, considering your next trick/move, getting things back on track when one object drifts from what you'd anticipated, etc. It simultaneously feels markedly more productive and requiring carefully divided (and mentally taxing) focus. It's an adjustment, though I do worry if there's a real tangible trade-off at play and I'm loosing my edge for instances where I need to do something carefully, meticulously and manually.
Theory of Bounded Rationality applies. Tech tools scale systemic capability limits. 3 inch chimp brain limits dont change. The story writes itself.
It feels no different than inhheriting someone's code base when you start at a company. I hate this feeling. AI removes the developer's attachment and first hand understanding of the code.
I go through phases with it where I am extraordinarily productive and times where i can't even bear to open a terminal window.
You used to be a Formula 1 driver. Now you are an instructor for a Formula 1 autopilot. You have to watch it at all times with full attention for it's a fast and reckless driver.
You're being generous to the humans; we're more like Ladas in comparison.
That may not be a bad comparison. A F1 car is really fast, really specialized car, that is also extremely fragile. A Lada may not be too fast but its incredibly versatile and robust even after decades of use. And has more luggage space
I imagine code reviewing is a very different sort of skill than coding. When you vibe code (assuming you're reading teh code that is written for you) you become a coder reviewer... I suspect you're learning a new skill.
It’s easier to write code than read it.
Id argue the read-write procedures are happening simultaneously as one goes along, writing code by hand.
It's important to enforce the rules that make the code easier to read.
The way I've tried to deal with it is by forcing the LLM to write code that is clear, well-factored and easy to review i.e. continually forcing it to do the opposite of what it wants to do. I've had good outcomes but they're hard-won.
The result is that I could say that it was code that I myself approved of. I can't imagine a time when I wouldn't read all of it, when you just let them go the results are so awful. If you're letting them go and reviewing at the end, like a post-programming review phase, I don't even know if that's a skill that can be mastered while the LLMs are still this bad. Can you really master Where's Waldo? Everything's a mess, but you're just looking for the part of the mess that has the bug?
I'm not reviewing after I ask it to write some entire thing. I'm getting it to accomplish a minimal function, then layering features on top. If I don't understand where something is happening, or I see it's happening in too many places, I have to read the code in order to tell it how to refactor the code. I might have to write stubs in order to show it what I want to happen. The reading happens as the programming is happening.
Current state of my workflow: (your feedback is welcome, I'm an autodidact and have been self-teaching myself thus far) I also think my workflow addresses some of the pain points the OP mentioned.
If a problem is a continuation of the current or other chat, switch to it. If it is a new problem or sub-problem requiring something more extensive than a tiny refactor, a new chat is started.
From there,
Start in Ask mode. Ask about existing code I'm trying to modify. If I am interfacing with someone else's code, that is put in reach of the project and ask questions about how it produces certain results or what a function does. Ask the foundational 'bottom-up' questions, how does this work? what routine produces x? Call out specific external sources from the web if it contains relevant information, like an API. Iterate until I feel I have a grasp of what I can build with. Not only does this help me comprehend the boundaries in terms of existing capability and/or shortcomings, it seeds the context.
Move to Plan mode. Provide a robust problem statement and incorporate findings from the Ask into the problem statement, and what the desired output is. Throw in some guard rails to narrow the search path used by the LLM as it seeks the solution. Disqualify certain approaches if necessary. If the LLMs plan isn't aligned with my goals, or I remember that thing I skipped, I amend the plan. The plan prompt I typed is saved to a blank file in the text editor.
Implement.
Validate. If it works, great. Read the code and approve each change, usually I speed read this.
If it doesn't work, I tell the LLM the difference between the expected and actual result and instruct it to instrument the code to produce trace output. Then feed the trace output back into it with explanations of where the output doesn't match my expectations (often times revealing weaknesses in my problem statement). Sometimes when it is a corner case that is problematic, several iterations are required then screen for regressions. If I reach the point where I know I screwed up the planning prompt, I trash the changeds, then I revise the copypasta saved earlier and start a new Planning session.
I have always enjoyed the feeling of aporia during coding. Learning to embrace the confusion and the eventual frustration is part of the job. So I don’t mind running in a loop alongside an agent.
But I absolutely loathe reviewing these generated PRs - more so when I know the submitter themselves has barely looked at the code. Now corporate has mandated AI usage and is asking people to do 10k LOC PRs every day. Reviewing this junk has become exhausting.
I don’t want to read your code if you haven’t bothered to read it yourselves. My stance is: reviewing this junk is far more exhausting. Coding is actually the fun part.
> Now corporate has mandated AI usage and is asking people to do 10k LOC PRs every day.
That's a big red flag if I ever saw one. Corporate should be empowering the engineering team to use AI tooling to improve their own process organically. Is this true or exaggeration? If it's true I'd start looking for a more balanced position at more disciplined org.
True at Doordash, Amazon, and salesforce - speaking from experience.
Mandates are becoming normal. Most devs don’t seem to want to but they want to keep their jobs.
Definitely a sign that workers aren't being exploited.
10k LoC per day? Wow, my condolences to you.
On a different note: something I just discovered is that if you google "my condolences", the AI summary will thank you for the kindness before defining its meaning, fun.
>Reviewing this junk has become exhausting.
Nitpick it to death. Ask the reviewer questions on how everything works. Even if it looks good, flip a coin and reject it anyway. Drag that review time out. You don't want unlucky PRs going through after all.
Corporate is not going to wake up and do the sensible thing on its own.
Ha ha I wish. Then both corporate and your coworkers hate you.
Also, there is no point in asking questions when you know that they just yoloed it and won't be able to answer anything.
We have collectively lost our common sense and reasonable people are doing unreasonable things because there's an immense amount of pressure from the top.
I always wonder where HNers worked or work; we do ERP and troubleshooting on legacy systems for medium to large corps; PRs by humans were always pretty random and barely looked at as well, even though the human wrote it (copy/pasted from SO and changed it somewhat); if you ask what it does they cannot tell you. This is not an exception, this is the norm as far as I can see outside HN. People who talk a lot, don't understand anything and write code that is almost alien. LLMs, for us, are a huge step up. There is a 40 nested if with a loop to prevent it from failing on a missing case in a critical Shell (the company) ERP system. LLMs would not do that. It is a nightmare but makes us a lot of money for keeping things like that running.
I currently work at one of the biggest tech companies. I’ve been doing this for over 20 years, and I’ve worked at scrappy startups, unicorns, and medium size companies.
I’ve certainly seen my share of what I call slot driven development where a developer just throws things at the wall until something mostly works. And plenty if cut and paste development.
But it’s far from the majority. It’s usually the same few developers at a company doing it, while the people who know what they’re doing furiously work to keep things from falling apart.
If the majority of devs were doing this nothing would work. My worry is that AI lets the bad devs produce this kind of work on a massive scale that overwhelms the good devs ability to fight back or to even comprehend the system.
I also work at a huge company, and this observation is true. The way AI is being rammed down our throats is burning out the best engineers. OTOH, the mediocre simian army “empowered” by AI is pushing slop like there’s no tomorrow. The expectation from leadership, who tried Claude for a single evening, is that you should be able to deliver everything yesterday.
The resilience of the system has taken a massive hit, and we were told that it doesn’t matter. Managers, designers, and product folks are being asked to make PRs. When things cause Sev0 or Sev1 incidents, engineers are being held responsible. It’s a huge clown show.
> The expectation from leadership, who tried Claude for a single evening, is that you should be able to deliver everything yesterday.
"Look, if the AI fairy worked like that our company would be me and the investors."
I should make t-shirts. They'll be worth a fortune in ironic street cred once the AI fairy works like that.
Tech companies. How about massive non software tech companies. I don't know where it is not the norm and I have been in very many of them as supplier for the past 30 years. Tech companies are a bit different as they usually have leadership that prioritizes these things.
None tech companies too. You can’t build large scale software with everyone merging PRs like that. My guess is that if you’re a supplier your are getting a pretty severe sampling bias.
I would hope that most people who are technically competent enough to be on HN are technically competent enough to quit orgs with coding standards that bad. Or, they're masochists who have taken on the chamllenge of working to fix them
Half the posts here are talking about how they 100xd their output with the latest agentic loop harness, so I'm not sure why you would get that impression.
Neither of those. The pay is great and if all leadership cares about is making the whole company "AI Native" and pushing bullshit diffs, I'll play ball.
The one thing I don't quite get is how running a loop alongside an agent is any different from reviewing those PRs.
I do “TDD” LLM coding and only review the tests. That way if the tests pass I ship it. It hasn’t bitten me in the ass yet.
Use AI to review.
10k, really? Are you supposed to understand all that code? This is crazy and a one way street to burnout.
Yep and now we are encouraged to use AI to review the code as well. But if shit hits the fan then you are held responsible.
A lot of these resonate with me, particularly the mental fatigue. It feels like normal coding forced me to slow my brain down, whereas now my mind is the limit.
For context, I started an experiment to rebuild a previous project entirely with LLMs back in June '25 ("fully vibecoded" - not even reading the source).
After iterating and finally settling on a design/plan/debug loop that works relatively well, I'm now experiencing an old problem like new: doing too much!
As a junior engineer, it's common to underestimate the scope of some task, and to pile on extra features/edge cases/etc. until you miss your deadline. A valuable lesson any new programmer/software engineer necessarily goes though.
With "agentic engineering," it's like I'm right back at square one. Code is so cheap/fast to write, I find myself doing it the "right way" from the get go, adding more features even though I know I shouldn't, and ballooning projects until they reach a state of never launching.
I feel like a kid again (:
I spent more time correcting LLMs or agentics systems than just learning the domain and doing the coding myself. I mainly leave LLM to the boring work of doing tedious repetitive code.
If I give it anything resembling anything that I'm not an expert on, it will make a mess of things.
Yeah the old adage "what you put in is what you get out" is highly relevant here.
Admittedly I'm knowledgable in most of the domains I use LLMs for, but even so, my prompts are much longer now than they used to be.
LLMs are token happy, especially Claude, so if you give it a short 1-2 sentence prompt, your results will be wildly variable.
I now spend a lot of mental energy on my prompting, and resist the urge to use less-than-professional language.
Instead of "build me an app to track fitness" it's more like:
> "We're building a companion app for novice barbell users, roughly inspired by the book 'Starting Strength.' The app should be entirely local, with no back-end. We're focusing on iOS, and want to use SwiftUI. Users should [..] Given this high-level description, let's draft a high-level design doc, including implementation decisions, open questions, etc. Before writing any code, we'll review and iterate on this spec."
I've found success in this method for building apps/tools in languages I'm not proficient in (Rust, Swift, etc.).
What do you mean doing it the "right way" from the get-go, as then paired with more features, ballooning projects, and never launching?
Is that why it's in quotes because it's the opposite of the right way?
If there's one thing I learned in a decade+ of professional programming, it's that we can't predict the future. That's it, that simple. YANGNI. (also: model the data, but I'm trying to make a point here)
We got into coding because we like to code; we invent reasons and justifications to code more, ship more, all the world's problems can be solved if only developers shipped more code.
Nirvana is reached when they that love and care about the shipping of the code know also that it's not the shipping of the code that matters.
Yeah exactly, "right way" is in quotes because there is no right way.
The most important thing is shipping/getting feedback, everything else is theatre at best, or a project-killing distraction at worst.
As a concrete example, I wanted to update my personal website to show some of these fully-vibecoded projects off. That seemed too simple, so instead I created a Rotten Tomatoes-inspired web app where I could list the projects. Cool, should be an afternoon or two.
A few yak shaves later, and I'm adding automatic repo import[0] from Github...
Totally unnecessary, because I don't actually expect anyone to use the site other than me!
lol. for whatever reason what came to mind is "it's like alcoholics anonymous". it's so liberating to be self-aware that we have a problem.
I JUST WANT TO CODE!
It gets us all. And it makes us better I think, to care about the craft. LLM people seem split on that. But it's both to me: gotta care about the craft, also as a professional, it's not the code, it's business outcomes. All good. hold two truths.
The most honest, logical, and practical take I've seen on this. People consistently underestimate the skill and effort it takes to write precisely and think critically both about their problem, and their processes. The closer you are to knowing what to ask for in the way knowledgeable people ask for it with respect to the process you are using to complete work, the closer the output will be to what you want.
[dead]
I find working more asynchronous with the agents help. I've disabled the in-your-face agent-is-done/need-input notifications [1]. I work across a few different tasks at my own pace. It works quite well, and when/if I find a rhythm to it, it's absolutely less intense than normal programming.
You might think that the "constant" task switching is draining, but I don't switch that frequently. Often I keep the main focus on one task and use the waiting time to draft some related ideas/thoughts/next prompt. Or browse through the code for light review/understanding. It also helps to have one big/complex task and a few simpler things concurrently. And since the number of details required to keep "loaded" in your head per task is fewer, switching has less cost I think. You can also "reload" much quicker by simply chatting with the agent for a minute or two, if some detail have faded.
I think a key thing is to NOT chase after keeping the agents running at max efficiency. It's ok to let them be idle while you finish up what your doing. (perhaps bad of KV cache efficiency though - I'm not sure how long they keep the cache)
(And obviously you should run the agent in a sandbox to limit how many approvals you need to consider)
[1] I use the urgent-window hint to get a subtle hint of which workspace contain an agent ready for input.
EDIT: disclaimer - I'm relative new to using them, and have so far not used them for super complex tasks.
Yes, I follow the same sort of pattern, it took a while to convince myself that it was ok to leave the agent waiting, but it helps with the human context switching. I also try to stagger the agests, so one may be planning and designing, while another is coding, that way i can spend more time on the planning and designing ones and leave the coding one to get on with it.
That's actually one of the best parts. You can trust some of the context you have loaded is side loaded in the LLM, making task switching feel less risky and often improving your ability to work on needed and/or related changes elsewhere.
Yes, I briefly felt like I needed to keep agents busy but got over it. The point of having multiple things going on is so you have a another task to work on.
I've found LLMs to be liberating and energizing, not at all exhausting.
I can finally do my preferred workflow: Research, (design, critique), (plan, critique, design), implement.
Design and planning has a quick enough turnaround cycle to not get annoying. By the time the agent is writing code, I have no involvement anymore. Just set it and forget it, come back in half an hour or so to see if it's done yet. Meanwhile, I look at the bigger picture and plan out my next prompt cycles as it churns out code.
For example, this project was entirely written by LLM:
https://github.com/kstenerud/yoloai
I never wrote a single line of this code (I do review it, of course, but even then the heavy lifting for that can be offloaded to an LLM so that I can focus on wider issues, which most often are architectural).
In particular, take a look at the docs/dev subdir to see the planning and design. Once the agent has that, it's MUCH harder for it to screw things up.
Is it as tight as it could be? Nope, but it has a solid architecture, does its job well, and has good debugging infrastructure so fixes are fast. I wouldn't use this approach for embedded or projects requiring maximum performance, but for regular code it's great!
I ran go's deadcode against your repo, it says there are 44 unreachable functions. If you add guardrails like static analysis tools to a pre-commit you can make LLMs tighten things up.
LLMs do not actually make anything better for anyone. You have to constantly correct them. It's like having a junior coder under your wing that never learns from its mistakes. I can't imagine anyone actually feeling productive using one to work.
I don't know what to think about comments like this. So many of them come from accounts that are days or at most weeks old. I don't know if this is astroturfing, or you really are just a new account and this is your experience.
As somebody who has been coding for just shy of 40 years and has gone through the actual pain on learning to run a high level and productive dev team, your experience does not match mine. Even great devs will forget some of the basics and make mistakes and I wish every junior (hell even seniors) were as effective as the LLMs are turning out to be. Put the LLM in the hands of a seasoned engineer who also has the skills to manage projects and mentor junior devs and you have a powerful accelerator. I'm seeing the outcome of that every day on my team. The velocity is up AND the quality is up.
> The velocity is up AND the quality is up.
This is not my experience on a team of experienced SWEs working on a product worth 100m/year.
Agents are a great search engine for a codebase and really nice for debugging but anytime we have it write feature code it makes too many mistakes. We end up spending more time tuning the process than it takes to just write the code AND you are trading human context with agent context that gets wiped.
I can't speak to your experience. I can only speak to mine.
We've spent years reducing old debt and modernizing our application and processes. The places where we've made that investment are where we are currently seeing the additional acceleration. The places where we haven't are still stuck in the mud, but per your "search engine for a codebase" comment our engineers are starting to engage with systems they would not have previously touched.
There are areas for sure where LLMs would fall down. That's where we need the experts to guide them and restructure the project so that it is LLM friendly (which also just happens to be the same things that make the app better for human engineers).
And I'm serious about the quality comment. Maybe there's a difference in how your team is using the tools, but I have individuals on my team who are learning to leverage the tools to create better outputs, not just pump out features faster.
I'm not saying LLMs solve everything, FAR from it. But it's giving a master weapon to an experienced warrior.
I also agree. In fact, I was hitting a limit on my ability to ship a really difficult feature and after I became good at using Claude, I was able to finally get it done. The last mile was really hard but I had documented things very well so the LLM was able to fly through the bugs, write tests that I dare say are too difficult for humans to design since they require keeping in your head a large amount of context ( distributed computing is really hard) which is where I was hitting my limit. I now think that I can only do the easy stuff by hand, anything serious requires me to get a LLM to at least verify, but of course I just let it do things while I explain the high level vision and the sorts of tests I expect it to have.
Your experience matches mine too. Experienced devs are increasing their output while maintaining quality. I'm personally writing better-quality code than before because it's trival to tell AI to refactor or rename something. I care about good code, but I'm also lazy, so I have my Claude skills set up to have AI do it for me. (Of course, I always keep the human in the loop and review the outputs.)
You said that you're restructuring the project to be LLM friendly, which also makes the app better for humans. I 100% agree with this. Code that is unreadable and unmaintainable for humans is much more difficult for AI to understand. I think companies that practiced or prioritized code hygiene will be ahead of the game when it comes to getting good results with agentic AI.
Whenever actual studies are made about LLM coding they always show that LLM coding is a net loss in quality and delivery speed.
(They are good as coder psychotherapy tho.)
Well, things are changing so fast those studies are going to be out of date. And I have no doubt some people are experiencing a net loss while others are not. We need to pry apart why some people are having success with it and others aren't, and build on top of what's working.
Sounds a lot like "software engineering".
Wasn't AI supposed to free us from all that and let us fire all those useless coders?
Who would I possibly be astroturfing for? The entire industry is all-in on LLMs.
I can't speak for you specifically, it's just a trend I'm seeing and unfortunately your 2 day old account falls into that bucket. There's a lot of people who have a lot to lose or who are very afraid of what LLMs will do. There's plenty of incentive to do this.
I would be curious to see if I'm just imaging this or it really is a trend.
At the same time you have astro-turfing from LLM producers though, so...
Agreed, but I find that astro-turfing far more obvious.
Agreed.
It's clear to me as a more seasoned engineer that I can prompt the LLM to do what I want (more or less) and it will catch generally small errors in my approach before I spend time trying them. I don't often feel like I ended up in a different place than I would have on my own. I just ended up there faster, making fewer concessions along the way.
I do worry I'll become lazy and spoiled. And then lose access to the LLM and feel crippled. That's concerning. I also worry that others aren't reading the patches the AI generates like I am before opening PRs, which is also concerning.
This is a very reasonable comment. IMO it's a falacy to take into consideration the age of an account especially when it is subjective experience.
A junior engineer who might spend a few hours trying to understand why you added a mutex, reading blogs on common patterns, might come back with a question about why you locked it twice in one thread in some case you didn't consider. Just because someone lacks the experience and knowledge you have, doesn't mean they cannot learn and be helpful. Sometimes those with the most to learn are the most willing to put the hours in trying.
You're just bad at using them. It's a skill like anything else. I also suspect bad coders become even worse with LLM's, and the opposite is true.
You need to learn to use the tool better, clearly, if you have such an unhinged take as this.
No to be fair I do see what he's saying. I see a major difference between the more expensive models and the cheaper ones. The cheaper (usually default) ones make mistakes all the damn time. You can be as clear as day with them and they simply don't have the context window or specs to make accurate, well reasoned desicions and it is a bit like having a terrible junior work alongside you, fresh out of university.
Emphasis on the "terrible" part of the junior.
The cheaper models can't be taught or improved due to their inherit limitations, which makes it a huge pain to even try with even the simplest of tasks. Perpetually, no matter your instruction file(s).
I agree. The more expensive models I must admit have impressed me, but sometimes they take so long and are so expensive you might as well do it yourself. That being said if you're feeling particularly lazy there is now a "do it for me" button built into code editors, but until perhaps 2035 this technology is still somewhat pedestrian compared to what it could be in the future.
It's not unhinged at all, it's a lack of imagination on both of your parts.
The only people who use LLMs "as a tool" are those who are incapable of doing it without using it at all.
> The only people who use LLMs "as a tool" are those who are incapable of doing it without using it at all.
Do you mean that? It's clearly false, but I don't want to waste time gathering famous-person counterexamples if you already know it's a huge exaggeration at best.
No true scotsman, right?
I've been using Claude Code within VS Code for the most part... it's funny, but from time to time, I forget to click the Claude icon, and start interacting with the default GitHub copilot on the side. I tend to find myself quickly frustrated with the interactions only to realize I wasn't working with Claude/Opus. As soon as I switch, I'm almost always back on track within 10-30 minutes.
That said, it helps to be in tune with your own body and mind. You need breaks now and then and with AI interactions, you will be "ON" more than just working through problems on your own. The AI can work through the boilerplate that lets your mind rest at a relatively blazing pace, leaving you to evaluate and iterate relatively quickly. You will find yourself more "worn out" from the constant thinking faster.
IIRC most people burn out after 4-6 hours of heavy thought work... take a long meal break, then consider getting back into it or not. Identify when it's okay to stop for the day... you may be getting good progress, but if you aren't in the right mindset it's you that may well be introducing mistakes into things.
Beyond this, I tend to plan/track things in TODO.md files as I work/plan through things... keeping track of what needs to be done, combined with history, and even the "why" along the way... AI makes it easy to completely swap out a backend library pretty quickly, especially with a good testing surface in place. But it helps to track why you're doing certain things... why you're making the changes you are on a technical level.
It looks like Stockholm syndrome or a traditional abusive relationship 100 years ago where the woman tries to figure out how to best prompt her husband to do something.
You know you can leave abusive relationships. Ditch the clanker and free your mind.
This was not the article that I expected. The headline is correct in both cases but I assumed that it would be about fighting against the army of LLM scrapers, which is the source of my exhaustion in relation to them. Perhaps that is one for me to write instead.
I wonder if the same people using "agentic AI" are the same that spend days setting up the "perfect" work environment with four screens.
I find LLMs are great for building ideas, improving understanding and basic prototyping. This is more useful at the start of the project lifecycle, however when getting toward release it's much more about refactoring and dealing with large numbers of files and resources, making very specific changes e.g. from user feedback.
For those of us with decades of muscle memory who can fix a bug in 30 seconds with a few Vim commands, LLMs are very likely to be slower in most coding tasks, excepting prototyping and obscure bug spotting.
Maybe if you've spent decades on the same codebase. Now try doing that in a recent codebase thats been agnatically engineered by five people spiffing into the same repository. Good luck with your VIM muscle memory where the entire code base changes every five seconds around you so that no human can actually track wtf is happening week on week.
I'm not sure a faster loop is helping. It may actually be the problem. I have taken to creating 'collaboration' and 'temp_code' folders that I am spending more and more time in. By the time I am actually ready to touch the real code I have often written and re-written the problem statement/plan and expanded it to several files and some test code. I tell the other devs at my company that I spend 90% of the tokens on understanding and clarifying the problem and let the last 10% generate an answer. If I don't do that then I get prototype code that won't survive a single feature change and likely has intentionally hidden bugs, or 'defensive' code as some like to call them (try, except, ignore is a common claude pattern). My favorite is when claude hits the unit tests and says 'that failure was there before we started so I can ignore it...'. To get it to write actually good code you have to have caged the problem to a space that the LLM can optimize without worry, but to do that you have to still do work to understand how to break the problem into pieces small enough that the right answer is the obvious one. At that point letting it take the syntax is just fine by me.
Maybe the right answer is to sometimes slow down, explore and think a little more instead of just letting it try something until it (eventually, sort of) works.
I've noticed the same thing. I would have three, sometimes four sessions run at the same time. It would be great, but mentally exhausting. To help this, I've set a self-imposed limit of two active chat sessions at a time.
Another thing I found is that it is too easy to keep going. I would work for too long and get even more exhausted. It feels rude to just stop a conversation. LLMs don't really care about social norms like that, but it still felt awkward to me and I would worry about losing the context I had.
To help with that, I wrote my own little plugin that reminds me to start winding down at the end of the work day and starts prompting me (pardon my phrasing) to take the off-ramp; to relay any thoughts and todos I still have in mind and put them down to pick up the next day.
This is in no way production ready, but it might be an inspiration: https://github.com/pindab0ter/wind-down
It's exhausting - sometimes it feels like you are continuously redirecting a deviant child who just won't give up on his shenanigans.
I wonder if it's more or less tiring to work with LLMs in YOLO/--dangerously-skip-permissions mode.
I mostly use YOLO mode which means I'm not constantly watching them and approving things they want to do... but also means I'm much more likely to have 2-3 agent sessions running in parallel, resulting in constant switching which is very mentally taxing.
It's orthogonal IMO. YOLO or not is simply a sign of trust for the harness or not. Trust slightly affects cognition, but not much. My working hypothesis: exhaustion is the residue of use of cognition.
What impacts cognition for me, and IMO for a lot of folks, is how well we end up defining our outcomes. Agents are tremendous at working towards the outcome (hence by TDD red-green works wonderfully), but if you point them to a goal slightly off, then you'll have to do the work of getting them on track, demanding cognition.
So the better you're at your initial research/plan phase, where you document all of your direction and constraints, the lesser effort is needed in the review.
The other thing impacting cognition is how many parallel threads you're running. I have defaulted to major/minor system - at any time I have 1 major project (higher cognition) and 1 minor agent (lower cognition) going. It's where managing this is comfortable.
llms aren’t exhausting it’s the hype and all the people around it
same thing happened with crypto - the underlying technology is cool but the community is what makes it so hated
How is "llms" pronounced?
L-L-Mms
I pronounce it language models or chat bots.
el el ems
"lemons"
Everytime I read articles here describing the LLM prompt engineering workflow, all I can think is, "This sounds like such a fucking awful job".
I imagine I will greatly reduce my job prospects as a hold out, but honestly, from what I've read I think I'd rather take a hefty pay hit and not go there. It sounds like a mental heath disaster and fast track to serious burnout.
YMMV, I realize I'm in the minority, this is unproductive ranting, yada yada yada
It seems to me like any other tech: how you use it is up to you. You don’t have to run 10 agents simultaneously, etc.
I use them when I find them helpful, and that’s the case in plenty of situations. Figuring out architecture and design, finding bugs, analyzing and explaining a codebase, writing little scripts and utilities (especially in areas where you lack familiarity), etc. are all pure wins, imo. They increase my productivity and quality of output without any real downside.
When it comes to writing the bulk of a codebase or doing ongoing maintenance on a nontrivial system, a lot of ymmv comes into play. There’s no real reason (yet!) to believe that if you’re not committing 10k lines of generated slop per day, you’re going to be left behind. People doing that are on a bleeding edge that may have already cut them deeper than they realize.
In short, there’s an enormous middle ground between Yegge’s Gas Town and “I refuse to use LLMs for development”. I’m enjoying working in that middle ground. It’s interesting and stimulating, it makes a lot of things easier and quicker, and I’m growing and learning. If that stops, I’ll just change what I’m doing.
Your human context also needs compacting at some point. After hours of working with an LLM, your prompts tend to become less detailed, you tend to trust the LLM more, and it's easier to go down a solution that is not necessarily the best one. It becomes more of a brute forcing LLM assisted "solve this issue flow". What's funny is that it sometimes feels that the LLM itself is exhausted as well as the human and then the context compacting makes it even worse.
It's like with regular non-llm assisted coding. Sometimes you gotta sleep on it and make a new /plan with a fresh direction.
I've just come off a 2 month bender using Claude Code for, well, far too much. I had 5 instances running at once for days on end. It felt amazing, it felt like flying. And then something gave way and I couldn't focus on anything for 3 days. Diagnosed with ADHD some time ago I fell into this kind of trap well before LLM's, but not to this degree.
So I'm writing code by hand today and using Claude to track down type and dependency errors. It feels good, I might do this for a while.
I know you're feeling. Similarly, diagnosed with ADHD early last year. I also went on a Claude Code bender basically since the turn of the new year; I like the term "bender", it really does feel that way.
I have also had to step back, for my own sanity, and approach how I am using these tools. They are very strong slot machines, especially Claude models which require more steering, and that's not a good match with my brain and work style. You're not alone! Keep on trying to work better :)
I am rewriting an agent framework from scratch because another agent framework, combined with my prompting, led to 2023-level regressions in alignment (completely faking tests, echoing "completed" then validating the test by grepping for the string "completed", when it was supposed to bootstrap a udp tunnel over ssh for that test...).
Many top labs [1] [2] already have heavily automated code review already and it's not slowing down. That doesn't mean I'm trusting everything blindly, but yes, over time, it should handle less and less "lower level" tasks and it's a good thing if it can.
[1] https://openai.com/index/harness-engineering/ [2] https://claude.com/blog/code-review
Further I want to vent about two things:
- Things can be improved.
- You are allowed to complain about anything, while not improving things yourself.
I think the mid 2010s really popularized self improvement in a way that you can't really argue with (if you disagree with "put in more effort and be more focused", you're obviously just lazy!). It's funny because the point of engineering is to find better solutions, but technically yes, an always valid solution is just "suck it up".
But moreover, if you do not allow these two premises, what ends up happening in practice for a lot of people, is that basically you can just interpret any slightly pushback as "oh they're just a whiner", and if they're not doing something to fix their problem this instant, that "obviously" validates your claim (and even if they are, it doesn't count, they should still not be a "debbie downer", etc.).
Sometimes a premise can sound extreme, but people forget that premises are not in a complete logical vaccuum, you actually live out and believe said premises, and by taking on a certain position, it's often more about what follows downstream from the behavior than the actual words themselves.
LLM coding is addictive as hell though. you're like a kid at disneyland, everything builds so fast, just one more feature, one more fix... and then you're 4 hours in and your prompts are garbage but you don't want to stop because everything feels so close to done
You're so lucky you feel this way...every day is a worse hell for me since these demons arrived. I miss my job.
I wanna say that it is indeed a “skill issue” when it comes to debugging and getting the agent in your editor of choice to move forward. Sometimes it takes an instruction to step back and evaluate the current state and others it’s about establishing the test cases.
I think the exhausting part is more probably more tied to the evaluation of the work the agent is doing, understanding its thought process and catching the hang up can be tedious in the current state of AI reasoning.
I get exhausted because of the cognitive overhead of switching between 2 or 3 projects at once. I always want to be manually verifying or prompt writing, and keeping it all straight is taxing. But I’m getting so much more done.
Most people reading this have probably had the experience of wasting hours debugging when exhausted, only to find it was a silly issue you’ve seen multiple times, or maybe you solve it in a few minutes the next morning.
Working with an agent coding all day can be exhilarating but also exhausting - maybe it’s because consequential decisions are packed more tightly together. And yes cognition still matters for now.
I think that if you build a solid foundation of your project and can articulate somewhat well what it is you want it to do, then you can expect a pretty good result. I typically limit my prompt to a specific file, often specify the lines and outline some of the logic and add references to other files where necessary. Then, Claude gets just enough context to to do what I want it to do.
Another trick I learnt is you can ask Claude to ask you comprehensive questions for clarification. Usually, it will then offer you a choice of 3 options per question that it might have and you can steer it towards the right implementation.
One thing I’ve noticed is sometimes it feels like I’m more of a QA person testing output than solving the problem.
If AI is doing the coding then it gets to solve the problems and I don’t get the satisfaction/dopamine/motivation you get when you solve a programming problem in a clever way.
I’ve found LLM development expands the scope of what I can do to an absurd level. This is what exhausts me.
My limits are now many of the same things that are have always been core to software dev, but are now even more obvious:
- what is the thing we are building? What is the core product or bug fix or feature?
- what are we _not_ building? What do we not care about?
- do I understand the code enough to guide design and architecture?
- can I guide dev and make good choices when it’s far outside my expertise but I know enough to “smell” when things are going off the rails
It’s a weird time
I think the fatigue is specifically about opacity. When you review agent output, you're not just checking correctness—you're trying to reconstruct what state the agent was in when it made each call. That reconstruction is the expensive part. If you already know the agent's tool pattern and drift trajectory while it ran, review shifts from guessing to confirming. Still work, but a different kind.
In agent-mode mode, IMO, the sweet spot is 2-3 concurrent tasks/sessions. You don’t want to sit waiting for it, but you don’t want to context-switch across more than a couple contexts yourself.
That sounds exhausting having to non stop prompt and review without a second to stop and think.
There is nothing dictating how long you stop and think for.
Until that becomes the metric measured in performance reviews.
the exhaustion pattern i've noticed is specific: it's not writing the code that's hard, it's integration. the model produces clean isolated functions fast. the part that takes mental energy is knowing where those functions should live, what they'll break when they change, and why that architectural decision was made 3 months ago. that context isn't in the code -- it's in your head.
so the bottleneck shifts. before: generating code is slow, integration is easy (you built it). after: generating code is instant, integration requires the same mental load as before because the codebase complexity didn't decrease -- it just grew faster.
One way to help, I think, is to take advantage of prompt libraries. Claude makes this easy via Skills (which can be augmented via Plugins). Since skills themselves are just plain text with some front matter, they're easy to update and improve, and you can reuse them as much as you like.
There's probably a Codex equivalent, but I don't know what it is.
I really appreciate the author for writing this.
I learned years ago that I when I write code after 10 PM, I'm go backward instead of forward. It was easy to see, because the test just wouldn't pass, or I'd introduce several bugs that each took 30 minutes to fix.
I'm learning now that it's no different, working with agents.
Of course. Any scenario where you are expected to deliver results using non-deterministic tooling is going to be painful and exhausting. Imagine driving a car that might dive one way or the other of its own accord, with controls that frequently changed how they worked. At the end of any decently sized journey you would be an emotional wreck - perhaps even an actual wreck.
I mostly do 2-3 agents yoloing with self "fresh eyes" review
LLMs shift you from a software engineer to a management role, with all of the overhead that entails.
LLM coding has made programming feel like playing Factorio to me. It's simultaneously much more addictive and much more strenuous than it's even been for me before. Each commit feels like moving to a new link in the supply chain, but each link is imperfect so I have to drop back down to debug them. At the end of a long evening, "one more assembly line" and "one more prompt" feel exactly the same.
I really hate having to wait 20 seconds to a minute between every interaction with the LLM— I end up alternating between prompting and doom scrolling for several hours, a viciously unsatisfying cycle. (I know I could probably fix this by having multiple agents running at once, but context switching to that level also seems like a stressful doom-scroll-esque experience lol)
It seems to me that LLM is a tool after all. One needs to learn to use it effectively.
> If I reach the point where I am not getting joy out of writing a great prompt...
Man, I envy you. For me, the joy comes from writing good code that I can be proud of. I never got ANY joy from writing a prompt.
I mean, it is a means to an end (getting the LLMs to do the boring stuff) and so it is a necessary evil. Also, the LLMs are at times amazing and at times dumb as rocks even for very similar prompts. That drives me crazy because it feels I have no control over those things.
This is exactly what was needed. Seamlessly transitioning from manual inspection in the Elements/Network panels to agent-led investigation is going to save so much 'context-setting' time.
Does anyone else see this as dystopian? Someone is unironically writing about how exhausted they are and up at night thinking about how they can be a better good-boy at prompting the LLM and reminding us how we shouldn't cope by blaming the AI or its supposed limitations (context size, etc). This is not a dig at the author. It just seems crazy that this is an unironic post. It's like we are gleefully running to the "Laughterhouse" and each reminding our smiling fellow passengers not to be annoyed at the driver if he isn't getting us there fast enough, without realizing the Slaughterhouse (yes, I am stealing the reference).
Another way you can read this is as a new cult member that his chiding himself whenever he might have an intrusive thought that Dear Leader may not be perfect, after all.
Oh, entirely. But the hype cycle is such that if you find a legitimate criticism or run into the hard limits of human cognition (there are real limits to multitasking), a lot of people blame themselves.
My pet theory is we haven't figured out what the best way to use these tools are, or even seen all the options yet. But that's a bigger topic for another day.
With the trend going towards devs coordinating multiple agents at once, I am very curious to see how cognitive load increases due to the multitasking. We know multitasking reduces productivity and increases the likelihood of mistakes. Cal Newport talked about how important is to engage in "deep work." We're going in the opposite direction.
Yup, and we arr wasting our weekends worried about keeping pace in an imagined red queen's race. Another similar post today.
Is it not LLM-generated? Or at this point are they just mimicking LLM writing if they ever do it manually?
Yeah I thought the same thing. Kinda eerie reading all that about prompts when not that long ago it would all pertain to actual coding.
Not at all.
I mean, how often do we feel the same thing about the compiler?
What the compiler will do is highly predictable. What an LLM will produce considerably less so. That is the problem.
Never? I can rely on the compiler to pretty much do the same thing every time. If I broke some rule, it points out where and what it is.
I don't feel this? When my code breaks, I'm more likely to get frustrated with myself.
The only time I've felt something akin to this with a compiler is when I was learning Rust. But that went away after a week or two.
There's nothing more annoying than the feeling of "oh FFS why you doing that?!".
Its amazing how right and wrong LLMs can be in the output produced. Personally the variance for me is too much... I cant stand when it gets things wrong on the most basic of stuff. I much prefer doing things without output from an LLM.
I have just started reading books while the agents are working and only checking in every 20 minutes or so. I'm considering just moving all the work onto my home desktop and just use tailscale with a terminal emulator on the ipad and iphone to get out of the house a bit more. I spend a lot of the morning working on specs once they are all ready I get the agents to work.
Exhausting in a GOOD way! I've been using Codex to review my existing Godot components framework at [0] and the project's modularity suits AI well: It can focus on one file/subsystem at a time. I don't use it to generate or edit any code but it has helped me catch a lot of bugs that would have taken me a long time on my own. I've been more productive than ever but boy it never seems to run out of flaws to point out in everything! I often have to ask it to overlook some issues/limitations as intentional so I can catch a break.
>They're dumbing down the model to save money. Context rot!
Coldtea's law: "Never attribute to context rot that which is adequately explained by cost-cutting".
[flagged]
LLM spam, ironically
We've banned the account.
All: it's good to use AI in good ways, but posting generated comments to HN is a bad way and not allowed here.
Comment was deleted :(
>Reviewing LLM output requires constant context-switching between "what does this code do" and "is this what I actually wanted."
Best way I've seen it framed
I've always preferred brownfield work. In the past I've said "it's easier to be an editor than an author" to describe why. I think you're on to something. For me the new structure's cognitively easier, but it's not faster. Might even be slightly slower.
It takes all kinds, I suppose.
Actually I find verification pretty lightweight, because I tend to decompose tasks intended for AI to a level where I already know the "shape" of the code in my head, as well as what the test cases should look like. So reviewing the generated code and tests for me is pretty quick because it's almost like reading a book I've already read before, and if something is wrong it jumps out quickly.
That said I have a different theory for why AI coding can be exhausting: the part where we translate concrete ideas into code, where the flow state usually occurs, is actually somewhat meditative and relaxing. But with that offloaded to AI, we're left mostly alternating between the cognitively intense idea-generation / problem-solving phases, and the quick dopamine hits of seeing things work: https://news.ycombinator.com/item?id=46938038
Great post.
So the people who are claiming huge jumps in productivity in the workplace, how are they dealing with this 'review fatigue'?
What we once called “vibe coding” is increasingly known as just coding. There’s no reasonable way to review thousands of lines of code a day and many organizations simply aren’t. No review fatigue there! Just a black box of probable spaghetti.
I notice myself not reviewing in depth, and I assume many many others are not either.
My intuition is that they're aren't really doing it.
Somatic experiencing techniques.
[dead]
[dead]
[dead]
[dead]
[dead]
[flagged]
[dead]
I feel like I've seen this exact bot comment multiple times
[dead]
[dead]
[dead]
[dead]
Is this AI? You've copied the best comment and put into AI-speak.
You saw the em dash
And the 4 other comments the account added to different threads within the same minute.
[flagged]
Crafted by Rajat
Source Code