hckrnws
Just to be clear, the article is NOT criticizing this. To the contrary, it's presenting it as expected, thanks to Solow's productivity paradox [1].
Which is that information technology similarly (and seemingly shockingly) didn't produce any net economic gains in the 1970's or 1980's despite all the computerization. It wasn't until the mid-to-late 1990's that information technology finally started to show clear benefit to the economy overall.
The reason is that investing in IT was very expensive, there were lots of wasted efforts, and it took a long time for the benefits to outweigh the costs across the entire economy.
And so we should expect AI to look the same -- it's helping lots of people, but it's also costing an extraordinary amount of money, and the few people it's helping is currently at least outweighed by the people wasting time with it and its expense. But, we should recognize that it's very early days, and that productivity will rise with time, and costs will come down, as we learn to integrate it with best practices.
The comparison seems flawed in terms of cost.
A Claude subscription is 20 bucks per worker if using personal accounts billed to the company, which is not very far from common office tools like slack. Onboarding a worker to Claude or ChatGPT is ridiculously easy compared to teaching a 1970’s manual office worker to use an early computer.
Larger implementations like automating customer service might be more costly, but I think there are enough short term supposed benefits that something should be showing there.
What if LLMs are optimizing the average office worker's productivity but the work itself simply has no discernable economic value? This is argued at length in Grebber's Bullshit Jobs essay and book.
This is an underrated take. If you make someone 3x faster at producing a report nobody reads, you've improved nothing. The real gains from AI show up when it changes what work gets done, not just how fast existing work happens. Most companies are still in the "do the same stuff but with AI" phase.
And if you make someone 3x faster at producing a report that 100 people has to read, but it now takes 10% longer to read and understand, you’ve lost overall value.
You are forgetting that they are now going to use AI to summarize it back.
This is one of my major concerns about people trying to use these tools for 'efficiency'. The only plausible value in somebody writing a huge report and somebody else reading it is information transfer. LLM's are notoriously bad at this. The noise to signal ratio is unacceptably high, and you will be worse off reading the summary than if you skimmed the first and last pages. In fact, you will be worse off than if you did nothing at all.
Using AI to output noise and learn nothing at breakneck speeds is worse than simply looking out the window, because you now have a false sense of security about your understanding of the material.
Relatedly, I think people get the sense that 'getting better at prompting' is purely a one-way issue of training the robot to give better outputs. But you are also training yourself to only ask the sorts of questions that it can answer well. Those questions that it will no longer occur to you to ask (not just of the robot, but of yourself) might be the most pertinent ones!
Yep. The other way it can have net no impact is if it saves thousand of hours of report drafting and reading but misses the one salient fact buried in the observations that could actually save the company money. Whilst completely nailing the fluff.
> LLM's are notoriously bad at this. The noise to signal ratio is unacceptably high…
I keep seeing this statement in threads about AI, and maybe it’s just from you, but high SNR is a good thing.
See https://en.wikipedia.org/wiki/Signal-to-noise_ratio
I think the rest of your post is very valid. It’s the mental equivalent of this article https://news.ycombinator.com/item?id=47049088
Except they didn't say signal-to-noise, they said noise-to-signal. And if the NSR is unacceptably high, that means the SNR is unacceptably low.
Two inverses do make a right, it seems.
Hah, you got me there! I'll try to keep the ratio flipped correctly next time ;)
I read that article recently, so the similarities might not be entirely coincidental. 'JPEG of thought' is gonna stay in my vocabulary for a while.
Hehe, yeah there's some terms that just are linguistically unintuitive.
"Skill floor" is another one. People generally interpret that one as "must be at least this tall to ride", but it actually means "amount of effort that translates to result". Something that has a high skill floor (if you write "high floor of skill" it makes more sense) means that with very little input you can gain a lot of result. Whereas a low skill floor means something behaves more linearly, where very little input only gains very little result.
Even though its just the antonym, "skill ceiling" is much more intuitive in that regard.
Are you sure about skill floor? I've only ever heard it used to describe the skill required to get into something, and skill ceiling describes the highest level of mastery. I've never heard your interpretation, and it doesn't make sense to me.
I've also never heard that use of "skill floor" before. The "floor/ceiling" descriptors imply min/max constraints.
Comment was deleted :(
Comment was deleted :(
> LLM's are notoriously bad at this. The noise to signal ratio is unacceptably high
I could go either way on the future of this, but if you take the argument that we're still early days, this may not hold. They're notoriously bad at this so far.
We could still be in the PC DOS 3.X era in this timeline. Wait until we hit the Windows 3.1, or 95 equivalent. Personally, I have seen shocking improvements in the past 3 months with the latest models.
Personally I strongly doubt it. Since the nature of LLM's does not allow them semantic content or context, I believe it is inherently a tool unsuited for this task. As far as I can tell, it's a limitation of the technology itself, not of the amount of power behind it.
Either way, being able to generate or compress loads of text very quickly with no understanding of the contents simply is not the bottleneck of information transfer between human beings.
Yeah, definitely more skeptical for communication pipelines.
But for coding, the latest models are able to read my codebase for context, understand my question, and implement a solution with nuance, using existing structures and paradigms. It hasn't missed since January.
One of them even said: "As an embedded engineer, you will appreciate that ...". I had never told it that was my title, it is nowhere in my soul.md or codebase. It just inferred that I, the user, was one. Based on the arm toolchain and code.
It was a bit creepy, tbh. They can definitely infer context to some degree.
> We could still be in the PC DOS 3.X era in this timeline. Wait until we hit the Windows 3.1, or 95 equivalent. Personally, I have seen shocking improvements in the past 3 months with the latest models.
While we're speculating, here's mine: we're in the Windows 7 phase of AI.
IOW, everything from this point on might be better tech, but is going to be worse in practice.
I would like to see the day when the context size is in gigabytes or tens of billions of tokens, not RAG or whatever, actual context.
Context size helps some things but generally speaking, it just slows everything down. Instead of huge contexts, what we need is actual reasoning.
I predict that in the next two to five years we're going to see a breakthrough in AI that doesn't involve LLMs but makes them 10x more effective at reasoning and completely eliminates the hallucination problem.
We currently have "high thinking" models that double and triple-check their own output and we call that "reasoning" but that's not really what it's doing. It's just passing its own output through itself a few times and hoping that it catches mistakes. It kind of works, but it's very slow and takes a lot more resources.
What we need instead is a reasoning model that can be called upon to perform logic-based tests on LLM output or even better, before the output is generated (if that's even possible—not sure if it is).
My guess is that it'll end up something like a "logic-trained" model instead of a "shitloads of raw data trained" model. Imagine a couple terabytes of truth statements like, "rabbits are mammals" and "mammals have mammary glands." Then, whenever the LLM wants to generate output suggesting someone put rocks on pizza, it fails the internal truth check, "rocks are not edible by humans" or even better, "rocks are not suitable as a pizza topping" which it had placed into the training data set as a result of regression testing.
Over time, such a "logic model" would grow and grow—just like a human mind—until it did a pretty good job at reasoning.
Wasn’t this idea the basic premise of coq? Why didn’t it work?
> I would like to see the day when the context size is in gigabytes or tens of billions of tokens, not RAG or whatever, actual context.
Might not make a difference. I believe we are already at the point of negative returns - doubling context from 800k tokens to 1600k tokens loses a larger percentage of context than halving it from 800k tokens to 400k tokens.
First impressions are everything. It's going to be hard to claw back good will without a complete branding change. But... where do you go from 'AI'???
It reminds me of that Apple ad where a guy just rocks up to a meeting completely unprepared and spits out an AI summary to all his coworkers. Great job Apple, thanks for proving Graeber right all along.
One thing AI should eliminate is the "proof of work" reports. Sometimes the long report is not meant to be read, but used as proof somebody has thoroughly thought through various things (captured by, for instance, required sections).
When AI is doing that, it loses all value as a proof of work (just as it does for a school report).
My AI writes for your AI to read is low value. But there is probably still some value in "My AI takes these notes and makes them into a concise readable doc".
> Those questions that it will no longer occur to you to ask (not just of the robot, but of yourself) might be the most pertinent ones!
That is true, but then again also with google. You could see why some people want to go back to the "read the book" era where you didn't have google to query anything and had to make the real questions.
Comment was deleted :(
> Using AI to output noise and learn nothing at breakneck speeds is worse than simply looking out the window, because you now have a false sense of security about your understanding of the material.
i may put this into my email signature with your permission, this is a whip-smart sentence.
and it is true. i used AI to "curate information" for me when i was heads-down deep in learning mode, about sound and music.
there was enough all-important info being omitted, i soon realized i was developing a textbook case of superficial, incomplete knowledge.
i stopped using AI and did it all over again through books and learning by doing. in retrospect, i'm glad to have had that experience because it taught me something about knowledge and learning.
mostly that something boils down to RTFM. a good manual or technical book written by an expert doesn't have a lot of fluff. what exactly are you expecting the AI to do? zip the rar file? it will do something, it might look great, lossless compression it will be not.
P.S. not a prompt skill issue. i was up to date on cutting edge prompting techniques and using multiple frontier models. i was developing an app using local models and audio analysis AI-powered libraries. in other words i was up to my neck immersed in AI.
after i grokked as much as i could, given my limited math knowledge, of the underlying tech from reading the theory, i realized the skill issue invectives don't hold water. if things break exactly in the way they're expected to break as per their design, it's a little too much on the nose. even appealing to your impostor syndrome won't work.
P.P.S. it's interesting how a lot of the slogans of the AI party are weaponizing trauma triggers or appealing to character weaknesses.
"hop on the train, commit fully, or you'll be left behind" > fear of abandonment trigger
"pah, skill issue. my prompts on the other hand...i'm afraid i can't share them as this IP is making me millions of passive income as we speak (i know you won't probe further cause asking a person about their finances is impolite)" > imposter syndrome inducer par excellence, also FOMO -- thinking to yourself "how long can the gold rush last? this person is raking it in!! what am i doing? the miserable sod i am"
1. outlandish claims (Claude writes ALL the code) noone can seem to reproduce, and indeed everyone non-affiliated is having a very different experience
2. some of the darkest patterns you've seen in marketing are the key tenets of the gospel
3. it's probably a duck.
i've been 100% clear on the grift since October '25. Steve Eisman of the "Big Short" was just hopping onto the hype train back then. i thought...oh. how much analysis does this guru of analysts really make? now Steve sings of AI panic and blood in the streets.
these things really make you think, about what an economy even is. it sure doesn't seem to have a lot to do with supply and demand, products and services, and all those archaisms.
So what we now have is a very expensive and energy-intensive method for inflating data in a lossy manner. Incredible.
Remarkably it has only cost a few trillion dollars to get here!
don't forget the insane costs to stay here
This reminds me of that "telephone" kids game.
For all the technology we develop, we rarely invest in processes. Once in a blue moon some country decides to revamp its bureaucracy, when it should really be a continuous effort (in the private sector too).
OTOH, what happens continuously is that technology is used to automate bureaucracy and even allows it to grow some complexity.
So a circular economy in which you add mistakes
An economy of the LLMs, by the LLMs, for the LLMs, shall not perish from the Earth.
Rather poignant actually. By replacing people with LLM's, you've just made the economy as a whole something which can be owned.
See, this is an opportunity. Company provides AI tool, monitors for cases where AI output is being fed as AI input. In such cases, flag the entire process for elimination.
And like the article says, early computerization produced way more output than anybody could handle. In my opinion, we realized the true benefits of IT when ordinary users were able to produce for themselves exactly the computations they needed. That is, when spreadsheets became widespread. LLMs haven’t had their spreadsheet moment yet; their outputs are largely directed outward, as if more noise meant more productivity.
Maybe the take is that those reports that people took a day to write were read by nobody in the first place and now those reports are being written faster and more of them are being produced but still nobody reads them. Thus productivity doesn't change. The solution is to get rid of all the people who write and process reports and empower the people who actually produce stuff to do it better.
The managerial class are like cats and closed doors.
Ofcourse they don't read the reports, who has time to read it? But don't even think about not sending the report, they like to have the option of reading it if they choose to do so.
A closed door removes agency from a cat, an absent report removes agency from a manager.
> The solution is to get rid of all the people who write and process reports and empower the people who actually produce stuff to do it better.
That’s the solution if you’re the business owner.
That’s definitely not the solution if you’re a manager in charge of this useless activity, in fact, you should increase the amount of reports being written as much as humanly possible. The more underlings under you= more power and prestige.
This is the principal-agent problem writ large. As the comment mentioned above, also see Graeber’s Bullshit Jobs essay and book.
> Thus productivity doesn't change.
Indeed, productivity has decreased, because now there’s more output that is waste and you are paying to generate that excess waste.
Not necessarily. You could have 100 FTE on reports instead of 300 FTE in a large company like a bank. That means 200 people who'd normally go into reporting jobs over the next decade, will go into something else, producing something else ontop of the reports that continue to be produced. The sum of this is more production.
Looking at job numbers that seems to be happening. A lot less employment needed, freeing up people to do other things.
Looking at job numbers
That is a wild take given the recession we're basically in from bad US policy like tariffs.I’m not in favor of these tariffs. At all. However, it seems that they haven’t had such an impact yet on the economy, at least regarding consumer prices. You’d expect much larger inflation given the tariffs IIUC.
My current understanding of the general consensus is that many companies have been eating the tariffs with the hope SCOTUS will strike them. If they are upheld, prices will likely rise significantly
Actually job numbers are depressed (hiring recession) and GDP numbers are still way up, both precisely due to the AI investment. More output with fewer people.
Wild take to cite a recession when last quarter growth was 4.4%.
"The economy" is not GDP.
What happens if (and I suspect this to be increasingly the case now) you make someone 3x faster at producing a report that nobody reads and those people now use LLMs to not read the report whereas they were not reading it in person before?
Then everyone saves time, which they can spend producing more things which other people will not read and/or not reading the things that other people produce (using llms)?
Productivity through the roof.
Now you know why GDP is higher than ever and people are poorer than ever.
Mmm I can’t wait to get home and grill up some Productivity for dinner. We’ll have so much Productivity and no jobs. Hopefully our billionaire overlords deign to feed us.
> Hopefully our billionaire overlords deign to feed us.
Eat the Rich
> The real gains from AI show up when it changes what work gets done, not just how fast existing work happens.
Sadly AI is only capable of doing work that has already been done, thousands of times.
This is the natural result when the value of businesses is not strongly related to their actual output.
Stewart Butterfield calls these ""Hyper-realistic work like activities"
The most hyped use cases for AI/LLM make me wonder, "why are we doing this activity to begin with? We could just not."
And the fact that you can make it 3x faster substantially increases the chances that nobody will read it in the first place.
I suspect that we are going to see managers say, "Hey, this request is BS. I'm just going to get ChatGPT to do it" while employees say, "Hey, this response is BS, I'm just going to get ChatGPT to do it" and then we'll just have ChatGPT talking to itself. Eventually someone will notice and fire them both.
"What would you say you do here?" --Office Space
What a load of nonsense, they won't be producing a report in a third of the time only to have no-one read it. They'll spend the same amount of time and produce a report three times the length, which will then go unread.
Not a phase, I’d argue that 90% of modern jobs are bullshit to keep cattle occupied and economy rolling.
You know, that would almost be fine if everyone could afford a home and food and some pleasures.
Your claim and the claims that all white collar jobs are going to disappear in 12-18 months cannot both be true. I guess we will see.
It's possible to automate the pointless stuff without realising it's pointless.
Made me think of this.
The question is: did the fake numbers make any difference? Were the management decisions based on them better or worse?
Imgur is banned on UK.
I recommend using https://catbox.moe/ which can even use remote-links so pasting the imgur link in it can also work.
> Imgur is banned on UK.
It's the other way round, Imgur banned UK access so that they wouldn't have to worry about the UK's stupid, authoritarian Online "Safety" Act.
I think they can both be true. Perhaps the innovation of AI is not that it automates important work, but because it forces people to question if the work has already been automated or is even necessary.
Well, if a lot of it is bullshit that can also be done more efficiently with AI, then 99% of white collar roles could be eliminated by the 1% using AI, and essentially both were very close to true.
Jobs you don’t notice or understand often look pointless. HR on the surface seems unimportant, but you’d notice if the company stopped having health insurance or sending your taxes to the IRS etc etc.
In the end when jobs are done right they seem to disappear. We notice crappy software or a poorly done HVAC system not clean carpets.
This just highlights the absurdity of having your employer responsible for your health insurance and managing your taxes for you.
These should be handled by the government, equally for all.
Moving some function to the government doesn’t eliminate the need for it. Something would still need to tell the government what you’re paid unless you’re advocating for anarchy or communism.
Also, part of that etc is doing payroll so there’s some reason for you to show up at work every day.
> These should be handled by the government, equally for all.
This is certainly possible, but it's called communism.
No. Private insurance could still be an option.
> HR on the surface seems unimportant, but you’d notice if the company stopped having health insurance or sending your taxes to the IRS etc etc.
That's not why companies have HR; sure, it's a nice side-effect, but it's not the reason for HR.
HR exists primarily to protect the company from the employees.
I emailed HR and asked what to do to best ask for leave in case of a future event (serious illness with a family member, I just wanted to be one step ahead and make sure I did everything right even in the state of grief).
HR wouldn't tell me what would be the best and most correct course of action, the only thing that they said was that it was my responsibility as an employee to find out. Well, what did they think I was doing.
Side effect seems like an odd way to describe what’s going on when these functions are required for a company to operate.
Companies don’t survive if nobody is paid to show up every day or if they keep paying every single ex employee that ever worked for the company. It’s harder to attract new employees if you don’t offer competitive salaries or benefits. HR is a tiny part of most companies, but without that work being done the company would absolutely fail.
Similarly a specific ratio of flight attendants to passengers are required by the FAA in case of an emergency. Airlines use them for other stuff but they wouldn’t have nearly as many if the job was just passing out food.
Comment was deleted :(
> HR on the surface seems unimportant, but you’d notice if the company stopped having health insurance or sending your taxes to the IRS etc etc.
Interesting on how the very example you give for "oh this job isn't really bullshit" ultimately ends up being useless for the business itself, and exists only as a result of regulation.
No, health insurance being provided by employers, or tax withholding aren't useful things for anyone, except for the state who now offloads its costs onto private businesses.
Only result of regulation, that statement invalidates probably a majority of modern work, and like every legal professional.
i agree.
> Not a phase, I’d argue that 90% of modern jobs are bullshit to keep cattle occupied and economy rolling.
Cattle? You actually think that about other people?
I think what he meant was that the top 1% ruling class is keeping those bullshit jobs around to keep the poor people (their cattle) occupied so they won't have time and energy to think and revolt.
Or for everyone in chain of command to have people to rule over. A common want for many in leadership positions. At least two ways, you want to control people. And your value to your peers is the amount of people or resources you control.
It is bullshit argunent. The 1% is seeking to fire as many people as possible and with pleasure.
We dont matter to them, one was or the other. They dont see us as a threat, just as bugs.
It seems more like they're implying it's those at the top think that about other people.
Nope, the entire statement betrays a combination of ignorance and arrogance that is best explained by them seeing most everyone else as beneath them.
Hard miss. GP is right, and your assumptions say more about you than about me. :^)
> Hard miss. GP is right, and your assumptions say more about you than about me. :^)
No. If that's the case, your statement was unclear: since you didn't specify who else thinks those people were cattle, the implication is that you think it. Especially since you prefaced your statement with "I’d argue."
And the interpretation...
> It seems more like they're implying it's those at the top think that about other people.
...beggars belief. What indication has "the top" given to show they have that kind of foresight and control? The closest is the AI-bros advocacy of UBI, which (for the record) has gone nowhere.
I was half a mind to point that out in my original comment, but didn't get around to it.
> No. If that's the case, your statement was unclear: since you didn't specify who else thinks those people were cattle, the implication is that you think it. Especially since you prefaced your statement with "I’d argue."
I never said it was clear? Two commenters got it right, two wrong, so it wasn’t THAT unobvious.
> What indication has "the top" given to show they have that kind of foresight and control? The closest is the AI-bros advocacy of UBI, which (for the record) has gone nowhere.
Tech bros selling “no more software engineers” to cost optimizers, dictatorships in US, Russia, China pressing with their heels on our freedoms, Europe cracking down on encryption, Dutch trying to tax unrealized (!) gains, do I really need to continue?
>> What indication has "the top" given to show they have that kind of foresight and control? The closest is the AI-bros advocacy of UBI, which (for the record) has gone nowhere.
> Tech bros selling “no more software engineers” to cost optimizers, dictatorships in US, Russia, China pressing with their heels on our freedoms, Europe cracking down on encryption, Dutch trying to tax unrealized (!) gains, do I really need to continue?
All those things are non sequiturs, though, some directly contradicting the statement I was responding to, as you claim it should be interpreted. If "90% of modern jobs are bullshit to keep cattle occupied" that implies "the top" deliberately engineered (or at least maintains) an economy where 90% jobs are bullshit (unnecessary). But that's obviously not the case, as the priority of "the top" is to gather more money to themselves in the short to medium term, and they very frequently cut jobs to accomplish that. "Tech bros selling “no more software engineers” to cost optimizers," is a new iteration of that. If "the top" was really trying "to keep cattle occupied" they wouldn't be cutting jobs left and right.
We don't live in a command economy, there's no group of people with an incentive to create "bullshit" jobs "to keep cattle occupied."
[dead]
My observation is about what your assumptions say about you, and that's not a miss.
Nobody really understands a job they haven't done themselves, and "arguing" that 90% of them are "bullshit" has no other possible explanation than a combination of ignorance (you don't understand the jobs well enough to judge whether they are useful) and arrogance (you think you can make that judgement better than the 90% of people doing those jobs).
> Nobody really understands a job they haven't done themselves, and "arguing" that 90% of them are "bullshit" has no other possible explanation than a combination of ignorance (you don't understand the jobs well enough to judge whether they are useful) and arrogance (you think you can make that judgement better than the 90% of people doing those jobs).
That's fine if you disagree, I'm not aiming to be the authority on bullshit jobs.
This doesn't change the fact that you and I are cattle for corpo/neo-feudals.
> This is an underrated take. If you make someone 3x faster at producing a report nobody reads, you've improved nothing
In the private market are there really so many companies delivering reports no one reads ? Why would management keep at it then ? The goal is to maximize profits. Now sure there are pockets of inefficiency even in the private sector but surely not that much - whatever the companies are doing - someone is buying it from them, otherwise they fail. That's capitalism. Yes there is perhaps 20% of employees who don't pull their weight but its not the majority.
I don't know what to tell you aside from "just go and work at a large private company and see".
I'm not smart enough to understand the macro-economics or incentive structures that lead to this happening, but I've seen many 100+ man teams that output whose output is something you could reasonably expect from a 5 man team.
Sorry I meant to say the private sector, not sure if it changes the argument though since you seem to believe inefficiencies are all over the place - in public companies, private etc. I've worked in tech all my life and in general if you were grossly inefficient you'd get fired. Now tech may be a high efficiency / low bullshit industry but I'm assuming in general if you are truly shit at your job you'd get fired no matter the industry.
Many of these companies are fairly close to the mechanisms of credit creation. That distortion can make a market work very counterintuitively.
> In the private market are there really so many companies delivering reports no one reads ? Why would management keep at it then ?
In finance, you have to produce truly astounding amounts of regulatory reports that won't be read... until there is a crash, or a lawsuit, or an investigation etc. And then they better have been right!
Got it that's a fair point - you're saying many companies deal with heaps of regulations and expediting that isn't really adding to productivity. I agree with you here. But even if 50% of what a company does is shit no one cares about - surely there's the other 50% that actually matters - no? Otherwise how does the company survive financially.
The implication is that companies in a private market can't possibly be hugely inefficient for irrational reasons that can ultimately be self-harming.
An interesting take.
They can be irrational and ineffective. Nevertheless, if LLM are useful, they would still earn more then before.
Regardless of their effectivity, it means LLMs are not useful for them.
I used the term "private market" when I actually meant the private sector. I just mean all labor that isn't government owned - public companies, private companies etc. So yes - in a reasonably functioning capitalist market (which the U.S still is in my eyes) I expect gross inefficiencies to not be prevalent.
> So yes - in a reasonably functioning capitalist market (which the U.S still is in my eyes) I expect gross inefficiencies to not be prevalent.
I am not sure that is true, though. Assume for a moment that Google would waste 50% of their profits. Truly, a huge inefficiency. However, would that make it likely some other corp could take their search/ad market share from them? I doubt it, given the abyss of a moat.
True. Therefore, what?
One could say: True, therefore search is not a reasonably functioning capitalist market.
Yeah, I know, this can turn into "no true capitalist market". Still, it seems reasonable to say that many markets work in a certain kind of way (with lots of competition), and search is not one of those markets.
The goal might be to maximize profits, but that only means that managers want to make sure everyone further down the chain are doing whatever they identify to be the best way to accomplish that. How do you do that? Reports.
>In the private market are there really so many companies delivering reports no one reads ?
Just this month the hospital in my municipality submitted an application to put in a new concrete pad for a new generator beside the old one that they, per the application, intend to retire/remove and replace with a storage shed on it's pad once the new one is operational.
Full page intro about how the hospital is saving the world, such a great thing for the community and all manner of vapid buzzword bullshit. dozens of pages of re-hashing bullshit about the environmental conditions, water flows down hill, etc, etc, (i.e. basically reiterating stuff from when they built the facility), etc, etc.
God knows how many people and hours it took to compile it (we'll ignore the labor wasted in the public sector circulating and reading it).
All for a project that 50yr ago wouldn't have required 1/100th of the labor expenditure just to be kicked off. All that labor, squandered on nothing that makes anyone any richer. No goods made. No services rendered.
Why should hospitals be for-profit organizations? Sounds like all the wrong incentives.
>Why should hospitals be for-profit organizations? Sounds like all the wrong incentives.
You're conflating private ownership with the organizations nominal financial structure. It has nothing to do with the structure model of the organization and everything to do with resources wasted on TPS reports. This waste has to come from somewhere. Something is necessarily being forgone whether that's profit, reinvestment in the organization or competitive edge that benefits the customer (e.g. lower cost or higher quality for same cost). The same is true for a for profit company, or any other organization.
FWIW the hospital is technically nonprofit as is typical for hospitals. And I assure you, they still have all the wrong incentives despite this.
The cost is it taking America a billion dollars to build what China can for 50 million. That's ultimately where the waste accumulates.
Best description of America's biggest long term problem I've read. This shit is exactly the reason we can't build anything in America anymore.
I find that highly unlikely, coding is the AIs best value use case by far. Right now office workers see marginal benefits but it's not like it's an order of magnitude difference. AI drafts an email, you have to check and edit it, then send it. In many cases it's a toss up if that actually saved time, and then if it did, it's not like the pace of work is break neck anyway, so the benefit is some office workers have a bit more idle time at the desk because you always tap some wall that's out of your control. Maybe AI saves you a Google search or a doc lookup here and there. You still need to check everything and it can cause mistakes that take longer too. Here's an example from today.
Assistant is dispatching a courier to get medical records. AI auto completes to include the address. Normally they wouldn't put the address, the courier knows who we work with, but AI added it so why not. Except it's the wrong address because it's for a different doctor with the same name. At least they knew to verify it, but still mistakes like this happening at scale is making the other time savings pretty close to a wash.
Coding is a relatively verifiable and strict task: it has to pass the compiler, it has to pass the test suite, it has to meet the user's requests.
There are a lot of white-collar tasks that have far lower quality and correctness bars. "Researching" by plugging things into google. Writing reports summarizing how a trend that an exec saw a report on can be applied to the company. Generating new values to share at a company all-hands.
Tons of these that never touch the "real world." Your assistant story is like a coding task - maybe someone ran some tests, maybe they didn't, but it was verifiable. No shortage of "the tests passed, but they weren't the right test, this broke some customers and had to be fixed by hand" coding stories out there like it. There are pages and pages of unverifiable bullshit that people are sleepwalking through, too, though.
Nobody already knows if those things helped or hurt, so nobody will ever even notice a hallucination.
But everyone in all those fields is going to be trying really really hard to enumerate all the reasons it's special and AI won't work well for them. The "management says do more, workers figure out ways to be lazier" see-saw is ancient, but this could skew far towards "management demands more from fewer people" spectrum for a while.
Code may have to compile but that's a lowish bar and since the AI is writing the tests it's obvious that they're going to pass.
In all areas where there's less easy ways to judge output there is going to be correspondingly more value to getting "good" people. Some AI that can produce readable reports isn't "good" - what matters is the quality of the work and the insight put into it which can only be ensured by looking at the workers reputation and past history.
We’ve had the sycophant problem for as long as people have held power over other people, and the answer has always been “put 3-5 workers in a room and make them compete for the illusion of favor.”
I have been doing this with coding agents across LLM providers for a while now, with very successful results. Grok seems particularly happy to tell Anthropic where it’s cutting corners, but I get great insights from O3 and Gemini too.
> since the AI is writing the tests it's obvious that they're going to pass
That's not obvious at all if the AI writing the tests is different than the AI writing the code being tested. Put into an adversarial and critical mode, the same model outputs very different results.
IMO the reason neither of them can really write entirely trustworthy tests is that they don't have domain knowledge so they write the test based on what the code does plus what they extract from some prompts rather than based on some abstract understanding of what it should do given that it's being used e.g. in a nuclear power station or for promoting cat videos or in a hospital or whatever.
Obviously this is only partially true but it's true enough.
It takes humans quite a long time to learn the external context that lets them write good tests IMO. We have trouble feeding enough context into AIs to give them equal ability. One is often talking about companies where nobody bothers to write down more than 1/20th of what is needed to be an effective developer. So you go to some place and 5 years later you might be lucky to know 80% of the context in your limited area after 100s of meetings and talking to people and handling customer complaints etc.
Yes, some kind of spec is always needed, and if the human programmer only has the spec in their head, then that's going to be a problem, but it's a problem for teams of humans as well.
Even if its a different session it can be enough. But that said i had times where it rewrote tests "because my implementation was now different so the tests needed to be updated" so you have to prompt even that to tell it to not touch the tests.
and then verify that it obeyed the prompt!
Someone needs to build an agentic tool that does strict, enforced TDD.
>Coding is a relatively verifiable and strict task: it has to pass the compiler, it has to pass the test suite, it has to meet the user's requests.
Except the test suite isnt just something that appears and the bugs dont necessarily get covered by the test suite.
The bugginess of a lot of the software i use has spiked in a very noticeable way, probably due to this.
>But everyone in all those fields is going to be trying really really hard to enumerate all the reasons it's special and AI won't work well for them.
No, not everyone. Half of them are trying to lean in to the changing social reality.
The gaslighting from the executive side, on the other hand, is nearly constant.
Not all code generates economic value. See slacks, jiras, etc constant ui updates.
High quality code that does exactly what it needs to do and well and that makes various actors and organizations far more efficient at their jobs... but their jobs are of negative economic value overall.
That makes it a perfect use case for AI, since now you don't need a dev for that. Any devs doing that would, imo, be effectively performing one of David Graeber's bullshit jobs.
LLMs might not save time but they certainly increase quality for at least some office work. I frequently use it to check my work before sending to colleagues or customers and it occasionally catches gaps or errors in my writing.
But that idealized example could also be offset by another employee who doubles their own output by churning out lower-quality unreviewed workslop all day without checking anything, while wasting other people's time.
Something I call the 'Generate First, Review Never' approach, seemingly favoured by my colleagues, and which has the magical quality of increasing the overall amount of work done through an increased amount of time taken by N receivers of low-quality document having to review, understand and fact check said document.
See also: AI-Generated “Workslop” Is Destroying Productivity [1]
[1] https://hbr.org/2025/09/ai-generated-workslop-is-destroying-...
Yeah, but that's no different from any other aspect of office work, and more conventional forms of automation. Gains by one person are often offset to some extent by the laziness, inattentiveness, or ineptitude of others.
What AI has done is accelerate and magnify both the positives and the negatives.
Code is much much harder to check for errors than an email.
Consider, for example, the following python code:
x = (5)
vs x = (5,)
One is a literal 5, and the other is a single element tuple containing the number 5. But more importantly, both are valid code.Now imagine trying to spot that one missing comma among the 20kloc of code one so proudly claims AI helped them "write", especially if it's in a cold path. You won't see it.
> Code is much much harder to check for errors than an email.
Disagree.
Even though performing checks on dynamic PLs is much harder than on static ones, PLs are designed to be non-ambiguous. There should be exactly 1 interpretation for any syntactically valid expression. Your example will unambiguously resolve to an error in a standard-conforming Python interpreter.
On the other hand, natural languages are not restricted by ambiguity. That's why something like Poe's law exists. There's simply no way to resolve the ambiguity by just staring at the words themselves, you need additional information to know the author's intent.
In other words, an "English interpreter" cannot exist. Remove the ambiguities, you get "interpreter" and you'll end up with non-ambiguous, Python-COBOL-like languages.
With that said, I agree with your point that blindly accepting 20kloc is certainly not a good idea.
Tell me you've never written any python without telling me you've never written any python...
Those are both syntactically valid lines of code. (it's actually one of python's many warts). They are not ambiguous in any way. one is a number, the other is a tuple. They return something of a completely different type.
My example will unambiguously NOT give an error because they are standard conforming. Which you would have noticed had you actually took 5 seconds to try typing them in the repl.
> Those are both syntactically valid lines of code. (it's actually one of python's many warts). They are not ambiguous in any way. one is a number, the other is a tuple. They return something of a completely different type.
You just demonstrated how hard it is to "check" an email or text message by missing the point of my reply. > "Now imagine trying to spot that one missing comma among the 20kloc of code"
I assume your previous comment tries to bring up Python's dynamic typing & late binding nature and use it as an example of how it can be problematic when someone tries to blindly merge 20kloc LLM-generated Python code.My reply, "Your example will unambiguously resolve to an error in a standard-conforming Python interpreter." tried to respond to the possibility of such an issue. Even though it's probably not the program behavior you want, Python, being a programming language, will be 100% guaranteed to interpret it unambiguously.
I admit, I should have phrased it a bit more unambiguously than leaving it like that.
Even if it's hard, you can try running a type checker to statically catch such problems. Even if it's not possible in cases of heavy usage of Python's dynamic typing feature, you can just run it and check the behavior at runtime. It might be hard to check, but not impossible.
On the other hand, it's impossible to perform a perfectly consistent "check" on this reply or an email written in a natural language, the person reading it might interpret the message in a completely different way.
In my experience the example you give here is exactly the kind of problem that AI powered code reviews are really good at spotting, and especially amongst codebases with tens of thousands of lines of code in them where a human being might well get scrolling blindness when quickly moving around them to work.
The AI is the one which made the mistake in the first place. Why would you assume it's guaranteed to find it?
The few times I've tried giving LLMs a shot I've had them warning me of not putting some validations in, when that exact validation was exactly 1 line below where they stopped looking.
And even if it did pass an AI code review, that's meaningless anyway. It still needs to be reviewed by an actual human before putting it into production. And that person would still get scrolling blindness whether or not the ai "reviewer" actually detected the error or not.
> The AI is the one which made the mistake in the first place. Why would you assume it's guaranteed to find it?
I didn't say they were guaranteed to find it: I said they were really good at finding these sorts of errors. Not perfect: just really good. I also didn't make any assumption: I said in my experience, by which I mean the code you shared is similar to a portion of the errors that I've seen LLMs find.
Which LLMs have you used for code generation?
I mostly use claude-opus-4-6 at the moment for development, and have had mostly good experiences. This is not to say it never gets anything wrong, but I'm definitely more productive with it than without it. On GitHub I've been using Copilot for more limited tasks as an agent: I find it's decent at code review, but more variable at fixing problems it finds, and so I quite often opt for manual fixes.
And then the other question is, how do you use them? I tend to keep them on quite a short leash, so I don't give them huge tasks, and on those occasions where I am doing something larger or more complex, I tend to write out quite a detailed and prescriptive prompt (which might take 15 minutes to do, but then it'll go and spend 10 minutes to generate code that might have taken me several hours to write "the old way").
This is what I have been saying for sometime. Working inside different Goverment department you see this happening every day. Email and report bouncing back and forth with no actual added value while feeling extremely productive. That is why private sector and public sector generally don't mix well. It is also one reason why I said in some of my previous post LLM could replace up to 70% of Goverment's job.
Edit: If anyone haven't watched Yes Minster, you should go and watch it, it is a documentary on UK Government that is still true today as it was 40-50 years ago.
At least in my experience, there's another mechanism at play: people aren't making it visible if AI is speeding them up. If AI means a bugfix card that would have taken a day takes 15 minutes, well, that's the work day sorted. Why pull another card instead of doing... something that isn't work?
> but the work itself simply has no discernable economic value? This is argued at length in Grebber's Bullshit Jobs essay and book.
That book was very different than what I expected from all of the internet comment takes about it. The premise was really thin and did't actually support the idea that the jobs don't generate value. It was comparing to a hypothetical world where everything is perfectly organized, everyone is perfectly behaved, everything is perfectly ordered, and therefore we don't have to have certain jobs that only exist to counter other imperfect things in society.
He couldn't even keep that straight, though. There's a part where he argues that open source work is valuable but corporate programmers are doing bullshit work that isn't socially productive because they're connecting disparate things together with glue code? It didn't make sense and you could see that he didn't really understand software, other than how he imagined it fitting into his idealized world where everything anarchist and open source is good and everything corporate and capitalist is bad. Once you see how little he understands about a topic you're familiar with, it's hard to unsee it in his discussions of everything else.
That said, he still wasn't arguing that the work didn't generate economic value. Jobs that don't provide value for a company are cut, eventually. They exist because the company gets more benefit out of the job existing than it costs to employ those people. The "bullshit jobs" idea was more about feelings and notions of societal impact than economic value.
> There's a part where he argues that open source work is valuable but corporate programmers are doing bullshit work that isn't socially productive because they're connecting disparate things together with glue code?
I don't know if maybe he wasn't explaining it well enough, but that kind of reasoning makes some sense.
A lot of code is written because you want the output from Foo to be the input to Bar and then you need some glue to put them together. This is pretty common when Foo and Bar are made by different people. With open source, someone writes the glue code, publishes it, and then nobody else has to write it because they just use what's published.
In corporate bureaucracies, Company A writes the glue code but then doesn't publish it, so Company B which has the same problem has to write it again, but they don't publish it either. A hundred companies are then doing the work that only really needed to be done once, which makes for 100 times as much work, a 1% efficiency rate and 99 bullshit jobs.
"They exist because the company gets more benefit out of the job existing than it costs to employ those people."
Sure, but there's no such thing as "the company." That's shorthand - a convenient metaphor for a particular bunch of people doing some things. So those jobs can exist if some people - even one person - gets more benefit out of the job existing than it costs that person to employ them. For example, a senior manager padding his department with non-jobs to increase headcount, because it gives him increased prestige and power, and the cost to him of employing that person is zero. Will those jobs get cut "eventually"? Maybe, but I've seen them go on for decades.
Hmmm, I got something different. I thought that Bullshit Jobs was based on people who self reported that their jobs were pointless. He detailed these types of jobs, the negative psychological impact this can have on employees, and the kicker was that these jobs don't make sense economically, the bureaucratization of the health care and education sectors for example, in contrast so many other professions that actually are useful. Other examples were status-symbol employees, sycophants, duct-tapers, etc.
I thought he made a case for both societal and economic impact.
> They exist because the company gets more benefit out of the job existing than it costs to employ those people.
Not necessarily, I’ve seen a lot of jobs that were just flying under the radar. Sort of like a cockroach that skitters when light is on but roams freely in the dark.
It's just a self-built UBI.
> It was comparing to a hypothetical world where everything is perfectly organized, everyone is perfectly behaved, everything is perfectly ordered, and therefore we don't have to have certain jobs that only exist to counter other imperfect things in society.
> Jobs that don't provide value for a company are cut, eventually.
Uhm, seems like Greaber is not the only one drawing conclusions from a hypothetical perfect world
> The "bullshit jobs" idea was more about feelings and notions of societal impact than economic value.
But he states that expressis verbis, so your discovery is not that spectacular.
Although he gives examples of jobs, or some aspects of jobs, that don't help to deliver what specific institutions aim to deliver. Example would be bureaucratization of academia.
Greaber’s best book is his ethnography “Lost People” and it’s one of his least read works. Bullshit Jobs was never intended to be read as seriously as it is criticized.
Honestly this is how every critique of Graeber goes in my experience: As soon as his works are discussed beyond surface level, the goalposts start zooming around so fast that nothing productive can be discussed.
I tried to respond to the specific conversation about Bullshit Jobs above. In my experience, the way this book is brought up so frequently in online conversations is used as a prop for whatever the commenter wants it to mean, not what the book actually says.
I think Graeber did a fantastic job of picking "bullshit jobs" as a topic because it sounds like something that everyone implicitly understands, but how it's used in conversation and how Graeber actually wrote about the topic are basically two different things
Well said - thanks for the reply.
> What if LLMs are optimizing the average office worker's productivity but the work itself simply has no discernable economic value?
I think broadly that's a paradoxical statement; improving office productivity should translate to higher gdp; whatever it is you're doing in some office - even if you're selling paper or making bombs, if you're more productive it means you're selling more (or using less resources to sell the same amount); that should translate to higher gdp (at least higher gdp per worker, there's the issue of what happens to gdp when many workers get fired).
Exactly. Even if they're doing bullshit it's a "bullshit make number go up" situation.
Society as a whole is no better off since no value or wealth was generated, but the number did go up.
A whole bunch of our economy is broken windows like this to varying degrees.
And in the type of work where AI arguably yields productivity gains the workers have high agency and may pay for their own tooling without telling their employers. Case in point, me, I have access to CoPilot via my employer but don't use it because I prefer my self-paid ChatGPT subscription. If Ai-lift in productivity is measured on the condition that I use Copilot then the resulting metric misses my AI usage entirely and my productivity improvement are not attributed to their real cause.
> the work itself simply has no discernable economic value
i'm going to need you to go ahead and come in on sunday
David Graeber is the correct spelling.https://davidgraeber.org/books/
That would be a way to get "the right answer."
i.e. it's not the LLM, it's that they're not being used properly.
I get accused of the no true Scotsman argument because I think agile can be done right, for example. Is work bullshit because an LLM doesn't help it?
But they're not optimizing the average worker's productivity. That's a silicon valley talking point. The average worker, IF they use AI, ends up proofreading the text for the same amount of time as it would take to write the text themselves.
And it is of this lowly commenter's opinion that proofreading for accuracy and clarity is harder than writing it yourself and defending it later.
I think it’s more likely that the same amount of work is getting done, just it’s far less taxing. And that averages are funny things, for developers it’s undeniably a huge boost, but for others it’s creating friction.
I think this is extremely common and nobody wants to admit to it!
And the reason the position and the busy work exist is to have someone who is knowledgeable on the topic/relationship/requirements/whatever for the edge cases that come up (you don't pay me to push the button, you pay me to know when to push the button). AI could be technically filling a roll while its defeating the whole point (familiarity, background knowledge) for a lot of these roles.
We made an under-the-radar optimization in a data flow in my company. A given task is now much more freshData-assisted that it used to.
Was a LLM used during that optimization? Yes.
Who will correlate the sudden productivity improvement with our optimization of the data flow with the availability of a LLM to do such optimizations fast enough that no project+consultants+management is needed ?
No one.
Just like no one is evaluating the value of a hammer or a ladder when you build a house.
But you would see more houses, or housing build costs/bids fall.
This is where the whole "show me what you built with AI" meme comes from, and currently there's no substitute for SWEs. Maybe next year or next next year, but mostly the usage is generating boring stuff like internal tool frontends, tests, etc. That's not nothing, but because actually writing the code was at best 20% of the time cost anyway, the gains aren't huge, and won't be until AI gets into the other parts of the SDLC (or the SDLC changes).
CONEXPO, World of Concrete, and NAHB IBS is where vendors go to show off their new ladders and the attendees totally evaluate the value of those ladders vs their competitors.
Is there a productivity improvement resulting tangible economic results coming from that optimization?
It’s easy to convince yourself that it is, and anyone can massage some internal metric enough to prove their desired outcome.
Have had job. Can confirm.
Bullshit Jobs is one of those "just so" stories that seems truthy but doesn't stand up to any critical evaluation. Companies are obviously not hesitant to lay off unproductive workers. While in large enterprises there is some level of empire building where managers hire more workers than necessary just to inflate their own importance, in the long run those businesses fall to leaner competitors.
> in the long run those businesses fall to leaner competitors
This is not true at all. You can find plenty of examples going either way but it’s far from truth from being a universal reality
> Companies are obviously not hesitant to lay off unproductive workers.
Companies are obviously not hesitant to lay off anyone, especially for cost saving. It is interesting how you think that people are laid off because they’re unproductive.
It's only after decades of experience and hindsight that you realize that a lot of the important work we spend our time on has extremely limited long-term value.
Maybe you're lucky enough to be doing cutting edge research or do something that really seriously impacts human beings, but I've done plenty of "mission critical right fucking now" work that a week from now (or even hours from now, when I worked for a content marketing business) is beyond irrelevant. It's an amazing thing watching marketing types set money on fire burning super expensive developer time (but salaried, so they discount the cost to zero) just to make their campaigns like 2-3% more efficient.
I've intentionally sat on plenty of projects that somebody was pushing really hard for because they thought it was the absolute right necessary thing at the time and the stakeholder realized was pointless/worthless after a good long shit and shower. This one move has saved literally man years of work to be done and IMO is the #1 most important skill people need to learn ("when to just do nothing").
Would hardly drag Graeber into this, theres a laundry list of issues with his research.
Most "Bullshit Jobs" can already be automated, but can isnt always should or will. Graeber is a capex thinker in an opex world.
And that book sort of vaguely hints around at all these jobs that are surely bullshit but won’t identify them concretely.
Not recognizing the essential role of sales seemed to be a common mistake.
What counts as “concretely”? And I don’t recall it calling sales bullshit.
It identified advertising as part of the category that it classed as heavily-bullshit-jobs for reason of being zero-sum—your competitor spends more, so you spent more to avoid falling behind, standard red queen’s race. (Another in this category was the military, which is kinda the classic case of this—see also, the Missile Gap, the dreadnought arms race, et c.) But not sales, IIRC.
> And I don’t recall it calling sales bullshit.
It says stuff like why can’t a customer just order from an online form? The employee who helps them doesn’t do anything except make them feel better. Must be a bullshit job. It talks specifically about my employees filling internal roles like this.
> advertising
I understand the arms race argument, but it’s really hard to see what an alternative looks like. People can spend money to make you more aware of something. You can limit some modes, but that kind of just exists.
I don’t see how they aren’t performing an important function.
It's an important function in a capitalist economy. Socialist economies are like "adblock for your life". That said, some advertising can be useful to inform consumers that a good exists, but convincing them they need it by synthesizing desires or fighting off competitors? Useless and socially detrimental.
> Socialist economies are like "adblock for your life".
There's nothing inherent to socialism that would preclude advertising. It's an economic system where the means of production (capital) is owned by the workers or the state. In market socialism you still have worker cooperatives competing on the market.
Have you participated in academia before?
Plus, a core part of what qualifies as a bullshit job is that the person doing it feels that it's a bullshit job. The book is a half-serious anthropological essay, not an economic treaty.
Yeah, guy states that in multiple places, and yet here we are, with an impression that most people referencing the book apparently didn't read it.
An odd tendency I’ve noticed about Graeber is that the more someone apparently dislikes his work, the more it will seem like they’re talking about totally different books from the ones I read.
Because he uses private framings of concepts that are well understood. So if your first encounter is through Graeber you’re going to have friction with every other understanding. If you’ve read much else you will say “hold on a minute, what’s about …”
> If you’ve read much else
If you've read much else you should be able to engage with text properly, and construct charitable interpretations of author's claims or arguments.
Please read my comments here engaging with the ideas in the text and specifically your concern that bullshit jobs are just jobs that don’t feel important.
You have written bunch of comments regarding advertising, single comment criticizing Graeber for using concepts in uncommon way, and one reply to my comment that doesn't really connect with the content of that comment.
Did I miss something?
> And that book sort of vaguely hints around at all these jobs that are surely bullshit but won’t identify them concretely.
See what I mean? We push on where these fake jobs are and you fallback to a subjective internal definition we can’t inspect.
And now let me remind you of the context. If the real definition of bullshit isn’t economic slack, but internal dissatisfaction then this comment would be false:
> What if LLMs are optimizing the average office worker's productivity but the work itself simply has no discernable economic value? This is argued at length in Grebber's Bullshit Jobs essay and book.
"Socialist economies are like "adblock for your life"."
Ever actually lived in anything approaching one? Yeah, if the stores are empty, it does not make sense to produce ads for stuff that isn't there ...
... but we still had ads on TV, surprisingly, even for stuff that was in shortage (= almost everything). Why? Because the Plan said so, and disrespecting the Plan too openly would stray dangerously close to the crime of sabotage.
You have no idea.
None of that is inherent to socialism. There can be good and bad management, freedom and authoritarianism in any economic system.
Socialist economies larger than kibbutzes could only be created and sustained by totalitarian states. Socialism means collective ownership of means of production.
And people won't give up their shops and fields and other means of production to the government voluntarily, at least not en masse. Thus they have to be forced at a gunpoint, and they always were.
All the subsequent horror is downstream from that. This is what is inherent to building a socialist economy: mass expropriation of the former "exploitative class". The bad management of the stolen assets is just a consequence, because ideologically brainwashed partisans are usually bad at managing anything including themselves.
This is exactly what I meant, a centrally-planned economy where the state owns everything and people are forced to give everything up is just one terrible (Soviet) model, not some defining feature of socialism.
Yugoslavia was extremely successful, with economic growth that matched or exceeded most capitalist European economies post-WW2. In some ways it wasn't as free as western societies are today but it definitely wasn't totalitarian, and in many ways it was more free - there's a philosophical question in there about what freedom really is. For example Yugoslavia made abortion a constitutionally protected right in the 70s.
I don't want to debate the nuances of what's better now and what was better then as that's beside the point, which is that the idiosyncrasies of the terrible Soviet economy are not inherent to "socialism", just like the idiosyncrasies of the US economy aren't inherent to capitalism.
"just one terrible (Soviet) model"
It is the model, introduced basically everywhere where socialism was taken seriously. It is like saying that cars with four wheels are just one terrible model, because there were a few cars with three wheels.
Yugoslavia was a mixed economy with a lot of economic power remaining in private hands. You cannot point at it and say "hey, successful socialism". Tito was a mortal enemy of Stalin, stroke a balanced neither-East-nor-West, but fairly friendly to the West policy already in 1950, and his collectivization efforts were a fraction of what Marxist-Leninist doctrine demands.
You also shouldn't discount the effect of sending young Yugoslavs to work in West Germany on the total balance sheet. A massive influx of remittances in Deutsche Mark was an important factor in Yugoslavia getting richer, and there was nothing socialist about it, it was an overflow of quick economic growth in a capitalist country.
You've created a tautology: Socialism is bad because bad models are socialism and better models are not-socialism.
> You cannot point at it and say "hey, successful socialism"
Yes I can because ideological purity doesn't exist in the real world. All of our countries are a mix of capitalist and socialist ideas yet we call them "capitalist" because that's the current predominant organization.
> Tito was a mortal enemy of Stalin, stroke a balanced neither-East-nor-West, but fairly friendly to the West policy already in 1950, and his collectivization efforts were a fraction of what Marxist-Leninist doctrine demands.
You're making my point for me, Yugoslavia was completely different from USSR yet still socialist. Socialism is not synonymous with Marxist-Leninist doctrine. It's a fairly simple core idea that has an infinite number of possible implementations, one of them being market socialism with worker cooperatives.
Aside from that short period post-WW2, no socialist or communist nation has been allowed to exist without interference from the US through oppressive economic sanctions that would cripple and destroy any economy regardless of its economic system, but people love nothing more than to draw conclusions from these obviously-invalid "experiments".
"You" (and I mean the collective you) are essentially hijacking the word "socialism" to simply mean "everything that was bad about the USSR". The system has been teaching and conditioning people to do that for decades, but we should really be more conscious and stop doing that.
" no socialist or communist nation has been allowed to exist without interference from the US through oppressive economic sanctions that would cripple and destroy any economy regardless of its economic system"
That is what COMECON was supposed to solve, but if you aggregate a heap of losers, you won't create a winning team.
"Socialism is not synonymous with Marxist-Leninist doctrine. It's a fairly simple core idea that has an infinite number of possible implementations, one of them being market socialism with worker cooperatives."
Of that infinite number, the violent Soviet-like version became the most widespread because it was the only one that was somewhat stable when implemented on a countrywide scale. That stability was bought by blood, of course.
No one is sabotaging worker cooperatives in Europe and lefty parties used to given them extra support, but they just don't seem to be able to grow well. The largest one is located in Basque Country and it is debatable if its size is partly caused by Basque nationalism, which is not a very socialist idea. Aside from that one, worker cooperatives of more than 1000 people are rare birds.
"The system has been teaching and conditioning people to do that for decades, but we should really be more conscious and stop doing that."
No one in the former socialist bloc will experiment with that quagmire again. For some reason, socialism is a catnip of intellectuals who continue to defend it, but real-world workers dislike it and defect from various attempts to build it at every opportunity.
We should stop trying to ride dead horses. Collective ownership of means of production on a macro scale is every bit as dead as divine right of kings to rule. There are still Curtis Yarvin types of intellectual who subscribe to the latter idea, but it is pining for the fjords. So is socialism.
> That is what COMECON was supposed to solve, but if you aggregate a heap of losers, you won't create a winning team.
What kind of disingenuous argument is that? Existence of COMECON doesn't neutralize the enormous disadvantage and economic pressure of having sanctioned imposed on you.
> Of that infinite number
I'm glad we agree that Soviet communism is not synonymous with "socialism".
> Aside from that one, worker cooperatives of more than 1000 people are rare birds.
You're applying pointless capitalist metrics to non-capitalist organizations and moralizing about how they don't live up to them.
> No one in the former socialist bloc will experiment with that quagmire again.
You're experimenting with socialist policies and values right now, you just don't want to call it by that name because of your weird fixation. Do public healthcare, transport, education, social security benefits ring any bells?
If you talked to people from ex-Yugoslavia, you'd know that many would be happy to return to that time.
> We should stop trying to ride dead horses.
We should stop declaring horses extinct when it's just your own horse that has died.
How does that make advertising a bullshit job? The only way advertising won't exist or won't be needed is when humanity becomes a hive mind and removes all competition.
Countries can just ban advertising, and hopefully we will slowly move towards this. There are already quite a few specific bans - tobacco advertising is banned, gambling and sex product advertising is only allowed in certain specific situations, billboards and other forms of advertising on public spaces are often banned in large European cities, and so on.
> Countries can just ban advertising
No. They can ban particular modes. They can’t stop people from using power and money to spread ideas.
In the US hedge funds are banned from advertising and all they did is change their forms of presentation to things like presenting at conferences or on podcasts.
If there was a socialist fantasy of a government review board for which all products were submitted before being listed in a government catalog. Then advertising would be lobbying and jockeying that review board to view your product in a particular way. Or merely to go through the process and ensure correct information was kept.
The parts that are only done to maintain status quo with a competitor aren’t productive, and that’s quite a bit of it. Two (or more) sides spend money, nothing changes. No good is produced. The whole exercise is basically an accident.
Like when a competing country builds their tenth battleship, so you commission another one to match them. The world would have been better if neither had been build. Money changed hands (one supposes) but the aim of the whole exercise had no effect. It was similar to paying people to dig holes a fill them back in again, to the tune of serious money. This was so utterly stupid and wasteful that there was a whole treaty about it, to try to prevent so many bullshit jobs from being created again.
Or when Pepsi increases their ad spending in Brazil, so Coca Cola counters, and much of the money ends up accomplishing little except keeping things just how they were. That component or quality of the ad industry, the book claims, is bullshit, on account of not doing any good.
The book treats of several ways in which a job might be bullshit, and just kinda mentions this one as an aside: the zero-sum activity. It mostly covers other sorts, but this is the closest I can recall it coming to declaring sales “bullshit” (the book rarely, bordering on never, paints even most of an entire industry or field as bullshit, and advertising isn’t sales, but it’s as close as it got, as I recall)
You’re describing redundant effort from competition as waste. Is it a waste of effort when both people try to get the same job and only one succeeds?
Is it a waste of effort when two companies try to make the best electric car?
If it is wasteful what does aw orld look like where nobody ever spends resources on a goal which overlaps with someone else?
I think you’ve misunderstood what I’ve written here. Graeber’s book might make it clearer for you, he probably did a better job of explaining it than I do. It’s about spendy Red Queen’s races, not just trying to make something better or trying to compete in general. The application of the idea to military spending is pretty much standard and uncontroversial stuff (hell, it was 100-plus years ago) while his observation of a similar effect in some notable component of ad spending is novel (at least, I’d not seen that connection made before).
Best product should be picked according to requirements by LLM without bullshit advertising.
See what I mean? What you see as a bullshit job is just completely misunderstanding how human beings work.
- Which products get included in the candidate list? Every product in existence which claims use? - how many results can it return? And in what order? - which attributes or description of the product is provided to the llm? Who provides it? - how are the claims in those descriptions verified? - what if my business believes the claims or description of our product is false? - how will the llm change its relative valuations based on demand?
I was replying to:
> The only way advertising won't exist or won't be needed is when humanity becomes a hive mind and removes all competition.
I don't need advertisement to pick the best product for myself. I have a list of requirements that I need fulfilled – why do I need advertisement for it?
You just don’t know what advertising is. How did you determine which products were available? How did you determine their attributes?
The thesis of Bullshit Jobs is almost universally rejected by economists, FYI. There’s not much of value to obtain from the book.
As a layman, I have to say the collective credibility of economists does not inspire confidence.
I love economics but it is a field of study that we don't have the proper tools to properly study the subject yet.
Modern economics is literally a bullshit job generating process or complex system.
What's particularly ironic is that economists are redundant from a mainstream economics perspective. They'd be the first job to cut.
Not surprising, because thesis is not about economy.
> There’s not much of value to obtain from the book.
Anthropological insight has much more value than anything economists may produce on economy.
… what’s the thesis?
Why should I believe "economists" over "David Grabber"?
[dead]
I keep seeing this argument thrown around, but API usage on business plans is token spend based and can be orders of magnitude more than this $20/head per month.
My company is spending 20-50x that much per head easily from the firmwide cost numbers being reported.
They had to set circuit breakers because some users hit $200/day.
How viable are the $20/month subscriptions for actual work and are they loss making for Anthropic? I've heard both of people needing to get higher tiers to get anything done in Claude Code and also that the subscriptions are (heavily?) subsidized by Anthropic, so the "just another $20 SaaS" argument doesn't sound too good.
I am confident that Anthropic make revenue from that $20 than the electricity and server costs needed to serve that customer.
Claude Code has rate limits for a reason: I expect they are carefully designed to ensure that the average user doesn't end up losing Anthropic money, and that even extreme heavy users don't cause big enough losses for it to be a problem.
Everything I've heard makes me believe the margins on inference are quite high. The AI labs lose money because of the R&D and training costs, not because they're giving electricity and server operational costs away for free.
Nobody questions that Anthropic makes revenue from a $20 subscription. The opposite would be very strange.
Yeah it's the caching that's doing the work for them though honestly. So many cached queries saving the GPUs from hard hits.
How is caching implemented in this scenario? I find it unlikely that two developers are going to ask the same exact question, so at a minimum some work has to be done to figure out “someone’s asked this before, fetch the response out of the cache.” But then the problem is that most questions are peppered with specific context that has to be represented in the response, so there’s really no way to cache that.
From my understanding (which is poor at best), the cache is about the separate parts of the input context. Once the LLM read a file the content of that file is cached (i.e. some representation that the LLM creates for that specific file, but I really have no idea how that works). So the next time you bring either directly or indirectly that file into the context the LLM doesn't have to do a full pass, but pull its understanding/representation from the cache and uses that to answer your question/perform the task.
A lot of people believe that Anthropic lose money selling tokens to customers because they are subsidizing it for growth.
But that has zero effect on revenue, it only affects profit.
I wrote awhile ago on here that he should stick to his domain.
I was downvoted big time. Ah, I love it when people provide an example so it can finally be exposed without me having to say anything.
Unfortunately this is a huge problem on here - many people step outside of their domains, even if on the surface it seems simple, but post gibberish and completely mangled stuff. How does this benefit people who get exposed to crap?
If you don't know you are wrong but have an itch to polish your ego a bit then what's stopping you (them), right.
People form very strong opinions on topic they barely understand. I'd say since they know little the opinions come mostly from emotions, which is hardly a good path for objective and deeper knowledge.
I always assumed that with inference being so cheap, my subscription fees were paying for training costs, not inference.
Anthropic and OpenAI are both well documented as losing billions of dollars a year because their revenue doesn't cover their R&D and training costs, but that doesn't mean their revenue doesn't cover their inference costs.
Does it matter if they can't ever stop training though? Like, this argument usually seems to imply that training is a one-off, not an ongoing process. I could save a lot of money if I stopped eating, but it'd be a short lived experiment.
I'll be convinced they're actually making money when they stop asking for $30 billion funding rounds. None of that money is free! Whoever is giving them that money wants a return on their investment, somehow.
At some point the players will need to reach profitability. Even if they're subsidising it with other revenue - they'll only be willing to do that as long as it drives rising inference revenue.
Once that happens, whomever is left standing can dial back the training investment to whatever their share of inference can bear.
> Once that happens, whomever is left standing can dial back the training investment to whatever their share of inference can bear.
Or, if there's two people left standing, they may compete with each other on price rather than performance and each end up with cloud compute's margins.
Sure, but they will still need to dial it back to a point where they can fund it out of inference at some point. The point is that the fact they can't do that now is irrelevant - it's a game of chicken at the moment, and that might kill some of them, but the game won't last forever.
It matters because as long as they are selling inference for less than it costs to serve they have a potential path to profitability.
Training costs are fixed at whatever billions of dollars per year.
If inference is profitable they might conceivably make a profit if they can build a model that's good enough to sign up vast numbers of paying customers.
If they lose even more money on each new customer they don't have any path to profitability at all.
I'm curious just because you're well known in this space -- have you read Ed Zitron's work on the bubble, and if so what did you think of it? I'm somewhat in agreement with him that the financials of this just can't be reconciled, at least for OpenAI and Anthropic. But I also know that's not my field. I find his arguments a lot more convincing than the people just saying "ahh it'll work itself out" though.
But only if you ignore all the other market participants, right? How can we ever reach a point where all the i.e. smaller Chinese competitors perpetually trailing behind SOTA with a ~9 month lag but at a tiny fraction of the cost stop existing?
I mean we just have to look at old discussions about Uber for the exact same arguments. Uber, after all these years, still is at a negative 10 % lifetime ROI , and that company doesn't even have to meaningfully invest in hardware.
IMO this will probably develop like the railroad boom in the first half of the 19th century: All the AI-only first movers like OpenAI and Anthropic will go bust, just like most railroad companies who laid the tracks, because they can't escape the training treadmill. But the tech itself will stay, and even become a meaningful productivity booster over the next decades.
I am also thinking long term where is the moat if it will inevitably lead to price competition? Like it's not a Microsoft product suite that your whole company is tied in multiple ways. LLMs can be quite easily swapped to another.
> If they lose even more money on each new customer they don't have any path to profitability at all.
In theory they can increase prices once the customers will be hocked up. That's how many startups works.
There's an argument to be made that a "return on investment by way of eliminating all workers" is a reasonable result for the capitalists.
At least until they are running out of customers. And/or societies with mass-unemployment destabilize to a degree that is not conducive for capitalists' operations.
That's a problem above most CEOs' pay grade.
Models are fixed. They do not learn post training.
Which means that training needs to be ongoing. So the revenue covers the inference? So what? All that means is that it doesn't cover your costs and you're operating at a loss. Because it doesn't cover the training that you can't stop doing either.
Training costs are fixed. Inference costs are variable. The difference matters.
No they are not. They are exponentially increasing. Due to the exponential scaling needed for linear gain. Otherwise they'd fall behind their competition.
Fixed cost here means that the training costs stay the same no matter how many customers you have - unlike serving costs which have to increase to serve more people.
Is inference really that cheap? Why can't I do it at home with a reasonable amount of money?
Capex vs opex?
Well, both? I need money for the equipment, and I need money for electricity.
Capex is probably the biggest hurdle, but I can see how electricity cost might become a factor under heavy use.
Doubtful
>make revenue from that $20 than the electricity and server costs needed to serve that customer
Seems like a pretty dumb take. It’s like saying it only takes $X in electricity and raw materials to produce a widget that I sell for $Y. Since $Y is bigger than $X, I’m making money! Just ignore that I have to pay people to work the lines. Ignore that I had to pay huge amounts to build the factory. Ignore every other cost.
They can’t just fire everyone and stop training new models.
As I understand it:
Gross profit = revenues - cost of goods sold
Operating profit = Gross profit - operating expenses including depreciation & amortisation
Net profit = Operating profit - net interest expense - taxes
See comment here: https://news.ycombinator.com/item?id=47057874
Merely for the viability part: I use the $20/mo plan now, but only as a part-time independent dev. I will hit rate-limits with Opus on any moderately complex app.
If I am on a roll, I will flip on Extra Usage. I prototyped a fully functional and useful niche app in ~6 total hours and $20 of extra usage, and it's solid enough and proved enough value to continue investing in and eventually ship to the App store.
Without Claude I likely wouldn't have gotten to the finished prototype version to use in the real world.
For Indy dev, I think LLMs are a new source of solutions. This app is too niche to justify building and marketing without LLM assistance. It likely won't earn more than $25k/year but good enough!
I don't think the assumption that Anthropic is losing money on subscriptions holds up. I think each additional customer provides more revenue than the cost to run their inference, on average.
For people doing work with LLMs as an assistant for codebase searching, reviews, double checks, and things like that the $20/month plan is more than fine. The closer you get to vibecoding and trying to get the LLM to do all the work, the more you need the $100 and $200 plans.
On the ChatGPT side, the $20/month subscription plan for GPT Codex feels extremely generous right now. I tried getting to the end of my window usage limit one day and could not.
> so the "just another $20 SaaS" argument doesn't sound too good
Having seen several company's SaaS bills, even $100/month or $200/month for developers would barely change anything.
Why do you think that? Lots of people on this thread saying that, but zero citations.
id guess the 200 subscription sufficient per person.
but at that point you could go for a bugger one and split amongst headcount
A $20 Claude subscription lets you scratch the surface. A $20 Claude subscription without training means you have a lot of people spending time figuring out how to use it, and then maybe getting a bit of payback, but earning back that training is going to take time.
Getting people to figure out how to enter questions is easy. Getting people to a point where they don't burn up all the savings by getting into unproductive conversations with the agent when it gets something wrong, is not so easy.
I see no reason to believe that just handing a Claude subscription to everyone in a company simply creates economic benefit. I don't think it's easier than "automating customer service". It's actually very strange.
I think it could definitely already create economic benefit, after someone instructed clearly how to use it and how to integrate it in your work. Most people are really not good at figuring that out on their own, in a busy workday, when left to their own devices and companies are just finding out where the ball is moving and what to organize around too.
So I can totally see a lot of failed experiments and people slowly figuring stuff out, and all of that not translating to measurable surpluses in a corp, in a setup similar to what OP laid out.
I wish people in my company even used their Claude code or Cursor properly rather than asking me non-sense questions all the time that the model can easily answer with the connected data sources. And these people are developers.
This shit will take like 10 years to adopt properly, at least in most boomer companies.
I use these all the time nowadays and they are great tools when utilized properly but I have hard time seeing it replace functions completely due to humans having limited cognitive capacity to multitask and still needing to review stuff and build infra to actually utilize all this..
> A Claude subscription is 20 bucks per worker
Only until the loans come due. We're still in the "uber undercutting medalian cabs" part of the game.
$20 is not useable, need $100 plan at least for development purposes. That is a lot of money for some countries. In my country, that can be 1/10 of their monthly salary. Hard to get approval on it. It is still too expensive right now.
Yeah it’s not obvious at first but a big project will cause usage to skyrocket bc of how much context it will stuff with reading files. I can use my $20 subscription’s 5 hour limit in mere seconds.
>I think there are enough short term supposed benefits that something should be showing there.
As measured by whom? The same managers who demanded we all return to the office 5 days a week because the only way they can measure productivity is butts in seats?
Productivity is the ratio of outputs to inputs, both measured in dollars.
What is the definition of input, what is the definition of output, and who is responsible for measuring them?
The answer is different depending on your target -- are you measuring the productivity of a firm, a country or some other sort of entity?
Productivity is the ratio of real GDP to total hours worked.
That's labor productivity, a different measure. But the original article references labor productivity, so your definition is more relevant.
If anything, the 'scariness' of an old computer probably protected the company in many ways. AI's approachability to the average office worker, specifically how it makes it seem like it easy to deploy/run/triage enterprise software, will continue to pwn.
I'm not sure about the comparison either, but the cost of operating the LLM should include the worker's wages.
I read an article yesterday about people working insane hours at companies that have bet heavily on AI. My interpretation is that a worker runs out of juice after a few hours, but the AI has no limit and can work its human tender to death.
I've never looked at enterprise licensing, but regular license wise, a Claude subscription is actually $200 a month. I don't count the $20 or $100 tiers because they're too limited to be useful (especially professionally!)
I think the subscription price is only the visible tip of the iceberg
"if using personal accounts"
InfoSec and Legal would like a word with you...
Agreed.
We do have a way to see the financial impact - just add up Anthropic and oAI's reported revenues -> something like $30b in annual run rate. Given growth rates, (stratospheric), it seems reasonable to conclude informed buyers see economic and/or strategic benefit in excess of their spend. I certainly do!
That puts the benefits to the economy at just around where Mastercard's benefits are, on a dollar basis. But with a lot more growth. Add something in there for MS and GOOG, and we're probably at least another $5b up. There are only like 30 US companies with > $100bn in revenues; at current growth rates, we'll see combined revenues in this range in a year.
All this is sort of peanuts though against 29 trillion GDP, 0.3%. Well not peanuts, it's boosting the US GDP by 10% of its historical growth rate, but the bull case from singularity folks is like 10%+ GDP growth; if we start seeing that, we'll know it.
All that said, there is real value being added to the economy today by these companies. And no doubt a lot of time and effort spent figuring out what the hell to do with it as well.
Investors are optimistic, but what will this new tech be used for? Advertising? Propaganda? Surveillance? Drone strikes?
Does profitable always equal useful? Might other cultures justifiably think differently, like the Amish?
The Amish are skilled at getting cash from the “English” as they call non-Amish. I imagine they also think that the money they receive is roughly tied to value they create. I wasn’t talking valuations, just revenue - money that CFOs and individuals spent so far, and are planning on spending.
I also didn’t talk profitable. Upshot, though, I don’t think it’s just a US thing to say that when money exchanges hands, generally both parties feel they are better off, and therefore there is value implied in a transaction.
As to what it will be used for: yes.
You did specify revenue. The original comment mentioned benefits. I was thinking that the two are different.
They are, but I was proposing they're closely correlated.
I think crazygringo mispresents Solow paradox. None of the main explanations say it's the cost that removed the productivity.
not true at all, onboarding is complex too. E.g. you cant just connect claude to your outlook, or have it automate stuff in your CRM. As a office drone, you don't have the admin permissions to setup those connections at all.
And that's the point here: value is handicapped by the web interface, and we are stuck there for the foreseeable future until the tech teams get their priorities straight and build decent data integration layers, and workflow management platforms.
> A Claude subscription is 20 bucks per worker
Talking about macro economics, I don’t think that number is correct.
Problem is, that just having a Claude subscription doesn't make you productive. Most of those talks happen in a "tech'ish" environments. Not every business is about coding.
Real life example: A client came to me asking how to compare orders against order confirmation from the vendor. They come as PDF files. Which made me wonder: Wait, you don't have any kind of API or at least structured data that the vendor gives you?
Nope.
And here you are. I am not talking about a niche business. I assume that's a broader problem. Tech can probably automate everything and this since 30 years. Still business lack of "proper" IT processes, because at the end every company is unique and requires particular measures to be "fully" onboarded to IT based improvements like that.
Thing is, unless the order confirmations number high in the thousands, most businesses could just hire an old lady with secretary experience to crunch through that work manually for a fraction of what it would cost them to get the whole system automated.
I've seen this play out with time sheets. Every day factory floor workers write up every job they do on time sheets, paper and pencil, but management wants excel spreadsheets with pretty plots. Solution? One old typist who can type up all the time sheets every day. No OCR trouble, no expensive developer time, if she encounters illegible numbers she just cross references against the job fliers to figure it out instead of throwing errors the first time a 0 looks like a 6.
A computer lets you save a fortune in storage rooms, admin staff, delivery fees etc. It lets you reinvent how everything runs.
ChatGPT just lets you generate slop, that may be helpful. For the vast majority of industries it doesn’t actually offer much. Your meme departments like HR might be able to push out their slop quicker, but that doesn’t improve profitability.
You still need to teach a 2020s employee how to use Claude.
- protect yourself from data loss / secret leaks - what it can and can't do - trust issues & hallucinations - Can't just enable Claude for Excel and expect people to become Excel wizards.
And nobody talks that the "20 bucks per worker" it's selling it at loss. I'm waiting to see when they put a price that expects to generate some net income...
20 bucks per worker could easily be profitable, depending on how much the workers actually use it..
Like Uber/Airbnb in early days, this is heavily subsidized.
Comment was deleted :(
> A Claude subscription is 20 bucks per worker
I mean, sure. If you want to use it for 20 minutes and wait for two hours at a time.
[dead]
It’s also pretty wild to me how people still don’t really even know how to use it.
On hacker news, a very tech literate place, I see people thinking modern AI models can’t generate working code.
The other day in real life I was talking to a friend of mine about ChatGPT. They didn’t know you needed to turn on “thinking” to get higher quality results. This is a technical person who has worked at Amazon.
You can’t expect revolutionary impact while people are still learning how to even use the thing. We’re so early.
I don't think "results don't match promises" is the same as "not knowing how to use it". I've been using Claude and OpenAI's latest models for the past two weeks now (probably moving at about 1000 lines of code a day, which is what I can comfortably review), and it makes subtle hard-to-find mistakes all over the place. Or it just misunderstands well known design patterns, or does something bone headed. I'm fine with this! But that's because I'm asking it to write code that I could write myself, and I'm actually reading it. This whole "it can build a whole company for me and I don't even look at it!" is overhype.
Prompting LLMs for code simply takes more than a couple of weeks to learn.
It takes time to get an intuition for the kinds of problems they've seen in pre-training, what environments it faced in RL, and what kind of bizarre biases and blindspots it has. Learning to google was hard, learning to use other peoples libraries was hard, and its on par with those skills at least.
If there is a well known design pattern you know, thats a great thing to shout out. Knowing what to add to the context takes time and taste. If you are asking for pieces so large that you can't trust them, ask for smaller pieces and their composition. Its a force multiplier, and your taste for abstractions as a programmer is one of the factors.
In early usenet/forum days, the XY problem described users asking for implementation details of their X solution to Y problem, rather than asking how to solve Y. In llm prompting, people fall into the opposite. They have an X implementation they want to see, and rather than ask for it, they describe the Y problem and expect the LLM to arrive at the same X solution. Just ask for the implementation you want.
Asking bots to ask bots seems to be another skill as well.
Let me clarify, I've been using the latest models for the last two weeks, but I've been using AI for about a year now. I know how to prompt. I don't know why people think it's an amazing skill, it's not much different from writing a good ticket.
Do you use an agent harness to have it review code for you before you do?
If not, you don't know how to use it efficiently.
A large part of using AI efficiently is to significantly lower that review burden by having it do far more of the verification and cleanup itself before you even look at it.
I have it run tests and every few days I ask it to do a code quality analysis check on the codebase.
I'm unconvinced AI reviewing AI is the answer here, because all LLMs have the same flaws. To me, the harness/guard rails for AI should be different technologies that work differently and in a more formal sense. IE, static code analysis, linters, tests, etc.
(Linting has actually been, by far, the BEST code quality enforcers for the agents I've run so far, and it's a lot cheaper and more configurable than running more agents.)
> Do you use an agent harness to have it review code for you before you do?
Right now you need to be Uncle Moneybags to do this in your personal life.
If you're lucky, your employer is footing the bill but otherwise... Ugh. It's like converting your app running perfectly fine on a cheap VPS to AWS Lambda. In theory, it's fine but in reality the next bill you get could make you faint.
It's down to how much you value your time. If your value your time low enough, it doesn't pay to make AI take over. If you value it high enough, it does.
This is correct, but part of the issue is that it significantly increases token usage costs. Some companies are doing:
- PRD and spec fulfillment review
- code review + correction loops
- security review + corrections
- addl. test coverage and tidying
- addl. type checks and tidying
- addl. lint checks and tidying
- maybe more I haven't listed
And these are run after each commit, so you can only imagine the costs per engineer doing this 10, 20, 50+ times per day depending on how much work they're knocking out.
Sure, it adds tokens. I've burnt 200 million tokens today on a single project.
The question is what your time is worth for the company, and which tasks costs less to have an agent automate than having you do.
If you know good architecture and you are testing as you go, I would say, it is probably pretty damn close to being able to build a company without looking at the code. Not without "risk" but definitely doable and plausible.
My current project that I started this weekend is a rust client server game with the client compiled into web assembly.
I do these projects without reading the code at all as a way to gauge what I can possibly do with AI without reading code, purely operating as a PM with technical intuition and architectural opinions.
So far Opus 4.6 has been capable of building it all out. I have to catch issues and I have asked it for refactoring analysis to see if it could optimize the file structure/components, but I haven't read the code at all.
At work I certainly read all the code. But would recommend people try to build something non trivial without looking at the code. It does take skill though, so maybe start small and build up the intuition on how they have issues, etc. I think you'll be surprised how much your technical intuition can scale even when you are not looking at the code.
Security auditor and criminals have a bright future ahead of them.
That is why I said "risk". Though the models are pretty good "if" you ask for security audits. Notice I didn't say you could do it without technical knowledge right now, so you need to know to ask for security review.
I have friends in security on major platforms who are impressed by the security review of the SOT models. Certainly better than the average bootstrapped founder.
For a few years maybe, but I see little reason to think this stuff won't be coming for their jobs as well.
True, but you'd be surprised how much you can tighten up a codebase by asking a heftier model to do a security review and suggest fixes.
At what point do people really know if it has been tightened up if they never look at the code?
How does a PM know that the code has been tighten up by the offshore team?
Why is there a point of not reading the code? Even with very competent humans we have put in place systems for reviewing the code.
What’s the game? Genuinely curious!
Simultaneous turn based top down car combat where you design the cars first. Inspired by Car Wars, but taking advantage of computers, so spline based path planing and much more complicated way of calculating armor penetration and damage.
I'm building to play with my friends online.
And yet, this is exactly what my last job's engineering & product leadership did with their CEO at the helm, before they laid me off.
They vibe-coded a complete rewrite of their products in a few months without any human review. Hundreds of thousands LOC. I feel sorry for the remaining engineers having to learn everything they just generated, and are now having customers use.
> On hacker news, a very tech literate place, I see people thinking modern AI models can’t generate working code.
I am completely flooded with comments and stories about how great LLMs are at coding. I am curious to see how you get a different picture than this? Can you point me to a thread or a story that supports your view? At the moment, individuals thinking AI cannot generate working code seem almost inexistent to me.
It's a real thing, but usually tied to IT folks that tried ChatGPT ~2 years ago (in a web browser) and had to "fix" whatever it output. That situation solidified their "understanding of AI" and they haven't updated their knowledge on the current situation (because... No pressing need).
Folks like this have never used AI inside of an IDE or one of the CLI AI tools. Without that perspective, AI seems mostly like a gimmick.
You are assuming that we all work on the same tasks and should have exactly the same experience with it, which is it course far from the truth. It's probably best to start with that base assumption and work on the implications from there.
As for the last example, for all the money being spent on this area, if someone is expected to perform a workflow based on the kind of question they're supposed to ask, that's a failure in the packaging and discoverability aspect of the product, the leaky abstraction only helps some of us who know why it's there.
I’ve been helping normal people at work use AI and there’s two groups that are really struggling:
1. People who only think of using AI in very specific scenarios. They don’t know when you use it outside of the obvious “to write code” situations and they don’t really use AI effectively and get deflated when AI outputs the occasional garbage. They think “isn’t AI supposed to be good at writing code?”
2. People who let AI do all the thinking. Sometimes they’ll use AI to do everything and you have to tell them to throw it all away because it makes no sense. These people also tend to dump analyses straight from AI into Slack because they lack the tools to verify if a given analysis is correct.
To be honest, I help them by teaching them fairly rigid workflows like “you can use AI if you are in this specific situation.” I think most people will only pick up tools effectively if there is a clear template. It’s basically on-the-job training.
In a WhatsApp group full of doctors, managers, journalist and engineers (including software) in age of 30-60 I asked if anyone heard of openclaw and only 3 people heard of it from influencers, none used it.
But from my social feed the impression was that it is taking over the world:)
I asked it because I am building something similar since some tome and I thought its over they were faster than me but as it appears there’s no real adoption yet. Maybe there will be some once they release it as part of ChatGPT but even then it looks like too early as actually few people are using the more advanced tools.
It’s definitely in very early stage. It appears that so far the mainstream success in AI is limited to slop generation and even that is actually small number of people generating huge amounts of slop.
> I asked if anyone heard of twitter vaporware and only 3 people heard of it from influencers, none used it.
Shocking results, I say!
No, these people ("managers, engineers" etc.) do just not work in tech & IT but in other fields and they do not read tech news in your country etc.
Most people are just "not that deep in there" as most people on HN.
I spend between 1 and 2h a day on hn and I barely know what openclaw is. I've seen it mentioned once or twice and checked their website but that's all.
If one lets AI FOMO since the release of chatgpt drive them they'd be glued to their screen 24/7.
What is happening is nonsensical.
OAI wants to keep the hype train going. That is all. OpenClaw is just a project that attracted the interests of people messing about with LLMs. Which as a proportion of economically active people is.... tiny.
They brought him (Pete) over as he seems to have some way of thinking about LLMs in the form of a product. Will he have repeatable success on a large scale? Who knows. I doubt it personally.
> “Tech news”
A guy attached Claude to his socials, groundbreaking tech.
Once I was working for a consulting & development company; they were trying to enter sector ABC by stuffing up a team of people, so I was told, who had interest in sector ABC stuff and want to do some projects there.
While they were deep in software development in general, no body of them read any of the essential/required daily industrial news (also not that one related to doing software development in sector ABC)
:-)
So no, even people somehow attached to a topic are not necessarily somehow deeper involved.
> I asked it because I am building something similar since some tome and I thought its over they were faster than me
If you have been working on a usecase similar to OpenClaw for sometime now I'd actually say you are in a great position to start raising now.
Being first to market is not a significant moat in most cases. Few people want to invest in the first company in a category - it's too risky. If there are a couple of other early players then the risk profile has been reduced.
That said, you NEED to concentrate on GTM - technology is commodified, distribution is not.
> It appears that so far the mainstream success in AI is limited to slop generation and even that is actually small number of people generating huge amounts of slop
The growth of AI slop has been exponential, but the application of agents for domain specific usecases has been decently successful.
The biggest reason you don't hear about it on HN is because domain-specific applications are not well known on HN, and most enterprises are not publicizing the fact that they are using these tools internally.
Furthermore, almost anyone who is shipping something with actual enterprise usage is under fairly onerous NDAs right now and every company has someone monitoring HN like a hawk.
Do you think that it is a good idea to release it first on iOS, announce on HN and Producthunt? How would you do?
On my app the tech is based on running agent generated code on JavaScriptCore to do things like OpenClaw, I’m wrapping the JS engine with the missing functionality like networking, file access and database access so I believe I will not have a problem with releasing it on Apple AppStore as I use their native stack. Then since this stack is also OS, I’m making a version that will run on Linux, the idea being users develops their solution on their device(iOS&Mac currently) see it working and and then deploys on a server with a tap of a button, so it keeps running.
Who's your persona? How are you pricing and packaging? Who is your buyer? Are you D2C? Consumer? Replacing EAs? Replacing Project Managers? ...
You need to answer these questions in order to decide whether a Show HN makes sense versus a much more targeted launch.
If you do not know how to answer these questions you need to find a cofounder asap. Technology is commodified. GTM, sales, and packaging is what turns technology into products. Building and selling and fundraising as 1 person is a one-way ticket to burnout, which only makes you and your product less attractive.
I also highly recommend chatting with your network to understand common types of problems. Once you've identified a couple classes of problems and personas for whom your story resonates, then you can decide what approach to take.
Best of luck!
The persona is, someone who knows what are they doing but need someone to actually automate their work routine. I.e. maybe it’s a crypto trader that makes decisions on signals interpretation so they can create a trading bot that executes on their method. Maybe its a compliance who needs automate some routine like checking details further when some conditions arise. Or maybe a social media manager that needs to moderate their channels.Maybe someone who needs a tool for monitoring HN that specific way?
Thanks for the advice! I’m at a stage where I want to have such tool and see who else wants it. Not sure yet about it’s viability as a business and what is the exact market. Maybe I will find out by putting it into the wild and that’s why I consider to release it as a mobile app first.
That persona still sounds too generic, too unfocused.
But even with that persona, it should already answer your question whether posting on HN and producthunt should be a core part of your strategy. Not a lot of social media managers or compliance people around here. And even for crypto traders there are better places to pitch products to them
That's too broad. You aren't going to get any nibbles.
You need to narrow it down to a single and specific persona and business domain.
This is because it takes years to fully flesh out and productionize a workflow from scratch, so concentrating on a business domain you know intimately well helps you build that muscle, which you can then repeat if you are able to hit revenue metrics for a Series A/B.
> every company has someone monitoring HN like a hawk.
Monitoring specific user accounts or keywords? Is this typically done by a social media reputation management service?
> On hacker news, a very tech literate place
I think this is the prior you should investigate. That may be what HN used to be. But it's been a long time since it has been an active reality. You can still see actual expert opinions on HN, but they are the minority more and more.
I think one longtime HN user (Karrot_Kream I think) pinpointed the change in HN discourse to sometime in mid 2022 to early 2023 when the rate of new users spiked to 40k per month and remained at that elevated rate.
From personal experience, I've also noticed that some of the most toxic discourse and responses I've received on this platform are overwhelmingly from post-2022 users.
HN got a write-up in a highly political, non-technical magazine around that time.
And it will get worse once the UX people get ahold of it.
You got that right . .. imagine AI making more keyboard shortcuts, "helping" wayland move off X more so, new window transistions, overhauling htmx ... it'll be hell+ on earth.
We can indeed only imagine. For now, AI has been a curse for open source projects.
A neighbour of me has a PhD and is working in research at a hospital. He is super smart.
Last time he said: "yes yes I know about ChatGPT, but I do not use it at work or home."
Therefore, most people wont even know about Gemini, Grok or even Claude.
He said he know about it and your conclusion is that he doesnt know about the other ones...
> I see people thinking modern AI models can’t generate working code.
Really? Can you show any examples of someone claiming AI models cannot generate working code? I haven't seen anyone make that claim in years, even from the most skeptical critics.
I've seen it said plenty of the times that the code might work eventually (after several cycles of prompting and testing), but even then the code you get might not be something you'd want to maintain, and it might contain bugs and security issues that don't (at least initially) seem to impact its ability to do whatever it was written to do but which could cause problems later.
Yeah but that's a completely different thing.
Depends what they mean. Generate working code all the time or after going a few iterations of trying and promoting? It can very easily happen, that an LLM generates something that is a straight error, because it hallucinates some keyword argument or something like that, which doesn't actually exist. Only happened to me yesterday. So going from that, no, they are still not able to generate working code all the time. Especially, when the basis is a shoddy-made library itself, that is simply missing something required.
10 days ago someone was making this claim about copilot on legacy code: https://news.ycombinator.com/item?id=46932609
> Github Copilot has been great in getting that code coverage up marginally but ass otherwise.
That's a completely different claim. Or do you think an AI can always, without fail, produce working code in every situation? That's trivially false.
Scroll up a few comments where someone said Claude is generating errors over and over again and that Claude cant work according to code guidelines etc :-))
That not the same.
And really the problem isn’t that it can’t make working code, the problem is that it’ll never get the kind of context that is in your brain.
I started working today on a project I hadn’t touched in a while but I now needed to as it was involved in an incident where I needed to address some shortcomings. I knew the fix I needed to do but I went about my usual AI assisted workflow because of course I’m lazy the last thing I want to do is interrupt my normal work to fix this stupid problem.
The AI doesn’t know anything about the full scope of all the things in my head about my company’s environment and the information I need to convey to it. I can give it a lot of instructions but it’s impossible to write out everything in my head across multiple systems.
The AI did write working code, but despite writing the code way faster than me, it made small but critical mistakes that I wouldn’t have made on my first draft.
For example, it just added in a command flag that I knew that it didn’t need, and it actually probably should have known it, too. Basically it changed a line of code that it didn’t need to touch.
It also didn’t realize that the curled URL was going to redirect so we needed an -L flag. Maybe it should have but my brain knew it already.
It also misinterpreted some changes in direction that a human never would have. It confused my local repository for the remote one because I originally thought I was going to set a mirror, but I changed plans and used a manual package upload to curl from. So it out the remote URL in some places where the local one should have been.
Finally, it seems to have just created some strange text gore while editing the readme where it deleted existing content for seemingly no reason other than some kind of readline snafu.
So yes it produced very fast great code that would have taken me way longer to do, but I had to go back and consume a very similar amount of time to fix so many things that I might as well have just done it manually.
But hey I’m glad my company is paying $XX/month for my lazy workday machine.
>>The AI doesn’t know anything about the full scope of all the things in my head about my company’s environment and the information I need to convey to it.<<
This is your problem: How should it know if you do not provide it?
Use Claude - in the pro version you can submit files for each project which are setting the context: This can be files, source code, SQL scripts, screenshots whatever - then the output will be based on your context given by providing these files.
And your problem is that you didnt understood the point of their post. The full context was so complex and would be so time consuming to relay that they might as well code themselves.
Is this process of brain dumping faster than me just writing the code?
If I was truly going to automate this one-time task I would have to give the AI access to my browser or an API token for the repository provider, so I’m either giving it dangerous modification capability via browser automation or I’m spending even more time setting up API access and trusting that it actually knows how to interact with the service via API calls.
My company doesn’t provide Claude, they give me GitHub Copilot Pro or whatever it’s called, and when I provided the website it needed to get the RPM files I was working with it didn’t actually do anything with it. It just wrote a readme file that told me what to do. Like I mention it also just eventually mistook the remote repository as my local internal repository.
And one of the specific commands it screwed up was in my existing script and was already correct, it just decided to change it for no discernible reason. I didn’t ask it to do anything related to that particular line.
With such a high error rate, I would be hesitant to actually integrate AI to other systems to try to achieve a more fully automated workflow.
I'll claim it. They can't generate working code for the things I am working on. They seem to be too complex or in languages that are too niche.
They can do a tolerable job with super popular /simple things like web dev and Python. It really depends on what you're doing.
For more on this exact topic and an answer to Solow’s Paradox, see, the excellent, The Dynamo and the Computer by Paul David [0].
[0]: https://www.almendron.com/tribuna/wp-content/uploads/2018/03...
Stanford prof rebutts David's idea[0] that it's difficult to extract productivity from the data
https://www.nber.org/system/files/working_papers/w25148/w251...
I don't agree that real GDP measures what he thinks it measures, but he opines
>Data released this week offers a striking corrective to the narrative that AI has yet to have an impact on the US economy as a whole. While initial reports suggested a year of steady labour expansion in the US, the new figures reveal that total payroll growth was revised downward by approximately 403,000 jobs. Crucially, this downward revision occurred while real GDP remained robust, including a 3.7 per cent growth rate in the fourth quarter. This decoupling — maintaining high output with significantly lower labour input — is the hallmark of productivity growth.
https://www.ft.com/content/4b51d0b4-bbfe-4f05-b50a-1d485d419...
[0] on the basis that IT and AI are not general technologies in the mold of the dynamo, keyword "intangibles", see section 4 p21, A method to measure intangibles
GDP growth measurements have a big bias due to tariffs on, tariffs off, tariffs on again policies wrecking imports and exports numbers. Consumer spending is up, too, so I too fail to see that gdp growth while jobs are not as up as expected is due to AI making us more productive and not just people spending more after months of increased savings due to tariffs.
Paul Strassmann wrote a book in 1990 called "Business Value of Computers" that showed that it matters where money on computers is spent. Only firms that spent it on their core business processes showed increased revenues whereas the ones that spent it on peripheral business processes didn't.
This is my feeling about both IT and AI. It enables companies to do a lot of things which don't really bring value. One of the biggest use case for AI in the company I work for now is powerBI report generation. Fine, but a couple of years ago we didn't even have all these graphs and reports. I'm not sure they bring actual value, since I see decisions still being made mostly on intuition.
Fwiw fortune had another article this week saying this J-curve of "General Technology" is showing up in the latest BLS data
https://fortune.com/2026/02/15/ai-productivity-liftoff-doubl...
Source of the Stanford-approved opinion: https://www.ft.com/content/4b51d0b4-bbfe-4f05-b50a-1d485d419...
An old office colleague used to tell us there was a time when he'd print a report prepared with Lotus123 (Ancient Excel) and their boss would verify the calculations on a calculator saying computers are not reliable. :o
>And so we should expect AI to look the same -
Maybe! Or it might never pan out, or it may pan out way better. Complicated things like this rarely turn out the way people expect, no matter how smart.
I’m thinking survivorship bias here. “Information Technology” is such a wide term, and we immediately think of the IT we currently use. Many of us can’t even remember all the blind alleys we wasted resources on in the ‘80s, especially those of us who weren’t there. I count myself among that group because I was a kid and didn’t pay much attention to business.
But I can say that, judging by historical artifacts, a lot of it was along the same broad lines as AI. And we maybe don’t realize how serious people were about it back then. The technology that actually changed the world was so comparatively boring and pragmatic that the stuff that was being hyped back then seems comically overwrought. It’s easy to assume it must have been a joke all along.
Thanks! we've swapped the baity title with that phrase above.
Ok, this article inspired some positivity in my view. Here comes, of course a disclaimer that this is just "wishful thinking", but still.
So we are in the process of "adapting a technology". Welcome, keep calm, observe, don't be ashamed to feel emotions like fear, excitement, anger and all else.
While adapting, we learn how to use it better and better. At first, we try "do all the work for me", then "ok, that was bad, plan what you would do, good, adjust, ok do it like this" etc etc.
A couple of years into the future this knowledge is just "passed on". If productivity grew and we "figured out how to get more out of the universe", then no jobs had to be lost, just readapted. And "investors" get happy not by "replacing workers", but by "reaping win-win rewards" from the universe at large.
There are dangers of course, like "maybe this is truly a huge win-win, but some loses can be hidden, like ecology", but "I hope there are people really addressing these problems and this win-win will help them be more productive as well".
Comment was deleted :(
Yet with IT, the bottleneck was largely technical and capital-related, whereas with AI it feels more organizational and cognitive
Wow I didn’t realize that. But I always thought it. I was bewildered that anyone got any real value out of any of that pre-VisiCalc (or even VisiCalc) computer tech for business. It all looked kinda clumsy.
(pre) VisiCalc: You have to understand that the primary users (accountants etc.) do not care about how a thing looks in their working process: If a tool helps them, they will use it even if its ugly according to aesthetical frontend questions :-)
(Think about this old black/white or green mainframe screens - horrible looking but it gets their job done)
The coding tools are not hard to pick up. Agent chat and autocomplete in IDE's are braindead simple, and even TUI's like Claude are extremely easy to pickup (I think it took me a day?) And despite what the vibers like to pretend, learning to prompt them isn't that hard either. Or, let me clarify, if you know how to code, and you know how you want something coded, prompting them isn't that hard. I can't imagine it'll take that long for an impact to be seen, if there is a major impact to be seen.
I think it's more likely that people "feel" more productive, and/or we're measuring bad things (lines of code is an awful way to measure productivity -- especially considering that these agents duplicate code all the time so bloat is a given unless you actively work to recombine things and create new abstractions)
It reminds me a lot of adderall's effect on people without ADHD. A pretty universal feeling that it's making you smarter, paired with no measurable increase in test scores.
That's a good analogy. I've never done stimulants, but from what I've heard about them they make people very active but that isn't the same as productive.
> It wasn't until the mid-to-late 1990's that information technology finally started to show clear benefit to the economy overall.
The 1990s boom was in large part due to connectivity -- millions[1] of computers joined the Internet.
[1] _ In the 1990s. Today, there are billions of devices connected, most of them Android devices.
Android in the 90s Not really
Is this like the hotels first jumping on the wifi bandwagon? Spent lots of money up front for expensive tech. Years later, anyone could buy a cheap router and set up, so every hotel had wifi. But the original high-end hotels that were first out with wifi and paid much for it, has the worst and old wifi and charge users for it, still trying to recoup the costs.
You’re missing the true value, the network.
Wide spread internet access turned expensive toys (PCs) into useful assets.
I like your username.
I don’t think LLMs are similar to computers in terms of productivity boost
One part of the system moving fast doesn't change the speed of the system all that much.
The thing to note is, verifying if something got done is harder and takes time in the same ballpark as doing the work.
If people are serious about AI productivity, lets start by addressing how we can verify program correctness quickly. Everything else is just a Ferrari between two traffic red lights.
Really? I disagree that verifying is as hard as doing the work yourself. It’s like P != NP.
productivity may rise with time, and costs may come down. The money is already spent
Some of the money is spent. What happens when better models, more efficient cooling techniques, and other technologies hit? Seems like the best strategy at this point isn't dumping your entire FCF into datacenters, but wait and see if there's even a viable business efficiency improvement first.
Only it's much more exponential
If things like computer-aided design and improved supply chain management, for example, make manufactured goods last longer and cause less waste, I would expect IT to cause productivity to go down. I drive a 15 year old car and use a 12 year old PC. It's a good thing that productivity goes down, or stays the same.
> And so we should expect AI to look the same
Is that somewhat substantiated assumption? I recall learning on University in 2001 the history of AI and that initial frameworks were written in 70's and that prediction was we will reach human-like intelligence by 2000. Just because Sama came up with this somewhat breakthrough of an AI, it doesn't mean that equal improvement leaps will be done on a monthly/annual basis going forward. We may as well not make another huge leaps or reach what some say human intelligence level in 10 years or so.
> it's helping lots of people, but it's also costing an extraordinary amount of money
Is it fair to say that wall street is betting America's collective pensions on AI...
They're betting a lot more than that, but since all their chips are externalities they don't care.
Very few people have pensions anymore. People now direct their own retirement funds.
That's what he was saying. Wall Street (the stock market) are people's "pensions" now because everyone has a 401k or equivalent so their retirement is tied to the market. Thus, these companies are betting America's collective retirement on AI...
I thought he was talking about actual pension funds, which still exist and invest large sums of money in Wall Street. But make sure your 401k doesn't include AI funds then. You should have a choice over what part of the market to invest in.
I mean the productivity paradox was only temporarily remedied. Around 2005 we entered a second version of the paradox and it persists to this day. I'll note that 2005 was when the internet became dominated by walled-gardens and social-media, _and_ it was the last year that people got to use the internet without smartphones (in 2006 LG released a smartphone, with Apple releasing iPhone in 2007).
The combination of attention-draining social media walled gardens, and the high performance pocket-computers (which are really designed for consumption instead of productivity), created a positive feedback loop that helped destroy the productivity that we won by defeating the paradox in the 1990s. And we have been struggling against this new paradox for twenty years, since. AI seems like it should defeat the paradox because it is a kind of hands-free system, perfect for mobile phones -- but this is really just a very expensive solution to a problem that we have created and allowed to fester. We could just shun the walled gardens, and demand to be paid for our attention and data.
The new productivity paradox (which I do not think AI in its current form can fix[1][2]), is the price that we pay for a prosperous and valuable advertising industry. And as long as the web is seen as an ad-channel, and as long as the web is always vibrating in your pocket, we will keep paying this price. We will eventually end up (metaphorically) lobotomizing our children, and families, and communities, so that the grand-children of ad-executives and tech-bros and frat-bros can grow up healthy, psychologically stable, educated, and comfortably wealthy. (Brain drain: now available literally everywhere).
[1]: It is telling that most LLMs are centralized, and are most useful as search-engines/information-retrieval-systems. The centralization makes them _spyware_, and their ability to directly answer any question, encourages users to actually ask direct questions, instead of stringing search-terms together. This makes the prompts high-signal advertising data (i.e. instead inferring what you are looking for from the search-string, these companies can see _exactly_ what you are looking for and why -- and with LLMs, they can probably turn these promps into joint-probability-tables or whatever other kind of serialization they need to figure out which products to sell you (either on the web or directly in the response to your prompt)).
[2]: As far as copyright infringement goes, LLM outputs may require mass clean-room rewrites (so your productivity, as pathetic as it already is, now gets _halved_ long term) of text, prose, code, and anything else that is produced with them, because of how copyright law works. In legal arts this is called _the fruit of the poison tree_, and any short-term productivity gains, may become long term liabilities that need to be replaced due to _legal mandate_ -- so even if LLMs can eventually produce perfect and faultless outputs, the copyright laws _in all 200+ countries_ would have to be torn down and rebuilt (and this will certainly come at great expense).
[dead]
[dead]
My experience has been
* If I don't know how to do something, llms can get me started really fast. Basically it distills the time taken to research something to a small amount.
* if I know something well, I find myself trying to guide the llm to make the best decisions. I haven't reached the state of completely letting go and trusting the llm yet, because the llm doesn't make good long term decisions
* when working alone, I see the biggest productivity boost in ai and where I can get things done.
* when working in a team, llms are not useful at all and can sometimes be a bottleneck. Not everyone uses llms the same, sharing context as a team is way harder than it should be. People don't want to collaborate. People can't communicate properly.
* so for me, solo engineers or really small teams benefit the most from llms. Larger teams and organizations will struggle because there's simply too much human overheads to overcome. This is currently matching what I'm seeing in posts these days
I suspect the real breakthrough for teams won't be better raw models, but better ways to make the "AI-assisted thinking" legible and shareable across the group, instead of trapped in personal prompt histories
This seems like a problem simply stated but not simply solved. I think Grokipedia or whatever it was called was a real exercise in “no one cares about cached LLM output”. The ephemeral nature of LLM output is somehow a core property of its utility. Kind of like I never share a Google search with a coworker, I share the link I found.
I sort of have this indirectly solved with a project I'm working on inspired by Beads. One thing I added is as you have the LLM work on tasks, you can sync them directly to GitHub, I would love to add other ticketing / task backends to it, but I mostly just use GitHub. You can also create them on GitHub and sync them down and claim a task (the tool will post a comment on GitHub that you've claimed the work). I can see people using it to collaborate easier, but for the time being it's just me using it for myself. ;)
These tasks become your prompt once refined. I basically braindump to Claude, have it make tasks from my brain dump. Then I tell Claude to ask me clarifying questions, it updates the tasks and then I have Claude do market research for some or all tasks to see what the most common path is to solve a given problem and then update the tasks.
The future of work is fewer human team members and way more AI assistants.
I think companies will need fewer engineers but there will be more companies.
Now: 100 companies who employ 1,000 engineers each
What we are transitioning to: 1000 companies who employ 10 engineers each
What will happen in the future: 10,000 companies who employ 1 engineer each
Same number of engineers.
We are about to enter an era of explosive software production, not from big tech but from small companies. I don't think this will only apply to the software industry. I expect this to apply to every industry.
It will lead to hollowing out of the substance everywhere. The constant march to more abstraction and simplicity will inevitably end up with AI doing all the work and nobody understanding what is going on underneath, turning technology into magic again. We have seen people losing touch with how things work with every single move towards abstraction, machine code -> C -> Java -> JavaScript -> async/await -> ... -> LLM code generation, producing generations of devs that are more and more detached from the metal and living in a vastly simplified landscape not understanding trade-offs of the abstractions they are using, which leads to some unsolvable problems in production that inevitably arise due to the choices made for them by the abstractions.
> nobody understanding what is going on underneath
I think many developers, especially ones who come from EE backgrounds, grossly overestimate the number of people needed who understand what is going on underneath.
“Going on underneath” is a lot of interesting and hard problems, ones that true hackers are attracted to, but I personally don’t think that it’s a good use of talented people to have 10s or 100s of thousands of people working on those problems.
Let the tech geniuses do genius work.
Meanwhile, there is a massive need for many millions of people who can solve business problems with tech abstractions. As an economy (national or global), supply is nowhere close to meeting demand in this category.
The point is that LLMs can only replicate what existed somewhere, they aren't able to invent new things. Once humans lose their edge, there won't be any AI-driven progress, just a remix of existing stuff. That was the hollowing out I mentioned. Obviously, even these days there is tech that looks like magic (EUV lito etc.) but there are at least some people that understand how it all works.
And those companies will do what? Produce products in uber-saturated markets?
Or magically 9900 more products or markets will be created, all of them successful?
Go back to a time at the start of Youtube.
Now answer these questions:
And what will those people who make videos on Youtube do? Produce videos in uber-saturated categories?
Or magically 9900 more media channels will be created, all of them successful?
To follow this up, one of my favorite channels on Youtube is Outdoor Boys. It's just a father who made videos once a week doing outdoor things like camping with his family. He has amassed billions of views on his channel. Literally a one-man operation. He does all the filming, editing himself. No marketing. His channel got so popular that he had to quit to protect his family from fame.
Many large production companies in the 2000s would have been extremely happy with that many views. They would have laughed you out of the building if you told them a single person could ever produce compelling enough video content and get at many viewers.
Serious question: but aren't there thousands of other guys doing almost the same thing and getting almost no views? Even if there are lots of new channels, there aren't going to be lots of winners
But there are many Youtubers making a decent living doing it as a one person shop or a small team. In the past, you needed to a large team with a large budget and buy in from TV/DVDs/VHS to get an audience.
I don't understand the people who think more companies with fewer employees is a good thing.
I already feel spammed to death by desperate requests for my consumption as is.
Because: 1. One person companies are better at giving the wealth to the worker. 2. With thousands of companies the products can be more targeted and each product can serve fewer people and still be profitable
Then companies won't need to spam you to convince you that you need something you don't. Or that their product will help you in ways it can't.
Once person companies will not have a 100 person marketing team trying to inject ads into ever corner of your life.
> One person companies will not have a 100 person marketing team to inject ads into every corner of your life.
But they could have a thousand-agent swarm connected via MCP to everything within our field of vision to bury us with ads.
It's been a long time since I read "The Third Wave" and up until 2026, not much has reminded me about its "Small is beautiful", and "rise of the prosumer" themes besides the content creator economy which is arguably the worst thing to ever happen to humanity's information environment, and LLM agent discussions.
> the content creator economy
This is exactly one of the things I find maddening at the moment. "Everyone" (except my actual friends) on social media is trying to sell me something.
Eg: I like dogs. It's becoming increasingly hard to follow dog accounts without someone trying to sell me their sponsor's stuff.
Comment was deleted :(
> Once person companies will not have a 100 person marketing team trying to inject ads into ever corner of your life.
Because these one person companies will scale up everything with AI except marketing/advertising? Consider me skeptical.
> And those companies will do what? Produce products in uber-saturated markets?
> Or magically 9900 more products or markets will be created, all of them successful?
Yes. Products will become more tailored/bespoke rather than a lot of the one size fits all approach that is pervasive now.
> smaller companies
And large companies. The first half of my career was spent writing internal software for large companies. I believe it's still the case that the majority of software written is for internal software. AI will be a boon for these use cases as it will make it easier for every company big and small to have custom software for its exact use case(s).
> AI will be a boon for these use cases as it will make it easier for every company big and small to have custom software
Big cos often have the problem of defining the internal problems they’re trying to solve. Once identified they have to create organizational permission structures to allow the solutions. Then they need to stay on tasks long enough to build and use the software to solve the problem.
Only one of these steps is easily improved with AI.
And thenb potentially suffer from integration hell.
The benefit of using off the shelf software is that many of the integration problems get solved by other people. Heck you may not even know you have a problem and they may already have a solution.
Custom software on the other hand could just breed more demand for custom software. We gotta be careful how much custom stuff we do lest it get completely out of hand
yeah, I agree.
When Engineering Budget Managers see their AI bills rising, they will fire the bottom 5-10% every 6-12 months and increase the AI assistant budget for the high performers, giving them even more leverage.
In my case, over the last 3 years, every dev who left was not replaced. We are doing more than ever.
Our team shrunk by 50% but we are serving 200% more customers. Every time a dev left, we thought we're screwed. We just leveraged AI more and more. We are also serving our customers better too with higher retention rates. When we onboard a customer with custom demands, we used to have meetings about the ROI. Now we just build the custom demands in the time we took to meet to discuss whether we should even do it.
Today, I maintain a few repos critical to the business without even knowing the programming language they are written in. The original developers left the company. All I know is what is suppose to go into the service and what is suppose to come out. When there is a bug, I ask the AI why. The AI almost always finds it. When I need to change something, I double and triple check the logic and I know how to test the changes.
No, a normal person without a background in software engineering can't do this. That's why I still have a job. But how I spend my time as a software engineer has changed drastically and so has my productivity.
When a software dev say AI doesn't increase their productivity, it truly does feel like they're using it wrong or don't know how to use it.
> Today, I maintain a few repos critical to the business without even knowing the programming language they are written in. [...] No, a normal person without a background in software engineering can't do this.
Of course they can - if you don't know any of the tech-stack details (i.e. a "normal" user), why can't someone else who also doesn't know the tech-stackc details replace you?
What magic sauce do you possess other than tech-stack chops?
In the future, they might be able to. Not yet though. I still have a job.
When a non software engineer can build a production app as well as I can, I know I won’t be working as a software engineer anymore. In that world, having great ideas, data, compute, and energy will be king.
I don’t think we will get there within the next 3-4 years. Beyond that, who knows.
Could you provide some details on your company, code base, etc? These are wild claims and don’t match the reality I’m seeing everywhere else.
How big is your team? How many customers? What’s your product? Can we see the code? How do you track defects? Etc.
Part of the reason I’m struggling with this is because we’d be seeing OpenAI, Anthropic, etc. plastering these case studies everywhere if they existed. Instead, I’m stuck using CC and all its poorly implemented warts.
Comment was deleted :(
Now if only companies knew how to correctly assess actual impact and not perceived impact.
I don't think this is an AI problem. Even before AI, FANGA companies famously optimize promotions on perceived impact.
During the promo review, people will look how many projects were done and the impact of those projects.
Acquisition rate, retention rate, revenue, profit margin?
By those metrics, Microsoft lost 20% of it's value due to hopping on the AI coding assistance train.
I'm not saying it is the case, just making it apparent how unreliable it is to measure productivity by comparing what's happening at the lowest level in a company to its financials.
This seems like a bot comment.
So is yours.
That means the system will collapse in the future. Now from bunch of people some good programmers are made. Rest go into marketing, sales, agile or other not really technical roles. When the initial crowd will be gone there will be no experienced users of AI. Crappy inexperienced developer will make more crap without prior experience and ability to judge the design decisions. Basically no seniors without juniors.
This implies that writing code by hand will remain the best way to create software.
The seniors today who have got to senior status by writing code manually will be different than seniors of tomorrow, who got to senior status using AI tools.
Maybe people will become more of generalists rather than specialists.
Generalist is not automatically bad. I design digital high speed hardware and write (probably crappy) Qt code. The thing is that I have experience to judge my work. Greenhorns can’t and this will lead to crapification of the whole industry. I often ask AI tools for an advice. Sometimes it’s very useful, sometimes it’s complete hallucination. On average it definitely makes me better developer. Having rather abstract answer I can derive exact solution. But that comes from my previous experience. Without experience it’s a macabre guessing game.
> The seniors today who have got to senior status by writing code manually will be different than seniors of tomorrow, who got to senior status using AI tools.
That’s putting it mildly. I think it’s going to be interesting to see what happens when an entire generation of software developers who’ve only ever known “just ask the LLM to do it” are unleashed on the world. I think these people will have close to no understanding of how computing works on a fundamental level. Sort of like the difference between Gen-X/millenial (and earlier) developers who grew up having to interact with computers primarily through CLIs (e.g., DOS), having to at least have some understanding of memory management, low-level programming, etc. versus the Gen-Z developers who’ve only ever known computers through extremely high level interfaces like iPads.
I barely know how assembly, CPUs, GPUs, compilers, networking work. Yet, software that I've designed and written have been used by hundreds of millions of people.
Sure, maybe you would have caught the bug if you wrote assembly instead of C. But the C programmer still released much better software than you faster. By the time you shipped v1 in assembly, the C program has already iterated 100 times and found product market fit.
Casey Muratori says that every programmer should understand how computers work and if you don't understand how computers work you can't be a good programmer.
I might not be a good programmer but I've been a very productive one.
Someone who is good at writing code isn't always good at making money.
Problem with much of this talk is receipts are always nowhere to be found.
But I don't see any receipts from the opposite side either.
AI slop books made more money than JK Rowling, too.
To be fair, I wouldn't be entirely surprised if they were better than barfs onto a page. She's not exactly Tolkien
Maybe in the future, yea. Most likely not because creating books is much easier now but total reading time can't increase nearly as fast. More books chasing the same amount of reading time.
I think we were headed that way before LLMs came on to hunt scene.
LLMs just accelerated this trend.
By and large "AI assistant" is not a real thing. Everyone talks about it but no one can point you to one, because it doesn't exist (at least not in a form that any fair non-disingenuous reading of that term would imply). It's one big collective hallucination.
> I think companies will need fewer engineers but there will be more companies.
This would be strange, because all other technology development in history has taken things the exact opposite direction; larger companies that can do things on scale and outcompete smaller ones.
This would be strange, because all other technology development in history has taken things the exact opposite direction; larger companies that can do things on scale and outcompete smaller ones.
I don't think this has always been true.Youtube allowed many more small media production companies - sometimes just one person in their garage.
Shopify allowed many more small retailers.
Steam & cheap game engines allowed many more indie game developers instead of just a few big studios.
It likely depends on the stage of the tech development. I can see Youtube channels consolidating into a few very large channels. But today, there are far more media production companies than 30 years ago.
That’s an argument for giant companies at scale like Google/YouTube.
I don't think so. Are there more media production entities now or in the 2000s?
> llms can get me started really fast. Basically it distills the time taken to research something
> the llm doesn't make good long term decisions
What could possibly go wrong, using something you know makes bad decisions, as the basis of your learning something new.
It's like if a dietician instructed a client to go watch McDonald's staff, when they ask how to cook the type of meals that have been recommended.
I’m bearish on AI, but I still think this is disingenuous. My grade school math teachers were probably not well-versed in Calculus and Real Analysis, but they helped me learn my time tables just as well.
AI is great at exposing you to what you don’t even know you don’t know: your personal unknown unknowns, the complexity you’re completely unaware of.
> My grade school math teachers were probably not well-versed in Calculus and Real Analysis, but they helped me learn my time tables just as well.
Are you somehow equating basic multiplication to a "bad long term decision"?
No, I was focusing on this quote:
> llms can get me started really fast. Basically it distills the time taken to research
I was saying that it speeds up research by exposing you to the things you don’t know that you don’t know.
> exposing you to the things you don’t know that you don’t know
Or the things that it has hallucinated, or just referenced incorrectly.
That's my point. You're talking about learning basic maths from a school teacher who isn't a Calculus expert, but the thread is talking about learning maths from a kid with ADHD who completes the homework before the teacher has finished describing what needs to be done, but sometimes returns homework in an invented language with references to Cthulhu throughout it.
To me the biggest benefit of LLMs has always been as a learning tol, be it for general queries or "build this so I can get an idea of how it works and get started quickly". There are so many little things that you need to know when trying anything new.
My compsci brain suggests large orgs are a distributed system running on faulty hardware (humans) with high network latency (communication). The individual people (CPUs) are plenty fast, we just waste time in meetings, or waiting for approval, or a lot of tasks can't be parallelized, etc. Before upgrading, you need to know if you're I/O Bound vs CPU Bound.
When my company first started pushing for devs to use AI, the most senior guy on my team was pretty vocal about coding not being the bottleneck that slowed down work. It was an I/O issue, and maybe a caching issue as well from too many projects going at the same time with no focus… which also makes the I/O issues worse.
Ironically using Ai on records of meetings across an org is amazing. If you can find out what everyone is talking about you can talk to them.
Privacy is non existent, every word said and message sent at the office is recorded but the benefits we saw were amazing.
Yep. Previous org I was at had the fancy copilot integrated into teams. All meetings were auto transcribed and you could chat with copilot about the meeting directly in the chat. It was absolutely magical at extracting action points, decisions and other salient points. It was like having a secretary for each and every meeting.
So how is it going for him? Was he able to prove his point?
He was laid off… unrelated to these opinions.
He was correct though. For example, I’ve been waiting over a month for another team to set me up so I can test something they wanted me to develop. I’ve followed up multiple times. AI coding tools aren’t going to solve my blocker.
Maybe experienced people are the L2 cache? And the challenge is to keep the cache fresh and not too deep. You want institutional memory available quickly (cache hit) to help with whatever your CPU people need at that instant. If you don´t have a cache, you can still solve the problem, but oof, is it gonna take you a long time. OTOH, if you get bad data in the cache, that is not good, as everyone is going be picking that out of the cache instead of really figuring out what to do.
L2? I'm hot L1 material, dude.
But I like your and OP's analogy. Also, the productivity claims are coming from the guys in main memory or even disk, far removed from where the crunching is taking place. At those latency magnitudes, even riding a turtle would appear like a huge productivity gain.
In my opinion, you're very wrong. There is typically lots of good communication -- one way. The stuff that doesn't get communicated down to worker bees is intentional. "CPUs" aren't all that fast either, unless you make them by providing incentives. if you're a well paid worker who likes their job, i can see why you would think that, but most people aren't that.
Meetings are work, as much as IPC and network calls are work. Just because they're not fun, or what you like to do, it doesn't mean they're any less of a work.
I think you're analyzing things from a tactical perspective, without considering strategic considerations. For example, have you considered that it might not be desirable for CPUs to be just fast, or fast at all? is CISC faster than RISC? different architectural considerations based on different strategic goals right?
If you're an order picker at an amazon warehouse, raw speed is important. being able to execute a simpler and more fixed set of instructions (RISC), and at greater speed is more desirable. if you're an IT worker, less so. IT is generally a cost-center, except for companies that sell IT services or software. if you're in a cost center, then you exist for non-profit-related strategic reasons, such as to help the rest of the company work efficiently, be resilient, compete, be secure. Some people exist in case they're needed some day, others are needed critically but not frequently, yet others are needed frequently but not critically. being able to execute complex and critical tasks reliably and in short order is more desirable for some workers. Being fast in a human context also means being easily bored, or it could mean lots of bullshit work needs to be invented to keep the person busy and happy.
I'd suggest taking that compsci approach but considering not just the varying tasks and workloads, but also the diversity of goals and user cases of users (decision makers/managers in companies). There are deeper topics with regards or strategy and decision making surrounding the state machines of incentives and punishments, and decision maker organization (hierarchical, flat, hub-and-spoke,full-mesh,etc..).
Meetings can be work, but often they are a waste of time. Often they are only done, because the company has not found a better way to structure itself, which is also accepted by the management lawyer, who often has a profound fear of loss of control and likes to micromanage. If you can zone out for most of the meeting, and not experience negative effects from that, then the meeting was a waste of your time.
it's not a waste of your time if you're getting paid for it, that's why you are there. You're not there to be otherwise "productive" you're there to do what the company wants you to do. you might say they're inefficient, but are things like company outings efficient? b.s. townhall calls? half the time it's just managers trying to hear their own voice in front of everyone. You're not there to write code or fix things, you're there to make those pay-check signers happy. placate to their ego if that's what they want.
That said, often, meetings are much more efficient means of syncing information than slack/chat or emails. call it "real-time active communication with rich context" if it sounds more technical. you can communicate in voice tones, body language, timing,etc.. what you can't using other means. and communicate doesn't mean just talk or listen for the sake of it, it can me brainstorm, understand requirements and expectations better, prevent misunderstandings and other wasted effort.
In my experience, things that exist as patterns like this in systems are always important, but it's also important to use them as intended, and not abuse them excessively.
Simply extracting the most value out of individual contributors isn't typically the goal of white collar management. as in my earlier example, you won't see order pickers at amazon warehouses attend meetings all day. their time at work is valued differently than a white collar workers'.
In some cases it might even make the mismatch worse. If one person can produce drafts, specs, or code much faster, you just create more work for reviewers, approvers, and downstream dependencies, which increases queueing
operationally, i think new startups have a big advantage on setting up to be agent-first, and they might not be as good as the old human first stuff, but theyll be much cheaper and nimble for model improvements
Start ups mostly move fast skipping the necessary ceremony which large corps have to do mandatorily to prevent a billion dollar product from melting. Its possible for start ups because they don't have a billion dollar to start with.
Once you do have a billion dollar product protecting it requires spending time, money and people to keep running. Because building a new one is a lot more effort than protecting existing one from melting.
This.
Once you have revenue you have downside to protect. Pre-revenue the worst that can happen is that you have to start again knowing more than you did.
Comment was deleted :(
Interesting analogy to explore a Distributed System as compared to Organizational Dynamics.
None of this fits
Then where are all the amazing open source programs written by individuals by themselves? Where are all the small businesses supposedly assisted by AI?
> 4% of GitHub public commits are being authored by Claude Code right now. At the current trajectory, we believe that Claude Code will be 20%+ of all daily commits by the end of 2026.
https://newsletter.semianalysis.com/p/claude-code-is-the-inf...
> Honestly, AI slop PRs are becoming increasingly draining and demoralizing for #Godot maintainers.
> If you want to help, more funding so we can pay more maintainers to deal with the slop (on top of everything we do already) is the only viable solution I can think of
https://www.pcgamer.com/software/platforms/open-source-game-...
> If you want to help, more funding so we can pay more maintainers to deal with the slop (on top of everything we do already) is the only viable solution I can think of
This is exactly the wrong approach! Funnel even more money away from productive tasks and into AI? Madness![1]
The only viable solution is being quick with a banhammer - maybe someone should start up a spamhaus type list of every github user who submitted AI slop.
Force them to burn these accounts on the very first spam.
------------
[1] Imagine if we chose this approach to deal with spam - we ask people for more money to hire a warm body to individually verify each email. Do you think spam would be the solved problem it is today?
There’s lots of slop out there, that doesn’t mean it’s actually good or useful code.
Keep moving those goal posts.
Comment was deleted :(
Doesn’t look like goal-post moving to me. GP argued that AI isn’t making a difference, because if it was, we’d see amazing AI-generated open source projects. (Edit: taking a second look, that’s not exactly what GP said, but that’s what I took away from it. Obviously individuals create open source projects all the time.)
You rebutted by claiming 4% of open source contributions are AI generated.
GP countered (somewhat indirectly) by arguing that contributions don’t indicate quality, and thus wasn’t sufficient to qualify as “amazing AI-generated open source projects.”
Personally, I agree. The presence of AI contributions is not sufficient to demonstrate “amazing AI-generated open-source projects.” To demonstrate that, you’d need to point to specific projects that were largely generated by AI.
The only big AI-generated projects I’ve heard of are Steve Yegge’s GasTown and Beads, and by all accounts those are complete slop, to the point that Beads has a community dedicated to teaching people how to uninstall it. (Just hearsay. I haven’t looked into them myself.)
So at this point, I’d say the burden of proof is on you, as the original goalposts have not been met.
Edit: Or, at least, I don’t think 4% is enough to demonstrate the level of productivity GP was asking for.
It has been argued for a very long time, lines of code is largely meaningless as a metric. But now that AI is writing those lines... it seems to be meaningful again? I continue to be optimistically skeptical.
It's not a great ask. Who's going to quantify what is 'amazing open source work'?
4% for a single tool used in a particular way (many are out there using AI tools in a way that doesn't make it clear the code was AI authored) is an incredible amount. Don't see how you can look at that and see 'not enough'.
The vast majority of people using these tools aren't announcing it to the world. Why would they ? They use it, it works and that's that.
So we're suddenly going back into measuring lines of code as a useful metric?
Just because people are shitting out endless slop code that they never bothered to throw a 2nd glance at doesn't mean it'sgood or that it's leading to better projects or tools, it literally just means people are pushing code out haphazardly . If I made a python script that everyone started using and all it did was create a repo, commit a README and push it every 5 seconds we'd be seeing billions of lines of code added! But none of it is useful in any way.
Same with AI, sure we're generating endless piles of code, but how much of it is actually leading to better software?
It's not the be all end all and there are obviously issues with using it alone, but it's rather silly going the other extreme and pretending it isn't a major factor.
>If I made a python script that everyone started using and all it did was create a repo, commit a README and push it every 5 seconds we'd be seeing
1. Well you can't do that
2. Something like that won't register as Claude Code (or any other AI tool) usage anyway
3. Something like that won't come anywhere near 4%
> Well you or anyone else can't do that...
But that's what these tools are doing, in a large number of cases? At least the end result is basically the same, like that Clawdbot or whatever name they've decided to try ride the coattails of guy who has 70k commits in the last few months that I saw being touted as an impressive feat on HN the other day. How much broken, unusable code exists within those 70k commits that ultimately would've had the same effect as if he had just pushed a `--allow-empty` commit thousands of times?
Now whatever, if it's people pushing slop into their own codebase that they own, more power to them, my issue stems from OSS projects being inundated with endless spam MR/PRs from AI hypesters. It's just making maintainer's lives more difficult, and the most annoying part of it all is that they have to put up with people who don't see the effort disparity between them prompting their chatbot to write up some broken bullshit vs the effort required for maintainers to keep up with the spam. It hurts the maintainers, it hurts genuine beginners who would like to learn and contribute to projects, it hurts the projects themselves since they have to waste what precious little time and resources they already have digging through crap, it hurts quite literally everyone who has to deal with it other than the selfish AI-using morons who just take a huge dump over everyone and spouts shit like "Well 4% of all code on Github is now AI-generated!" as if more of that is somehow a good thing.
>But that's what these tools are doing, in a large number of cases?
I mean No not really. I'm not sure why you think that.
>How much broken, unusable code exists within those 70k commits that ultimately would've had the same effect as if he had just pushed a `--allow-empty` commit thousands of times?
How much stable usable code exists within those 70k commits ?
This is pretty much exactly why I said the original question was not a great ask. You have your biases. Show an example and the default response for some almost like a stochastic parrot is, 'Must be slop!". How do you know ? Did you examine it? No, you didn't. That's just what you want to believe so it must be true. It makes for a very frustrating conversation.
> where are all the amazing open source programs
> amazing
Nobody moved the goal posts.
They didn’t, amazing open source was asked for, meaningless stats were given. Not that GitHub public repositories were amazing before AI, but nothing has changed since, except AI slop being a new category.
Even if this was goalpost moving, is it really an unreasonable ask to not have slop everywhere?
I deliberately asked for amazing open source projects. I’ve yet to see a single AI coded project i would use.
Keep licking those boots.
Here are a few of mine from the past month - for all of them 90%+ of the code written by Claude Code:
- https://github.com/simonw/sqlite-history-json
- https://github.com/simonw/sqlite-ast
- https://github.com/simonw/showboat - 292 stars
- https://github.com/simonw/datasette-showboat
- https://github.com/simonw/rodney - 290 stars and 4 contributors who aren't me or Claude
- https://github.com/simonw/chartroom
Noting the star counts here because they are a very loose indication that someone other than me has found them useful.
I quickly read through the `sqlite-history-json` project and it's only a few hundred lines of code and the code doesn't use transactions which means that it can fail and leave the state of the code and database in an inconsistent state.
Being only a few hundred lines of code is a pro, not a con (it's 2,800 including tests: https://tools.simonwillison.net/sloccount?repo=https%3A%2F%2... - lines counted by my vibe-coded port of the classic Perl SLOCCount tool to run in a browser using Perl-in-WebAssembly.)
It does use transactions in the form of savepoints which means they can be nested: https://github.com/simonw/sqlite-history-json/blob/53e66b279...
Transactions are tested here: https://github.com/simonw/sqlite-history-json/blob/53e66b279...
I lead with sqlite-history-json because I think it's the most impressive of the bunch - it solves a difficult problem in an elegant way with code I would have been proud to write by hand.
You could have easily made the same point and just not included the last sentence. Guidelines an all that
I feel different: the last line is very important in this context, since it communicates the underlying thoughts and values of the poster.
Asking for "amazing" open source projects in this case is not asking out of genuine curiosity or want for debate, it is a rhetorical question asked out of frustration at the general trajectory of AI and who profits off of it -- namely the boot-wearers.
Seemingly every day on Show HN?
Also small businesses aren't going to publish blog posts saying "we saved $500 on graphic design this week!"
But they could have saved that $500 by paying... a human
Is saving 500$ by generating some shitty AI art the bar? I thought this supposed to replace entire departments
Someone asked “where are all the small businesses”, this was a reply to that. Small businesses don’t have entire art departments.
Gotcha, so the impact of AI is small businesses get to save a couple hundred dollars and the cost is only 2% of your countries GDP. That’s good.
Prior to industrialization if you wanted to paint something you had to know how to mix your own paints.
And make your own brushes.
Before the printing presses came along, putting up flyers was not even imaginable.
Signs for businesses used to hand carved.
Then printed. A store sign was still produced by a team of professionals, but small businesses coils reasonably afford to print a sign. Not often updated, but it existed.
Then desktop publishing took off. Now lone graphic designers could design and send work off to a print shop. Small businesses could now afford regularly updated menus, signage, and even adverts and flyers.
Now small businesses can make their own creatives. AI can change stylesheets, write ad copy, and generate promotional photos.
Does any of this have the artistry of hand carved signs from 600 years ago? Of course not.
But the point is technology gives individuals control.
None of this is even slightly correct lol
People have been painting with red and yellow ochre and soot for at least 50K years for sure, and probably several hundred thousand years in truth. You don't need a brush, you have fingers or a twig.
The walls on the streets of Pompeii are full of advertising -- they had an election going on and people just scribbled slogans and such on walls. You don't need flyers lol.
The idea that signs or advertising was "artistry" is deeply ahistorical. The reason old stuff looks real fancy is because labor was extremely cheap and materials were expensive.
> People have been painting with red and yellow ochre and soot for at least 50K years for sure,
Compare those to the pigments used (mixed up!) by professional painters, and then to what printers could make.
If you wanted to paint fine art in the 1400s you were possibly making your own canvases, your own paint brushes, and your own paints.
And on top of that you had to be a skilled painter!
> The walls on the streets of Pompeii are full of advertising -- they had an election going on and people just scribbled slogans and such on walls. You don't need flyers lol.
The American revolution included a lot of propaganda courtesy of printing presses and some very rich financers who had a vested interest in a revolution occuring.
Pamphlets everywhere. It is one thing to scribble on a wall, it is another to produce messages at a mass scale.
That sense of scale has been multiplied yet again by AI.
No, that's just the impact that you're not going to hear in the news ("Small business saves a couple of hundred dollars" is not a good headline). But that's not the only "impact of AI". The bigger impacts are reflected in the news and the stock market almost on a daily basis over the last two years.
Couple hundred dollars
..a month
..multiplied by how many small businesses globally?
I think both. Most organizatuons lack someone like Steve Jobs to prime their product lines. Microsoft is a good example where you see their products over the years are mostly meh. Then meetings are pervasive and even more so in most companies due to msteam convenience. But currently they faced reduced demands due softer market as compare 2-3 years ago. If you observed that no effect while they layoff many and revenue still hold or at least no negative growth, I would surmise that AI is helping. But in corporate, it only counta if directly contributed sales numbers.
I've sat in a room with a too big to tail banker's VP happily telling me and my boss that "we're getting rid of this whole floor".
Dateline ~2010. Location: NYC Why:Indian outsourced shops.
Now the zinger, dear hn, is this: He actually said to us (we ran a more boutique consulting firm) that "everything has to be done 3 times" and "their work is crap". But "we're getting rid of this floor".
That, imho, was due to geopolitical machinations of inducing India to become part of the West. The immediate equation of "money for quality work" wasn't working but the 'our higher ups' had more grand plans and sacrificing and gutting the IT industry in US was not a problem.
So, given the incentives these days, do not remotely pin your hopes on what these CEOs are saying. It means nothing whatsoever.
And India buys russian oil so what exactly is the goal there.
The slow part as a senior engineer has never been actually writing the code. It has been:
- reviews for code
- asking stakeholders opinions
- SDLC latency (things taking forever to test)
- tickets
- documentations/diagrams
- presentations
Many of these require review. The review hell doesn't magically stop at Open source projects. These things happen internally too.
This is because the vast majority of white collar activity in a large corporation produces no direct economic value.
Making it easier/better just means more/higher quality “worthless” work is performed. The incentives in the not-directly -productive parts of organizations are to keep busy and maintain a stream of signals of productivity. For this , AI just raises the bar. The 25% of the work that -is- important to producing economic value just gets reduced to 15%.
The workforce in large orgs that is most AI adjacent is already idling along in terms of production of direct economic value. Making them 10x more productive in nonproductive work will not impact critical metrics in a short timeframe.
It’s worth noting that these “not directly productive” activities actually can (and often do) produce value, eventually. Things like brand identity, culture, and meta-innovation, vision (search-space) are intangibles that present as cost centers but can prove invaluable in longer timescales if done right.
There are a lot of people who sit with their laptop open while streaming something, sleeping or messing with their phone while periodically waking up to join a new meeting or fiddle with something to make it look like they are active.
These are the people "shocked" when they are displaced.
There are many reasons why such people might be employed. E.g. preventing a competitor from hoarding talent, so you decide to do it too.
Whats taught in economics textbooks doesnt always reflect reality - ha.
Principal-agent problem.
The manager wants a large team. The shareholder who ultimately employs the manager but does not control operations does not want that of course.
Hmm.
My company’s behind the curve, just got nudged today that I should make sure my AI use numbers aren’t low enough to stand out or I may have a bad time. Reckon we’re minimum six months from “oh whoops that was a waste of money”, maybe even a year. (Unless the AI market very publicly crashes first)
My manager mentioned that his manager (an executive) is not happy because the org we are in are not using as much tokens as other orgs in the company. Pretty wild
Just have Claude code churn out some Harry Potter fan fiction for an hour a day and you’ll meet your KPI easily
It could literally be internal marketing fan fiction on the org's intent to meet KPI's with a focus on synergistic evolution towards x-functional singularity between department hiveminds including a footer on projected outcomes for operational efficiency.
I think LinkedIn is in the dataset, right?
"You haven't hit your spell quota today, Harry. I'm putting you on a Wizard Improvement Plan."
You are in the "not-enough-AI" stage - keep increasing your usage but try to keep it very gradual to avoid entering the "do-more-AI-within-this-budget" stage too soon. It seems like the firing would be in following stages:
1. too old/expensive
2. not using AI
3. using AI but not productive
4. productive using AI but not within AI budget
5. reduce AI budget and GOTO 3
So management basically have no clue and want you to figure out how to use AI?
Do they also make you write your own performance review and set your own objectives?
> Do they also make you write your own performance review and set your own objectives?
Not to get off on a tangent but this has got to be a "tell" for how much a company is managed by formula and how much it's actually got thinking people running things. Every time I've had to write my own review I fill out the form with some corporatese bullshit, my supervisor approves it and adds some more bullshit, it disappears into HR and I never hear anything about it until it's time for the next review, and it starts over again. There isn't even reference to any of my "objectives" from the last review, because that review has simply disappeared.
But I'm sure some HR exec is checking boxes for following "best practices" in employee evaluation.
That’s exactly what we do at the Fortune 500 company where I work, and it’s surreal.
In my first year I didn’t know any better, so I tried to set myself some actual objectives (learn to use XYZ, improve test coverage by X%, measurable stuff that would actually help).
Fortunately my manager showed me how to do it correctly, so now my goals are to “differentiate with expertise” and to “empower through better solutions”.
Every year I open up the self-review, grade myself a 5/5 on these absurd, unmeasurable goals, my manager approves it, and it disappears off somewhere into the layers and layers of ever-higher management where nobody cares to look at it.
> So management basically have no clue and want you to figure out how to use AI?
This is basically the same story I have heard both my own place of employment and also from a number of friends. There is a "need" for AI usage, even if the value proposition is undefined (or, as I would expect, non-existent) for most businesses.
Look, to make something productive out of it: a job seeker who has high level skills using LLM assistance will be much more valuable than one without the experience. Never mind your current company mangement's policies.
Management probably also wants them to figure out how to use the laptops, ide and other resources provided to them. Getting a tool for your employees that you've been told is important but have no idea what to do with is a perfectly valid management task.
The thing with a lot of white collar work is that the thinking/talking is often the majority of the work… unlike coding, where thinking is (or, used to be, pre-agent) a smaller percentage of the time consumed. Writing the software, which is essentially working through how to implement the thought, used to take a much larger percentage of the overall time consumed from thought to completion.
Other white collar business/bullshit job (ala Graeber) work is meeting with people, “aligning expectations”, getting consensus, making slides/decks to communicate those thoughts, thinking about market positioning, etc.
Maybe tools like Cowork can help to find files, identify tickets, pull in information, write Excel formulas, etc.
What’s different about coding is no one actually cares about code as output from a business standpoint. The code is the end destination for decided business processes. I think, for that reason, that code is uniquely well adapted to LLM takeover.
But I’m not so sure about other white-collar jobs. If anything, AI tooling just makes everyone move faster. But an LLM automating a new feature release and drafting a press release and hopping on a sales call to sell the product is (IMO) further off than turning a detailed prompt into a fully functional codebase autonomously.
I’m confused what kind of software engineer jobs there are that don’t involve meeting with people, “aligning expectations”, getting consensus, making slides/decks to communicate that, thinking about market positioning, etc?
If you weren’t doing much of that before, I struggled to think of how you were doing much engineering at all, save some more niche extremely technical roles where many of those questions were already answered, but even still, I should expect you’re having those kinds of discussions, just more efficiently and with other engineers.
> I’m confused what kind of software engineer jobs there are that don’t involve meeting with people, “aligning expectations”, getting consensus, making slides/decks to communicate that, thinking about market positioning, etc?
The vast majority of software engineers in the world. The most widespread management culture is that where a team's manager is the interface towards the rest of the organization and the engineers themselves don't do any alignment/consensus/business thinking, which is the manager's exclusive job.
I used to work like that and I loved it. My managers were decent and they allowed me to focus on my technical skills. Then, due to those technical skills I'd acquired, I somehow got hired at Google, stayed there nearly a decade but hated all the OKR crap, perf and the continuous self-promotion I was obliged to do.
It seems that to some number of folks, "engineering" means "writing code."
> I’m confused what kind of software engineer jobs there are that don’t involve meeting with people, “aligning expectations”, getting consensus, making slides/decks to communicate that, thinking about market positioning, etc?
I'd suspect the kind that's going away.
That kind was already reserved for junior roles, contractors, and off shoring.
I’m not sure everyone would agree with that statement. As a more senior engineer at a big tech company, our execs still believe more code output is expected by level. Hell they even measure and rate you on lines of code deltas.
I don’t agree with it or believe it’s smart but it’s the world we live in
In a lot of larger organizations there is a whole stable of people whose job is to keep stakeholders and programmers from ever having to talk to each other. This was considered a best practice a quarter-century ago ("Office Space" makes fun of it), and in retrospect I concede it sometimes had a point.
In my case
* meeting with people, yes, on calls, on chats, sometimes even on phone
* “aligning expectations”, yes, because of the next point
* getting consensus, yes, inevitably or how else do we decide what to do and how to do it?
* making slides/decks to communicate that, not anymore, but this is a specific tool of the job, like programming in Java vs in Python.
* thinking about market positioning, no, but this is what only a few people in an organization have agency on.
* etc? Yes, for example don't piss off other people, help custumers using the product, identify new functionalities that could help us deliver a better product, prioritize them and then back to getting consensus.
Comment was deleted :(
Well that’s why AI will not replace the software engineer!
Ime a team or project lead does that and the rest of the engineers maybe do that on a smaller scale but mostly implement.
>If you weren’t doing much of that before, I struggled to think of how you were doing much engineering at all
Isn't like half of our industry just churning out JS file after JS file to yet again change how facebook looks?
> making slides/decks to communicate those thoughts,
That use case is definitely delegated to LLMs by many people. That said, I don't think it translates into linear productivity gains. Most white collar work isn't so fast-paced that if you save an hour making slides, you're going to reap some big productivity benefit. What are you going to do, make five more decks about the same thing? Respond to every email twice? Or just pat yourself on the back and browse Reddit for a while?
It doesn't help that these LLM-generated slides probably contain inaccuracies or other weirdness that someone else will need to fix down the line, so your gains are another person's loss.
Yeah, but this is self-correcting. Eventually it will get to a point where the data that you use to prompt the LLM will have more signal than the LLM output.
But if you get deep into an enterprise, you'll find there are so many irreducible complexities (as Stephen Wolfram might coin them), that you really need a fully agentically empowered worker — meaning a human — to make progress. AI is not there yet.
Thinking is always the hardest part and the bottleneck for me.
It doesn’t capture everyone’s experience when you say thinking is the smaller part of programming.
I don’t even believe a regular person is capable of producing good quality code without thinking 2x the amount they are coding
Agree. I remember in school in the 1980s reading that a good programmer can write about 10 lines of code a day (citing The Mythical Man-Month) and I thought "that's ridiculous, I can write hundreds of lines a day" but didn't understand that's including all the time understanding requirements, thinking about design, testing, debugging, etc. Writing the code is a small portion of what a software engineer does.
Also remember that programs were much smaller, code had to be typed in full and read accurately because compilers were slow and you didn't want to waste time for a syntax error. Anyway it's common even today to work half a day thinking, debugging, testing and eventually git diff shows only two changed lines.
Most people (and most businesses) aren’t making good quality code though. Most tools we use have horrible codebases. Therefore now the code can often be a similar quality to before, just done far faster.
> The thing with a lot of white collar work is that the thinking/talking is often the majority of the work… unlike coding, where thinking is (or, used to be, pre-agent) a smaller percentage of the time consumed.
WHOAH WHOAH WHOAH WHOAH STOP. No coder I've ever met has thought that thinking was anything other than the BIGGEST allocation of time when coding. Nobody is putting their typing words-per-minute on their resume because typing has never been the problem.
I'm absolutely baffled that you think the job that requires some of the most thinking, by far, is somehow less cognitively intense than sending emails and making slide decks.
I honestly think a project managers job is actually a lot easier to automate, if you're going to go there (not that I'm hoping for anyone's job to be automated away). It's a lot easier for an engineer to learn the industry and business than it is for a project manager to learn how to keep their vibe code from spilling private keys all over the internet.
Our job is not the intellectual exercise you think it is. We're not smarter than anyone else and software development is not automatically more thought-intensive than other jobs. The fact that programming is the first job task to be fully automated says it all.
When coders need a break from intense coding, what do they do with the remaining hours of the day? Usually, administrative stuff -- sending emails, attending meetings (if they can organize when their meetings are), filing expense reports, etc. IE, the stuff that's easy. Also while I wasn't attempting to suggest that thinking more = higher iq (just that it requires a lot of careful thought), average IQ's per job score are quite a bit higher in software engineering fields.
It’s weird that you equate time spent thinking with intelligence and egotism. Plenty of “normal people” jobs require lots of time spent thinking like art, writing, product and ad design. The only one implying taking time to think equals big brain master race is you
"Normal people"? "Big brain master race"? The only one implying weird things here is you.
> unlike coding, where thinking is (or, used to be, pre-agent) a smaller percentage of the time consumed. Writing the software, which is essentially working through how to implement the thought, used to take a much larger percentage of the overall time consumed from thought to completion.
huh? maybe im in the minority, but the thinking:coding has always been 80:20 spend a ton of time thinking and drawing, then write once and debug a bit, and it works
this hasnt really changed with Llm coding either, except that for the same amount of thinking, you get more code output
Yeah, ratios vary depending on how productive you are with code. For me it was 50:50 and is now 80:20, but only because I was a relatively unproductive coder (struggled with language feature memorization, etc.) and a much more productive thinker/architect.
"Struggling with language feature memorization" is what we call "unemployed", not "relatively unproductive".
when the work involves navigating a bunch of rules with very ambiguous syntax, AI will automate them to the point computers automated rules based systems with very precise syntax in the 1990s
this software (which i am not related to or promoting) is better at investment planning and tax planning than over 90% of RIAs in the US. It will automate RIA to the point that trading software automated stock broking. This will reduce the average RIA fee from 1% per year to 0.20% or even 0.10% per year just like mutual fund fees dropped in the early 00s
You could have beaten the returns of most financial professionals over the last several years by just parking your money in the S&P 500, and yet plenty of people are still making a lucrative career out of underperforming it. In some fields, “being better and cheaper” does not always spell victory.
you are right on beating money managers. when I said investment planning, I meant planning the size and tax structures for investments. this software automates all of the technical work that goes on inside financial planning firms, which is done by tens of thousands of white collar professionals in US/UK/EU, et c. it will then lead to price competitiveness.
more expensive silly companies will exist, but the cheap ones get the scale. SP500 index funds have over 1 trillion in the top 3 providers. cathy wood has like 6-7 billion.
BNYMellon is the custodian of $50 trillion of investment assets. robinhood has $324bn.
silly companies get the headlines though
Workers may see the LLM as a productivity boost because they can basically cheat a their homework.
As a CEO I see it as a massive clog up of vast amounts of content that somebody will need to check. A DDoS of any text-based system.
The other day I got a document of 155 pages in Whatsapp. Thanx. Same with pull requests. Who will check all this?
> Who will check all this?
The answer to that, for some, is more AI.
I had a peer explain that the PRs created by AI are now too large and difficult to understand. They were concerned that bugs would crop up after merging the code. Their solution, was to use another AI to review the code... However, this did not solve the problem of not knowing what the code does. They had a solution for that as well... ask AI to prepare a quiz and then deliver it to the engineer to check their understanding of the code.
The question was asked - does using AI mean best-practices should no longer be followed? There were some in the conversation who answered, "probably yes".
> Who will check all this?
So yeah, I think the real answer to that is... no one.
Just yesterday one of my junior devs got an 800-line code review from an AI agent. It wasn't all bad, but is this kid literally going to have to read an essay every time he submits code?
Who gave you the 155 page doc? How quickly were they fired?
[dead]
If I do something faster by pairing with AI, why should my employer reap the benefit? Why would I pass the savings on to my employer?
Could it be that employers are not seeing the difference because most employees are doing something else with the time they've saved by using AI?
There's been massive wage stagnation, benefits are crap, they play games with PTO. Most people I talk to who use AI as a part of their workflow are taking advantage of something nice that has come their way for a change.
I accept that AI-mediated productivity might not be what we expect to be.
But really, are CEO's the best people to assess productivity? What do they _actually_ use to measure it? Annual reviews? GTFO. Perhaps more importantly, it's not like anything a C-level says can ever be taken at face value when it involved their own business.
The latest company I worked in had your typical fee-earners and fee-burners categories of employees.
The fee-earners had KPIs tied to the sales pipeline, from leads to contracts to work completed on fixed contracts or hours billed on variable-rate contracts. It's relatively easy to measure improvements here. Though it's harder to distill the causes of that and tie it to LLMs.
The fee-burners like in IT, legal, compliance, marketing, finance, typically had KPIs tied to the department objectives. This stuff is a LOT more subjective and a lot more prone to manipulation (goodhart's law). But if you spend 60 hours a week on work in such a department, you tend to have a pretty good idea if things are speeding up or not at all. In a department I was involved in there was a lot of KYC that involved reviewing 300+ pages per case, we tracked case workload per person per day, as well as success rates (percentage of case reviews completed correctly), and could see meaningful changes one could attribute to LLM use.
Agreed though that I'm more interested in a few case studies in detail to understand how they actually measured productivity.
Most CEOs of large firms arent all that involved in the details, so theres no way they can have a true and proper view of the day-to-day operations on the ground level.
Steve Jobs is the only CEO of a large firm that I can re-call that always remained intimately involved.
I think we are entering the phase where corporate is expecting more ROI than they are getting, but want to remain in the arms race.
The firmwide AI guru at my shop who sends out weekly usage metrics and release notes started mentioning cost only in the last few weeks. At first it was just about engaging with individual business heads on setting budgets / rules and slowing the cost growth rate.
A few weeks later and he is mentioning automated cost reporting, model downgrading and circuit breaking at a per-user level. The daily spend where you immediately get locked within 24 hours is pretty low.
I noticed something similar at my work. The CEO is hyping AI, but at the same time free access to the big models was taken away and rate limits seem to be much tighter..
Original paper https://www.nber.org/system/files/working_papers/w34836/w348...
Figure A6 on page 45: Current and expected AI adoption by industry
Figure A11 on page 51: Realised and expected impacts of AI on employment by industry
Figure A12 on page 52: Realised and expected impacts of AI on productivity by industry
These seem to roughly line up with my expectations that the more customer facing or physical product your industry is, the lower the usage and impact of AI. (construction, retail)
A little bit surprising is "Accom & Food" being 4th highest for productivity impact in A12. I wonder how they are using it.
Figure right after A6 is pretty striking. Ask people if they expect to use AI and a vast majority say yes. Ask if they expect to use AI for specific applications and no more than a third say yes in any industry. That should be telling imo. What we have is a tool that looks impressive to any non-SME for a lot of applications. I would caution against the idea that benefits are obvious.
It’s simple calculus for business leaders: admit they’re laying off workers because the fundamentals are bad and spook investors, admit they’re laying off workers because the economy is bad and anger the administration, or just say it’s AI making roles unnecessary and hope for the best.
If you include microsoft copilot trials in fortune 500s, absolutely. A lot of major listed companies are still oblivious to the functionality of AI, their senior management don't even use it out of laziness
There's a lot of rote work in software development that's well-suited to LLM automation, but I think a lot of us overestimate the actual usefulness of a chatbot to the average white-collar worker. What's the point of making Copilot compose an email when your prompt would be longer than the email itself? You can tell ChatGPT to make you a slide deck, but slide decks are already super simple to make. You can use an LLM as a search engine, but we already have search engines. People sometimes talk about using a chatbot to brainstorm, but that seems redundant when you could simply think, free from the burden of explaining yourself to a chatbot.
LLMs are impressive and flexible tools, but people expect them to be transformative, and they're only transformative in narrow ways. The places they shine are quite low-level: transcription, translation, image recognition, search, solving clearly specified problems using well-known APIs, etc. There's value in these, but I'm not seeing the sort of universal accelerant that some people are anticipating.
That's probably true for some, but I think a lot of big orgs are simply risk-averse and see AI in general as a giant risk that isn't even fully baked enough to quantify yet. The security and confidentiality issues alone will make Operations hesitant, and Legal probably has some questions about IP (both the risk of a model outputting patented or otherwise protected code, and the huge legal gray area that is the copyrightability of the output of an LLM).
Give it a year or two and let things settle down and (assuming the music is still playing at that time) you might see more dinosaurs start to wander this way.
it turns out it's really hard to get a man to fish with a pole when you don't teach them how to use the reel
If AGI is coming, won't there just be autofishers and no one will ever have to fish again, completely devaluing one's fishing knowledge and the effort put in to learn it?
It’s not a great analogy but...
“Autofishers” are large boats with nets that bring in fish in vast quantities that you then buy at a wholesale market, a supermarket a bit later, or they flash freeze and sell it to you over the next 6-9 months.
Yet there’s still a thriving industry selling fishing gear. Because people like to fish. And because you can rarely buy fish as fresh as what you catch yourself.
Again, it’s not a great analogy, but I dunno. I doubt AGI, if it does come, will end up working the way people think it will.
> If AGI is coming
spoiler, it's not
In regards to copilot, they’ve also been led on a fishing expedition to the middle of a desert
I'm not sure if this was the intention of the analogy, but fishing poles don't have reels.
Or give them a stick with twine and a plastic fork as a hook, as is the case with Copilot.
100% All of the people who are floored by AI capabilities right now are software engineers, and everyone who's extremely skeptical basically has any other office job. On investigating their primary AI interaction surface, it's Microsoft Co-Pilot, which has to be the absolute shittiest implementation of any AI system so far. As a progress-driven person, it's just super disappointing to see how few people are benefiting from the productive gains of these systems.
I'm a SWE who's been using coding agents daily for the last 6 months and I'm still skeptical.
For my team at least, the productivity boost is difficult to quantify objectively. Our products and services have still tons of issues that AI isn't going to solve magically.
It's pretty clear that AI is allowing to move faster for some tasks, but it's also detrimental for other things. We're going to learn how to use these tools more efficiently, but right now, I'm not convinced about the productivity gain.
> I'm a SWE who's been using coding agents daily for the last 6 months and I'm still skeptical.
What improvements have you noticed over that time?
It seems like the models coming out in the last several weeks are dramatically superior to those mid-last year. Does that match your experience?
Not the grandparent, but I've used most of the OpenAI models that have been released in the last year. Out of all of them, o3 was the best at the programming tasks I do. I liked it a lot more than I like GPT 5.2 Thinking/Pro. Overall, I'm not at all convinced that models are making forward progress in general.
Yes, it matches my experience. Now I can throw tasks at the agent and have it write a full PR, with tests, good summary. Or it can review things and make good suggestions that a casual reviewer or non-expert would have missed. It can also take a bunch of logs as input, find the issue, fix the code. I can't deny it's impressive and useful.
What I'm still skeptical about is how much more productive it makes us. In my case, coding is maybe 50% of my job, and I work on complex and novel systems. The agent gives me the illusion I don't need to think anymore, but it's not the case. Agents slow me down in many cases too, I'm not learning and improving as I used to.
Is your backlog and/or your velocity increasing, decreasing, or the same? That's really the ultimate question.
In a team of one at work I see clear benefits, but having worked in many different team sizes for most of my career I can see how it quickly would go down, especially if you care about quality. And even with the latest models it’s a constant battle against legacy training data, which has gotten worse over time. ”I have to spend 45 minutes explaining why a one minute AI generated PR is bad code” was how an old colleague summarized it.
I think anthropic will succeed immensely here because when integrated with Microsoft365 and especially Excel it basically does what co-pilot said it would do.
The moment of realisation happen for a lot of normoid business people when they see claude make a DCF spreadsheet or search emails
claude is also smart because it visually shows the user as it resizes the columns, changes colours, etc. Seeing the computer do things makes the normoid SEE the AI despite it being much slower
Hilarious lack of self awareness. Calling others "normoids" yet you believe you can emphasise with them enough to predict how they will adopt AI?
No one wants a chatbot “integrated” with excel and office365 crap, it’s clippy 2.0 bullshit.
Replace excel and office stuff with ai model entirely then people will pay attention.
that only works if you can oneshot. but nobody can oneshot.
iterating over work in excel and seeing it update correctly is exactly what people want. If they get it working in MSWord it will pick up even faster.
If the average office worker can get the benefit of AI by installing an add-on into the same office software they have been using since 2000 (the entire professional career of anyone under the age of 45), then they will do so. its also really easy to sell to companies because they dont have to redesign their teams or software stack, or even train people that much. the board can easily agree to budget $20 a head for claude pro
the other thing normies like is they can put in huge legacy spreadsheets and find all the errors
Microsoft365 has 400 million paid seats
> normoid
Do you work extra hard to be this arrogant or does it come naturally?
IMO Copilot was "we need to give these people rope, but not enough for them to hang themselves". A non technical person with no patience and access to a real AI agent inside a business is a bull in a china shop. Copilot Cowork is the closest thing we have to what Copilot should have been and is only possible now because models finally got good enough to be less supervised.
FWIW Gemini inside Google apps is just as bad.
This isn't my experience. I see many non-software people using AI regularly. What you may be seeing is more: organizations with no incentive to do things better never did anything to do things better. AI is no different. They were never doing things better with pencil and paper.
Many people are using AI as a slot machine, rerolling repeatedly until they get the result they want..
Once the tools help the AI to get feedback on what its first attempt got right and wrong, then we will see the benefits.
And the models people use en masse - eg. free tier ChatGPT - need to get to some threshold of capability where they’re able to do really well on the tasks they don’t do well enough on today.
There’s a tipping point there where models don’t create more work after they’re used for a task, but we aren’t there yet.
This title is so click bait-y and misleading compared to what the actual article is about it's tough not to feel disappointed this is on the front page. @dang
Belatedly fixed. Thanks!
p.s. @dang doesn't work reliably - hn@ycombinator.com is the way to get a message delivered
There was a recent post where someone said AI allows them to start and finish projects. And I find that exactly true. AI agents are helpful for starting proof of concepts. And for doing finishing fixes to an established codebase. For a lot of the work in the middle, I can be still useful, but the developer is more important there.
If you've ever undertaken the task of documenting entire workflows, then you know that you quickly put up the white flag at the word "entire".
When you actually talk to people about what they do there are often many, many nuances, micro-events, micro-decisions and micro-actions in their work. This is why it can take days/weeks/months to completely train a new person for a job.
This level of detail is barely documented - anywhere. There is a huge amount of information buried in workflows that AI has barely had access to for training. A lot of this is more in the realm of world models, rather than LLMs.
So imagine trying to use AI to improve these workflows it knows so little about. Then imagine AI trying to reinvent them across an organization.
We find these use cases where AI provides great value - totally true - but these barely scratch the surface of what goes on.
Perhaps something went wrong along the career path of a developer? Personally during my education there is a severe lack of actual coding done mid lectures, especially any sort of showcase of tools that are available. We didn't even get taught how to use debuggers, I see late year students still struggle how to do basic navigation in a terminal.
And the biggest irony is that the "scariest" projects we had at our university ended up being maybe 500-1000 lines of code, things really must go back to hands on programming with real time feedback from a teacher. LLM's only output what you ask and won't really suggest concepts used by professionals unless you go out of your way to ask for it, it all seems like a vicious cycle even though meaningful code blocks can range along 5 to 100 lines which. When I use LLM's I just get information burn out trying to dig through all that info or code
Productivity is a moveable feast and tricky to compare with the past. The productivity business talk about is the ratio of cost to profit.
As tech become available to help reduce your costs and drive up your profit, the same tech also reduces your competitor's costs and perhaps lets more competitors into the market. This drives down your product prices and reduces your profit.
So you invest but see no increase in productivity, but if you don't do it - you're toast.
One underexplored reason: companies can't give AI agents real authority. The moment an agent needs to do anything beyond summarizing text — update a CRM, transfer funds, modify infrastructure — the security question kills it. No one wants an agent that can take irreversible actions with no approval chain. Until the trust architecture problem is solved, AI stays in read-only mode for most enterprises.
This is the biggest bottleneck. To realize the “replacement of white collar workers” fever dream, (which is, I still believe, technically feasible), you need the agent that replaces them to have all of the context they had. All of the emails, all of the Slacks, all of the meeting minutes, access to private corporate systems and files, etc. I can’t think of a single company that would want to turn all of that over to OpenAI.
> I can’t think of a single company that would want to turn all of that over to OpenAI
you’d be surprised… the largest IP in the majority of cases is the codebase itself. once that hurdle was crossed the rest is easy decision
Companies over a certain size (say more than 25+ employees), are universally bad at:
- measuring productivity
- adapting to change
This article just reinforces that. Past a certain headcount, executives have little to no understanding of what IC day-to-day is like.
AI tooling doesn't fix the bureaucracy the c-suite helped to create.
And there are incentives to miss-report.
My team has gained a reputation of being some sort of firefighting crew.
We are being called by PMs when projects are failing, usually engineering-data and engineering-adjacent stuff. (Mechanical/Electrical).
We automate the heck out the processes, using a mix of AI processing, RAGs, and AI assisted coding.
We rescue the projects. Finish ahead of schedule. Make fewer mistakes. We gain additional scope. We win new projects. We bring new clients.
But when higher ups ask the people we helped about productivity gains, the most generous will say stuff like "it takes as long to review as it takes to do things manually", "They really helped on {inconsequential part of the deliverable}"
If the that is the takeaway these people were taking, they would incredibly misled. Luckily for me, I have people who deal with the politics, while my team can focus on delivery.
Our reputation keeps growing, and we keep delivering faster. The heads of the departments we work with love us, the middle rank who were doing the laborious crap, maybe not so much.
The analogy that an LLM is simply an amplifier is apt for most general business.
If you've already got a very effective team with clear vision/goals, this technology will almost certainly help to some degree.
If you've got a sinking ship of a business, this technology will likely drag you down faster.
You always have to work backward from the customer into the technology. AI will never change that. I've found myself waffling on advice to some clients regarding AI because whether or not they can effectively leverage it depends more on what the people in the business are willing to do than what the technology can do.
This may mean the centaur era will be shorter than expected. If we take as a given that:
* AI is doing real work
* Humans using AI don't seem to get more done with AI than without
There is a huge economic pressure to remove humans and just let the AI do the work without them as soon as possible.
I suspect this may be the case. There’s inherent inefficiency in having a human forced to translate everything into context for the LLM. You don’t get the full benefit until you allow it to be fully plugged in.
Erik Brynjolfsson: Productivity is 'Much Higher' Due to AI
https://www.wsj.com/video/erik-brynjolfsson-productivity-is-...
If we assume people are somewhat rational (big ask I know), and the Efficient-market hypothesis, then we can estimate the value created by AI to be roughly equal to the revenue of these AI companies. That is: A professional who pays 20€/month likely believes that the AI product provides them with roughly 20€ each month in productivity gains, or else they wouldn't be paying, and similarly they would pay more for a bigger subscription if they thought there was more low hanging fruit available to grab.
Of course this doesn't take into account people who just pay to play around and learn, non professional use cases, or a few other things, but it's a rough ballpark estimate.
Assuming the above, current AI models would only increase the productivity for most workplaces by a relatively small amount, around 10-200 € per employee per month perhaps. Almost indistinguishable compared to salaries and other business expenses.
> A professional who pays 20€/month likely believes that the AI product provides them with roughly 20€ each month in productivity gains, or else [...] they would pay more for a bigger subscription
Unless I'm misunderstanding, shouldn't someone rational want to pay where (value - cost) is highest, opposed to increasing cost to the point where it equals value (which has diminishing returns)?
A $40 subscription creating $1000 worth of value would be preferred over a $200 subscription creating $1100 of value, for instance, and both preferred over a $1200 subscription creating $1200 of value.
True! (value - cost) would be better.
I was more so limiting myself to the simpler heuristic where people only pay roughly what they personally think something is worth, and not significantly more/less regardless of the options. But of course, as you've pointed out, in real life the options available really do matter, and someone might decline a 200:1200 trade if there are even more lopsided options available. It does complicate the though experiment somewhat if you try to take this into account.
It's the same reason why I have, for more than a decade, been so frustrated with people refusing to consider proper pair programming and even mob programming, as they view the need to keep people busy churning lines of code individually as the most important part of the company.
That multiple AI agents can now churn out those lines relatively nearly instantly, and yet project velocity does not go much faster, should start to make people aware that code generation is not actually the crucial cost in time taken to deliver software and projects.
I ranted recently that small mob teams with AI agents may be my view of ideal team setup: https://blog.flurdy.com/2026/02/mob-together-when-ai-joins-t...
> ... and yet project velocity does not go much faster
1) The models like us have finite context windows and intelligence, even with good engineering practices system complexity will eventually slow them down.
2) At the moment at least, the code still needs to be reviewed and signed off by us and reading someone else's code is usually harder than writing it.
Are people still reading PRs in detail manually?
I am after the automated PR agents have all passed a PR I tend to let Claude Code and Codex give me a summary, with an MCP skill to read the requirement story. I trust their ability to catch edge cases and typos more than me. I just check the general structure of the PR
I may use my own skill to automate this... https://github.com/flurdy/agent-skills/blob/main/skills/revi...
Anthropic, OpenAI, Google et. al. have EULAs and the best lawyers money can buy ready to argue that any damage done by publicly releasing bad or malicious code produced or reviewed with their systems is the developers responsibility for not checking properly.
> I trust their ability to catch edge cases and typos more than me.
Given the vendors EULAs etc, if poop really hits the fan with released code, then how is that likely to sound if the lawyers get involved?
> Are people still reading PRs in detail manually?
Ultimately it all depends on circumstance and appetite for risk, but yes many/most places still manually checking releases.
The Solow paradox is real, but there's another factor: most current AI tools are glorified autocomplete. You still have to prompt them, check their work, and integrate their output into your workflow.
The real productivity gains will come when AI runs autonomously - handling tasks without constant supervision. Think email triage that just happens, schedules that self-maintain, follow-ups that occur automatically.
Right now we're in the "AI as fancy search" phase. The jump to "AI as autonomous assistant" is where the productivity numbers will start showing up.
I read an article in FT just a couple days ago claiming that increased productivity was becoming visible in economic data
> My own updated analysis suggests a US productivity increase of roughly 2.7 per cent for 2025. This is a near doubling from the sluggish 1.4 per cent annual average that characterised the past decade.
good for 3 clicks: https://giftarticle.ft.com/giftarticle/actions/redeem/97861f...
I think the best point made in this conversation is that AI is often enough used to do things quickly that have little value, or just waste people’s time.
I am glad to see articles like this that evaluate impact, but I wish the following would get more public interest:
With LLMs we are chasing sort-of linear growth in capability at exponential cost increases for power and compute.
Were you mad when the government bailed out mis-managed banks? The mother of all government bailouts might be using the US taxpayer to fund idiot companies like Anthropic and OpenAI that are spending $1000 in costs to earn $100.
I am starting to feel like the entire industry is lazy: we need fundamental new research in energy and compute efficient AI. I do love seeing non-LLM research efforts and more being done with much smaller task-focused models, but the overall approach we are taking in the USA is f$cking crazy. I fear we are going to lose big-time on this one.
The article suggests that AI-related productivity gains could follow a J-curve. An initial decline, as initially happened with IT, followed by an exponential surge. They admit this is heavily dependent on the real value AI provides.
However, there's another factor. The J-curve for IT happened in a different era. No matter when you jumped on the bandwagon, things just kept getting faster, easier, and cheaper. Moore's law was relentless. The exponential growth phase of the J-curve for AI, if there is one, is going to be heavily damped by the enshitification phase of the winning AI companies. They are currently incurring massive debt in order to gain an edge on their competition. Whatever companies are left standing in a couple of years are going to have to raise the funds to service and pay back that debt. The investment required to compete in AI is so massive that cheaper competition may not arise, and a small number of (or single) winner could put anyone dependent on AI into a financial bind. Will growth really be exponential if this happens and the benefits aren't clearly worth it?
The best possible outcome may be for the bubble to pop, the current batch of AI companies to go bankrupt, and for AI capability to be built back better and cheaper as computation becomes cheaper.
But there already is cheaper competition? Open models may be behind, but only ~6 months for every new generation.
I think the deluge on projects on show HN points to something real, its possible today to ship projects as a one man shop that looks like something that just a year or so would have required a team.
Personally I have noticed strange effects, where I previously would have reached for a software package to make something or solve an issue, its now often faster for me to write a specific program just for my use case. Just this weekend I needed a reel with a specific look to post on instagram but instead of trying to use something like after effects, i could quickly cobble together a program that was using css transforms that outputted a series of images I could tie together with ffmpeg.
About a month ago I was unhappy with the commercial ticketing systems, they were both expensive and opaque so I made my own. Obviously for a case like that you need discipline and testing when you take peoples money, so there was a lot of focus on end to end testing.
I have a few more examples like this, but to make this work you need to approach using LLMs with a certain amount of rigour. The hardest part is to prevent drift in the model. There are a certain number things you can do to make the model grounded in reality.
When the tool doesn’t have a reproducer, it’ll happily invent a story and you’ll debug the story. If you ground the root cause in for example a test, the model can get context enough to actually solve the problem.
Another issue is that you need to read and understand code quickly, but its no real difference from working with other developers. When tests are passing I usually do a PR to myself and then review as I usually would do.
A prerequisite is that you need tight specs, but those can also be generated if you are experienced enough. You need enough domain intuition to know what ‘done’ means and what to measure.
Personally I think the bottleneck will go from trying to get into a flow state to write solutions to analyze the problem space and verification.
> I think the deluge on projects on show HN points to something real, its possible today to ship projects as a one man shop that looks like something that just a year or so would have required a team.
Lots of these project have a lifespan of a week and will never ever be maintained. When you pour blood and sweat in a projet you get attached to it, when you vibe code it in an afternoon and it's not and instant hit you move on to the next one.
Large firms are extremely bureaucratic organizations largely isolated from the market by their monopolistic positions. Internal pressures rule over external ones, and thus, inefficiency abounds. AI undeniably is a productive tool, but large companies aren't really primarily concerned with productivity.
Indeed. Most large companies don't need AI to increase productivity - they just need to stop wasting time on stupid bullshit. However, figuring out what is stupid bullshit and what is not seems to be an impossible task, and I don't think AI is going to help here at all.
What are you talking about, if there is one thing on which LLM shine, it’s generating vast amount of bullshit. That’s extreme productivity gains.
Also flame-wars can be autofed ad nauseam now, so there’s going to be less and less interest in engaging into them. In an act of desperation, idle trolls will turn to tasks tracked by KPIs.
Yeah, maybe the CEO doesn’t see any impact on productivity — but mine definitely changed. I actually have more time for my own stuff now, because AI quietly handles part of the work for me. Of course, if productivity is measured by how many PowerPoint slides get presented to the board, then sure — nothing changed. Especially when HR reports say “everything looks the same” — because no one is tracking how much work is silently being offloaded to AI. And just to avoid overworking myself, I even asked AI to write this comment so I could focus on something else in the meantime.
Maybe the CEOs do not realise that their workers are achieving great productivity, completing their tasks in 1 hour instead of 8, and spending time on the beach, rather than at their desks?
most likely scenario
most of these companies deployed microsoft copilot, watched it hallucinate meeting summaries for six months, and called that an AI strategy. source: current situation
I'm saying it over and over, AI is not killing dev jobs, offshoring is. The AI hype happens to fall into the end of the pandemic, and lots of companies went to work-from-home and are now hiring cheaper devs around the world.
I think the 'AI productivity gap' is mostly a state management problem. Even with great models, you burn so much time just manually syncing context between different agents or chat sessions.
Until the handoff tax is lower than the cost of just doing it yourself, the ROI isn't going to be there for most engineering workflows.
I am in strategy consulting and I can tell you the productivity gains are real in terms of research, model building, and summarising work. The result is price pressure from our clients.
I'm not sure how you even measure productivity going up or down, for many of us LLMs have allowed us to trim down the amount of effort required to scour through google search results.
My experience has been that AI is much more useful on my own systems than on company systems. For AI to (currently) be useful, I need to choose my own tooling and LLM models to support AI centered workflow. At work, I have to use whatever (usually Microsoft) tools my company has chosen to purchase and approve for my corporate computer, and usually nothing works as well as on my own machine where I get to set it up as I want.
What you are describing is a failure to integrate AI into said company systems. I have seen quite a few companies now that buy MS AI products with great hopes only to be severely disappointed, because they may as well have just used vanilla ChatGPT (in fact then they would at least get newer models faster). But there are counter examples too. If you can pull all your company documentation into a vector db and build a RAG based assistant, you can potentially save countless hours across your workforce and possibly customers too. But this is not easy and also requires some level of UI interactivity that noone really offers right now. In fact they can't offer it, because you usually need to integrate ancient, arcane sources into your system. So you do have to write a lot of integration code yourself at every step. Not many companies are willing to spend that kind of money and effort, because managers just want to buy a MS product and be done with improving efficiency by next quarter.
I have been using vector based RAG for about two years now, I am not knocking the tech, but last year I started experimenting with going way back in time and also in parallel trying BM25 search (or hybrid BM25 and vector). So: not even a very good example use case of LLMs, the tech is not always applicable.
EDIT: I am on a mobile device and don’t have a reference handy but there have been good papers on RAG scaling issues - basically the embedding space gets saturated (too many document chunks cluster in small areas of the embedding space), if my memory is correct.
Depends on your use case. A system that can do full text and semantic search across a vast archive, open files based on that search to retrieve detail and generate an answer after sifting through hundreds of pages is pretty powerful. Especially if you mange to pair it with document link generation and page citation.
I agree with you, but then I am retired so my opinion is not that relevant.
I have a sweet spot for using just Emacs, no other IDEs except very occasional use of AntiGravity. For a particular fun project or researching how to use generative models in applications, I like to start low: see if a small local model with appropriate tool use or agentic library will get the job done, if not move up to using something cheap like gemini-3-flash, and only if none of these approaches work, then use an expensive model.
I was advising a friend’s company last month on their application that effectively uses LLM models, but I was blown away by their zeal to spend lots of money on tokens.
I like AI and use it daily, but this bubble can’t pop soon enough so we can all return to normally scheduled programming.
CEOs are now on the downside of the hype curve.
They went from “Get me some of that AI!” after first hearing about it, to “Why are we not seeing any savings? Shut this boondoggle down!” now that we’re a few years into bubble, the business math isn’t working, and they only see burning piles of cash.
"return to normally scheduled programming" is probably not the exact phrasing you want to use. :)
This is the standard stupidity here based on emotion and denial. This is the narrative that people want to hear.
Of course, trying to automate with chatGPT 4o was stupid. Trying to automate with Sonnet 4.6 will work better. Trying to automate with the models a year from now will work all the better.
To believe we are going to stop and go back to 2019 at this stage is seriously delusional.
I wish it were true. I would love to go back to 2019 but we obviously are not. We never go backwards.
“Regularly scheduled programming” includes progress, not stasis. It just means the AI hype firehouse runs out of money and we rerun to normal sane progress and productivity enhancements without all the overpricing and under delivering of the last few years.
I consume a lot of different content on a lot of different places. Every site or app has its vibe and communal beliefs. They rarely if ever agree on anything, but they all agree we're in a massive bubble.
I don't have a point, just that it's an unlikely unity.
Comment was deleted :(
I’m not sure about this. I’ve been 100% ai since jan/1 and I’m way more productive at producing code.
The non code parts (about 90% of the work) is taking the same amount of time though.
Isn't it a bit early to draw such conclusions ? We are just getting started with AI use, especially in tech / engineering teams, and have only scratched the surface with regards to what is possible.
At my current job I am in deep net LOC negative despite all new features... Somebody is getting fired and sued for stealing all these LOCs from the company...
I find this difficult to reconcile with things like for example freelance translation being basically wiped out wholesale
Or even the simple utility of having a chatbot. They’re not popular because they’re useless
Which to me says it’s more likely that people under estimate corporate inertia
It's not just technology, it's very hard to detect the effect of inventions in general on productivity. There was a paper pointing out that the invention of the steam engine was basically invisible in the productivity statistics:
The first steam engine was invented by a Turk and he used it solely to make kebab spin. Never thought about using it for anything else.
Mirror:
Comment was deleted :(
General-purpose technologies tend to have long and uneven diffusion curves. The hype cycle moves faster than organizational change
BTW the study was from September 2024 to 2025, so its the very earliest of adopters.
This article is mostly based on NBER working paper 34836, which was published this month, and the data was collected from September 2025 to January 2026[0]
[0]: See page 2: https://www.nber.org/system/files/working_papers/w34836/w348...
Every technology, whether it improved existing systems and productivity or not, created new wealth by creating new services and experiences. So that is what needs to happen with this wave as well.
If nothing else AI is making great strides in surveillance. It makes mistakes, but that only matters when there's accountability, and now it can make them at scales that were unthinkable before. Most of us are not going to enjoy the new experiences AI brings us, but a small number of people are already making a lot of money selling new services to government and law enforcement.
As we approach the singularity things will be more noisy and things will make less and less sense as rapid change can look like chaos from inside the system. I recommend folks just take a deep breath, and just take a look around you. Regardless on your stance if the singularity is real, if AI will revolutionize everything or not, just forget all that noise. just look around you and ask yourself if things are seeming more or less chaotic, are you able to predict better or worse on what is going to happen? how far can your predictions land you now versus lets say 10 or 20 years ago? Conflicting signals is exactly how all of this looks. one account is saying its the end of the world another is saying nothing ever changes and everything is the same as it always was....
I think the biggest problem is calling it AI to start with. It gives people a huge misrepresentation of what it is actually capable of. It is an impressive tool with many uses, but it is not AGI.
Mentioning AI in an earnings call means fuck all when what they’re actually referring to is toggling on the permissions for borderline useless copilot features across their enterprise 365 deployments or being convinced to buy some tool that’s actually just a wrapper around API calls to a cheap/outdated OpenAI model with a hidden system prompt.
Yeah, if your Fortune 500 workplace is claiming to be leveraging AI because it has a few dozen relatively tech illiterate employees using it to write their em dash/emoji riddled emails about wellness sessions and teams invites for trivia events… there’s not going to be a noticeable uptick in productivity.
The real productivity comes from tooling that no sufficiently risk adverse pubco IS department is going to let their employees use, because when all of their incentives point to saying no to installing anything ever, the idea of giving the permissions required for agentic AI to do anything useful is a non-starter.
I hope that RAM prices will drop soon as a result. Companies cease to exist; therefore, datacenters are unnecessary.
"Admitted" as the verb in a statement like this is blatant editorialization. Did they just finally "admit" what they had been reluctant to reveal? No doubt with their heads hung in shame?
Maybe this bothers me more than it should.
It's not that AI is ineffective, but it will take time to create solutions that are actually highly useful in real-world business scenarios.
Quickly slapping "AI features" on a bunch of existing products -- like almost every SW company seems to have done in an effort to appear "on the cutting edge" -- accomplishes almost nothing.
In other words, everybody is benefiting from AI, except CEOs.
Including 999 using Copilot.
Q: fortune (and a other sites) seem to be posted regularly but they're inevitably paywalled; is this common for others or just me? If it's common why are they posted here?
I was in the “AI is grossly overhyped” camp because I work on large distributed deep learning training jobs and AI is indeed worthless for those, and will likely always be worthless since the APIs change constantly and the iteration loop is too cumbersome to constantly resubmit broken jobs to a training cluster.
Then I started working on some basic grpc/fullstack crap that I absolutely do not care about, at all, but needs to be done and uses internal frameworks that are not well documented, and now Claude is my best friend at work.
The best part is everyone else’s AI code still sucks, because they ask it to do stupid crap and don’t apply any critical thinking skills to it, so I just tell AI to re-do it but don’t fuck up the error handling and use constants instead of hardcoding strings like a middle schooler, and now I’m a 100x developer fearlessly leading the charge to usher in the AI era as I play the new No Man’s Sky update on my other PC and wait for whatever agent to finish crap.
Ah I see what my goal for this year is then. I have a large Steam backlog to work through. Unfortunately we currently code in short bursts and mostly are trying to figure out how these integrations are supposed to happen and why the different teams tell us different things
this weirdly skirts my own experience yet somehow still read like sarcasm hehe. I think if we just return to calling it intelligent autocomplete expectations for productivity gain would be better established.
trying to hacksmash Claude into outputting something it simply can't just produces endless mess. or getting into a fight pointing out issues with what it's doing and it just piles on extra layer upon layer of gunk. but meanwhile if you ask it to boilerplate an entire SaaS around the hard part, it's done in about 15 seconds.
of course this says nothing about the costs of long term maintainability, and I think everyone by now recognises what that's going to look like
We just haven’t figured out how to use it. You wouldn’t try to create an entire project out of IDE templates, but how many “low code” attempts were there to do just that at some point?
I think there are phases in a project’s lifecycle where it’s more appropriate, at the very beginning and very late. I do not think junior developers should be using it, because it is much much harder to learn and it kills productivity having senior developers review 3000 lines of slop. Just stuff like that needs to be figured out.
I've had some luck with this idea of keeping the "Clauded" bits separate where possible. Do you really care if it crates a spaghetti mess if the result is some visually beautiful low trust site that lives in its own repo entirely? vs. letting it run in autoapprove mode inside a module where critical hand-written crypto code exists
Anyone read the goal lately?
How do they define productivity? How is it measured?
I think the reason tech didn't help productivity until the late 90s is pretty obvious. The internet was missing. Computers needed the internet to make them useful to everyone. So the question should be.
What is Ai missing that will make it useful to everyone?
NVIDIA is doing circular finance deals with all of the top labs to pump up demand for its products and charging a monopoly rate on those products. Everything in computing is costing more.
Access to capital for everyone else is dropping. And the US economy is being managed by chaos monkeys, causing all kinds of supply chain disruptions. Oligopolies in almost every market are increasingly jacking up prices above market equilibrium rates as they are emboldened by a corrupted FTC.
Despite what Peter Thiel may have led you to believe, Monopolies are not healthy for an economy in aggregate.
Of course the economy is slowing.
I think the article is very premature. Lots of companies are slow to adapt. And while there are a lot of early adopters, there are way more people still not really adapting what they do.
There are some real changes in day to day software development. Programmers seem to be spending a lot of time prompting LLMs these days. Some more than others. But the trend is pretty hard to deny at this point. That snowballed in just 6-7 months from mostly working in IDEs to mostly working in Agentic coding tools. Codex was barely usable before the summer (I'm biased to that since that is what I use but it wasn't that far behind Claude Code). Their cli tool got a lot more usable in autumn and by Christmas I was using it more and more. The Desktop app release and the new model releases only three weeks ago really spiked my usage. Claude Code was a bit earlier but saw a similar massive increase in utility and usability.
It is still early days. This report cannot possibly take into account these massive improvements that hav been playing out over essentially just the last few months. This time last year, Agentic coding was barely usable. You had isolated early adopters of Claude Code, Cursor, and similar tools. Compare to what we have now, these tools weren't very good.
In the business world things are delayed much more. We programmers have the advantage that many/most of our tools are highly scriptable (by design) and easy to figure out for LLMs. As soon as AI coders figured out how to patch tool calling into LLMs there was this massive leap in utility as LLMs suddenly gained feedback loops based on existing tools that it could suddenly just use.
This has not happened yet for the vast majority of business tools. There are lots of permission and security issues. Proprietary tools that are hard to integrate with. Even things like wordprocessors, spreadsheets, presentation tools, and email/calendar tools remain poorly integrated. You can really see Apple, MS, and Google struggle with this. They are all taking baby steps here but the state of the art is still "copy this blob of text in your tool". Forget about it respecting your document theme, or structure. Agentic tool usage is not widely spread outside the software engineering community yet.
The net result is that the business world still has a lot of drudgery in the form of people manually copying data around between UIs that are mostly not accessible to agentic tools yet. Also many users aren't that tool savvy to begin with. It's unreasonable to expect people like that to be impacted a lot by AI this early in the game. There's a lot of this stuff that is in scope for automating with agentic tools. Most of it is a lot less hard than the type of stuff programmers already deal with in their lives.
Most of the effects this will have on the industry will play out over the next few years. We've seen nothing yet. Especially bigger companies will do so very conservatively. They are mostly incapable of rapid change. Just look at how slow the big trillion dollar companies are themselves with eating their own dog food. And they literally invented and bootstrapped most of this stuff. The rest of the industry is worse at this.
The good news is that the main challenges at this point are non technical: organizational lag, security practices, low level API/UI plumbing to facilitate agentic tool usage, etc. None of this stuff requires further leaps in AI model quality. But doing the actual work to make this happen is not a fast process. From proof of concept to reality is a slow process. Five years would be exceptionally fast. That might actually happen given the massive impact this stuff might have.
It’s funny because at work we have paid Codex and Claude but I rarely find a use for it, yet I pay for the $200 Max plan for personal stuff and will use it for hours!
So I’m not even in the “it’s useless” camp, but it’s frankly only situationally useful outside of new greenfield stuff. Maybe that is the problem?
Why do you find it useless for legacy code? I find I have to give it plenty of context but it does pretty well on legacy code.
And Ask DeepWiki is a great shortcut for finding the right context… Granted this is open source and DW is free.
Is it the specific nature of your work?
thank goodness! our jobs are safe lads!!
Yep just a risk amplifier. We are having a global warming level event in computing and blindly walling into it.
As a small bespoke manufactuter of things made out of metal, I have recently begun implimenting a policy of abandoning most online services, including banking, well almost as customers can still send me money online, but I have to go to a branch to see or get funds, except for monthly reports. It is awsome, the web brings me customers via 2 web sites, and searches useing AI, but the whole thing is asymetrical, as it has been more than a year since my last online purchase or filling out a form, aplication etc, all done on paper, in person, or I live without whatever it is. The result is a work environment that is focused on customers and production, and external obligations, requirements are litteral, as they must be managed efficiently in person and in such a way as to be finnished or stable, none of the death by 1000 emails brain rot. The mental state of haveing zero knowledge of what is happening on a millisecond by millisecond basis and letting everything go, and lo the world grinds on just fine without me, and I get a few things done. Mr Solow called it long ago, and my intuition has always been that the busy work was shit, and have now proven that in my one specific circumstance.
It's weird being on here and seeing so much naysaying, because I see a radical change already happening in software development. The future is here, it's just not equally distributed.
In the past 6 months, I've gone from Copilot to Cursor to Conductor. It's really the shift to Conductor that convinced me that I crossed into a new reality of software work. It is now possible to code at a scale dramatically higher than before.
This has not yet translated into shipping at far higher magnitude. There are still big friction points and bottlenecks. Some will need to be resolved with technology, others will need organizational solutions.
But this is crystal clear to me: there is a clear path to companies getting software value to the end customer much more rapidly.
I would compare the ongoing revolution to the advent of the Web for software delivery. When features didn't have to be scheduled for release in physical shipments, it unlocked radically different approaches to product development, most clearly illustrated in The Agile Manifesto. You could also do real-time experiments to optimize product outcomes.
I'm not here to say that this is all going to be OK. It won't be for a lot of people. Some companies are going to make tremendous mistakes and generate tremendous waste. Many of the concerns around GenAI are deadly,serious.
But I also have zero doubt that the companies that most effectively embrace the new possibilities are going to run circles around their competition.
It's a weird feeling when people argue against me in this, because I've seen too much. It's like arguing with flat-earthers. I've never personally circumnavigated Antarctica, but me being wrong would invalidate so many facts my frame of reality depends on.
To me, the question isn't about the capabilities of the technology. It's whether we actually want the future it unlocks. That's the discussion I wish we were having. Even if it's hard for me to see what choice there is. Capitalism and geopolitical competition are incredible forces to reckon with, and AI is being driven hard by both.
Curious why you like Conductor. I’m trying it out, but since I primarily live in the CLI, I might not see much value in it.
Fair point. What it really does for me is give me a better UX for having a bunch of parallel workstreams. I could achieve a similar effect thing with scripting, and maybe some clever ways of getting something like the sidebar for seeing the status of everything on a single pane. But Conductor packaged it up in a way that I found much improved over multiple Cursor or VSCode windows.
At $dayjob GenAI has been shoved into every workflow and it's a constant source of noise and irritation, slop galore. I'm so close to walking away from the industry to resume being a mechanic, what a complete shit show.
Meanwhile in some auto shop,
"Perfect! Let's delve into the problem with the engine. Based on the symptoms you describe, the likely cause is a blown head gasket..."
Look, that's hardly the point, now, is it, CEOs? AI, or at least saying "AI" a lot, makes number go up.
CEOs have no clue what's going on at the IC level.
I bet many CEO PA are using AI for many tasks. It's typically a role where AI is very useful. Answering emails, moving meetings around, booking and buying a bunch of crap.
I ask this as a genuine question: who needs help "answering emails" and what part of it do they need help with?
A lot of people obsess over phrasing.
Like for instance you want to tell a coworker his work is shit, but don't know how to put it in a way that's not going hurt them or make you look like an asshole.
I know people who'd potentially spend hours on a single email like this.
This study spans 3 years, so it goes back to ChatGPT 3.5 era. Not sure how valid it is, considering the breakneck speed at which everything moves.
The people who will be most productive with AI will be the entreprompteurs who whip up entire products and go to market faster than ever before, iterating at dangerous speeds. Lean Startup methodology on pure steroids basically.
Unfortunately I think most of the stuff they make will be shit, but they will build it very productively.
Software doesn't need to be good to be successful; it only needs to solve a problem and be better than the competition.
I predict a golden age for experienced developers! There will be an uncountable number of poorly designed apps with scaling issues. And many of them will be funded.
Meh, no. In a future where any app could be prompted, the only thing you’d get funding for is if you had managed to go viral and secure some large audience.
This is not good. When all that matters is how viral your app is, people no longer compete on features and quality of life.
There is probably a threshold effect above which the technology begins to be very useful for production (other than faking school assignments, one-off-scripts, spam, language translation, and political propaganda), but I guess we're not there yet. I'm not counting out the possibility of researchers finding a way to add long term memory or stronger reasoning abilities, which would change the game in a very disorienting way, but that would likely mean a change of architecture or a very capable hybrid tool.
the greatest step change will be when mainstream business realise they can use AI to accurately fill in PDF documents with information in any format
filling in pdf documents is effectively the job of millions of people around the world
That would require accurate validation of said documents, which is extremely hard now. Pointing 1 million PDF LLM machine guns at current validation pipelines will not end well, especially since LLMs are inherently unreliable.
This is lost on people. A 98% accurate automation is useful if you can programmatically identify the 2% of cases that need human review. If you can’t, and it matters, then every case needs human review.
So you lose a lot of benefits to the time sync, but since people tend to have their eye glaze over when the correction rate is low, you may still miss the 2% anyway.
This is going to put a stop to a lot of ideas that sound reasonable on paper.
"What had promised to be a boom to workplace productivity.."
No. BOON. A BOON to workplace productivity.
And then the writer doubles down on the error by proving it was not a typo, ending the sentence with "...was for several years a bust."
The issue with framing this as a resurrection of the productivity paradox is that AI had never even theoretically increased productivity.
I think in retrospect it's going to look very silly.
[dead]
[dead]
[dead]
[dead]
[dead]
[dead]
[dead]
and how do you know it is not going to be an inverted j-curve ?
Ha ha. I suppose an inverted j curve technology would offer short term productivity but yield long term slow-downs. I can see aspects of that - perhaps using agents people write code that quickly adds the next feature but overall is not maintainable. Or they quickly write a project plan but in the long term it doesn’t result in good payoffs. I’ve already observed both of these.
I think the hardest part to figure out will be delineating the illusion of productivity from actual productivity.
In the other hand - we do have empirical research (from the economist in this article) even from 2023 showing LLMs ability to offer real productivity in certain tasks : https://www.nber.org/papers/w31161. And the models have become much more better since then.
So it’s probably a question of if we can use them for what they are good for without being lulled by the sirens song of fake productivity.
[dead]
[flagged]
Of course AI is bullshit. If you couldn't just use it yourself and figure that out then ask yourself why people like Bezos or Altman are perfectly happy "investing" other people's money but not their own. If they actually believed their own bullshit they would personally be investing all of their money AND taking on personal debt. Instead Bezos, a guy worth ~200B, sells 5B worth of stock to invest in "AI-adjacent" (power generation) industry, while making amazon invest 200B in data centers. Talk about conflict of interest! WTF!
These surveys don’t make sense. Ask the forward thinking companies and they’ll say the opposite. The flood of anti AI productivity articles almost feel like they’re meant to lull the population into not seeing what’s about to happen to employment.
> Ask the forward thinking companies and they’ll say the opposite.
Which ones? OpenAI? Microsoft? Anthropic?
Eh, try using Microsoft Copilot in Word or PowerPoint. It is worthless. If your experience with AI was a Microsoft product, you would think it was a scam too.
It’s not just that though. You find when going through AI projects in an organization that many times the process is manual for a reason. This isn’t the first wave of “automation” that’s came through. Most things that can be fully automated already have been long ago and they manual parts get sold as we can make AI do it, until you see the specs and noodle around on the problem some then you realize it’s probably just going to remain manual as the amount of model training requires as much time and effort as just doing it by hand.
I have a dystopian future vision where humans are cheaper machines than robots, so we become the disposable task force for grunt work that robots aren’t cheap enough for. To some degree this is already happening.
Yeah Microsoft has consistently been bragging about how so much code is written by AI, yet their products are worse than ever. Seems to indicate “using AI” is not enough. You have to be smart about when and where.
It’s comical that Microsoft inserted Copilot buttons throughout all of their productivity suite, and none of them are able to do the bare minimum that you would hope for.
“Oh cool, copilot is in excel! I’m going to ask it a question about the data in the spreadsheet that it’s literally appearing beside natively in-app, or for help with a formula!”
“Wait what, it’s saying it can’t see anything or read from the currently displayed worksheet? Why is it inside the application then? Why would I want an outdated version of ChatGPT with no useful context or ability to read/do anything inside all my Office applications?”
Meta's AI can't search posts on Meta's properties (or at least couldn't a few months ago). I'm not really sure what it's point is unless it's meant as a kind of help desk for the site (which they already also have).
Thousand of companies to be replaced by leaner counterparts that learned to use AI towards greater employment and productivity
Where are my LLM-crafted Google and Instagram?
Crafted by Rajat
Source Code