If they optimize though - and this is coming at some point - local AI becomes possible, and their entire business case as a cloud monopoly evaporates. I think they know they're in a race between centralized control, and widespread use and control, and that is what is really driving this.
Yes, if you see the LLM as a compressed dictionary of all available information.
But if they succeed with agentic reasoning models (we are absolutly not there yet) then I think meritocracy will be replaced with assetocracy. The better the model, the more expensive it will be and the better the software will be.
I don’t worry about it myself, but I do worry for my kids. Im not even sure what to teach them anymore to have a shot at early retirement (and they keep raising the retirement age too).
Teach them basic financial literacy. The time value of money, the power of compounding, the relationship between risk and expected returns. Grade school does not cover any of this.
It does not matter what your income is if you cannot budget and save.
Financial literacy is a red herring. If one only stores their savings in gold or an index fund, that gets them practically all the way there. It takes all of two minutes to teach it. It compounds itself.
Risk too is sort of a red herring. Just buy in whenever it dips, and you are set. Diversify just enough to dilute the aggregate risk, and it practically disappears.
Savings are not even possible with low income, only with medium to high income. The lesson to learn is to avoid wasteful excessive spending that benefits oneself only in the moment.
Plenty of people have bought what they thought was the dip, only to watch an instrument go to almost zero and never recover. Look at Bed Bath and Beyond. It’s not quite that simple.
If you buy junk with all of your money, that's on you. I mentioned gold and (broad) index funds, although a few select cryptocurrencies also work. Buying junk must be restricted to a very small amounts only, to what one is willing to risk.
Yes, keep focused on the future, deny the moment. Avoid testing your own experience about what waste and excess mean. Follow the herd and treat participation algorithmically. Buy in. All key points for a satisfying life well lived.
You can spend $100B on a assets but it doesn't mean you'll turn a profit.
Capitalism certainly favors those with the most... capital, but there are quite a few other factors. Market fit, efficiency, etc. The Dutch East India Company had the most assets, yes, but also the best ships and a killer (literally) business model.
The notion of a sector where success is determined almost entirely by who can stockpile the most assets (GPUs in this case) is a somewhat unique situation and probably merits its own term
I see a lot of comments here criticizing the author, and I think both teams have a point. There's definitely a bubble, because companies are buying up infrastructure which doesn't need to be used right now.
But also, the companies are buying up this infrastructure because whoever controls the infrastructure also controls the industry in around 5 years time.
It's not just Satya, the CEOs of all the hyperscalers are consistent in their messaging over the last several quarters that they are backlogged on capacity. Not only are they saying it, they are putting their money where their mouths are by actively burning double-digit billions of their free cash flow on this buildout.
How probable is it that all these competitors are colluding on the same story while burning what would have otherwise been very healthy profits?
Being the CEO of a notable publicly traded company (and liable if caught lying about what they do with billions of shareholders dollars) surely a little more than random HN commentator without sources...?
And just when was the last time you saw CEO of a company as big as Microsoft "caught lying to shareholders" about anything actually face any punishment?
CEOs of big public companies lie to their shareholders all the time. It would be fantastic if they could be held accountable for those lies, but AFAIK when the SEC has tried, they always weasel out of it by saying things like "well, from what I knew at the time, it was true" or "if you interpret it this (ridiculous) way it was true". It's very, very hard to prove malicious intent—that is, prove what was going on in the CEO's head when they said it—with something like this beyond doubt, and that's effectively what's required.
> And just when was the last time you saw CEO of a company as big as Microsoft "caught lying to shareholders" about anything actually face any punishment?
Actually curious, when was the last time we saw a CEO of a company as big as Microsoft caught lying to shareholders?
The one instance I can recall offhand of big companies doing fraud were cases like Enron, which resulted in execs going to jail. More recent cases of CEOs were not large companies and they also ended up with them in prison, e.g. Nikola. (Sure the guy's out again, but that's been done, uh, outside the usual process of justice.)
This is very insightful. I remember the epoch of clueless startups wasting venture capital on Sun servers. I worked at one of those startups. Warden is clearly correct that if you want to train your AI faster then the optimal amount to spend on software optimization is at least a substantial fraction of your hardware budget.
However, clueless people who don't know how to optimize probably don't know where to spend money on optimization, either. So maybe it's just not a great fit for outsourcing, especially in a realm where there's no standard of correctness to measure the results of the supposedly "optimized" training against. And Warden seems to be pitching outsourcing rather than trying to get acquihired.
the optimal amount to spend on software optimization
is at least a substantial fraction of your hardware budget.
This has been a banging-head-against-wall sort of struggle every place I've worked on software, without AI even coming into the picture.
At one startup they were spending millions of dollars on AWS and complaining loudly to us about AWS spend and yet... god forbid the engineering team devote any resources to optimization passes instead of rolling out more poorly considered features, and hiring more engineers, because the existing engineers are struggling to be productive because everything is so unoptimized, and also because they have to spend a bunch of their time interviewing and training new hires.
1. Token prices keep plummeting even as models are getting stronger.
2. Most models are being offered for free at a significant loss, so reducing costs would be critical to maintain some path to sustainability.
3. Every hyperscaler has been consistently saying for the past several quarters that they are severely constrained on capacity and in fact have billions in booked backlogs. That is, if they had more capacity they would actually be making even more billions.
I can totally imagine the smaller players renting these cloud resources for their private model uses to be rather inefficient (which is where the 50% utilization number comes from), probably because they are prioritizing time-to-market over other aspects. But I would wager that resource efficiency, at least for inference, is absolutely a top priority for all the big players.
While I agree there’s a lack of attention for the impact of software engineers on near-term industry growth — rather the opposite with layoffs and agentic automation (attempts) et cetera; the mentioned Scott Gray is working at OpenAI now, so the human capital angle is I guess just flying under the mainstream radar.
What also cannot be ignored, is that transformer models are a great unifying force. It's basically one architecture that can be used for many purposes.
This eliminates the need for more specialized models and the associated engineering and optimizations for their infrastructure needs.
He lost me a bit at the end talking about running chat bots on CPUs. I know it's possible, but it's inherently parallel computing isn't it? Would that ever really make sense? I expected to hear something more like low end consumer gpus.
Recent generation llms do seem to have some significant efficiency gains. And routers to decide if you really need all of their power on a given question. And Google is building their custom tpus. So I'm not sure if I buy the idea that everyone ignores efficiency.
(Hi, Tom!) Reread the article and look for “CPU”. The whole article is about doing deep learning on CPUs not GPUs. Moonshine, the open source project and startup he talks about, shows speech recognition and realtime translation on the device rather than on a server. My understanding is that doing The Math in parallel is itself a performance hack, but Doing Less Math is also a performance hack.
That is unfortunate, because these are special skills you may but find inhouse. I know some guys that did it inhouse for a long time, toured from project to project in the right phase and saved bigcorp lots of money. Now they are doing it publicly.
Usually large companies attract or develop these skills by their scale. I do think there a lot of smaller companies that are underserved in this area, though.
There are a few "optimization startups". But in this context I find it a bit ironic that pretty much everyone is working with the same architecture, and the same hardware for the most part, so actually there isn't really that much demand for bespoke optimizations.
Those that are serious are paying through the nose for their engineers to work on these optimizations. Your competitor working on "the same hardware" does not magically make your MFU go up.
And when you have enough spending to account for 1%+ of revenue for the AI hardware companies?
You can get the engineers from those very hardware companies to do bespoke optimizations for your specific high load use cases. That's something a startup would struggle to match.
OP here, I didn't write the post, but found it interesting and posted it here.
> So i understand correctly, they spend more even thought They can, optimize and spend less
This is what I understand as well, we could utilise the hw better today and make things more efficient but instead we are focusing on making more. TBH I think both need to happen, money should be spent to make better more performant hw and at the same time squeeze any performance we can from what we already have.
I believe the author is making the point that the companies spending all this money on hardware aren't concerned at all with how the hardware is actually used.
Optimization isn't even being considered because its the total cost spent on hardware that is the goal, not output from the hardware.
I slightly have trouble believing that Mr “Stop wasting tokens by saying please to LLMs” Altman is not considering how his models can be optimized. I suppose the real question is how accurate are the utilization numbers in the article.
I stopped paying attention to any specific thing Sam Altman says a while ago. I've seen too many examples of interviews or off the cuff interactions that make me think very little of him personally.
For example, I could see him saying not to waste tokens on "please" simply because he thinks that is a stupid way to use the LLM. I.e. a judgement on anyone that would say please, not a concern over token use in his data centers.
But can that really be the case? It takes a long time to train and tune the models, any small, even low % digit of squeezing more implies much faster iteration.
Sounds like someone that got lucky in big picture (in ML during Alexnet era), but then unlucky in picking the sub-genre.
>I see hundreds of billions of dollars being spent on hardware
>I don’t see are people waving large checks at ML infrastructure engineers like me
Which seemed like a valid question mark until you look at the github. <1B Raspberry pi class edge speech models. That's not the game the hyperscalers are playing
I don't think we can conclude much of anything about the datacenter build out from that
> That's not the game the hyperscalers are playing
The hyperscalers are playing the game hyperscalers are playing - and only them. Where do they expect to find talent then? If the logic is, you need to work at a hyperscaler to work at a hyperscaler, no wonder they won't find any talent. That would be like NASA only hiring astronauts to send to space if they had already experience being in space.
That's not the logic, they obviously hire from outside. The author's complaint is not that he can't get hired. He doesn't want to get hired, even! The complaint is rather that investors aren't funding his startup.
Author is definitely correct in pointing out the incentives for companies to buy hardware. What the article misses is that there is in fact a reasonable economic incentive to not invest in software even if LLMs were not an economic bubble. It is that every single company is developing the same thing, there are many of those who even develop them as open source, and the ones that are closed as well as any company who would hire this guy, have a bunch of industrial spies inside anyway. Buying hardware may increase your moat, but developing software just rises the sea level.
> When I look around, I see hundreds of billions of dollars being spent on hardware – GPUs, data centers, and power stations. What I don’t see are people waving large checks at ML infrastructure engineers like me and my team.
That doesn't seem to be the case to me. I guess the author wants to do everything on his own terms and maybe companies aren't interested in that.
There's probably a bit more to it. It really only takes one company to bet on optimizing infrastructure, to the degree that the author suggests to undermine the entire house of cards being built on Nvidia GPUs currently, yet not one AI company is willing to take that bet?
The author could also be correct. Investors tend to be herd animals, and if you're not buying into the same tech as everyone else, your proposal is higher risk. It might very well be easier to say to an investor that you're going to buy a million Nvidia GPUs and stuff them in a datacenter in Texas like everyone else.
I'm interested in the one company that does take the bet on infrastruture optimization. If that works, then a lot of people are going to lose a lot of money really quickly.
Spending a lot (on capex or opex) certainly is not providing any kind of signaling benefit at this level. It's the opposite, because obviously every single financial analyst in the market is worried about the rapid increase in capex. The companies involved are cutting everything else to the bone to make sure they can still make those (necessary) investments without degrading their top-line numbers too much. Or in some cases actively working to hide the debt they're financing this with from their books.
Even if we imagined that the author's conspiracy theory were true, there would still be massive incentives for optimization because everyone is bottlenecked on compute despite expanding it as fast as is physically possible. Like, are we supposed to believe that nobody would run larger training runs if the compute was there? That they're intentionally choosing to be inefficient, and as a result having to rate-limit their paying customers? Of course not.
The reality is that any serious ML operation will have teams trying to make it more efficient, at all levels. If the author's services are not wanted, there are a few more obvious options than the outright moronic theory of intentional inefficiency. In this case most likely that their product is an on-edge speech to text model, which is not at all relevant to what is driving the capex.
> Spending a lot (on capex or opex) certainly is not providing any kind of signaling benefit at this level.
It's not providing any benefit now but there's still signalling going on, and it absolutely provided benefit at the beginning of this cycle of economy-shattering fuckwittery.
"OpenAI rejected me so the entire industry is going to collapse" is certainly a take. They are still probably one of the less arrogant engineers in silicon valley.
The take is that small incremental improvements on the hardware-software at that scale imply massive returns yet there isn't much work for that use case.
It isn't really. The assumption that these companies aren't hiring any infrastructure engineers is absurd. They all have massive in-house teams doing GPU optimization and everything else that the author brings up. They just don't need an external consultant for it.
He didn't say they aren't hiring _any_ but that they are hiring few and that he finds it strange that despite his multi-decades record of squeezing performance on the gpu-software stack he isn't getting much collaboration proposals.
There are people who are experts in a generalist sense. When a new field opens up, they quickly snatch up the opportunity and make immense progress and name for themselves in the evolving field. So in this case the first author is the mouse who ate the cheese and died.
Sorry, I should have said he died in the process of getting the cheese while the second mouse got the cheese.
The phrase "the second mouse gets the cheese" means that it can be beneficial to let others take the initial risk, as the first to act might trigger a negative outcome, leaving the opportunity for the second person to succeed without the same danger. I
If they optimize though - and this is coming at some point - local AI becomes possible, and their entire business case as a cloud monopoly evaporates. I think they know they're in a race between centralized control, and widespread use and control, and that is what is really driving this.
Yes, if you see the LLM as a compressed dictionary of all available information.
But if they succeed with agentic reasoning models (we are absolutly not there yet) then I think meritocracy will be replaced with assetocracy. The better the model, the more expensive it will be and the better the software will be.
I don’t worry about it myself, but I do worry for my kids. Im not even sure what to teach them anymore to have a shot at early retirement (and they keep raising the retirement age too).
Teach them basic financial literacy. The time value of money, the power of compounding, the relationship between risk and expected returns. Grade school does not cover any of this.
It does not matter what your income is if you cannot budget and save.
Financial literacy is a red herring. If one only stores their savings in gold or an index fund, that gets them practically all the way there. It takes all of two minutes to teach it. It compounds itself.
Risk too is sort of a red herring. Just buy in whenever it dips, and you are set. Diversify just enough to dilute the aggregate risk, and it practically disappears.
Savings are not even possible with low income, only with medium to high income. The lesson to learn is to avoid wasteful excessive spending that benefits oneself only in the moment.
Plenty of people have bought what they thought was the dip, only to watch an instrument go to almost zero and never recover. Look at Bed Bath and Beyond. It’s not quite that simple.
Time in the market beats timing the market. But we’re also probably talking about broad index funds and not BB&B stock.
If you buy junk with all of your money, that's on you. I mentioned gold and (broad) index funds, although a few select cryptocurrencies also work. Buying junk must be restricted to a very small amounts only, to what one is willing to risk.
Yes, keep focused on the future, deny the moment. Avoid testing your own experience about what waste and excess mean. Follow the herd and treat participation algorithmically. Buy in. All key points for a satisfying life well lived.
Military then a trade then a small business with employees doing the trade then done.
Hard disagree. How is putting guns in your children's hands a wise and loving first step?
This assumes you believe in the scaling hypothesis.
The best thing to teach them is how to use AI really well, how to be educated enough to do so, and perhaps how to improve AI.
>assetocracy
That's an interesting neologism, but the existing term for "rule by whoever controls the most expensive assets" is "capitalism"
You can spend $100B on a assets but it doesn't mean you'll turn a profit.
Capitalism certainly favors those with the most... capital, but there are quite a few other factors. Market fit, efficiency, etc. The Dutch East India Company had the most assets, yes, but also the best ships and a killer (literally) business model.
The notion of a sector where success is determined almost entirely by who can stockpile the most assets (GPUs in this case) is a somewhat unique situation and probably merits its own term
I see a lot of comments here criticizing the author, and I think both teams have a point. There's definitely a bubble, because companies are buying up infrastructure which doesn't need to be used right now.
But also, the companies are buying up this infrastructure because whoever controls the infrastructure also controls the industry in around 5 years time.
Nobody knows what will be happening "in around 5 years time"!
Economic and social predictions beyond 2 years are sketchy at best.
...if the bubble hasn't already popped by then.
> There's definitely a bubble, because companies are buying up infrastructure which doesn't need to be used right now.
Source? Satya Nadella seems to disagree with your statement (at least as I understand both): https://uk.finance.yahoo.com/news/microsoft-ceo-satya-nadell...
Can Satya Nadella be honest regarding this subject?
"Ah yes we invested $13B into OpenAI but it's a bubble"
It's not just Satya, the CEOs of all the hyperscalers are consistent in their messaging over the last several quarters that they are backlogged on capacity. Not only are they saying it, they are putting their money where their mouths are by actively burning double-digit billions of their free cash flow on this buildout.
How probable is it that all these competitors are colluding on the same story while burning what would have otherwise been very healthy profits?
Occam's razor and all that.
Being the CEO of a notable publicly traded company (and liable if caught lying about what they do with billions of shareholders dollars) surely a little more than random HN commentator without sources...?
And just when was the last time you saw CEO of a company as big as Microsoft "caught lying to shareholders" about anything actually face any punishment?
CEOs of big public companies lie to their shareholders all the time. It would be fantastic if they could be held accountable for those lies, but AFAIK when the SEC has tried, they always weasel out of it by saying things like "well, from what I knew at the time, it was true" or "if you interpret it this (ridiculous) way it was true". It's very, very hard to prove malicious intent—that is, prove what was going on in the CEO's head when they said it—with something like this beyond doubt, and that's effectively what's required.
> And just when was the last time you saw CEO of a company as big as Microsoft "caught lying to shareholders" about anything actually face any punishment?
Actually curious, when was the last time we saw a CEO of a company as big as Microsoft caught lying to shareholders?
The one instance I can recall offhand of big companies doing fraud were cases like Enron, which resulted in execs going to jail. More recent cases of CEOs were not large companies and they also ended up with them in prison, e.g. Nikola. (Sure the guy's out again, but that's been done, uh, outside the usual process of justice.)
This is very insightful. I remember the epoch of clueless startups wasting venture capital on Sun servers. I worked at one of those startups. Warden is clearly correct that if you want to train your AI faster then the optimal amount to spend on software optimization is at least a substantial fraction of your hardware budget.
However, clueless people who don't know how to optimize probably don't know where to spend money on optimization, either. So maybe it's just not a great fit for outsourcing, especially in a realm where there's no standard of correctness to measure the results of the supposedly "optimized" training against. And Warden seems to be pitching outsourcing rather than trying to get acquihired.
At one startup they were spending millions of dollars on AWS and complaining loudly to us about AWS spend and yet... god forbid the engineering team devote any resources to optimization passes instead of rolling out more poorly considered features, and hiring more engineers, because the existing engineers are struggling to be productive because everything is so unoptimized, and also because they have to spend a bunch of their time interviewing and training new hires.
I dunno... Consider:
1. Token prices keep plummeting even as models are getting stronger.
2. Most models are being offered for free at a significant loss, so reducing costs would be critical to maintain some path to sustainability.
3. Every hyperscaler has been consistently saying for the past several quarters that they are severely constrained on capacity and in fact have billions in booked backlogs. That is, if they had more capacity they would actually be making even more billions.
I can totally imagine the smaller players renting these cloud resources for their private model uses to be rather inefficient (which is where the 50% utilization number comes from), probably because they are prioritizing time-to-market over other aspects. But I would wager that resource efficiency, at least for inference, is absolutely a top priority for all the big players.
While I agree there’s a lack of attention for the impact of software engineers on near-term industry growth — rather the opposite with layoffs and agentic automation (attempts) et cetera; the mentioned Scott Gray is working at OpenAI now, so the human capital angle is I guess just flying under the mainstream radar.
OTOH garage-startup acquisitions are acquihires.
What also cannot be ignored, is that transformer models are a great unifying force. It's basically one architecture that can be used for many purposes.
This eliminates the need for more specialized models and the associated engineering and optimizations for their infrastructure needs.
And if better models than transformers are found? Or if someone finds models that do not rely on GPUs or specialized hardware?
Neither the hyperscalers nor NVDA are safe from uncertainty.
He lost me a bit at the end talking about running chat bots on CPUs. I know it's possible, but it's inherently parallel computing isn't it? Would that ever really make sense? I expected to hear something more like low end consumer gpus.
Recent generation llms do seem to have some significant efficiency gains. And routers to decide if you really need all of their power on a given question. And Google is building their custom tpus. So I'm not sure if I buy the idea that everyone ignores efficiency.
(Hi, Tom!) Reread the article and look for “CPU”. The whole article is about doing deep learning on CPUs not GPUs. Moonshine, the open source project and startup he talks about, shows speech recognition and realtime translation on the device rather than on a server. My understanding is that doing The Math in parallel is itself a performance hack, but Doing Less Math is also a performance hack.
This is just not correct. Also nobody is making optimization startups because if you cared you’d have an in house team working on it.
That is unfortunate, because these are special skills you may but find inhouse. I know some guys that did it inhouse for a long time, toured from project to project in the right phase and saved bigcorp lots of money. Now they are doing it publicly.
https://efficientware.net/how-we-work/
Usually large companies attract or develop these skills by their scale. I do think there a lot of smaller companies that are underserved in this area, though.
There are a few "optimization startups". But in this context I find it a bit ironic that pretty much everyone is working with the same architecture, and the same hardware for the most part, so actually there isn't really that much demand for bespoke optimizations.
Those that are serious are paying through the nose for their engineers to work on these optimizations. Your competitor working on "the same hardware" does not magically make your MFU go up.
And when you have enough spending to account for 1%+ of revenue for the AI hardware companies?
You can get the engineers from those very hardware companies to do bespoke optimizations for your specific high load use cases. That's something a startup would struggle to match.
It's good that you didn't give up ! So i understand correctly,they spend more even thought They can, optimize and spend less ?
OP here, I didn't write the post, but found it interesting and posted it here.
> So i understand correctly, they spend more even thought They can, optimize and spend less
This is what I understand as well, we could utilise the hw better today and make things more efficient but instead we are focusing on making more. TBH I think both need to happen, money should be spent to make better more performant hw and at the same time squeeze any performance we can from what we already have.
I believe the author is making the point that the companies spending all this money on hardware aren't concerned at all with how the hardware is actually used.
Optimization isn't even being considered because its the total cost spent on hardware that is the goal, not output from the hardware.
I slightly have trouble believing that Mr “Stop wasting tokens by saying please to LLMs” Altman is not considering how his models can be optimized. I suppose the real question is how accurate are the utilization numbers in the article.
I stopped paying attention to any specific thing Sam Altman says a while ago. I've seen too many examples of interviews or off the cuff interactions that make me think very little of him personally.
For example, I could see him saying not to waste tokens on "please" simply because he thinks that is a stupid way to use the LLM. I.e. a judgement on anyone that would say please, not a concern over token use in his data centers.
But can that really be the case? It takes a long time to train and tune the models, any small, even low % digit of squeezing more implies much faster iteration.
Until investors specifically incentivize speed or cost for the next iteration I wouldn't expect them to optimize for efficiency.
Right now it seems investment is primarily based on vibes, media hype, and total spend on hardware and infrastructure.
Sounds like someone that got lucky in big picture (in ML during Alexnet era), but then unlucky in picking the sub-genre.
>I see hundreds of billions of dollars being spent on hardware
>I don’t see are people waving large checks at ML infrastructure engineers like me
Which seemed like a valid question mark until you look at the github. <1B Raspberry pi class edge speech models. That's not the game the hyperscalers are playing
I don't think we can conclude much of anything about the datacenter build out from that
> That's not the game the hyperscalers are playing
The hyperscalers are playing the game hyperscalers are playing - and only them. Where do they expect to find talent then? If the logic is, you need to work at a hyperscaler to work at a hyperscaler, no wonder they won't find any talent. That would be like NASA only hiring astronauts to send to space if they had already experience being in space.
That's not the logic, they obviously hire from outside. The author's complaint is not that he can't get hired. He doesn't want to get hired, even! The complaint is rather that investors aren't funding his startup.
Great insight & nice read! Thanks
Author is definitely correct in pointing out the incentives for companies to buy hardware. What the article misses is that there is in fact a reasonable economic incentive to not invest in software even if LLMs were not an economic bubble. It is that every single company is developing the same thing, there are many of those who even develop them as open source, and the ones that are closed as well as any company who would hire this guy, have a bunch of industrial spies inside anyway. Buying hardware may increase your moat, but developing software just rises the sea level.
> When I look around, I see hundreds of billions of dollars being spent on hardware – GPUs, data centers, and power stations. What I don’t see are people waving large checks at ML infrastructure engineers like me and my team.
That doesn't seem to be the case to me. I guess the author wants to do everything on his own terms and maybe companies aren't interested in that.
There's probably a bit more to it. It really only takes one company to bet on optimizing infrastructure, to the degree that the author suggests to undermine the entire house of cards being built on Nvidia GPUs currently, yet not one AI company is willing to take that bet?
The author could also be correct. Investors tend to be herd animals, and if you're not buying into the same tech as everyone else, your proposal is higher risk. It might very well be easier to say to an investor that you're going to buy a million Nvidia GPUs and stuff them in a datacenter in Texas like everyone else.
I'm interested in the one company that does take the bet on infrastruture optimization. If that works, then a lot of people are going to lose a lot of money really quickly.
Every part of this is nonsense.
Spending a lot (on capex or opex) certainly is not providing any kind of signaling benefit at this level. It's the opposite, because obviously every single financial analyst in the market is worried about the rapid increase in capex. The companies involved are cutting everything else to the bone to make sure they can still make those (necessary) investments without degrading their top-line numbers too much. Or in some cases actively working to hide the debt they're financing this with from their books.
Even if we imagined that the author's conspiracy theory were true, there would still be massive incentives for optimization because everyone is bottlenecked on compute despite expanding it as fast as is physically possible. Like, are we supposed to believe that nobody would run larger training runs if the compute was there? That they're intentionally choosing to be inefficient, and as a result having to rate-limit their paying customers? Of course not.
The reality is that any serious ML operation will have teams trying to make it more efficient, at all levels. If the author's services are not wanted, there are a few more obvious options than the outright moronic theory of intentional inefficiency. In this case most likely that their product is an on-edge speech to text model, which is not at all relevant to what is driving the capex.
> Spending a lot (on capex or opex) certainly is not providing any kind of signaling benefit at this level.
It's not providing any benefit now but there's still signalling going on, and it absolutely provided benefit at the beginning of this cycle of economy-shattering fuckwittery.
"OpenAI rejected me so the entire industry is going to collapse" is certainly a take. They are still probably one of the less arrogant engineers in silicon valley.
That's not the take?
The take is that small incremental improvements on the hardware-software at that scale imply massive returns yet there isn't much work for that use case.
There’s no sour grapes in this article. I went in expecting the same but found that the author actually makes a good point.
It isn't really. The assumption that these companies aren't hiring any infrastructure engineers is absurd. They all have massive in-house teams doing GPU optimization and everything else that the author brings up. They just don't need an external consultant for it.
He didn't say they aren't hiring _any_ but that they are hiring few and that he finds it strange that despite his multi-decades record of squeezing performance on the gpu-software stack he isn't getting much collaboration proposals.
There are people who are experts in a generalist sense. When a new field opens up, they quickly snatch up the opportunity and make immense progress and name for themselves in the evolving field. So in this case the first author is the mouse who ate the cheese and died.
>is the mouse who ate the cheese and died.
I don’t follow what this means
Sorry, I should have said he died in the process of getting the cheese while the second mouse got the cheese.
The phrase "the second mouse gets the cheese" means that it can be beneficial to let others take the initial risk, as the first to act might trigger a negative outcome, leaving the opportunity for the second person to succeed without the same danger. I
They're referencing a famous book / parable about navigating change in business, "Who Moved My Cheese"
https://en.wikipedia.org/wiki/Who_Moved_My_Cheese%3F