> I noted that my own token usage comes to about $1,000/month against each of Anthropic and OpenAI - which currently costs me just $100 per provider thanks to their generous subsidized plans for individual subscribers.
Do we know that AI providers are going to keep these per-token prices, or eventually lower them because of competition from China?
Many lower-budget individuals are now moving to China open weight models like DeepSeek. I wonder if China's really subsidising the providers, or if inferencing costs are actually much lower, and Anthropic/OpenAI are just making sure no money's left on the table for their eventual IPOs.
show comments
f311a
How many more months do we need to wait, until big companies realize that flash models work just fine if you:
1) Don't ask LLMs for big changes
2) Review everything and point them in the right direction
Large models still suck at big changes, they produce questionable architecture and you still have to review the code, if your project is serious enough.
The codebase quickly become a mess, if you don't pay enough attention. Does not matter which model.
So why bother with big models, when flash models are 10x cheaper and much faster to iterate under guidance? Large models can be used for security and bug audits. Flash models work almost the same for changes under 300 LOC when you dictate how you want your code to look.
show comments
thundergolfer
> That means each employee's AI spending cap is ~11% of that median compensation package.
Probably better to use the fully-loaded cost of the engineer, which is much higher than their compensation package. The fully-loaded cost is the total cost paid for the labor power of the engineer, and it includes big ticket items such as office space, food, equipment, insurance, payroll tax, fringe benefits, recruiting costs.
If the median compensation package is $330k/year then the median fully loaded cost is probably around $450-500k.
show comments
tuesdaynight
Why there are so many people that still believe that AI coding is a fad? It's something that started less than two years ago and companies are already paying thousands per seat. I know one that gives you 5k per month. Which other tool went from nothing to this level of acceptance so quickly?
show comments
CharlieDigital
$1500/mo is $18,000/seat/annum.
Maybe Microsoft and Nvidia are on to something.
128 GB machines that can run local LLMs are a bargain even if priced $5-8k. Yes, tok/s is not quite there, but that's probably OK since the bottleneck really isn't the code; it's WTF did Uber build with all of that spend? How did it meaningfully impact their revenue in a positive direction?
show comments
siliconc0w
I use the $100/mo sub but my 30 day API cost is about $1700/mo.
It really depends how you use it, if you're using prompts to generate detailed designs, breaking those into lists of tasks, and then feeding those to multiple agents - it's really easy to burn through many thousands.
If you're being more deliberate and using a few agents at a time interactively, having it review PRs/resolve issues, automated clean-ups and performance optimization, etc it could be more like $1500.
If you're just throwing it one-off questions like a better stack-overflow that is well under a $100.
I've really gotten into /goal, if you can find something verifiable and leave it overnight - it's kinda like christmas morning to see where it landed.
thesumofall
Plenty of comparisons here between salaries and token costs. All fair but very much assumes that salaries are rational. Why do we pay some engineers 10x as much for the same role just because they are in a different location? The WFH discussion surfaced some of that. If money is cheap, all sorts of funny things are happening. Is it worth to spend 1500 USD on AI? I don’t know. Is it worth paying engineers 300k USD instead of 30k? Honestly, I don’t know
show comments
marcosdumay
Just to put this in context. If every company did this, all over the world, with that same limit, we are talking about something around $45B monthly in revenue for all AI companies to share.
show comments
jkwang
The $1500 number is less interesting than the fact that they hit a ceiling at all. Most engineering teams I've talked to have no idea what their AI spend is per developer because it's buried in a consolidated cloud bill. Having a hard cap forces two useful conversations: what workflows actually justify API calls vs local inference, and whether the output is being measured against any real productivity metric. Without that feedback loop it's just a race to see who can burn tokens fastest.
show comments
c7b
1,5k. For two months of that spend you could buy a machine that can self-host decent models, plus a year's worth of electricity. It's not up there in terms of quality, but with a bit more effort it works pretty decently. I'm completely baffled that that's not way more common, is it really just the quality?
show comments
blobbers
I think the main thing companies should try to understand is avoiding the use of 'claude -p'.
I definitely have written a goal file, and then just ran claude in a loop over the goal in order to 'token max'... why not? I'm doing research and have some clear KPIs where research into all kinds of techniques / tuning can improve the results. I can spend my budget on a "experiment with blah blah blah to improve blah blah" or give it a list of things to try that I know will take awhile.
Its no problem hitting hundreds of $ of API spend while sitting at a computer with 3 monitors have 6 windows of useful claude code interactive sessions, while working on 2 or 3 projects and using worktrees, and it's a little weird when you hit your limit by 2 o'clock and have to wait for token budgets to reset; god forbid, I manually edit code... which I did do for the first time in months.
You can also start to generate a lot of token spend if you do something like "hey make me a stylized slide deck using internal skill / agent XYZ based on commits A through C", which as an engineer, makes presentations building much less painful.
This uber limit is not high compared to the big SV companies.
show comments
suncemoje
Lock-in / switching costs are increasingly concerning me. I am using Claude for a good year now and have been accumulating so much "knowledge" in there by now. If Claude became less favorable in terms of price/performance in the future, that would worry me. I've started to think about a distributed solution, where my storage is detached from the inference, but currently Claude is still the way to go for me. Wondering if anyone has similar concerns?
show comments
john01dav
Why isn't self hosting (even just renting a GPU server, not necessarily on premise) at large companies or hosting via something like together AI to run the open weight models not more common? I've tried the open weight models and the premium models like Opus and Gemini Pro, and I find that the latter are a little better, but not nearly to the degree to justify the extreme price difference, since the differences largely don't matter for what I've tried them for, and I expect that many other users likely have similar use cases.
show comments
linuxhansl
I use Claude every day. Often for multiple hours a day.
Basically doing my job not worrying how many tokens I spend (as in too many or too few). This is a pretty complex code base (database optimizer and related).
Just looked at spent for the past 30 day, didn't even come to $600. 95% of my tokens are from cache. If I were to reach even $1500 I have to let claude run unsupervised over night (and with the amount of mistakes it still makes and guidance it needs, I do not believe we are there yet.)
show comments
geodel
> A $1,500 monthly limit per tool strikes me as a rational policy response to over-spending,...
> I noted that my own token usage comes to about $1,000/month against each of Anthropic and OpenAI - which currently costs me just $100 per provider thanks to their generous subsidized plans for individual subscribers.
This whole article seems to me like Multi level marketing "businesses" where 'Diamonds' have made their money by promoting MLM in seminars and telling hopefuls at bottom that "Buying AI subscription now is their one shot to be a winner in life"
Perhaps there is something to MLM vs LLM to create a FOMO effect.
show comments
dzonga
> That means each employee's AI spending cap is ~11% of that median compensation package.
when looking at costs - numbers make sense. however decisions as an org/company/solo founder - costs help you set prices, but to reach profitability you want to model around ROI.
now the question is what's the ROI for a $36K/investment per engineer or $90M for the total org ?
I bet the ROI is negative.
show comments
watershawl
Do you think companies are gonna be like?:
Wait a minute. We didn’t save money by adding AI. We just added an expense.
Now we have to pay for employees AND AI.
CSMastermind
A blanket cap makes no sense to me. There's a power distribution of AI use in my company and I'd imagine it's the same at a much greater scale at Uber.
I'd guess there should be a few people Uber is bascially allocating unlimited AI spending to and a large swath they're giving basically nothing.
show comments
szatkus
That's a lot. On my usual day I burn less than $1 on Opus. I could get beyond $10 only if I have a complex and well-defined problem, which is rare (the second part at least).
show comments
colonelspace
If a worker doesn't use their AI/LLM budget, can they get a raise?
show comments
PessimalDecimal
These are still at currently subsidized prices. We'll see if they think they're getting $1500/month of value when that buys significantly fewer tokens.
show comments
deviation
$300/day at Apple, with an increase to $500 with manager approval.
pmontra
I wonder what they are doing with $1500 per month. I'm on Claude Pro $20 plan and I'm doing well. That's 3 days per week. On the other 2 days I'm using a customer's Claude Max, I don't know if it's the $100 or the $200 plan, but I'm sharing it with some of its other developers.
show comments
throw0606
When blue-collars were loosing jobs they were told to learn to code and now engineers are vilifying AI for taking jobs
show comments
newobj
It's also a useful signal for AI value. Looks like it's a max value add of $18,000 per engineer per year.
show comments
cmiles8
And $1500 a month is on the very high end of where most companies will land. When you run the numbers there isn’t a realistic path that connects the dots between likely market size and the claimed valuation of the AI companies. The math simply does not add up.
schnitzelstoat
How are people using so many tokens? I'm on the $200/month enterprise plan for Claude Code (because it's a better deal than the API pricing) and I don't come close to the limits.
If you use stuff like opusplan and /advisor so you use Sonnet for most of the work and only Opus for the really complex stuff then it's quite easy to keep costs low without affecting performance.
show comments
827a
This week an S&P 20 company with previously unlimited Claude limits also set a $250/mo/person limit; though its unclear to me how widely the limits are being enforced, may be the case that its just non-software engineers. Do with this info what you will.
etothet
In my experience, this is far below the cost the average dev will incur per month so this seems very reasonable to me. And, no doubt there are exceptions for heavy users so they can get some extra token usage when they need it.
show comments
sameersri2004
Its a lot when using Chinese models, less when using Opus 4.8
andix
It finally puts a number on productivity gain of engineers with AI. This is probably less than 10% of the cost of an average uber developer. So they don't assume much more productivity gain from AI than 10%.
(Cost of an employee is much higher than their salary, it includes things like office space, supporting structures like HR/accounting, insurance, hardware/software, and much more)
show comments
epsteingpt
Uber engineers reported that loading their workspace and pulling recent commits exhausted that AI limit for Claude Code (4.8 x-high) immediately.
show comments
rasbmn
Uber is in the business of experimenting with robotaxis and automated food delivery.
They can't say that $0 per employee is the appropriate amount for AI spending. So they capped it, perhaps in order to "send a signal" that is eagerly picked up by the AI boosters.
There is no signal. Uber does not work any better since AI. They still want to promote AI, so they chose the highest number that doesn't bankrupt them so the press and AI promoters pick it up as the new price anchor.
Probably they'll quietly reduce the number more soon.
show comments
LurkandComment
1) This happened because they fundementally misunderstand how to use AI and how AI is priced
2) Most organizations are throwing everything in for analyses and not limiting the answer they want. You need to be specific of about what you analyze and what answers you want
3) People undervalue prompting or templated responses. I will have written. validated and sanity checked a prompt several times and run it across several models before I say its ready for use. But when it is, I know what it will give me and that the scope of its research and answer is as close to what I want as it can be. As little excess as I can. This all saves tokens
galaxyLogic
It's probabaly a good things that Uber-developers are now forced to do some coding on their own. Only use AI where it absolutely helps
show comments
zkmon
The big question is, will the productivity gains be absorbed by the needs? Societies don't have a need for infinite amount of luxury and laziness offered by the productivity of the machines. At some point, you would shake off things, get up from the couch and start walking again, breathing afresh.
meszmate
It still probably produces better results than some junior engineers in a lot of cases.
But yeah, for a company at Uber’s scale, I can see why they would want real engineering discipline around it.
sylwk
Due to recent Copilot price increase my friend was capped to $70 per month of usage. Not on a subscription…
My $100 subscription is not cheap. At the same time our product burns orders of magnitude more tokens.
packspro
The tool categories that pay for themselves fastest: (1) Anything that gets invoices out faster and makes it easier for clients to pay. (2) Scheduling links that eliminate email back-and-forth. Everything else is optimization. I keep notes on which freelancer tools hit each threshold at freelancerkit.surge.sh
Galanwe
I think the logical follow up will be for Uber to lay off a bunch of people so that the remaining ones can token maxx.
To the mooooon!
jwpapi
If you estimate 10k salary per engineer that means the moment it’s cheaper for them to hire another engineer but that doesn’t mean it’s improving productivity 15% but if 15% is the moment it stopped being better than another human we can assume 7.5%?
Probably even less because you would spend those 1500 extra per employee also if you just save 10% so 150 per employee that’s 1.5% on salary.
This is imho one of the best ranges we can assume for now how much would that be on the whole swe market?
ilia-a
Seems odd limit, especially since it highly dependant on Token provider used, with Opus this is not much and could easily be burnt in a week or less, but with something like deepseek the 1500 can literarily be an annual budget.
That being said, I do have to wonder why someone as bug as say Uber, simply not rollout OSS model in the cloud for their team, I'd imagine that would be cheapest & most flexible option, while also keeping all the data shared with LLM private.
show comments
5701652400
eventually tokens will cost price of energy. and china is miles ahead.
china will be major token exporter soon. mark my words.
show comments
easygenes
If I were paying API rates this year, I would have already burned through $20k in tokens. Looking forward to the costs of this level of capability coming down.
era-epoch
Reading the headline
Oh that's actually really economical! I wonder if they're doing a lot on locally running models or managing a shared context or knowledge-base in some clever way, maybe just encouraging employees to be efficient and mindful.
...
> each employee
...
> per AI coding tool
...
> I noted that my own token usage comes to about $1,000/month against each of Anthropic and OpenAI
What on this godforsaken earth are all you rich idiots doing???
transitorykris
Is anyone doing story point estimation in terms of tokens? If you have a token budget, does this change how you prioritize?
show comments
ewangzzz
I'm curious how much of the usage comes from vibe coding vs using agents/harnesses in internal tooling
gck1
A lot of talk about cheaper models here. Just curios, is there any non-Anthropic model that can do UI well? GPT-5.5 is laughably bad, and I'm never restarting my Anthropic subscription after their 6-month sprint of gaslighting, even if opus was really good at UI.
hrpnk
If budgeted at $1,500/month per user, power users still can get 5-10x of that allocation if the user pool is large enough.
walthamstow
I think a lot of people are missing that this is $1500 _per tool_ which is still rather a lot of money.
show comments
LeicaLatte
If china captures the market now, well deserved. Way cheaper compared to us providers.
ChrisArchitect
Related:
Uber’s COO says it’s getting harder to justify money spent on tokenmaxxing
They are also beholden to enterprise pricing and can't use the subsidized consumer max plans.
gck1
ccusage for codex tells me the medium feature I prompted in codex, with a $200 subscription, running for 72 hours and still not delivering full result would have cost ~ $2200 at API rates.
I also misconfigured something in my agent's configuration and a simple web tool request (maybe 4 turns) through OR went to GPT-5.5 accidentally and that cost me ~$0.4.
I have no idea how any business can afford API rates without having a mindset of casually setting money on fire.
cadamsdotcom
Token costs rising because data center build costs must be paid down.. is not the whole picture. It is actually possible for token costs to fall despite the spending frenzy.
Naively you’d expect to always keep paying more - but growth in token usage is what changes the equation. Amortizing debt over an exponentially growing amount of spend across a growing customer base (not per customer) lets the debt be paid off & costs covered even as each individual’s spend stays steady or even goes down - but it only works if there’s growth beyond some threshold that makes the whole thing hang together. No one on the outside knows how much growth that is, and everyone chases maximum growth.
Jevons Paradox ends up being your friend as well as the friend of the inference providers as well as the friend of the inference financiers.
If it’s a strong enough effect, it has potential to cancel out all the circular financing too, and let everyone ride out the bursting of the bubble.
KnuthIsGod
China will bring down the price per million tokens.
edg5000
Why are people getting these high spending numbers? A 200 USD subscription for either Codex or Claude should give you plenty of usage. What am I missing? Are they just being dumb?
show comments
morpheos137
the real interesting way to address the question of token effectiveness would be internal alpha vs beta testing and measuringing marginal revenue generated by similar teams using ai and at different usage levels. right now $1500 a month is not a meaningful signal of anything beyond current executive willingness to spend. in the long run executives will cut spending where it does not support income generation.
nalekberov
What is the point of allowing a developer to spend $18,000 a year on AI subscriptions? Can't they hire a decent developer who is capable of producing a quality solution faster?
Clearly, these decisions are all made by high-level management team.
I was recently talking to an HR person from a European company, and she goes: 'We are forcing our developers to use AI coding agents, but they are still kind of hesitant.' This person had never written a single line of code, nor did she know what software engineering is. For these people, using AI coding agents = faster delivery without breaking anything.
show comments
insane_dreamer
I still have never hit a ceiling with my Claude Max $100 account, much less the Max $200 account. I'm not burning tokens needlessly, nor running it all day, but I do use CC almost daily. What are these devs doing that they are burning more than $1500 in tokens a month?
Maybe it's just me, but I still find that I really have to "shepherd" the AI and work with it to get the results I want. And I read every line of code added and challenge the model's logic. So that limits my token burning. Maybe these people are just "vibe-coding" without really checking the results?
show comments
sremani
I have strong conviction that companies will now choose tech stack/programming languages based on 'tokenomics'. I am vibe coding using Clojure, a language I can read but cannot write and I never hit the usage limits even when using the latest model on Claude. I have similar experience with F#, which is a bit more verbose than clojure but absolutely beats every OOP language, Python, Typescript etc.
The reason, I use F# & Clojure is they hit JVM and CLR, two popular enterprise stacks.
In my not so humble opinion Lisp(Clojure) still remains the language of AI.
show comments
noncoml
They want to replace employees with AI, then replace paid AI with unpaid AI.
Their wet dream was never automation. It was zero marginal cost labor. And that dream is starting to rot.
ipunchghosts
Why aren't they using Claude code 20x for 200/month?
show comments
nphardon
It's wild; at my shop in Silicon Valley they dropped us from unlimited use to 60% prem budget on copilot. People are walking around like zombies.
show comments
cyanydeez
no....the fact that you could buy a reasonably prices MAC or AMD395+ thats AI tool pricing; it loads a big enough model and spits out tokens just fast enough that you can read what it's doing and comprehend it instead of magic.
That's the most useful signal. Pre OpenAI mafia RAM pricing, that comes out to $250/month.
> I noted that my own token usage comes to about $1,000/month against each of Anthropic and OpenAI - which currently costs me just $100 per provider thanks to their generous subsidized plans for individual subscribers.
Do we know that AI providers are going to keep these per-token prices, or eventually lower them because of competition from China?
Many lower-budget individuals are now moving to China open weight models like DeepSeek. I wonder if China's really subsidising the providers, or if inferencing costs are actually much lower, and Anthropic/OpenAI are just making sure no money's left on the table for their eventual IPOs.
How many more months do we need to wait, until big companies realize that flash models work just fine if you:
1) Don't ask LLMs for big changes
2) Review everything and point them in the right direction
Large models still suck at big changes, they produce questionable architecture and you still have to review the code, if your project is serious enough.
The codebase quickly become a mess, if you don't pay enough attention. Does not matter which model.
So why bother with big models, when flash models are 10x cheaper and much faster to iterate under guidance? Large models can be used for security and bug audits. Flash models work almost the same for changes under 300 LOC when you dictate how you want your code to look.
> That means each employee's AI spending cap is ~11% of that median compensation package.
Probably better to use the fully-loaded cost of the engineer, which is much higher than their compensation package. The fully-loaded cost is the total cost paid for the labor power of the engineer, and it includes big ticket items such as office space, food, equipment, insurance, payroll tax, fringe benefits, recruiting costs.
If the median compensation package is $330k/year then the median fully loaded cost is probably around $450-500k.
Why there are so many people that still believe that AI coding is a fad? It's something that started less than two years ago and companies are already paying thousands per seat. I know one that gives you 5k per month. Which other tool went from nothing to this level of acceptance so quickly?
$1500/mo is $18,000/seat/annum.
Maybe Microsoft and Nvidia are on to something.
128 GB machines that can run local LLMs are a bargain even if priced $5-8k. Yes, tok/s is not quite there, but that's probably OK since the bottleneck really isn't the code; it's WTF did Uber build with all of that spend? How did it meaningfully impact their revenue in a positive direction?
I use the $100/mo sub but my 30 day API cost is about $1700/mo.
It really depends how you use it, if you're using prompts to generate detailed designs, breaking those into lists of tasks, and then feeding those to multiple agents - it's really easy to burn through many thousands.
If you're being more deliberate and using a few agents at a time interactively, having it review PRs/resolve issues, automated clean-ups and performance optimization, etc it could be more like $1500.
If you're just throwing it one-off questions like a better stack-overflow that is well under a $100.
I've really gotten into /goal, if you can find something verifiable and leave it overnight - it's kinda like christmas morning to see where it landed.
Plenty of comparisons here between salaries and token costs. All fair but very much assumes that salaries are rational. Why do we pay some engineers 10x as much for the same role just because they are in a different location? The WFH discussion surfaced some of that. If money is cheap, all sorts of funny things are happening. Is it worth to spend 1500 USD on AI? I don’t know. Is it worth paying engineers 300k USD instead of 30k? Honestly, I don’t know
Just to put this in context. If every company did this, all over the world, with that same limit, we are talking about something around $45B monthly in revenue for all AI companies to share.
The $1500 number is less interesting than the fact that they hit a ceiling at all. Most engineering teams I've talked to have no idea what their AI spend is per developer because it's buried in a consolidated cloud bill. Having a hard cap forces two useful conversations: what workflows actually justify API calls vs local inference, and whether the output is being measured against any real productivity metric. Without that feedback loop it's just a race to see who can burn tokens fastest.
1,5k. For two months of that spend you could buy a machine that can self-host decent models, plus a year's worth of electricity. It's not up there in terms of quality, but with a bit more effort it works pretty decently. I'm completely baffled that that's not way more common, is it really just the quality?
I think the main thing companies should try to understand is avoiding the use of 'claude -p'.
I definitely have written a goal file, and then just ran claude in a loop over the goal in order to 'token max'... why not? I'm doing research and have some clear KPIs where research into all kinds of techniques / tuning can improve the results. I can spend my budget on a "experiment with blah blah blah to improve blah blah" or give it a list of things to try that I know will take awhile.
Its no problem hitting hundreds of $ of API spend while sitting at a computer with 3 monitors have 6 windows of useful claude code interactive sessions, while working on 2 or 3 projects and using worktrees, and it's a little weird when you hit your limit by 2 o'clock and have to wait for token budgets to reset; god forbid, I manually edit code... which I did do for the first time in months.
You can also start to generate a lot of token spend if you do something like "hey make me a stylized slide deck using internal skill / agent XYZ based on commits A through C", which as an engineer, makes presentations building much less painful.
This uber limit is not high compared to the big SV companies.
Lock-in / switching costs are increasingly concerning me. I am using Claude for a good year now and have been accumulating so much "knowledge" in there by now. If Claude became less favorable in terms of price/performance in the future, that would worry me. I've started to think about a distributed solution, where my storage is detached from the inference, but currently Claude is still the way to go for me. Wondering if anyone has similar concerns?
Why isn't self hosting (even just renting a GPU server, not necessarily on premise) at large companies or hosting via something like together AI to run the open weight models not more common? I've tried the open weight models and the premium models like Opus and Gemini Pro, and I find that the latter are a little better, but not nearly to the degree to justify the extreme price difference, since the differences largely don't matter for what I've tried them for, and I expect that many other users likely have similar use cases.
I use Claude every day. Often for multiple hours a day. Basically doing my job not worrying how many tokens I spend (as in too many or too few). This is a pretty complex code base (database optimizer and related).
Just looked at spent for the past 30 day, didn't even come to $600. 95% of my tokens are from cache. If I were to reach even $1500 I have to let claude run unsupervised over night (and with the amount of mistakes it still makes and guidance it needs, I do not believe we are there yet.)
> A $1,500 monthly limit per tool strikes me as a rational policy response to over-spending,...
> I noted that my own token usage comes to about $1,000/month against each of Anthropic and OpenAI - which currently costs me just $100 per provider thanks to their generous subsidized plans for individual subscribers.
This whole article seems to me like Multi level marketing "businesses" where 'Diamonds' have made their money by promoting MLM in seminars and telling hopefuls at bottom that "Buying AI subscription now is their one shot to be a winner in life"
Perhaps there is something to MLM vs LLM to create a FOMO effect.
> That means each employee's AI spending cap is ~11% of that median compensation package.
when looking at costs - numbers make sense. however decisions as an org/company/solo founder - costs help you set prices, but to reach profitability you want to model around ROI.
now the question is what's the ROI for a $36K/investment per engineer or $90M for the total org ?
I bet the ROI is negative.
Do you think companies are gonna be like?:
Wait a minute. We didn’t save money by adding AI. We just added an expense.
Now we have to pay for employees AND AI.
A blanket cap makes no sense to me. There's a power distribution of AI use in my company and I'd imagine it's the same at a much greater scale at Uber.
I'd guess there should be a few people Uber is bascially allocating unlimited AI spending to and a large swath they're giving basically nothing.
That's a lot. On my usual day I burn less than $1 on Opus. I could get beyond $10 only if I have a complex and well-defined problem, which is rare (the second part at least).
If a worker doesn't use their AI/LLM budget, can they get a raise?
These are still at currently subsidized prices. We'll see if they think they're getting $1500/month of value when that buys significantly fewer tokens.
$300/day at Apple, with an increase to $500 with manager approval.
I wonder what they are doing with $1500 per month. I'm on Claude Pro $20 plan and I'm doing well. That's 3 days per week. On the other 2 days I'm using a customer's Claude Max, I don't know if it's the $100 or the $200 plan, but I'm sharing it with some of its other developers.
When blue-collars were loosing jobs they were told to learn to code and now engineers are vilifying AI for taking jobs
It's also a useful signal for AI value. Looks like it's a max value add of $18,000 per engineer per year.
And $1500 a month is on the very high end of where most companies will land. When you run the numbers there isn’t a realistic path that connects the dots between likely market size and the claimed valuation of the AI companies. The math simply does not add up.
How are people using so many tokens? I'm on the $200/month enterprise plan for Claude Code (because it's a better deal than the API pricing) and I don't come close to the limits.
If you use stuff like opusplan and /advisor so you use Sonnet for most of the work and only Opus for the really complex stuff then it's quite easy to keep costs low without affecting performance.
This week an S&P 20 company with previously unlimited Claude limits also set a $250/mo/person limit; though its unclear to me how widely the limits are being enforced, may be the case that its just non-software engineers. Do with this info what you will.
In my experience, this is far below the cost the average dev will incur per month so this seems very reasonable to me. And, no doubt there are exceptions for heavy users so they can get some extra token usage when they need it.
Its a lot when using Chinese models, less when using Opus 4.8
It finally puts a number on productivity gain of engineers with AI. This is probably less than 10% of the cost of an average uber developer. So they don't assume much more productivity gain from AI than 10%.
(Cost of an employee is much higher than their salary, it includes things like office space, supporting structures like HR/accounting, insurance, hardware/software, and much more)
Uber engineers reported that loading their workspace and pulling recent commits exhausted that AI limit for Claude Code (4.8 x-high) immediately.
Uber is in the business of experimenting with robotaxis and automated food delivery.
They can't say that $0 per employee is the appropriate amount for AI spending. So they capped it, perhaps in order to "send a signal" that is eagerly picked up by the AI boosters.
There is no signal. Uber does not work any better since AI. They still want to promote AI, so they chose the highest number that doesn't bankrupt them so the press and AI promoters pick it up as the new price anchor.
Probably they'll quietly reduce the number more soon.
1) This happened because they fundementally misunderstand how to use AI and how AI is priced 2) Most organizations are throwing everything in for analyses and not limiting the answer they want. You need to be specific of about what you analyze and what answers you want 3) People undervalue prompting or templated responses. I will have written. validated and sanity checked a prompt several times and run it across several models before I say its ready for use. But when it is, I know what it will give me and that the scope of its research and answer is as close to what I want as it can be. As little excess as I can. This all saves tokens
It's probabaly a good things that Uber-developers are now forced to do some coding on their own. Only use AI where it absolutely helps
The big question is, will the productivity gains be absorbed by the needs? Societies don't have a need for infinite amount of luxury and laziness offered by the productivity of the machines. At some point, you would shake off things, get up from the couch and start walking again, breathing afresh.
It still probably produces better results than some junior engineers in a lot of cases.
But yeah, for a company at Uber’s scale, I can see why they would want real engineering discipline around it.
Due to recent Copilot price increase my friend was capped to $70 per month of usage. Not on a subscription…
My $100 subscription is not cheap. At the same time our product burns orders of magnitude more tokens.
The tool categories that pay for themselves fastest: (1) Anything that gets invoices out faster and makes it easier for clients to pay. (2) Scheduling links that eliminate email back-and-forth. Everything else is optimization. I keep notes on which freelancer tools hit each threshold at freelancerkit.surge.sh
I think the logical follow up will be for Uber to lay off a bunch of people so that the remaining ones can token maxx.
To the mooooon!
If you estimate 10k salary per engineer that means the moment it’s cheaper for them to hire another engineer but that doesn’t mean it’s improving productivity 15% but if 15% is the moment it stopped being better than another human we can assume 7.5%?
Probably even less because you would spend those 1500 extra per employee also if you just save 10% so 150 per employee that’s 1.5% on salary.
This is imho one of the best ranges we can assume for now how much would that be on the whole swe market?
Seems odd limit, especially since it highly dependant on Token provider used, with Opus this is not much and could easily be burnt in a week or less, but with something like deepseek the 1500 can literarily be an annual budget.
That being said, I do have to wonder why someone as bug as say Uber, simply not rollout OSS model in the cloud for their team, I'd imagine that would be cheapest & most flexible option, while also keeping all the data shared with LLM private.
eventually tokens will cost price of energy. and china is miles ahead.
china will be major token exporter soon. mark my words.
If I were paying API rates this year, I would have already burned through $20k in tokens. Looking forward to the costs of this level of capability coming down.
Reading the headline
Oh that's actually really economical! I wonder if they're doing a lot on locally running models or managing a shared context or knowledge-base in some clever way, maybe just encouraging employees to be efficient and mindful.
...
> each employee
...
> per AI coding tool
...
> I noted that my own token usage comes to about $1,000/month against each of Anthropic and OpenAI
What on this godforsaken earth are all you rich idiots doing???
Is anyone doing story point estimation in terms of tokens? If you have a token budget, does this change how you prioritize?
I'm curious how much of the usage comes from vibe coding vs using agents/harnesses in internal tooling
A lot of talk about cheaper models here. Just curios, is there any non-Anthropic model that can do UI well? GPT-5.5 is laughably bad, and I'm never restarting my Anthropic subscription after their 6-month sprint of gaslighting, even if opus was really good at UI.
If budgeted at $1,500/month per user, power users still can get 5-10x of that allocation if the user pool is large enough.
I think a lot of people are missing that this is $1500 _per tool_ which is still rather a lot of money.
If china captures the market now, well deserved. Way cheaper compared to us providers.
Related:
Uber’s COO says it’s getting harder to justify money spent on tokenmaxxing
https://news.ycombinator.com/item?id=48268871
Uber torches 2026 AI budget on Claude Code in four months
https://news.ycombinator.com/item?id=47976415
Corporate America Is Starting to Ration AI as Cost Skyrockets
https://news.ycombinator.com/item?id=48335388
They are also beholden to enterprise pricing and can't use the subsidized consumer max plans.
ccusage for codex tells me the medium feature I prompted in codex, with a $200 subscription, running for 72 hours and still not delivering full result would have cost ~ $2200 at API rates.
I also misconfigured something in my agent's configuration and a simple web tool request (maybe 4 turns) through OR went to GPT-5.5 accidentally and that cost me ~$0.4.
I have no idea how any business can afford API rates without having a mindset of casually setting money on fire.
Token costs rising because data center build costs must be paid down.. is not the whole picture. It is actually possible for token costs to fall despite the spending frenzy.
Naively you’d expect to always keep paying more - but growth in token usage is what changes the equation. Amortizing debt over an exponentially growing amount of spend across a growing customer base (not per customer) lets the debt be paid off & costs covered even as each individual’s spend stays steady or even goes down - but it only works if there’s growth beyond some threshold that makes the whole thing hang together. No one on the outside knows how much growth that is, and everyone chases maximum growth.
Jevons Paradox ends up being your friend as well as the friend of the inference providers as well as the friend of the inference financiers.
If it’s a strong enough effect, it has potential to cancel out all the circular financing too, and let everyone ride out the bursting of the bubble.
China will bring down the price per million tokens.
Why are people getting these high spending numbers? A 200 USD subscription for either Codex or Claude should give you plenty of usage. What am I missing? Are they just being dumb?
the real interesting way to address the question of token effectiveness would be internal alpha vs beta testing and measuringing marginal revenue generated by similar teams using ai and at different usage levels. right now $1500 a month is not a meaningful signal of anything beyond current executive willingness to spend. in the long run executives will cut spending where it does not support income generation.
What is the point of allowing a developer to spend $18,000 a year on AI subscriptions? Can't they hire a decent developer who is capable of producing a quality solution faster? Clearly, these decisions are all made by high-level management team.
I was recently talking to an HR person from a European company, and she goes: 'We are forcing our developers to use AI coding agents, but they are still kind of hesitant.' This person had never written a single line of code, nor did she know what software engineering is. For these people, using AI coding agents = faster delivery without breaking anything.
I still have never hit a ceiling with my Claude Max $100 account, much less the Max $200 account. I'm not burning tokens needlessly, nor running it all day, but I do use CC almost daily. What are these devs doing that they are burning more than $1500 in tokens a month?
Maybe it's just me, but I still find that I really have to "shepherd" the AI and work with it to get the results I want. And I read every line of code added and challenge the model's logic. So that limits my token burning. Maybe these people are just "vibe-coding" without really checking the results?
I have strong conviction that companies will now choose tech stack/programming languages based on 'tokenomics'. I am vibe coding using Clojure, a language I can read but cannot write and I never hit the usage limits even when using the latest model on Claude. I have similar experience with F#, which is a bit more verbose than clojure but absolutely beats every OOP language, Python, Typescript etc.
The reason, I use F# & Clojure is they hit JVM and CLR, two popular enterprise stacks.
In my not so humble opinion Lisp(Clojure) still remains the language of AI.
They want to replace employees with AI, then replace paid AI with unpaid AI.
Their wet dream was never automation. It was zero marginal cost labor. And that dream is starting to rot.
Why aren't they using Claude code 20x for 200/month?
It's wild; at my shop in Silicon Valley they dropped us from unlimited use to 60% prem budget on copilot. People are walking around like zombies.
no....the fact that you could buy a reasonably prices MAC or AMD395+ thats AI tool pricing; it loads a big enough model and spits out tokens just fast enough that you can read what it's doing and comprehend it instead of magic.
That's the most useful signal. Pre OpenAI mafia RAM pricing, that comes out to $250/month.
A lot of things can be done with local models.