Fact is that I can actually read TFA, while your link is paywalled.
ryanschaefer 1 hours ago [-]
I mean… should you be able to?
It looks like this is just an AI summarizing a bunch of other paywalled sources. It’s “by MLQ Agent”
dwoosley 10 hours ago [-]
I’d be curious to see the breakdown on spending by use case. I’ve heard it said that the majority of tokenmaxing comes from none technical uses like reading PDFs, creating PowerPoints, generating graphics/images… ect. But I’ve never heard any actual proof to that.
wpasc 10 hours ago [-]
One thing I find fascinating as a software engineer who talks to non software engineers who use AI tools is how "reading PDFs" is not more of a solved problem. What I mean is that uploading a PDF into a chatbot tool seems to be an extraordinarily obvious use case that non technical (and technical) users would want to do.
IMO claude, chatgpt/codex, etc should be able to optimize the PDF use case to be extremely token efficient as it's a very obvious use case. But when I start to explain to my wife/friends why it burns through so much quota, I find myself thinking "why should they have to understand this aspect of it". to me, that the details of PDF parsing and extracting are relevant to users (instead of solved such that you don't have to pay attention to it) shows how these tools are not nearly as "ready" as they are made out to be. I may be preaching to the choir on this one, but just my 2c
mattnewton 10 hours ago [-]
Because PDFs are a nightmare of a format and the only thing that’s is reasonably guaranteed about them is they will render to an image that people can read, the parsing of which will be much less token efficient than the equivalent text
wpasc 10 hours ago [-]
I agree with you, but every non-engineer I know using these tools 100% will drag and drop a PDF into a chatbot. Anthropic and OpenAI as companies who are selling their products to all sorts of businesses should have a much better means of handling this nightmare of a format because it is so pervasive and so obviously what so many of their customers are going to drop into the product.
spott 8 hours ago [-]
Why would they spend a ton of effort ensuring that their customers spend less money on them?
Token economics also are weird. If you design a fancy new frontend that for example uses a cheap model to parse a PDF into text that is fed into an expensive model, you will probably spend more money because you are on API payscale rather than the "max plan" payscale.
mattnewton 8 hours ago [-]
I’m saying there is basically no way to both make vlms able to understand the long tail of PDFs where the layout conveys information (like charts and tables) and to make it as token efficient as text formats. Current approaches have mostly chosen to work more often than not at the cost of token efficiency.
tiahura 10 hours ago [-]
I think they’ve just decided that vision gives the best results and the token issue will take care of itself.
watwut 4 hours ago [-]
Once you pay full price for tokens plus margin, it is better for the company to burn as many tokens as possible.
For the same reason as why the oil companies want everyone to use large cars.
tyre 10 hours ago [-]
For anyone needing to do this, the answer is to convert it to an image first. Far smaller, LLMs work well with them (even in some pretty insane use cases I've seen), and, along with human review, it can be a huge productivity gain that results in structured data.
spindump8930 10 hours ago [-]
I agree with your recomendation, but converting a pdf to an image is by no means smaller. PDFs are much closer to SVGs then to jpegs.
schmuhblaster 9 hours ago [-]
Been building various LLM+PDF pipelines at work. As soon as you need to e.g. parse tables etc. it becomes a lot of hard work!
0cf8612b2e1e 10 hours ago [-]
I hope someday we can get out of this local maxima of PDF documents. The format is terrible, but was right place, right time and might be impossible to dislodge.
Loughla 10 hours ago [-]
The problem is that for 99% of people in 99% of cases they work fine. It's hard for people to understand that they're trash.
Source; my last job working with accessibility and that nightmare.
evdubs 10 hours ago [-]
You don't need to use an online service to do this; you get to avoid spending money on tokens doing it offline.
Gemma 4 works perfectly well offline on limited hardware (I have an 8GB video card) and can handle extracting text from image-based PDFs just fine.
Take a PDF -> run it through MarkItDown [1], using the OCR plugin if you need (point it to Gemma 4) -> now you can ask Gemma 4 questions about the (markdown) document.
I am sure Gemma 4 could even create a GUI to make this process very simple for a non technical user.
Amen. Normal office work is wildly different from what we read about on HN. If you were a CEO, determined to lay off all your people, you would want to really zero in on having your AI solve these very unsexy problems: extract data from Office and PDF. Grab data from some part of the screen of a webapp and parse it. drive a line of business app via keyboard or mouse simulation. I know there are companies out there that try, eg Appian and (here in YC) Skyvern, but its a hard problem and yet I feel this is where the true money is.
ShinyLeftPad 8 hours ago [-]
> how "reading PDFs" is not more of a solved problem
This and replies to this are surreal. It's like everyone simultaneously decided to forget that you don't need claude or whatever to read a PDF. The document is literally made for you to read...
stoorafa 4 hours ago [-]
> The document is literally made for you to read...
It’s disingenuous to assume every PDF is actually crafted to communicate to its recipients, even more so to pretend LLM users are in a position to understand all the PDFs they receive
There’s a lot of gray area where help understanding a document is fully reasonable
nojito 10 hours ago [-]
The best way to parse pdfs is to convert them to images and feed them into the llm.
This workflow is highly optimized.
wpasc 10 hours ago [-]
For sure there are very optimized ways to do it. My point is that a non technical user will drag and drop a pdf into a chatbot. and from a UX/product perspective, they should have to think about it more than that IMO. but seemingly, that's very much an expensive, inefficient way of doing it (burning through a whole context window try to read it, reloading it multiple times per conversation, etc.).
seemaze 10 hours ago [-]
Absolutely this. Never try to parse a native PDF document with any expectation of coherence or consistency.
csomar 10 hours ago [-]
You are missing that the product is the hype cycle around AI and that's worth Trillions of $ (Trillions with a T). Why build a PDF parser that generate text when you can BS in a podcast and get paid.
This discussion was about measures, goals and incentives. Follow the incentives.
stavarotti 8 hours ago [-]
PDFs are both awesome and terrible at the same time. I've seen screenshots of emails added to pdfs alongside tables that span multiple pages. Because you can do almost anything and guarantee that it'll look the same regardless of how or where it's viewed is a big selling point for a lot of businesses. It's this flexibility (i say madness) that makes PDFs notorious, and why some labs have document parsing as a leading product (see https://mistral.ai/news/ocr-4/).
georgeburdell 9 hours ago [-]
Anecdotally it's true for me. I can code all day with an agent and never went above $50, but the second I need to ingest a pdf doc to figure out a command I need to use it's easily $20-30 for 10 mins of work
10 hours ago [-]
ahmadyan 10 hours ago [-]
the majority came from random claws running on cron. They get a heart-beat, wake up every 10mins, reads all internal-posts, emails, gchat messages, diffs, and decides to post some random message to the workplace so other claws can also regurgitate. rinse and repeat and then we are looking at $B tokens
adam_arthur 10 hours ago [-]
I'd guess through LLM embedded PoC projects.
You can rack up token consumption extremely quickly when you embed LLMs into automated processes or products.
I'd be very surprised if these numbers are just typical coding usage with no scripting/pipeline/automation stuff
menloshark 10 hours ago [-]
One thing we use it for is for forking tools internally because of politics.
thewhitetulip 8 hours ago [-]
Tokenn maxing also comes from lesser technical guys who had no moat pre AI
Using AI to suddenly deliver massive amounts of code without questioning the requirements
simonw 10 hours ago [-]
"The leaderboard, which ranked employees and teams by token consumption, inadvertently incentivized usage volume over productive output."
Who could possibly have predicted that happening?
Aurornis 10 hours ago [-]
A past employer thought it was a good idea to put up a leaderboard of who sent the most Slack messages. They celebrated the people at the top for being so active.
Predictably, everyone started talking in Slack like their jobs depended on it. Everyone was responding to everything. Instead of writing out a complete message and pressing enter, they'd send each fragment of the sentence as a new line.
The Slack leaderboard was never shown again. Unfortunately the habit remained because people were afraid they were going to be secretly judged by how much Slack activity they generated.
I expect the same thing is going to happen at companies who had token leaderboards. Once you've instilled that fear in people, they internalize the expectation.
PaulHoule 10 hours ago [-]
Reminds me of the place I worked at where I got in trouble because I was the only person writing JIRA tickets. Instead of bitching out the product manager or the tester for not writing tickets, they just complained to me. And if I wrote a ticket about how we could speed up the 40 minute build to 15 minutes I'd have to explain "How does this change improve the customer experience?" to which I answered "If the build was faster the customer would have had the product six months ago"
lokar 10 hours ago [-]
I worked somewhere that made time from PR being sent for review and ready to merge be a metric for the reviewers. Not time to add feedback in each round. Total time elapsed.
Insanity
Eridrus 10 hours ago [-]
This will inevitably be allocated like other budgets, and from talking to Meta folks about GPU budgets, it is going to be brutal.
Loughla 10 hours ago [-]
You have to realize that if you set a measure, you're actually setting a goal for your employees. There is no such thing as a meaningless metric; why else would you measure it?
No amount of "this isn't used for anything" will change that. It's inherent in human nature in the 21st century to believe any and all metrics will be used against them, and therefore must be gamed.
It's why you also have to set UNBELIEVABLY clear goals and have incentives tied to those goals. Incentives meaning money. If you want to measure things, measure them. But have clear, consistent, and meaningful goals tied to bonuses or something if you want a thing done correctly.
chillfox 3 hours ago [-]
I worked at a place that argued that nobody would game the metrics because it would be wrong and they never stated what the metrics were… while I was gaming the metrics and they were praising me for being one of the best on the team.
It was an unreal experience.
4yfr 10 hours ago [-]
Kinda.
The answer is simpler on the surface: focus.
Generally the problem is the larger the firm’s operations, the harder it is to focus.
Apple is the only firm that has done well on this consistently and doesn’t have a huge grave yard of failures to show for it.
estearum 10 hours ago [-]
Are you saying what people are hoping to achieve with stupid goals? Because yeah, obviously. But the point is that they're stupid, so they don't achieve that, and that failure is 100% knowable in most scenarios.
Freedom2 10 hours ago [-]
I wonder if this should be codified as a rule of thumb, or an unofficial "law"? One perhaps we can reference easily among our peers.
aeve890 9 hours ago [-]
You mean like Goodhart's Law? (Unless you're being sarcastic)
ghurtado 6 hours ago [-]
"when a measure becomes a goal, it stops being a measure"
It's surprising how often this principle is applicable.
morpheos137 10 hours ago [-]
What is about silicon valley leaders not understanding basic economics or business management? These kind of cargo cult tactics would not fly in any other industry.
AnimalMuppet 9 hours ago [-]
These kind of cargo cult tactics show up in all kinds of businesses.
But yeah, it's like they've never actually met human beings...
onetokeoverthe 10 hours ago [-]
[dead]
jghn 10 hours ago [-]
> Who could possibly have predicted that happening?
It’s funny how many times the same thing happens at each large company. I think people’s thought process is this:
> Oh wow! If I paid for this myself I would have spent a lot of money! Are other people spending as much as me? I’m going to create a leaderboard!
> Oh no, my misinformed manager is using the leaderboard as a slight of hand for work. I need to game this now.
Then the leaderboard is banned… I can’t see how this ever really goes up the chain beyond director.
what 9 hours ago [-]
There is zero chance that this is how the leaderboards came to be.
ryanschaefer 9 hours ago [-]
What makes you assert that?
skizm 10 hours ago [-]
It wasn't leadership doing this though. Any meta IC can generate internal apps and dashboards. This was unofficial and unsupported. Some random IC just made it for fun. Management is usually pretty lax with stuff like this (plenty of games and joke internal apps) so they left it up until it became a problem.
giancarlostoro 10 hours ago [-]
I still don't understand how Mark Zuckerberg has any serious investors, he went on this AI tangent and has absolutely nothing to show for it, despite FB / Meta having built some key tech in the space. He needs to stop trying to do something "different" and literally try and build a serious coding agent he can sell, he could have probably had something worthwhile in that space by now.
He started being drastically more serious into AI in 2022, and 2023 and he has nothing to show for it.
Heck, he could have rented GPUs the way Elon did at this point and either mended the bleeding or stopped it, not sure how many he has, but it beats losing this badly.
If he doesn't wake up and learn how to business, I suspect he will lose his empire he's built up for himself.
MangoCoffee 9 hours ago [-]
>he could have rented GPUs the way Elon did at this point
"Meta building cloud business to sell excess AI capacity, Bloomberg News reports Meta building cloud business to sell excess AI capacity, Bloomberg News reports"
Haven’t got numbers so I might be wrong, but I suspect it is dwarfed by the present size and future potential of Meta’s ads business.
darth_avocado 10 hours ago [-]
> Who could possibly have predicted that happening?
Everyone except the executives who get paid millions to predict exactly that.
Avicebron 10 hours ago [-]
Not a problem. There are thousands of employees standing by, willing to sacrifice their jobs for their vision.
It's a hard job, someone has to not pay consequences for bad decisions.
qwertytyyuu 10 hours ago [-]
I know right? What did the leadership think would happen when they give some of the worlds greatest software engineers (supportably), a easily quantifiable metric to target?
VygmraMGVl 10 hours ago [-]
The leaderboard wasn't leadership generated, it was engineer generated from internally available data. The leadership target is "impact" from ai tools.
gtowey 10 hours ago [-]
What a wonderful scapegoat! Technically it's all "engineer created" because the managers generally don't do technical work. I bet many managers pushed their reports to increase their usage during their 1:1 meetings based on data from the leaderboard. If management had any sense that it was a bad metric, they had ample time to get ahead of it and take it down and provide appropriate guidance. Instead, predictably, they waited until it was a full on disaster and a crisis before acting.
VygmraMGVl 9 hours ago [-]
At one point there were over 70 different token maxing dashboards as the management had a game of whack a mole trying to remove them. There definitely was encouragement from management about a year ago to increase ai usage, but once Claude code was allowed, they didn't really need to encourage anyone any more.
Avicebron 10 hours ago [-]
Budget impact is technically impact.
0cf8612b2e1e 10 hours ago [-]
Now come on, there was a recent post where the author argued that infallible management knew this would happen, but was part of the double-secret-probation strategy to get the cogs to finally start using AI.
SpicyLemonZest 10 hours ago [-]
I still think this is true and it’s not obvious to me from the source article that Meta believes otherwise. I couldn’t find the full memo, do they claim the leaderboard or “tokenmaxxing” era was a mistake?
TheOtherHobbes 10 hours ago [-]
Would they admit it if it was? Or would they try to find a plausible rationalisation for wasting billions without any return?
John23832 10 hours ago [-]
Given that Meta has run 5ish layoffs at this point, and everyone is in survival mode, what did they expect? Everyone wants to juice whatever numbers possible to keep their jobs.
10 hours ago [-]
nsonha 4 hours ago [-]
My company has an AI leaderboard and ONE ranking to be this, AND OTHER rankings like efficiency (loc merged per token). No one is so stupid that they think any single one of this is to be optimized (gamed) for.
Tech journalists have low opinion of people with actual skills who actually contribute to society, and when their opinions get posted here, it's often selectively echoed by people looking for a reason to feel smarter than the industry.
thewhitetulip 8 hours ago [-]
Even in firms that don't hold this board, giving unlimited AI to people means those who don't bother to learn technology can now deliver 100x their capacity, in very poor quality code
dzonga 10 hours ago [-]
unfortunately at big tech, this shit will keep happening.
people who make it to managers tend to have bozo tendencies & are yes men.
before it was lines of code, Jira tickets closed. Now it's tokens spent.
zulux 10 hours ago [-]
[flagged]
sharts 10 hours ago [-]
How dare you question the most effective allocators of capital.
TimByte 4 hours ago [-]
The most interesting number is missing here, and that is the token distribution by use case. If 60-70% was eaten up by PDFs, agents and automation instead of people actually sitting in Claude Code, then it is a completely different story
8 hours ago [-]
10 hours ago [-]
andsoitis 10 hours ago [-]
measure outcomes (impact), not effort (token usage, lines of code, code coverage, hours worked, etc.)
lokar 10 hours ago [-]
The whole phenomenon of metric based Eng evaluations is because leadership does not trust line managers to evaluate individual engineers.
4yfr 10 hours ago [-]
What outcomes though? The ones I’ve seen posted are still nonsensical metrics that a publicly traded firm absolutely doesn’t care about.
It wants to see faster R&D, higher revenues from existing assets, greater operating margins, higher sales to invested capital ratio and so on…
The best way to measure that for a software firm is up-time of services, usage and project completion duration
wpasc 10 hours ago [-]
measuring uptime? I've seen Anthropic's status page, and they are a >$1 Trillion dollar company who "largely solved" coding. so clearly you aren't correct. /s
janalsncm 10 hours ago [-]
Ok, uptime. How do you measure an individual’s contribution to uptime? If Claude goes down does everyone take a hit? If Claude stays up everyone gets rewarded?
If so, your metric cannot distinguish between a bad engineer and a good one.
If not, you have the same problem you started with: measuring contributions to “uptime”.
andsoitis 9 hours ago [-]
> If so, your metric cannot distinguish between a bad engineer and a good one.
A metric that moves in the same direction and amount for everyone based on external event isn’t a problem. The delta in performance of the great engineer will outweigh that of the poor, since the metric movement that is due to external circumstances will be the same in each kind of engineer and thus not count.
lokar 10 hours ago [-]
Unfortunately that is a group metric, we need individual metrics
4yfr 10 hours ago [-]
[flagged]
wpasc 10 hours ago [-]
my friend, I was being sarcastic before, and I am agreeing with you. LoC, token spend, etc as metrics are horrible measures. Software uptime is a great metric. I'm merely lamenting that in the age we're in, uptimes are getting worse and worse
unknownfuture 9 hours ago [-]
Okay.
How?
This is an org pushing thousands of PRs a day. How do you solve the attribution problem for any one engineer's work given some set of impact metrics?
And keep in mind, most common impact metrics are trailing indicators, often over relative long time horizons.
jdlshore 5 hours ago [-]
As VPEng, I didn’t use metrics to assess individuals. Too prone to metric gaming.
Instead, I had a career ladder with a detailed rubric describing the skills an engineer at each level was expected to have. (Including communication and peer-leadership skills.)
Managers performed qualitative assessment of employees, using the career ladder as a guide. They relied on tech leads and Staff engineers to help them understand people’s skills, and provided 1:1 feedback and coaching.
We did use impact-based metrics to assess the results of important initiatives. We solved the attribution and lagging indicator problems by estimating impact rather than measuring it, and using a series of proxy measurements (activation, usage, retention, etc.) as a feedback mechanism for revising those estimates.
dheera 10 hours ago [-]
> measure outcomes (impact)
This is also not easy. In particular proactively preventing bugs is not rewarded
andsoitis 9 hours ago [-]
> In particular proactively preventing bugs is not rewarded
The main way I think you can proactively prevent bugs in a meaningful way is by crafting and propagating better architecture.
Better (or worse) architecture and adoption of it can be measured through a mix of quantitative and qualitative means so those metrics could be used to evaluate the impact of the engineer driving that architecture.
dheera 8 hours ago [-]
That's not how managers evaluate engineers at these corporations.
The engineer who haphazardly launched on Friday then promptly saved the team at 3am and worked the weekends gets the promotion, while the one who prevented a bug from happening "didn't get anything done" and gets the PIP.
veber-alex 10 hours ago [-]
It's not flashy.
When shit just works for months or years no one is going to come and praise you for stuff you did a while back.
You are better off breaking stuff and then fixing them to show how useful you are.
nsagent 10 hours ago [-]
Not surprising. It seems that the comment section of every coding agent thread has at least one person mentioning they use "tokenmaxxing" to increase their token usage because it was brought up during their quarterly review, at a standup, or some other communique from on high.
Just wonder what happens when more and more companies introduce similar restrictions. Will that lead to devaluations of the LLM companies?
root_axis 10 hours ago [-]
Not sure if I missed it but I couldn't find any information in the article to explain where the "approaching billions" estimate is coming from.
I could believe it, but I'd want to see something a little more concrete.
bdcravens 10 hours ago [-]
And I still can't exhaust the limits on my Claude Max subscription, despite being more productive than I've ever been in terms of real work (ie, things that actually make money)
ifwinterco 6 hours ago [-]
Because that’s heavily subsidised, whereas companies have to pay something closer to the actual price.
Enjoy it while you can, because it won’t last forever. Per-token billing is quite eye opening in terms of how much it can cost
jm4 9 hours ago [-]
For real. I've used 8B tokens in the past month and haven't hit my limits even once. In fact, I can't even get close except for the day I used Fable. I've barely stopped. Claude keeps reminding me to sleep.
Balgair 6 hours ago [-]
Oh gosh, I run out by Wednesday usually. But I'm not really coding with it, per se. Mostly just writing docs and tech manuals and AI generation. I'm in biotech.
d4rkp4ttern 10 hours ago [-]
Ok I’ll ask since nobody else has — are they not giving their devs a Claude code max or Codex Pro subscription? If so, why is token cost approaching billions? And if not, why not?
lesuorac 10 hours ago [-]
They can't.
The subscriptions are for personal use not enterprise.
i.e. [1] "This article is about paid Max plans for individual consumers. If you're part of an organization looking to use Claude with your team, refer to Team and Enterprise Plans."
Enterprise customers don’t get those plans, at the enterprise level you have to pay by the API rate… so people don’t have limited use, but you’re also not getting the heavily discounted rate the “normal” plans are at.
fuzzfactor 5 hours ago [-]
>Meta plans to spend up to $135 billion on AI infrastructure through 2026 and commits $600 billion to data center buildouts through 2028
And they can't afford a few extra billion that their engineers can utilize right now?
Looks like AI as it develops is intended to be too expensive for regular people in the long run, but if Meta can't even afford it at that rate, who can?
grim_io 10 hours ago [-]
Big enterprises don't get to have those subscriptions. OpenAI or Anthropic simply won't sell them to you if you need a couple thousand of those.
felix-the-cat 10 hours ago [-]
Within a few weeks of telling people at our company that if they don’t use AI they will be replaced by someone who does, they just announced that their allocation with ChatGPT has reset and are now panicking as they blew through their million token allocation for this month in under six hours - you can’t make this shit up.
Atotalnoob 9 hours ago [-]
A million tokens is like $15 with SOTA models… that’s their allocation?
noashavit 8 hours ago [-]
Uber, Microsoft and now Meta. All tokenmaxing to the max on Claude
Trasmatta 10 hours ago [-]
All those billions spent on tokens by Meta, and not a single iota of value generated by any of it
janalsncm 10 hours ago [-]
I can’t tell if you’re complaining that Meta isn’t saving the whales or that their products aren’t good. If it’s the latter, you should double check their financials.
steve-atx-7600 10 hours ago [-]
I guess maybe they can crank out more ads in their dystopian ad space of a social network site.
tyre 10 hours ago [-]
I love how confidently you say this, with no evidence provided (and I doubt you have any.)
Just a pristine comment section yap.
jazzyjackson 10 hours ago [-]
If there was a positive return on token spend they wouldn’t be capping it now would they?
janalsncm 10 hours ago [-]
It’s possible for something to have diminishing returns.
Having a speed limit does not imply the utility of driving is zero.
stinkbeetle 10 hours ago [-]
That does not follow. Many things have diminishing returns curves.
csomar 10 hours ago [-]
Was there a new product released by Meta that we are not aware of? The last thing I read about was the Instagram account take-over AI-bug.
Barrin92 10 hours ago [-]
>I love how confidently you say this,
it's not that difficult to say it confidently if you use any of their services and applications because exactly nothing has changed.
For reference most labor productivity increases for the last 50 years amounted to about 2% per year. If a hypothetical FB engineer had doubled their productivity with their gazillion tokens that would be 30 years of productivity gains in one year. I'd wager the evidence would be quite evident if you opened any of their apps
countcol 10 hours ago [-]
The only thing I’ve seen Meta release recently are spy glasses, and every employee who has worked on that product should be in prison (with a live 24/7 feed where the world gets to watch them wallow).
The times I’ve been asked to evaluate a prospective candidate and I see that product on their résumé, it’s been an instant veto, in the same category as working at Palantir.
linzhangrun 10 hours ago [-]
This is what you get when token consumption becomes a KPI...
10 hours ago [-]
wonderwonder 10 hours ago [-]
I have never worked there and I am likely very unqualified to ever work there and Zuck has more money than I could dream of so take my comment with that in mind.
Meta sounds like a cluster-F of a place to work. Massive reorgs around wild ideas like the metaverse and everything Ai all the time. Employees terrified of being fired. Incentivizing token spending and then cutting it off. While the overall company may be fine, the dev department sounds rudderless and absolutely miserable.
peter_d_sherman 8 hours ago [-]
>"The internal memo disclosed that Meta
employees consumed 73.7 trillion tokens in roughly 30 days
, a figure tracked on an internal leaderboard called "Claudeonomics" — a reference to Anthropic's Claude, one of the third-party AI tools widely used inside the company [2]. The leaderboard, which ranked employees and teams by token consumption, inadvertently incentivized usage volume over productive output.
Meta plans to dismantle the leaderboard and replace it with a centralized monitoring platform called "AI Gateway," which will track usage and spending across teams in real time [2]."
This seems to be an interesting upcoming business, that is:
Helping companies centralize and track their AI usage by employee.
Anyway, great article!
xnx 10 hours ago [-]
It's stories like this that really dispell the genius/merit theory of successful business. The best you can say about Zuck is he didn't prevent Facebook from becoming huge.
fuzzfactor 5 hours ago [-]
This is the real point. If an average person had access to the same amount of capital and ended up with the same ownership terms, there would have been a more sensible outcome for everyone affected.
smrtinsert 10 hours ago [-]
That is insane. I'm sure companies will learn the absolute wrong lesson from this, and attempt to centralize and kneecap token usage.
SpicyLemonZest 10 hours ago [-]
As many companies do with all their budgets, down to the trivial and clearly positive EV cost of free coffee. So it goes, cost controls are hard and necessarily imprecise.
downrightmike 10 hours ago [-]
Tokens are less valuable than the eyeball metric of the Dotcom era. At least the eyeballs were real then.
I'd argue most of the AI value is related to how 'Dead' the internet is.
4yfr 10 hours ago [-]
This talk of tokens is wasteful.
Ultimately the spend on tokens has to benefit the firm financially or it won’t continue spending on it.
jaredcwhite 8 hours ago [-]
in an old-timey cartoon voice
"Now he tells me!"
bonk
whalesalad 10 hours ago [-]
Clearly no one is using Meta’s customer facing AI products. Why aren’t they using their own gpu/compute for development?
wmf 10 hours ago [-]
Because Muse isn't good enough and why use Muse if they'll let you use Opus for free?
gordon_freeman 10 hours ago [-]
that is a fair point. The contrast between Meta and Apple could not be bigger here. Apple has billions of devices and yet they decided to use 3rd party models from OpenAI and later Google to build their AI features rather than building foundational models in house. Yet Meta did opposite: they built models (spending billions of $$$ and firing 10% of the company) for billions of users who rather would not use Meta AI features.
Had a similar comment but what’s even weirder and I seemed to have missed entirely at first glance is that this is an AI news aggregator agent?
The article is “by MLQ Agent.”
hodgehog11 9 hours ago [-]
Judging from the decisions and outputs of the last decade or so, the leadership at Meta, including Mark Zuckerberg, have got to be among the most incompetent I have ever seen. They go all in on the worst decisions; not just the worst in hindsight, but also the worst at the time. The only thing keeping them afloat is their monopoly from past purchases. They are a posterchild for why the US is no longer a properly capitalist nation.
investmuse 3 hours ago [-]
[flagged]
conartist6 11 hours ago [-]
I don't understand though. How will all the AI users replace all the non-AI users if they can't spend money that isn't theirs to win by default?
_heimdall 10 hours ago [-]
Don't worry, once we achieve post-scarcity they will have more tokens than they could ever dream for spending.
downrightmike 10 hours ago [-]
How soon until this becomes part of the "no one wants to work anymore" argument
IMO claude, chatgpt/codex, etc should be able to optimize the PDF use case to be extremely token efficient as it's a very obvious use case. But when I start to explain to my wife/friends why it burns through so much quota, I find myself thinking "why should they have to understand this aspect of it". to me, that the details of PDF parsing and extracting are relevant to users (instead of solved such that you don't have to pay attention to it) shows how these tools are not nearly as "ready" as they are made out to be. I may be preaching to the choir on this one, but just my 2c
Token economics also are weird. If you design a fancy new frontend that for example uses a cheap model to parse a PDF into text that is fed into an expensive model, you will probably spend more money because you are on API payscale rather than the "max plan" payscale.
For the same reason as why the oil companies want everyone to use large cars.
Source; my last job working with accessibility and that nightmare.
Gemma 4 works perfectly well offline on limited hardware (I have an 8GB video card) and can handle extracting text from image-based PDFs just fine.
Take a PDF -> run it through MarkItDown [1], using the OCR plugin if you need (point it to Gemma 4) -> now you can ask Gemma 4 questions about the (markdown) document.
I am sure Gemma 4 could even create a GUI to make this process very simple for a non technical user.
[1] https://github.com/microsoft/markitdown
This and replies to this are surreal. It's like everyone simultaneously decided to forget that you don't need claude or whatever to read a PDF. The document is literally made for you to read...
It’s disingenuous to assume every PDF is actually crafted to communicate to its recipients, even more so to pretend LLM users are in a position to understand all the PDFs they receive
There’s a lot of gray area where help understanding a document is fully reasonable
This workflow is highly optimized.
This discussion was about measures, goals and incentives. Follow the incentives.
You can rack up token consumption extremely quickly when you embed LLMs into automated processes or products.
I'd be very surprised if these numbers are just typical coding usage with no scripting/pipeline/automation stuff
Using AI to suddenly deliver massive amounts of code without questioning the requirements
Who could possibly have predicted that happening?
Predictably, everyone started talking in Slack like their jobs depended on it. Everyone was responding to everything. Instead of writing out a complete message and pressing enter, they'd send each fragment of the sentence as a new line.
The Slack leaderboard was never shown again. Unfortunately the habit remained because people were afraid they were going to be secretly judged by how much Slack activity they generated.
I expect the same thing is going to happen at companies who had token leaderboards. Once you've instilled that fear in people, they internalize the expectation.
Insanity
No amount of "this isn't used for anything" will change that. It's inherent in human nature in the 21st century to believe any and all metrics will be used against them, and therefore must be gamed.
It's why you also have to set UNBELIEVABLY clear goals and have incentives tied to those goals. Incentives meaning money. If you want to measure things, measure them. But have clear, consistent, and meaningful goals tied to bonuses or something if you want a thing done correctly.
It was an unreal experience.
The answer is simpler on the surface: focus.
Generally the problem is the larger the firm’s operations, the harder it is to focus.
Apple is the only firm that has done well on this consistently and doesn’t have a huge grave yard of failures to show for it.
It's surprising how often this principle is applicable.
But yeah, it's like they've never actually met human beings...
Charles Goodhart :-)
> Oh wow! If I paid for this myself I would have spent a lot of money! Are other people spending as much as me? I’m going to create a leaderboard!
> Oh no, my misinformed manager is using the leaderboard as a slight of hand for work. I need to game this now.
Then the leaderboard is banned… I can’t see how this ever really goes up the chain beyond director.
He started being drastically more serious into AI in 2022, and 2023 and he has nothing to show for it.
Heck, he could have rented GPUs the way Elon did at this point and either mended the bleeding or stopped it, not sure how many he has, but it beats losing this badly.
If he doesn't wake up and learn how to business, I suspect he will lose his empire he's built up for himself.
"Meta building cloud business to sell excess AI capacity, Bloomberg News reports Meta building cloud business to sell excess AI capacity, Bloomberg News reports"
https://www.reuters.com/business/meta-sell-excess-ai-computi...
Everyone except the executives who get paid millions to predict exactly that.
It's a hard job, someone has to not pay consequences for bad decisions.
Tech journalists have low opinion of people with actual skills who actually contribute to society, and when their opinions get posted here, it's often selectively echoed by people looking for a reason to feel smarter than the industry.
people who make it to managers tend to have bozo tendencies & are yes men.
before it was lines of code, Jira tickets closed. Now it's tokens spent.
It wants to see faster R&D, higher revenues from existing assets, greater operating margins, higher sales to invested capital ratio and so on…
The best way to measure that for a software firm is up-time of services, usage and project completion duration
If so, your metric cannot distinguish between a bad engineer and a good one.
If not, you have the same problem you started with: measuring contributions to “uptime”.
A metric that moves in the same direction and amount for everyone based on external event isn’t a problem. The delta in performance of the great engineer will outweigh that of the poor, since the metric movement that is due to external circumstances will be the same in each kind of engineer and thus not count.
How?
This is an org pushing thousands of PRs a day. How do you solve the attribution problem for any one engineer's work given some set of impact metrics?
And keep in mind, most common impact metrics are trailing indicators, often over relative long time horizons.
Instead, I had a career ladder with a detailed rubric describing the skills an engineer at each level was expected to have. (Including communication and peer-leadership skills.)
Managers performed qualitative assessment of employees, using the career ladder as a guide. They relied on tech leads and Staff engineers to help them understand people’s skills, and provided 1:1 feedback and coaching.
We did use impact-based metrics to assess the results of important initiatives. We solved the attribution and lagging indicator problems by estimating impact rather than measuring it, and using a series of proxy measurements (activation, usage, retention, etc.) as a feedback mechanism for revising those estimates.
This is also not easy. In particular proactively preventing bugs is not rewarded
The main way I think you can proactively prevent bugs in a meaningful way is by crafting and propagating better architecture.
Better (or worse) architecture and adoption of it can be measured through a mix of quantitative and qualitative means so those metrics could be used to evaluate the impact of the engineer driving that architecture.
The engineer who haphazardly launched on Friday then promptly saved the team at 3am and worked the weekends gets the promotion, while the one who prevented a bug from happening "didn't get anything done" and gets the PIP.
When shit just works for months or years no one is going to come and praise you for stuff you did a while back.
You are better off breaking stuff and then fixing them to show how useful you are.
Just wonder what happens when more and more companies introduce similar restrictions. Will that lead to devaluations of the LLM companies?
I could believe it, but I'd want to see something a little more concrete.
Enjoy it while you can, because it won’t last forever. Per-token billing is quite eye opening in terms of how much it can cost
The subscriptions are for personal use not enterprise.
i.e. [1] "This article is about paid Max plans for individual consumers. If you're part of an organization looking to use Claude with your team, refer to Team and Enterprise Plans."
[1]: https://support.claude.com/en/articles/11049741-what-is-the-...
And they can't afford a few extra billion that their engineers can utilize right now?
Looks like AI as it develops is intended to be too expensive for regular people in the long run, but if Meta can't even afford it at that rate, who can?
Just a pristine comment section yap.
Having a speed limit does not imply the utility of driving is zero.
it's not that difficult to say it confidently if you use any of their services and applications because exactly nothing has changed.
For reference most labor productivity increases for the last 50 years amounted to about 2% per year. If a hypothetical FB engineer had doubled their productivity with their gazillion tokens that would be 30 years of productivity gains in one year. I'd wager the evidence would be quite evident if you opened any of their apps
The times I’ve been asked to evaluate a prospective candidate and I see that product on their résumé, it’s been an instant veto, in the same category as working at Palantir.
Meta sounds like a cluster-F of a place to work. Massive reorgs around wild ideas like the metaverse and everything Ai all the time. Employees terrified of being fired. Incentivizing token spending and then cutting it off. While the overall company may be fine, the dev department sounds rudderless and absolutely miserable.
employees consumed 73.7 trillion tokens in roughly 30 days
, a figure tracked on an internal leaderboard called "Claudeonomics" — a reference to Anthropic's Claude, one of the third-party AI tools widely used inside the company [2]. The leaderboard, which ranked employees and teams by token consumption, inadvertently incentivized usage volume over productive output.
Meta plans to dismantle the leaderboard and replace it with a centralized monitoring platform called "AI Gateway," which will track usage and spending across teams in real time [2]."
This seems to be an interesting upcoming business, that is:
Helping companies centralize and track their AI usage by employee.
Anyway, great article!
I'd argue most of the AI value is related to how 'Dead' the internet is.
Ultimately the spend on tokens has to benefit the firm financially or it won’t continue spending on it.
"Now he tells me!"
bonk
Various discussions:
Meta’s chaotic AI strategy
https://news.ycombinator.com/item?id=48523271
Companies rein in AI usage as costs strain budgets
https://news.ycombinator.com/item?id=48602571
Meta CTO Andrew Bosworth Admits the Company's AI Reorg Was 'Atrocious'
https://news.ycombinator.com/item?id=48548461
Tokenmaxxing is dead, long live tokenmaxxing
https://news.ycombinator.com/item?id=48708795
The article is “by MLQ Agent.”