> Developed from design to production in nine months, accelerated by OpenAI’s models
> the use of OpenAI models to accelerate parts of the design and optimization process.
I wish there was more about this. As is I kind of have to assume that this is just meaningless marketing, like saying development was accelerated by Microsoft Office or their 5k LG Ultrafine 40-inch monitors.
Like, if this was as big a deal as it kind of vaguely implies, they would be making a bigger deal of it, right?
show comments
shellcromancer
Probably obvious but still omitted in the OpenAI post: chips are being made by TSMC [1]. Wasn't sure if Intel got it.
I wanna see an inference chip where the weights are part of the rom of the chip.
There would be 1 multiplier per weight (and since they're constant, the whole thing turns into a bunch of simple adders), and the total pipelined system throughput would be one token per clock cycle.
That means you can probably have millions of users simultaneously using a single bit of silicon, with perhaps 500 million tokens per second coming out the output bus.
Downside is this chip would be huuuuge - a whole wafer.
Wafer level faults probably won't matter though - neural nets are resistant to a few missing or wrong weights.
Due to the speed the industry moves, you'd want to race from model weights to production super fast, make 50 wafers, use them for a year, then bin them when that model is obsolete.
show comments
deweywsu
With the pace of AI, and with AI helping to pave the way for faster/better AI, I keep wondering if hardware like this will become obsolete well before it has a meaningful ROI. Huge AI models can be run with less resources already through quantization and offloading, but that's just the beginning.
One day, maybe not far from now, a breakthrough will allow huge LLMs (say 200B in size) to run well on an old 5 year old Dell desktop. Think that's crazy? Look at the size of the first hard drives. The IBM 350 was a disk with 50 platters, 24 inches in diameter, that held 3.5Mb, and was leased for today's equivalent of $35K.
Compare that to a multi-terabyte ssd. Now apply that improvement to how an LLM is architected and run now. With AI assisting, it won't be long before a leap occurs and these data centers with all their current ultra-cutting edge Nvidia cards are nearly obsolete overnight.
show comments
nickpinkston
This is very cool to see - seems like soooo much efficiency waiting to be unlocked at the chip level.
What's everyone think of Taalas?
They're actually burning the LLM model into the silicon, with some onboard memory for fine-tuning. They claim huge cost / latency wins.
Pretty huge move. Google and their TPUs are looking infinitely more prescient as I think they are on their 7th generation, along with the offshoots it inspired like the LPU and even others, perhaps like Cerebras and their Wafer Scale Engine.
However, based off first impressions, it seems like this is meant for inference side, and not training, which is also an interesting choice.
show comments
v5v3
>designed for initial deployment by the end of 2026 and expanding in the years ahead,
So after the IPO and will be featured heavily in the IPO sales brochure as a future promise?
I'm sceptical over any pre-IPO announcements.
show comments
bogdiyan
I am not sure how much of the work is done by OpenAI, or whether it is basically a Broadcom chip specifically built for OpenAI models. It is a necessary step, but building a high-performance chip is not easy. Look at companies like Groq, Amazon, and Google.
show comments
cpldcpu
I had Opus 4.5 design an LLM inference engine in verilog, including firmware and automated verification a while ago: https://github.com/cpldcpu/smollm.c
It's of course far from optical. But lowering the implementation through the abstraction levels turned out to be extremely powerful.
show comments
bluegatty
'braodcom' ha ha ... it's not OpenAI's chip then ...
chris_money202
Microsoft, Google, and Amazon also do this, but they also have the hyperscaler datacenter infrastructure to host the chips. Designing and taping out the chip is one thing, packaging, cooling, deploying, powering, and managing the fleet is another stack entirely. Wonder where that will come from?
show comments
digitaltrees
We’ve entered the “if you care about software, build hardware” phase of AI
show comments
MangoCoffee
cheap token is more important now than ever. Chinese open weight model is getting pretty good. the real cost of AI adaption will come down to who (China or US) can provide cheap token for consumers and companies. Microsoft consider DeepSeek for their cowork is an example and now OpenAI with its own AI inference chip.
kilroy123
I hope to see something like this, but in a small form factor like the NVIDIA spark.
I want a super fast LLM that is Opus 4.6+, like, in ability.
show comments
theowaway213456
This seems like more competition for Cerebras? Am I understanding correctly?
show comments
skyberrys
The new chip sounds like it's vustom made to accelerate a few specific models they really need to run fast. The advantage is it's truly and ASIC, not a xPU. There are several new startups targeting EDA tooling automation, Chip Agents is the biggest one I can think of but their are smaller players too, Silimate is one I recall. These companies are focusing on building fast AI powered tools to speed up the tape out cycle.
Legend2440
The only surprising thing about this is that they didn't do it three years ago.
tehjoker
No information on how significant the reduction in energy per token is. No information on amortized price per request. Increasingly its clear OpenAI must demonstrate order of magnitude reductions in cost to not die, this is investor story time without that information.
dadoum
> May we scale smoothly, exponentially and uneventfully through A[SI]
That sentence sounds weird to me. I can't really put my finger on why, maybe the combination of adverbs, or just the fact of writing the desire of scaling as a company so directly. It feels (to me) like openly claiming their selfish goals. Or maybe I am just misinterpreting and they are referring to the whole humanity as "We" (but knowing Broadcom and in a lesser extent OpenAI doings, I am not convinced).
kazinator
There is a never ending torrent of money coming, so why not make custom chips.
Whoo ... party!
satvikpendem
I'm assuming they used LLMs to (help humans) do custom circuit design. Even pre LLM there were various computer optimizations that didn't require humans like genetic algorithms. It'd be cool to see a paper on how they did it.
fennecbutt
I mean I'd love to be able to buy something like the 17k tps taalas chip as a pcie or m.2.
Imagine when we can roar along at that speed, low power. Can just have the model reason for a while about anything and everything. It reminds me of the "race to idle" for mcus etc.
show comments
OrvalWintermute
Word of Advice for OpenAI:
Never underestimate Broadcom’s ability to shaft their own customers
- VMware
- CA Technologies
- Symantec Enterprise Security
- Brocade
- LSI Corporation
show comments
delduca
NVidia stocks are red now
show comments
zuzululu
im very excited that frontier models now have so much money and revenue they are releasing their own chips that could change the relationships and bottom line
Africa-Ai
Wow thats sounds tempting to use open ai newest chips
renoir
Look at the SIZE of that chip.
Cerebras stock is down nearly 20% today.
Not only is approach overlapping, OpenAI is also Cerebras's only major customer.
show comments
qsxfthnkp2322
aw shucks nvda has some spicy competition
Make sure you all use that fancy ñ
show comments
duendefm
If this is something that will hurt Nvidia, I'm all for it
gravypod
I wonder how close OpenAI is getting to using the memory they purchased. Are they planning to stack a huge amount of HBM2 into these chips?
show comments
fibonacci112358
So this is where all the memory they bought is going to.
Although this seems to be for inference itself only and not training but inference is a recurring cost and training is a one time cost and so to me, even if Nvidia still gets moat on training, I don't think that it could ever justify its massive evaluations because for example, some chinese models are actually trained on Non-Nvidia models. The moat in that is incredibly thin.
(at the moment), I think that if I were Nvidia, I would be a bit terrified and I imagine the stock to not be doing super great as I can just imagine everyone online might start talking about it for better or for worse.
I am a bit impressed by OpenAI but is this what can be classified as a plan for OAI to salvage itself and all the commitments it has made nearing a 1.4 Trillion dollars from my memory and this article[0] is from 2025
But could OpenAI simply walk out of its commitments when necessary (for example to Nvidia) if this chip works out or what exactly might happen in the future as these commitments are asked to be paid for, its still smart for OAI to diversify with this chip and to have more deeper ways of revenue than just being a simple middleman but I imagine that Nvidia and others have also invested in OpenAI and they must not be happy with this change.
The thing with AI deals are that they have become so complicated that it is hard for me to find the first order impact of things, let alone second or third order impacts and financial accountability seems to be impacted quite heavily because of all of it and there is some sense that it is done so intentionally.
> significantly better performance-per-watt than current state-of-the-art alternatives
An interesting example of how the current market dynamics incentivize low cost and therefore power efficiency and therefore lowering resource use.
jabedude
how much does this chip help with inference speed?
show comments
gaigalas
But nvidia's moat is software support, isn't it?
show comments
sehw
lol
flyinglizard
I call BS. It’s probably a white label around existing Broadcom IP, impossible to go from zero to this kind of chip in nine months. I doubt OpenAI had any significant contribution.
show comments
Mistletoe
The similarities between the AI world and the crypto world are so much closer than any AI fanboy would ever admit.
jerojero
One thing I don't like about California based companies is how cringe the names always are.
"Jalapeño" is such a bad name, having an "ñ" already makes it difficult and annoying to deal with in so many little ways. Good luck with that.
But also, theres the sort of "yes lets use Mexican related things because we're California" thought that I just really hate. I don't know, its like corporate Memphis to me. You see a product like this, you know it's an uppity califonia based firm that came up with it.
> Developed from design to production in nine months, accelerated by OpenAI’s models
> the use of OpenAI models to accelerate parts of the design and optimization process.
I wish there was more about this. As is I kind of have to assume that this is just meaningless marketing, like saying development was accelerated by Microsoft Office or their 5k LG Ultrafine 40-inch monitors.
Like, if this was as big a deal as it kind of vaguely implies, they would be making a bigger deal of it, right?
Probably obvious but still omitted in the OpenAI post: chips are being made by TSMC [1]. Wasn't sure if Intel got it.
1. https://www.investing.com/news/stock-market-news/openai-unve...
I wanna see an inference chip where the weights are part of the rom of the chip.
There would be 1 multiplier per weight (and since they're constant, the whole thing turns into a bunch of simple adders), and the total pipelined system throughput would be one token per clock cycle.
That means you can probably have millions of users simultaneously using a single bit of silicon, with perhaps 500 million tokens per second coming out the output bus.
Downside is this chip would be huuuuge - a whole wafer.
Wafer level faults probably won't matter though - neural nets are resistant to a few missing or wrong weights.
Due to the speed the industry moves, you'd want to race from model weights to production super fast, make 50 wafers, use them for a year, then bin them when that model is obsolete.
With the pace of AI, and with AI helping to pave the way for faster/better AI, I keep wondering if hardware like this will become obsolete well before it has a meaningful ROI. Huge AI models can be run with less resources already through quantization and offloading, but that's just the beginning. One day, maybe not far from now, a breakthrough will allow huge LLMs (say 200B in size) to run well on an old 5 year old Dell desktop. Think that's crazy? Look at the size of the first hard drives. The IBM 350 was a disk with 50 platters, 24 inches in diameter, that held 3.5Mb, and was leased for today's equivalent of $35K.
https://www.computerhistory.org/storageengine/first-commerci...
Compare that to a multi-terabyte ssd. Now apply that improvement to how an LLM is architected and run now. With AI assisting, it won't be long before a leap occurs and these data centers with all their current ultra-cutting edge Nvidia cards are nearly obsolete overnight.
This is very cool to see - seems like soooo much efficiency waiting to be unlocked at the chip level.
What's everyone think of Taalas?
They're actually burning the LLM model into the silicon, with some onboard memory for fine-tuning. They claim huge cost / latency wins.
Super fast demo live at: https://chatjimmy.ai/
https://taalas.com/
https://www.reddit.com/r/singularity/comments/1r9frzk/taalas...
Pretty huge move. Google and their TPUs are looking infinitely more prescient as I think they are on their 7th generation, along with the offshoots it inspired like the LPU and even others, perhaps like Cerebras and their Wafer Scale Engine.
However, based off first impressions, it seems like this is meant for inference side, and not training, which is also an interesting choice.
>designed for initial deployment by the end of 2026 and expanding in the years ahead,
So after the IPO and will be featured heavily in the IPO sales brochure as a future promise?
I'm sceptical over any pre-IPO announcements.
I am not sure how much of the work is done by OpenAI, or whether it is basically a Broadcom chip specifically built for OpenAI models. It is a necessary step, but building a high-performance chip is not easy. Look at companies like Groq, Amazon, and Google.
I had Opus 4.5 design an LLM inference engine in verilog, including firmware and automated verification a while ago: https://github.com/cpldcpu/smollm.c
It's of course far from optical. But lowering the implementation through the abstraction levels turned out to be extremely powerful.
'braodcom' ha ha ... it's not OpenAI's chip then ...
Microsoft, Google, and Amazon also do this, but they also have the hyperscaler datacenter infrastructure to host the chips. Designing and taping out the chip is one thing, packaging, cooling, deploying, powering, and managing the fleet is another stack entirely. Wonder where that will come from?
We’ve entered the “if you care about software, build hardware” phase of AI
cheap token is more important now than ever. Chinese open weight model is getting pretty good. the real cost of AI adaption will come down to who (China or US) can provide cheap token for consumers and companies. Microsoft consider DeepSeek for their cowork is an example and now OpenAI with its own AI inference chip.
I hope to see something like this, but in a small form factor like the NVIDIA spark.
I want a super fast LLM that is Opus 4.6+, like, in ability.
This seems like more competition for Cerebras? Am I understanding correctly?
The new chip sounds like it's vustom made to accelerate a few specific models they really need to run fast. The advantage is it's truly and ASIC, not a xPU. There are several new startups targeting EDA tooling automation, Chip Agents is the biggest one I can think of but their are smaller players too, Silimate is one I recall. These companies are focusing on building fast AI powered tools to speed up the tape out cycle.
The only surprising thing about this is that they didn't do it three years ago.
No information on how significant the reduction in energy per token is. No information on amortized price per request. Increasingly its clear OpenAI must demonstrate order of magnitude reductions in cost to not die, this is investor story time without that information.
> May we scale smoothly, exponentially and uneventfully through A[SI]
That sentence sounds weird to me. I can't really put my finger on why, maybe the combination of adverbs, or just the fact of writing the desire of scaling as a company so directly. It feels (to me) like openly claiming their selfish goals. Or maybe I am just misinterpreting and they are referring to the whole humanity as "We" (but knowing Broadcom and in a lesser extent OpenAI doings, I am not convinced).
There is a never ending torrent of money coming, so why not make custom chips.
Whoo ... party!
I'm assuming they used LLMs to (help humans) do custom circuit design. Even pre LLM there were various computer optimizations that didn't require humans like genetic algorithms. It'd be cool to see a paper on how they did it.
I mean I'd love to be able to buy something like the 17k tps taalas chip as a pcie or m.2.
Imagine when we can roar along at that speed, low power. Can just have the model reason for a while about anything and everything. It reminds me of the "race to idle" for mcus etc.
Word of Advice for OpenAI:
Never underestimate Broadcom’s ability to shaft their own customers
- VMware
- CA Technologies
- Symantec Enterprise Security
- Brocade
- LSI Corporation
NVidia stocks are red now
im very excited that frontier models now have so much money and revenue they are releasing their own chips that could change the relationships and bottom line
Wow thats sounds tempting to use open ai newest chips
Look at the SIZE of that chip.
Cerebras stock is down nearly 20% today.
Not only is approach overlapping, OpenAI is also Cerebras's only major customer.
aw shucks nvda has some spicy competition
Make sure you all use that fancy ñ
If this is something that will hurt Nvidia, I'm all for it
I wonder how close OpenAI is getting to using the memory they purchased. Are they planning to stack a huge amount of HBM2 into these chips?
So this is where all the memory they bought is going to.
No surprise here. [0]
[0] https://news.ycombinator.com/item?id=45429514
Although this seems to be for inference itself only and not training but inference is a recurring cost and training is a one time cost and so to me, even if Nvidia still gets moat on training, I don't think that it could ever justify its massive evaluations because for example, some chinese models are actually trained on Non-Nvidia models. The moat in that is incredibly thin.
(at the moment), I think that if I were Nvidia, I would be a bit terrified and I imagine the stock to not be doing super great as I can just imagine everyone online might start talking about it for better or for worse.
I am a bit impressed by OpenAI but is this what can be classified as a plan for OAI to salvage itself and all the commitments it has made nearing a 1.4 Trillion dollars from my memory and this article[0] is from 2025
But could OpenAI simply walk out of its commitments when necessary (for example to Nvidia) if this chip works out or what exactly might happen in the future as these commitments are asked to be paid for, its still smart for OAI to diversify with this chip and to have more deeper ways of revenue than just being a simple middleman but I imagine that Nvidia and others have also invested in OpenAI and they must not be happy with this change.
The thing with AI deals are that they have become so complicated that it is hard for me to find the first order impact of things, let alone second or third order impacts and financial accountability seems to be impacted quite heavily because of all of it and there is some sense that it is done so intentionally.
https://techcrunch.com/2025/11/06/sam-altman-says-openai-has...
> significantly better performance-per-watt than current state-of-the-art alternatives
An interesting example of how the current market dynamics incentivize low cost and therefore power efficiency and therefore lowering resource use.
how much does this chip help with inference speed?
But nvidia's moat is software support, isn't it?
lol
I call BS. It’s probably a white label around existing Broadcom IP, impossible to go from zero to this kind of chip in nine months. I doubt OpenAI had any significant contribution.
The similarities between the AI world and the crypto world are so much closer than any AI fanboy would ever admit.
One thing I don't like about California based companies is how cringe the names always are.
"Jalapeño" is such a bad name, having an "ñ" already makes it difficult and annoying to deal with in so many little ways. Good luck with that.
But also, theres the sort of "yes lets use Mexican related things because we're California" thought that I just really hate. I don't know, its like corporate Memphis to me. You see a product like this, you know it's an uppity califonia based firm that came up with it.