Cerebras has been a true revelation when it comes to inference. I have a lot of respect for their founder, team, innovation, and technology. The colossal size of the WS3 chip, utilizing DRAM to mind-boggling scale, it's definitely ultra cool stuff.
I also wonder why they have not been acquired yet. Or is it intentional?
I will say, their pricing and deployment strategy is a bit murky and unclear. Paying $1500-$10,000 per month plus usage costs? I'm assuming that it has to do with chasing and optimizing for higher value contracts and deeper-pocketed customers, hence the minimum monthly spend that they require.
I'm not claiming to be an expert, but as a CEO/CTO, there were other providers in the market that had relatively comparable inference speed (obviously Cerebras is #1), easier onboarding, better response from people that worked there (all of my experience with Cerebras have been days/weeks late or simply ignored). IMHO, if Cerebras wants to gain more mindshare, they'll have to look into this aspect.
show comments
Shakahs
Sonnet/Claude Code may technically be "smarter", but Qwen3-Coder on Cerebras is often more productive for me because it's just so incredibly fast. Even if it takes more LLM calls to complete a task, those calls are all happening in a fraction of the time.
show comments
mythz
Running Qwen3 coder at speed is great, but would also prefer to have access to other leading OSS models like GLM 4.6, Kimi K2 and DeepSeek v3.2 before considering switching subs.
Groq also runs OSS models at speed which is my preferred way to access Kimi K2 on their free quotas.
fcpguru
Their core product is the Wafer Scale Engine (WSE-3) — the largest single chip ever made for AI, designed to train and run models much faster and more efficiently than traditional GPUs.
I'm surprised how under-the-radar Cerebras is. Being able to get near-instantaneous responses from Qwen3 and gpt-oss is pretty incredible.
show comments
JLO64
My experience with Cerebras is pretty mixed. On the one hand for simple and basic requests, it truly is mind blowing how fast it is. That said, I’ve had nothing but issues and empty responses whenever I try to use them for coding tasks (Opencode via Openrouter, GPT-OSS). It’s gotten to a point where I’ve disabled them as a provider on Openrouter.
show comments
arjie
I just tried out Qwen-3-480B-Coder on them yesterday and to be honest it's not good enough. It's very fast but has trouble on lots of tasks that Claude Code just solves. Perhaps part of it is that I'm using Charm's Crush instead of Claude Code.
ramshanker
I am not able to guess, what is preventing Cerebras from replacing few of the cores in the Wafer-Scale package with HBM memory? It seems the only constraint with their WSE3 is memory capacity. Considering the size of NVDA chips, Only a small subset of wafer area should easily exceed the memory size of contemporary models.
show comments
lvl155
Last I tried, their service was spotty and unreliable. I would wait maybe a year or so to retry.
redwood
Would be interesting if IBM were to acquire. Seems like the big iron approach to GPUs
If the idiots at AMZN have any brains left, they would acquire this and make it the center of their inference offerings. But considering how lackluster their performance and strategy as a company has been off late, I doubt that.
Disappointed quite a bit with this fund raise. They were expected to IPO this year and give us poor retail investors a chance at investing in them.
$50/month for one person for code (daily token limit), or pay per token, or $1500/month for small teams, or an enterprise agreement (contact for pricing).
Seems high.
tibbydudeza
Damm they are fast.
rvz
Sooner or later, lots of competitors including Cerebras are going to take apart Nvidia's data center market share and it will cause many AI model firms to question the unnecessary spend and hoarding of GPUs.
OpenAI is still developing their own chips with Broadcom, but they are not operational yet. So for now, they're buying GPUs from Nvidia to build up their own revenue income (to later spend it on their own chips)
By 2030, eventually many companies will be looking for alternatives to Nvidia like Cerebras or Lightmatter for both training and inference use-cases.
For example [0] Meta just acquired a chip startup for this exact reason - "An alternative to training AI systems" and "to cut infrastructure costs linked to its spending on advanced AI tools.".
Cerebras has been a true revelation when it comes to inference. I have a lot of respect for their founder, team, innovation, and technology. The colossal size of the WS3 chip, utilizing DRAM to mind-boggling scale, it's definitely ultra cool stuff.
I also wonder why they have not been acquired yet. Or is it intentional?
I will say, their pricing and deployment strategy is a bit murky and unclear. Paying $1500-$10,000 per month plus usage costs? I'm assuming that it has to do with chasing and optimizing for higher value contracts and deeper-pocketed customers, hence the minimum monthly spend that they require.
I'm not claiming to be an expert, but as a CEO/CTO, there were other providers in the market that had relatively comparable inference speed (obviously Cerebras is #1), easier onboarding, better response from people that worked there (all of my experience with Cerebras have been days/weeks late or simply ignored). IMHO, if Cerebras wants to gain more mindshare, they'll have to look into this aspect.
Sonnet/Claude Code may technically be "smarter", but Qwen3-Coder on Cerebras is often more productive for me because it's just so incredibly fast. Even if it takes more LLM calls to complete a task, those calls are all happening in a fraction of the time.
Running Qwen3 coder at speed is great, but would also prefer to have access to other leading OSS models like GLM 4.6, Kimi K2 and DeepSeek v3.2 before considering switching subs.
Groq also runs OSS models at speed which is my preferred way to access Kimi K2 on their free quotas.
Their core product is the Wafer Scale Engine (WSE-3) — the largest single chip ever made for AI, designed to train and run models much faster and more efficiently than traditional GPUs.
Just tried https://cloud.cerebras.ai wow is it fast!
I'm surprised how under-the-radar Cerebras is. Being able to get near-instantaneous responses from Qwen3 and gpt-oss is pretty incredible.
My experience with Cerebras is pretty mixed. On the one hand for simple and basic requests, it truly is mind blowing how fast it is. That said, I’ve had nothing but issues and empty responses whenever I try to use them for coding tasks (Opencode via Openrouter, GPT-OSS). It’s gotten to a point where I’ve disabled them as a provider on Openrouter.
I just tried out Qwen-3-480B-Coder on them yesterday and to be honest it's not good enough. It's very fast but has trouble on lots of tasks that Claude Code just solves. Perhaps part of it is that I'm using Charm's Crush instead of Claude Code.
I am not able to guess, what is preventing Cerebras from replacing few of the cores in the Wafer-Scale package with HBM memory? It seems the only constraint with their WSE3 is memory capacity. Considering the size of NVDA chips, Only a small subset of wafer area should easily exceed the memory size of contemporary models.
Last I tried, their service was spotty and unreliable. I would wait maybe a year or so to retry.
Would be interesting if IBM were to acquire. Seems like the big iron approach to GPUs
does Guillaume Verdon from https://www.extropic.ai/ have thoughts on on cerebras?
(or other people that read the litepaper https://www.extropic.ai/future)
If the idiots at AMZN have any brains left, they would acquire this and make it the center of their inference offerings. But considering how lackluster their performance and strategy as a company has been off late, I doubt that.
Disappointed quite a bit with this fund raise. They were expected to IPO this year and give us poor retail investors a chance at investing in them.
Valued at 8.1 billion dollars.
https://www.cerebras.ai/pricing
$50/month for one person for code (daily token limit), or pay per token, or $1500/month for small teams, or an enterprise agreement (contact for pricing).
Seems high.
Damm they are fast.
Sooner or later, lots of competitors including Cerebras are going to take apart Nvidia's data center market share and it will cause many AI model firms to question the unnecessary spend and hoarding of GPUs.
OpenAI is still developing their own chips with Broadcom, but they are not operational yet. So for now, they're buying GPUs from Nvidia to build up their own revenue income (to later spend it on their own chips)
By 2030, eventually many companies will be looking for alternatives to Nvidia like Cerebras or Lightmatter for both training and inference use-cases.
For example [0] Meta just acquired a chip startup for this exact reason - "An alternative to training AI systems" and "to cut infrastructure costs linked to its spending on advanced AI tools.".
[0] https://www.reuters.com/business/meta-buy-chip-startup-rivos...