Yeah this is key, a lot of people are still just looking at the number of params and thinking these models are toys. What Qwen 3.6 has shown is that reasoning and tool calling are just as important if not more.

That&#x27;s absolutely possible, its just as we move towards more advancement, We&#x27;ll soon see Small models being smart enough to not be judged by parameter count but their reasoning and intelligence. You can see examples like Qwen 3.6 27B.

I see that going around, and either the test cases are too simplistic or I&#x27;m doing something wrong. I have a server with a 3090 in it, enough to run qwen3.6, but I haven&#x27;t had much luck using it with either codex or oh-my-pi. They work, but the model gets really slow with ~64k context and the attention degrades quickly. You&#x27;ll sometimes execute a prompt, the model will load a test file and say something like &quot;I was presented with a test file but no command. What should I do with it?&quot;.So yeah, while it&#x27;s true that qwen3.6 is good for agentic coding, it&#x27;s not very good for exploring the codebase and coming up with plans. You need to pair it today with a model capable of ingesting the whole context and providing a detailed plan, and even then the implementation might take 10x the amount of time it&#x27;d take for sonnet or Gemini 3 to crunch through the plan.

It&#x27;s pretty close already. Check qwen3.6 27b if you haven&#x27;t already. People are vibe and agentic coding with it on a single GPU.It is more finicky than Claude but if you hand hold it a bit it&#x27;s crazy.

&gt; The math and coding part is impressive but the agentic one is not.I think this is very important to eventually become a viable replacement for coding models. Because most of the time coding harnesses are leveraging tool calls to gather the context and then write a solution.I am hopeful, that one day we can replace Claude and OpenAI models with local SOTA LLMs

Announcement blogpost: <a href="https:&#x2F;&#x2F;www.zyphra.com&#x2F;post&#x2F;zaya1-8b" rel="nofollow">https:&#x2F;&#x2F;www.zyphra.com&#x2F;post&#x2F;zaya1-8b</a>

I disagree. I think people can make very good software by balancing their use of AI and their market knowledge. I still believe for the foreseeable future people can make wildly loved or mission critical software with 0 ai and have it be met with market interest.I think we are going to see a surge in software claiming to do everything and becoming bloated and unsustainable.I already see 1gpu local models 1 shotting games via vibe coding. I see people doing agentic programming, granted more slowly and cheaply than 12 Claude sessions.The difference isn&#x27;t as big as it was 2 months ago. In the past 45 days so many model releases have happened. Meanwhile frontier performance has stagnated and degraded. If it&#x27;s a taste of what is to come I welcome it.

using C was 100 times as productive as assembly. what happened was not that we finished software 100 times faster, but that we did projects 100 times bigger in the same timesame thing with smol local LLMs versus the big ones in the sky. your smol local LLM will only be able to tackle projects which are not comercially valuable anymore, because people expect 100x scope and features. which is fine as a hobby&#x2F;art projectyes, we&#x27;ll do amazing things with local LLMs in 2 years, but the big LLMs will do things beyond imagination (assembly vs C)

I&#x27;ve been saying it for a long time now. I think small models are the future for LLMs. It&#x27;s been fun seeing experiments to see just how much better models get by making them insanely large but it&#x27;s not sustainable.No I am not saying this model is a drop in Claude replacement. But I think in 2 years we might be really surprised what can be done in a desktop with commodity hardware, no connection to the internet, and a few models that span a subset of tasks.Really happy to see amd put their hat in the ring. It&#x27;s a good day for amd investors. I know a lot of AI bros will scoff at this, but having your first training run is a big deal for a new lab. AMD is on their way despite Nvidia having years of runway

ZAYA1-8B: An 8B Moe Model with 760M Active Params Matching DeepSeek-R1 on Math