abeppu

Diffusion model papers are always interesting to read but I always feel like they need some mechanism to insert or delete tokens. In the example in the figure in this post, once it has fixed "British munchkin cats _ _ and ..." you _can't_ get to "British munchkin cats are a new and controversial breed." because there's not the right number of tokens between "cats" and "and". In a coding context, if your model samples a paren or a comma or something which is entirely plausible at that position, it can still close off an expansion which would be syntactically correct.

show comments
MASNeo

I wish there would be more of this research to speed things up rather than building ever larger models

show comments
yjftsjthsd-h

Is anyone doing any form of diffusion language models that are actually practical to run today on the actual machine under my desk? There's loads of more "traditional" .gguf options (well, quants) that are practical even on shockingly weak hardware, and I've been seeing things that give me hope that diffusion is the next step forward, but so far it's all been early research prototypes.

show comments
simonw

I'd love to know what's going on with the Gemini Diffusion model - they had a preview last May and it was crazy fast but I've not heard anything since then.

fumeux_fume

Seeing half of an AR LLM's output tokens go to generating a predefined json schema bothers me so much. I would love to have an option to use diffusion for infilling.

show comments
nl

Releasing this on the same day as Taalas's 16,000 token-per-second acceleration for the roughly comparable Llama 8B model must hurt!

I wonder how far down they can scale a diffusion LM? I've been playing with in-browser models, and the speed is painful.

https://taalas.com/products/

show comments
LarsDu88

A lot of this post-training recipe feels reminiscent of DINO training (teacher/student, use of stop gradients). I wonder if the more recent leJEPA SigREG regularization research might be relevant here for simpler post-training.

bjt12345

I do wonder why diffusion models aren't used alongside constraint decoding for programming - surely it makes better sense then using an auto-regressive model.

show comments
LarsDu88

Google is working on a similar line of research. Wonder why they haven't rolled out a GPT40 scaled version of this yet

show comments
WiSaGaN

I think diffusion makes much more sense than auto-regressive (AR) specifically in code generation comparing to chatbot.

hanifbbz

Is this available as open source anywhere to try?

cubefox

This doesn't mention the drawback of diffusion language models, the main reason why nobody is using them: they have significantly lower performance on benchmarks than autoregressive models at similar size.

LoganDark

Can't wait for the day I can actually try a diffusion model on my own machine (128GB M4 Max) rather than as a hosted service. So far I haven't seen a single piece of software that supports it.

show comments
refulgentis

If this means there’s a 2x-7x speed up available to a scaled diffusion model like Inception Mercury, that’ll be a game changer. It feels 10x faster already…

show comments