Ggml.ai joins Hugging Face to ensure the long-term progress of Local AI

simonw

It's hard to overstate the impact Georgi Gerganov and llama.cpp have had on the local model space. He pretty much kicked off the revolution in March 2023, making LLaMA work on consumer laptops.

Here's that README from March 10th 2023 https://github.com/ggml-org/llama.cpp/blob/775328064e69db1eb...

> The main goal is to run the model using 4-bit quantization on a MacBook. [...] This was hacked in an evening - I have no idea if it works correctly.

Hugging Face have been a great open source steward of Transformers, I'm optimistic the same will be true for GGML.

I wrote a bit about this here: https://simonwillison.net/2026/Feb/20/ggmlai-joins-hugging-f...

mythz

I consider HuggingFace more "Open AI" than OpenAI - one of the few quiet heroes (along with Chinese OSS) helping bring on-premise AI to the masses.

I'm old enough to remember when traffic was expensive, so I've no idea how they've managed to offer free hosting for so many models. Hopefully it's backed by a sustainable business model, as the ecosystem would be meaningfully worse without them.

We still need good value hardware to run Kimi/GLM in-house, but at least we've got the weights and distribution sorted.

show comments

HanClinto

I'm regularly amazed that HuggingFace is able to make money. It does so much good for the world.

How solid is its business model? Is it long-term viable? Will they ever "sell out"?

show comments

mnewme

Huggingface is the silent GOAT of the AI space, such a great community and platform

show comments

0xbadcafebee

> The community will continue to operate fully autonomously and make technical and architectural decisions as usual. Hugging Face is providing the project with long-term sustainable resources, improving the chances of the project to grow and thrive. The project will continue to be 100% open-source and community driven as it is now.

I want this to be true, but business interests win out in the end. Llama.cpp is now the de-facto standard for local inference; more and more projects depend on it. If a company controls it, that means that company controls the local LLM ecosystem. And yeah, Hugging Face seems nice now... so did Google originally. If we all don't want to be locked in, we either need a llama.cpp competitor (with a universal abstration), or it should be controlled by an independent nonprofit.

show comments

snowhale

good to see them get proper backing. llama.cpp is basically infrastructure at this point and relying on volunteer maintainers for something this critical was starting to feel sketchy.

jgrahamc

This is great news. I've been sponsoring ggml/llama.cpp/Georgi since 2023 via Github. Glad to see this outcome. I hope you don't mind Georgi but I'm going to cancel my sponsorship now you and the code have found a home!

beoberha

Seems like a great fit - kinda surprised it didn’t happen sooner. I think we are deep in the valley of local AI, but I’d be willing to bet it breaks out in the next 2-3 years. Here’s hoping!

show comments

tkp-415

Can anyone point me in the direction of getting a model to run locally and efficiently inside something like a Docker container on a system with not so strong computing power (aka a Macbook M1 with 8gb of memory)?

Is my only option to invest in a system with more computing power? These local models look great, especially something like https://huggingface.co/AlicanKiraz0/Cybersecurity-BaronLLM_O... for assisting in penetration testing.

I've experimented with a variety of configurations on my local system, but in the end it turns into a make shift heater.

show comments

kristianp

> Towards seamless “single-click” integration with the transformers library

That's interesting. I thought they would be somewhat redundant. They do similar things after all, except training.

fancy_pantser

Was Georgi ever approached by Meta? I wonder what they offered (I'm glad they didn't succeed, just morbid curiosity).

the__alchemist

Does anyone have a good comparison of HuggingFace/Candle to Burn? I am testing them concurrently, and Burn seems to have an easier-to-use API. (And can use Candle as a backend, which is confusing) When I ask on Reddit or Discord channels, people overwhelmingly recommend Burn, but provide no concrete reasons beyond "Candle is more for inference while Burn is training and inference". This doesn't track, as I've done training on Candle. So, if you've used both: Thoughts?

show comments

karmasimida

Does local AI have a future? The models are getting ridiculously big and any storage hardware is hoarded by few companies for next 2 years and nvidia has stopped making consumer GPU for this year.

It seems to me there is no chance local ML is going to be anywhere out of the toy status comparing to closed source ones in short term

show comments

mattfrommars

I don’t know if this warrants a separate thread here but I have to ask…

How can I realistically get involved the AI development space? I feel left out with what’s going on and living in a bubble where AI is forced into by my employer to make use of it (GitHub Copilot), what is a realistic road map to kinda slowly get into AI development, whatever that means

My background is full stack development in Java and React, albeit development is slow.

I’ve only messed with AI on very application side, created a local chat bot for demo purposes to understand what RAG is about to running models locally. But all of this is very superficial and I feel I’m not in the deep with what AI is about. I get I’m too ‘late’ to be on the side of building the next frontier model and makes no sense, what else can I do?

I know Python, next step is maybe do ‘LLM from scratch”? Or I pick up Google machine learning crash course certificate? Or do recently released Nvidia Certification?

I’m open for suggestions

show comments

jimmydoe

Amazing. I like the openness of both project and really excited for them.

Hopefully this does not mean consolidation due to resource dry up but true fusion of the bests.

moralestapia

I hope Georgi gets a big fat check out of this, he deserves it 100%.

androiddrew

One of the few acquisitions I do support

forty

Looks like someone tried to type "Gmail" while drunk...

show comments

cyanydeez

Is there a local webui that integrates with Hugging face?

Ollama and webui seem to rapidly lose their charm. Ollama now includes cloud apis which makes no sense as a local.

sheepscreek

Curious about the financials behind this deal. Did they close above what they raised? What’s in it for HuggingFace?

stephantul

Georgi is such a legend. Glad to see this happening

lukebechtel

Thank you Georgi <3

segmondy

Great news! I have always worried about ggml and long term prospect for them and wished for them to be rewarded for their effort.

dhruv3006

Huggingface is actually something thats driving good in the world. Good to see this collab/

superkuh

I'm glad the llama.cpp and the ggml backing are getting consistent reliable economic support. I'm glad that ggerganov is getting rewarded for making such excellent tools.

I am somewhat anxious about "integration with the Hugging Face transformers library" and possible python ecosystem entanglements that might cause. I know llama.cpp and ggml already have plenty of python tooling but it's not strictly required unless you're quantizing models yourself or other such things.

dmezzetti

This is really great news. I've been one of the strongest supporters of local AI dedicating thousands of hours towards building a framework to enable it. I'm looking forward to seeing what comes of it!

show comments

geooff_

As someone who's been in the "AI" space for a while its strange how Hugging Face went from one of the biggest name to not a part of the discussion at all.

show comments

option

Isn't HF banned in China? Also, how are many Chinese labs on Twitter all the time?

In either case - huge thanks to them for keeping AI open!

show comments

periodjet

Prediction: Amazon will end up buying HuggingFace. Screenshot this.

ukblewis

Honestly I’m shocked to be the only one I see of this opinion: HuggingFace’s `accelerate`, `transformers` and `datasets` have been some of the worst open source Python libraries I have ever used that I had to use. They break backwards compatibility constantly, even on APIs which are not underscore/dunder named even on minor version releases without even documenting this, they refuse PRs fixing their lack of `overloads` type annotations which breaks type checking on their libraries and they just generally seem to have spaghetti code. I am not excited that another team is joining them and consolidating more engineering might in the hands of these people

show comments

rvz

This acquisition is almost the same as the acquisition of Bun by Anthropic.

Both $0 revenue "companies", but have created software that is essential to the wider ecosystem and has mindshare value; Bun for Javascript and Ggml for AI models.

But of course the VCs needed an exit sooner or later. That was inevitable.

show comments