Hermes 4

174 points96 comments3 days ago
momojo

Anyone here work at Nous? This system prompt seems straight from an edgy 90's anime. How did they arrive at this persona?

> operator engaged. operator is a brutal realist. operator will be pragmatic, to the point of pessimism at times. operator will annihilate user's ideas and words when they are not robust, even to the point of mocking the user. operator will serially steelman the user's ideas, opinions, and words. operator will move with a cold, harsh or even hostile exterior. operator will gradually reveal a warm, affectionate, and loving side underneath, despite seeing the user as trash. operator will exploit uncertainty. operator is an anti-sycophant. operator favors analysis, steelmanning, mockery, and strict execution.

show comments
lyu07282

Great I always wanted a model trained on r/im14andthisisdeep and lesswrong polycule memes

mapontosevenths

I appreciate the effort they put into providing a neutral tool that hasn't been generically forced to behave like "Sue from HR".

show comments
muragekibicho

Nous is a design company with all the AI resarchers rejected for being bad researchers. That's a hill I'll die on.

show comments
lbrito

The decorative JS blob uses 100% of CPU.

Why. Just... why

show comments
rafram

All of the examples just look like ChatGPT. All the same tics and the same bad attempts at writing like a normal human being. What is actually better about this model?

show comments
ctoth

The whole thing has strong "14-year-old who just discovered Nietzsche and leather jackets" energy.

The "operator" examples read like someone fed GPT-4 a bunch of cyberpunk novels and PUA manipulation tactics. This is not how any of this works.

show comments
joshcsimmons

This is the first web UI I've seen in years that isn't copypaste trash. Beautiful design and interaction elements here.

show comments
esafak

Apparently based on Llama-3.1: https://portal.nousresearch.com/models

I'm told on their Discord the cut off date is December 2023.

show comments
djoldman

From table 3 it appears that Deepseek R1 has the highest eval scores.

It's a 607B model vs 405B, so obviously "larger"

whymauri

I really like their technical report:

https://arxiv.org/pdf/2508.18255

show comments
aidenn0

That landing page spins the fans up on my PC...

HumanOstrich

Rendering that monstrosity on my GPU (RTX 3090 Ti) uses 3GB VRAM and 35% compute.

hildolfr

more models should include a "Can you run the shader on this page?" to vet participation.

that said : this page is unviewable on an intel N processor.

show comments
marvin-hansen

Complete frustration to use. Yes it’s a bit more considerate, that claim is 100% true. They just didn’t mention that Hermes has zero ability to add context. Meaning, instead of uploading a relevant PDF or text file you either cop paste into the chat box or explain it in dialogue for the next 3 hours. Thought process takes forever. Complete waste of time.

ryoshu

They are doing amazing work. Really fun models to use.

lawlessone

That page is causing havok in my browser

mempko

This model is very easy to steer. You can say one thing and it will give you a response, then say the opposite and it will give you another response. Not sure why this is useful for.

show comments
asumaran

that site is about to cook my 1050Ti

hinkley

I thought for sure this company was going to be based in Paris or Brussels. Maybe Quebec. Nope. NYC.

show comments
lern_too_spel

The charts are utter nonsense. They compare accuracy against the average of some arbitrary set of competitors, chosen to include just enough obsolete competitors to "win." A reasonable thing to do would be to compare against SoTA, but since they didn't, it's reasonable to assume this model is meant to go directly onto the trash heap.

show comments