I’ve been trying out various LLMs for working on assembly code in my toy OS kernel for a few months now. It’s mostly low-level device setup and bootstrap code, and I’ve found they’re pretty terrible at it generally. They’ll often generate code that won’t quite assemble, they’ll hallucinate details like hardware registers etc, and very often they’ll come up with inefficient code. The LLM attempt at an AP bootstrap (real-mode to long) was almost comical.
All that said, I’ve recently started a RISC-V port, and I’ve found that porting bits of low-level init code from x86 (NASM) to RISC-V (GAS) is actually quite good - I guess because it’s largely a simple translation job and it already has the logic to work from.
show comments
userbinator
I wonder how many demoscene productions it was trained on. Probably not many, because stuff like this sticks out like a sore thumb:
Might be interesting to try this in ARM assembly where it's a lot less likely to be existing code in the training set.
show comments
broken_broken_
The x64 assembly would probably work natively on the Mac, no need for docker, provided the 2 syscall numbers (write and exit) are adjusted. Which llms can likely do.
If it’s an ARM Mac, under Rosetta. Otherwise directly.
revskill
Llm is useless in real world codebase. Tons of hallucination and nonsense. Garbagd everywhere. The danger thing is they messed things up rdomly, o consistence at all.
It is fine to treat it as a better autocompletion tool.
worldsayshi
Given the price of Claude Code I'm surprised that not more people go the route of using claude through aider with copilot or something like that. Is Claude Code the tool worth the extra expense?
show comments
piker
I actually expected the struggle to continue based on experience. Though these things can produce some magical results sometimes.
show comments
ur-whale
The code seem to be doing calculations with integers instead of floats.
I’ve been trying out various LLMs for working on assembly code in my toy OS kernel for a few months now. It’s mostly low-level device setup and bootstrap code, and I’ve found they’re pretty terrible at it generally. They’ll often generate code that won’t quite assemble, they’ll hallucinate details like hardware registers etc, and very often they’ll come up with inefficient code. The LLM attempt at an AP bootstrap (real-mode to long) was almost comical.
All that said, I’ve recently started a RISC-V port, and I’ve found that porting bits of low-level init code from x86 (NASM) to RISC-V (GAS) is actually quite good - I guess because it’s largely a simple translation job and it already has the logic to work from.
I wonder how many demoscene productions it was trained on. Probably not many, because stuff like this sticks out like a sore thumb:
It can also do a pretty good 3d star field: https://godbolt.org/z/a7v4xnbef
First try worked but didn't use correct terminal size.
Not to be confused with the excellent Mandelbook[0] and related work on the Mandelbrot[1] by Claude Heiland-Allen :)
[0]: https://mathr.co.uk/mandelbrot/book-draft-2017-11-10.pdf
[1]: https://mathr.co.uk/web/mandelbrot.html
Googling "Mandelbrot set in assembly" returns a bunch of examples of this.
OP may want to test this setup here [0]. This is a bit more challenging than replacing a google query with a LLMs pipeline
[0] https://code.golf/mandelbrot#assembly
Might be interesting to try this in ARM assembly where it's a lot less likely to be existing code in the training set.
The x64 assembly would probably work natively on the Mac, no need for docker, provided the 2 syscall numbers (write and exit) are adjusted. Which llms can likely do.
If it’s an ARM Mac, under Rosetta. Otherwise directly.
Llm is useless in real world codebase. Tons of hallucination and nonsense. Garbagd everywhere. The danger thing is they messed things up rdomly, o consistence at all.
It is fine to treat it as a better autocompletion tool.
Given the price of Claude Code I'm surprised that not more people go the route of using claude through aider with copilot or something like that. Is Claude Code the tool worth the extra expense?
I actually expected the struggle to continue based on experience. Though these things can produce some magical results sometimes.
The code seem to be doing calculations with integers instead of floats.
If so, why?
DeepSeek actually does this is one go