There isn't any attempt to falsify the "clean room" claim in the article - a rational approach would be to not provide any documents about the Z80 and the Spectrum, and just ask it to one-shot an emulator and compare the outputs...
If the one-shot output resembles anything working (and I am betting it will), then obviously this isn't clean room at all.
show comments
stevekemp
I grew up with the Spectrum, and wrote a CP/M emulator a while back. I'd be curious to see how complete it would get.
I struggled a lot with some complex software, which worked on some emulators and failed on others (and mine).
For example one bug I had, which is still outstanding, relates to the Hisoft C compiler:
But I see that my cpm-dist repository is referenced in the download script so that made me happy!
It's great to see people still using CP/M, writing software for it, and sharing the knowledge. Though I do think the choice to implement the CCP in C, rather than using a genuine one, is an interesting one, and a bit of a cheat. It means that you cannot use "SUBMIT" and other common-place binaries/utilities.
show comments
avadodin
So what you're saying is that it's not just the machine-readable documentation built over decades of the officially undocumented behavior of Z80 opcodes—often provided under restrictive licenses—it's also the "known techniques and patterns" of emulator code—often provided under restrictive licenses.
hoc
Great project and write-up. I wonder whether most of those "hints" are really needed, though, as you are already using Claude CODE. Aren't things like "simple" and "clean" assumed to be part of its system prompt already (idnividual documentation style etc can't be, of course). While they were useful when using a general LLM for coding, I would think that they are now part of the overall setup of any coding agent. These days I run more into problems with language and api version drifts, even when specified beforehand.
itomato
All the design hints required for this or any other type of agentic "set it and forget it" development are interesting to me, because they enable the result but also lock in less-than-desirable results that exhibit a miss "like simulating a 2Mhz clock".
What if Agents were hip enough to recognize that they have navigated into a specialized area and need additional hinting? "I'm set up for CP/M development, but what I really need now is Z80 memory management technique. Let me swap my tool head for the low-level Z80 unit..."
We can throw RAGs on the pile and hope the context window includes the relevant tokens, but what if there were pointers instead?
cbolton
I asked Gemini to reproduce the poem "The Road Not Taken". I got it in full (as far as I can tell without Gemini fetching anything from the web). I didn't provide any verse of the poem so I guess that counts as a clean room "implementation"?
rjh29
No Carmack or Stallman. Just the right person at the right time.
ontouchstart
Is it possible to build a full OS emulator on top of MMIX?
> The above tools could theoretically be used to compile, build, and bootstrap an entire FreeBSD, Linux, or other similar operating system kernel onto MMIX hardware, were such hardware to exist.
The problem is that it will have been trained on multiple open source spectrum emulators. Even "don't access the internet" isn't going to help much if it can parrot someone else's emulator verbatim just from training.
Maybe a more sensible challenge would be to describe a system that hasn't previously been emulated before (or had an emulator source released publicly as far as you can tell from the internet) and then try it.
For fun, try using obscure CPUs giving it the same level of specification as you needed for this, or even try an imagined Z80-like but swapping the order of the bits in the encodings and different orderings for the ALU instructions and see how it manages it.
show comments
le-mark
Who else had ai implement an emulator? Raises hand. A 6502 emulator in JavaScript was my first Gemini experiment.
kazinator
What'a a "clear room"? A clean room, but with plagiarized code, laundered through an LLM?
dist-epoch
> I believe automatic programming to be already super-human, not in the sense it is currently capable of producing code that humans can’t produce, but in the concurrent usage of different programming languages, system programming techniques, DSP stuff, operating system tricks, math, and everything needed to reach the result in the most immediate way.
As HN likes to say, only a amateur vibe-coder could believe this.
Which is wrong. It's x4-x0. Comment does not match the code below.
> static inline uint16_t zx_pixel_addr(int y, int col) {
It computes a pixel address with 0x4000 added to it only to always subtract 0x4000 from it later. The ZX apparently has ROM at 0x0000..0x3fff necessitating the shift in general but not in this case in particular.
This and the other inline function next to it for attributes are only ever used once.
> During the
> * 192 display scanlines, the ULA fetches screen data for 128 T-states per
> * line.
Yep.. but..
> Instead of a 69,888-byte lookup table
How does that follow? The description completely forgets to mention that it's 192 scan lines + 64+56 border lines * 224 T-States.
I'm bored. This is a pretty muddy implementation. It reminds me of the way children play with Duplo blocks.
show comments
xcf_seetan
I had Claude make an quad core 32 bits z80 just for fun.
What is "clear room"? If he means clean room, no, this doesn't qualify.
I wish people would stop using this phrase altogether for LLM-assisted coding. It has a specific legal and cultural meaning, and the giant amount of proprietary IP that has been (illegally?) fed to the model during training completely disqualifies any LLM output from claiming this status.
airza
You use clean room everywhere in the article and clear room in the title. Is this on purpose?
show comments
jlarcombe
How on earth does this count as "clean room" in any way, when many open-source Z80 emulators will without doubt have been part of its training data?
There isn't any attempt to falsify the "clean room" claim in the article - a rational approach would be to not provide any documents about the Z80 and the Spectrum, and just ask it to one-shot an emulator and compare the outputs...
If the one-shot output resembles anything working (and I am betting it will), then obviously this isn't clean room at all.
I grew up with the Spectrum, and wrote a CP/M emulator a while back. I'd be curious to see how complete it would get.
I struggled a lot with some complex software, which worked on some emulators and failed on others (and mine).
For example one bug I had, which is still outstanding, relates to the Hisoft C compiler:
https://github.com/skx/cpmulator/issues/250
But I see that my cpm-dist repository is referenced in the download script so that made me happy!
It's great to see people still using CP/M, writing software for it, and sharing the knowledge. Though I do think the choice to implement the CCP in C, rather than using a genuine one, is an interesting one, and a bit of a cheat. It means that you cannot use "SUBMIT" and other common-place binaries/utilities.
So what you're saying is that it's not just the machine-readable documentation built over decades of the officially undocumented behavior of Z80 opcodes—often provided under restrictive licenses—it's also the "known techniques and patterns" of emulator code—often provided under restrictive licenses.
Great project and write-up. I wonder whether most of those "hints" are really needed, though, as you are already using Claude CODE. Aren't things like "simple" and "clean" assumed to be part of its system prompt already (idnividual documentation style etc can't be, of course). While they were useful when using a general LLM for coding, I would think that they are now part of the overall setup of any coding agent. These days I run more into problems with language and api version drifts, even when specified beforehand.
All the design hints required for this or any other type of agentic "set it and forget it" development are interesting to me, because they enable the result but also lock in less-than-desirable results that exhibit a miss "like simulating a 2Mhz clock".
What if Agents were hip enough to recognize that they have navigated into a specialized area and need additional hinting? "I'm set up for CP/M development, but what I really need now is Z80 memory management technique. Let me swap my tool head for the low-level Z80 unit..."
We can throw RAGs on the pile and hope the context window includes the relevant tokens, but what if there were pointers instead?
I asked Gemini to reproduce the poem "The Road Not Taken". I got it in full (as far as I can tell without Gemini fetching anything from the web). I didn't provide any verse of the poem so I guess that counts as a clean room "implementation"?
No Carmack or Stallman. Just the right person at the right time.
Is it possible to build a full OS emulator on top of MMIX?
> The above tools could theoretically be used to compile, build, and bootstrap an entire FreeBSD, Linux, or other similar operating system kernel onto MMIX hardware, were such hardware to exist.
https://en.wikipedia.org/wiki/MMIX
The problem is that it will have been trained on multiple open source spectrum emulators. Even "don't access the internet" isn't going to help much if it can parrot someone else's emulator verbatim just from training.
Maybe a more sensible challenge would be to describe a system that hasn't previously been emulated before (or had an emulator source released publicly as far as you can tell from the internet) and then try it.
For fun, try using obscure CPUs giving it the same level of specification as you needed for this, or even try an imagined Z80-like but swapping the order of the bits in the encodings and different orderings for the ALU instructions and see how it manages it.
Who else had ai implement an emulator? Raises hand. A 6502 emulator in JavaScript was my first Gemini experiment.
What'a a "clear room"? A clean room, but with plagiarized code, laundered through an LLM?
> I believe automatic programming to be already super-human, not in the sense it is currently capable of producing code that humans can’t produce, but in the concurrent usage of different programming languages, system programming techniques, DSP stuff, operating system tricks, math, and everything needed to reach the result in the most immediate way.
As HN likes to say, only a amateur vibe-coder could believe this.
in spectrum.c
> Address bits for pixel (x, y): > * 010 Y7 Y6 Y2 Y1 Y0 | Y5 Y4 Y3 X7 X6 X5 X4 X3
Which is wrong. It's x4-x0. Comment does not match the code below.
> static inline uint16_t zx_pixel_addr(int y, int col) {
It computes a pixel address with 0x4000 added to it only to always subtract 0x4000 from it later. The ZX apparently has ROM at 0x0000..0x3fff necessitating the shift in general but not in this case in particular.
This and the other inline function next to it for attributes are only ever used once.
> During the > * 192 display scanlines, the ULA fetches screen data for 128 T-states per > * line.
Yep.. but..
> Instead of a 69,888-byte lookup table
How does that follow? The description completely forgets to mention that it's 192 scan lines + 64+56 border lines * 224 T-States.
I'm bored. This is a pretty muddy implementation. It reminds me of the way children play with Duplo blocks.
I had Claude make an quad core 32 bits z80 just for fun.
<https://pastebin.com/Z2b82LHG>
It is "clean room"
What is "clear room"? If he means clean room, no, this doesn't qualify.
I wish people would stop using this phrase altogether for LLM-assisted coding. It has a specific legal and cultural meaning, and the giant amount of proprietary IP that has been (illegally?) fed to the model during training completely disqualifies any LLM output from claiming this status.
You use clean room everywhere in the article and clear room in the title. Is this on purpose?
How on earth does this count as "clean room" in any way, when many open-source Z80 emulators will without doubt have been part of its training data?
Wow