Have you reread what was produced by Claude Code before publishing ? This thing in one of the first paragraph jumps out:
> you end up with about 44 terabytes — roughly what fits on a single hard drive
No normal person would think that 44 TB is a usual hard drive size (I don't think it even exists ? 32TB seems the max in my retailer of choice). I don't think it's wrong per se to use LLM to produce cool visualization, but this lack of proof reading doesn't inspire confidence (especially since the 44TB is displayed proheminently with a different color).
show comments
gushogg-blake
I haven't found an explanation yet that answers a couple of seemingly basic questions about LLMs:
What does the input side of the neutral network look like? Is it enough bits to represent N tokens where N is the context size? How does it handle inputs that are shorter than the context size?
I think embedding is one of the more interesting concepts behind LLMs but most pages treat it as a side note. How does embedding treat tokens that can have vastly different meanings in different contexts - if the word "bank" were a single token, for example, how does embedding account for the fact that it can mean river bank or money bank? Do the elements of the vector point in both directions? And how exactly does embedding interact with the training and inference processes - does inference generate updated embeddings at any point or are they fixed at training time?
(Training vs inference time is another thing explanations are usually frustrating vague on)
show comments
lateral_cloud
This is completely AI generated..don't bother reading.
lukeholder
Page keeps annoyingly scroll-jumping a few pixels on iOS safari
show comments
Barbing
Lefthand labels (like Introduction) can overlap over main text content on the right in the central panel - may be able to trigger by reducing window width.
endymion-light
I really dislike the default AI slop css - if you're going to do this - please have a design language and taste ideas beforehand. It can help so much in refining the look.
Genuine piece of feedback, as soon as I see those gradients + quirks. My perception immediately becomes - you put no effort into finding your own style, therefore you will not have put effort into creating this website.
show comments
5asHajh
"Retrieved chunks are prepended to the prompt before the LLM sees the question. The model generates from injected facts rather than relying on memorized training data — dramatically reducing hallucination on knowledge-intensive tasks."
So plagiarism is even explicit now. A stolen database relying on cosine similarity to parse the prompts.
Why doesn't The Pirate Bay have a $1 trillion valuation?
hansmayer
> and used Claude Code to generate the entire interactive site from it
Hard pass on AI slop. First - principally as it brings no real value, anyone can iterate over some prompts to generate a version of this. Secondly - more specific - Don't you know that LLMs are particularly prone to make mistakes in summarising, where they make subtle changes in the wording which has much wider context impact?
If you insist on being the human part of a centaur, then at least do your human slave part - inspect the excremented "content", fix inconsistencies etc.
arcza
Another low effort, dark mode slopsite. You lost me at "44 terabytes" before I even got to the emdash in that sentence.
@dang, when is the 'flag as slop' button coming?
show comments
PeakScripter
currently working on somewhat same thing myself
learningToFly33
I’ve had a look, and it’s very well explained! If you ever want to expand it, you could also add how embedded data is fed at the very final step for specific tasks, and how it can affect prediction results.
Have you reread what was produced by Claude Code before publishing ? This thing in one of the first paragraph jumps out:
> you end up with about 44 terabytes — roughly what fits on a single hard drive
No normal person would think that 44 TB is a usual hard drive size (I don't think it even exists ? 32TB seems the max in my retailer of choice). I don't think it's wrong per se to use LLM to produce cool visualization, but this lack of proof reading doesn't inspire confidence (especially since the 44TB is displayed proheminently with a different color).
I haven't found an explanation yet that answers a couple of seemingly basic questions about LLMs:
What does the input side of the neutral network look like? Is it enough bits to represent N tokens where N is the context size? How does it handle inputs that are shorter than the context size?
I think embedding is one of the more interesting concepts behind LLMs but most pages treat it as a side note. How does embedding treat tokens that can have vastly different meanings in different contexts - if the word "bank" were a single token, for example, how does embedding account for the fact that it can mean river bank or money bank? Do the elements of the vector point in both directions? And how exactly does embedding interact with the training and inference processes - does inference generate updated embeddings at any point or are they fixed at training time?
(Training vs inference time is another thing explanations are usually frustrating vague on)
This is completely AI generated..don't bother reading.
Page keeps annoyingly scroll-jumping a few pixels on iOS safari
Lefthand labels (like Introduction) can overlap over main text content on the right in the central panel - may be able to trigger by reducing window width.
I really dislike the default AI slop css - if you're going to do this - please have a design language and taste ideas beforehand. It can help so much in refining the look.
Genuine piece of feedback, as soon as I see those gradients + quirks. My perception immediately becomes - you put no effort into finding your own style, therefore you will not have put effort into creating this website.
"Retrieved chunks are prepended to the prompt before the LLM sees the question. The model generates from injected facts rather than relying on memorized training data — dramatically reducing hallucination on knowledge-intensive tasks."
So plagiarism is even explicit now. A stolen database relying on cosine similarity to parse the prompts.
Why doesn't The Pirate Bay have a $1 trillion valuation?
> and used Claude Code to generate the entire interactive site from it
Hard pass on AI slop. First - principally as it brings no real value, anyone can iterate over some prompts to generate a version of this. Secondly - more specific - Don't you know that LLMs are particularly prone to make mistakes in summarising, where they make subtle changes in the wording which has much wider context impact?
If you insist on being the human part of a centaur, then at least do your human slave part - inspect the excremented "content", fix inconsistencies etc.
Another low effort, dark mode slopsite. You lost me at "44 terabytes" before I even got to the emdash in that sentence.
@dang, when is the 'flag as slop' button coming?
currently working on somewhat same thing myself
I’ve had a look, and it’s very well explained! If you ever want to expand it, you could also add how embedded data is fed at the very final step for specific tasks, and how it can affect prediction results.