I pay for Kagi to get better search results. Lately, I’ve felt that Kagi’s search has been just as full of low-information and AI generated results as Google. I’ve been wondering why I’m still paying for it. This seemed like a good litmus test. Unfortunately, Kagi displays pretty much the same results as Google for nanoclaw.
show comments
Growtika
A couple years back John Reilly posted on HN "How I ruined my SEO" and I helped him fix it for free. He wrote about the whole thing here: https://johnnyreilly.com/how-we-fixed-my-seo
Happy to do the same for you if you want.
The quickest win in your case: map all the backlinks the .net site got (happy to pull this for you), then email every publication that linked to it. "Hey, you covered NanoClaw but linked to a fake site, here's the real one." You'd be surprised how many will actually swap the link. That alone could flip things.
Beyond that there's some technical SEO stuff on nanoclaw.dev that would help - structured data, schema, signals for search engines and LLMs. Happy to walk you through it.
update: ok this is getting more traction than I expected so let me give some practical stuff.
1. Google Search Console - did you add and verify nanoclaw.dev there? If not, do it now and submit your sitemap. Basic but critical.
2. I checked the fake site and it actually doesn't have that many backlinks, so the situation is more winnable than it looks.
3. Your GitHub repo has tons of high quality backlinks which is great. Outreach to those places, tell the story. I'm sure a few will add a link to your actual site. That alone makes you way more resilient to fakers going forward. This is only happening because everything is so new. Here's a list with all the backlinks pointing to your repo:
4. Open social profiles for the project - Twitter/X, LinkedIn page if you want. This helps search engines build a knowledge graph around NanoClaw. Then add Organization and sameAs schema markup to nanoclaw.dev connecting all the dots (your site, the GitHub repo, the social profiles). This is how you tell Google "these all belong to the same entity."
5. One more thing - you had a chance to link to nanoclaw.dev from this HN thread but you linked to your tweet instead. Totally get it, but a strong link from a front page HN post with all this traffic and engagement would do real work for your site's authority. If it's not crossing any rule (specific use case here so maybe check with the mods haha) drop a comment here with a link to nanoclaw.dev. I don't think anyone here would mind if it will get you few steps closer towards winning that fake site
show comments
jackfranklyn
The structured data point in the top comment is spot on. Added Organization and SoftwareApplication schema to my own project recently and the shift in how Google indexes you is real - went from being treated as a random domain to Google actually understanding what the site represents.
What's maddening about this whole situation though is that Google already has every signal it needs. The GitHub repo links to nanoclaw.dev. The npm package links to it. The commit history proves authorship. But apparently domain age and raw backlink count still trump verified ownership signals. The system rewards whoever stakes out the domain first, not whoever actually built the thing.
show comments
AznHisoka
I’m looking at this from a 3rd party of view (definitely not claiming the .net “deserves” to rank higher)
1) the .net version has a couple of very high authority links, namely from theregister and thenewstack (both of which have had lots of engagement).
I highly doubt it would have ranked without those links.
2) its only been a week. Give Google time to understand which pages should rank higher.
3) Google is biased towards sites that cover a topic earlier than others.
I’ve seen pages that are still top 3 for a particular competitive query years later, simply because they were one of the first to write about it.
Suggestions: give it time. Meanwhile I would recommend linking to your website rather than your github everywhere you mention it, to give it a boost
show comments
uyzstvqs
I did some experimenting using different search engines and AIs. Here's the results:
Google and Brave linked to the official GitHub repo followed by the fake domain. DuckDuckGo and Bing linked to the fake domain first, followed by the official GitHub. Mojeek gave higher ranking to two third party articles, but linked to both the official GitHub and website without fakes. Qwant was the worst, as the official website was the second result amongst multiple fake websites and an unrelated GitHub repo.
Then there the AIs. ChatGPT, Google AI mode, Gemini, Grok, Perplexity, and Brave Search "Ask" all linked to the official website, and some added the GitHub repo as well. DuckDuckGo Search Assist linked to just the official GitHub. Google AI mode, Gemini and Grok also explicitly warned about the fake websites. Copilot got the official website and GitHub right, but linked to a presumably fake X account as well.
Conclusion: Google, Brave and Mojeek win in search. AI is very good and clearly beats search overall. Google AI mode, Gemini and Grok stand out in quality.
show comments
markus_zhang
My advice to all OSS developers: if you open source your project, expect it to be abused in all possible ways. Don't open source if you have anxiety over it. It is how the world works, whether we like it or not.
I appreciate that you open source your projects for us to study. But TBH, please help yourself first.
show comments
ariehkovler
It's worse than that. There's a SECOND imitator that I actually stumbled on today while looking something up about nanoclaw - nanoclawS [dot] io - and that one's harvesting email addresses.
The obvious risk here is a bait and switch, where one of these sites switches their link to the Github repo to point to a malicious imitator repo instead.
One approach would be to go after the sites themselves, not their Google ranking. See if their hosts are willing to take them down. Is there anything you can assert copyright over to hang a DCMA request on? That's hard for an Open Source project, I guess. And the fake sites aren't (yet) doing any actual scamming.
Good luck, though!
show comments
bob1029
Losing the SEO battle is a lot like losing money on the stock market. The system you are fighting is incredibly efficient and will never in a trillion years give a single shit about your specific concerns. You can hire lawyers and spend time complaining about it all day on social media. But you'll rarely get a drop of blood out of this stone. The best you can do is to step back, reevaluate your understanding of the market, and adjust your strategy.
allthetime
Piggybacking on the Claw hype, surprised when someone piggybacks on you...
show comments
GeoAtreides
And I'm losing the sanity battle for my own mind with all these AI generated posts pls I beg you two lines by your hand are worth 100000 generated tokens
dirk94018
We had a similar experience — looks like someone used AI to clone our site's design and structure at linuxtoaster.com. The real issue Gavriel is highlighting goes beyond SEO. The cost of creating a convincing copycat site just went to zero. Anyone can feed a successful page to an LLM and get a polished clone in minutes. And for open source projects it's even worse — they can clone your website AND clone your code, have an AI rebrand it, and ship a convincing-looking alternative overnight.
show comments
MarkSweep
The link on GitHub to the real site is marked with rel="nofollow". I wonder if it would make sense for GitHub to remove nofollow in some circumstances. Perhaps based on some sort of reputation system or if the site links back to the repo with a <link rel="self" href="..." /> in the header? Presumably that would help the real site rank higher when the repo ranks highly.
show comments
Sweepi
> When you Google "NanoClaw," a fake website ranks #2 globally, right below the project's GitHub.
Unfortunately, the fake website [.net] is also #3 on Kagi, and #1 on Duckduckgo.
On Kagi, the Github is #1 and nanoclaw.dev is #4, but only if you count "Interesting Finds".
On Duckduckgo, the Github is #2 and nanoclaw.dev is nowhere to be found.
Neither of these projects anything requiring payment anywhere, but tons of sites pop up trying to "sell" these projects. I wouldn't even know what that means and I'm kind of tempted to drop in a credit card to see what happens. Would they auto send you a link to the public repo?
Most of it is quite lazy and haven't quite kept up with modern AI capabilities. They mostly just scrape the text I wrote, and present it with some screenshots that I created. I can imagine a future where
- really nice landing pages are generated
- the product is entirely rebranded
- marketing is automated (linkedin, google ads, etc)
and someone can develop some autonomous system that basically finds high quality, yet unknown open source projects, and redeploys it and sells it online for actual money.
signorovitch
> This isn't an SEO problem. This is a Google problem.
I've tested on a few of the big search engines, and nanoclaw.dev is never in the first page.
Gemini was also unable to find the .dev, even in "Research Mode." The only way I was able to get a direct link to nanoclaw.dev was with chatgpt, which found it by scraping the GitHub (it also spat out links to a couple of other copies it found from google.)
Seems this is a wider SEO issue, one which infiltrates even the technology supposed to replace it.
show comments
tracker1
Do what Louis Rossman did... just ask Google's AI what you need to change on your site... Apparently that's the secret now.
networkcat
Before installing new software, I usually visit its GitHub page or Wikipedia entry first and click through to the official site from there. I just don't trust the 'official' sites that pop up in Google search results. How many of you do the same?
show comments
youknownothing
> I've done everything you're supposed to do and more.
By the sound of it, everything except reporting it? Winning SEO just means appear before them in search results, but the fake page shouldn't just lose the race, it should be taken down.
lol This gets worse with AI search. If Google can't figure out canonical source from a GitHub repo linking directly to the official site, LLMs definitely can't. And once an AI overview bakes the fake site into its knowledge graph, you're not just losing Google rankings imo, you're losing the models too. Registering every TLD on day 1 is now just table stakes for any OSS project which still doesn't seem fair.
throwaway85825
People forget that Google is a malware services company. A significant part of their revenue is fake OBS malware and the like.
samuelknight
Copycats are not a new problem. You can be completely open source and have a trademark on the project name.
show comments
azangru
> So I built a real website.
That was two weeks ago.
Is Google supposed to have drastic updates to its index over 2 weeks?
show comments
lucasluitjes
I've been annoyed with Google search quality lately and was wondering how the others fared on this specific issue. Turns out, mostly not much better.
Bing, DuckDuckGo, Qwant, Ecosia, Brave all had the github repo and nanoclaw.net (the fake homepage) in the first or second place. Marginalia had fascinating results about biology but only tangentially related Nanoclaw results, not the github repo or either the fake or real homepage.
Mojeek was the exception, sort of. It had some random news sites up top, but the github repo in 2nd place and nanoclaw.dev (the real homepage) in the 4th place. The fake nanoclaw.net did not show.
Kagi is the only one I couldn't try because apparently I used up my free credits a year back. Can anyone see how they compare?
show comments
WD-42
Is there an acronym for “AI generated, didn’t read”?
show comments
jccooper
I don't see that Google cares much about backlinks any more. Seems like it's all about "content" keywords and maybe a little time-on-site. The domain is a huge signal, which is probably where the problem comes from here.
Sadly, Google's generally better against all the new AI-generated content farms than other players, so maybe they're still running PageRank somewhere.
vegasbrianc
SEO is broken at the moment. With Google Overviews just killing organic SEO, it is becoming less and less relevant, unfortunately.
theanonymousone
I saw this some time ago with Bing and OpenCode:
"If I search for "opencode GitHub" in Bing, a random fork is returned"
Just an FYI, but I don't know if being in the website field of GitHub really helps since there's a rel nofollow on the link.
bubblewand
Yeah, Google stopped even trying to usefully index most of the web around ‘08 or ‘09 or so. Was super obvious when it happened and it’s been that way ever since. Your GitHub is up there because it’s a blessed website, your personal site isn’t and will struggle mightily to rank even when you search exact, unusual phrases on it, if it’s like most of the rest of the Web on Google these days.
Get more traffic (make sure google analytics sees it, IDK but that probably matters because monopoly) and it might help.
Most of the other indices aren’t much better. Turns out fighting spam is expensive, easier to just do a combo of boosting really big sites and blessed spammers that use your ad network.
show comments
elevation
This project was launched very quickly, and may have not had a large budget for extra domains.
But for entities with a bit more time, you can prevent this scenario by taking acquiring the .com/.net variant domains before launching.
roywiggins
I'll be honest, I'd take this more seriously if this post didn't read like ChatGPT output. If you won't spend the effort to use your own words why should I stir myself to care?
Sorry, I'll put it in hand-crafted ChatGPTese:
## The Slop Problem
Every post sounds the same. No intelligence. No individuality. Just pure, clean LLM slop. Let's dive in.
- Every post has LLM tells. This is key.
- Posts get upvoted anyway. Nobody seems to notice or indeed care.
- People acclimate to the slop. This isn't just a coincidence. This is a real shift in standards. When people read enough of this, they begin to think it sounds normal.
## The Replying Dilemma
Should you engage with the content, when there is a real person involved? On the one hand, they put their name on it, and probably the details are drawn from their prompt, so it can be said to fairly represent what they wanted to say. So maybe ragging on their ChatGPT prose is being mean. On the other hand, if nobody ever mentions this, the acclimatization will only get worse as the rising tide of slop overwhelms any other style of writing.
## The "Snobbery is good actually" Option
Relentlessly bully people for their half-baked LLM copy. Make it your whole personality. Go insane.
## The "Giving Up" Solution
Learn to stop worrying and love the LLM.
show comments
ryandrake
> I don't want to be playing this game. I want to be writing code, building community, pushing features, fixing bugs.
Then just write code, build features, and fix bugs. Nobody is forcing you to fix search engines' problems. If you're not making money off of traffic, then why worry so much about SEO? Just do your thing. If it really bothers you, put a little note on your GitHub warning people about the fake site, and get on with your life.
show comments
iamacyborg
Google is absolutely idiotic sometimes.
We (as in the team that helped fork and migrate the PoE1 wiki) setup a new domain for the Path of Exile 2 wiki, which is being hosted by the folks at Grinding Gear Games and linked on the official website and in multiple places on the highly trafficked subreddit.
Despite this, Google has decided that the site is not relevant and shouldn't appear anywhere in search results, despite the wiki for the first game appearing everywhere.
tmaly
Wasn't one of the original ideas of NFT was to essentially identify the original creator?
alexpham14
Oof, this is exactly the nightmare scenario for “repo-first” OSS.
The weird bit isn’t that a scraper site exists, it’s that Google can’t do the obvious graph join: query == project name, #1 result is the repo, repo declares Homepage = X, yet Google still boosts an imposter domain. That’s not “SEO”, that’s the ranking system refusing to treat maintainer-declared canonical as a strong signal. Early domain squatters get to “set the default” purely by being first, then they can flip the content later once trust is baked in.
People keep saying “tell users to bookmark the real URL” like that scales. Most people will click the second link and assume it’s official. If Google can’t solve this class of problem, their “AI answers” are going to be a bigger mess than blue links ever were.
bakugo
> I don't want to be playing this game. I want to be writing code
I assume the "I" here refers to Claude, who seemingly wrote the entire project AND the linked post.
ZoomZoomZoom
This is a google problem, but only secondary.
The crux of the matter is that there's nothing that protects an open project besides reputation, and nowadays in the digital space it can be cheaply farmed.
Laws could help, but they only work when you undertake purposeful actions to be covered by them, like register a trademark, and it's never cheap.
Imagine you're in a local band playing shows. It's 3 month old and you have no issued records. A second band tighter with venues takes your name and starts performing under your moniker. You have no money to take that to court and good luck making a case. You can't do anything besides screaming on the web or, don't know, kicking a few butts. You change your name.
show comments
renegat0x0
- I think I was upset when Google allowed fake ad for VLC to appear high in ranking
- I hate that Google returns content farms instead of product web pages
- I hate that Google provides a page of 10 useful links, later links are just pure garbage. I think that something in Google engine is profoundly broken
- I maintain my own search index, but it requires a lot of effort, and attention. I do insert links if I find them worthy. I think more people should have their personal search indexes. Mine is below. I am quite happy that problems like these do not affect me that much
> This isn't an SEO problem. This is a Google problem.
Sorry, but this is a SEO problem. The fake site has probably been linked to by a number of high-SEO outlets. What you should do is contact them and tell them to fix the links (to point to your site), which they should be happy to do.
show comments
rocketvole
i think orcasclicer suffers from the same issue. Not really sure why some oss projects struggle with this issue and others don't (notepad++)
MagicMoonlight
A guy that stole someone else’s idea by making a shinier website getting mad that someone stole his idea by making a shinier website. Such is life.
boredhedgehog
> The person running nanoclaw[.]net can put anything they want on that page tomorrow. A crypto scam. A phishing page. Malicious download links. They could fork the GitHub repo, inject malicious code, and link to it from the site that Google is telling thousands of people is legitimate.
A lot of handwringing about hypotheticals. The page is up there because it links the official repo. Changing that will quickly tank its search rank.
barelysapient
The more things change the more they stay the same.
shevy-java
I've noticed this a few years ago. Google has been ruining its search engine
deliberately so. I could explain the things Google did here, but other websites
and videos already explain it, including the why (though there is some speculation as to why).
These days I even find e. g. qwant sometimes having better results than google search. I see it as a positive thing though - I can soon stop using Google search. So one less Google product. One day I will be Google free. It will be a happy day. I really think Google must cease to exist.
(The only sad thing is how crap the other search engines are. So while Google search sucks nowadays, I consistently get even worse results with e. g. DuckDuckGo. And I think part of the reason is because the world wide web also sucks a LOT more compared to the old days. Google is also partially responsible for this by the way, which just reinforces the idea that Google must die.)
keybored
Live by bots, die by bots.
imp0cat
It's simple really, .net > .dev.
keiferski
Suddenly the pre-Google Yahoo model of curated links is starting to seem relevant again.
Curation in general is probably a skill that will become more and more in demand as the Internet fills up with AI slop.
show comments
ChrisArchitect
Two weeks? Hardly enough for the correct url to take over. A correct url with no history/presence that came out of nowhere as far as the engine is concerned. It will happen most likely tho, thanks to the links from the project etc, but might take a bit of time since the other url is established. "losing the battle" now perhaps, but not for long most likely.
Imustaskforhelp
Duckduckgo actually shows nanoclaw.net as the first result and the github page as second.
Another point but DDG's AI feature actually references Nanoclaw.net as a source.
Damn I booted up Orion (Kagi) and even Kagi shows nanoclaw.net as the third result after the github page with qwibitai and another github page with your (previous?) github username ie gavrielc which when clicked on also results to the same github page.
There is an interesting find page in kagi which references the website but it still shows nanoclaw.net page earlier and the nanoclaw.dev interesting find shows the .dev domain barely that in first time I didn't even notice it.
I expected it better from DDG/Kagi to be honest. I also tried brave and it had the same issue. Brave even is its own independent index and even that struggles with.
Let's hope that this can quickly get patched though. Also a good reminder to people to prefer opening up github links than websites as I must admit that even as a tech-savvy person I could've fallen for nanoclaw.net link as well given its second in like all search engines.
show comments
dumbfounder
DMCA?
show comments
Imustaskforhelp
Another comment here but here are all the search engines I looked at:
from 1-5 all referenced .net before .dev and DDG referenced .net before github , marinalia didn't give me either .net, .dev or gh link but rather docker.com or some other tech articles
Mojeek and Yandex.ru DID give me .dev links before .net at the time of writing.
I literally opened these two as a joke especially Mojeek not expecting too much But I just know names of lots of search engines so I tried.
Mojeek and Yandex.ru have surprised me although I think yandex.ru might have referenced the .dev because of https://nanoclaw.dev/ru/ as it points to this.
Mojeek seems interesting now from this observation
I also wanted to try swisscows but looks like they have become 100% premium as I do remember being able to search for free but now a popup comes.
I also tried baidu (chinese search engine) and it gave results in chinese and firefox translate sort of stuttered and didn't work when I tried to translate, I don't know chinese so pasted it in claude and it doesn't link to either .net or .dev but rather chinese links.
Now with all of this observation, I think that we do know one Provider (Mojeek) who won. A lot of these on these lists are actually not independent except Mojeek and brave and probably yandex.ru
SO I guess the main takeaway from this could be that Independent search engines can be interesting. They can still be hit or miss but the more independent search engines the merrier given that some might miss but some will also hit.
My comment definitely feels like a good reputation bonus for mojeek. Well anything for more independent search engines imo. I looked at their about me and it seems that they are a single person (Marc Smith). Fascinating stuff
I know marginalia_nu is on hn so maybe marginalia and mojeek can share some index together. Anyways this was a fun exciting experiment to do. I hope the community tries out other search engines if I may have missed any and share insights if a particular search engine gives interesting results.
show comments
Drupon
Sorry Gavriel Cohen, but this Google search placement was promised to the other person thousands of years ago.
newswasboring
I fell for this yesterday, but for zeroclaw not nanoclaw. I found this website[1] through brave search I think. I was not paying too much attention as I was under the influence, it points to the wrong repo[2] and instructions install from that. I didn't like zeroclaw anyways so I tried to uninstall it and only then realized i'm on a forked repo.
Gavriel is freaking out over nothing while making rookie mistakes pretending not to be in an SEO war
It's literally not his problem that some people click a scam link, he still has 18,000 github stars, its just a bifurcated audience of undiscerning people
He's overly worried about a perfect unanimous impression when he shouldn't
Now he's wasting his money on SEO tweaks and domain names while saying he only wants to code, then focus on coding! not buying obscure TLD's and vibecoding sitemaps while wondering what he did wrong
yeesh, some people can't handle a little fame
show comments
csomar
It’s worse. I wrote about this a couple weeks ago [1]. With AI responses and Google pulling results from different sources, you could potentially hijack other brands with your own fake content (ie: phone number).
>We trust Google to surface reliable information about elections. Vaccines. Medical conditions. Financial decisions. And they can't get this right?
Actually I don't trust Google and I don't expect it to surface reliable information. I expect it to surface information and I will dig through it and judge for myself whether it is reliable or not.
gjsman-1000
Steve Jobs famously never allowed free meals at Apple.
Humans are psychologically incapable of assigning respect to things that are free; across the board - not donating to open-source, maxing out every dollar of food stamps, refusing to pay a dollar for an app if it has a free tier, even companies like AWS ripping off open source without any qualms. If you got an offer for a free relationship no strings attached, would you take it seriously? If someone on a street corner has artwork for $5 or $500, it could be the same piece of art, but which one gets more attention on first glance?
If you want your work to be respected, do not make it open source. Your odds are slightly better at succeeding at acting. Remember that 97% of public GitHub repos have zero external users.
I pay for Kagi to get better search results. Lately, I’ve felt that Kagi’s search has been just as full of low-information and AI generated results as Google. I’ve been wondering why I’m still paying for it. This seemed like a good litmus test. Unfortunately, Kagi displays pretty much the same results as Google for nanoclaw.
A couple years back John Reilly posted on HN "How I ruined my SEO" and I helped him fix it for free. He wrote about the whole thing here: https://johnnyreilly.com/how-we-fixed-my-seo
Happy to do the same for you if you want.
The quickest win in your case: map all the backlinks the .net site got (happy to pull this for you), then email every publication that linked to it. "Hey, you covered NanoClaw but linked to a fake site, here's the real one." You'd be surprised how many will actually swap the link. That alone could flip things.
Beyond that there's some technical SEO stuff on nanoclaw.dev that would help - structured data, schema, signals for search engines and LLMs. Happy to walk you through it.
update: ok this is getting more traction than I expected so let me give some practical stuff.
1. Google Search Console - did you add and verify nanoclaw.dev there? If not, do it now and submit your sitemap. Basic but critical.
2. I checked the fake site and it actually doesn't have that many backlinks, so the situation is more winnable than it looks.
3. Your GitHub repo has tons of high quality backlinks which is great. Outreach to those places, tell the story. I'm sure a few will add a link to your actual site. That alone makes you way more resilient to fakers going forward. This is only happening because everything is so new. Here's a list with all the backlinks pointing to your repo:
https://docs.google.com/spreadsheets/d/1bBrYsppQuVrktL1lPfNm...
4. Open social profiles for the project - Twitter/X, LinkedIn page if you want. This helps search engines build a knowledge graph around NanoClaw. Then add Organization and sameAs schema markup to nanoclaw.dev connecting all the dots (your site, the GitHub repo, the social profiles). This is how you tell Google "these all belong to the same entity."
5. One more thing - you had a chance to link to nanoclaw.dev from this HN thread but you linked to your tweet instead. Totally get it, but a strong link from a front page HN post with all this traffic and engagement would do real work for your site's authority. If it's not crossing any rule (specific use case here so maybe check with the mods haha) drop a comment here with a link to nanoclaw.dev. I don't think anyone here would mind if it will get you few steps closer towards winning that fake site
The structured data point in the top comment is spot on. Added Organization and SoftwareApplication schema to my own project recently and the shift in how Google indexes you is real - went from being treated as a random domain to Google actually understanding what the site represents.
What's maddening about this whole situation though is that Google already has every signal it needs. The GitHub repo links to nanoclaw.dev. The npm package links to it. The commit history proves authorship. But apparently domain age and raw backlink count still trump verified ownership signals. The system rewards whoever stakes out the domain first, not whoever actually built the thing.
I’m looking at this from a 3rd party of view (definitely not claiming the .net “deserves” to rank higher)
1) the .net version has a couple of very high authority links, namely from theregister and thenewstack (both of which have had lots of engagement).
I highly doubt it would have ranked without those links.
2) its only been a week. Give Google time to understand which pages should rank higher.
3) Google is biased towards sites that cover a topic earlier than others.
I’ve seen pages that are still top 3 for a particular competitive query years later, simply because they were one of the first to write about it.
Suggestions: give it time. Meanwhile I would recommend linking to your website rather than your github everywhere you mention it, to give it a boost
I did some experimenting using different search engines and AIs. Here's the results:
Google and Brave linked to the official GitHub repo followed by the fake domain. DuckDuckGo and Bing linked to the fake domain first, followed by the official GitHub. Mojeek gave higher ranking to two third party articles, but linked to both the official GitHub and website without fakes. Qwant was the worst, as the official website was the second result amongst multiple fake websites and an unrelated GitHub repo.
Then there the AIs. ChatGPT, Google AI mode, Gemini, Grok, Perplexity, and Brave Search "Ask" all linked to the official website, and some added the GitHub repo as well. DuckDuckGo Search Assist linked to just the official GitHub. Google AI mode, Gemini and Grok also explicitly warned about the fake websites. Copilot got the official website and GitHub right, but linked to a presumably fake X account as well.
Conclusion: Google, Brave and Mojeek win in search. AI is very good and clearly beats search overall. Google AI mode, Gemini and Grok stand out in quality.
My advice to all OSS developers: if you open source your project, expect it to be abused in all possible ways. Don't open source if you have anxiety over it. It is how the world works, whether we like it or not.
I appreciate that you open source your projects for us to study. But TBH, please help yourself first.
It's worse than that. There's a SECOND imitator that I actually stumbled on today while looking something up about nanoclaw - nanoclawS [dot] io - and that one's harvesting email addresses.
The obvious risk here is a bait and switch, where one of these sites switches their link to the Github repo to point to a malicious imitator repo instead.
One approach would be to go after the sites themselves, not their Google ranking. See if their hosts are willing to take them down. Is there anything you can assert copyright over to hang a DCMA request on? That's hard for an Open Source project, I guess. And the fake sites aren't (yet) doing any actual scamming.
Good luck, though!
Losing the SEO battle is a lot like losing money on the stock market. The system you are fighting is incredibly efficient and will never in a trillion years give a single shit about your specific concerns. You can hire lawyers and spend time complaining about it all day on social media. But you'll rarely get a drop of blood out of this stone. The best you can do is to step back, reevaluate your understanding of the market, and adjust your strategy.
Piggybacking on the Claw hype, surprised when someone piggybacks on you...
And I'm losing the sanity battle for my own mind with all these AI generated posts pls I beg you two lines by your hand are worth 100000 generated tokens
We had a similar experience — looks like someone used AI to clone our site's design and structure at linuxtoaster.com. The real issue Gavriel is highlighting goes beyond SEO. The cost of creating a convincing copycat site just went to zero. Anyone can feed a successful page to an LLM and get a polished clone in minutes. And for open source projects it's even worse — they can clone your website AND clone your code, have an AI rebrand it, and ship a convincing-looking alternative overnight.
The link on GitHub to the real site is marked with rel="nofollow". I wonder if it would make sense for GitHub to remove nofollow in some circumstances. Perhaps based on some sort of reputation system or if the site links back to the repo with a <link rel="self" href="..." /> in the header? Presumably that would help the real site rank higher when the repo ranks highly.
> When you Google "NanoClaw," a fake website ranks #2 globally, right below the project's GitHub.
Unfortunately, the fake website [.net] is also #3 on Kagi, and #1 on Duckduckgo. On Kagi, the Github is #1 and nanoclaw.dev is #4, but only if you count "Interesting Finds". On Duckduckgo, the Github is #2 and nanoclaw.dev is nowhere to be found.
I've been developing and maintaining https://canine.sh and https://hellocsv.github.io/HelloCSV/ for some time now, and its really odd what pops up when you google these.
Neither of these projects anything requiring payment anywhere, but tons of sites pop up trying to "sell" these projects. I wouldn't even know what that means and I'm kind of tempted to drop in a credit card to see what happens. Would they auto send you a link to the public repo?
Most of it is quite lazy and haven't quite kept up with modern AI capabilities. They mostly just scrape the text I wrote, and present it with some screenshots that I created. I can imagine a future where
- really nice landing pages are generated
- the product is entirely rebranded
- marketing is automated (linkedin, google ads, etc)
and someone can develop some autonomous system that basically finds high quality, yet unknown open source projects, and redeploys it and sells it online for actual money.
> This isn't an SEO problem. This is a Google problem.
I've tested on a few of the big search engines, and nanoclaw.dev is never in the first page.
Gemini was also unable to find the .dev, even in "Research Mode." The only way I was able to get a direct link to nanoclaw.dev was with chatgpt, which found it by scraping the GitHub (it also spat out links to a couple of other copies it found from google.)
Seems this is a wider SEO issue, one which infiltrates even the technology supposed to replace it.
Do what Louis Rossman did... just ask Google's AI what you need to change on your site... Apparently that's the secret now.
Before installing new software, I usually visit its GitHub page or Wikipedia entry first and click through to the official site from there. I just don't trust the 'official' sites that pop up in Google search results. How many of you do the same?
> I've done everything you're supposed to do and more.
By the sound of it, everything except reporting it? Winning SEO just means appear before them in search results, but the fake page shouldn't just lose the race, it should be taken down.
ICANN specifies how to deal with this kind of issue: https://www.icann.org/en/system/files/files/submitting-dns-a...
lol This gets worse with AI search. If Google can't figure out canonical source from a GitHub repo linking directly to the official site, LLMs definitely can't. And once an AI overview bakes the fake site into its knowledge graph, you're not just losing Google rankings imo, you're losing the models too. Registering every TLD on day 1 is now just table stakes for any OSS project which still doesn't seem fair.
People forget that Google is a malware services company. A significant part of their revenue is fake OBS malware and the like.
Copycats are not a new problem. You can be completely open source and have a trademark on the project name.
> So I built a real website. That was two weeks ago.
Is Google supposed to have drastic updates to its index over 2 weeks?
I've been annoyed with Google search quality lately and was wondering how the others fared on this specific issue. Turns out, mostly not much better.
Bing, DuckDuckGo, Qwant, Ecosia, Brave all had the github repo and nanoclaw.net (the fake homepage) in the first or second place. Marginalia had fascinating results about biology but only tangentially related Nanoclaw results, not the github repo or either the fake or real homepage.
Mojeek was the exception, sort of. It had some random news sites up top, but the github repo in 2nd place and nanoclaw.dev (the real homepage) in the 4th place. The fake nanoclaw.net did not show.
Kagi is the only one I couldn't try because apparently I used up my free credits a year back. Can anyone see how they compare?
Is there an acronym for “AI generated, didn’t read”?
I don't see that Google cares much about backlinks any more. Seems like it's all about "content" keywords and maybe a little time-on-site. The domain is a huge signal, which is probably where the problem comes from here.
Sadly, Google's generally better against all the new AI-generated content farms than other players, so maybe they're still running PageRank somewhere.
SEO is broken at the moment. With Google Overviews just killing organic SEO, it is becoming less and less relevant, unfortunately.
I saw this some time ago with Bing and OpenCode:
"If I search for "opencode GitHub" in Bing, a random fork is returned"
https://news.ycombinator.com/item?id=46573286
Just an FYI, but I don't know if being in the website field of GitHub really helps since there's a rel nofollow on the link.
Yeah, Google stopped even trying to usefully index most of the web around ‘08 or ‘09 or so. Was super obvious when it happened and it’s been that way ever since. Your GitHub is up there because it’s a blessed website, your personal site isn’t and will struggle mightily to rank even when you search exact, unusual phrases on it, if it’s like most of the rest of the Web on Google these days.
Get more traffic (make sure google analytics sees it, IDK but that probably matters because monopoly) and it might help.
Most of the other indices aren’t much better. Turns out fighting spam is expensive, easier to just do a combo of boosting really big sites and blessed spammers that use your ad network.
This project was launched very quickly, and may have not had a large budget for extra domains.
But for entities with a bit more time, you can prevent this scenario by taking acquiring the .com/.net variant domains before launching.
I'll be honest, I'd take this more seriously if this post didn't read like ChatGPT output. If you won't spend the effort to use your own words why should I stir myself to care?
Sorry, I'll put it in hand-crafted ChatGPTese:
## The Slop Problem
Every post sounds the same. No intelligence. No individuality. Just pure, clean LLM slop. Let's dive in.
- Every post has LLM tells. This is key.
- Posts get upvoted anyway. Nobody seems to notice or indeed care.
- People acclimate to the slop. This isn't just a coincidence. This is a real shift in standards. When people read enough of this, they begin to think it sounds normal.
## The Replying Dilemma
Should you engage with the content, when there is a real person involved? On the one hand, they put their name on it, and probably the details are drawn from their prompt, so it can be said to fairly represent what they wanted to say. So maybe ragging on their ChatGPT prose is being mean. On the other hand, if nobody ever mentions this, the acclimatization will only get worse as the rising tide of slop overwhelms any other style of writing.
## The "Snobbery is good actually" Option
Relentlessly bully people for their half-baked LLM copy. Make it your whole personality. Go insane.
## The "Giving Up" Solution
Learn to stop worrying and love the LLM.
> I don't want to be playing this game. I want to be writing code, building community, pushing features, fixing bugs.
Then just write code, build features, and fix bugs. Nobody is forcing you to fix search engines' problems. If you're not making money off of traffic, then why worry so much about SEO? Just do your thing. If it really bothers you, put a little note on your GitHub warning people about the fake site, and get on with your life.
Google is absolutely idiotic sometimes.
We (as in the team that helped fork and migrate the PoE1 wiki) setup a new domain for the Path of Exile 2 wiki, which is being hosted by the folks at Grinding Gear Games and linked on the official website and in multiple places on the highly trafficked subreddit.
Despite this, Google has decided that the site is not relevant and shouldn't appear anywhere in search results, despite the wiki for the first game appearing everywhere.
Wasn't one of the original ideas of NFT was to essentially identify the original creator?
Oof, this is exactly the nightmare scenario for “repo-first” OSS.
The weird bit isn’t that a scraper site exists, it’s that Google can’t do the obvious graph join: query == project name, #1 result is the repo, repo declares Homepage = X, yet Google still boosts an imposter domain. That’s not “SEO”, that’s the ranking system refusing to treat maintainer-declared canonical as a strong signal. Early domain squatters get to “set the default” purely by being first, then they can flip the content later once trust is baked in.
People keep saying “tell users to bookmark the real URL” like that scales. Most people will click the second link and assume it’s official. If Google can’t solve this class of problem, their “AI answers” are going to be a bigger mess than blue links ever were.
> I don't want to be playing this game. I want to be writing code
I assume the "I" here refers to Claude, who seemingly wrote the entire project AND the linked post.
This is a google problem, but only secondary.
The crux of the matter is that there's nothing that protects an open project besides reputation, and nowadays in the digital space it can be cheaply farmed.
Laws could help, but they only work when you undertake purposeful actions to be covered by them, like register a trademark, and it's never cheap.
Imagine you're in a local band playing shows. It's 3 month old and you have no issued records. A second band tighter with venues takes your name and starts performing under your moniker. You have no money to take that to court and good luck making a case. You can't do anything besides screaming on the web or, don't know, kicking a few butts. You change your name.
- I think I was upset when Google allowed fake ad for VLC to appear high in ranking
- I hate that Google returns content farms instead of product web pages
- I hate that Google provides a page of 10 useful links, later links are just pure garbage. I think that something in Google engine is profoundly broken
- I maintain my own search index, but it requires a lot of effort, and attention. I do insert links if I find them worthy. I think more people should have their personal search indexes. Mine is below. I am quite happy that problems like these do not affect me that much
https://github.com/rumca-js/Internet-Places-Database
> This isn't an SEO problem. This is a Google problem.
Sorry, but this is a SEO problem. The fake site has probably been linked to by a number of high-SEO outlets. What you should do is contact them and tell them to fix the links (to point to your site), which they should be happy to do.
i think orcasclicer suffers from the same issue. Not really sure why some oss projects struggle with this issue and others don't (notepad++)
A guy that stole someone else’s idea by making a shinier website getting mad that someone stole his idea by making a shinier website. Such is life.
> The person running nanoclaw[.]net can put anything they want on that page tomorrow. A crypto scam. A phishing page. Malicious download links. They could fork the GitHub repo, inject malicious code, and link to it from the site that Google is telling thousands of people is legitimate.
A lot of handwringing about hypotheticals. The page is up there because it links the official repo. Changing that will quickly tank its search rank.
The more things change the more they stay the same.
I've noticed this a few years ago. Google has been ruining its search engine deliberately so. I could explain the things Google did here, but other websites and videos already explain it, including the why (though there is some speculation as to why).
These days I even find e. g. qwant sometimes having better results than google search. I see it as a positive thing though - I can soon stop using Google search. So one less Google product. One day I will be Google free. It will be a happy day. I really think Google must cease to exist.
(The only sad thing is how crap the other search engines are. So while Google search sucks nowadays, I consistently get even worse results with e. g. DuckDuckGo. And I think part of the reason is because the world wide web also sucks a LOT more compared to the old days. Google is also partially responsible for this by the way, which just reinforces the idea that Google must die.)
Live by bots, die by bots.
It's simple really, .net > .dev.
Suddenly the pre-Google Yahoo model of curated links is starting to seem relevant again.
Curation in general is probably a skill that will become more and more in demand as the Internet fills up with AI slop.
Two weeks? Hardly enough for the correct url to take over. A correct url with no history/presence that came out of nowhere as far as the engine is concerned. It will happen most likely tho, thanks to the links from the project etc, but might take a bit of time since the other url is established. "losing the battle" now perhaps, but not for long most likely.
Duckduckgo actually shows nanoclaw.net as the first result and the github page as second.
Another point but DDG's AI feature actually references Nanoclaw.net as a source.
Damn I booted up Orion (Kagi) and even Kagi shows nanoclaw.net as the third result after the github page with qwibitai and another github page with your (previous?) github username ie gavrielc which when clicked on also results to the same github page.
There is an interesting find page in kagi which references the website but it still shows nanoclaw.net page earlier and the nanoclaw.dev interesting find shows the .dev domain barely that in first time I didn't even notice it.
I expected it better from DDG/Kagi to be honest. I also tried brave and it had the same issue. Brave even is its own independent index and even that struggles with.
Let's hope that this can quickly get patched though. Also a good reminder to people to prefer opening up github links than websites as I must admit that even as a tech-savvy person I could've fallen for nanoclaw.net link as well given its second in like all search engines.
DMCA?
Another comment here but here are all the search engines I looked at:
1. DDG 2. Kagi 3. Brave 4. Ecosia 5. Startpage 6. Marginalia 7. Mojeek 8. Yandex.ru
from 1-5 all referenced .net before .dev and DDG referenced .net before github , marinalia didn't give me either .net, .dev or gh link but rather docker.com or some other tech articles
Mojeek and Yandex.ru DID give me .dev links before .net at the time of writing.
I literally opened these two as a joke especially Mojeek not expecting too much But I just know names of lots of search engines so I tried.
Mojeek and Yandex.ru have surprised me although I think yandex.ru might have referenced the .dev because of https://nanoclaw.dev/ru/ as it points to this.
Mojeek seems interesting now from this observation
I also wanted to try swisscows but looks like they have become 100% premium as I do remember being able to search for free but now a popup comes.
I also tried baidu (chinese search engine) and it gave results in chinese and firefox translate sort of stuttered and didn't work when I tried to translate, I don't know chinese so pasted it in claude and it doesn't link to either .net or .dev but rather chinese links.
Now with all of this observation, I think that we do know one Provider (Mojeek) who won. A lot of these on these lists are actually not independent except Mojeek and brave and probably yandex.ru
SO I guess the main takeaway from this could be that Independent search engines can be interesting. They can still be hit or miss but the more independent search engines the merrier given that some might miss but some will also hit.
My comment definitely feels like a good reputation bonus for mojeek. Well anything for more independent search engines imo. I looked at their about me and it seems that they are a single person (Marc Smith). Fascinating stuff
I know marginalia_nu is on hn so maybe marginalia and mojeek can share some index together. Anyways this was a fun exciting experiment to do. I hope the community tries out other search engines if I may have missed any and share insights if a particular search engine gives interesting results.
Sorry Gavriel Cohen, but this Google search placement was promised to the other person thousands of years ago.
I fell for this yesterday, but for zeroclaw not nanoclaw. I found this website[1] through brave search I think. I was not paying too much attention as I was under the influence, it points to the wrong repo[2] and instructions install from that. I didn't like zeroclaw anyways so I tried to uninstall it and only then realized i'm on a forked repo.
[1] https://zeroclaw.net/ [2] https://github.com/openagen/zeroclaw
Gavriel is freaking out over nothing while making rookie mistakes pretending not to be in an SEO war
It's literally not his problem that some people click a scam link, he still has 18,000 github stars, its just a bifurcated audience of undiscerning people
He's overly worried about a perfect unanimous impression when he shouldn't
Now he's wasting his money on SEO tweaks and domain names while saying he only wants to code, then focus on coding! not buying obscure TLD's and vibecoding sitemaps while wondering what he did wrong
yeesh, some people can't handle a little fame
It’s worse. I wrote about this a couple weeks ago [1]. With AI responses and Google pulling results from different sources, you could potentially hijack other brands with your own fake content (ie: phone number).
1: https://codeinput.com/blog/google-seo
>We trust Google to surface reliable information about elections. Vaccines. Medical conditions. Financial decisions. And they can't get this right?
Actually I don't trust Google and I don't expect it to surface reliable information. I expect it to surface information and I will dig through it and judge for myself whether it is reliable or not.
Steve Jobs famously never allowed free meals at Apple.
Humans are psychologically incapable of assigning respect to things that are free; across the board - not donating to open-source, maxing out every dollar of food stamps, refusing to pay a dollar for an app if it has a free tier, even companies like AWS ripping off open source without any qualms. If you got an offer for a free relationship no strings attached, would you take it seriously? If someone on a street corner has artwork for $5 or $500, it could be the same piece of art, but which one gets more attention on first glance?
If you want your work to be respected, do not make it open source. Your odds are slightly better at succeeding at acting. Remember that 97% of public GitHub repos have zero external users.