This happened to me and I found this tool super helpful to get my site unblocked: https://dnsblacklist.org/
I purchased a valuable premium domain to host a personal art collection (of anime cels). For some bizarre reason, the site was inaccessible from my work computer and it was de-listed from Google even if I typed the url itself into search.
I hired a square space specialist to figure out why, to no avail. I then begged our company’s CISO to investigate and it turns out we had some firewall setting on UniFi that blocked the domain because it appeared on a list. Once I checked way back, it turns out that it was as an anime porn aggregator years back. I personally reached out to all the web filters out there (Google, Symantec, bing) and one by one filed tickets for them to mark it as art instead of pornography and it worked. I am now properly crawled on Google but still MIA on Bing, search console is giving me some BS error that’s incomprehensible, typical of MSFT.
I... actually remember that address floating around and it indeed was hentai.
We're talking like 20 years back. Holy shit, my brain is getting jostled by this sudden tsunami of forgotten memories.
EDIT: Digging around on Wayback Machine (obviously NSFW, for the curious), apparently it was actually still around until somewhere between 2018 and '19 when it finally died. The snapshots from around 2007 are peak Web 1.5 design with stuff like affiliate buttons and table layouts. Man I miss that era.
You have some awesome cells, thanks for sharing them online.
Had completely forgotten about Robot Carnival and neat to see you have a few pieces from some of the shorts(episodes?)
Also the resources->galleries was useful, found some new but actually old sites to check out.
I love RC and many of my wishlist items are from it. I regret I was relatively late into collecting it. Glad you appreciate the old galleries, many are internet relics which I love.
Yahoo Auctions is more popular over there and proxy services (I use Buyee) make it pretty simple bid/buy and not too much more expensive if you wait for their (Buyee) coupons.
If you can set up your own domain why would you need someone that specializes in a super limited non technical frontend for customizing prebuilt web templates?
In hindsight I didn’t need him. I am pretty technical but I couldn’t figure out what happened so I hired some squarespace seo guy to make sure I had everything configured properly. It was the first and only time I heard of this happening.
Another "haunted domain" check is by trying to post about it on social media. I ran into this with my current project's domain name. After building an MVP and trying to test the social sharing functionality, I found that Facebook was blocking the domain outright. Turns out there was some spamming from it years ago. Getting it unblocked was extra fun, as the page to request manual review was itself broken! Thankfully I knew someone on the inside who alerted the relevant team, but the whole experience was quite the novel speedbump.
It's not that the smooth path you can get via nepotism is the base way things work which people who don't "know a guy" are excluded from. Rather, everything is falling apart and shitty, and if you're lucky, you occasionally get to circumvent that shittyness.
> It's not that the smooth path you can get via nepotism is the base way things work
Well, obviously it isn't if you're not in the 1%. If you're in the 1% then that's the way the world has always worked and you don't know anything differently.
Meritocracy is great and all, but there's a gap between having merit and others seeing the merit.
I don't believe that human society can, practically, get particularly close to the ideal. I question the choice of fatty meat as a substrate for minds.
For my money, I'd suggest that merit will get you further today than in the days of letters of recommendation, but that failures of meritocracy are more visible.
I would really like to see it fixed too, especially as regards these faceless behemoths which nevertheless worm themselves into dictating important parts of real peoples' real lives with absolute authority and no recourse
The fix is called "legal system", or rather, also making it accessible for individuals and small businesses against the large mega corporations without risking getting bankrupt in case of losing. And companies that continuously lose in judgements get fined progressively until they establish enough support infrastructure to not be a burden on society.
Small claims court often works, depending upon jurisdiction.
Where I am there is no forced disclosure, no costs costs assigned, and it is $150 to file.
And while a lawyer can represent a large firm, an employee has to be present, and the lawyer cannot use excessive legalise, the court is carried on in plain language... with the judge expaining things to you if you don't userstand.
That's pretty accessible.
The biggest risk is not knowing about no required discovery, and costs. Lawyers for big corp will ask for things, and hope you work your tail off. I just say no.
They will also elude to how expensive this will be, to which I typically snort.
Said large companies typically spend 50k to 100k on lawyers, and I spend $150 and a dozen or two hours of my personal time.
Depends on the context. Forming a real human connection with someone who has proven they can be trusted is a feature. However, people oftentimes feel they are connected to others based on identity, and then treat those people favorably regardless of merit. The latter is such a major detriment to society that it needs to be actively countered by regulation (and is to some extent).
> Ideally, search engine algorithms would give new domain owners a fresh start.
Sadly, I think this would be instantly gamed by abusers. They would release the domain name and attempt to register as a new owner or start repeatedly doing handoffs. It's difficult to tell who the owner is changing between and whether or not the new one is a better actor than the former.
> It's difficult to tell who the owner is changing between and whether or not the new one is a better actor than the former.
This doesn't seem like that hard of a problem to solve, because these are domains with negative reputation, i.e. worse than zero.
So if a) the domain is no longer hosting any of the stuff previously complained about and b) is no longer receiving new complaints over a period of a year, it costs you nothing to reset the domain to zero. Because the bad actors don't have to behave for a year to get back to zero, they can just register a new domain.
All you're doing is giving the new owner the same fresh start that anybody can get by buying a never before registered domain for the same price as a year's renewal on the existing one.
Using a domain every second year in that environment would get it a gradually raising rank where it isn't penalized/sanitized (by accident, on principle, etc) so every restart after a $30 pause year would be much more effective than a new domain.
The search index knows when the first time it saw that old link was. If it was before the reset, regard it as pointing to a different domain than the current one.
Google can take various actions to put pressure but it ultimately doesn't control how the entire world treats archived text.
A google rank at zero and lots of 2 hop routes to your site that google can either penalize for being an accurate historical record or not is better than a rank of zero and a domain that has never been in historical artifacts.
The historical artifacts exist independently of the search ranking. Actual bad guys can get a new domain to get a clean slate without taking the old one down. The reason they care about the cost of domains is their domains get a bad reputation immediately and they have to cycle through far more than one domain a year.
If they were going to consistently use the same domain for links while they churn through hundreds/thousands a year for Google, the extra cost for one extra renewal for the persistent domain would be entirely negligible. And on top of that would make it trivial for Reddit/Facebook/etc. to disable all the historical links because they all go to the same scam site.
How about not even look for a new owner, and just... check the content and complaint levels? If I was hacked and hosted spam, getting blocked/banned for months at a time when... the spam is cleaned and the hole that allowed it is fixed ASAP... that gives folks less incentive to fix/clean/remediate.
3 assumptions that from my read are baked into your comment.
- Any empty domain starts with the same reputation
- Registering a new domain is a 0 cost action
- The eng effort to reset domain reputation is 0
Certain domains are used by abusers more often, usually due to them being cheaper. Forcing them to move domains is extra friction to the abusers which "haunted" domains force more than the proposed new system.
For the last point, I think it's simplifying a complex system change. Even if the new system was marginally better, it could be a large eng effort and not worth pursuing.
> Any empty domain starts with the same reputation
What basis would you have to do otherwise, and if there is something (like TLD), why wouldn't "resetting to zero" in terms of past content just mean resetting to that zero?
> Registering a new domain is a 0 cost action
No, that registering a new domain has a similar cost to renewing an existing domain, which is a valid assumption. In fact, the new domains are often cheaper because registrars often discount the initial registration as a loss leader with the expectation that people will make future renewals at a higher price.
> The eng effort to reset domain reputation is 0
It is the job of the party operating that system to make it operate as correctly as feasible. Needlessly causing collateral damage purely out of laziness and unaccountability is how you get people showing up at government offices demanding for you to be regulated or broken up, if not showing up at your offices with a disposition to cause bodily harm.
> Certain domains are used by abusers more often, usually due to them being cheaper.
Running out of domain names is physically impossible. There are more possible domain names in any given TLD than there are atoms in the observable universe. So the low price is going to be the price set by the registry for that TLD.
Whether the TLD itself has some reputation is orthogonal to the reputation of one domain in that TLD relative to another one in the same TLD. Moreover, you would presumably do the same thing for the TLD -- if one TLD is doing promotion and has $1 registrations this year and then gets used for a lot of scams, and then next year it costs $15 and so do the renewals so the scammers move to a different TLD, the reputation of the TLD should be reset just the same as the individual domains.
> Even if the new system was marginally better, it could be a large eng effort and not worth pursuing.
If the primary goal is to reduce engineering effort then the obvious solution is to delete the entire reputation system so it doesn't have to be maintained anymore. If the primary goal is to make it work well then you have to, well, you know.
> What basis would you have to do otherwise, and if there is something (like TLD), why wouldn't "resetting to zero" in terms of past content just mean resetting to that zero?
Fair enough, but I'm not sure it resolves "haunted" domains as a TLD which is often abused could have a lower "0" reputation and thus by default is "haunted". Perhaps it lessens the impact though by how much is quite opaque to us.
> Whether the TLD itself has some reputation is orthogonal to the reputation of one domain in that TLD relative to another one in the same TLD.
I think this depends on how reputation works and is not so clear. Registrars for these TLD also have a responsibility but have no incentives to stop abusers. If TLD domain reputation is not orthogonal to reputation individual domains on that TLD then that could be an incentive for them to also crack down on abuse as their domains have bad SEO etc.
> If the primary goal is to reduce engineering effort then the obvious solution is to delete the entire reputation system so it doesn't have to be maintained anymore. If the primary goal is to make it work well then you have to, well, you know.
I think this is the most uncharitable interpretation. The eng effort could go to features that improves other customer experiences affecting more people.
Google product manager interview question - Write some code with an LLM tool that leverages a LLM to determine if the new owner of a domain is doing (a) same dodgy thing as prior owner that got flagged (b) different dodgy thing as prior owner but should be flagged (c) something completely innocuous (d) needs further review.
(Therefore, this has a one-way function of improving the status of haunted domains and why I think anything is better than nothing unless it blocks a better strategy.)
Follow up interview question. Update the code using your LLM code gen tool of choice that, when someone submits a complaint via an online form, feeds that complaint text back into your LLM to score it again. Points deduction if the candidate ever mentions informing the complainant of anything.
If it's instantly released, then yes. But in this thread are reports where the offensive actions happened 15 years ago. After such a long time of "good behavior" it makes no sense for me to still keep the domain blocked/downranked.
Honestly, these days, with domains in general being nearly free compared to the profit potential of a single successful spammer grift, I’m not sure I even see the point of blacklisting domains at all. 25 years ago maybe a spammer would be devastated that he had to “start all over and buy a new domain and build up its reputation.” Now, spammers launch and abandon what, a million new domains a day? Google or anyone spitefully holding onto hard feelings about what a domain “did” years ago is pointless because the spammers will move on anyway. They wouldn’t reuse abcqwertuiop26abc dot xyz anyway because it’s safer to make up a new gibberish domain anyway. Only people who acquire domains legitimately are hurt by this.
I would want to experiment judging them based on what they’ve been seen to do in the past month.
I'm imagining/advocating for blacklisting them for say, 12 months, and re-evaluating them at that point. This imposes the identical cost on the spammer as now (each "detection" costs them a year's domain registration) while allowing a reputation "reset" for innocent people who acquire haunted domains.
Yes, the spammers can sit on their domains once blacklisted, renew them, and redeploy their spam on them 12 months later, but they'd have nothing to gain from the reuse, since the names of their domains are just nonsense anyway.
I’m guessing that would complicate blacklist maintenance quite a bit, which is why we aren’t seeing it work that way.
Most of these blacklists (at least initially) were emergency type measures - ‘block these spammers’, then move on with life.
Blacklist maintainers would need to maintain date first seen/date last seen info, and purge/re-add correctly.
Technically, seems like an ‘append only’ type thing is what they’ve been doing for the most part.
As this evolves and the idea that these do need some kind of expiration or we end up with more maintenance headaches becomes more widely known, maybe eh?
Or if there is some kind of legal rules around it.
A tweak to that could be along the lines of "if the DNS lookup of the domain responds with NXDOMAIN for more than x days, give it a fresh start".
I'm not up to date with SEO so unsure whether Google would (or is able to) reset the domain's backlink profile, I'd guess it would be possible. A lot of the value of using expired domains is for backlinks (or at least was)
Require a deposit then, say 1000$, that is to be refunded after a year of probationary period. You get caught being a scammer/spammer, you lose the deposit.
Some time ago I noticed that my side project (with a domain that is not haunted) shows up fine on Google but not Bing/DuckDuckGo.
So I checked the Bing Webmaster Tools. URL Inspection says "Discovered but not crawled - The inspected URL is known to Bing but has some issues which are preventing indexation. We recommend you to follow Bing Webmaster Guidelines to increase your chances of indexation."
That's quite unhelpful. What's more, when I open the "Live URL" tab, it says, in green: "URL can be indexed by Bing."
It's a simple static Hugo site hosted on Cloudflare R2 (DNS mapped directly to bucket). https://pagespeed.web.dev gives it a score of 100 in every category.
Yup. I've regularly had problems with a static site [0]. Sometimes it's a top hit for my name on Bing, sometimes completely unlisted. Seems to flip back and forth - with that same message you get.
It's a handwritten HTML website, enhanced with JS but not reliant on it, hosted on Cloudflare. Not quite a 100 in every PageSpeed category, but just about.
OP here, and yes, I've been getting that same message for musicbox.fun. I thought it just needed some time but I requested a fresh index two weeks ago, and nothing seems to have changed. :/
A side effect of negative seo is that some stuff that hasn't worked on Google for a long time still does on Bing (They, Bing, obviously, not being the real target of the attack).
I've seen a few sites become de-indexed and the 'give away' is the type of results that first appear when the penalty is eventually lifted.
For example, just a dozen or so urls with really weird query strings that never existed before. The real stuff does come back after time though and, in my limited experience, it's a one-off incident.
Just to add, not many sites are insignificant enough not to attract negative seo - especially this type of low-level, zero cost malarkey.
Another variant of this is cached or preloaded security configurations.
HSTS (which forces browsers to validate HTTPS when connecting) asks browsers to cache the configuration for a set "max-age". Some sites set huge values here, like Twitter's 20 year max-age[1]. There's also the preload lists [2] to consider. This creates a problem if you want to serve non-HTTPS/unencrypted HTTP on your new domain and the previous owner didn't.
MTA-STS [3] is another variant that's becoming more popular. It limits which mail servers your domain uses and enforces TLS certificate verification. "max_age" is capped to a year by the RFC. If you don't set your own policy, then the previous domain owners policy would impact any senders who previously cached the policy.
Thankfully HPKP (key pinning) is obsolete, otherwise you'd also need to worry about old pinned keys too. That RFC recommended, but did not enforce, a 60 day max-age limit.
These are especially tricky as the old security policy only lives in the caches of any end-user devices that previously connected to the domain. Double haunted.
The worst part about HSTS is that the spec doesnt just define the interaction between the browser and the website but also goes as far as mandating that the browser restricts the options it provides to the user ... and would-be user agents actually go along with that.
FWIW, you can invalidate MTA-STS cache by updating the DNS assertion record to a different 'id' value. This is how you indicate a policy has changed.
So the sender is supposed to obey the normal DNS TTL caching period, and re-query the assertion record if TTL expired. It should re-fetch the MTA-STS policy if the 'id' value in the DNS assertion changed, or the max_age in the previously fetched policy has expired.
> RFC 8461 section 3.3: Conversely, if no "live" policy can be [...] fetched via HTTPS, but a valid (non-expired) policy exists in the sender's cache, the sender MUST apply that cached policy.
A client of mine once swapped over to a new domain that was coincidentally one letter away from another major domain. It wasn't an attempt to typosquat or anything nefarious, but Chrome started automatically showing everyone a big scary warning page before entering the site. We looked into appealing it but there was no guarantee of it getting whitelisted in a timely manner, so we ended up canceling the domain migration before they lost too much traffic.
I wonder if it would be a reasonable requirement of registrars to now allow domains to be purchased if they are some edit distance away from existing/active domains. Its fine if Google wants to protect its users, but ideally this would be caught sooner.
Look at the milka.fr problems... Milka is also a female name over here, and that already proved to be a problem in france. But so are Mirka and Minka so yeah... no domain for them? Also Micka. Oh and mivka is (beach) sand. Want to sell beach sand? It's just one letter away from milka, so no domain for you either.
Is it really better if Mirka, Minka and Micka get to pay for a domain but won't be able to use it because the dominant webbrowser shows super scary warnings?
Still seems better to raise the issue as early as possible so they can find a solution (appeal or chose a different domain) before investing into the unusable domain name. It would also mean that the dispute is at a layer (ICANN) where you at least theoretically have some rights instead of at the hands of a megacorporation that thinks the best way to reduce customer support costs is to make it impossible to get support.
This can also happen with IP addresses. We recently moved one of our sites to a new IP and got a trickle of complaints about it being inaccessible from various authoritarian countries. After some digging, the new IP was used as a Tor bridge (not even an exit node) over _ten years ago_. I gave up any hope of fixing that and just ordered a different IP address.
In their defense (and I don’t defend Google often), addressing this really well means:
- knowing all the complexities of every local, state, federal, international jurisdiction that might interfere with the whitelist
- awareness of the content in question which could be millions of subpages
- a customer support team that is definitely not incentivized based on tickets triaged per day, but is somehow incentivized to spend hours on “whale” tickets.
- going through ticket history and solving the problem for everyone now that its policy to solve this
- dealing with the inevitable rush of fraud that follows every tiny change in google systems
If it was easy to reset reputation with search engines what's stopping people from saying "under new management" every once in a while for an existing poor reputation domain? Probably better to just cut their losses and find another domain.
I've had an opposite experience. One domain I bought was used for an entirely different purpose in the past, which got linked on a Wikipedia article in references. This gives me some good link juice and at least matches the geo area of the previous business. Since it's an extremely niche entry and low on the list of references, I decided to be slightly naughty and not touch it for a couple of years. Not sure what's the opposite of haunted in this case, but it was just as surprising.
I have a lot of sites (all saas) and more and more people send me cease and desists and lawyer threats because they go to google, enter 'something' that's remotely phonetically similar to a domain I run and then click on my site. They paid on some site that sounds a LITTLE bit (if you squint) like my domain and now they are scammed and want to sue me. Now I understand scammers do this as well, but I had actually someone turnup at our office (which is my business partner his home) with bank receipts with a really not so similar name, however if you type it in google we pop up first even though our businesses are not at all related.
"Ideally, search engine algorithms would give new domain owners a fresh start."
I don't think it's possible to fix this problem without also helping bad actors. Maybe it's a problem that just isn't worth fixing. Just don't buy preexisting domains unless it's a project big enough to justify the necessary cost of due diligence.
Then help them. If a few bad actors is the price of a free internet, so be it. I'd rather deal with those than have a whitelisted internet where you need permission to start a website.
The really bad actors just buy and discard new domains daily and silly blacklisting techniques are powerless to prevent that. I don’t think they renew and come back to try to use their domains years later.
This happens with physical addresses too, for similar reasons. The ABC (Alcoholic Beverages Commision) tracks complaints against physical addresses, and too many violations will get an address banned from permits. Then a new owner comes in with a new business and gets mysteriously denied for a liquor license, even years later.
How can a business function without a name? So much tax paperwork requires a name. Is it just a sole proprietor that files everything under the owners name?
> It wasn’t until I had redirected all of my musicboxfun.com traffic to musicbox.fun that I noticed that something wasn’t right: my web traffic from organic search dropped to zero.
Some practical advice here: do not change your canonical domain[1] name unless you really really have to.
If he had just set his fun new domain to redirect to the existing domain, instead of making the new domain the canonical, it likely would have had no negative effect.
I’m not saying this is how things should work. But the practical reality is that your domain name is like a Social Security number: it’s the basis for assigning a type of reputation score, even though it was not intended to do that originally.
[1] The domain at which your web pages finally load, after all redirects have completed.
Basic SEO stuff, you have marketplaces that check history, you have domain search engines aggregating data from multiple sources - not only ahrefs.
Checking web archive is a basic operation to test if site was hosting anything fishy - not only pirated stuff or porn - often websites has been hacked and changed into link farms or simply were bought on aftermarket simply to use it's SEO value to pass the strength to other domains.
Interesting. Domain as a unit of trust makes sense until it doesn't. Buying a second hand domain is like a second hand car. But you may not know it is second hand!
I think the mistake here is the redirect old to new. That is always risky so only do it if deseprate. In this case I would have done the redirect from new to old. Then just use the new as a vanity url.
one other thing i would suggest is to set up a catch-all email for the domain and see what gets sent to it, sometimes you can access accounts associated with the domain, socials etc
I set up a catch-all for personal use and wasn't expecting to get flooded with emails.
I was getting business emails, people trying to send money by Zelle, etc.
I was kind of hoping to get something good that I could take action on in the market, so I left it on for a little bit, but then I felt bad that people's emails were not getting answered (at least bouncing), so I turned off the catch-all. Oh well.
This sort of thing is also an issue for phone numbers, some other company could have used your new number for robocalls and gotten it spam blocked on Truecaller and similar services.
Years ago I bought the carelessly discarded domain of a defence contractor that was acquired by another one. And set up a catch all email forwarder. Had weeks of fun reading all the emails that I got sent. There was nothing "secret" but plenty of social and business stuff still going on.
One risk of pre-validating a domain before purchase is that it's not a good idea to tell strangers about your interest in such a property.
Even automated queries are likely to spill the beans. Someone else could snag the purchase before you, or bid up the price. But it's a risk you may need to calculate.
Conversely when you drop domain don’t forget you might have accounts on emails or some DNS verification in services that you better explicitly discontinue before just dropping domain.
My very first domain was haunted. The warning sign was firewall blocks against the domain at both school and the public library. As it turned out... a previous owner in the early 2000's was running a sort of proto-Netflix, but with VHS instead of DVD, and that was exclusively targeting the... erm... "adult entertainment" market.
Wayback machine would've saved me there, had I done my due diligence!
Not quite haunted but I've had people report that my website hosted on a .quest domain is blocked on their work computer. My best guess is that their filter thinks it's gaming related (it's not) or maybe they just block all "weird" domains.
The individual IPs may not all have too bad reputation but you don't control who shares the block with you and don't have any control over new neighbors - and that is enough for some agressive organizations (Microsoft) to block you.
Find out the IP address of the machine hosting the domain, then do a reverse lookup on that IP address. It might show the last domain hosted on that IP address.
Calling a domain “haunted” is an awful, terrible way to frame it. It places all the badness of the domain on the domain itself, as if the domain name had something with it which could be removed or fixed by the domain owner. Instead, what has actually happened is that the domain is blacklisted by entirely too powerful entities. The problem lies with these blacklisting entities, not with the domain, and the solution must be done there, too. It should not be a domain owner’s responsibility to get out of being unfairly blacklisted.
It’s like when cars took over the streets, and instead of blaming cars for being dangerous for regular people using the streets for walking, the concept of “jaywalking” was invented by car companies to place the blame on people for daring to obstruct cars. Or the concept of “personal carbon footprint”, commonly used to move blame from companies to individuals, when in reality whatever individuals, even in aggregate, could do is utterly insignificant compared to what companies and legislation could accomplish.
> what has actually happened is that the domain is blacklisted by entirely too powerful entities. The problem lies with these blacklisting entities, not with the domain, and the solution must be done there, too. It should not be a domain owner’s responsibility to get out of being unfairly blacklisted.
These kinds of blacklists exist because these domains have been used to host scams or distribute spam (or some other malicious activity) in the past. They're there to protect people (e.g. so that Firefox can disply a "warning: this site is a scam") and reduce abuse. They're not just there so people at Google can get a good kick out of blacklisting random domains.
I'm guessing here because I'm not the author but I believe this statement is directed towards the blocklisting entities because they don't provide transparencies or a method to reach them to resolve issues with a domain once it's aquired by someone else. That absolutely is the issue of those entities.
At one point of time when I had to deal with people submitting phishing links to a web service I owned, I learned some of the tricks that phishers use to get around reports, such as using IP geolocation or the accept-language and accept-encoding header to determine if the phishing page should be served.
With tricks like this, it's not a surprise to see why the companies operating blocklists are hesitant to make this process easy; after all, what's to prevent the phishers from temporarily stating that the issue has been resolved to get out of the denylist, and then restarting their campaign again?
If the process required you to verify ID, e.g. a passport + video selfie, some accountability might be possible. But that might be too invasive for many folks.
This doesn't work because there's a nearly unlimited supply of people willing (out of desperation, drug addiction, or just plain poor decision making) to let bad actors use their IDs.
I really disagree with pulling the power dynamic angle into focus here. Injustice can also be carried out by the "little man", sometimes even at scale, and is every bit as awful to remedy if not even more so.
The issue is with the issue: people/systems (big and small) blacklisting an ownable identifier pointing to some ownable content without any care for the lifecycle of either.
Painting this with a social brush is extremely unhelpful and is guaranteed to derail conversations for no benefit whatsoever.
> The issue is with the issue: people/systems (big and small) blacklisting an ownable identifier pointing to some ownable content without any care for the lifecycle of either.
Does the lifecycle matter much, though?
Kind of like a carfax report. Tells you whether a vehicle you’re buying has been in an accident before (if it has, the value goes down because maybe there’s some latent issue that isn’t obvious at the time of purchase)
It would be nice if ICANN had some equivalent of a carfax for domains, perhaps even with a requirement that registrars expose at time of purchase whether a domain has been misused in the past (and who the prior owners were, or at the very minimum what the historical DNS records were).
Basically you want to avoid buying a “lemon” domain by accident.
I place zero fault/blame on “powerful entities” maintaining lists of domains used for spam/scams. How else will we protect grandma?
A carfax report lists issues with the actual car. You don’t want a car with “car exploded” in the carfax report, since this would translate to actual damage in the car, damage which could actually affect you if you were to drive the car.
On the other hand, a domain reputation at Google et al. is more like Carfax reporting “This car was once parked at the same street where a horrific mass murder took place.” If this was a problem since, let’s assume for the sake of argument, the police would pull you over all the time if you drove it, it would still not be a problem with the actual car; the problem would be the police, and fixing police behavior would be the only workable solution. Using Carfax as an analogy still places the blame on the domain owner, not on Google et al.
But in this scenario there are many more parties involved than just "the police". So you can't "just fix the police behavior" for a "solution". You'd have to "fix" any and every party that already exists or pops up in the future.
This kind of issue is inherent to any system where identifiers are recycled, particularly when that recycling happens on demand. It's not "fixable", at best it's combatable. And trying to language police away the symptom and blaming it all on the pivotal participants supports and achieves neither.
The analogy is not perfect, but there aren’t myriads of parties, there’s basically only Google, plus a handful of others of greatly decreasing importance.
If it was a reputation problem where, say, end clients with web browsers would each have a separate and uniquely derived negative opinion about domain names, this would indeed be a “bad reputation” problem and not a Google problem, since the problem could not be fixed at the Google side. But with domain reputation being so centralized, the problem is at the center.
How could it not? It's essentially the same issue as an unmaintained phonebook or a map. What's at a given address or phone number changes, and if your solution is not equipped to handle that change, your solution is bad.
But that’s not a fixable problem in my eyes. At least not without extreme and sweeping changes driven by some kind of government regulation or ICANN mandates which, if enacted, would probably be highly criticized on HN.
There are just too many block lists for domains (literally thousands if you include open source ad blockers).
The lifecycle “should” matter in a perfect world, I agree.
Oh I don't think it's full-on fixable either. What I wanted to challenge was just the characterization of the issue itself.
As you say there are plenty of volunteer maintained blocklists as well, and there are also the countless privately deployed filters using those lists, which may or may not get updated properly. That's the "little man" part, and is why I think the characterization the thread starter was trying to push is ill-fitting.
Who says it's the fault of the domain in some abstract sense? A house becomes haunted when something bad happens in it. It's not the fault of the rafters and joists. I think "haunted" is an apt description.
“Haunted” still implies that the problem exists at the house/domain, and can be fixed there. But a domain being blacklisted is not something which a domain owner can fix by themselves, they have to beg the blacklister to de-list them.
You'd usually describe a house as haunted if something bad has happened in the past (e.g. a murder, evil spirits, etc) and people are superstitious about this (e.g. believe some ghosts are still living in the house). Hard to see how an owner can fix this. All the usual problems the owner can fix (floorboards need replacing, gutters need cleaning, general repairs) aren't really examples of a house being "haunted".
In a perfect world, when your legitimately good content isn’t being surfaced by Google, it’s a failure on their part, and their problem to solve, not yours. In practice, it is your problem and you have to do a bunch of work to help them see that their current assessment of your domain name is no longer accurate.
You're right, the fault lies with the search engines, but in practice it sure feels like the domain itself is tainted somehow.
Yes, but they are on some blacklist somewhere. One could say greylisted. The point is the whatever term describes the issue shouldn't be mystical.
Haunted implies a supernatural condition that just isn't helpful in system administration.
If something isn't working with a service there is always a method to troubleshoot and isolate the issue. Contact the appropriate people when needed.
This is how NeoTokyo restored his "listed" domain.
This happened to me and I found this tool super helpful to get my site unblocked: https://dnsblacklist.org/
I purchased a valuable premium domain to host a personal art collection (of anime cels). For some bizarre reason, the site was inaccessible from my work computer and it was de-listed from Google even if I typed the url itself into search.
I hired a square space specialist to figure out why, to no avail. I then begged our company’s CISO to investigate and it turns out we had some firewall setting on UniFi that blocked the domain because it appeared on a list. Once I checked way back, it turns out that it was as an anime porn aggregator years back. I personally reached out to all the web filters out there (Google, Symantec, bing) and one by one filed tickets for them to mark it as art instead of pornography and it worked. I am now properly crawled on Google but still MIA on Bing, search console is giving me some BS error that’s incomprehensible, typical of MSFT.
I'd be somewhat interested in seeing the cels. :)
https://www.neotokyo.com
I have a +100 cel backlog that I need to catalog and photograph. Was planning to do it this holiday season so check back in.
I... actually remember that address floating around and it indeed was hentai.
We're talking like 20 years back. Holy shit, my brain is getting jostled by this sudden tsunami of forgotten memories.
EDIT: Digging around on Wayback Machine (obviously NSFW, for the curious), apparently it was actually still around until somewhere between 2018 and '19 when it finally died. The snapshots from around 2007 are peak Web 1.5 design with stuff like affiliate buttons and table layouts. Man I miss that era.
It is also blocked by the UK ISP porn filter.
Does that still exist? I got a decent ISP (Zen) so they don't block anything.
Ah great! Such nostalgia for that site, they had the -best- porn back in the old days, one of my favorite pron sites.
You have some awesome cells, thanks for sharing them online. Had completely forgotten about Robot Carnival and neat to see you have a few pieces from some of the shorts(episodes?)
Also the resources->galleries was useful, found some new but actually old sites to check out.
I love RC and many of my wishlist items are from it. I regret I was relatively late into collecting it. Glad you appreciate the old galleries, many are internet relics which I love.
Great domain name! I can see why you went through the effort of contacting all the web filters.
Did you get anything from the Heritage auction last week? They had a ton of good stuff.
I watched closely and bid on a few but didn’t pull trigger. I am eyeing a few private pieces and saving my budget.
Where does one buy cells, apart from ebay?
There's actually a page on their site under resources for that: https://www.neotokyo.com/anime-cel-dealers-and-resources
Yahoo Auctions is more popular over there and proxy services (I use Buyee) make it pretty simple bid/buy and not too much more expensive if you wait for their (Buyee) coupons.
It’s wild how these past associations can stick and haunt a domain, even after it’s changed hands entirely.
> I hired a square space specialist
I had no idea such a thing existed.
If you can set up your own domain why would you need someone that specializes in a super limited non technical frontend for customizing prebuilt web templates?
In hindsight I didn’t need him. I am pretty technical but I couldn’t figure out what happened so I hired some squarespace seo guy to make sure I had everything configured properly. It was the first and only time I heard of this happening.
Another "haunted domain" check is by trying to post about it on social media. I ran into this with my current project's domain name. After building an MVP and trying to test the social sharing functionality, I found that Facebook was blocking the domain outright. Turns out there was some spamming from it years ago. Getting it unblocked was extra fun, as the page to request manual review was itself broken! Thankfully I knew someone on the inside who alerted the relevant team, but the whole experience was quite the novel speedbump.
I faced the same issue with one of my project. But, as i don't know anybody at Facebook, I left the domain and buy a new one.
So much of the world is still based on who you know. This is a bug in our society I would really, really like to see fixed in my lifetime.
Reframe:
It's not that the smooth path you can get via nepotism is the base way things work which people who don't "know a guy" are excluded from. Rather, everything is falling apart and shitty, and if you're lucky, you occasionally get to circumvent that shittyness.
> It's not that the smooth path you can get via nepotism is the base way things work
Well, obviously it isn't if you're not in the 1%. If you're in the 1% then that's the way the world has always worked and you don't know anything differently.
Meritocracy is great and all, but there's a gap between having merit and others seeing the merit.
I don't believe that human society can, practically, get particularly close to the ideal. I question the choice of fatty meat as a substrate for minds.
For my money, I'd suggest that merit will get you further today than in the days of letters of recommendation, but that failures of meritocracy are more visible.
I think with AI it is going to become the opposite. You only trust who you know in real life and ignore everything else.
Huh? Weird. I only trust the AI and ignore everyone in real life life. (/s for the humor impaired)
I would really like to see it fixed too, especially as regards these faceless behemoths which nevertheless worm themselves into dictating important parts of real peoples' real lives with absolute authority and no recourse
Sadly, the most likely "fix" would be to remove the "who you know" path and just make things shit for everyone. :(
But would that not introduce pressure for the official paths to become better oiled and working better than before?
The fix is called "legal system", or rather, also making it accessible for individuals and small businesses against the large mega corporations without risking getting bankrupt in case of losing. And companies that continuously lose in judgements get fined progressively until they establish enough support infrastructure to not be a burden on society.
Small claims court often works, depending upon jurisdiction.
Where I am there is no forced disclosure, no costs costs assigned, and it is $150 to file.
And while a lawyer can represent a large firm, an employee has to be present, and the lawyer cannot use excessive legalise, the court is carried on in plain language... with the judge expaining things to you if you don't userstand.
That's pretty accessible.
The biggest risk is not knowing about no required discovery, and costs. Lawyers for big corp will ask for things, and hope you work your tail off. I just say no.
They will also elude to how expensive this will be, to which I typically snort.
Said large companies typically spend 50k to 100k on lawyers, and I spend $150 and a dozen or two hours of my personal time.
All very amusing.
Anyhow, a good equalizer.
Is this a bug? I think this is a built in feature since version 1.0.
Depends on the context. Forming a real human connection with someone who has proven they can be trusted is a feature. However, people oftentimes feel they are connected to others based on identity, and then treat those people favorably regardless of merit. The latter is such a major detriment to society that it needs to be actively countered by regulation (and is to some extent).
Social media platforms can be some of the biggest canaries in the coal mine when it comes to a domain’s “haunted” reputation
I had that one happen as well, after launching a project. I could even post in a messages to friends.
I have a fairly boring consulting business, blocked by Twitter for being malware. Fortunately FB / LinkedIn / WhatsApp all work.
> Ideally, search engine algorithms would give new domain owners a fresh start.
Sadly, I think this would be instantly gamed by abusers. They would release the domain name and attempt to register as a new owner or start repeatedly doing handoffs. It's difficult to tell who the owner is changing between and whether or not the new one is a better actor than the former.
> It's difficult to tell who the owner is changing between and whether or not the new one is a better actor than the former.
This doesn't seem like that hard of a problem to solve, because these are domains with negative reputation, i.e. worse than zero.
So if a) the domain is no longer hosting any of the stuff previously complained about and b) is no longer receiving new complaints over a period of a year, it costs you nothing to reset the domain to zero. Because the bad actors don't have to behave for a year to get back to zero, they can just register a new domain.
All you're doing is giving the new owner the same fresh start that anybody can get by buying a never before registered domain for the same price as a year's renewal on the existing one.
Using a domain every second year in that environment would get it a gradually raising rank where it isn't penalized/sanitized (by accident, on principle, etc) so every restart after a $30 pause year would be much more effective than a new domain.
It gets reset every year so how would it be more effective?
A system gets reset, what happens in obscure places like old HN content?
The search index knows when the first time it saw that old link was. If it was before the reset, regard it as pointing to a different domain than the current one.
Google can take various actions to put pressure but it ultimately doesn't control how the entire world treats archived text.
A google rank at zero and lots of 2 hop routes to your site that google can either penalize for being an accurate historical record or not is better than a rank of zero and a domain that has never been in historical artifacts.
The historical artifacts exist independently of the search ranking. Actual bad guys can get a new domain to get a clean slate without taking the old one down. The reason they care about the cost of domains is their domains get a bad reputation immediately and they have to cycle through far more than one domain a year.
If they were going to consistently use the same domain for links while they churn through hundreds/thousands a year for Google, the extra cost for one extra renewal for the persistent domain would be entirely negligible. And on top of that would make it trivial for Reddit/Facebook/etc. to disable all the historical links because they all go to the same scam site.
How about not even look for a new owner, and just... check the content and complaint levels? If I was hacked and hosted spam, getting blocked/banned for months at a time when... the spam is cleaned and the hole that allowed it is fixed ASAP... that gives folks less incentive to fix/clean/remediate.
3 assumptions that from my read are baked into your comment.
- Any empty domain starts with the same reputation
- Registering a new domain is a 0 cost action
- The eng effort to reset domain reputation is 0
Certain domains are used by abusers more often, usually due to them being cheaper. Forcing them to move domains is extra friction to the abusers which "haunted" domains force more than the proposed new system.
For the last point, I think it's simplifying a complex system change. Even if the new system was marginally better, it could be a large eng effort and not worth pursuing.
edit: styling
> Any empty domain starts with the same reputation
What basis would you have to do otherwise, and if there is something (like TLD), why wouldn't "resetting to zero" in terms of past content just mean resetting to that zero?
> Registering a new domain is a 0 cost action
No, that registering a new domain has a similar cost to renewing an existing domain, which is a valid assumption. In fact, the new domains are often cheaper because registrars often discount the initial registration as a loss leader with the expectation that people will make future renewals at a higher price.
> The eng effort to reset domain reputation is 0
It is the job of the party operating that system to make it operate as correctly as feasible. Needlessly causing collateral damage purely out of laziness and unaccountability is how you get people showing up at government offices demanding for you to be regulated or broken up, if not showing up at your offices with a disposition to cause bodily harm.
> Certain domains are used by abusers more often, usually due to them being cheaper.
Running out of domain names is physically impossible. There are more possible domain names in any given TLD than there are atoms in the observable universe. So the low price is going to be the price set by the registry for that TLD.
Whether the TLD itself has some reputation is orthogonal to the reputation of one domain in that TLD relative to another one in the same TLD. Moreover, you would presumably do the same thing for the TLD -- if one TLD is doing promotion and has $1 registrations this year and then gets used for a lot of scams, and then next year it costs $15 and so do the renewals so the scammers move to a different TLD, the reputation of the TLD should be reset just the same as the individual domains.
> Even if the new system was marginally better, it could be a large eng effort and not worth pursuing.
If the primary goal is to reduce engineering effort then the obvious solution is to delete the entire reputation system so it doesn't have to be maintained anymore. If the primary goal is to make it work well then you have to, well, you know.
> What basis would you have to do otherwise, and if there is something (like TLD), why wouldn't "resetting to zero" in terms of past content just mean resetting to that zero?
Fair enough, but I'm not sure it resolves "haunted" domains as a TLD which is often abused could have a lower "0" reputation and thus by default is "haunted". Perhaps it lessens the impact though by how much is quite opaque to us.
> Whether the TLD itself has some reputation is orthogonal to the reputation of one domain in that TLD relative to another one in the same TLD.
I think this depends on how reputation works and is not so clear. Registrars for these TLD also have a responsibility but have no incentives to stop abusers. If TLD domain reputation is not orthogonal to reputation individual domains on that TLD then that could be an incentive for them to also crack down on abuse as their domains have bad SEO etc.
> If the primary goal is to reduce engineering effort then the obvious solution is to delete the entire reputation system so it doesn't have to be maintained anymore. If the primary goal is to make it work well then you have to, well, you know.
I think this is the most uncharitable interpretation. The eng effort could go to features that improves other customer experiences affecting more people.
Google product manager interview question - Write some code with an LLM tool that leverages a LLM to determine if the new owner of a domain is doing (a) same dodgy thing as prior owner that got flagged (b) different dodgy thing as prior owner but should be flagged (c) something completely innocuous (d) needs further review.
Please don't give Google ideas for more ways they can have an algorithm arbitrarily screw you over with no recourse, they're listening.
Well, current approach guarantees you’re getting screwed over. Any improvement is beneficial unless it blocks a better approach?
You're looking at this from the perspective of a haunted domain owner. And from that perspective your idea is fine.
A good technique to evaluate ideas though is to try and view it from different perspectives.
In this case from the owner of a non-haunted domain. Can you see any potential problem with your idea when viewed from that perspective?
Now, if there are potential problems, consider the relative sizes of the two groups. Do the benefits to one outweigh harm to the other?
This technique can be used every day with pretty much any idea.
The parents rules seemed to indicate only reevaluating the status of a haunted domain. I see nothing about evaluating a normal domain.
(Therefore, this has a one-way function of improving the status of haunted domains and why I think anything is better than nothing unless it blocks a better strategy.)
Follow up interview question. Update the code using your LLM code gen tool of choice that, when someone submits a complaint via an online form, feeds that complaint text back into your LLM to score it again. Points deduction if the candidate ever mentions informing the complainant of anything.
Why would they care?
If it's instantly released, then yes. But in this thread are reports where the offensive actions happened 15 years ago. After such a long time of "good behavior" it makes no sense for me to still keep the domain blocked/downranked.
Honestly, these days, with domains in general being nearly free compared to the profit potential of a single successful spammer grift, I’m not sure I even see the point of blacklisting domains at all. 25 years ago maybe a spammer would be devastated that he had to “start all over and buy a new domain and build up its reputation.” Now, spammers launch and abandon what, a million new domains a day? Google or anyone spitefully holding onto hard feelings about what a domain “did” years ago is pointless because the spammers will move on anyway. They wouldn’t reuse abcqwertuiop26abc dot xyz anyway because it’s safer to make up a new gibberish domain anyway. Only people who acquire domains legitimately are hurt by this.
I would want to experiment judging them based on what they’ve been seen to do in the past month.
The only reason they go to those new domains is because of the blacklist.
If you remove the blacklist, they’d just stop doing that and it would be even easier for them.
I'm imagining/advocating for blacklisting them for say, 12 months, and re-evaluating them at that point. This imposes the identical cost on the spammer as now (each "detection" costs them a year's domain registration) while allowing a reputation "reset" for innocent people who acquire haunted domains.
Yes, the spammers can sit on their domains once blacklisted, renew them, and redeploy their spam on them 12 months later, but they'd have nothing to gain from the reuse, since the names of their domains are just nonsense anyway.
Fair point.
I’m guessing that would complicate blacklist maintenance quite a bit, which is why we aren’t seeing it work that way.
Most of these blacklists (at least initially) were emergency type measures - ‘block these spammers’, then move on with life.
Blacklist maintainers would need to maintain date first seen/date last seen info, and purge/re-add correctly.
Technically, seems like an ‘append only’ type thing is what they’ve been doing for the most part.
As this evolves and the idea that these do need some kind of expiration or we end up with more maintenance headaches becomes more widely known, maybe eh?
Or if there is some kind of legal rules around it.
A tweak to that could be along the lines of "if the DNS lookup of the domain responds with NXDOMAIN for more than x days, give it a fresh start".
I'm not up to date with SEO so unsure whether Google would (or is able to) reset the domain's backlink profile, I'd guess it would be possible. A lot of the value of using expired domains is for backlinks (or at least was)
Require a deposit then, say 1000$, that is to be refunded after a year of probationary period. You get caught being a scammer/spammer, you lose the deposit.
The deposit would be either too high for normal people to pay, or too low to matter to bad actors
Given that spammers cycle through thousands of domains, they'd run into serious cash flow issues very soon.
Who holds the deposit, and what is to stop them from having someone report your domain as a spammer so they can keep your money?
Sadly, the same holds true for IP addresses.
Some time ago I noticed that my side project (with a domain that is not haunted) shows up fine on Google but not Bing/DuckDuckGo.
So I checked the Bing Webmaster Tools. URL Inspection says "Discovered but not crawled - The inspected URL is known to Bing but has some issues which are preventing indexation. We recommend you to follow Bing Webmaster Guidelines to increase your chances of indexation."
That's quite unhelpful. What's more, when I open the "Live URL" tab, it says, in green: "URL can be indexed by Bing."
It's a simple static Hugo site hosted on Cloudflare R2 (DNS mapped directly to bucket). https://pagespeed.web.dev gives it a score of 100 in every category.
Anyone else had something like this happen?
Yup. I've regularly had problems with a static site [0]. Sometimes it's a top hit for my name on Bing, sometimes completely unlisted. Seems to flip back and forth - with that same message you get.
It's a handwritten HTML website, enhanced with JS but not reliant on it, hosted on Cloudflare. Not quite a 100 in every PageSpeed category, but just about.
[0] https://jamesmilne.org/
OP here, and yes, I've been getting that same message for musicbox.fun. I thought it just needed some time but I requested a fresh index two weeks ago, and nothing seems to have changed. :/
A side effect of negative seo is that some stuff that hasn't worked on Google for a long time still does on Bing (They, Bing, obviously, not being the real target of the attack).
I've seen a few sites become de-indexed and the 'give away' is the type of results that first appear when the penalty is eventually lifted. For example, just a dozen or so urls with really weird query strings that never existed before. The real stuff does come back after time though and, in my limited experience, it's a one-off incident.
Just to add, not many sites are insignificant enough not to attract negative seo - especially this type of low-level, zero cost malarkey.
Another variant of this is cached or preloaded security configurations.
HSTS (which forces browsers to validate HTTPS when connecting) asks browsers to cache the configuration for a set "max-age". Some sites set huge values here, like Twitter's 20 year max-age[1]. There's also the preload lists [2] to consider. This creates a problem if you want to serve non-HTTPS/unencrypted HTTP on your new domain and the previous owner didn't.
MTA-STS [3] is another variant that's becoming more popular. It limits which mail servers your domain uses and enforces TLS certificate verification. "max_age" is capped to a year by the RFC. If you don't set your own policy, then the previous domain owners policy would impact any senders who previously cached the policy.
Thankfully HPKP (key pinning) is obsolete, otherwise you'd also need to worry about old pinned keys too. That RFC recommended, but did not enforce, a 60 day max-age limit.
These are especially tricky as the old security policy only lives in the caches of any end-user devices that previously connected to the domain. Double haunted.
[1] https://alexsci.com/blog/hsts-adoption/
[2] https://hstspreload.org/
[3] https://alexsci.com/blog/smtp-downgrade-attacks-and-mta-sts/
The worst part about HSTS is that the spec doesnt just define the interaction between the browser and the website but also goes as far as mandating that the browser restricts the options it provides to the user ... and would-be user agents actually go along with that.
FWIW, you can invalidate MTA-STS cache by updating the DNS assertion record to a different 'id' value. This is how you indicate a policy has changed.
So the sender is supposed to obey the normal DNS TTL caching period, and re-query the assertion record if TTL expired. It should re-fetch the MTA-STS policy if the 'id' value in the DNS assertion changed, or the max_age in the previously fetched policy has expired.
Almost, it's a little more involved.
> RFC 8461 section 3.3: Conversely, if no "live" policy can be [...] fetched via HTTPS, but a valid (non-expired) policy exists in the sender's cache, the sender MUST apply that cached policy.
You'll also need to host a "none" policy doc. Full instructions are here: https://www.rfc-editor.org/rfc/rfc8461.html#section-8.3
A client of mine once swapped over to a new domain that was coincidentally one letter away from another major domain. It wasn't an attempt to typosquat or anything nefarious, but Chrome started automatically showing everyone a big scary warning page before entering the site. We looked into appealing it but there was no guarantee of it getting whitelisted in a timely manner, so we ended up canceling the domain migration before they lost too much traffic.
I wonder if it would be a reasonable requirement of registrars to now allow domains to be purchased if they are some edit distance away from existing/active domains. Its fine if Google wants to protect its users, but ideally this would be caught sooner.
That would be a pain...
Look at the milka.fr problems... Milka is also a female name over here, and that already proved to be a problem in france. But so are Mirka and Minka so yeah... no domain for them? Also Micka. Oh and mivka is (beach) sand. Want to sell beach sand? It's just one letter away from milka, so no domain for you either.
Is it really better if Mirka, Minka and Micka get to pay for a domain but won't be able to use it because the dominant webbrowser shows super scary warnings?
Still seems better to raise the issue as early as possible so they can find a solution (appeal or chose a different domain) before investing into the unusable domain name. It would also mean that the dispute is at a layer (ICANN) where you at least theoretically have some rights instead of at the hands of a megacorporation that thinks the best way to reduce customer support costs is to make it impossible to get support.
Defining “active” seems like the tricky part
This can also happen with IP addresses. We recently moved one of our sites to a new IP and got a trickle of complaints about it being inaccessible from various authoritarian countries. After some digging, the new IP was used as a Tor bridge (not even an exit node) over _ten years ago_. I gave up any hope of fixing that and just ordered a different IP address.
Not always the easiest thing to do. A haunted domain could have been haunted 15 years ago. And Google refuses to tell you why or fix their system.
Just one more place where the web gets screwed by a company too big to have to do basic customer service.
In their defense (and I don’t defend Google often), addressing this really well means:
- knowing all the complexities of every local, state, federal, international jurisdiction that might interfere with the whitelist
- awareness of the content in question which could be millions of subpages
- a customer support team that is definitely not incentivized based on tickets triaged per day, but is somehow incentivized to spend hours on “whale” tickets.
- going through ticket history and solving the problem for everyone now that its policy to solve this
- dealing with the inevitable rush of fraud that follows every tiny change in google systems
The usual version of this is the popular SEO technique of buying an aged domain with a few backlinks and slapping a wordpress on it.
If it was easy to reset reputation with search engines what's stopping people from saying "under new management" every once in a while for an existing poor reputation domain? Probably better to just cut their losses and find another domain.
I've had an opposite experience. One domain I bought was used for an entirely different purpose in the past, which got linked on a Wikipedia article in references. This gives me some good link juice and at least matches the geo area of the previous business. Since it's an extremely niche entry and low on the list of references, I decided to be slightly naughty and not touch it for a couple of years. Not sure what's the opposite of haunted in this case, but it was just as surprising.
Enchanted?
I have a lot of sites (all saas) and more and more people send me cease and desists and lawyer threats because they go to google, enter 'something' that's remotely phonetically similar to a domain I run and then click on my site. They paid on some site that sounds a LITTLE bit (if you squint) like my domain and now they are scammed and want to sue me. Now I understand scammers do this as well, but I had actually someone turn up at our office (which is my business partner his home) with bank receipts with a really not so similar name, however if you type it in google we pop up first even though our businesses are not at all related.
"Ideally, search engine algorithms would give new domain owners a fresh start."
I don't think it's possible to fix this problem without also helping bad actors. Maybe it's a problem that just isn't worth fixing. Just don't buy preexisting domains unless it's a project big enough to justify the necessary cost of due diligence.
Then help them. If a few bad actors is the price of a free internet, so be it. I'd rather deal with those than have a whitelisted internet where you need permission to start a website.
"Maybe it's a problem that just isn't worth fixing."
There is a finite amount of short, memorisable names.
But also an ever-increasing number of TLDs under which to register them.
But only .com actually matters.
The really bad actors just buy and discard new domains daily and silly blacklisting techniques are powerless to prevent that. I don’t think they renew and come back to try to use their domains years later.
This happens with physical addresses too, for similar reasons. The ABC (Alcoholic Beverages Commision) tracks complaints against physical addresses, and too many violations will get an address banned from permits. Then a new owner comes in with a new business and gets mysteriously denied for a liquor license, even years later.
It is customary to revoke the right of a business to name itself if there were too many violations.
If you've ever gone to a nightclub or bar which has no name, only its street address number, that's what has happened there.
How can a business function without a name? So much tax paperwork requires a name. Is it just a sole proprietor that files everything under the owners name?
It has a name, but that name cannot be different from the address, like "The 1415 Club" on 1415 Main St.
Sounds like a very stupid custom
> It wasn’t until I had redirected all of my musicboxfun.com traffic to musicbox.fun that I noticed that something wasn’t right: my web traffic from organic search dropped to zero.
Some practical advice here: do not change your canonical domain[1] name unless you really really have to.
If he had just set his fun new domain to redirect to the existing domain, instead of making the new domain the canonical, it likely would have had no negative effect.
I’m not saying this is how things should work. But the practical reality is that your domain name is like a Social Security number: it’s the basis for assigning a type of reputation score, even though it was not intended to do that originally.
[1] The domain at which your web pages finally load, after all redirects have completed.
Basic SEO stuff, you have marketplaces that check history, you have domain search engines aggregating data from multiple sources - not only ahrefs.
Checking web archive is a basic operation to test if site was hosting anything fishy - not only pirated stuff or porn - often websites has been hacked and changed into link farms or simply were bought on aftermarket simply to use it's SEO value to pass the strength to other domains.
Anyways good point regarding email filters.
For running a mail server every new domain is haunted.
Not really an issue on the same scale because it resolves itself once the domain registration has aged a bit. IP reputation is stickier.
And cloud server IP…
Interesting. Domain as a unit of trust makes sense until it doesn't. Buying a second hand domain is like a second hand car. But you may not know it is second hand!
I think the mistake here is the redirect old to new. That is always risky so only do it if deseprate. In this case I would have done the redirect from new to old. Then just use the new as a vanity url.
> Buying a second hand domain is like a second hand car.
I have never hear of anyone being denied business because their car has a bad reputation from a previous owner.
one other thing i would suggest is to set up a catch-all email for the domain and see what gets sent to it, sometimes you can access accounts associated with the domain, socials etc
I have an interesting 3-letter.net
I set up a catch-all for personal use and wasn't expecting to get flooded with emails.
I was getting business emails, people trying to send money by Zelle, etc.
I was kind of hoping to get something good that I could take action on in the market, so I left it on for a little bit, but then I felt bad that people's emails were not getting answered (at least bouncing), so I turned off the catch-all. Oh well.
I do that and get the occasional account signup. I also ban addresses that fet sent spam, which happens more than the account signups.
This sort of thing is also an issue for phone numbers, some other company could have used your new number for robocalls and gotten it spam blocked on Truecaller and similar services.
Years ago I bought the carelessly discarded domain of a defence contractor that was acquired by another one. And set up a catch all email forwarder. Had weeks of fun reading all the emails that I got sent. There was nothing "secret" but plenty of social and business stuff still going on.
One risk of pre-validating a domain before purchase is that it's not a good idea to tell strangers about your interest in such a property.
Even automated queries are likely to spill the beans. Someone else could snag the purchase before you, or bid up the price. But it's a risk you may need to calculate.
I wonder if there’s a market for rehabilitating domain names
*exorcizing domain names
IP addresses can be haunted too, like if they were previously used for spamming.
Conversely when you drop domain don’t forget you might have accounts on emails or some DNS verification in services that you better explicitly discontinue before just dropping domain.
My very first domain was haunted. The warning sign was firewall blocks against the domain at both school and the public library. As it turned out... a previous owner in the early 2000's was running a sort of proto-Netflix, but with VHS instead of DVD, and that was exclusively targeting the... erm... "adult entertainment" market.
Wayback machine would've saved me there, had I done my due diligence!
Automattic.com was bought (no idea if it was unregistered / acquired) by Matt Mullenweg when he set up the company. He also bought https://a8c.com.
Here in the UK with EE/BT that correctly redirects to automattic.com, but it might not for you depending on your ISP.
The wayback machine shows adult content links prior to the domain being put on sale, hence the blocking.
see also landslide.com - a domain that should never have been reused imo
I've had this with anti-virus flagging domains and VirusTotal was helpful: https://virustotal.com
But it does require manually reporting false positives to each vendor
Not quite haunted but I've had people report that my website hosted on a .quest domain is blocked on their work computer. My best guess is that their filter thinks it's gaming related (it's not) or maybe they just block all "weird" domains.
unfortunately, blocking newer TLDs altogether seems common
I’ll add: and if you lease a VPS, check out its address reputation and reverse DNS record.
Isn't it pretty safe to just assume that any IP addresses belonging to public clouds, especially cheap ones, have bad reputations?
The individual IPs may not all have too bad reputation but you don't control who shares the block with you and don't have any control over new neighbors - and that is enough for some agressive organizations (Microsoft) to block you.
How?
I'm not the person you were replying to, but in the past, I've just used an IP reputation checking website, such as:
https://www.apivoid.com/tools/ip-reputation-check/
Website unusable: Captcha forever waits using latest Firefox on latest iPhone13/iOS 18.0
Find out the IP address of the machine hosting the domain, then do a reverse lookup on that IP address. It might show the last domain hosted on that IP address.
Using dig:
$>dig yourdomain.tld
1.2.3.4
$>dig -x 1.2.3.4
evilcorp.com
> search engines treat links to your site as a massive signal of relevance and trust
I am admittedly a bit distant from SEO. The above is not true and hasn't been true for a long time.
Haunted is a weird way to call them, these are stigmatized domains.
Stigmatised would be when it commonly/publicly has a bad rep.
That’s pretty much what happened to those domains.
No, those domains are completely fine, they are just marked as untrustworthy on some obscure google list.
That’s a contradictory statement.
No. There's no general stigma. It's just the one list.
A risk that’s easy to overlook until it bites you
I feel like this should be the registrar's responsibility. Least they could do is give a disclaimer and/or a heavy discount.
The domain could also have been used to run spam email campaigns, meaning that it is blacklisted by email servers
sounds like the makings of a business service
Also be careful connecting new domains to cloudflare. It has a habit of adding old info from presumably a previous owner.
Managed to get a takedown notice thanks to that idiotic "feature" while not even aware the domain is serving anything
Please drop me an email with what you’re seeing - justin (at) cloudflare.com ?
That doesn’t sound like old info - that sounds like someone might still be reporting it for abuse even after the domain changed owners.
Especially on an .io TLD; it's haunted by the lovely US taking advantage of Chargossian exploitation.
that is amazing
Yet another valuable use for the WayBack Machine, glad it got a mention.
Calling a domain “haunted” is an awful, terrible way to frame it. It places all the badness of the domain on the domain itself, as if the domain name had something with it which could be removed or fixed by the domain owner. Instead, what has actually happened is that the domain is blacklisted by entirely too powerful entities. The problem lies with these blacklisting entities, not with the domain, and the solution must be done there, too. It should not be a domain owner’s responsibility to get out of being unfairly blacklisted.
It’s like when cars took over the streets, and instead of blaming cars for being dangerous for regular people using the streets for walking, the concept of “jaywalking” was invented by car companies to place the blame on people for daring to obstruct cars. Or the concept of “personal carbon footprint”, commonly used to move blame from companies to individuals, when in reality whatever individuals, even in aggregate, could do is utterly insignificant compared to what companies and legislation could accomplish.
> what has actually happened is that the domain is blacklisted by entirely too powerful entities. The problem lies with these blacklisting entities, not with the domain, and the solution must be done there, too. It should not be a domain owner’s responsibility to get out of being unfairly blacklisted.
These kinds of blacklists exist because these domains have been used to host scams or distribute spam (or some other malicious activity) in the past. They're there to protect people (e.g. so that Firefox can disply a "warning: this site is a scam") and reduce abuse. They're not just there so people at Google can get a good kick out of blacklisting random domains.
I'm guessing here because I'm not the author but I believe this statement is directed towards the blocklisting entities because they don't provide transparencies or a method to reach them to resolve issues with a domain once it's aquired by someone else. That absolutely is the issue of those entities.
At one point of time when I had to deal with people submitting phishing links to a web service I owned, I learned some of the tricks that phishers use to get around reports, such as using IP geolocation or the accept-language and accept-encoding header to determine if the phishing page should be served.
With tricks like this, it's not a surprise to see why the companies operating blocklists are hesitant to make this process easy; after all, what's to prevent the phishers from temporarily stating that the issue has been resolved to get out of the denylist, and then restarting their campaign again?
If the process required you to verify ID, e.g. a passport + video selfie, some accountability might be possible. But that might be too invasive for many folks.
This doesn't work because there's a nearly unlimited supply of people willing (out of desperation, drug addiction, or just plain poor decision making) to let bad actors use their IDs.
Also, all that info has been leaked a billion times now, and there are tools to allow real-time filter/overlays of faces to make it even easier.
It's what banks are using now.
These two things are concerning, not reassuring.
Still, an improvement over what they were previously using I guess?
If you could get out of blacklists by transferring ownerships then people can “wash” domains by fake transfers.
I really disagree with pulling the power dynamic angle into focus here. Injustice can also be carried out by the "little man", sometimes even at scale, and is every bit as awful to remedy if not even more so.
The issue is with the issue: people/systems (big and small) blacklisting an ownable identifier pointing to some ownable content without any care for the lifecycle of either.
Painting this with a social brush is extremely unhelpful and is guaranteed to derail conversations for no benefit whatsoever.
> The issue is with the issue: people/systems (big and small) blacklisting an ownable identifier pointing to some ownable content without any care for the lifecycle of either.
Does the lifecycle matter much, though?
Kind of like a carfax report. Tells you whether a vehicle you’re buying has been in an accident before (if it has, the value goes down because maybe there’s some latent issue that isn’t obvious at the time of purchase)
It would be nice if ICANN had some equivalent of a carfax for domains, perhaps even with a requirement that registrars expose at time of purchase whether a domain has been misused in the past (and who the prior owners were, or at the very minimum what the historical DNS records were).
Basically you want to avoid buying a “lemon” domain by accident.
I place zero fault/blame on “powerful entities” maintaining lists of domains used for spam/scams. How else will we protect grandma?
For readers: you could build Namefax as a startup! Pure-partnerships based model... distribute it through registrars.
"Heads up, this is a pre-owned domain. Do you want to get the Namefax for $0.99 before you buy?"
A carfax report lists issues with the actual car. You don’t want a car with “car exploded” in the carfax report, since this would translate to actual damage in the car, damage which could actually affect you if you were to drive the car.
On the other hand, a domain reputation at Google et al. is more like Carfax reporting “This car was once parked at the same street where a horrific mass murder took place.” If this was a problem since, let’s assume for the sake of argument, the police would pull you over all the time if you drove it, it would still not be a problem with the actual car; the problem would be the police, and fixing police behavior would be the only workable solution. Using Carfax as an analogy still places the blame on the domain owner, not on Google et al.
But in this scenario there are many more parties involved than just "the police". So you can't "just fix the police behavior" for a "solution". You'd have to "fix" any and every party that already exists or pops up in the future.
This kind of issue is inherent to any system where identifiers are recycled, particularly when that recycling happens on demand. It's not "fixable", at best it's combatable. And trying to language police away the symptom and blaming it all on the pivotal participants supports and achieves neither.
The analogy is not perfect, but there aren’t myriads of parties, there’s basically only Google, plus a handful of others of greatly decreasing importance.
If it was a reputation problem where, say, end clients with web browsers would each have a separate and uniquely derived negative opinion about domain names, this would indeed be a “bad reputation” problem and not a Google problem, since the problem could not be fixed at the Google side. But with domain reputation being so centralized, the problem is at the center.
> Does the lifecycle matter much, though?
How could it not? It's essentially the same issue as an unmaintained phonebook or a map. What's at a given address or phone number changes, and if your solution is not equipped to handle that change, your solution is bad.
I agree.
But that’s not a fixable problem in my eyes. At least not without extreme and sweeping changes driven by some kind of government regulation or ICANN mandates which, if enacted, would probably be highly criticized on HN.
There are just too many block lists for domains (literally thousands if you include open source ad blockers).
The lifecycle “should” matter in a perfect world, I agree.
Oh I don't think it's full-on fixable either. What I wanted to challenge was just the characterization of the issue itself.
As you say there are plenty of volunteer maintained blocklists as well, and there are also the countless privately deployed filters using those lists, which may or may not get updated properly. That's the "little man" part, and is why I think the characterization the thread starter was trying to push is ill-fitting.
I couldn't disagree more. What you've written is both apologetics and simply untrue.
Sorry to hear you feel that way.
Who says it's the fault of the domain in some abstract sense? A house becomes haunted when something bad happens in it. It's not the fault of the rafters and joists. I think "haunted" is an apt description.
“Haunted” still implies that the problem exists at the house/domain, and can be fixed there. But a domain being blacklisted is not something which a domain owner can fix by themselves, they have to beg the blacklister to de-list them.
You'd usually describe a house as haunted if something bad has happened in the past (e.g. a murder, evil spirits, etc) and people are superstitious about this (e.g. believe some ghosts are still living in the house). Hard to see how an owner can fix this. All the usual problems the owner can fix (floorboards need replacing, gutters need cleaning, general repairs) aren't really examples of a house being "haunted".
Oh, I know people who spray holy water all around the house as a "possible remedy".
Houses are also not haunted, so it's fine. It's also fine to have fun.
The post talks a bit about this:
In a perfect world, when your legitimately good content isn’t being surfaced by Google, it’s a failure on their part, and their problem to solve, not yours. In practice, it is your problem and you have to do a bunch of work to help them see that their current assessment of your domain name is no longer accurate.
You're right, the fault lies with the search engines, but in practice it sure feels like the domain itself is tainted somehow.
We should avoid words and concepts which places the blame unfairly on mostly powerless individuals.
"Haunted" is actually a pretty good descriptor.
Something terrible happened here in the past.
The intangible spirts from this terrible event remain.
The new owner discovers his pictures scream at him and his closet constantly fills up with blood.
The fault, ultimately, belongs with the one who did the terrible deed.
blacklisted would be a good description as well.
Blacklist is too concrete.
With some domains, you merely will find a higher % of your emails land in spam, or your content ranks a bit worse, etc.
There's a somewhat random continuum. Haunting is a funny word that does sort of include some variability.
Yes, but they are on some blacklist somewhere. One could say greylisted. The point is the whatever term describes the issue shouldn't be mystical.
Haunted implies a supernatural condition that just isn't helpful in system administration.
If something isn't working with a service there is always a method to troubleshoot and isolate the issue. Contact the appropriate people when needed. This is how NeoTokyo restored his "listed" domain.
Maybe, but it's not "blacklisted" per se. You can go to the URL and do whatever.
It's not getting SEO blessings, true, but it's not disappeared.
Domains aren't individuals. Owners of domains aren't necessarily individuals either.
> by entirely too powerful entities
So, haunted then?
TLDR: when you rent anything, double check who rented it before you and what they did with it to make sure it’s in good condition.
As someone who knows what active persecution on this site is I relish the opportunity to say what I really know under a pseudonym.