You don’t want to be on Cloudflare’s naughty list

jgrahamc · on Sept 20, 2022

Well into the second day of Cloudflare’s blockade of my home internet connection, Google Search also began blocking requests. It required me to resolve a CAPTCHA challenge for every other search. This luckily only lasted a day.

Cloudflare shares IP reputation data with partners like Google, coordinated through a program called the Bandwidth Alliance. So, my original offense might not even have been against Cloudflare. It might have received the reputation data from a partner, and it just propagated through the Bandwidth Alliance network.

That's not what Bandwidth Alliance is at all. It's about reducing or eliminating egress fees between a cloud provider and Cloudflare. Not sure where the idea that it's about sharing IP reputation data comes from.

https://www.cloudflare.com/bandwidth-alliance/

So, if Google Search started showing a CAPTCHA that's not Cloudflare.

tomxor · on Sept 20, 2022

FYI, this guy is far from alone, your "protection" has given me a lot of grief over the past few years, particularly on highly NATed mobile networks.

I've been gradually removing cloudflare based CDNs from services I develop and control because I don't want my users being arbitrarily discriminated against.

There was a good article posted on HN recently titled "The ideal level of fraud is non-zero" which I think is highly relevant here... In essence any mechanism employed to prevent illegitimate use comes with a negative cost to legitimate users, if that cost is too high it defeats the purpose. i.e what's the point in a website that is completely immune to a botnet and also cannot be accessed by anyone else? unplugging the ethernet cable also effectively protects against botnets. More subtly the cost of outright rejecting some legitimate users is usually not worth the savings of rejecting 100% of illegitimate ones. I think Cloudflare's service has it the wrong way around: it currently accept blocking legitimate users far too easily, that is not an acceptable cost; whereas you should be letting a higher level of bots through to avoid pissing off legitimate users - if it's not obviously a DDoS, it's probably worth the bandwidth cost.

Consider the bigger picture, if you save a slither of a penny by blocking a bot, but also end up blocking or seriously inconveniencing 10 real users... is it worth it.

dmix · on Sept 20, 2022

Cloudflare just isn't worth the tradeoffs: the risks associated with their centralization, how they made Tor basically unusable on non-onion sites, the lack of transparency when content-moderating the internet, etc.

The space is in need of solid competitors to break the stranglehold they have on the internet. Whether it's the right combination of services, documentation, etc.

thaumaturgy · on Sept 20, 2022

Tor made Tor unusable on non-onion sites. I feed a netfilters table with the list of exit node IPs that Tor publishes (https://check.torproject.org/torbulkexitlist) as a standard part of server deployment, and it's the single most effective way to reduce form and login abuse on hosted sites. I like the idea of Tor, but there's no denying that it's a huge source of nuisances.

shaky-carrousel · on Sept 20, 2022

I live in a country with censored internet. What you are doing is harmful. I can only hope whatever you provide is irrelevant enough.

thaumaturgy · on Sept 20, 2022

I'm sorry. I have a colleague based out of Venezuela. We've had to work together to get tunnels and vpns configured so that he can get uncensored and secure internet access.

But Tor is an enormous source of abusive traffic and if I don't filter it, then that's harmful to site owners. I'm being forced to choose between the needs of people that I know, work with, and depend on financially, and the needs of people in countries with issues that are far outside my ability to resolve. It's not a hard decision.

Zak · on Sept 20, 2022

There are probably more sophisticated options that would solve your problems than simply blocking it.

plumeria · on Sept 20, 2022

Is using CAPTCHAs one of those?

throwaway0x7E6 · on Sept 21, 2022

captchas are fine. recaptcha is not.

between 2015 and ~2020, my home ISP was blessed with every recaptcha being 3 rounds of slow fade-in bullshit. I have also seen infuriating gaslighting of "please try again" after certainly correct solutions, as well as 5+ rounds followed by a notification that my network is entirely blocked.

I've developed a reflex to Ctrl+W upon seeing it, unless that is absolutely vital for me to get past it - which is exceedingly rare.

if I had a genie lamp, I'd waste one of my 3 wishes to do terrible things to the people responsible for that shit.

Schroedingersat · on Sept 21, 2022

Most captcha services are just used to force users identified as having few other options into giving free tagging labour.

judge2020 · on Sept 20, 2022

Such as?

cowtools · on Sept 20, 2022

The answer depends on the type of service you host. I don't know what you need to do, but I do know that filtering IP space is merely security-by-obscurity, it is a cheap and broken solution to the hard problems of sybil resistance. If you need IP filtering to operate on a day-to-day basis, then the security of your service is fundamentally broken.

Tor users do not have any special properties over clear-net users besides low accountability for their IP space. There are other ways to acquire this type of setup that don't involve broadcasting a public list of known exit nodes as an act of good faith. Any sophisticated attacker will be able to easily get ahold of the IP space and bandwidth they need to do their work, whether it's through a botnet or simply because they operate out of some less-accountable country like China or Russia.

IP filtering: now you have two problems!

thaumaturgy · on Sept 20, 2022

This is why I'm strongly against spam filtering for email. Spam filters are fundamentally security-through-obscurity. I mean, they don't protect your email from targeted bombing attacks or phishing. If you need spam filters to operate your email on a day-to-day basis, then the security of your email is fundamentally broken.

/s, obviously, I hope.

Blocking Tor isn't a security measure, it's a nuisance reduction measure.

cowtools · on Sept 21, 2022

>This is why I'm strongly against spam filtering for email. Spam filters are fundamentally security-through-obscurity. I mean, they don't protect your email from targeted bombing attacks or phishing. If you need spam filters to operate your email on a day-to-day basis, then the security of your email is fundamentally broken.

You kid, but this is completely true Email is simply an incredibly flawed, outdated and broken system, especially when used without PGP. Phishing is a massive problem, and it has only continued to grow in scale because spam, uh... finds a way. At the same time, spam filters regularly create false positives, making email an unreliable transport (leading "oops, it got lost in my spam folder").

>Blocking Tor isn't a security measure, it's a nuisance reduction measure.

You should block all IP space, this will reduce nuisances by 100%. In fact, this will save you from having to consider any real security practices or do your job properly.

matjet · on Sept 21, 2022

The correct analogy here would be implementing spam filtering by blocking large segments of email addresses. Eg, dropping mail from all non microsoft/gmail domains (as a nuisance reduction measure!), with predictable impact on smaller providers and self hosted email.

thaumaturgy · on Sept 21, 2022

You're reframing this to make Tor look a lot better than it is. The signal:noise ratio for Tor is epsilon. It's almost entirely garbage. If a network generated spam at rates analogous to network traffic from Tor, yes, I guarantee that network would be on every single email service's block list.

Tor's advocates in this thread keep trying to argue it from ideology, as though anybody's obligated to deal with Tor traffic on principle alone, and not one of them so far has tried to argue that Tor is not 90+% bots and garbage. Funny, that.

swores · on Sept 21, 2022

With all the blocks in place, is it ever possible to know whether the 90% is still an innate effect of Tor, or actually an effect of sites blocking Tor?

I have Tor installed, figured it would be worth adding my boring browsing to the mix sometimes, but since most sites I try to load block Tor exits, Tor browser now sits unused.

On the other hand, if I woke up tomorrow deciding to start a bot farm or whatever other malicious thing, or course I'd be interested in hiding through Tor and might try it again (don't worry, I won't wake up that way).

So even if a hypothetical 100% of global internet users really wanted to do all their browsing through Tor, they might all reach the same conclusion as me that too many sites are blocked and therefore leave Tor to mostly bad traffic. Of course it's nowhere near 100%, but hopefully you see my point that the sites blocking Tor IPs (and I absolutely appreciate why) can become a self-fulfilling prophecy - and I'm not sure how you'd get out of that loop?

account42 · on Sept 21, 2022

And if everyone blocks all non-gmail addresses then soon enough the snr of non-gmail addresses will also be garbage because you are actively preventing any legitimate user from using them.

plumeria · on Sept 21, 2022

I think it would be sensible to block new account registrations with addresses from email address aliasing services (e.g. duck.com) or disposable email address services (e.g. mailinator.com).

account42 · on Sept 21, 2022

I am strongly against any kind of spam filtering that drops/rejects messages that the recipient did not intentionally configure for those kinds of messages. Sorting suspicious mail into a separate folder is fine, preventing two humans from communicaing based on heuristics, IP block reputation and other such bs is not.

twno1 · on Oct 2, 2022

Outlook did that actually (preventing two humans from communication without reason)

https://www.linode.com/community/questions/22305/entire-ip-r...

justsomehnguy · on Sept 20, 2022

> It's not a hard decision.

Depends on what you imply under 'hard'.

As a IaaS provider I endured alk the hurdles about that and ten years later - I don't care, at least not until my outbound bill is bigger than usual.

Like some of the clients are on CentOS6, on a public facing machines.

parroteal · on Sept 20, 2022

I'm a noob, can you give me a pointer?

What kind of abusive traffic is coming through Tor and why do they do it?

thaumaturgy · on Sept 20, 2022

Mainly forms -- login forms, comment forms, signup forms. Bots use Tor pretty heavily because it's anonymous and hard to block them without blocking the entire network. Login form abuse is mildly irritating but not a huge deal if you have other measures in place. Comment spam is annoying but there are some options that deal with it pretty well.

But the signup spam was a headache. I didn't want to just blackhole Tor traffic, and tried to reduce the abuse with other tools, including some custom stuff. The final straw was a customer's small business site that had a MailChimp or Constant Contact signup form. Those vendors want you to embed their code by default to render the form, so you have less control over the form itself. There were workarounds, but they all sucked.

Tor bots would sign up email addresses through this newsletter form, and then I'd have to go through and manually scrub them before newsletters went out, or the service would penalize my client for too many bounces/unsubscribes/complaints. Very nearly 100% of the abuse on that particular form came from Tor IPs.

I do not want to spend my limited time on this Earth manually sorting out bots from humans because of one particular network. Blackholing Tor made that problem disappear immediately.

VPNs are dime-a-dozen now, cheap VPSs are available from lots of vendors, there's Wireguard, there's ssh, a clever person could even set up Apache or nginx as a forward proxy with ssl from LetsEncrypt. Tor is well over 90% abusive traffic (https://blog.cloudflare.com/the-trouble-with-tor/). This is a Tor problem, not a me problem. There are better alternatives available.

mjevans · on Sept 20, 2022

I think the workflow is the issue with http(s)-based email list sign-ups.

Solution: Require sign-ups by email, so the end account must actively send your mailserver a registration message. This also turns an open-loop control system into a closed loop control system, which is inherently easier to secure / keep safe.

gregmac · on Sept 21, 2022

How would this be better? It's trivially easy to spoof email addresses. Someone could sign you up easily, for example.

It's also easy to send "from" an addresses that passes SPIF/DKIM but bounces inbound mail -- not sure what reason someone would have for this other than hurting the service reputation or acting as a DoS of sorts, but it can be done.

3np · on Sept 21, 2022

> It's trivially easy to spoof email addresses. Someone could sign you up easily, for example.

Proper DMARC configuration is table stakes to send e-mail, which makes that anything but trivial.

viraptor · on Sept 21, 2022

But neither the newsletter host nor the email user has any input into how dmarc/dkim/spf are implemented. Only the user's email provider does. And if that's a small business domain, it's likely not very strict with the rules.

namibj · on Sept 21, 2022

I thought DMARC/DKIM was necessary for delivering to Gmail for years now; in any case, there should be few who can't use a backup email to subscribe, as your newsletter won't be the only thing that has these anti-spoof requirements.

viraptor · on Sept 22, 2022

Not necessary. Just very highly recommended. I can still deliver my cron emails from a rando host successfully.

namibj · on Sept 25, 2022

That doesn't rule out DKIM, which only requires the `From:` header's domain to list a pubkey and the email to include a DKIM signature from a matching private key. SPF is the one that regulates which hosts a domain's outbound SMTP servers are on.

Schroedingersat · on Sept 21, 2022

> Mainly forms -- login forms, comment forms, signup forms. Bots use Tor pretty heavily because it's anonymous and hard to block them without blocking the entire network. Login form abuse is mildly irritating but not a huge deal if you have other measures in place. Comment spam is annoying but there are some options that deal with it pretty well.

Then put the form behind your monopolistic internet gatekeeper. There's no reason for a GET to redirect to a sysiphean captcha treadmill.

judge2020 · on Sept 20, 2022

https://blog.cloudflare.com/the-trouble-with-tor/

> . Based on data across the CloudFlare network, 94% of requests that we see across the Tor network are per se malicious. That doesn’t mean they are visiting controversial content, but instead that they are automated requests designed to harm our customers. A large percentage of the comment spam, vulnerability scanning, ad click fraud, content scraping, and login scanning comes via the Tor network. To give you some sense, based on data from Project Honey Pot, 18% of global email spam, or approximately 6.5 trillion unwanted messages per year, begin with an automated bot harvesting email addresses via the Tor network.

remus · on Sept 20, 2022

Say you're running an account take over script that spams login forms with a list of known username and password combos. If a website owner sees thousands of login attempts coming from a single IP address they're likely to block you to prevent abuse on their website. This is annoying for you as you then need to rotate your IP address.

Using tor hides your IP address from the website and makes switching exit nodes very straightforward, so you can run your account take over script in peace.

viraptor · on Sept 21, 2022

That's not that easy in practice. There's less than 2k exits normally, not all of them usable. Your abuse script competed with other malicious traffic for those exits and their reputation gets burned pretty much immediately.

So yes, you can switch exits easily, but effectively your switching from one known bad IP to another bad IP.

plumeria · on Sept 20, 2022

How often is the list of exit nodes updated?

thaumaturgy · on Sept 20, 2022

Daily, I believe. I don't have the file git-controlled. That would be a good idea, though.

matheusmoreira · on Sept 21, 2022

> how they made Tor basically unusable on non-onion sites

I wonder if that's such a bad thing. Tor is safer when the traffic never leaves the network. In the ideal world, everything that matters would be inside the Tor network instead of being merely accessible through it.

andrewnyr · on Sept 20, 2022

there are many solid competitors: Amazon, Fastly, Akamai, Imperva to name a few

WirelessGigabit · on Sept 21, 2022

Fuck Akamai. Have you worked with them? They are the most archaic internet company you can think of. Their UI is stuck in 2000. Just like their procedures.

hoppla · on Sept 21, 2022

There is an easy way to get the banhammer from Amazon, and it is possible host a JavaScript page that triggers it for any visiting user.

I did tell Amazon about it, but it fell for deaf ears. The ban lasts for about a week and the internet is mostly unusable in that period

wahnfrieden · on Sept 20, 2022

Bunny

thaumaturgy · on Sept 20, 2022

Just 10 minutes ago, I got the following email from a housemate (I'm not home at the moment):

> The past few weeks I've been getting tons of redirects to verify my humanity before being allowed to view a webpage. Usually I just have to click the box that says human, not find all the ladders in a photo. SoFi is doing it every single time I log in. Petco, too, along with others who are more sporadic. This is happening with and without uBlock on. Same browser I've always used. ...

SoFi and Petco both use Cloudflare. I do exactly zero web crawling / scraping / abusive anything from my home connection.

I'm noticing a recent increase in volume of complaints about Cloudflare's human verification filter. I'm starting to wonder if they touched a dial.

I had already started pulling some infra back from Cloudflare after their last appearance in the tech news cycle. Now I've got an additional reason to continue doing that.

thephyber · on Sept 21, 2022

> I do exactly zero web crawling / scraping / abusive anything from my home connection.

That you know about. Your house mates share the internet connection.

I’m guessing you have WiFi, so you may have unintended guests.

You probably have lots of devices, one of which may be infected.

Your ISP may have issued you a different IP which may have a negative reputation score.

You could be using a malware infected browser or browser extension.

There are lots of variables. You haven’t isolated all of the ones in your control, so assuming CloudFlare is the only possible cause isn’t rational.

winkeltripel · on Sept 21, 2022

It's simple, really. Cloudflare is the single root cause of the issue. All the others are not a huge issue until Cloudflare notices. It's perfectly rational and reasonable to blame the company trying to gatekeep the whole internet, without taking any responsibility for false positives.

Scaling to infinity isn't a right, it is a privilege. Any company that builds this sort of no-human-decision systems are abusing that privilege and hoping that anyone who suffers wrongly under their systems doesn't have enough voice (google seems to be the worst for this, though cloudflare seems set to follow).

thaumaturgy · on Sept 21, 2022

I'm not sure how to phrase this without sounding like a prick, but I'm not exactly new at this stuff. You missed on all of those examples. I appreciate the point you're trying to make, but Cloudflare is in fact the primary factor here.

tomxor · on Sept 23, 2022

> That you know about. Your house mates share the internet connection.

This is actually the least likely these days... the no.1 cause would be CGNAT, the vast majority of residential endpoints share an IPv4 address with a huge number of users, mobile networks are even worse... that's before we even get to IP recycling for dynamic IPs which happens at high frequency with mobile networks again, so you will inevitably get affected eventually.

This is why it's a bad idea to block IPs outright, because today one IP address never equates to a single individual or the same set of individuals over time. The other problem with blocking IP addresses based on abuse is considering them equal in user weight, yet one IP might have 2 users, another might have 10000 users - Blocking a TOR exit node is a good extreme example of this... people think of it as an effective defence because of the concentration of abuse on that single IP address, but they fail to consider the concentration of users behind that IP address - TOR exit nodes probably are a slightly higher source of abuse per user, but not any where as high as per IP - if you measure abuse per IP you are more likely looking a rough picture of users per IP for highly NATed IPs.

patrec · on Sept 20, 2022

> I had already started pulling some infra back from Cloudflare after their last appearance in the tech news cycle.

What triggered your reaction? That they terminated a customer with zero notice?

tarakat · on Sept 20, 2022

You're looking at it all wrong. From Cloudflare's point of view, this kind of blocking is a feature. Anyone doing legitimate web crawling, or offering alternative web services such as Starlink, now needs Cloudflare's permission.

Essentially, for a broad class of web-based businesses, they have made themselves gatekeepers. I'm sure they'll find a profitable use for this position. Charging outright would look bad, but investing in businesses that just happen to not run into Cloudflare-based trouble, but whose competitors do...

tomxor · on Sept 20, 2022

I'm familiar with that perspective, and biased towards it... Cloudflare is certainly in such a position, but they are a relatively young company (for their size and reach) and I've seen good things come from them.

I'd guess the intent is unlikely to be anti-competitive or monopolistic, just over-aggressive. However regardless of intent their position does cause an absence of market forces to put pressure on fixing such issues - Similar to how it's become acceptable to have downtime when it's on AWS, because "everyone is affected".

hypertele-Xii · on Sept 21, 2022

It's true that any wall around you that protects you from something unavoidably comes with a gate that someone else guards, unless you want to guard your gate personally (and that entails filtering out armies of spambots, worst case manually).

As with any power and control you delegate to any entity, only time and good behavior will earn them your trust. That's theoretically what companies are competing for, your loyalty.

zxcvbn4038 · on Sept 21, 2022

Isn’t there an config option to dial down the anti-bot stuff so that you still get the benefit of Cloudglare’s caching but with much less chance of dropping legit traffic from schools, VPNs, etc? I think their lowest setting only really kicks in if they think an ip is participating in a DDOS attack.

eek2121 · on Sept 21, 2022

My dude, it isn't about money. At least not directly.

I encourage those of you attempting to block Cloudflare to try and host your own website for a bit. Make sure you don't do it on a metered/paid connection. I know one eCommerce site with 1,300 employees that went bankrupt overnight thanks to the AWS bill (and lack of options to get back online, this was prior to companies such as CF). Bankruptcy as in the company filed for bankruptcy and no longer exists. They were profitable for a decade prior. One DDoS attack...

Also make sure you don't have a democratic opinion if you are in the US, like a 50 person manufacturing company. They were shut down completely thanks to saying a single wrong thing about Republicans. CF existed there, but they weren't aware thanks to not having IT folks. They were a non profit.

CF may be evil to some, but there is a reason they exist. I use CF. I don't like throwing money at them every month, however, many of my websites have also been attacked, usually via competitors. We can either deanonymize the internet or allow companies like CF to exist. There is really no other way.

tomxor · on Sept 21, 2022

> the company filed for bankruptcy and no longer exists. They were profitable for a decade prior. One DDoS attack...

Being milked dry by a single DDoS is a hosting issue in my opinion, there should be sensible limits in place, AWS is notorious for making it very hard to understand and control this...

Even if you disagree and consider it a problem that must be solved with a separate DDoS protection service, this is not what I am talking about, I think it's a good idea - if there is a clear ongoing targeted DDoS attack, that system needs to engage and do it's best to try to filter through only legitimate users (which is the only point in time it makes sense to potentially block regular users - because the alternative is that no one can access the site).

The problem is this is not how cloudflare's protection operates, there is no throughput trigger, it's always on, it attempts to block bots at all times and has a very high false positive rate.

mike_d · on Sept 21, 2022

> I know one eCommerce site with 1,300 employees that went bankrupt overnight thanks to the AWS bill

Had they just called AWS and explained the situation, they would likely still be in business.

I keep a backup DDoS mitigation service for my entire network that costs me less than $200/mo to mitigate up to 100 Gbps.

d2wa · on Sept 20, 2022

> That's not what Bandwidth Alliance is at all. It's about reducing or eliminating egress fees between a cloud provider and Cloudflare. Not sure where the idea that it's about sharing IP reputation data comes from.

It comes from the Cloudflare blog. https://blog.cloudflare.com/cleaning-up-bad-bots/

There’s a support page about it too. https://developers.cloudflare.com/bots/get-started/free/

jgrahamc · on Sept 20, 2022

I need to look into that. Thanks for pointing it out. I had totally forgotten about that post.

Edit: team tells me this idea never got off the ground. Did talk with some potential partners (which did NOT include Google) but didn’t happen. So if Google was throwing CAPTCHAs it wasn’t because of our IP reputation.

d2wa · on Sept 20, 2022

Dear John. What am I — as a normal human being/end-user — supposed to do in this situation? People can’t do anything without any information about why they’re blocked. Who do you contact? Where do you go? What to do? The challenge page doesn’t help the end user understand why this is happening to them. It’s okay if you only see it for two seconds. But the page stays on screen for over a minute. When this happens for every website — what do you do? You’d be furious if this had happen to you. I’m just trying to read my online comics and lookup some stuff about some interests and hobbies. It reduced my quality of life/sanity for a week. The last two days, I started worrying that this was going to be the new normal. I even looked into swapping ISP to get a new IP address.

PS: I love all the innovation and engineering stuff you guys regularly share on the Cloudflare blog. It’s [almost] always an interesting read. Even though I’m no fan of the massive centralization your company has caused.

jgrahamc · on Sept 20, 2022

Once upon a time Matthew made us set the IP reputation of every Cloudflare office to bad so that we experienced the worst case scenario. Helped a lot.

I don’t understand why you saw one minute block screens. That’s not right. Should be seconds.

I’m talking with the team about your other points.

tinus_hn · on Sept 20, 2022

The main problem of course, and it isn’t limited to Cloudflare and I won’t pretend to have the solution, is that if you are caught in this kind of web, you have no recourse but go public and hope the spotlight lands on you. For every problem we see in an upvoted post there’s tons that nobody sees.

northwest65 · on Sept 20, 2022

What about answering his actual question?

easrng · on Sept 20, 2022

I haven't been getting challenges that last that long, but I have noticed that the redesigned "security check" challenge pages with the spinner do seem much slower than the old design with the loader that was made of 3 orange dots.

JohnFen · on Sept 20, 2022

> People can’t do anything without any information about why they’re blocked. Who do you contact? Where do you go? What to do?

This is the most serious problem with all of the major companies these days. Cloudflare, Google, Apple, etc. When you get on their "bad side", you're just screwed. You'll never even know what got them mad at you, and there's nothing you can do to recover.

The only reasonable way to deal with this is to avoid them all to the greatest extent possible. You have no control over whether or not you deal with Cloudflare, unfortunately, which makes them the worst of the lot.

adammartinetti · on Sept 20, 2022

> It’s okay if you only see it for two seconds. But the page stays on screen for over a minute.

That doesn't sound right. You shouldn't see a loading page for over a minute. If you're open to providing more details privately I'd love to help troubleshoot. You can drop me an email at amartinetti @ cloudflare.

throwaway742 · on Sept 20, 2022

What I do when I want a new IP is change my router's MAC address and reboot the modem.

d2wa · on Sept 20, 2022

I edited and added a second link to a support page that mentions it too.

jgrahamc · on Sept 20, 2022

Thanks. I'm talking with the team.

Edit: see comment above.

cvwright · on Sept 20, 2022

You block this guy from the internet for a week —- for no apparent reason —- and then you come in here with a nitpick about how another related system works?

Really?

judge2020 · on Sept 20, 2022

The point is that Cloudflare does not beam IP reputation data to Google. If Google and CF are blocking this IP separately, what's the chance there's some malicious device or hacked IoT device on the network, participating in DDOS attacks or unauthorized vulnerability scanning of random websites?

pessimizer · on Sept 20, 2022

According to another comment, it's a wrong point: https://blog.cloudflare.com/cleaning-up-bad-bots/

> Once enabled, when we detect a bad bot, we will do three things: (1) we’re going to disincentivize the bot maker economically by tarpitting them, including requiring them to solve a computationally intensive challenge that will require more of their bot’s CPU; (2) for Bandwidth Alliance partners, we’re going to hand the IP of the bot to the partner and get the bot kicked offline; and (3) we’re going to plant trees to make up for the bot’s carbon cost.

judge2020 · on Sept 20, 2022

I'm pretty sure this was for a situation like Digitalocean themselves hosting a bot, but such IP sharing very well might be currently (ab)used by partners, if it's happening here.

jgrahamc · on Sept 20, 2022

Yeah. I'm looking into that.

zinekeller · on Sept 20, 2022

Yeah, if for example Spamhaus (which both Cloudflare and Google consult) has detected that a subnet is bad then that could be the cause.

Still, it doesn't excuse Cloudflare that there's no redress if you are caught on a block or even a clue on what you can do to reduce it (especially that Spamhaus do have redress procedures).

cvwright · on Sept 20, 2022

Fair point

stefan_ · on Sept 20, 2022

A wrong nitpick, even! Way to look like the asshole.

noasaservice · on Sept 20, 2022

[flagged]

tshtf · on Sept 20, 2022

[flagged]

acdha · on Sept 20, 2022

It makes a very broad claim which makes it sound like an extortion racket but doesn't have anything to back it up. I would bet that if it included some evidence it would fare much better. For example, they have a ton of large organizations which are customers. The very first question the average reader is going to have is whether it's really the case that these sites are predominantly attacked by booter services which use Cloudflare for hosting? That seems unlikely and as general rule here the broader the claim the more people are going to expect you to show that you did your homework first.

gusgus01 · on Sept 20, 2022

The claim was discussed in this post: https://news.ycombinator.com/item?id=32709329

Basically DDOS booters use Cloudflare to protect their websites from competitors, since Cloudflare is one of the best. The same people Cloudflare is protecting (and claims to do so on an ethical neutrality basis) is furthering the need for Cloudflare to exist.

acdha · on Sept 20, 2022

Note that I’m not saying whether or not this is true, only that a comment which links to something like that will generally fare better than one which begs the question.

cma · on Sept 20, 2022

Its like finding the worst videos on youtube and saying that's their business model.

throwawayays · on Sept 20, 2022

The tone of this reply is a bit shit from a PR perspective.

How about _also_ pointing to a knowledge base article for how an end user could go about working out what network activity from their IP might be flagging Cloudflare’s systems?

bogomipz · on Sept 21, 2022

>"Not sure where the idea that it's about sharing IP reputation data comes from."

One source of that would be a blog post on your company's website that was actually authored by you! Point 2 below:

>"Once enabled, when we detect a bad bot, we will do three things: (1) we’re going to disincentivize the bot maker economically by tarpitting them, including requiring them to solve a computationally intensive challenge that will require more of their bot’s CPU; (2) for Bandwidth Alliance partners, we’re going to hand the IP of the bot to the partner and get the bot kicked offline; and (3) we’re going to plant trees to make up for the bot’s carbon cost. [1]

So it's not such a far-fetched notion is it?

[1] https://blog.cloudflare.com/cleaning-up-bad-bots/

O__________O · on Sept 20, 2022

They do have a threat score

https://developers.cloudflare.com/firewall/recipes/block-ip-...

I was surprised to learn Cloudflare was born out of Project Honeypot, so I am guessing Cloudflare does share data with them:

https://www.projecthoneypot.org/cloudflare_beta.html

elcomet · on Sept 20, 2022

FYI you're responding to the cloudflare CTO

trasz · on Sept 20, 2022

It’s naive to assume Cloudflare CTO would not be lying if beneficial to him or Cloudflare.

nemothekid · on Sept 20, 2022

I wonder if HN posters have ever held a job before. Can you explain why it's beneficial for Cloudflare to block legitimate users? Why is the simplest explanation "Cloudflare just hates this one user in particular?"

saurik · on Sept 20, 2022

The story I've heard is--because their direct customers are websites, not end users--that Cloudflare loves to be ostentatious with these branded blocks and have a vested interest in offering services which punish users because it makes people feel like the product really really does something. Do you constantly hear about people being hosted by Akamai or CDNetworks or whatever going down due to DDoS attacks? No. However, despite a bajillion websites being hosted by Akamai--including, for example, virtually everything from Akamai--have you ever accidentally been blocked or severely rate limited by one, or been given a CAPTCHA... even behind Tor? I doubt it (and this is coming from someone who nigh unto tried to cause themselves problems with Apple and all I ever got was a subtle speed cap); and yet, I feel like everyone I know has experienced being stuck behind Cloudflare at various points in their lives :/.

lmm · on Sept 20, 2022

Well, apparently they scared this user into installing their browser extension, so it sounds like this incident was a win for them.

webmobdev · on Sept 21, 2022

That is indeed their goal - this kind of targeted harassment is done deliberately to collect more personal data of the user.

This tactic is quite common among BigTech and something I've experienced with both Google and Amazon - once you are hooked onto their product, one day they will suddenly deny some aspect of their service to you and force you to share more personal data with them to get access to it. For example, Amazon will one day start to ask you to click a link sent to your mobile to access their account, or Google or Microsoft will block your account and ask you for your mobile number to "verify" you etc. When you are blocked from using a service suddenly, however privacy conscious you are, in your desperation you will be forced to comply.

I have experienced this with CloudFlare too once when many website were suddenly blocked for me by CloudFlare on all browsers and I was forced to install their extension to access some information I needed from a website urgently. I have no doubt in my mind that even otherwise, they just deliberately and randomly blocked access to some sites and displayed their "captcha" page just for PR and "brand awareness". Now that CloudFlare has realised this is backfiring on them because of the negative emotions being associated with their brand, they have now redesigned their "captcha harassment" page to give less prominence to their branding than before.

stevewatson301 · on Sept 21, 2022

Privacy pass uses [VOPRFs](https://datatracker.ietf.org/doc/draft-irtf-cfrg-voprf/) with the express goal of avoiding tracking, so all this talk about "targeted harassment" is a bit much.

Sebb767 · on Sept 20, 2022

And they also got them to write a blog post, giving massively negative press on HN. I doubt his PII (assuming they collect it) is worth the trouble this thread is causing.

SkeuomorphicBee · on Sept 21, 2022

That is easy to explain: because it is easier/cheaper for Cloudflare to build a solution that works for 99.99% of the people and simply throw that extra 0.01% under the bus. So the simplest explanation is "Cloudflare knows random users will be locked off the internet, and is happy with the trade-off".

Veen · on Sept 20, 2022

It's even more naive to assume Cloudflare's CTO would tell lies that can be trivially shown to be untrue.

trasz · on Sept 20, 2022

How would you show they are untrue? Ask? :-D

elcomet · on Sept 20, 2022

I don't assume anything. The previous comment was just trying to teach something about cloudflare to its CTO

TakeBlaster16 · on Sept 20, 2022

Can you acknowledge the main point of the article? What should someone do if they find themselves misclassified by Cloudflare's systems?

mh- · on Sept 20, 2022

(not the parent commenter)

That person should start with the assumption they haven't been misclassified and eliminate the possibility that a device on their network is compromised.

d2wa · on Sept 20, 2022

(Author here.) That’s missing from the article. But I have logs of the network. There’s nothing out of the ordinary. “I don’t know what I did wrong,” as I started the article, means “I’ve checked logs and such and there’s no indication of anything wrong on my end.”

JohnFen · on Sept 20, 2022

A task that would be made much easier and less likely to miss something if the affected person had some indication as to what the problem was.

buildbot · on Sept 20, 2022

Devil's advocate - would it not then be pretty easy to engineer malicious bots to avoid detection?

JohnFen · on Sept 20, 2022

Depends on the level of detail provided. That much detail isn't necessary in order to provide a helpful pointer to innocent bystanders.

Ferret7446 · on Sept 21, 2022

Do you expect the average user to know how to "eliminate the possibility that a device on their network is compromised"? That is untenable.

mh- · on Sept 21, 2022

No, but I wouldn't expect the average user to write a blog post with unsubstantiated technical claims, either.

I do think Cloudflare could do better here to let the owner of an IP know why they're suffering from poor reputation.

However, it's not immediately clear to me how they could accomplish this without weakening their side of the car vs. mouse game.

phantom_of_cato · on Sept 20, 2022

But that's beside the main point. You guys are essentially the "single point of failure" for half the internet. [1] Being competent and smart doesn't really help too much, as demonstrated by how you guys had to give in to the pressure to censor recently.

[1]: https://easydns.com/blog/2020/07/20/turns-out-half-the-inter...

3np · on Sept 21, 2022

What happened to PrivacyPass? It seems to have stopped working completely when connecting over Tor several months back. I say this from having spent several hours trying to get it work on multiple devices with different OS/client software (chromium/FF), both with the store versions and bundling the extension from source.

We did have it working mostly fine for some time back in 2021 but haven't been able to since.

There are multiple open issues reporting this on the GH repo with no real follow-up from maintainers apart from maybe a "should be fixed, open again if still an issue".

ie https://github.com/privacypass/challenge-bypass-extension/is...

xani_ · on Sept 20, 2022

> Not sure where the idea that it's about sharing IP reputation data comes from.

Probably from scam called mail blacklists

plumeria · on Sept 20, 2022

It is interesting that the Bandwidth Alliance partners list shows pretty much every big cloud provider except AWS and Akamai [0]

[0] https://www.cloudflare.com/bandwidth-alliance/

pilif · on Sept 21, 2022

Yep. That paragraph made me pause and consider that maybe OP is the victim of some compromised device running on their network.

If two independent sites believe you are a bot, you or something at your address just might be.

shiomiru · on Sept 20, 2022

If you'd like to experience this treatment first-hand, try surfing the web using the Tor Browser.

Spoiler alert: many websites simply refuse to load at all (e.g. any google service, and lots of websites "protected" by CF). Captchas are everywhere: in many cases, you can't even complete simple GETs of blogs without donating free labor to CF.

And the most infuriating part, you get CF marketing messages right in your face while your browser is calculating hashcash (I guess?)... At this point I can recognize every single one of them: something about bots making up 40% of all internet traffic, something about their web scraper protection racket, something about small businesses (???), etc etc...

To be fair, Tor exit nodes have an awful reputation for sure. Nevertheless, I have a hard time forgiving how CF makes browsing the Internet hell for those who actually need Tor.

yjftsjthsd-h · on Sept 20, 2022

> And the most infuriating part, you get CF marketing messages right in your face while your browser is calculating hashcash (I guess?)... At this point I can recognize every single one of them: something about bots making up 40% of all internet traffic,

Yeah, there's something amazingly aggravating about CF telling you how much traffic is bots while showing that they can't distinguish you from a bot.

robocat · on Sept 20, 2022

CloudFlare are creating a new devision for advertising to bots. They have projected that in the near future, bots will be 90% of spending, so the bot demographic is the most important to target, marketingwise.

The fact that humans are seeing the traffic meant for bots is an unfortunate side-effect.

I personally welcome our future bot overlords (not only because being unwelcome might be unhealthy for me — why would I publicly disagree with an overlord or not want to be their friend?).

rvdca · on Sept 21, 2022

Someone has seen a basilisk...

synthetigram · on Sept 20, 2022

Cloudflare has mixed up the definitions of "bot" and "abuse". Tor users may or may not be bots, but as long as they don't abuse (spamming or DoS), they ought to be treated the same.

thephyber · on Sept 21, 2022

Citation needed.

Kab1r · on Sept 21, 2022

I think this is more of an opinion than a matter of fact

thephyber · on Sept 21, 2022

It wasn’t framed as an opinion. And even if it was, I’m saying I think it is wrong and I want to know why I should change my mind.

The fact is that CloudFlare distinguishes abuse (DDoS at IP layers 3 and 4) completely separately from bot detection. And it allows user controls to domain owners to allow some bots like Google Search Crawler.

So my statement stands: I want to see a citation of evidence that CloudFlare doesn’t have the ability to distinguish abuse.

wraptile · on Sept 21, 2022

You don't even need TOR. Try a public wifi that is not in the "preferred geographical location" (i.e. US or Europe). The gaming cafes in SEA are probably responsible for 90% of all AI training datasets lol

jasonfarnon · on Sept 20, 2022

I routinely use Youtube with Tor. I will occasionally get kicked off with a "suspicious traffic" message, but it isn't my experience that it "refuses to load at all".

yamtaddle · on Sept 20, 2022

Harsh blocking/limiting/challenging is way too valuable to sites that are actually trying to make money online. It's not going away short of legislation banning it. Losing 1/10,000 legitimate customers to cut fraud attempts, spam, exploit attempts, and so on, by 90% or more, is just too good a trade-off.

I have bad news about the most-likely fix for it, longer term, so we can lay off the IP-based reputation stuff and the geo-blocking: it's tying some form of personal ID to your browsing activity, so that bears the reputation instead of the address.

Sorry. Said it was bad news.

jabbany · on Sept 20, 2022

An alternative that preserves some privacy also doesn't seem that hard to imagine... though it probably has its own can of worms*.

Basically, the core problem is digital identities (accounts, IPs, phone #s etc.) are cheap to create (even considering captchas and all) so fraud is easy. The solution could be just to make it "costly" to create new digital identities. For example, you could get a "verified but anonymous" identity issued by locking some assets (could be real world money, or maybe something intangible like community reputation) as collateral with a trusted party (or, for the crypto people, the blockchain). If you misbehave, you lose your reputation on that identity (and essentially your collateral) and have to start over. This lets anyone bootstrap a "minimal" level of trust at the beginning before they can use time to prove themselves trustworthy.

Note: This model might remind some of things like staking in crypto. However the idea is really not anything new... Putting money on the line is really how most low-trust bootstrapping happens.

*: To name a few:(1) this can result in participation being gated by wealth, which can be unfair. (2) it makes accounts more valuable to hack so people need better security practices [re: twitter checkmark]. (3) one would need some authority to decide how accounts lose their collateral or maybe the collateral is just burned to create that initial credibility...

Sebb767 · on Sept 20, 2022

> Basically, the core problem is digital identities [...] are cheap to create [...] so fraud is easy. The solution could be just to make it "costly" to create new digital identities.

We already use this model in practice. It's why so many services require a phone number verification now - they are hard enough to get en-masse, especially if you block things like Google Voice. They even have a big advantage in that they are comparatively hard to hack, as the SIM card is effectively a weak form of physical security key.

I think the big problems this causes is discussed on HN quite often.

georgyo · on Sept 20, 2022

Your idea is comes from a good place, but identity theft is already a thing in the real world. Digital identities would also be very stealable. This malware more harmful in the long term. Imagine if your Twitter gets hacked and your digital identity makes it so your Gmail gets blocked.

Similar, the internet is already very difficult for the people with limited means. This would make it even harder.

Schroedingersat · on Sept 21, 2022

Easy solution.

Go down to your local post office.

They physically hand you an identity token on a physical $2 2fa device if you give some evidence you live nearby. You can put down the deposit or hand over the device for an old id which is cleared and reused.

It's traceable to the post office but no further, nothing is recorded other than that the token is deployed and roughly when.

Local communities can be responsible for cleaning up local messes. No need for the scammers two cities over to effect your reputation. No need for a corrupt employee handing out tokens to effect the reputation of the token you got ten years ago.

georgyo · on Sept 21, 2022

So every country in the world should simultaneously roll out this $2 2FA token?

And the governments of the world are going to do this is an anonymous way?

Who is going to manufacture these 8 billion (Or at least 3 billion if we only count Facebook's MAU) tokens?

And there still needs to be a global database of valid identifiers, else anyone could just create a software token that they can reprogram ever second.

And we expect all people to carry these 2FA tokens perfectly?

And what happens when someone looses this token? The post office has way to prove you owned that token in your proposal.

Same thing for revoking a token. There is no identity out of the token, so how do you revoke it after it is lost? People are not willing to store a piece of paper in a security deposit box.

This "easy" solution is impossible in practice.

Schroedingersat · on Sept 22, 2022

You're projecting use cases that weren't proposed.

The only purpose is to provide evidence of not being a bot. Not to log in or verify identity. You don't need a server or proof that a particular token is owned by a particular person, just a cert chain and a list of postcodes with current public keys. The post office has a private key. They sign a message saying 'the holder of this token walked into the store'. Let servers make whatever judgements they wish about the chain's credibility. If a particular key signs lots of bots then you know where to look for the source of the bot farm and the people that live there know where to look to fix their reputation.

It doesn't need to roll out simultaneously. Just be an alternative to captcha that isn't as abusive as device attestation.

The manufacturers will be the same ones that manufacture the hundreds of billions of usb drives and phones and smart light bulbs.

The only problems are it's not as useful for abusing users or spying on citizens as revoking access to general purpose computing, and idiots who project problems onto it that come from use cases that are not proposed or say 'big number make thing impossible'.

jabbany · on Sept 21, 2022

It's not really my idea (has been proposed multiple times long ago to the point it has even been implemented in many places).

As for identity theft, it's actually not that common/easy except in the US (which has no centralized national ID issuer and largely depends on hacks building on the SSN).

An besides, protecting digital identity is already important even without this bootstrapping.

> Imagine if your Twitter gets hacked and your digital identity makes it so your Gmail gets blocked.

Flip the services around and you have the reality of today.

mhink · on Sept 20, 2022

> Basically, the core problem is digital identities (accounts, IPs, phone #s etc.) are cheap to create (even considering captchas and all) so fraud is easy. The solution could be just to make it "costly" to create new digital identities. For example, you could get a "verified but anonymous" identity issued by locking some assets (could be real world money, or maybe something intangible like community reputation) as collateral with a trusted party (or, for the crypto people, the blockchain). If you misbehave, you lose your reputation on that identity (and essentially your collateral) and have to start over. This lets anyone bootstrap a "minimal" level of trust at the beginning before they can use time to prove themselves trustworthy.

I've always thought that client certs would be an interesting solution to this problem. Any given certificate can carry signatures from multiple signing authorities, right? So we could imagine a world where there are many different certificate authorities, each of whom have their own criteria for signing a particular certificate and each of whom offer different varieties of assurance regarding the signature-holder's identity.

From here, the question of "should I allow the user identified by this client cert to use my service" simply becomes a question of 1.) checking the validity of the signatures of the client cert and 2.) deciding if the CA's criteria for signing certs aligns with my desired userbase.

For example, a particular CA might insist that their users go through some real-world process to renew their certification every few years, but when they sign a cert it means that the bearer has been strongly vetted as a real person.

An interesting side effect of this auth model is that a service provider accepting certs from a particular CA has someone to complain to if a user bearing their signature acts improperly on their platform. You could imagine a CA which has a code of conduct expected of the users whose certs they sign, and would perhaps revoke a user's certification if too many websites complain.

unwise-exe · on Sept 20, 2022

That's not safe for a lot of sites, though.

I hear that porn tends to be officially frowned on in a fair number of places.

Reading non-approved news is dangerous in some places.

Honestly debating political topics can be super dangerous if you're identifiable.

Sometimes even having a login on a site is dangerous, I think I heard about this after a non-mainstream discussion site got hacked like a hear and a half ago.

mhink · on Sept 30, 2022

So, my thought process here was: given a fairly robust selection of certificate authorities, a given user could have a number of different client certs for use in different trust scenarios. Contrast the following:

- A user bearing a client cert with the name "Jonathan Grant", signed by a U.S. government agency which is known to verify that its signees' certs are a citizen of the United States.

- A user bearing a client cert with the name "Michael Black", signed by Alice, who is known to only sign certs after verifying that the real-world identity of the signee matches the name on the cert.

- A user bearing a client cert with the pseudonym "c00ln4m3", signed by Bob, who is known to only sign a single cert for any given real-world person. (To do so, he verifies the person's real-world identity but does not reveal which cert corresponds to which person.)

- A user bearing a client cert with the pseudonym "hunter217", signed by Charlie, who is known to sign certs without verifying the real-world identity of his signees at all, but who is also known to revoke his signature on certificates if a service provider complains about the user bearing that cert.

- A user bearing a client cert with the pseudonym "cypr3ss", signed by David, who is known to charge $1000/year for a cert bearing his signature but performs no other identity verification.

The point of listing out these different scenarios is that the underlying mechanism (client certs) is the same, but the end-user and the service provider don't actually have to trust each other: they only have to agree on a CA with mutually acceptable policies.

Waterluvian · on Sept 20, 2022

I think this is true. It also reminds me of one possible purpose of regulation and government, given the majority will usually be happy to throw any sort of minority under the bus for the "greater good."

This also reminds me of the anxiety of Google deciding to just ban my account for some reason. They can't be bothered to commit resources to making sure mistakes can be resolved. They don't care to lose a fleetingly small percentage of customers.

Not sure I have an answer. Just a thought.

akira2501 · on Sept 20, 2022

> Harsh blocking/limiting/challenging is way too valuable to sites that are actually trying to make money online.

I'm not understanding the generalized sentiment here. How would, for example, a retailer benefit from this strategy? How does it protect their bottom line?

I can see how a particular kind of "facilitated user economy," such as games, gambling and promotional companies could benefit, but it doesn't seem that broadly applicable to what most people would consider a "mainstream" business.

> so we can lay off the IP-based reputation stuff and the geo-blocking: it's tying some form of personal ID to your browsing activity

And a new market for identity theft is born.

Also, as someone who serves content and geo blocks it, that's not up to me, that's up to the owner of the content or whoever happens to be licensing it for them. So, even if you sent me a picture of your government ID, it changes nothing.

yamtaddle · on Sept 20, 2022

> I'm not understanding the generalized sentiment here. How would, for example, a retailer benefit from this strategy? How does it protect their bottom line?

The amount of automated and apparently-manual attempted credit card fraud (and exploit attempts, for that matter) any halfway-prominent site with a CC form is subjected to is hard to appreciate if you've never seen it. It's a whole lot. They aren't even necessarily trying to buy what you have, but to validate that their stolen cards work. And they're quite busy. If too much of that gets through—really, any more than a very tiny amount of it gets through—you're gonna have an extremely bad time.

Various CC service providers like Stripe do provide tools to try to block those attempts, but defense in depth is usually a very good idea, including fairly aggressive firewall-level blocking.

les_diabolique · on Sept 20, 2022

> a retailer benefit from this strategy? How does it protect their bottom line?

A couple of examples I can think of is blocking bots from scraping their site for pricing and details and from resellers from buying up all of the stock (see sneakers, electronics, etc). The last example doesn't directly impact their bottom line, but it will make customers go elsewhere.

ajb · on Sept 21, 2022

That's not a solution, it would be way worse. Companies would then make automated decisions and associate them with your personal ID, and spammers/DDOSers would be spending serious effort to hack their way to using the IDs of innocents. So rather than just your home network or whatever getting a sh*t reputation with no recourse, you would.

JohnFen · on Sept 20, 2022

> it's tying some form of personal ID to your browsing activity

That wouldn't just be bad news, it would be disastrous news. It would immediately render the entirety of the web worthless to me.

tboyd47 · on Sept 20, 2022

How does having a personal ID tied to browsing activity help with spam? Are spammers not real people with IDs?

adamckay · on Sept 20, 2022

Of course, but the theory is it's restricting 1 real person to 1 account, versus 1 spammer creating 1,000 accounts via automation.

And once your spammer has been identified then that's them banned/removed, unable to sign up again.

tboyd47 · on Sept 20, 2022

What's to stop them from using fake IDs

MichaelZuo · on Sept 21, 2022

Airports seem to be able to spot fake passports pretty reliably.

tboyd47 · on Sept 21, 2022

Try forcing 100% of online traffic through an airport security checkpoint.

MichaelZuo · on Sept 22, 2022

Presumably there can be more than one, like real life airport checkpoints? What are you even trying to say?

les_diabolique · on Sept 20, 2022

Spammers typically implement bots to carry out tasks. I mean, technically at some point a spammer is a real person, but when you're automating tasks and using bots, it's not at the same scale.

notsapiensatall · on Sept 20, 2022

So what happens when your ID gets hacked and reused for fraudulent activity?

Would you have to submit a dispute with the internet credit agencies? Maybe join a class action suit against the entity that leaked your ID so that they're forced to give you a year of free internet identity monitoring?

smsm42 · on Sept 20, 2022

The same that happens now when somebody stills your identity and ruins your credit history. You'll have to live in a bureaucratic hell for the next couple of years. And yes, as a compensation, you'll get the $6.99 worth of services from the guilty party. If you win the class action suit, that is.

notsapiensatall · on Sept 20, 2022

Exactly. Why on earth would we want to replicate such a terrible system online?

We should be reforming our current credit agency system, not empowering it with a new mandate of judging somebody's social or political creditworthiness.

jamie_ca · on Sept 20, 2022

Then you need to deal with levels of rate-limiting that are fine for individuals but make it not feasible for spammers.

Keeping with the cloudflare topic, if Cloudflare only permits you 10 requests per second (HTML + JS/images) that's still usable for web browsing, but someone running a cloud of hundreds of bots would be effectively shut down. Similarly with email, an individual probably doesn't need to send more than one email per 10 seconds but email spammers wouldn't find any ROI at that rate - business needs being different might necessitate a different registry or something in that case.

mcguire · on Sept 20, 2022

Nobody said it wouldn't suck. The only question is whether it sucks less than the alternatives.

les_diabolique · on Sept 21, 2022

If you have a better solution, I'm sure it would be very lucrative.

tboyd47 · on Sept 21, 2022

Looks like Cloudflare beat us to it.

smsm42 · on Sept 20, 2022

They are already testing out digital IDs. Now link that to the social score... and make the browsers and the sites exchange these data on the background, and make frontend services providers refuse connections from non-supporting browsers as "bots"...

hot_gril · on Sept 20, 2022

The other not-so-great approach is to act like a normal user. This stuff doesn't tend to happen to the average Joe who browses the WWW. It's when you're doing unusual (albeit harmless) things.

NelsonMinar · on Sept 20, 2022

Cloudflare is a regular problem for Starlink users. We're on CGNAT so users share IPv4 addresses. I see CAPTCHAs when using Starlink ten times as often as on my other ISP. I don't think it actually breaks things the way this article describes, it seems like a gentler behavior, but it's annoying.

A few months ago I got on Akamai's naughty list (with my other ISP) for some very light automated website downloading. That was a straight block with HTTP errors and I had to use a proxy to access the Web. It cleared up after a few days.

The lack of any user feedback or support for this situation is really annoying. Reminds you how much power the CDNs have. It'd be really bad if loading websites got as difficult as sending email through all the layers of spam filtering.

ThatPlayer · on Sept 20, 2022

I feel like Starlink could at least partially mitigate this by supporting IPv6. T-mobile US supports IPv6, and I hardly notice this as an issue on my phone. Or the time my work ran the business over a 4G mobile while waiting for ISP install.

tomjakubowski · on Sept 21, 2022

A genuine question from an ignoramus: how on earth did Starlink launch a brand new ISP in 2020 which doesn't support IPv6? Is IPv6 really so difficult? Does actually nobody care about IPv6 still, after all these years?

jeremyvisser · on Sept 21, 2022

Not an answer to your question, but an indicator of shared culture: Tesla vehicles also don’t support IPv6 whatsoever.

Things you might use an internet connection for in your Tesla include triggering air con remotely, live traffic and satellite maps, streaming music or online radio, web browsing, or YouTube/Netflix/Disney+ clients.

It completely refuses to use IPv6 over mobile or wi-fi. Also it refuses to access anything over IPv4 (apart from DNS) which resolves to an RFC1918 address, even if it's connected to said RFC1918 network.

So yes, Starlink and Tesla are different companies, but I see cultural parallels which I'm sure surprises nobody.

tomjakubowski · on Sept 21, 2022

That's wild. I wonder if Musk has some wealth tied up in the scarcity of IPv4 addresses, like an IP NIMBY.

NelsonMinar · on Sept 21, 2022

I've been wondering that too, it's kind of confounding. Particularly since Starlink started with CGNAT to manage a limited IPv4 address allocation.

FWIW there've been hints from time to time that Starlink was working on IPv6; users reported being given working addresses. That mostly stopped though when they handed over ISP operations to Google last year.

Syonyk · on Sept 20, 2022

> Cloudflare is a regular problem for Starlink users. We're on CGNAT so users share IPv4 addresses. I see CAPTCHAs when using Starlink ten times as often as on my other ISP. I don't think it actually breaks things the way this article describes, it seems like a gentler behavior, but it's annoying.

I've been noticing this too, and it's why Starlink remains my secondary ISP/bulk transfer connection. If I had to drop one connection, I'd drop Starlink for this reason alone.

There are some sites that I simply can't browse, and it's not Cloudflare errors, either. Lowes, in particular, simply returns error pages for anything but the main landing page on a regular enough basis. Of course, my observed public IP changes so it's not consistent, but it's genuinely annoying.

cma · on Sept 20, 2022

> I've been noticing this too, and it's why Starlink remains my secondary ISP/bulk transfer connection. If I had to drop one connection, I'd drop Starlink for this reason alone.

Could cloudflare legally charge them a bribe to captcha their users less? It isnt good to have a company in this position of power if so.

somedude895 · on Sept 20, 2022

> If I had to drop one connection, I'd drop Starlink for this reason alone.

Why are you using Starlink at all if you have other options?

Syonyk · on Sept 20, 2022

Because my other connection is a 25/3 WISP link that mostly doesn't. I generally see about 5/1 in the evenings, if that.

I've had several area WISP connections, as there's no wired infrastructure to my area, and they vary in quality. I work full time remote, so I need two connections as a general habit - I can work with one, but when that one is down for a week straight, I have problems. I like being able to fail over.

I typically keep one connection for "interactive" traffic, and one for "bulk transfer/failover" - things like my local Ubuntu repo mirror, offsite backup traffic, etc. And I can fail to it if needed, which I do often enough.

On a good day, Starlink is far better than my WISP connection, and I have some machines routed out it persistently. On a bad day, I can't hit much from it, because that particular public IP has been blocked from large parts of the internet. It's very hit and miss, and overall bandwidth has definitely dropped from the early days, though reliability of getting packets where they need to go is drastically improved.

throwaway742 · on Sept 20, 2022

I wonder if a IPv6 tunnel broker to get IPv6 addresses would help with your Starlink problems.

Syonyk · on Sept 21, 2022

Starlink supports IPv6 addresses, I'm sure it would help. My network infrastructure is just lacking general IPv6 support, as I've not cared to get it set up in great detail, and my testing has demonstrated that my IPv6 addresses behind my router are publically reachable, so... I'll get around to proper firewalling at some point.

I could do a range of things to solve it, but as I have two ISPs, I typically just switch to the one that works better for the solution. I'm aware it's not a technically fancy solution, but it's quick and easy (change the gateway on the machine), and works fine.

causi · on Sept 20, 2022

What archival tool were you using? I've been looking for a replacement for HTTRACK forever.

NelsonMinar · on Sept 20, 2022

A combination of shotscraper and metascraper; really more web previews than archives. And in a single thread, to different hostnames, maybe one every 10 seconds? Honestly surprised Akamai or anything even noticed. I fake my user agent now, lesson learned.

justoreply · on Sept 20, 2022

But any automated tool won't work. I have a similar problem with my self hosted feed reader, my vps hosting ip doesn't have 100% reputation with Cloudflare and I can't download some feeds

Edit: spelling

hedora · on Sept 21, 2022

I moved from a local CGNAT'ed WISP to starlink.

Starlink is at least 10 better (fewer captchas).

I'm really hoping cloudflare gets busted for having backroom deals with big ISPs or something. (For instance, if the cgnat had a cloudflare CDN cache endpoint behind / accociated with it, I suspect the IP would be white listed.)

diebeforei485 · on Sept 20, 2022

Cloudflare said they're working on this- https://blog.cloudflare.com/eliminating-captchas-on-iphones-...

hedora · on Sept 21, 2022

That's... The opposite of working on this. It's moving the internet further away from being an interoperable, endpoint-agnostic medium.

viraptor · on Sept 21, 2022

They're working on double dipping by providing both the problem and the solution. Somehow this is not a recurrent issue for every other CDN / ddos shield. They're not even mentioning any other hosting company collaborating on this open solution that requires hardware from a specific company they totally don't have a deal with...

Vorh · on Sept 22, 2022

Anecdote: I've been using Starlink for about a year now, and I've had no trouble with Cloudflare.

therealmarv · on Sept 20, 2022

If you surf on desktop sites from Philippines on a mobile phone plan (which is often the best Internet connection in that country) you also get Cloudflare's captchas everywhere.

I told it before and tell it now again: Cloudflare is dividing the World between first and second/third World countries with their captchas. I call it discrimination of second/third World countries! If you are from US and Europe you will never notice it but if you travel a little bit more you see these blocking captchas everywhere.

chrismorgan · on Sept 20, 2022

I’ve had a similar experience in India with wired internet from a local ISP: CGNAT is used so there are who knows how many customers on the same IPv4 address, https://iknowwhatyoudownload.com/ shows at least forty hours of movies being downloaded every day, the IP address is on half the blacklists out there because someone is part of an email-sending botnet, and yeah, Cloudflare hates you.

MichaelZuo · on Sept 21, 2022

Is there even any way to reliably identify individual users behind a CGNAT without invasive fingerprinting?

ReptileMan · on Sept 20, 2022

I am from Europe and I notice if I use some non residential ip. The captchas are extremely annoying especially when trying to access a site I have already been logged into with 2fa. Who is protected in this case.

thewebcount · on Sept 20, 2022

I get it browsing from a major ISP in the US. I have the gall to browse in private mode and to block trackers and ads because of all the malware they contain. (And I don't use a browser that requires me to login just to browse the web - gasp!) And apparently, that means I'm worthy of this sort of punishment as well.

thephyber · on Sept 21, 2022

> I have the gall to browse in private mode and to block trackers and ads because of all the malware they contain.

I do these things as well. It’s been months since I’ve seen a CloudFlare challenge page.

thewebcount · on Sept 22, 2022

I don't get them as frequently as discussed in the article, but they come up a few times per week.

aendruk · on Sept 20, 2022

The other side of this story is that PLDT stands out from other residential networks as a persistent source of web form spam. I’d love to learn what’s going on differently there.

Dma54rhs · on Sept 20, 2022

I get these a lot and I'm from EU. But it's "seasonal".

Jamie9912 · on Sept 20, 2022

Maybe your mobile ISPs dont do enough to stop malicious/spam traffic. That's not Cloudflare's fault

therealmarv · on Sept 20, 2022

It only affects Cloudflare hosted sites though.

Jamie9912 · on Sept 21, 2022

That's true, but it's the Cloudflare customer who decides what to block by default with their Firewall setting mode, and custom rules etc

therealmarv · on Sept 21, 2022

the default is to block most malicious sites or something which is in their opinion everything outside of regular US and Europe networks. And that's wrong. Many people also do not change defaults.

thephyber · on Sept 21, 2022

No, it affects visitors of “Cloudflare hosted sites”. It also affects all sites not hosted by CloudFlare.

You are complaining about the use of IP as an imperfect signal. Everyone involved knows that. But it’s still better than the current alternatives.

DethNinja · on Sept 20, 2022

There is a chance you might’ve been hacked.

You would be surprised to see how easy it is to hack domestic routers.

1. Find and disinfect the devices, including the router. If you don’t have enough technical knowledge, then buy a new router.

2. Use 30 character long random password on the router.

3. Disable UPnP.

4. Anything with WI-FI and weak password can be hacked within minutes, so check your other devices as well, especially IOT ones.

mh- · on Sept 20, 2022

My assumption is also that something on his network is compromised, and getting his IP into reputation issues.

Tarpitting (serving content slowly from the edge, in order to slow down bots) is necessarily one of the most expensive tools in a WAF/CDN's toolbox.

It's much more likely that something on his network is sending sketchy traffic to CF-fronted/Google sites, and the slow loading he's experiencing elsewhere is because his upstream is being saturated by whatever is happening on his network.

d2wa · on Sept 20, 2022

(Author here.) My router isn’t a domestic router. It’s a MikroTik running RouterOS, completely unsupported by the ISP. Outgoing connections and DNS is logged. UPnP is only allowed for the Xbox, PS4, and off-most-of-the-time gaming PC. Nothing out of the ordinary in the logs.

alexforster · on Sept 20, 2022

> It’s a MikroTik running RouterOS

https://google.com/search?q=mikrotik+botnet

These things are the absolute scourge of the internet.

d2wa · on Sept 22, 2022

They're a powerful tool that lets you shopt off your foot and half your brains with the same bullet. However, this my router isn't compromised. MikroTik routers can easily be misconfigured to be insecure or misbehave. It's a Cisco clone, so that is the product you're buying.

I don't recommend them to anyone who doesn't enjoy and are familiar with the lower-level intricacies of network operations.

aaronmdjones · on Sept 20, 2022

> It’s a MikroTik running RouterOS

It's almost certainly compromised.

d2wa · on Sept 22, 2022

No. It isn't.

malfist · on Sept 20, 2022

Why would you disable UPnP? You're gonna break most collaboration tools/video games/etc.

kunwon1 · on Sept 20, 2022

Disabling UPnP doesn't break much. I've used enterprise firewalls at home for years, none of them have UPnP, I've never noticed a problem arising from that lack. I don't have a problem with video games or collaboration tools

UPnP allows devices inside your network to open ports to the outside world without your knowledge. I think everyone should avoid it if they can get by without it

d2wa · on Sept 20, 2022

It’s absolutely required for most multiplayer games. Many need random ports and some even refuse to work if UPnP is blocked even if you manually open a port for them.

aaronmdjones · on Sept 20, 2022

I've never had UPnP enabled and I don't have any problems doing online gaming / flight sim / video chatting / etc.

emikulic · on Sept 21, 2022

Same. I've found the biggest problem was SNAT rewriting the (source) port number. netfilter, by default, doesn't do this. pf does but you can configure it not to.

Zizizizz · on Sept 21, 2022

On series X you can set up port forwarding really easily. I had to do it for openwrt

malfist · on Sept 20, 2022

What's your solution for the grandmother who just wants to make a zoom call to her grandson? Have her log into her router portal and setup a static ip for her laptop and then port forwarding routes for zoom?