> After config fetch, the SDK opens a persistent WebSocket to:
wss://proxyjs.brdtnet.com:443
This hostname resolves to AWS Global Accelerator IPs
There is some irony that both the scrapers and the websites being scraped are probably hosted on AWS, while playing an elaborate cat-and-mouse game pretending that they weren't.
Kind how the American government needs commercial businesses which they poorly regulate so those businesses provide privacy invasions as a legal means to wash their hands.
I never connect any “smart” device to wifi. If it doesn’t work without connectivity, I don’t want it. I use my TVs as display devices. They have HDMI-in and that’s it.
I have a smart TV that's never spoken to the internet after exiting the factory, but it's a pretty tenuous state of affairs. I have this fear that someone staying over is going to see the "Services unavailable, press [menu] to troubleshoot" toast that shows up overtop the HDMI feed for a few seconds and think they're helping me by connecting it. 4-5 years worth of firmware updates all at once... half a decade of watch data somehow extracated from the HDMI feed and stored for precisely this moment... ads everywhere. Even if it doesn't happen instantly, I can only assume there's some flag deep in the OS called makeEverythingWorse just waiting to be flipped on the femtosecond The Beast catches a whiff of a slightly-higher patch number; now content in it's doomed state after having fufilled it's one true purpose of telling someone at samsung my favourite show is HDMI2.
I have had to back my mother down from that precipice on her own TV so I know it's worth worrying about. The siren call of an entirely empty TV homescreen beckoning us with a struck-out radio tower icon. "We have Disney+ and CraveTV too... press [menu]... pay no attention to the sticky note your son put on the coffee table"
I fear the same but made sure I block basically everything at the network level. First thing I do is hook the tv to the network and black hole its mac.
> I have this fear that someone staying over is going to
This happened to me. After they left, I tried a factory reset, but I don't have confidence there's not some code to remember previously saved wifi connections because my tinfoil hat is firmly in place. However, as you've said I only use the TV as an HDMI receiver. None of the TV's apps are used again. So I'm not sure how much they can detect from just the use of the HDMI port as the only thing being used. The games we play to get the subsidized pricing.
Find the TV’s MAC address and block it on your router. My brother home network had this system where your MAC address had to be whitelisted on the router to communicate with the network, as the days go by I see how in hindsight how this might be for the best in the end.
I’m paranoid that actually blocking internet access to the TV will result in filling up the TV’s disk with all of this intrusive data they have collected waiting to be uploaded, eventually run out of space and brick the TV. This could be just bad software or actually malicious where they intentionally break something if it loses connectivity for too long and they can see you using it with other connected devices.
We really need normies to care enough about this to the point manufacturers will need to think they need to advertise on their TVs that they are privacy-friendly and don’t collect anything as a selling point. Until then, they don’t really care. I just wish someone like Apple made a TV with their Apple TV functionality baked in that I could trust.
Lot's of people do it and I haven't seen nobody reporting this. Given the miser hardware specs most smartvs have, if this was a problem, it wouldn't take years to fill up the small storage space most of those TVs come with.
I split my network(s) into subnets (sharing the same wire, not to be confused with the actual subnets which don't share the same wire) which correspond to routability policies. This in turn involves firewall rules, routing table entries, and DHCP configs corresponding to those subnets.
I give away the software which does the following. I get this (and a lot more) for every host on my network, and I know what every host is.
That's fine! With technologies like Amazon Sidewalk and cheap and cheerful 4g/5g radios nobody NEEDS to ask your permission to connect any more. You COULD use an older device but you don't want to be left behind, do you? And your peers will think you're poor or possibly a C.H.U.D. if you don't just accept your digital yoke.
Frustratingly I do want some of the functionality that comes with connecting my TV to the network - specifically the ability to control things like turning it on and choosing which input its set to via an API it exposes. That's manageable by putting it on a VLAN which isn't allowed access to the outside world, but its also really annoying to me that I have to do that.
If it has an Ethernet port I would use that then unplug it. It still gets to phone home once but you don't have to worry about it maliciously saving your Wi-Fi password for later
You must check a checkbox in agreement to continue. To read the policies one agrees to, an internet connection is required. You may check the checkbox without reading.
As far as I have found from a lot of menu spelunking, this agreement is irrevocable. If I ever go online, it will be used.
> The SDK’s config ships a flag “use_netifs”: true. That flag triggers code in the SDK binary that constructs its NWConnection with a specific required interface: en0 (WiFi) or pdp_ip0 (cellular), rather than using the system default route.
> On iOS, this bypasses any configured VPN’s tun0 interface entirely. The peer tunnel does not cross a user-configured VPN, even when the rest of the app’s HTTPS traffic does.
What's a legitimate use case for this API? When/why should an app be allowed to bypass a user-configured VPN?
When you're the application providing the VPN or when you're any app built to communicate with something on a local-ish network, not something actually reachable globally.
Look at how far TikTok can go if you try blocking DNS. The hardcoded IPs, self-DNS-resolution and cat-and-mouse game of blocking is quite... interesting.
I think they may have scaled back from this, but they were running a 100% malware-style playbook to hit the Tiktok servers like it was some kinda sketchy C2 package. Lots of attempts of their own DoH (and DoT!) and normal DNS servers to try to get into the Tiktok network.
It has been a while since I personally did such traces, but Wireshark was very simple to use and once the network is exposed, it has lots of information available online if you need more.
I found bypassing your VPN particularly appalling, as is the whole thing. Personally, it would be amazing if there were a limit on how much can be in Terms of Service, as no one wants to read that much anymore.
One of the problems I can see here is the problem that running a Tor exit node has: badly behaved users are going to be using it to hide their location.
Imaging having the police show up at your door because they've figured out that you're trafficking child porn, when the actual culprit is someone that is using your TV as a proxy to trade child porn.
I genuinely dislike how user hostile everything has become. I effectively have to become an expert in near everything and track all news on the off-change something major upends previous assumptions. And if I miss it somehow and complain about it, defenders will come out of the woodwork to defend, deflect or derail the conversation.
If there is any good news about this, it is that the fatigue seems to be hitting normal people. Buddy from work complained to me how he now is now forced to be a full blown wifi/internet admin so that his kids' restrictions/limits are appropriately enforced.
I am just venting, because I am not entirely certain what an appropriate solution here is.
Are there any defenses I can put in front of my websites that are good for stopping these things? The amount of traffic I see from residential proxies is just killing me. In particular defense against residential proxies.
The bots used by these proxies are detectable in a few ways. Remember the bot itself doesn't run on the proxy...
There is discernible lag from proxy to c&c node. The individual bots don't have access to a lot of compute, and are sometimes restricted wrt feature set (e.g. proprietary video codecs).
There are a few other techniques. It's a cat and mouse game though. And the bot owners are usually more motivated than you are.
If the kind of proxying isn't illegal, in my opinion it should be -- saying it's bordering on circumvention of fundamental assumptions about Internet routing and IP address leasing (and ownership), would be a sorry understatement compared to what Bright Data has managed to package into a product payment:
> you are allowing Bright Data to occasionally use your device’s free resources and _IP address to download public web data from the internet_. (emphasis mine)
I think the misleading part -- to the end-user -- is the "download public web data" part. If the data is public why can't Bright Data download it themselves? Well, because the other end doesn't want them to, apparently. The product is make you help Bright Data circumvent the undesired properties of the "public" data providers, on behalf of someone who happens to have the cash but as of yet is at the short end of the Internet stick (for all the right reasons, I'd say).
This is absolutely deplorable, but knowing the directions this is heading, I am neither surprised nor concerned, frankly. People have long voted with their wallet -- it's not the privacy-conscious Joe the Hacker that is being proxied through here, it's our parents and millions of people who just want entertainment at the end of the working day, including _parents_ of small children.
Day by day the dark Internet theory sounds more plausible, and frankly I am all there for it. The Internet will collapse into a feudal internetwork where any routing will need hop-by-hop key, so real people (and agents, frankly) can maintain a measure of trust that right now is being actively circumvented.
I found some 60 iOS apps that have the SDK mentioned in the article:
https://appgoblin.info/sdks/brdsdk.framework (sorry this requires a free login due to heavy scraping, feel free to contact me for list)
I was unable to find related Android SDKs. I tried looking at the various apps on AppGoblin to find the android versions, then looking through their unmapped SDK parts but didn't see anything.
This looks like it should just be "com.brightdata" but I did not find anything. With 60 iOS apps there must be apps with Android SDK, but I'm not sure why I am not finding any.
If anyone knows, or would like to chat feel free to connect. I'm happy to share data.
Android apps are usually obfuscated, ostensibly to make them smaller, and now obviously to hide what's inside. Only a basic level of obfuscation is typically used, so you would have more luck searching for strings.
Having never owned a telivision because of how much I didn't like advertising when tv was the primary delivery method, the feeling of having avoided a life sentence of bieng lashed to the tube is wierd, I know that people might catch me looking all to intently into there eyes trying to see if they are realy in there.
Years ago I had smart TV, and while I never used anything “smart”, one day I connected it to the network to update it and forgot it, two days later I was checking my dns and 80% of the traffic and blocked queries in the past two days were from one device, after tracking it, it was the TV!
So what I have now is a pre-smart TV I found at the thrift, still very good picture that’s more than enough for the few times I use it.
There should be a way to disable the “smart” garbage in new TVs, or an option to buy normal ones at least.
Both are causing a dynamic that will lock down the internet evermore for everything straying slightly from the corporate-approved line.
If the divide was data center vs residential IPs, fine, but thanks to Bright Data and friends, residential IPs are getting suspicious as well, so I guess the next step is full-on client verification then...
DC and residential IPs aren't real categories that exist either. They are guesses by IP reputation companies. Nothing except practicality stops an ISP from mixing them both into the same DHCP pool.
I wish federal or state laws could force providing transparency because asking for privacy is a dead end at this point. Just force products and providers that run in my home where they phone in. Then, I can decide what to do with that whether I send them to a black hole or let them pass.
MDM: Mobile Device Management. Software that helps ops folks control a fleet of mobile devices like tablets, phones, etc…
Mobile EDR: Endpoint detection and response. This is cybersecurity software to monitor and deal with network activity happening in mobile devices like tablets, phones, etc…
So wait a second then, it connects out using a websocket to its bot C&C server, right?
Which presumably passes it a URL to scrape and waits for it to return the data.
What happens if I write my own tool that connects to that C&C server, waits for a URL to scrape, and returns gigabytes of freshly brewed hot horseshit?
Not sure about Bright Data but these are usually SOCKS or HTTP CONNECT proxies because that's most flexible. But the customer might be paying by the gigabyte, so you can still feed them nonsense, maybe a 4 gigabyte TLS certificate.
> After config fetch, the SDK opens a persistent WebSocket to:
wss://proxyjs.brdtnet.com:443
This hostname resolves to AWS Global Accelerator IPs
There is some irony that both the scrapers and the websites being scraped are probably hosted on AWS, while playing an elaborate cat-and-mouse game pretending that they weren't.
Cloudflare sells DDoS protection services to DDoS control panels.
Adding to DNS block list immediately.
Don't forget the config endpoint before as well.
> On every launch the SDK calls:
GET <https://clientsdk.bright-sdk.com/sdk_config_ios.json>?appid=<bundle>&ver=<sdk-version>&uuid=sdk-ios-<32hex>
Kind how the American government needs commercial businesses which they poorly regulate so those businesses provide privacy invasions as a legal means to wash their hands.
Same for arms dealing, and every other industry.
I never connect any “smart” device to wifi. If it doesn’t work without connectivity, I don’t want it. I use my TVs as display devices. They have HDMI-in and that’s it.
I have a smart TV that's never spoken to the internet after exiting the factory, but it's a pretty tenuous state of affairs. I have this fear that someone staying over is going to see the "Services unavailable, press [menu] to troubleshoot" toast that shows up overtop the HDMI feed for a few seconds and think they're helping me by connecting it. 4-5 years worth of firmware updates all at once... half a decade of watch data somehow extracated from the HDMI feed and stored for precisely this moment... ads everywhere. Even if it doesn't happen instantly, I can only assume there's some flag deep in the OS called makeEverythingWorse just waiting to be flipped on the femtosecond The Beast catches a whiff of a slightly-higher patch number; now content in it's doomed state after having fufilled it's one true purpose of telling someone at samsung my favourite show is HDMI2.
I have had to back my mother down from that precipice on her own TV so I know it's worth worrying about. The siren call of an entirely empty TV homescreen beckoning us with a struck-out radio tower icon. "We have Disney+ and CraveTV too... press [menu]... pay no attention to the sticky note your son put on the coffee table"
I fear the same but made sure I block basically everything at the network level. First thing I do is hook the tv to the network and black hole its mac.
Or open up the TV and disable its wifi hardware.
That is the only solution, otherwise there is no guarantee that someone just won't connect the TV to their phone's hotspot.
> I have this fear that someone staying over is going to
This happened to me. After they left, I tried a factory reset, but I don't have confidence there's not some code to remember previously saved wifi connections because my tinfoil hat is firmly in place. However, as you've said I only use the TV as an HDMI receiver. None of the TV's apps are used again. So I'm not sure how much they can detect from just the use of the HDMI port as the only thing being used. The games we play to get the subsidized pricing.
HDMI is heavily used for ACR (automatic content recognition) in smart TVs:
"Our findings indicate that (1) ACR operates even when it is used as a “dumb” display via HDMI"
"For both LG (a) and Samsung (b)TVs, the scenarios with the highest ACR traffic are Linear and HDMI."
* https://dl.acm.org/doi/epdf/10.1145/3646547.3689013
Find the TV’s MAC address and block it on your router. My brother home network had this system where your MAC address had to be whitelisted on the router to communicate with the network, as the days go by I see how in hindsight how this might be for the best in the end.
I’m paranoid that actually blocking internet access to the TV will result in filling up the TV’s disk with all of this intrusive data they have collected waiting to be uploaded, eventually run out of space and brick the TV. This could be just bad software or actually malicious where they intentionally break something if it loses connectivity for too long and they can see you using it with other connected devices.
We really need normies to care enough about this to the point manufacturers will need to think they need to advertise on their TVs that they are privacy-friendly and don’t collect anything as a selling point. Until then, they don’t really care. I just wish someone like Apple made a TV with their Apple TV functionality baked in that I could trust.
Lot's of people do it and I haven't seen nobody reporting this. Given the miser hardware specs most smartvs have, if this was a problem, it wouldn't take years to fill up the small storage space most of those TVs come with.
I split my network(s) into subnets (sharing the same wire, not to be confused with the actual subnets which don't share the same wire) which correspond to routability policies. This in turn involves firewall rules, routing table entries, and DHCP configs corresponding to those subnets.
I give away the software which does the following. I get this (and a lot more) for every host on my network, and I know what every host is.
That's fine! With technologies like Amazon Sidewalk and cheap and cheerful 4g/5g radios nobody NEEDS to ask your permission to connect any more. You COULD use an older device but you don't want to be left behind, do you? And your peers will think you're poor or possibly a C.H.U.D. if you don't just accept your digital yoke.
Frustratingly I do want some of the functionality that comes with connecting my TV to the network - specifically the ability to control things like turning it on and choosing which input its set to via an API it exposes. That's manageable by putting it on a VLAN which isn't allowed access to the outside world, but its also really annoying to me that I have to do that.
On my TCL TV, you have to connect it to read the Google policies you are agreeing to. If you don't, you agree to policies unread.
Thankfully, the blast radius of this is nothing without connectivity.
If it has an Ethernet port I would use that then unplug it. It still gets to phone home once but you don't have to worry about it maliciously saving your Wi-Fi password for later
You can create a guest wifi with temporary password, I do that when I need to connect devices that might store the password like kindle or such.
But it lets you continue without reading them? There's a lot of questionable terms of service rules but this one has to be unenforcable.
You must check a checkbox in agreement to continue. To read the policies one agrees to, an internet connection is required. You may check the checkbox without reading.
As far as I have found from a lot of menu spelunking, this agreement is irrevocable. If I ever go online, it will be used.
That's the kind of thing that doesn't always hold up in court.
If I don't connect to the internet ever, my agreement to Google policies is probably a moot point.
> The SDK’s config ships a flag “use_netifs”: true. That flag triggers code in the SDK binary that constructs its NWConnection with a specific required interface: en0 (WiFi) or pdp_ip0 (cellular), rather than using the system default route.
> On iOS, this bypasses any configured VPN’s tun0 interface entirely. The peer tunnel does not cross a user-configured VPN, even when the rest of the app’s HTTPS traffic does.
What's a legitimate use case for this API? When/why should an app be allowed to bypass a user-configured VPN?
> What's a legitimate use case for this API?
When you're the application providing the VPN or when you're any app built to communicate with something on a local-ish network, not something actually reachable globally.
> When/why should an app be allowed to bypass a user-configured VPN?
temporarily if full tunnelling isn't working, one can split tunnel to route around issues due to VPN
But imo an app should never bypass something like a network boundary.
Look at how far TikTok can go if you try blocking DNS. The hardcoded IPs, self-DNS-resolution and cat-and-mouse game of blocking is quite... interesting.
Is there anywhere I could read more about this ?
https://github.com/M4jx/TikTokBlocklist
I think they may have scaled back from this, but they were running a 100% malware-style playbook to hit the Tiktok servers like it was some kinda sketchy C2 package. Lots of attempts of their own DoH (and DoT!) and normal DNS servers to try to get into the Tiktok network.
Naive question: what would I search for to find a tutorial on how to detect this on my devices, which are mostly iOS, or in my home network?
I'd love to find and remove any apps from my devices that have this SDk active.
There could be better, but this looked reasonable at first glance if you also have a Mac.
https://www.thequantizer.com/tutorials/wireshark-iphone-traf...
It has been a while since I personally did such traces, but Wireshark was very simple to use and once the network is exposed, it has lots of information available online if you need more.
I found bypassing your VPN particularly appalling, as is the whole thing. Personally, it would be amazing if there were a limit on how much can be in Terms of Service, as no one wants to read that much anymore.
Not if my firewall blocks it from accessing the outside world. (But allows HomeAssistant to control it)
One of the problems I can see here is the problem that running a Tor exit node has: badly behaved users are going to be using it to hide their location.
Imaging having the police show up at your door because they've figured out that you're trafficking child porn, when the actual culprit is someone that is using your TV as a proxy to trade child porn.
I genuinely dislike how user hostile everything has become. I effectively have to become an expert in near everything and track all news on the off-change something major upends previous assumptions. And if I miss it somehow and complain about it, defenders will come out of the woodwork to defend, deflect or derail the conversation.
If there is any good news about this, it is that the fatigue seems to be hitting normal people. Buddy from work complained to me how he now is now forced to be a full blown wifi/internet admin so that his kids' restrictions/limits are appropriately enforced.
I am just venting, because I am not entirely certain what an appropriate solution here is.
Solution is more regulation, stronger consumer organizations, and privacy watchdogs with actual teeth.
Groups like Bright Data have pretty good KYC. After the scare of the police visit, the actual perpetrator would go to prison.
> Groups like Bright Data have pretty good KYC.
They don't. I use their residential proxies without ever having KYC'd.
Are there any defenses I can put in front of my websites that are good for stopping these things? The amount of traffic I see from residential proxies is just killing me. In particular defense against residential proxies.
Make your server so efficient that a few extra requests doesn't bring it down.
Alternatively, if it's the first time the IP is seen and it's a deep linked page with no referer, send a neverending chunked gzip data stream.
The bots used by these proxies are detectable in a few ways. Remember the bot itself doesn't run on the proxy...
There is discernible lag from proxy to c&c node. The individual bots don't have access to a lot of compute, and are sometimes restricted wrt feature set (e.g. proprietary video codecs).
There are a few other techniques. It's a cat and mouse game though. And the bot owners are usually more motivated than you are.
Add a captcha or proof-of-work challenge in front of your website. Those are pretty much your only options.
If the kind of proxying isn't illegal, in my opinion it should be -- saying it's bordering on circumvention of fundamental assumptions about Internet routing and IP address leasing (and ownership), would be a sorry understatement compared to what Bright Data has managed to package into a product payment:
> you are allowing Bright Data to occasionally use your device’s free resources and _IP address to download public web data from the internet_. (emphasis mine)
I think the misleading part -- to the end-user -- is the "download public web data" part. If the data is public why can't Bright Data download it themselves? Well, because the other end doesn't want them to, apparently. The product is make you help Bright Data circumvent the undesired properties of the "public" data providers, on behalf of someone who happens to have the cash but as of yet is at the short end of the Internet stick (for all the right reasons, I'd say).
This is absolutely deplorable, but knowing the directions this is heading, I am neither surprised nor concerned, frankly. People have long voted with their wallet -- it's not the privacy-conscious Joe the Hacker that is being proxied through here, it's our parents and millions of people who just want entertainment at the end of the working day, including _parents_ of small children.
Day by day the dark Internet theory sounds more plausible, and frankly I am all there for it. The Internet will collapse into a feudal internetwork where any routing will need hop-by-hop key, so real people (and agents, frankly) can maintain a measure of trust that right now is being actively circumvented.
It's completely legal and the law you mentioned about IP routing and address ownership does not exist.
I found some 60 iOS apps that have the SDK mentioned in the article: https://appgoblin.info/sdks/brdsdk.framework (sorry this requires a free login due to heavy scraping, feel free to contact me for list)
I was unable to find related Android SDKs. I tried looking at the various apps on AppGoblin to find the android versions, then looking through their unmapped SDK parts but didn't see anything.
https://github.com/BrightSDK/bright-sdk-gradle-plugin-docs
This looks like it should just be "com.brightdata" but I did not find anything. With 60 iOS apps there must be apps with Android SDK, but I'm not sure why I am not finding any.
If anyone knows, or would like to chat feel free to connect. I'm happy to share data.
Android apps are usually obfuscated, ostensibly to make them smaller, and now obviously to hide what's inside. Only a basic level of obfuscation is typically used, so you would have more luck searching for strings.
Why can't you just post the list as a comment?
> The TLS certificate is CN=*.luminatinet.com — the domain for Luminati Networks, Bright Data’s pre-2018 corporate name
Ah yes. The big privacy scraping company called themselves The Luminati. It’s like they are side-investing in tin foil hats or something.
Having never owned a telivision because of how much I didn't like advertising when tv was the primary delivery method, the feeling of having avoided a life sentence of bieng lashed to the tube is wierd, I know that people might catch me looking all to intently into there eyes trying to see if they are realy in there.
Phones do the same thing...
Years ago I had smart TV, and while I never used anything “smart”, one day I connected it to the network to update it and forgot it, two days later I was checking my dns and 80% of the traffic and blocked queries in the past two days were from one device, after tracking it, it was the TV!
So what I have now is a pre-smart TV I found at the thrift, still very good picture that’s more than enough for the few times I use it.
There should be a way to disable the “smart” garbage in new TVs, or an option to buy normal ones at least.
I find Cloudflare to be more unethical than Bright Data.
Both are causing a dynamic that will lock down the internet evermore for everything straying slightly from the corporate-approved line.
If the divide was data center vs residential IPs, fine, but thanks to Bright Data and friends, residential IPs are getting suspicious as well, so I guess the next step is full-on client verification then...
DC and residential IPs aren't real categories that exist either. They are guesses by IP reputation companies. Nothing except practicality stops an ISP from mixing them both into the same DHCP pool.
I wish federal or state laws could force providing transparency because asking for privacy is a dead end at this point. Just force products and providers that run in my home where they phone in. Then, I can decide what to do with that whether I send them to a black hole or let them pass.
These are legitimate client devices. Good luck with that.
FTA:
> MDM, mobile EDR
Anyone care to ELI5 these?
MDM: Mobile Device Management. Software that helps ops folks control a fleet of mobile devices like tablets, phones, etc…
Mobile EDR: Endpoint detection and response. This is cybersecurity software to monitor and deal with network activity happening in mobile devices like tablets, phones, etc…
So wait a second then, it connects out using a websocket to its bot C&C server, right?
Which presumably passes it a URL to scrape and waits for it to return the data.
What happens if I write my own tool that connects to that C&C server, waits for a URL to scrape, and returns gigabytes of freshly brewed hot horseshit?
Most scrapped websites have https, so you need to perform a MITM attack. Scrapers will probably notice that.
No, you just need to stand up your own website and feed the scraper a URL to it.
I would just generate scads of Markov chain output and make it look like a plausible web page.
That's pretty much what the bots are scraping now, with all the AI slop websites out there.
How would https affect it?
If they're making a request to my machine to go and curl a page, how do they even know whether or not it was https?
Not sure about Bright Data but these are usually SOCKS or HTTP CONNECT proxies because that's most flexible. But the customer might be paying by the gigabyte, so you can still feed them nonsense, maybe a 4 gigabyte TLS certificate.
Not the one in my living room.