The Smart TV in Your LivingRoom Is a Node in the AIScraping Economy

167 points | by nikcub 8 hours ago

68 comments

xg15 6 hours ago
> After config fetch, the SDK opens a persistent WebSocket to:
wss://proxyjs.brdtnet.com:443
This hostname resolves to AWS Global Accelerator IPs
There is some irony that both the scrapers and the websites being scraped are probably hosted on AWS, while playing an elaborate cat-and-mouse game pretending that they weren't.
[-]
- trumpdong an hour ago
  Cloudflare sells DDoS protection services to DDoS control panels.
- BLKNSLVR 2 hours ago
  Adding to DNS block list immediately.
  [-]
  - xg15 2 hours ago
    Don't forget the config endpoint before as well.
    > On every launch the SDK calls:
    GET <https://clientsdk.bright-sdk.com/sdk_config_ios.json>?appid=<bundle>&ver=<sdk-version>&uuid=sdk-ios-<32hex>
- cyanydeez 5 hours ago
  Kind how the American government needs commercial businesses which they poorly regulate so those businesses provide privacy invasions as a legal means to wash their hands.
  [-]
  - rootsudo 3 hours ago
    Same for arms dealing, and every other industry.
cobbzilla 6 hours ago
I never connect any “smart” device to wifi. If it doesn’t work without connectivity, I don’t want it. I use my TVs as display devices. They have HDMI-in and that’s it.
[-]
- graypegg 3 hours ago
  I have a smart TV that's never spoken to the internet after exiting the factory, but it's a pretty tenuous state of affairs. I have this fear that someone staying over is going to see the "Services unavailable, press [menu] to troubleshoot" toast that shows up overtop the HDMI feed for a few seconds and think they're helping me by connecting it. 4-5 years worth of firmware updates all at once... half a decade of watch data somehow extracated from the HDMI feed and stored for precisely this moment... ads everywhere. Even if it doesn't happen instantly, I can only assume there's some flag deep in the OS called makeEverythingWorse just waiting to be flipped on the femtosecond The Beast catches a whiff of a slightly-higher patch number; now content in it's doomed state after having fufilled it's one true purpose of telling someone at samsung my favourite show is HDMI2.
  I have had to back my mother down from that precipice on her own TV so I know it's worth worrying about. The siren call of an entirely empty TV homescreen beckoning us with a struck-out radio tower icon. "We have Disney+ and CraveTV too... press [menu]... pay no attention to the sticky note your son put on the coffee table"
  [-]
  - dawnerd an hour ago
    I fear the same but made sure I block basically everything at the network level. First thing I do is hook the tv to the network and black hole its mac.
    [-]
    - dreamcompiler 40 minutes ago
      Or open up the TV and disable its wifi hardware.
      [-]
      - rationalist a minute ago
        That is the only solution, otherwise there is no guarantee that someone just won't connect the TV to their phone's hotspot.
  - dylan604 2 hours ago
    > I have this fear that someone staying over is going to
    This happened to me. After they left, I tried a factory reset, but I don't have confidence there's not some code to remember previously saved wifi connections because my tinfoil hat is firmly in place. However, as you've said I only use the TV as an HDMI receiver. None of the TV's apps are used again. So I'm not sure how much they can detect from just the use of the HDMI port as the only thing being used. The games we play to get the subsidized pricing.
    [-]
    - Eisenstein 2 hours ago
      HDMI is heavily used for ACR (automatic content recognition) in smart TVs:
      "Our findings indicate that (1) ACR operates even when it is used as a “dumb” display via HDMI"
      "For both LG (a) and Samsung (b)TVs, the scenarios with the highest ACR traffic are Linear and HDMI."
      * https://dl.acm.org/doi/epdf/10.1145/3646547.3689013
  - archerx 2 hours ago
    Find the TV’s MAC address and block it on your router. My brother home network had this system where your MAC address had to be whitelisted on the router to communicate with the network, as the days go by I see how in hindsight how this might be for the best in the end.
    [-]
    - onesociety2022 2 hours ago
      I’m paranoid that actually blocking internet access to the TV will result in filling up the TV’s disk with all of this intrusive data they have collected waiting to be uploaded, eventually run out of space and brick the TV. This could be just bad software or actually malicious where they intentionally break something if it loses connectivity for too long and they can see you using it with other connected devices.
      We really need normies to care enough about this to the point manufacturers will need to think they need to advertise on their TVs that they are privacy-friendly and don’t collect anything as a selling point. Until then, they don’t really care. I just wish someone like Apple made a TV with their Apple TV functionality baked in that I could trust.
      [-]
      - elzbardico 2 hours ago
        Lot's of people do it and I haven't seen nobody reporting this. Given the miser hardware specs most smartvs have, if this was a problem, it wouldn't take years to fill up the small storage space most of those TVs come with.
    - m3047 43 minutes ago
      I split my network(s) into subnets (sharing the same wire, not to be confused with the actual subnets which don't share the same wire) which correspond to routability policies. This in turn involves firewall rules, routing table entries, and DHCP configs corresponding to those subnets.
      I give away the software which does the following. I get this (and a lot more) for every host on my network, and I know what every host is.
      # peers upstairs-roku.m3047 +addr +serv dns.google [8.8.4.4] domain [53] dns.google [8.8.8.8] domain [53] athena.m3047 [10.0.0.220] domain [53] mediaservices.cdn-apple.com [23.46.228.133] https [443] mediaservices.cdn-apple.com [23.46.228.134] https [443] mediaservices.cdn-apple.com [23.46.228.135] https [443] mediaservices.cdn-apple.com [23.46.228.137] https [443] mediaservices.cdn-apple.com [23.46.228.138] https [443] mediaservices.cdn-apple.com [23.46.228.139] https [443] mediaservices.cdn-apple.com [23.46.228.140] https [443] mediaservices.cdn-apple.com [23.46.228.142] https [443] mediaservices.cdn-apple.com [23.46.228.143] https [443] mediaservices.cdn-apple.com [23.46.228.144] https [443] mediaservices.cdn-apple.com [23.46.228.145] https [443] mediaservices.cdn-apple.com [23.213.34.169] https [443] mediaservices.cdn-apple.com [23.213.34.176] https [443] mediaservices.cdn-apple.com [23.213.34.178] https [443] mediaservices.cdn-apple.com [23.213.34.185] https [443] mediaservices.cdn-apple.com [23.213.34.186] https [443] mediaservices.cdn-apple.com [23.213.34.187] https [443] mediaservices.cdn-apple.com [23.213.34.188] https [443] mediaservices.cdn-apple.com [23.213.34.193] https [443] mediaservices.cdn-apple.com [23.213.34.196] https [443] mediaservices.cdn-apple.com [23.213.34.201] https [443] mediaservices.cdn-apple.com [23.213.34.203] https [443] nrdp.push.prod.netflix.com [35.81.198.46] www [80] ec2-35-86-100-253.us-west-2.compute.amazonaws.com [35.86.100.253] psbserver [2350] austin.logs.roku.com [35.212.27.142] https [443] scribe.logs.roku.com [35.212.34.174] https [443] austin.logs.roku.com [35.212.72.105] https [443] austin.logs.roku.com [35.212.119.44] https [443] display.ravm.tv [35.212.178.254] https [443] logs.netflix.com [44.226.179.188] https [443] logs.netflix.com [44.228.67.58] https [443] nrdp.push.prod.netflix.com [44.229.50.4] www [80] logs.netflix.com [44.229.122.169] https [443] nrdp.push.prod.netflix.com [44.232.75.216] www [80] api.roku.com [44.249.213.211] https [443] nrdp.prod.ftl.netflix.com [45.57.40.1] https [443] nrdp.prod.ftl.netflix.com [45.57.41.1] https [443] nrdp.push.prod.netflix.com [52.24.26.117] www [80] logs.netflix.com [52.33.247.19] https [443] themes-service.sr.roku.com [54.200.214.141] https [443] occ-0-1009-1007.1.nflxso.net [198.38.112.135] www [80] occ-0-1009-1007.1.nflxso.net [198.38.112.144] www [80] occ-0-1009-1007.1.nflxso.net [198.38.112.145] www [80] occ-0-1009-1007.1.nflxso.net [198.38.112.165] www [80] occ-0-1009-1007.1.nflxso.net [198.38.112.169] www [80] occ-0-1009-1007.1.nflxso.net [198.38.112.170] www [80] occ-0-1009-1007.1.nflxso.net [198.38.112.172] www [80] occ-0-1009-1007.1.nflxso.net [198.38.112.178] www [80] mdns.mcast.net [224.0.0.251] mdns [5353] 239.255.255.250 [239.255.255.250] ssdp [1900]
- Innittech an hour ago
  That's fine! With technologies like Amazon Sidewalk and cheap and cheerful 4g/5g radios nobody NEEDS to ask your permission to connect any more. You COULD use an older device but you don't want to be left behind, do you? And your peers will think you're poor or possibly a C.H.U.D. if you don't just accept your digital yoke.
- jon-wood 2 hours ago
  Frustratingly I do want some of the functionality that comes with connecting my TV to the network - specifically the ability to control things like turning it on and choosing which input its set to via an API it exposes. That's manageable by putting it on a VLAN which isn't allowed access to the outside world, but its also really annoying to me that I have to do that.
- lelandfe 6 hours ago
  On my TCL TV, you have to connect it to read the Google policies you are agreeing to. If you don't, you agree to policies unread.
  Thankfully, the blast radius of this is nothing without connectivity.
  [-]
  - drhike 5 hours ago
    If it has an Ethernet port I would use that then unplug it. It still gets to phone home once but you don't have to worry about it maliciously saving your Wi-Fi password for later
    [-]
    - tamimio 3 hours ago
      You can create a guest wifi with temporary password, I do that when I need to connect devices that might store the password like kindle or such.
  - idiotsecant 5 hours ago
    But it lets you continue without reading them? There's a lot of questionable terms of service rules but this one has to be unenforcable.
    [-]
    - lelandfe 4 hours ago
      You must check a checkbox in agreement to continue. To read the policies one agrees to, an internet connection is required. You may check the checkbox without reading.
      As far as I have found from a lot of menu spelunking, this agreement is irrevocable. If I ever go online, it will be used.
      [-]
      - trumpdong an hour ago
        That's the kind of thing that doesn't always hold up in court.
  - elzbardico 2 hours ago
    If I don't connect to the internet ever, my agreement to Google policies is probably a moot point.
calcifer 5 hours ago
> The SDK’s config ships a flag “use_netifs”: true. That flag triggers code in the SDK binary that constructs its NWConnection with a specific required interface: en0 (WiFi) or pdp_ip0 (cellular), rather than using the system default route.
> On iOS, this bypasses any configured VPN’s tun0 interface entirely. The peer tunnel does not cross a user-configured VPN, even when the rest of the app’s HTTPS traffic does.
What's a legitimate use case for this API? When/why should an app be allowed to bypass a user-configured VPN?
[-]
- chmod775 5 hours ago
  > What's a legitimate use case for this API?
  When you're the application providing the VPN or when you're any app built to communicate with something on a local-ish network, not something actually reachable globally.
- picofarad 5 hours ago
  > When/why should an app be allowed to bypass a user-configured VPN?
  temporarily if full tunnelling isn't working, one can split tunnel to route around issues due to VPN
  But imo an app should never bypass something like a network boundary.
  [-]
  - kotaKat 3 hours ago
    Look at how far TikTok can go if you try blocking DNS. The hardcoded IPs, self-DNS-resolution and cat-and-mouse game of blocking is quite... interesting.
    [-]
    - vsgherzi 3 hours ago
      Is there anywhere I could read more about this ?
      [-]
      - kotaKat 3 hours ago
        https://github.com/M4jx/TikTokBlocklist
        I think they may have scaled back from this, but they were running a 100% malware-style playbook to hit the Tiktok servers like it was some kinda sketchy C2 package. Lots of attempts of their own DoH (and DoT!) and normal DNS servers to try to get into the Tiktok network.
yodon 5 hours ago
Naive question: what would I search for to find a tutorial on how to detect this on my devices, which are mostly iOS, or in my home network?
I'd love to find and remove any apps from my devices that have this SDk active.
[-]
- tisdadd 5 hours ago
  There could be better, but this looked reasonable at first glance if you also have a Mac.
  https://www.thequantizer.com/tutorials/wireshark-iphone-traf...
  It has been a while since I personally did such traces, but Wireshark was very simple to use and once the network is exposed, it has lots of information available online if you need more.
  I found bypassing your VPN particularly appalling, as is the whole thing. Personally, it would be amazing if there were a limit on how much can be in Terms of Service, as no one wants to read that much anymore.
skinwill 5 hours ago
Not if my firewall blocks it from accessing the outside world. (But allows HomeAssistant to control it)
NewCzech 6 hours ago
One of the problems I can see here is the problem that running a Tor exit node has: badly behaved users are going to be using it to hide their location.
Imaging having the police show up at your door because they've figured out that you're trafficking child porn, when the actual culprit is someone that is using your TV as a proxy to trade child porn.
[-]
- iugtmkbdfil834 5 hours ago
  I genuinely dislike how user hostile everything has become. I effectively have to become an expert in near everything and track all news on the off-change something major upends previous assumptions. And if I miss it somehow and complain about it, defenders will come out of the woodwork to defend, deflect or derail the conversation.
  If there is any good news about this, it is that the fatigue seems to be hitting normal people. Buddy from work complained to me how he now is now forced to be a full blown wifi/internet admin so that his kids' restrictions/limits are appropriately enforced.
  I am just venting, because I am not entirely certain what an appropriate solution here is.
  [-]
  - amelius 4 hours ago
    Solution is more regulation, stronger consumer organizations, and privacy watchdogs with actual teeth.
- trumpdong 44 minutes ago
  Groups like Bright Data have pretty good KYC. After the scare of the police visit, the actual perpetrator would go to prison.
  [-]
  - KomoD 20 minutes ago
    > Groups like Bright Data have pretty good KYC.
    They don't. I use their residential proxies without ever having KYC'd.
blakesterz 4 hours ago
Are there any defenses I can put in front of my websites that are good for stopping these things? The amount of traffic I see from residential proxies is just killing me. In particular defense against residential proxies.
[-]
- trumpdong 44 minutes ago
  Make your server so efficient that a few extra requests doesn't bring it down.
  Alternatively, if it's the first time the IP is seen and it's a deep linked page with no referer, send a neverending chunked gzip data stream.
- jappgar 3 hours ago
  The bots used by these proxies are detectable in a few ways. Remember the bot itself doesn't run on the proxy...
  There is discernible lag from proxy to c&c node. The individual bots don't have access to a lot of compute, and are sometimes restricted wrt feature set (e.g. proprietary video codecs).
  There are a few other techniques. It's a cat and mouse game though. And the bot owners are usually more motivated than you are.
- bakugo 2 hours ago
  Add a captcha or proof-of-work challenge in front of your website. Those are pretty much your only options.
hackrmn 4 hours ago
If the kind of proxying isn't illegal, in my opinion it should be -- saying it's bordering on circumvention of fundamental assumptions about Internet routing and IP address leasing (and ownership), would be a sorry understatement compared to what Bright Data has managed to package into a product payment:
> you are allowing Bright Data to occasionally use your device’s free resources and _IP address to download public web data from the internet_. (emphasis mine)
I think the misleading part -- to the end-user -- is the "download public web data" part. If the data is public why can't Bright Data download it themselves? Well, because the other end doesn't want them to, apparently. The product is make you help Bright Data circumvent the undesired properties of the "public" data providers, on behalf of someone who happens to have the cash but as of yet is at the short end of the Internet stick (for all the right reasons, I'd say).
This is absolutely deplorable, but knowing the directions this is heading, I am neither surprised nor concerned, frankly. People have long voted with their wallet -- it's not the privacy-conscious Joe the Hacker that is being proxied through here, it's our parents and millions of people who just want entertainment at the end of the working day, including _parents_ of small children.
Day by day the dark Internet theory sounds more plausible, and frankly I am all there for it. The Internet will collapse into a feudal internetwork where any routing will need hop-by-hop key, so real people (and agents, frankly) can maintain a measure of trust that right now is being actively circumvented.
[-]
- trumpdong 43 minutes ago
  It's completely legal and the law you mentioned about IP routing and address ownership does not exist.
ddxv 3 hours ago
I found some 60 iOS apps that have the SDK mentioned in the article: https://appgoblin.info/sdks/brdsdk.framework (sorry this requires a free login due to heavy scraping, feel free to contact me for list)
I was unable to find related Android SDKs. I tried looking at the various apps on AppGoblin to find the android versions, then looking through their unmapped SDK parts but didn't see anything.
https://github.com/BrightSDK/bright-sdk-gradle-plugin-docs
This looks like it should just be "com.brightdata" but I did not find anything. With 60 iOS apps there must be apps with Android SDK, but I'm not sure why I am not finding any.
If anyone knows, or would like to chat feel free to connect. I'm happy to share data.
[-]
- trumpdong 42 minutes ago
  Android apps are usually obfuscated, ostensibly to make them smaller, and now obviously to hide what's inside. Only a basic level of obfuscation is typically used, so you would have more luck searching for strings.
- NewsaHackO an hour ago
  Why can't you just post the list as a comment?
rdtsc 2 hours ago
> The TLS certificate is CN=*.luminatinet.com — the domain for Luminati Networks, Bright Data’s pre-2018 corporate name
Ah yes. The big privacy scraping company called themselves The Luminati. It’s like they are side-investing in tin foil hats or something.
metalman 2 hours ago
Having never owned a telivision because of how much I didn't like advertising when tv was the primary delivery method, the feeling of having avoided a life sentence of bieng lashed to the tube is wierd, I know that people might catch me looking all to intently into there eyes trying to see if they are realy in there.
[-]
- trumpdong 42 minutes ago
  Phones do the same thing...
tamimio 3 hours ago
Years ago I had smart TV, and while I never used anything “smart”, one day I connected it to the network to update it and forgot it, two days later I was checking my dns and 80% of the traffic and blocked queries in the past two days were from one device, after tracking it, it was the TV!
So what I have now is a pre-smart TV I found at the thrift, still very good picture that’s more than enough for the few times I use it.
There should be a way to disable the “smart” garbage in new TVs, or an option to buy normal ones at least.
trumpdong 6 hours ago
I find Cloudflare to be more unethical than Bright Data.
[-]
- xg15 6 hours ago
  Both are causing a dynamic that will lock down the internet evermore for everything straying slightly from the corporate-approved line.
  If the divide was data center vs residential IPs, fine, but thanks to Bright Data and friends, residential IPs are getting suspicious as well, so I guess the next step is full-on client verification then...
  [-]
  - trumpdong 39 minutes ago
    DC and residential IPs aren't real categories that exist either. They are guesses by IP reputation companies. Nothing except practicality stops an ISP from mixing them both into the same DHCP pool.
  - clvx 6 hours ago
    I wish federal or state laws could force providing transparency because asking for privacy is a dead end at this point. Just force products and providers that run in my home where they phone in. Then, I can decide what to do with that whether I send them to a black hole or let them pass.
  - trumpdong 4 hours ago
    These are legitimate client devices. Good luck with that.
everybodyknows 3 hours ago
FTA:
> MDM, mobile EDR
Anyone care to ELI5 these?
[-]
- boilerupnc 2 hours ago
  MDM: Mobile Device Management. Software that helps ops folks control a fleet of mobile devices like tablets, phones, etc…
  Mobile EDR: Endpoint detection and response. This is cybersecurity software to monitor and deal with network activity happening in mobile devices like tablets, phones, etc…
ErroneousBosh 5 hours ago
So wait a second then, it connects out using a websocket to its bot C&C server, right?
Which presumably passes it a URL to scrape and waits for it to return the data.
What happens if I write my own tool that connects to that C&C server, waits for a URL to scrape, and returns gigabytes of freshly brewed hot horseshit?
[-]
- woffoor 4 hours ago
  Most scrapped websites have https, so you need to perform a MITM attack. Scrapers will probably notice that.
  [-]
  - voakbasda 4 hours ago
    No, you just need to stand up your own website and feed the scraper a URL to it.
    [-]
    - ErroneousBosh 2 hours ago
      I would just generate scads of Markov chain output and make it look like a plausible web page.
      [-]
      - dreamcompiler 22 minutes ago
        That's pretty much what the bots are scraping now, with all the AI slop websites out there.
  - ErroneousBosh 2 hours ago
    How would https affect it?
    If they're making a request to my machine to go and curl a page, how do they even know whether or not it was https?
    [-]
    - trumpdong 41 minutes ago
      Not sure about Bright Data but these are usually SOCKS or HTTP CONNECT proxies because that's most flexible. But the customer might be paying by the gigabyte, so you can still feed them nonsense, maybe a 4 gigabyte TLS certificate.
skywhopper 6 hours ago
Not the one in my living room.