> Google’s hash match may well have established probable cause for a warrant to allow police to conduct a visual examination of the Maher file.
Very reasonable. Google can flag accounts as CP, but then a judge still needs to issue a warrant for the police to actually go and look at the file. Good job court. Extra points for reasoning about hash values.
> a judge still needs to issue a warrant for the police to actually go and look at the file
Only in the future. Maher's conviction, based on the warrantless search, still stands because the court found that the "good faith exception" applies--the court affirmed the District Court's finding that the police officers who conducted the warrantless search had a good faith belief that no warrant was required for the search.
I wonder what happened to fruit of the poisoned tree? Seems a lot more liberty oriented than "good faith exception" when police don't think they need a warrant (because police never seem to "think" they need a warrant).
This exactly. Bad people have to go free in order to incentivize good behavior by cops.
You and I (as innocent people) are more likely to be affected by bad police behavior than the few bad people themselves and so we support the bad people going free.
I think its okay that we expect cops to be good _after_ the rule exists, rather than set the bad guys free to (checks notes) incentivize cops to take our new rule super seriously.
It would seem that the inverse would need to apply in order for the justice system to have any semblance of impartiality. That is that we now have to let both of them off the hook, since neither had been specifically informed they weren’t allowed to do the thing beforehand.
That is why many people think this should be tossed out. Ignorance that an action was a crime is almost never an acceptable defense, so it should not be an acceptable offense either.
> we now have to let both of them off the hook, since neither had been specifically informed they weren’t allowed to do the thing beforehand.
I'm not trying to be funny, or aggressive, or passive aggressive, seriously: there's two entities in the discussion, the cops, and the person with a photograph with a hash matching child porn. I'm phrasing that as passively as possible because I want to avoid the tarpit of looking like I'm appealing to emotion:
Do you mean the hash-possessor weren't specifically informed it was illegal to possess said hash?
> It would seem that the inverse would need to apply in order for the justice system to have any semblance of impartiality...That is why many people think this should be tossed out.
Of course, I could be missing something here because I'm making a hash of parsing the first bit. But, no, if the cops in good faith make a mistake, there's centuries of jurisprudence behind not letting people go free for it, not novel with this case.
> Do you mean the hash-possessor weren't specifically informed it was illegal to possess said hash?
This is literally the doctrine behind the good faith argument and qualified immunity. If they have not been informed that this specific act, done in this specific way is not allowed then it is largely permissible.
A stupid but equivalent defense from the possessor would be “it’s in Googles possession, not mine, so I had a good faith belief that I did not possess the files”. It’s clearly wrong based on case law, but I wouldn’t expect the average person to have a great grasp of how possession works legally (nor would I claim to be an expert on it).
This is effectively what the good faith doctrine establishes for police, even though they really ought to at least have an inkling given that the law is an integral part of their jobs. As long as they can claim to be sufficiently stupid, it is permissible. That is not extended to the defense, for whom stupidity is never a defense.
> But, no, if the cops in good faith make a mistake, there's centuries of jurisprudence behind not letting people go free for it, not novel with this case.
Acting in good faith would be getting a warrant regardless, because the issue is not that time-sensitive and there are clear ambiguities here. They acted brashly under the assumption that if they were wrong, they could claim stupidity. It encourages the police to push the boundaries of legal behavior, because they still get to keep the evidence even if they are wrong and have committed an illegal search.
It is, yet again, rules for thee but not for me. Frankly, with the asymmetry of responsibility and experience with laws, the police should need to clear a MUCH higher bar to come within throwing distance of “good faith”.
Your argument is a bit disingenuous because it's not applicable in situation where there is clear law clarifying that something can't be done.
You're pretending that cops are using this in situations where it's known that a warrant is needed, as opposed to it being an exception to "fruit of the poisonous tree" doctrine when new caselaw is being made.
> Acting in good faith would be getting a warrant regardless
That's not what "good faith" means, that's just something entirely made up by you. From a reasonable perspective that could be described as foolish and a waste of time and the public's resources.
> It encourages the police to push the boundaries of legal behavior, because they still get to keep the evidence even if they are wrong and have committed an illegal search.
There's a constant tension between technology, crime and the police that's reflected in the history of 4th amendment jurisprudence and it's not at all like what you describe. The criminals are pushing the boundaries to which the police must catch up, and the law must determine what is fair as society changes over time. I'm not particularly pro cop, but you don't seem to be reasonable about any of this.
The rule established in this case is new, hence TFA, and all the time the lawyers and judge wasted on it :)
If I may suggest where wires are getting crossed:
You are sort of assuming it's like a logic gate: if 4th amendment violation, bad evidence, criminal must go free. So when you say "the rule", you mean "the 4th amendment", not the actual ruling.
That's not how it works, because that simple ultimatum also has edge cases. So we built up this whole system around nominating juries and judges, and paying lawyers, over centuries, to argue out complicated things like weighing intentionality.
The opinion says at the time the warrantless search occurred, one appellate court had already held "that no warrant was required in those circumstances" (p 42). Only a year after the search occurred, did another appellate court rule the other way.
This is the main argument that the search met the good faith exception to the exclusionary rule (i.e. the rule that says you have to exclude evidence improperly obtained). This exception is supported in the opinion (at p41) with several citations including United States v. Ganias, 824 F.3d 199, 221–22 (2d Cir. 2016)
IANAL, but as I understood, this exception is specifically about cases where precedence is established. This same trick or others substantially like it won't work in the future, but because it was not a "known trick", the conviction still stands.
I'm trying to imagine a more "real-world" example of this to see how I feel about it. I dislike that there is yet another loophole to gain access to peoples' data for legal reasons, but this does feel like a reasonable approach and a valid goal to pursue.
I guess it's like if someone noticed you had a case shaped exactly like a machine gun, told the police, and they went to check if it was registered or not? I suppose that seems perfectly reasonable, but I'm happy to hear counter-arguments.
The main factual components are as follows: Party A has rented out property to Party B. Party A performs surveillance on or around the property with Party B's knowledge and consent. Party A discovers very high probability evidence that Party B is committing crimes within the property, and then informs the police of their findings. Police obtain a warrant, using Party A's statements as evidence.
The closest "real world" analogy that comes to mind might be a real estate management company uses security cameras or some other method to determine that there is a crime occurring in a space that they are renting out to another party. The real estate management company then sends evidence to the police.
In the case of real property -- rental housing and warehouse/storage space in particular -- this happens all the time. I think that this ruling is imminently reasonable as a piece of case law (ie, the judge got the law as it exists correct). I also thing this precedent would strike a healthy policy balance as well (ie, the law as it exists if interpreted how the judge in this case interprets it would a good policy situation).
Is there any such thing as this surveillence applying to the inside of the renters bed room, bath room, filing cabinet with medical or financial documents, or political for that matter?
I don't think there is, and I don't think you can reduce reality to being as simple as "owner has more right over property than renter" renter absolutely has at least a few rights in at least a few defined contextx over owner because owner "consented" to accept money in trade for use of property.
> Is there any such thing as this surveillence applying to the inside of the renters bed room, bath room, filing cabinet with medical or financial documents, or political for that matter?
Yes. Entering property for regular maintenance. Any time a landlord or his agent enters a piece of property, there is implicit surveillance. Some places are more formal about this than others, but anyone who has rented, owned rental property, or managed rental property knows that any time maintenance occurs there's an implicit examination of the premises also happening...
But here is a more pertinent example: the regular comings and goings of people or property can be and often are observed from outside of a property. These can contribute to probable cause for a search of those premises even without direct observation. (E.g., large numbers of disheveled children moving through an apartment, or an exterior camera shot of a known fugitive entering the property.)
Here the police could obtain a warrant on the basis of landlord's testimony without the landlord actually seeing the inside of the unit. This is somewhat similar to the case at hand, since what Google alerted the police to a hash match without actually looking at the image (ie, entering the bedroom).
> I don't think you can reduce reality to being as simple as "owner has more right over property than renter"
But I make no such reduction, and neither does the opinion. In fact, quite the opposite -- this is contributory why the court determines a warrant is required!
> ...Google alerted the police to a hash match without actually looking at the image (ie, entering the bedroom).
Google cannot have calculated that hash without examining the data in the image. They, or systems under there control obviously looked at the image.
It should not legally matter whether the eyes are meat or machine... if anything, machine inspection should be MORE strictly regulated, because of how much easier and cheaper it tends to make surveillance (mass or otherwise).
> It should not legally matter whether the eyes are meat or machine
But it does matter, and, perhaps ironically, it matters in a way that gives you STRONGER (not weaker) fourth amendment rights. That's the entire TL;DR of the fine article.
If the court accepted this sentence of yours in isolation, then the court would have determined that no warrant was necessary in any case.
> if anything, machine inspection should be MORE strictly regulated, because of how much easier and cheaper it tends to make surveillance (mass or otherwise).
I don't disagree. In particular: I believe that the "Reasonable Person", to the extent that we remain stuck with the fiction, should be understood as having stronger privacy expectations in their phone or cloud account than they do even in their own bedroom or bathroom.
With respect to Google's actions in this case, this is an issue for your legislator and not the courts. The fourth amendment does not bind Google's hands in any way, and judges are not lawmakers.
If I import hundreds of pounds of poached ivory and store it in a shipping yard or move it to a long term storage unit, the owner and operator of those properties are allowed to notify police of suspected illegal activities and unlock the storage locker if there is a warrant produced.
Maybe the warrant uses some abstraction of the contents of that storage locker like the shipping manifest or customs declaration. Maybe someone saw a shadow of an elephant tusk or rhino horn as I was closing the locker door.
I don't think that argument supports the better analogy of breaking into a computer or filing cabinet owned by someone renting the space. Just because someone is renting space doesn't give you the right to do whatever you want to them. Cameras in bathrooms of a rented space would be another example.
But he wasn’t running a computer in a rented space, he was using storage space on google’s computers.
In an older comment I argued against analogies to rationalize this. I think honestly at face value it is possible to evaluate the goodness or badness of the decision.
> In an older comment I argued against analogies to rationalize this. I think honestly at face value it is possible to evaluate the goodness or badness of the decision.
I generally do agree that analogies became anti-useful in this thread relatively quickly.
However, I am not sure that avoiding analogies is actually possible for the courts. I mean, they can try, but at some point analogies are unavailable because most of the case law -- and, hell, the fourth amendment itself -- is written in terms of the non-digital world. Judges are forced to reason by analogy, because legal arguments will be advanced in terms of precedent that is inherently physical.
So there is value in hashing out the analogies, even if at some point they become tenuous, primarily because demonstrating the breaking points of the analogies is step zero in deviating from case law.
Yes, that is why I presented an alternative to the analogy of "import hundreds of pounds of poached ivory and store it in a shipping yard or move it to a long term storage unit".
Like having the right to avoid being videoed in the bathroom, we have the right to avoid unreasonable search of our files by authorities, whether stored locally or on the cloud
I have this weird experience where people that get all their legal news from tech websites have really pointed views about fourth amendment jurisprudence and patent law.
I agree. This is a case where the physical analogy leads us to (imo) the correct conclusion: compelling major property management companies to perform regular searches of their tenant's properties, and then to report any findings to the police, is hopefully something that most judges understand to be a clear violation of the fourth amendment.
> The issue of course being the government then pressuring or requiring these companies to look for some sort of content as part of routine operations.
>> Party A discovers very high probability evidence that Party B is committing crimes within the property ...
> This isn't accurate: the hashes were purposefully compared to a specific list. They didn't happen to notice it, they looked specifically for it.
1. I don't understand how the text that comes on the right side of the colon substantiates the claim on the left side of the colon... I said "discovers", without mention of how it's discovered.
2. The specificity of the search cuts in exactly the opposite direction than you suggest; specificity makes the search far less invasive -- BUT, at the same time, the "everywhere and always" nature of the search makes it more invasive. The problem is the pervasiveness, not the specificity. See https://news.ycombinator.com/user?id=aiforecastthway
> And of course, what happens when it's a different list?
The fact that the search is targeted, that the search is highly specific, and that the conduct plainly criminal, are all, in fact, highly material. The decision here is not relevant to most of the "worst case scenarios" or even "bad scenarios" in your head, because prior assumptions would have been violated prior to this moment in the legal evaluation.
But with respect to your actual argument here... it's really a moot point. If the executive branch starts compelling companies to help them discover political enemies on basis of non-criminal activity, then the court's opinions will have exactly as much force as the army that court proves capable of raising, because such an executive would likely have no respect for the rule of law in any case...
It is reasonable for legislators to draft laws on a certain assumption of good faith, and for courts to interpret law on a certain assumption of good faith, because without that good faith the law is nothing more than a sequence of forceless ink blotches on paper anyways.
I don't think that changes anything. I think it's entirely reasonable for Party A to be actively watching the rented property to see if crimes are being committed, either by the renter (Party B) or by someone else.
The difference I do see, however, is that many places do have laws that restrict this sort of surveillance. If we're talking about an apartment building, a landlord can put cameras in common areas of the building, but cannot put cameras inside individual units. And with the exception of emergencies, many places require that a landlord give tenants some amount of notice before entering their unit.
So if Google is checking user images against known CSAM image hashes, are those user images sitting out in the common areas, or are they in an individual tenant's unit? I think it should be obvious that it's the latter, not the former.
Maybe this is more like a company that rents out storage units. Do storage companies generally have the right to enter their customers' storage units whenever they want, without notice or notification? Many storage companies allow customers to put their own locks on their units, so even if they have the right to enter whenever they want, regularly, in practice they certainly do not.
But like all analogies, this one is going to have flaws. Even if we can't match it up with a real-world example, maybe there's still no inconsistency or problem here. Google's ToS says they can and will do this sort of scanning, users agree to it, and there's no law saying Google can't do that sort of thing. Google itself has no obligation to preserve users' 4th Amendment rights; they passed along evidence to the police. I do think the police should be required to obtain a warrant before gaining access to the underlying data; the judge agrees on this, but the police get away with it in the original case due to the bullshit "good faith exception".
Ok. But that would also be invasion of privacy. If the property you rented out was being used for trafficking and you don’t want to be involved with trafficking, then the terms would have to first explicitly set what is not allowed. Then it would also have to explicitly mention what measures are taken to enforce it and what punishments are imposed for violations. It should also mention steps that are taken for compliance.
Without full documentation of compliance measures, enforcement measures, and punishments imposed, violations of the rule cannot involve law enforcement who are restricted to acting on searches with warrants.
> If the property you rented out was being used for trafficking and you don’t want to be involved with trafficking, then the terms would have to first explicitly set what is not allowed.
I don't believe that's the case. You don't need to state that illegal activities are not allowed; that's the default.
> Then it would also have to explicitly mention what measures are taken to enforce it
When Airbnb used to allow cameras indoors, they did -- after some backlash -- require hosts to disclose the presence of the cameras.
> ... and what punishments are imposed for violations.
No, I don't think that is or should be necessary. If you do illegal things, the possible punishments don't need to be enumerated by the person who reports you to the police.
Put another way: if I'm hosting someone on Airbnb in the case where I'm living in the same property, and I walk into the kitchen to see my Airbnb guest dealing drugs, I am well within my rights to call the police, without having ever said anything up-front to my guest about whether or not that's acceptable behavior, or what the consequences might be. Having the drug deal instead caught on camera is no different, though I would agree that the presence of the cameras should have to be disclosed beforehand.
In Google's case, the "camera" (aka CSAM scanning) appears to have been disclosed beforehand.
>Without full documentation of compliance measures, enforcement measures, and punishments imposed, violations of the rule cannot involve law enforcement who are restricted to acting on searches with warrants.
I think the real-world analogy would be to say that the case is shaped exactly like a machine gun and the hotel calls the police, who then open the case without a warrant. The "private search" doctrine allows the police to repeat a search done by a private party, but here (as in the machine gun case), the case was not actually searched by a private party.
But this court decision is a real world example, and not some esoteric edge case.
This is something I don’t think needs analogies to understand. SA/CP image and video distribution is an ongoing moderation, network, and storage issue. The right to not be under constant digital surveillance is somewhat protected in the constitution.
I like speech and privacy and am paranoid of corporate or government overreach, but I arrive at the same conclusion as you taking this court decision at face value.
Wait until Trump is in power and corporations are masterfully using these tools to “mow the grass” (if you want an existing example of this, look at Putin’s Russia, where people get jail time for any pro-Ukraine mentions on social media).
Yeah I’m paranoid like I said, but this case it seems like the hash of a file on google’s remote storage flagged as potential match that was used as justification to request a warrant. That seems common sense and did not involve employees snooping pre-warrant.
The Apple CSAM hash detection process, that the launch was rolled back, concerned me namely because it was run on-device with no opt out. If this is running on cloud storage then it sort of makes sense. You need to ensure you are not aiding or harboring actually harmful illegal material.
I get there are slippery slopes or whatever but the fact is you cannot just store whatever you wish in a rental. I don’t see this as opening mass regex surveillance of our communication channels. We have the patriot act to do that lol.
I think the better option is a system where the cloud provider cannot decrypt the files, and they’re not obligated to lift a finger to help the police because they have no knowledge of the content at all
In my opinion, despite the technical merits of an algorithm, encryption is only as trustworthy as the computer who generates and holds a private key.
I would personally not knowingly use a cloud provider to commit a crime. That is a fairly naive take to assume because your browser is https that data at rest and in process isn’t somehow observable.
And I see where you’re coming from but I am afraid that position severely overestimates the will of US people to trade freedom/privacy for security and the legislature to hold citizens’ privacy in such high regard.
I only worry that, in the case that renting becomes a roundabout way of granting more oversight ability to the government, then as home ownership rates decrease, government surveillance power increases.
Sure, it's facilitated through a third party (the owner), but the extrapolated pattern seems to be: "1. Only people in group B will have fewer rights, so people in group A shouldn't worry" followed closely by "2. Sorry, you've been priced out of group A."
In the case of renting, we end up in the situation where those who have enough wealth to own their own home are afforded extra privileges of privacy.
Now to bring this back to the cloud; the cynical part of me looks towards a future of cheap, cloud-only storage devices. Or an intermediate future of devices where cloud is first party and local storage is just enough of a hassle that people don't use it. And the result is that basically everyone now has the present day equivalent of local storage scanning.
If renting de-facto grants fewer rights, then in the future where "you'll own nothing and be happy", you'll also have no rights, and all the way people will say "as a renter, what did you expect?"
OK I agree with you about setting a precedent that future storage will be scanned by default. Additionally who will control the reference hash list?, since making one necessitates hashing that illicit material.
I only hope the court systems escalate it and manage to protect free speech or unreasonable search and seizure or self incrimination or whatever if the CSAM hash comparisons are used against political opponents or music piracy or tax evasion or whatever.
I’m unsure I wrote that from like an ethics standpoint. The silk road guy was got on conspiracy for attempting murder and not drug or human trafficking charges. So I’m unsure of legal side.
I think if you knowingly provided a platform to distribute SA/CP/CSAM and the feds become involved you will be righteously fucked.
Reddit clamped down on the creepy *bait subreddits years ago. Maybe it was self-preservation on the business side or maybe it was forward looking about legal issues.
I’m not a lawyer I was just mentioning things that I would follow for ethics morals and my sense of self preservation.
I don't think the analogy holds for two reasons (which cut in opposite directions from the perspective of fourth amendment jurisprudence, fwiw).
First, the dragnet surveillance that Google performs is very different from the targeted surveillance that can be performed by a drug dog. Drug dogs are not used "everywhere and always"; rather, they are mostly used in situations where people have a less reasonable expectation of privacy than the expectation they have over their cloud storage accounts.
Second, the nature of the evidence is quite different. Drug-sniffing dogs are inscrutable and non-deterministic and transmit handler bias. Hashing algorithms can be interrogated and are deterministic and do not have such bias transferal issues; collisions do occur, but are rare, especially because the "search key" set is so minuscule relative to the space of possible hashes. The narrowness and precision of the hashing method preserves most of the privacy expectations that society is currently willing to recognize as objectively reasonable.
Here we get directly to the heart of the problem with the fictitious "reasonable person" used in tests like the Katz test, especially in cases where societal norms and technology co-evolve at a pace far more rapid than that of the courts.
This analogy can have two opposite meanings. Drug dogs can be anything from a prop used by the police to search your car without a warrant (a cop can always say in court the dog "alerted" them) to a useful drug detection tool.
Don't they?. If you tell the cops that your neighbor has drugs of significant quantity in their house, would they not still need a warrant to actually go into your neighbor's house?
There are a lot of nuances to these situations of third-party involvement and the ruling discusses these at length. If you’re interested in the precise limits of the 4th amendment you should really just read the linked document.
they should as a matter of course. but I guess "papers" you entrust to someone else are a gray area. I personally think that it goes against the separation of police state and democracy, but I'm a nobody, so it doesn't matter I suppose.
Is it reasonable? Even if the hash was md5, given valid image files, the chances of it being an accidental collision are way lower than the chance of any other evidence given to a judge was false or misinterpreted.
This is NOT a secure hash. This is an image similar to hash which has many many matches in not related images.
Unfortunately the decision didn't mention this at all even though it is important. If it was even as good as a md5 hash (which is broken) I think the search should be allowed without warrant because even though a accidental collision is possible odds are so strongly against it that the courts can safely assume there isn't (and of course if there is the police would close the case). However since this has is not that good the police cannot look at the image unless Google does.
I wish I could get access to the "App'x 29" being referenced so that I could better understand the judges' understanding here. I assume this is Federal Appendix 29 (in which case a more thorough reference would've been appreciated). If the Appeals Court is going to cite the Federal Appendix in a decision like this and in this manner, then the Federal Appendix is as good as case law and West Publishing's copyright claims should be ripped away. Either the Federal Appendix should not be cited in Appeals Court and Supreme Court opinions, or the Federal Appenix is part of the law and belongs to the people. There is no middle there.
> I think the search should be allowed without warrant because even though a accidental collision is possible odds are so strongly against it that the courts can safely assume there isn't
The footnote in the decision bakes this property into the definition of a hash:
A “hash” or “hash value” is “(usually) a short string of characters generated from a much larger string of data (say, an electronic image) using an algorithm—and calculated in a way that makes it highly unlikely another set of data will produce the same value.
(Importantly, this is NOT an accurate definition of a hash for anyone remotely technical... of course hashing algorithms with significant hash collisions exist, and is even a design criterion for some hashing algorithms...)
>I wish I could get access to the "App'x 29" being referenced so that I could better understand the judges' understanding here. I assume this is Federal Appendix 29 (in which case a more thorough reference would've been appreciated). If the Appeals Court is going to cite the Federal Appendix in a decision like this and in this manner, then the Federal Appendix is as good as case law and West Publishing's copyright claims should be ripped away. Either the Federal Appendix should not be cited in Appeals Court and Supreme Court opinions, or the Federal Appenix is part of the law and belongs to the people. There is no middle there.
Just go to a law library.
Do you know that judges routinely make decisions based on confidential documents not in the public record? Is that also bad?
You're assuming accidential collision. Images can be generated that intentionally trigger the hash algorithm while they still appear as something else (a meme, funny photo, etc.) to a person looking at them. This brings many possibilities for "bad people" to do to people they hate (like an alternative to swatting etc.)
So you're saying that I craft a file that has the same hash as a CSAM one, I give it to you, you upload it to google, but it also happens to be CSAM, and I've somehow framed you?
My point is that a hash (granted, I'm assuming that we're talking about a cryptographic hash function, which is not clear) is much closer to "This is the file" than someone actually looking at it, and that it's definitely more proof of them having that sort of content than any other type of evidence.
I don't understand. If you contend that it's even better evidence than actually having the file and looking at it, how is not reasonable to then need a judge to issue a warrant to look at it? Are you saying it would be more reasonable to skip that part and go directly to arrest?
It seems like a large part of the ruling hinges on the fact that Google matched the image hash to a hash of a known child pornography image, but didn't require an employee to actually look at that image before reporting it to the police. If they had visually confirmed it was the image they suspected it was based on the hash then no warrant would have been required, but the judge reads that the image hash match is not equivalent to a visual confirmation of the image. Maybe there's some slight doubt in whether or not the image could be a hash collision, which depends on the hash method. It may be incredibly unlikely (near impossible?) for any hash collision depending on the specific hash strategy.
I think it would obviously be less than ideal for Google to require an employee visually inspect child pornography identified by image hash before informing a legal authority like the police. So it seems more likely that the remedy to this situation would be for the police to obtain a warrant after getting the tip but before requesting the raw data from Google.
Would the image hash match qualify as probable cause enough for a warrant? On page 4 the judge stops short of setting precedence on whether it would have or not. Seems likely that it would be a solid probable cause to me, but sometimes judges or courts have a unique interpretation of technology that I don't always share, and leaving it open to individual interpretation can lead to conflicting results.
The hashes involved in stuff like this, as with copyright auto-matching, are perceptual hashes (https://en.wikipedia.org/wiki/Perceptual_hashing), not cryptographic hashes. False matches are common enough that perceptual hashing attacks are already a thing in use to manipulate search engine results (see the example in random paper on the subject https://gangw.cs.illinois.edu/PHashing.pdf).
It seems like that is very relevant information that was not considered by the court. If this was a cryptographic hash I would say with high confidence that this is the same image and so Google examined it - there is a small chance that some unrelated file (which might not even be a picture) matches but odds are the universe will end before that happens and so the courts can consider it the same image for search purposes. However because there are many false positive cases there is reasonable odds that the image is legal and so a higher standard for search is needed - a warrant.
>so the courts can consider it the same image for search purposes
An important part of the ruling seems to be that neither Google nor the police had the original image or any information about it, so the police viewing the image gave them more information than Google matching the hash gave Google: for example, consider how the suspect being in the image would have changed the case, or what might happen if the image turned out not to be CSAM, but showed the suspect storing drugs somewhere, or was even, somehow, something entirely legal but embarrassing to the suspect. This isn't changed by the type of hash.
It shouldn't. Google hasn't otherwise seen the image, so the employee couldn't have witnessed a crime. There are reportedly many perfectly legal images that end up in these almost perfectly unaccountable databases.
That makes sense - if they were using a cryptographic hash then people could get around it by making tiny changes to the file. I’ve used some reverse image search tools, which use perceptual hashing under the hood, to find the original source for art that gets shared without attribution (saucenao pretty solid). They’re good, but they definitely have false positives.
Now you’ve got me interested in what’s going on under the hood, lol. It’s probably like any other statistical model: you can decrease your false negatives (images people have cropped or added watermarks/text to), but at the cost of increased false positives.
Rather simple methods are surprisingly effective [1]. There's sure to be more NN fanciness nowadays (like Apple's proposed NeuralHash), but I've used the algorithms described by [1] to great effect in the not-too-distant past. The HN discussion linked in that article is also worth a read.
This submission is the first I've heard of the concept. Are there OSS implementations available? Could I use this, say, to deduplicate resized or re-jpg-compressed images?
The hash functions used for these purposes are usually not cryptographic hashes. They are "perceptual hashes" that allows for approximate matches (e.g. if the image has been scaled or brightness-adjusted). https://en.wikipedia.org/wiki/Perceptual_hashing
> Maybe there's some slight doubt in whether or not the image could be a hash collision, which depends on the hash method. It may be incredibly unlikely (near impossible?) for any hash collision depending on the specific hash strategy.
If it was a cryptographic hash (apparently not), this mathematical near-certainty is necessary but not sufficient. Like cryptography used for confidentiality or integrity, the math doesn't at all guarantee the outcome; the implementation is the most important factor.
Each entry in the illegal hash database, for example, relies on some person characterizing the original image as illegal - there is no mathematical formula for defining illegal images - and that characterization could be inaccurate. It also relies on the database's integrity, the user's application and its implementation, even the hash calculator. People on HN can imagine lots of things that could go wrong.
If I were a judge, I'd just want to know if someone witnessed CP or not. It might be unpleasant but we're talking about arresting someone for CP, which even sans conviction can be highly traumatic (including time in jail, waiting for bail or trial, as a ~child molestor) and destroy people's lives and reputations. Do you fancy appearing at a bail hearing about your CP charge, even if you are innocent? 'Kids, I have something to tell you ...'; 'Boss, I can't work for a couple weeks because ...'.
It seems like there just needs to be case law about the qualifications of an image hash in order to be counted as probable cause for a warrant. Of course you could make an image hash be arbitrarily good or bad.
I am not at all opposed to any of this "get a damn warrant" pushback from judges.
I am also not at all opposed to Google searching it's cloud storage for this kind of content. There are a lot of things I would mind a cloud provider going on fishing expeditions to find potentially illegal activity, but this I am fine with.
I do strongly object to companies searching content for illegal activity on devices in my possession absent probable cause and a warrant (that they would have to get in a way other than searching my device). Likewise I object to the pervasive and mostly invisible delivery to the cloud of nearly everything I do on devices I possess.
In other words, I want custody of my stuff and for the physical possession of my stuff to be protected by the 4th amendment and not subject to corporate search either. Things that I willingly give to cloud providers that they have custody of I am fine with the cloud provider doing limited searches and the necessary reporting to authorities. The line is who actually has the bits present on a thing they hold.
I think if the hashes were made available to the public, we should just flood the internet with matching but completely innocuous images so they can no longer be used to justify a search
>please use the original title, unless it is misleading or linkbait; don't editorialize. (@dang)
On topic, I like this quote from the first page of the opinion:
>A “hash” or “hash value” is “(usually) a short string of characters generated from a much larger string of data (say, an electronic image) using an algorithm—and calculated in a way that makes it highly unlikely another set of data will produce the same value.” United States v. Ackerman, 831 F.3d 1292, 1294 (10th Cir. 2016) (Gorsuch, J.).
It's amusing to me that they use a supreme court case as a reference for what a hash is rather than eg. a textbook. It makes sense when you consider how the court system works but it is amusing nonetheless that the courts have their own body of CS literature.
Maybe someone could publish a "CS for Judges" book that teaches as much CS as possible using only court decisions. That could actually have a real use case when you think of it. (As other commenters pointed out, the hashing definition given here could use a bit more qualification, and should at least differentiate between neural hashes and traditional ones like MD5, especially as it relates to the likeliness that "another set of data will produce the same value." Perhaps that could be an author's note in my "CS for Judges" book.)
> Maybe someone could publish a "CS for Judges" book
At last, a form of civic participation which seems both helpful and exciting to me.
That said, I am worried that lot of necessary content may not be easy to introduce with hard precedent, and direct advice or dicta might somehow (?) not be permitted in a case since it's not adversarial... A new career as a professional expert witness--even on computer topics--sounds rather dreary.
What's so weird about this? CS literature is not legally binding in any way. Of course a judge would rather quote a previous ruling by fellow judge than a textbook, Wikipedia, or similar sources.
From what I understand, a judge is free to decide matters of fact on his own, which could include from a textbook. Also, it is not clear that matters of fact decided by the Supreme Court are binding to lower courts. Additionally, facts and even meanings of words themselves can change, which makes previous findings of fact no longer applicable. That's actually true in this case as well. "Hash" as used in the context of images generally meant something like an MD5 hash (which itself is now more prone to collisions than before). The "hash" in the Google case appears to be a perceptual hash, which I don't think was as commonly used until recently (I could be wrong here). So whatever findings of fact were made by the Supereme Court about how reliable a hash is is not necessarily relevant to begin with. Looking at this specific case, here is the full quote from United States v. Ackerman:
>How does AOL's screening system work? It relies on hash value matching. A hash value is (usually) a short string of characters generated from a much larger string of data (say, an electronic image) using an algorithm—and calculated in a way that makes it highly unlikely another set of data will produce the same value. Some consider a hash value as a sort of digital fingerprint. See Richard P. Salgado, Fourth Amendment Search and the Power of the Hash, 119 Harv. L. Rev. F. 38, 38-40 (2005). AOL's automated filter works by identifying the hash values of images attached to emails sent through its mail servers.[0]
I don't have access to this issue of Harvard Law Review but looking at the first page, it says:
>Hash algorithms are used to confirm that when a copy of data is made, the original is unaltered and the copy is identical, bit-for-bit.[1]
This is clearly referring to a cryptographic hash like MD5, not a perceptual hash/neural hash as in Google. So the actual source here is not necessarily dealing with the same matters of fact as the source of the quote here (although there could be valid comparisons between them).
All this said, judges feel more confident in citing a Supreme Court case than a textbook because 1. it is easier to understand for them 2. the matter of fact is then already tied to a legal matter, instead of the judge having to make that leap himself and also 3. judges are more likely to read relevant case law to begin with since they will read it to find precedent in matters of law – which are binding to lower courts. This is why a "CS for Judges" could be a useful reference book.
Lastly, I should have looked a bit more closely at the quoted case. This is actually not a supreme court case at all. Gorsuch was nominated in 2017 and this case is from 2016.
> As the district court correctly ruled in the alternative, the good faith exception to the exclusionary rule supports denial of Maher’s suppression motion because, at the time authorities opened his uploaded file, they had a good faith basis to believe that no warrant was required
So this means this conviction is upheld but future convictions may be overturned if they similarly don't acquire a warrant?
> the good faith exception to the exclusionary rule supports denial of Maher’s suppression motion because, at the time authorities opened his uploaded file, they had a good faith basis to believe that no warrant was required
This "good faith exception" is so absurd I struggle to believe that it's real.
Ordinary citizens are expected to understand and scrupulously abide by all of the law, but it's enough for law enforcement to believe that what they're doing is legal even if it isn't?
What that is is a punch line from a Chapelle bit[1], not a reasonable part of the justice system.
The courts accept good faith arguments at times. They will give reduced sentences or even none at all if they think you acted in good faith. There are enough situations where it is legal to kill someone that there are laws to make it clear that is a legal situation where one person can kill another (hopefully they never apply to you).
Note that this case is not about ignorance of the law. This is I knew the law and was trying to follow it - I just honestly thought it didn't apply because of some tricky situation that isn't 100% clear.
The difference between "I don't know" and "I thought it worked like this" is purely a matter of degrees of ignorance. It sounds like the cops were ignorant of the law in the same way as someone who is completely unaware of it, just to a lesser degree. Unless they were misinformed about the origins of what they were looking at, it doesn't seem like it would be a matter of good faith, but purely negligence.
“Mens rea” is a key component of most crimes. Some crimes can only be committed if the perpetrator knows they are doing something wrong. For example, fraud or libel.
> “Mens rea” is a key component of most crimes. Some crimes can only be committed if the perpetrator knows they are doing something wrong. For example, fraud or libel.
We're talking about orthogonal issues.
Mens rea applies to whether the person performs the act on purpose. Not whether they were aware that the act was illegal.
Let's use fraud as an example since you brought it up.
If I bought an item from someone and used counterfeit money on purpose, that would be fraud. Even if I truly believed that doing so was legal. But it wouldn't be fraud if I didn't know that the money was counterfeit.
At the time, what they did was assumed to be legal because no one had ruled on it.
Now, there is prior case law declaring it illegal.
The ruling is made in such a way to say “we were allowing this, but we shouldn’t have been, so we wont allow it going forward”.
I am not a legal scholar, but that’s the best way I can explain it. The way that the judicial system applies to law is incredibly complex and inconsistent.
This is a deeply problematic way to operate. En masse, it has the right result, but, for the individual that will have their life turned upside down, the negative impact is effectively catastrophic.
This ends up feeling a lot like gambling in a casino. The casino can afford to bet and lose much more than the individual.
I don't care nearly as much about the 4th amendment when the person is guilty. I care a lot when the person is innocent. Searches of innocent people is costly for the innocent person and so we require warrants to ensure such searches are minimized (even though most warrants are approved, the act of getting on forces the police to be careful). If a search was completely not costly to innocent I wouldn't be against them, but there are many ways a search that finds nothing is costly to the innocent.
If the average person is illegally searched, but turns out to be innocent, what are the chances they bother to take the police to court? It's not like they're going to be jailed or convicted, so many people would prefer to just try to move on with their life rather than spend thousands of dollars litigating a case in the hopes of a payout that could easily be denied if the judge decides the cops were too stupid to understand the law rather than maliciously breaking it.
Because of that, precedent is largely going to be set with guilty parties, but will apply equally to violations of the rights of the innocent.
I want guilty people to go free if their 4th amendment rights are violated, thats the only way to ensure police are meticulous about protecting peoples rights
Thus, it's not clear that any harm was caused because the right wasn't clearly enshrined and had the police known that it was, they likely would have followed the correct process. There was no intention to violate rights, and no advantage gained from even the inadvertent violation of rights. But the process is updated for the future.
This specific conviction upheld, yes. But no, this ruling doesn't speak to whether or not any future convictions may be overturned.
It simply means that at the trial court level, future prosecutions will not be able to rely on the good faith exception to the exclusionary rule if warrantless inculpatory evidence is obtained under similar circumstances. If the governement were to try to present such evidence at trial and the trial judge were to admit it over the objection of the defendant, then that would present a specific ground for appeal.
This ruling merely bolsters the 'better to get a warrant' spirit of the Fourth Amendment.
The harshness of sentence is not for the action of keeping the photos in itself, but the individual suffering and social damage caused by the actions that he incentivizes when he consumes such content.
Consumption per se does not incentivize it, though; procurement does. It's not unreasonable to causally connect one to the other, but I still think that it needs to be done explicitly. Strict liability for possession in particular is nonsense.
There's also an interesting question wrt simulated (drawn, rendered etc) CSAM, especially now that AI image generators can produce it in bulk. There's no individual suffering nor social damage involved in that at any point, yet it's equally illegal in most jurisdictions, and the penalties aren't any lighter. I've yet to see any sensible arguments in favor of this arrangement - it appears to be purely a "crime against nature" kind of moral panic over the extreme ickiness of the act as opposed to any actual harm caused by it.
It can. In several public cases it seems fairly clear that there is a "community" aspect to these productions and many of these sites highlight the number of downloads or views of an image. It creates an environment where creators are incentivized to go out of their way to produce "popular" material.
> Strict liability for possession in particular is nonsense.
I entirely disagree. Offenders tend to increase their level of offense. This is about preventing the problem from becoming worse and new victims being created. It's effectively the same reason we harshly prosecute people who torture animals.
> nor social damage involved in that at any point,
That's a bold claim. Is it based on any facts or study?
> over the extreme ickiness of the act as opposed to any actual harm caused by it.
It's about the potential class of victims and the outrageous life long damage that can be done to them. The appropriate response to recognizing these feelings isn't to hand them AI generated material to sate their desires. It's to get them into therapy immediately.
Icky things were historically made illegal all the time, but most of those historical examples have not fared well in retrospect. Modern justice systems are generally predicated on some quantifiable harm for good reasons.
Given the extremely harsh penalties at play, I am not at all comfortable about punishing someone with a multi-year prison sentence for possession of a drawn or computer generated image. What exactly is the point, other than people getting off from making someone suffer for reasons they consider morally justifiable?
There's no room for sensible discussion like this in these matters. Not demanding draconian sentences for morally outraging crimes is morally outraging.
GP is saying that people who want this to be a crime are morally outraged that someone else might disagree, and so it's impossible to have a reasonable debate with them about it. They're probably correct, but it never hurts to try.
Assuming the person is a passive consumer with no messages / money exchanged with anyone, it is very hard to prove social harm or damage. Sentences should be proportional to the crime. Treating possession of cp as equivalent of literally raping a child just seems absurd to me. IMO, just for the legal protection of the average citizen, a simple possession should never warrant jail time.
For the record, i'm against any kind of child abuse, and 25 years for an actual abuser would not be a problem.
But...
Should you go to prison for possesing images of an adult being raped? What if you don't even know it's rape? What if the person is underage, but you don't know (looks adult to you)? What about a murder video instead of rape? What if the child porn is digitally created (AI, photoshop, whatever)? What if a murder scene is digitally created (fake bullets, holes+blood made in video editing software)? What if you go to a mainstream porno store, buy a mainsteam professional porno video and you later find out that the actress way a 15yo Traci Lords?
> the individual suffering and social damage caused by the actions that he incentivizes
That's some convoluted way to say he deserves 25 years because he may (or may not) at some point in his life molest a kid.
Personally i think that the idea of convicting a man for his thoughts is borderline crazy.
User of child pornography need to be arrested, treated, flagged and receive psychological followup all along their lives, but sending them away for 25 years is lazy and dangerous because when he will get out he will be even worst than before and won't have much to loose.
The language is defined by how people actually use it, not by how a handful of activists try to prescribe its use. Ask any random person on the street, and most of them have no idea what CSAM is, but they know full well what "child porn" is. Dictionaries, encyclopedias etc also reflect this common sense usage.
The justification for this attempt to change the definition doesn't make any sense, either. Just because some porn is child porn, which is bad, doesn't in any way imply that all porn is bad. In fact, I would posit that making this argument in the first place is detrimental to sex-positive outlook on porn.
> Just because some porn is child porn, which is bad, doesn't in any way imply that all porn is bad.
I think people who want others to stop using the term "child porn" are actually arguing the opposite of this. Porn is good, so calling it "child porn" is making a euphemism or otherwise diminishing the severity of "CSAM" by using the positive term "porn" to describe it.
I don't think the established consensus on the meaning of the word "porn" itself includes some kind of inherent implied positivity, either; not even among people who have a generally positive attitude towards porn.
"Legitimate" is probably a better word. I think you can get the point though. Those I have seen preferring the term CSAM are more concerned about CSAM being perceived less negatively when it is called child porn than they are about consensual porn being perceived more negatively.
Stop doing this. You are confusing the perfectly noble aspect of calling it abuse material to make it victim centric with denying the basic purpose of the material. The people who worked hard to get it called CSAM do not deny that it’s pornography for its users.
The distinction you went on to make was necessary specifically for this reason.
It's a reasonable argument, but a concerning one because it hinges on a couple of layers of indirection between the person engaging in consuming the content and the person doing the harm / person who is harmed.
That's not outside the purview of US law (especially in the world post-reinterpretation of the Commerce Clause), but it is perhaps worth observing how close to the cliff of "For the good of Society, you must behave optimally, Citizen" such reasoning treads.
For example: AI-generated CP (or hand-drawn illustrations) are viscerally repugnant, but does the same "individual suffering and social damage" reasoning apply to making them illegal? The FBI says yes to both in spite of the fact that we can name no human that was harmed or was unable to give consent in their fabrication (handwaving the source material for the AI, which if one chooses not to handwave it: drop that question on the floor and focus on under what reasoning we make hand-illustrated cartoons illegal to possess that couldn't be applied to pornography in general).
> The FBI says yes to both in spite of the fact that we can name no
They have two arguments for this (that I am aware of). The first argument is a practical one, that AI-generated images would be indistinguishable from the "real thing", but that the real thing still being out there would complicate their efforts to investigate and prosecute. While everyone might agree that this is pragmatic, it's not necessarily constitutionally valid. We shouldn't prohibit activities based on whether these activities make it more difficult for authorities to investigate crimes. Besides, this one's technically moot... those producing the images could do so in such a way (from a technical standpoint) that they were instantly, automatically, and indisputably provable as being AI-generated.
All images could be mandated to require embedded metadata which describes the model, seed, and so forth necessary to regenerate it. Anyone who needs to do so could push a button, the computer would attempt to regenerate the image from that seed, and the computer could even indicate that the two images matched (the person wouldn't even need to personally view the image for that to be the case). If the application indicated they did not match, then authorities could investigate it more thoroughly.
The second argument is an economic one. That is, if a person "consumes" such material, they increase economic demand for it to be created. Even in a post-AI world, some "creation" would be criminal. Thus, the consumer of such imagery does cause (indirectly) more child abuse, and the government is justified in prohibiting AI-generated material. This is a weak argument on the best of days... one of the things that law enforcement efforts excel at is just this. When there are two varieties of a behavior, one objectionable and the other not, but both similar enough that they might at a glance be mistaken for one another, is that it can greatly disincentivize one without infringing the other. Being an economic argument, one of the things that might be said is that economic actors seek to reduce their risk of doing business, and so would gravitate to creating the legal variety of material.
While their arguments are dumb, this filth's as reprehensible as anything. The only question worth asking or answering is, were (AI-generated) it legal, would it result in fewer children being harmed or not? It's commonly claimed that the easy availability of mainstream pornography has reduced the rate of rape since the mid-20th century.
The problem with the internet nowadays is that a few big players are making up their own law. Very often it is against local laws, but nobody can fight with it. For example someone created some content but other person uploaded it and got better scores which rendered the original poster blocked. Another example: children were playing a violin concert and the audio got removed due to alleged copyright violation. No possibility to appeal, nobody sane would go to court. It just goes this way...
> the private search doctrine, which authorizes a government
actor to repeat a search already conducted by a private party without
securing a warrant.
IANAL, etc. Does that mean that if someone breaks in to your house in search of drugs, finds and steals some, and is caught by the police and confesses all that the police can then search your house without a warrant?
IANAL either, but from what I've read before the courts treat searches of your home with extra care under the 4th Amendment. At least one circuit has pushed back on applying private search cases to residences, and that was for a hotel room[0]:
> Unlike the package in Jacobsen, however, which "contained nothing but contraband," Allen's motel room was a temporary abode containing personal possessions. Allen had a legitimate and significant privacy interest in the contents of his motel room, and this privacy interest was not breached in its entirety merely because the motel manager viewed some of those contents. Jacobsen, which measured the scope of a private search of a mail package, the entire contents of which were obvious, is distinguishable on its facts; this Court is unwilling to extend the holding in Jacobsen to cases involving private searches of residences.
So under your hypothetical, I'd expect the police would be able to test "your drugs" that they confiscated from the thief, and use any findings to apply for a warrant for a search of your house, but any search without a warrant would be illegal.
"That, however, does not mean that Maher is
entitled to relief from conviction. As the district court correctly ruled in the
alternative, the good faith exception to the exclusionary rule supports denial of
Maher’s suppression motion because, at the time authorities opened his uploaded
file, they had a good faith basis to believe that no warrant was required."
"Defendant [..] stands convicted following a guilty plea in
the United States District Court for the Northern District of New York
(Glenn T. Suddaby, Judge) of both receiving and possessing approximately
4,000 images and five videos depicting child pornography"
A win for google, for the us judicial system, and for constitutional rights.
You forgot your IANAL, but thankfully it's obvious.
That's a ridiculous desire. In that world, if I delete your comment, and you kill me in retaliation, you should be let free if you argue that my deleting your comment infringed your right to free speech?
What I mean specifically is that because the police saw illegally obtained evidence, all evidence collected afterward after that point should be considered fruit of the poisoned tree and inadmissible
Was using those md5 sums on images for flagging images 20 years ago for the government, occasional false positives, but the safety team would review those, not operations. My only role was to burn the users account to a dvd (via a script) and have the police officer pick up the dvd, we never touched the disk, and only burned the disk with a warrant. (we never saw/touched the users data...)
Figured this common industry standard for chain of custody for evidence. Same with police videos, they are uploaded to the courts digital evidence repository, and everyone who looks at the evidence is logged.
> It feels like it incentivizes the police to minimize their understanding of the law so that they can believe they are following it.
That's a bingo. That's exactly what they do, and why so many cops know less about the law than random citizens. A better society would have high standards for the knowledge expected of police officers, including things like requiring 4-year criminal justice or pre-law degree to be eligible to be hired, rather than capping IQ and preferring people who have had prior experience in conducting violent actions.
Yes, this likely explains part of why the Norwegian police behave like professionals who are trying to do their job with high standards of performance and behavior and the police in the US behave like a bunch of drinking buddies that used to be bullies in high school trying to find their next target to harass.
The good faith exception requires the belief be reasonable. Ignorance of clearly settled law is not reasonable, it should be a situation where the law was unclear, had conflicting interpretations or could otherwise be interpreted the way the police did by a reasonable person.
It's crazy that the most dangerous people one regularly encounters can do anything they want as long as they believe they can do it. The good faith exemption has to be one of the most fascist laws on the books today.
> "the good faith exception to the exclusionary rule supports denial of Maher’s suppression motion because, at the time authorities opened his uploaded file, they had a good faith basis to believe that no warrant was required."
In no other context or career can you do anything you want and get away with it just as long as you say you thought you could. You'd think police offiers would be held to a higher standard, not no standard.
And specifically with respect to the law, breaking a law and claiming you didn't know you did anything wrong as an individual is not considered a valid defense in our justice system. This same type of standard should apply even more to trained law enforcement, not less, otherwise it becomes a double standard.
No this is breaking the law by saying this looked like one of the situations where I already know the law doesn't apply. If Google had looked at the actual image and said it was child porn instead of just saying it was similar to some image that is child porn this would be 100% legal as the courts have already said. That difference is subtle enough that I can see how someone would get it wrong (and in fact I would expect other courts to rule differently)
That's not what this means. One can ask whether the belief is reasonable, that is justifiable by a reasoning process. The argument for applying the GFE in this case is that the probability of false positives from a perceptual hash match is low enough that it's OK to assume it's legit and open the image to verify that it was indeed child porn. They then used that finding to get warrants to search the guy's gmail account and later his home.
If I'm not a professional and I hurt someone while trying to save their life by doing something stupid, that's understandable ignorance.
If a doctor stops to help someone and hurts them because the doctor did something stupid, that is malpractice and could get them sued and maybe get their license revoked.
Would you hire a programmer who refused to learn how to code the claimed "good faith" every time they screwed things up? Good faith shouldn't cover willful ignorance. A cop is hired to know, understand, and enforce the law. If they can't do that, they should be fired.
It's not exactly the same imo, since GS laws are meant to protect someone who is genuinely trying to do what a reasonable person could consider "something positive"
In this case you're correct. But the good faith exemption is far broader than this and applies to even officer's completely personal false beliefs in their authority.
I think the judge chose to relax a lot on this one due to the circumstances. Releasing a man in society found with 4,000 child porn photos in his computer would be a shame.
But yeah, this opens too wide gates of precedence for tyranny, unfortunately...
- Defendant was in fact sending CP through his gmail.
- gmail correctly detects and flags it based on hash value
- Google sends message to NCMEC based on hash value
- NCMEC sends it to police based on hash value
Now police are facing the obvious question, is this actually CP? They open the image, determine it is, then get a warrant to search his gmail account, and (later) another warrant to search his home.
The court here is saying they should have got a warrant to even look at the image in the first place. But warrants only issue on probable cause. What's the PC here? The hash value. What's the probability of hash collisions? Non-zero but very low.
The practical upshot of this is that all reports from NCMEC will now go through an extra step of the police submitting a copy of the report with the hash value and some boilerplate document saying 'based on my law enforcement experience, hash values are pretty reliable indicators of fundamental similarity', and the warrant application will then be rubber stamped by a judge.
An analogous situation would be where I send a sealed envelope with some documents to the police, writing on the outside 'I believe the contents of this envelope are proof that John Doe committed [specific crime]', and the police have to get a warrant to open the envelope. It's arguably more legally consistent, but in practice it just creates an extra stage of legal bureaucracy/delay with no appreciable impact on the eventual outcome.
Recall that the standard for issuance of a warrant is 'probable cause', not 'mathematically proven cause'. Hash collisions are a possibility, but a sufficiently unlikely one that it doesn't matter. Probable cause means 'a fair probability' based on independent evidence of some kind - testimony, observation, forensic results or so. Even a shitty hash function that's only 90% reliable is going to meet that threshold. In the 10% of cases where the opened file turns out to be a random image with no pornographic content it's a 'no harm no foul' situation.
and a more detailed examination of common perceptual hashing algorithms (skip to table 3 for the collision probabilities): https://ceur-ws.org/Vol-2904/81.pdf
I think what a lot of people are implicitly arguing here is that the detection system needs to be perfect before anyone can do anything. Nobody wants the job of examining images to check if they're CP or not, so we've outsourced it to machines that do so with good-but-not-perfect accuracy and then pass the hot potato around until someone has to pollute their visual cortex with it.
Obviously we don't want to arrest or convict people based on computer output alone, but how good does it have to be (in % or odds terms) in order to begin an investigation - not of the alleged criminal, but of the evidence itself? Should companies like Google have to submit an estimate of the probability of hash collisions using their algorithm and based on the number of image hashes that exist on their servers at any given moment? Should they be required to submit source code used to derive that? What about the microcode of the silicon substrate on which the calculation is performed?
All other things being equal, what improvement will result here from adding another layer of administrative processing, whose outcome is predetermined?
and a more detailed examination of common perceptual hashing algorithms (skip to table 3 for the collision probabilities): https://ceur-ws.org/Vol-2904/81.pdf
And there was a whole lot of explanation of how probable cause works and how it's different from programmers' aspirations to perfection.
> Recall that the standard for issuance of a warrant is 'probable cause', not 'mathematically proven cause'. Hash collisions are a possibility, but a sufficiently unlikely one that it doesn't matter. Probable cause means 'a fair probability' based on independent evidence of some kind - testimony, observation, forensic results or so. Even a shitty hash function that's only 90% reliable is going to meet that threshold. In the 10% of cases where the opened file turns out to be a random image with no pornographic content it's a 'no harm no foul' situation.
But do we actually know that? Do we know what the thresholds of "similarity" are in use by google and others, and how many false positives they trigger? Billions of photos are processed daily by googles services (google photo, chat programs, gmail, drive, etc.), and very few people actually send such stuff via gmail, so what if the reality is, that 99.9% of the matches are actually false positives? What about intentional matches, like someone intentionally creating some random SFW meme image, that (when hashed) matches with some illegal image hash, and that photo is then sent around intentionally.. should police really be checking all those emails, photos, etc., without warrants?
Well, that's why I'm asking what threshold of certainty people want to apply. The hypotheticals you cite are certainly possible, but are they likely?
what if the reality is, that 99.9% of the matches are actually false positives
Don't you think that if Google were deluging the cops with false positive reports that turned out to be perfectly innocuous 999 times out of 1000, that police would call them up and say 'why are you wasting our time with this?' Or that defense lawyers wouldn't be raising hell if there were large numbers of clients being investigated over nothing? And how would running it through a judge first improve that process?
What about intentional matches, like someone intentionally creating some random SFW meme image [...]
OK, but what is the probability of that happening? And if such images are being mailed in bulk, what would be the purpose other than to provide cover for CSAM traders? The tactic would only be viable for as long as it takes a platform operator to change up their hashing algorithm. And again, how would the extra legal step of consulting a judge alleviate this?
should police really be checking all those emails, photos, etc., without warrants?
But that's not happening. As I pointed out, police examined the submitted image evidence to determine of it was CP (it was). Then they got a warrant to search the gmail account, and following that another warrant to search his home. The didn't investigate the criminal first, the investigated an image file submitted to them to determine whether it was evidence of a crime.
And yet again, how would bouncing this off a judge improve the process? The judge will just look at the report submitted to the police and a standard police letter saying 'reports of this kind are reliable in our experience' and then tell the police yes, go ahead and look.
The old example is the email server administrator. If the email administrator has to view the contents of user messages as a part of regular maintenance and that email administrator notices lawful violations in those user messages they can report it to law enforcement. In that case law enforcement can receive the material without a warrant only if law enforcement never asked for it before it was gifted to them. There are no fourth amendment protections provided to offenders in this scenario of third party accidental discovery. Typically, in these cases the email administrator does not have an affirmed requirement to report lawful violations to law enforcement unless specific laws claim otherwise.
If on the other hand law enforcement approaches that email administrator to fish for illegal user content then that email administrator has become an extension of law enforcement and any evidence discovered cannot be used in a criminal proceeding. Likewise, if the email administrator was intentionally looking through email messages for violations of law even not at the request of law enforcement they are still acting as agents of the law. In that case discovery was intentional and not an unintentional product of system maintenance.
There is a third scenario: obscenity. Obscenity is illegal intellectual property, whether digital or physical, as defined by criminal code. Possession of obscene materials is a violation of criminal law for all persons, businesses, and systems in possession. In that case an email administrator that accidentally discovers obscene material does have a required obligation to report their discoveries, typically through their employer's corporate legal process, to law enforcement. Failures to disclose such discoveries potentially aligns the system provider to the illegal conduct of the violating user.
Google's discovery, though, was not accidental as a result of system maintenance. It was due to an intentional discovery mechanism based on stored hashes, which puts Google's conduct in line with law enforcement even if they specified their conduct in their terms of service. That is why the appeals court claims the district court erred by denying the defendant's right to suppression on fourth amendment grounds.
The saving grace for the district court was a good faith exception, such as inevitable discovery. The authenticity and integrity of the hash algorithm was never in question by any party so no search for violating material was necessary, which established probably cause thus allowing law enforcement reasonable grounds to proceed to trial. No warrant was required because the evidence was likely sufficient at trial even if law enforcement did not directly view image in question, but they did verify the image. None of that was challenged by either party. What was challenged was just Google's conduct.
The judge doesn't really understand a hash well. They say things like "Google assigned a hash" which is not true, Google calculated the hash.
Also I'm surprised the 3rd-party doctrine doesn't apply. There's the "private search doctrine" mentioned but generally you don't have an expectation of privacy for things you share with Google
"More simply, a hash value is a string of characters obtained by processing the contents of a given computer file and assigning a sequence of numbers and letters that correspond to the file’s contents."
Google assigned the hashing algorithm (maybe, assuming it wasn't chosen in some law somewhere, I know this CSAM hashing is something the big tech work on together).
Once the hashing algorithm was assigned, individual values are computed or calculated.
I don't think the judge's wording is all that bad but the word "assigned" is making it sound like Google exercised some agency when really all it did was apply a pre-chosen algorithm.
And it should be mostly bijective under most conditions. (This is obviously impossible in practice but hashes with common collisions shouldn't be allowed as legal evidence). Also neural/visual hashes like those used by big tech makes things tricky.
The hash in question has many collisions. It it probably enough to get a warrant put it on a warrant, but it may not be enough to get a warrant without some other evidence. (it can be enough evidence to look for other public signs of evidence, or perhaps because there are a number of images that match different hashes)
There's a password on my Google account, I totally expect to have privacy for anything I didn't choose to share with other people.
The hash is kind of metadata recorded by Google, I feel like Google using it to keep child porn off their systems should be reasonable. Same ballpark as limiting my storage to 1GB based on file sizes. Sharing metadata without a warrant is a different question though.
As should be expected from the lawyer world, it seems like whether you have an expectation of privacy using gmail comes down to very technical word choices in the ToS, which of course neither this guy nor anyone else has ever read. Specifically, it may be legally relevant to your expectation of privacy whether Google says they "may" or "will" scan for this stuff.
Out of curiosity, what is false positive rate of a hash match?
If the FPR is comparable to asking a human "are these the same image?", then it would seem to be equivalent to a visual search. I wonder if (or why) human verification is actually necessary here.
The reason human verification is necessary is that the government is relying on something called the "private search" doctrine to conduct the search without a warrant. This doctrine allows them to repeat a search already conducted by a private party (i.e., Google) without getting a warrant. Since Google didn't actually look at the file, the government is not able to look at the file without a warrant, as that search exceeds the scope of the initial search Google performed.
I doubt sha1 hashes are used for this. Those image hashes should match files regardless of orientation, cropping, resizing, re-compression, color correction etc. The collision could be far more frequent with these hashes.
The hash should ideally match even if you use photoshop to cut the one person out of the picture and put that person into a different photo. I'm not sure if that is possible, but that is what we want.
Naively, 1/(2^{hash_size_in_bits}). Which is about 1 in 4 billion odds for a 32 bit hash, and gets astronomically low at higher bit counts.
Of course, that's assuming a perfect, evenly distributed hash algorithm. And that's just the odds that any given pair of images has the same hash, not the odds that a hash conflict exists somewhere on the internet.
Normal hash functions have pseudo-random outputs and they can collide even when the input space is much smaller than the output space.
In fact, I'll go run ten million values, encoded into 24 bits each, through a 40 bit hash and count the collisions. My hash of choice will be a truncated sha256.
> Out of curiosity, what is false positive rate of a hash match?
No way to know without knowledge of the 'proprietary hashing technology'.
Theoretically though, a hash can have infinitely many inputs that produce the same output.
Mismatching hash values from the same hashing algorithm can prove mismatching inputs, but matching hash values don't ensure matching inputs.
> I wonder if (or why) human verification is actually necessary here
It's not about frequency, it's about criticality of getting it right. If you are going to make a negatively life-altering report on someone, you'd better make sure the accusation is legitimate.
I'd say the focus on hashing is a bit of a red herring.
Most anyone would agree that the hash matching should probably form probable cause for a warrant, allowing a judge to sign off on the police searching (i.e., viewing) the image. So, if it's a collision, the cops get a warrant and open up your linux ISO or cat meme, and it's all good. Probably the ideal case is that they get a warrant to search the specific image, and are only able to obtain a warrant to search your home and effects, etc. if the image does appear to be CSAM.
At issue here is the fact that no such warrant was obtained.
I think it'll prove far more likely that the government creates incentives to lead Google/other providers to fully do the search on their behalf.
The entire appeal seems to hinge on the fact that Google didn't actually view the image before passing it to NCMEC. Had Google policy been that all perceptual hash hits were reviewed by employees first, this would've likely been a one page denial.
If the hash algorithm were CRC8, then obviously it should not be probable cause for anything. If it were SHA-3, then it's basically proof beyond reasonable doubt of what the file is. It seems reasonable to question how collisions behave.
I don't agree that it would be proof beyond reasonable doubt, especially because neither google nor law enforcement can produce the original image that got tagged.
By original do you mean the one in the database or the one on the device?
If the device spit out the same SHA3, then either it had the exact same image, or the SHA3 was planted somehow. The idea that it's actually a different file is not a reasonable doubt. It's too unlikely.
By the original, I mean the image that was used to produce the initial hash, which Google (rightly) claimed to be CSAM. Without some proof that an illicit image that has the same hash exists, I wouldn't accept a claim based on hash alone.
Oh definitely you need someone to examine the image that was put in the database to show it's CSAM, if the legal argument depends on that. But that's an entirely different question from whether the image on the device is that image.
> Most anyone would agree that the hash matching should probably form probable cause for a warrant
I disagree with this. Yes, if we were talking MD5, SHA, or some similar true hash algo, then the probability of a natural collision is small enough that I agree in principle.
But if the hash algo is of some other kind then I do not know enough about it to assert that it can justify probable cause. Anyone who agrees without knowing more about it is a fool.
That's fair. I came away from reading the opinion that this was not a perceptual hash, but I don't think it is explicitly stated anywhere. I would have similar misgivings if indeed it is a perceptual hash.
For non-broken cryptographic hashes (e.g., SHA-256), the false-positive rate is negligible. Indeed, cryptographic hashes were designed so that even nation-state adversaries do not have the resources to generate two inputs that hash to the same value.
These are not the kinds of hashes used for CSAM detection, though, because that would only work for the exact pixel-by-pixel copy - any resizing, compression etc would drastically change the hash.
Instead, systems like these use perceptual hashing, in which similar inputs produce similar hashes, so that one can test for likeness. Those have much higher collision rates, and are also much easier to deliberately generate collisions for.
So now an algorithm can interpret the law better than a judge. It’s amazing how technology becomes judge and jury while privacy rights are left to a good faith interpretation. Are we really okay with letting an algorithmic click define the boundaries of privacy?
> Google’s hash match may well have established probable cause for a warrant to allow police to conduct a visual examination of the Maher file.
Very reasonable. Google can flag accounts as CP, but then a judge still needs to issue a warrant for the police to actually go and look at the file. Good job court. Extra points for reasoning about hash values.
> a judge still needs to issue a warrant for the police to actually go and look at the file
Only in the future. Maher's conviction, based on the warrantless search, still stands because the court found that the "good faith exception" applies--the court affirmed the District Court's finding that the police officers who conducted the warrantless search had a good faith belief that no warrant was required for the search.
I wonder what happened to fruit of the poisoned tree? Seems a lot more liberty oriented than "good faith exception" when police don't think they need a warrant (because police never seem to "think" they need a warrant).
This exactly. Bad people have to go free in order to incentivize good behavior by cops.
You and I (as innocent people) are more likely to be affected by bad police behavior than the few bad people themselves and so we support the bad people going free.
I think its okay that we expect cops to be good _after_ the rule exists, rather than set the bad guys free to (checks notes) incentivize cops to take our new rule super seriously.
It would seem that the inverse would need to apply in order for the justice system to have any semblance of impartiality. That is that we now have to let both of them off the hook, since neither had been specifically informed they weren’t allowed to do the thing beforehand.
That is why many people think this should be tossed out. Ignorance that an action was a crime is almost never an acceptable defense, so it should not be an acceptable offense either.
> we now have to let both of them off the hook, since neither had been specifically informed they weren’t allowed to do the thing beforehand.
I'm not trying to be funny, or aggressive, or passive aggressive, seriously: there's two entities in the discussion, the cops, and the person with a photograph with a hash matching child porn. I'm phrasing that as passively as possible because I want to avoid the tarpit of looking like I'm appealing to emotion:
Do you mean the hash-possessor weren't specifically informed it was illegal to possess said hash?
> It would seem that the inverse would need to apply in order for the justice system to have any semblance of impartiality...That is why many people think this should be tossed out.
Of course, I could be missing something here because I'm making a hash of parsing the first bit. But, no, if the cops in good faith make a mistake, there's centuries of jurisprudence behind not letting people go free for it, not novel with this case.
> Do you mean the hash-possessor weren't specifically informed it was illegal to possess said hash?
This is literally the doctrine behind the good faith argument and qualified immunity. If they have not been informed that this specific act, done in this specific way is not allowed then it is largely permissible.
A stupid but equivalent defense from the possessor would be “it’s in Googles possession, not mine, so I had a good faith belief that I did not possess the files”. It’s clearly wrong based on case law, but I wouldn’t expect the average person to have a great grasp of how possession works legally (nor would I claim to be an expert on it).
This is effectively what the good faith doctrine establishes for police, even though they really ought to at least have an inkling given that the law is an integral part of their jobs. As long as they can claim to be sufficiently stupid, it is permissible. That is not extended to the defense, for whom stupidity is never a defense.
> But, no, if the cops in good faith make a mistake, there's centuries of jurisprudence behind not letting people go free for it, not novel with this case.
Acting in good faith would be getting a warrant regardless, because the issue is not that time-sensitive and there are clear ambiguities here. They acted brashly under the assumption that if they were wrong, they could claim stupidity. It encourages the police to push the boundaries of legal behavior, because they still get to keep the evidence even if they are wrong and have committed an illegal search.
It is, yet again, rules for thee but not for me. Frankly, with the asymmetry of responsibility and experience with laws, the police should need to clear a MUCH higher bar to come within throwing distance of “good faith”.
Your argument is a bit disingenuous because it's not applicable in situation where there is clear law clarifying that something can't be done.
You're pretending that cops are using this in situations where it's known that a warrant is needed, as opposed to it being an exception to "fruit of the poisonous tree" doctrine when new caselaw is being made.
> Acting in good faith would be getting a warrant regardless
That's not what "good faith" means, that's just something entirely made up by you. From a reasonable perspective that could be described as foolish and a waste of time and the public's resources.
> It encourages the police to push the boundaries of legal behavior, because they still get to keep the evidence even if they are wrong and have committed an illegal search.
There's a constant tension between technology, crime and the police that's reflected in the history of 4th amendment jurisprudence and it's not at all like what you describe. The criminals are pushing the boundaries to which the police must catch up, and the law must determine what is fair as society changes over time. I'm not particularly pro cop, but you don't seem to be reasonable about any of this.
That rule has been around for quite a while, and looks worse for wear now
> That rule has been around for quite a while
The rule established in this case is new, hence TFA, and all the time the lawyers and judge wasted on it :)
If I may suggest where wires are getting crossed:
You are sort of assuming it's like a logic gate: if 4th amendment violation, bad evidence, criminal must go free. So when you say "the rule", you mean "the 4th amendment", not the actual ruling.
That's not how it works, because that simple ultimatum also has edge cases. So we built up this whole system around nominating juries and judges, and paying lawyers, over centuries, to argue out complicated things like weighing intentionality.
The opinion says at the time the warrantless search occurred, one appellate court had already held "that no warrant was required in those circumstances" (p 42). Only a year after the search occurred, did another appellate court rule the other way.
This is the main argument that the search met the good faith exception to the exclusionary rule (i.e. the rule that says you have to exclude evidence improperly obtained). This exception is supported in the opinion (at p41) with several citations including United States v. Ganias, 824 F.3d 199, 221–22 (2d Cir. 2016)
IANAL, but as I understood, this exception is specifically about cases where precedence is established. This same trick or others substantially like it won't work in the future, but because it was not a "known trick", the conviction still stands.
I'm trying to imagine a more "real-world" example of this to see how I feel about it. I dislike that there is yet another loophole to gain access to peoples' data for legal reasons, but this does feel like a reasonable approach and a valid goal to pursue.
I guess it's like if someone noticed you had a case shaped exactly like a machine gun, told the police, and they went to check if it was registered or not? I suppose that seems perfectly reasonable, but I'm happy to hear counter-arguments.
The main factual components are as follows: Party A has rented out property to Party B. Party A performs surveillance on or around the property with Party B's knowledge and consent. Party A discovers very high probability evidence that Party B is committing crimes within the property, and then informs the police of their findings. Police obtain a warrant, using Party A's statements as evidence.
The closest "real world" analogy that comes to mind might be a real estate management company uses security cameras or some other method to determine that there is a crime occurring in a space that they are renting out to another party. The real estate management company then sends evidence to the police.
In the case of real property -- rental housing and warehouse/storage space in particular -- this happens all the time. I think that this ruling is imminently reasonable as a piece of case law (ie, the judge got the law as it exists correct). I also thing this precedent would strike a healthy policy balance as well (ie, the law as it exists if interpreted how the judge in this case interprets it would a good policy situation).
Is there any such thing as this surveillence applying to the inside of the renters bed room, bath room, filing cabinet with medical or financial documents, or political for that matter?
I don't think there is, and I don't think you can reduce reality to being as simple as "owner has more right over property than renter" renter absolutely has at least a few rights in at least a few defined contextx over owner because owner "consented" to accept money in trade for use of property.
> Is there any such thing as this surveillence applying to the inside of the renters bed room, bath room, filing cabinet with medical or financial documents, or political for that matter?
Yes. Entering property for regular maintenance. Any time a landlord or his agent enters a piece of property, there is implicit surveillance. Some places are more formal about this than others, but anyone who has rented, owned rental property, or managed rental property knows that any time maintenance occurs there's an implicit examination of the premises also happening...
But here is a more pertinent example: the regular comings and goings of people or property can be and often are observed from outside of a property. These can contribute to probable cause for a search of those premises even without direct observation. (E.g., large numbers of disheveled children moving through an apartment, or an exterior camera shot of a known fugitive entering the property.)
Here the police could obtain a warrant on the basis of landlord's testimony without the landlord actually seeing the inside of the unit. This is somewhat similar to the case at hand, since what Google alerted the police to a hash match without actually looking at the image (ie, entering the bedroom).
> I don't think you can reduce reality to being as simple as "owner has more right over property than renter"
But I make no such reduction, and neither does the opinion. In fact, quite the opposite -- this is contributory why the court determines a warrant is required!
> ...Google alerted the police to a hash match without actually looking at the image (ie, entering the bedroom).
Google cannot have calculated that hash without examining the data in the image. They, or systems under there control obviously looked at the image.
It should not legally matter whether the eyes are meat or machine... if anything, machine inspection should be MORE strictly regulated, because of how much easier and cheaper it tends to make surveillance (mass or otherwise).
> It should not legally matter whether the eyes are meat or machine
But it does matter, and, perhaps ironically, it matters in a way that gives you STRONGER (not weaker) fourth amendment rights. That's the entire TL;DR of the fine article.
If the court accepted this sentence of yours in isolation, then the court would have determined that no warrant was necessary in any case.
> if anything, machine inspection should be MORE strictly regulated, because of how much easier and cheaper it tends to make surveillance (mass or otherwise).
I don't disagree. In particular: I believe that the "Reasonable Person", to the extent that we remain stuck with the fiction, should be understood as having stronger privacy expectations in their phone or cloud account than they do even in their own bedroom or bathroom.
With respect to Google's actions in this case, this is an issue for your legislator and not the courts. The fourth amendment does not bind Google's hands in any way, and judges are not lawmakers.
> Yes. Entering property for regular maintenance.
In every state that I've lived in they must give advance notice (except for emergencies). They can't just show up and do a surprise check.
Only in residential properties, typically. There are also states that have no such requirement even on residential rentals.
In any case, I think it's a bit of a red herring and that the "regular comings and goings" case is more analogous.
But also that, at this point in the thread, we have reached a point where analogy stops being helpful and the actual thing has to be analyzed.
Fair enough.
If I import hundreds of pounds of poached ivory and store it in a shipping yard or move it to a long term storage unit, the owner and operator of those properties are allowed to notify police of suspected illegal activities and unlock the storage locker if there is a warrant produced.
Maybe the warrant uses some abstraction of the contents of that storage locker like the shipping manifest or customs declaration. Maybe someone saw a shadow of an elephant tusk or rhino horn as I was closing the locker door.
I don't think that argument supports the better analogy of breaking into a computer or filing cabinet owned by someone renting the space. Just because someone is renting space doesn't give you the right to do whatever you want to them. Cameras in bathrooms of a rented space would be another example.
But he wasn’t running a computer in a rented space, he was using storage space on google’s computers.
In an older comment I argued against analogies to rationalize this. I think honestly at face value it is possible to evaluate the goodness or badness of the decision.
> In an older comment I argued against analogies to rationalize this. I think honestly at face value it is possible to evaluate the goodness or badness of the decision.
I generally do agree that analogies became anti-useful in this thread relatively quickly.
However, I am not sure that avoiding analogies is actually possible for the courts. I mean, they can try, but at some point analogies are unavailable because most of the case law -- and, hell, the fourth amendment itself -- is written in terms of the non-digital world. Judges are forced to reason by analogy, because legal arguments will be advanced in terms of precedent that is inherently physical.
So there is value in hashing out the analogies, even if at some point they become tenuous, primarily because demonstrating the breaking points of the analogies is step zero in deviating from case law.
Yes, that is why I presented an alternative to the analogy of "import hundreds of pounds of poached ivory and store it in a shipping yard or move it to a long term storage unit".
Like having the right to avoid being videoed in the bathroom, we have the right to avoid unreasonable search of our files by authorities, whether stored locally or on the cloud
Wait until you hear about third party doctrine.
I have this weird experience where people that get all their legal news from tech websites have really pointed views about fourth amendment jurisprudence and patent law.
The issue of course being the government then pressuring or requiring these companies to look for some sort of content as part of routine operations.
I agree. This is a case where the physical analogy leads us to (imo) the correct conclusion: compelling major property management companies to perform regular searches of their tenant's properties, and then to report any findings to the police, is hopefully something that most judges understand to be a clear violation of the fourth amendment.
> The issue of course being the government then pressuring or requiring these companies to look for some sort of content as part of routine operations.
Was that the case here?
Not requiring, but certainly pressure. See https://www.nytimes.com/2013/12/09/technology/tech-giants-is... for example. Also all of the heat Apple took over rolling back its perceptual hashing.
> Party A discovers very high probability evidence that Party B is committing crimes within the property ...
This isn't accurate: the hashes were purposefully compared to a specific list. They didn't happen to notice it, they looked specifically for it.
And of course, what happens when it's a different list?
>> Party A discovers very high probability evidence that Party B is committing crimes within the property ...
> This isn't accurate: the hashes were purposefully compared to a specific list. They didn't happen to notice it, they looked specifically for it.
1. I don't understand how the text that comes on the right side of the colon substantiates the claim on the left side of the colon... I said "discovers", without mention of how it's discovered.
2. The specificity of the search cuts in exactly the opposite direction than you suggest; specificity makes the search far less invasive -- BUT, at the same time, the "everywhere and always" nature of the search makes it more invasive. The problem is the pervasiveness, not the specificity. See https://news.ycombinator.com/user?id=aiforecastthway
> And of course, what happens when it's a different list?
The fact that the search is targeted, that the search is highly specific, and that the conduct plainly criminal, are all, in fact, highly material. The decision here is not relevant to most of the "worst case scenarios" or even "bad scenarios" in your head, because prior assumptions would have been violated prior to this moment in the legal evaluation.
But with respect to your actual argument here... it's really a moot point. If the executive branch starts compelling companies to help them discover political enemies on basis of non-criminal activity, then the court's opinions will have exactly as much force as the army that court proves capable of raising, because such an executive would likely have no respect for the rule of law in any case...
It is reasonable for legislators to draft laws on a certain assumption of good faith, and for courts to interpret law on a certain assumption of good faith, because without that good faith the law is nothing more than a sequence of forceless ink blotches on paper anyways.
I don't think that changes anything. I think it's entirely reasonable for Party A to be actively watching the rented property to see if crimes are being committed, either by the renter (Party B) or by someone else.
The difference I do see, however, is that many places do have laws that restrict this sort of surveillance. If we're talking about an apartment building, a landlord can put cameras in common areas of the building, but cannot put cameras inside individual units. And with the exception of emergencies, many places require that a landlord give tenants some amount of notice before entering their unit.
So if Google is checking user images against known CSAM image hashes, are those user images sitting out in the common areas, or are they in an individual tenant's unit? I think it should be obvious that it's the latter, not the former.
Maybe this is more like a company that rents out storage units. Do storage companies generally have the right to enter their customers' storage units whenever they want, without notice or notification? Many storage companies allow customers to put their own locks on their units, so even if they have the right to enter whenever they want, regularly, in practice they certainly do not.
But like all analogies, this one is going to have flaws. Even if we can't match it up with a real-world example, maybe there's still no inconsistency or problem here. Google's ToS says they can and will do this sort of scanning, users agree to it, and there's no law saying Google can't do that sort of thing. Google itself has no obligation to preserve users' 4th Amendment rights; they passed along evidence to the police. I do think the police should be required to obtain a warrant before gaining access to the underlying data; the judge agrees on this, but the police get away with it in the original case due to the bullshit "good faith exception".
This is an excellent example, I think I get it now and I'm fully on-board. Thanks.
I could easily see an AirBNB owner calling the cops if they saw, for instance, child abuse happening on their property.
Ok. But that would also be invasion of privacy. If the property you rented out was being used for trafficking and you don’t want to be involved with trafficking, then the terms would have to first explicitly set what is not allowed. Then it would also have to explicitly mention what measures are taken to enforce it and what punishments are imposed for violations. It should also mention steps that are taken for compliance.
Without full documentation of compliance measures, enforcement measures, and punishments imposed, violations of the rule cannot involve law enforcement who are restricted to acting on searches with warrants.
> If the property you rented out was being used for trafficking and you don’t want to be involved with trafficking, then the terms would have to first explicitly set what is not allowed.
I don't believe that's the case. You don't need to state that illegal activities are not allowed; that's the default.
> Then it would also have to explicitly mention what measures are taken to enforce it
When Airbnb used to allow cameras indoors, they did -- after some backlash -- require hosts to disclose the presence of the cameras.
> ... and what punishments are imposed for violations.
No, I don't think that is or should be necessary. If you do illegal things, the possible punishments don't need to be enumerated by the person who reports you to the police.
Put another way: if I'm hosting someone on Airbnb in the case where I'm living in the same property, and I walk into the kitchen to see my Airbnb guest dealing drugs, I am well within my rights to call the police, without having ever said anything up-front to my guest about whether or not that's acceptable behavior, or what the consequences might be. Having the drug deal instead caught on camera is no different, though I would agree that the presence of the cameras should have to be disclosed beforehand.
In Google's case, the "camera" (aka CSAM scanning) appears to have been disclosed beforehand.
>Without full documentation of compliance measures, enforcement measures, and punishments imposed, violations of the rule cannot involve law enforcement who are restricted to acting on searches with warrants.
That's not the only way police get information...
With their hidden camera in the bathroom.
I just meant it as an analogy, not that I'm specifically on-board with AirBNB owners putting cameras in bathrooms.
Anyways, that's why I just rent hotel rooms, personally. :)
I think the real-world analogy would be to say that the case is shaped exactly like a machine gun and the hotel calls the police, who then open the case without a warrant. The "private search" doctrine allows the police to repeat a search done by a private party, but here (as in the machine gun case), the case was not actually searched by a private party.
But this court decision is a real world example, and not some esoteric edge case.
This is something I don’t think needs analogies to understand. SA/CP image and video distribution is an ongoing moderation, network, and storage issue. The right to not be under constant digital surveillance is somewhat protected in the constitution.
I like speech and privacy and am paranoid of corporate or government overreach, but I arrive at the same conclusion as you taking this court decision at face value.
Wait until Trump is in power and corporations are masterfully using these tools to “mow the grass” (if you want an existing example of this, look at Putin’s Russia, where people get jail time for any pro-Ukraine mentions on social media).
Yeah I’m paranoid like I said, but this case it seems like the hash of a file on google’s remote storage flagged as potential match that was used as justification to request a warrant. That seems common sense and did not involve employees snooping pre-warrant.
The Apple CSAM hash detection process, that the launch was rolled back, concerned me namely because it was run on-device with no opt out. If this is running on cloud storage then it sort of makes sense. You need to ensure you are not aiding or harboring actually harmful illegal material.
I get there are slippery slopes or whatever but the fact is you cannot just store whatever you wish in a rental. I don’t see this as opening mass regex surveillance of our communication channels. We have the patriot act to do that lol.
I think the better option is a system where the cloud provider cannot decrypt the files, and they’re not obligated to lift a finger to help the police because they have no knowledge of the content at all
In my opinion, despite the technical merits of an algorithm, encryption is only as trustworthy as the computer who generates and holds a private key.
I would personally not knowingly use a cloud provider to commit a crime. That is a fairly naive take to assume because your browser is https that data at rest and in process isn’t somehow observable.
And I see where you’re coming from but I am afraid that position severely overestimates the will of US people to trade freedom/privacy for security and the legislature to hold citizens’ privacy in such high regard.
I only worry that, in the case that renting becomes a roundabout way of granting more oversight ability to the government, then as home ownership rates decrease, government surveillance power increases.
Sure, it's facilitated through a third party (the owner), but the extrapolated pattern seems to be: "1. Only people in group B will have fewer rights, so people in group A shouldn't worry" followed closely by "2. Sorry, you've been priced out of group A."
In the case of renting, we end up in the situation where those who have enough wealth to own their own home are afforded extra privileges of privacy.
Now to bring this back to the cloud; the cynical part of me looks towards a future of cheap, cloud-only storage devices. Or an intermediate future of devices where cloud is first party and local storage is just enough of a hassle that people don't use it. And the result is that basically everyone now has the present day equivalent of local storage scanning.
If renting de-facto grants fewer rights, then in the future where "you'll own nothing and be happy", you'll also have no rights, and all the way people will say "as a renter, what did you expect?"
OK I agree with you about setting a precedent that future storage will be scanned by default. Additionally who will control the reference hash list?, since making one necessitates hashing that illicit material.
I only hope the court systems escalate it and manage to protect free speech or unreasonable search and seizure or self incrimination or whatever if the CSAM hash comparisons are used against political opponents or music piracy or tax evasion or whatever.
Good point.
> You need to ensure you are not aiding or harboring actually harmful illegal material.
Is this actually true, legally speaking?
I’m unsure I wrote that from like an ethics standpoint. The silk road guy was got on conspiracy for attempting murder and not drug or human trafficking charges. So I’m unsure of legal side.
I think if you knowingly provided a platform to distribute SA/CP/CSAM and the feds become involved you will be righteously fucked.
Reddit clamped down on the creepy *bait subreddits years ago. Maybe it was self-preservation on the business side or maybe it was forward looking about legal issues.
I’m not a lawyer I was just mentioning things that I would follow for ethics morals and my sense of self preservation.
It is worse. Trump will actually put people on concentration camps! Glenn Greenwald explains the issue here:
https://www.youtube.com/watch?v=8EjkstotxpE
It's like a digital 'smell'; Google is a drug sniffing dog.
I don't think the analogy holds for two reasons (which cut in opposite directions from the perspective of fourth amendment jurisprudence, fwiw).
First, the dragnet surveillance that Google performs is very different from the targeted surveillance that can be performed by a drug dog. Drug dogs are not used "everywhere and always"; rather, they are mostly used in situations where people have a less reasonable expectation of privacy than the expectation they have over their cloud storage accounts.
Second, the nature of the evidence is quite different. Drug-sniffing dogs are inscrutable and non-deterministic and transmit handler bias. Hashing algorithms can be interrogated and are deterministic and do not have such bias transferal issues; collisions do occur, but are rare, especially because the "search key" set is so minuscule relative to the space of possible hashes. The narrowness and precision of the hashing method preserves most of the privacy expectations that society is currently willing to recognize as objectively reasonable.
Here we get directly to the heart of the problem with the fictitious "reasonable person" used in tests like the Katz test, especially in cases where societal norms and technology co-evolve at a pace far more rapid than that of the courts.
This analogy can have two opposite meanings. Drug dogs can be anything from a prop used by the police to search your car without a warrant (a cop can always say in court the dog "alerted" them) to a useful drug detection tool.
If the police “wanted” to look. But what if they were notified of the material? Then the police should not need a warrant, right?
Don't they?. If you tell the cops that your neighbor has drugs of significant quantity in their house, would they not still need a warrant to actually go into your neighbor's house?
There are a lot of nuances to these situations of third-party involvement and the ruling discusses these at length. If you’re interested in the precise limits of the 4th amendment you should really just read the linked document.
they should as a matter of course. but I guess "papers" you entrust to someone else are a gray area. I personally think that it goes against the separation of police state and democracy, but I'm a nobody, so it doesn't matter I suppose.
No. What I send through my email is between me and God.
Is it reasonable? Even if the hash was md5, given valid image files, the chances of it being an accidental collision are way lower than the chance of any other evidence given to a judge was false or misinterpreted.
This is NOT a secure hash. This is an image similar to hash which has many many matches in not related images.
Unfortunately the decision didn't mention this at all even though it is important. If it was even as good as a md5 hash (which is broken) I think the search should be allowed without warrant because even though a accidental collision is possible odds are so strongly against it that the courts can safely assume there isn't (and of course if there is the police would close the case). However since this has is not that good the police cannot look at the image unless Google does.
I wish I could get access to the "App'x 29" being referenced so that I could better understand the judges' understanding here. I assume this is Federal Appendix 29 (in which case a more thorough reference would've been appreciated). If the Appeals Court is going to cite the Federal Appendix in a decision like this and in this manner, then the Federal Appendix is as good as case law and West Publishing's copyright claims should be ripped away. Either the Federal Appendix should not be cited in Appeals Court and Supreme Court opinions, or the Federal Appenix is part of the law and belongs to the people. There is no middle there.
> I think the search should be allowed without warrant because even though a accidental collision is possible odds are so strongly against it that the courts can safely assume there isn't
The footnote in the decision bakes this property into the definition of a hash:
A “hash” or “hash value” is “(usually) a short string of characters generated from a much larger string of data (say, an electronic image) using an algorithm—and calculated in a way that makes it highly unlikely another set of data will produce the same value.
(Importantly, this is NOT an accurate definition of a hash for anyone remotely technical... of course hashing algorithms with significant hash collisions exist, and is even a design criterion for some hashing algorithms...)
>I wish I could get access to the "App'x 29" being referenced so that I could better understand the judges' understanding here. I assume this is Federal Appendix 29 (in which case a more thorough reference would've been appreciated). If the Appeals Court is going to cite the Federal Appendix in a decision like this and in this manner, then the Federal Appendix is as good as case law and West Publishing's copyright claims should be ripped away. Either the Federal Appendix should not be cited in Appeals Court and Supreme Court opinions, or the Federal Appenix is part of the law and belongs to the people. There is no middle there.
Just go to a law library.
Do you know that judges routinely make decisions based on confidential documents not in the public record? Is that also bad?
We can’t access appendix 29? Is that what you are saying?
You're assuming accidential collision. Images can be generated that intentionally trigger the hash algorithm while they still appear as something else (a meme, funny photo, etc.) to a person looking at them. This brings many possibilities for "bad people" to do to people they hate (like an alternative to swatting etc.)
Yes. How else would you prevent framing someone?
So you're saying that I craft a file that has the same hash as a CSAM one, I give it to you, you upload it to google, but it also happens to be CSAM, and I've somehow framed you?
My point is that a hash (granted, I'm assuming that we're talking about a cryptographic hash function, which is not clear) is much closer to "This is the file" than someone actually looking at it, and that it's definitely more proof of them having that sort of content than any other type of evidence.
These are perceptual hashes designed on purpose to be a little vague and broad so they catch transformed images. Not cryptographic hashes.
I don't understand. If you contend that it's even better evidence than actually having the file and looking at it, how is not reasonable to then need a judge to issue a warrant to look at it? Are you saying it would be more reasonable to skip that part and go directly to arrest?
It seems like a large part of the ruling hinges on the fact that Google matched the image hash to a hash of a known child pornography image, but didn't require an employee to actually look at that image before reporting it to the police. If they had visually confirmed it was the image they suspected it was based on the hash then no warrant would have been required, but the judge reads that the image hash match is not equivalent to a visual confirmation of the image. Maybe there's some slight doubt in whether or not the image could be a hash collision, which depends on the hash method. It may be incredibly unlikely (near impossible?) for any hash collision depending on the specific hash strategy.
I think it would obviously be less than ideal for Google to require an employee visually inspect child pornography identified by image hash before informing a legal authority like the police. So it seems more likely that the remedy to this situation would be for the police to obtain a warrant after getting the tip but before requesting the raw data from Google.
Would the image hash match qualify as probable cause enough for a warrant? On page 4 the judge stops short of setting precedence on whether it would have or not. Seems likely that it would be a solid probable cause to me, but sometimes judges or courts have a unique interpretation of technology that I don't always share, and leaving it open to individual interpretation can lead to conflicting results.
The hashes involved in stuff like this, as with copyright auto-matching, are perceptual hashes (https://en.wikipedia.org/wiki/Perceptual_hashing), not cryptographic hashes. False matches are common enough that perceptual hashing attacks are already a thing in use to manipulate search engine results (see the example in random paper on the subject https://gangw.cs.illinois.edu/PHashing.pdf).
It seems like that is very relevant information that was not considered by the court. If this was a cryptographic hash I would say with high confidence that this is the same image and so Google examined it - there is a small chance that some unrelated file (which might not even be a picture) matches but odds are the universe will end before that happens and so the courts can consider it the same image for search purposes. However because there are many false positive cases there is reasonable odds that the image is legal and so a higher standard for search is needed - a warrant.
>so the courts can consider it the same image for search purposes
An important part of the ruling seems to be that neither Google nor the police had the original image or any information about it, so the police viewing the image gave them more information than Google matching the hash gave Google: for example, consider how the suspect being in the image would have changed the case, or what might happen if the image turned out not to be CSAM, but showed the suspect storing drugs somewhere, or was even, somehow, something entirely legal but embarrassing to the suspect. This isn't changed by the type of hash.
That's the exact conclusion that was reached - the search required a warrant.
the court implied even a hash without collisions would not count when it should.
It shouldn't. Google hasn't otherwise seen the image, so the employee couldn't have witnessed a crime. There are reportedly many perfectly legal images that end up in these almost perfectly unaccountable databases.
That makes sense - if they were using a cryptographic hash then people could get around it by making tiny changes to the file. I’ve used some reverse image search tools, which use perceptual hashing under the hood, to find the original source for art that gets shared without attribution (saucenao pretty solid). They’re good, but they definitely have false positives.
Now you’ve got me interested in what’s going on under the hood, lol. It’s probably like any other statistical model: you can decrease your false negatives (images people have cropped or added watermarks/text to), but at the cost of increased false positives.
> what's going on under the hood
Rather simple methods are surprisingly effective [1]. There's sure to be more NN fanciness nowadays (like Apple's proposed NeuralHash), but I've used the algorithms described by [1] to great effect in the not-too-distant past. The HN discussion linked in that article is also worth a read.
[1] https://www.hackerfactor.com/blog/index.php?/archives/432-Lo...
This submission is the first I've heard of the concept. Are there OSS implementations available? Could I use this, say, to deduplicate resized or re-jpg-compressed images?
Probably yeah, though there’s significant overlap between how much distortion to accept vs the number of false positives.
The hash functions used for these purposes are usually not cryptographic hashes. They are "perceptual hashes" that allows for approximate matches (e.g. if the image has been scaled or brightness-adjusted). https://en.wikipedia.org/wiki/Perceptual_hashing
These hashes are not collision-resistant.
They should be called embeddings.
> Maybe there's some slight doubt in whether or not the image could be a hash collision, which depends on the hash method. It may be incredibly unlikely (near impossible?) for any hash collision depending on the specific hash strategy.
If it was a cryptographic hash (apparently not), this mathematical near-certainty is necessary but not sufficient. Like cryptography used for confidentiality or integrity, the math doesn't at all guarantee the outcome; the implementation is the most important factor.
Each entry in the illegal hash database, for example, relies on some person characterizing the original image as illegal - there is no mathematical formula for defining illegal images - and that characterization could be inaccurate. It also relies on the database's integrity, the user's application and its implementation, even the hash calculator. People on HN can imagine lots of things that could go wrong.
If I were a judge, I'd just want to know if someone witnessed CP or not. It might be unpleasant but we're talking about arresting someone for CP, which even sans conviction can be highly traumatic (including time in jail, waiting for bail or trial, as a ~child molestor) and destroy people's lives and reputations. Do you fancy appearing at a bail hearing about your CP charge, even if you are innocent? 'Kids, I have something to tell you ...'; 'Boss, I can't work for a couple weeks because ...'.
It seems like there just needs to be case law about the qualifications of an image hash in order to be counted as probable cause for a warrant. Of course you could make an image hash be arbitrarily good or bad.
I am not at all opposed to any of this "get a damn warrant" pushback from judges.
I am also not at all opposed to Google searching it's cloud storage for this kind of content. There are a lot of things I would mind a cloud provider going on fishing expeditions to find potentially illegal activity, but this I am fine with.
I do strongly object to companies searching content for illegal activity on devices in my possession absent probable cause and a warrant (that they would have to get in a way other than searching my device). Likewise I object to the pervasive and mostly invisible delivery to the cloud of nearly everything I do on devices I possess.
In other words, I want custody of my stuff and for the physical possession of my stuff to be protected by the 4th amendment and not subject to corporate search either. Things that I willingly give to cloud providers that they have custody of I am fine with the cloud provider doing limited searches and the necessary reporting to authorities. The line is who actually has the bits present on a thing they hold.
I think if the hashes were made available to the public, we should just flood the internet with matching but completely innocuous images so they can no longer be used to justify a search
>please use the original title, unless it is misleading or linkbait; don't editorialize. (@dang)
On topic, I like this quote from the first page of the opinion:
>A “hash” or “hash value” is “(usually) a short string of characters generated from a much larger string of data (say, an electronic image) using an algorithm—and calculated in a way that makes it highly unlikely another set of data will produce the same value.” United States v. Ackerman, 831 F.3d 1292, 1294 (10th Cir. 2016) (Gorsuch, J.).
It's amusing to me that they use a supreme court case as a reference for what a hash is rather than eg. a textbook. It makes sense when you consider how the court system works but it is amusing nonetheless that the courts have their own body of CS literature.
Maybe someone could publish a "CS for Judges" book that teaches as much CS as possible using only court decisions. That could actually have a real use case when you think of it. (As other commenters pointed out, the hashing definition given here could use a bit more qualification, and should at least differentiate between neural hashes and traditional ones like MD5, especially as it relates to the likeliness that "another set of data will produce the same value." Perhaps that could be an author's note in my "CS for Judges" book.)
> Maybe someone could publish a "CS for Judges" book
At last, a form of civic participation which seems both helpful and exciting to me.
That said, I am worried that lot of necessary content may not be easy to introduce with hard precedent, and direct advice or dicta might somehow (?) not be permitted in a case since it's not adversarial... A new career as a professional expert witness--even on computer topics--sounds rather dreary.
I bet that book would end up with some very strange content, like attributing the invention of all sorts of obvious things to patent trolls.
What's so weird about this? CS literature is not legally binding in any way. Of course a judge would rather quote a previous ruling by fellow judge than a textbook, Wikipedia, or similar sources.
I think the operative word was "amusing"--which it is--but even then there's a difference between:
1. That's weird and represents an operational error that breaks the rules.
2. That's weird and represents a potential deficiency in how the system or rules have been made.
I don't think anyone is suggesting #1, and #2 is a lot more defensible.
They didn't say it was weird.
From what I understand, a judge is free to decide matters of fact on his own, which could include from a textbook. Also, it is not clear that matters of fact decided by the Supreme Court are binding to lower courts. Additionally, facts and even meanings of words themselves can change, which makes previous findings of fact no longer applicable. That's actually true in this case as well. "Hash" as used in the context of images generally meant something like an MD5 hash (which itself is now more prone to collisions than before). The "hash" in the Google case appears to be a perceptual hash, which I don't think was as commonly used until recently (I could be wrong here). So whatever findings of fact were made by the Supereme Court about how reliable a hash is is not necessarily relevant to begin with. Looking at this specific case, here is the full quote from United States v. Ackerman:
>How does AOL's screening system work? It relies on hash value matching. A hash value is (usually) a short string of characters generated from a much larger string of data (say, an electronic image) using an algorithm—and calculated in a way that makes it highly unlikely another set of data will produce the same value. Some consider a hash value as a sort of digital fingerprint. See Richard P. Salgado, Fourth Amendment Search and the Power of the Hash, 119 Harv. L. Rev. F. 38, 38-40 (2005). AOL's automated filter works by identifying the hash values of images attached to emails sent through its mail servers.[0]
I don't have access to this issue of Harvard Law Review but looking at the first page, it says:
>Hash algorithms are used to confirm that when a copy of data is made, the original is unaltered and the copy is identical, bit-for-bit.[1]
This is clearly referring to a cryptographic hash like MD5, not a perceptual hash/neural hash as in Google. So the actual source here is not necessarily dealing with the same matters of fact as the source of the quote here (although there could be valid comparisons between them).
All this said, judges feel more confident in citing a Supreme Court case than a textbook because 1. it is easier to understand for them 2. the matter of fact is then already tied to a legal matter, instead of the judge having to make that leap himself and also 3. judges are more likely to read relevant case law to begin with since they will read it to find precedent in matters of law – which are binding to lower courts. This is why a "CS for Judges" could be a useful reference book.
Lastly, I should have looked a bit more closely at the quoted case. This is actually not a supreme court case at all. Gorsuch was nominated in 2017 and this case is from 2016.
[0] https://casetext.com/case/united-states-v-ackerman-12
[1] https://heinonline.org/HOL/LandingPage?handle=hein.journals/...
> As the district court correctly ruled in the alternative, the good faith exception to the exclusionary rule supports denial of Maher’s suppression motion because, at the time authorities opened his uploaded file, they had a good faith basis to believe that no warrant was required
So this means this conviction is upheld but future convictions may be overturned if they similarly don't acquire a warrant?
> the good faith exception to the exclusionary rule supports denial of Maher’s suppression motion because, at the time authorities opened his uploaded file, they had a good faith basis to believe that no warrant was required
This "good faith exception" is so absurd I struggle to believe that it's real.
Ordinary citizens are expected to understand and scrupulously abide by all of the law, but it's enough for law enforcement to believe that what they're doing is legal even if it isn't?
What that is is a punch line from a Chapelle bit[1], not a reasonable part of the justice system.
---
1. https://www.youtube.com/watch?v=0WlmScgbdws
The courts accept good faith arguments at times. They will give reduced sentences or even none at all if they think you acted in good faith. There are enough situations where it is legal to kill someone that there are laws to make it clear that is a legal situation where one person can kill another (hopefully they never apply to you).
Note that this case is not about ignorance of the law. This is I knew the law and was trying to follow it - I just honestly thought it didn't apply because of some tricky situation that isn't 100% clear.
The difference between "I don't know" and "I thought it worked like this" is purely a matter of degrees of ignorance. It sounds like the cops were ignorant of the law in the same way as someone who is completely unaware of it, just to a lesser degree. Unless they were misinformed about the origins of what they were looking at, it doesn't seem like it would be a matter of good faith, but purely negligence.
There was a circuit split and a matter of first impression in this circuit.
“Mens rea” is a key component of most crimes. Some crimes can only be committed if the perpetrator knows they are doing something wrong. For example, fraud or libel.
> “Mens rea” is a key component of most crimes. Some crimes can only be committed if the perpetrator knows they are doing something wrong. For example, fraud or libel.
We're talking about orthogonal issues.
Mens rea applies to whether the person performs the act on purpose. Not whether they were aware that the act was illegal.
Let's use fraud as an example since you brought it up.
If I bought an item from someone and used counterfeit money on purpose, that would be fraud. Even if I truly believed that doing so was legal. But it wouldn't be fraud if I didn't know that the money was counterfeit.
At the time, what they did was assumed to be legal because no one had ruled on it.
Now, there is prior case law declaring it illegal.
The ruling is made in such a way to say “we were allowing this, but we shouldn’t have been, so we wont allow it going forward”.
I am not a legal scholar, but that’s the best way I can explain it. The way that the judicial system applies to law is incredibly complex and inconsistent.
This is a deeply problematic way to operate. En masse, it has the right result, but, for the individual that will have their life turned upside down, the negative impact is effectively catastrophic.
This ends up feeling a lot like gambling in a casino. The casino can afford to bet and lose much more than the individual.
I don't care nearly as much about the 4th amendment when the person is guilty. I care a lot when the person is innocent. Searches of innocent people is costly for the innocent person and so we require warrants to ensure such searches are minimized (even though most warrants are approved, the act of getting on forces the police to be careful). If a search was completely not costly to innocent I wouldn't be against them, but there are many ways a search that finds nothing is costly to the innocent.
If the average person is illegally searched, but turns out to be innocent, what are the chances they bother to take the police to court? It's not like they're going to be jailed or convicted, so many people would prefer to just try to move on with their life rather than spend thousands of dollars litigating a case in the hopes of a payout that could easily be denied if the judge decides the cops were too stupid to understand the law rather than maliciously breaking it.
Because of that, precedent is largely going to be set with guilty parties, but will apply equally to violations of the rights of the innocent.
I want guilty people to go free if their 4th amendment rights are violated, thats the only way to ensure police are meticulous about protecting peoples rights
I think the full reasoning here is something like
1. It was unclear if a warrant was necessary
2. Any judge would have given a warrant
3. You didn't get a warrant
4. A warrant was actually required.
Thus, it's not clear that any harm was caused because the right wasn't clearly enshrined and had the police known that it was, they likely would have followed the correct process. There was no intention to violate rights, and no advantage gained from even the inadvertent violation of rights. But the process is updated for the future.
It doesn’t seem like it was wrong in this specific case however.
This specific conviction upheld, yes. But no, this ruling doesn't speak to whether or not any future convictions may be overturned.
It simply means that at the trial court level, future prosecutions will not be able to rely on the good faith exception to the exclusionary rule if warrantless inculpatory evidence is obtained under similar circumstances. If the governement were to try to present such evidence at trial and the trial judge were to admit it over the objection of the defendant, then that would present a specific ground for appeal.
This ruling merely bolsters the 'better to get a warrant' spirit of the Fourth Amendment.
Yep, that's basically it.
The Fourth Amendment didn't help here, unfortunately. Or, perhaps fortunately.
Still, 25 years for possessing kiddie porn, damn.
The harshness of sentence is not for the action of keeping the photos in itself, but the individual suffering and social damage caused by the actions that he incentivizes when he consumes such content.
Consumption per se does not incentivize it, though; procurement does. It's not unreasonable to causally connect one to the other, but I still think that it needs to be done explicitly. Strict liability for possession in particular is nonsense.
There's also an interesting question wrt simulated (drawn, rendered etc) CSAM, especially now that AI image generators can produce it in bulk. There's no individual suffering nor social damage involved in that at any point, yet it's equally illegal in most jurisdictions, and the penalties aren't any lighter. I've yet to see any sensible arguments in favor of this arrangement - it appears to be purely a "crime against nature" kind of moral panic over the extreme ickiness of the act as opposed to any actual harm caused by it.
> Consumption per se does not incentivize it,
It can. In several public cases it seems fairly clear that there is a "community" aspect to these productions and many of these sites highlight the number of downloads or views of an image. It creates an environment where creators are incentivized to go out of their way to produce "popular" material.
> Strict liability for possession in particular is nonsense.
I entirely disagree. Offenders tend to increase their level of offense. This is about preventing the problem from becoming worse and new victims being created. It's effectively the same reason we harshly prosecute people who torture animals.
> nor social damage involved in that at any point,
That's a bold claim. Is it based on any facts or study?
> over the extreme ickiness of the act as opposed to any actual harm caused by it.
It's about the potential class of victims and the outrageous life long damage that can be done to them. The appropriate response to recognizing these feelings isn't to hand them AI generated material to sate their desires. It's to get them into therapy immediately.
It's not an interesting question at all.
Icky things are made illegal all the time. There's no need to have a 'sensible argument'.
Icky things were historically made illegal all the time, but most of those historical examples have not fared well in retrospect. Modern justice systems are generally predicated on some quantifiable harm for good reasons.
Given the extremely harsh penalties at play, I am not at all comfortable about punishing someone with a multi-year prison sentence for possession of a drawn or computer generated image. What exactly is the point, other than people getting off from making someone suffer for reasons they consider morally justifiable?
There's no room for sensible discussion like this in these matters. Not demanding draconian sentences for morally outraging crimes is morally outraging.
I think their point was they think the law should be based off of harms, not necessarily "morals" (since no one can seem to decide on those).
GP is saying that people who want this to be a crime are morally outraged that someone else might disagree, and so it's impossible to have a reasonable debate with them about it. They're probably correct, but it never hurts to try.
Oof, I fell victim to Poe's law
Assuming the person is a passive consumer with no messages / money exchanged with anyone, it is very hard to prove social harm or damage. Sentences should be proportional to the crime. Treating possession of cp as equivalent of literally raping a child just seems absurd to me. IMO, just for the legal protection of the average citizen, a simple possession should never warrant jail time.
CP is better described as "images of child abuse", and the argument is that the viewing is revictimising the child.
You appear to be suggesting that you shouldn't go to prison for possessing images of babies being raped?
You don't go to prison for possessing images of adults being raped, last I checked. Or adults being murdered. Or children being murdered.
I don't think making the images illegal is a good way to handle things.
For the record, i'm against any kind of child abuse, and 25 years for an actual abuser would not be a problem.
But...
Should you go to prison for possesing images of an adult being raped? What if you don't even know it's rape? What if the person is underage, but you don't know (looks adult to you)? What about a murder video instead of rape? What if the child porn is digitally created (AI, photoshop, whatever)? What if a murder scene is digitally created (fake bullets, holes+blood made in video editing software)? What if you go to a mainstream porno store, buy a mainsteam professional porno video and you later find out that the actress way a 15yo Traci Lords?
> the individual suffering and social damage caused by the actions that he incentivizes
That's some convoluted way to say he deserves 25 years because he may (or may not) at some point in his life molest a kid.
Personally i think that the idea of convicting a man for his thoughts is borderline crazy.
User of child pornography need to be arrested, treated, flagged and receive psychological followup all along their lives, but sending them away for 25 years is lazy and dangerous because when he will get out he will be even worst than before and won't have much to loose.
Respectfully, it's not pornography, it's child sexual abuse material.
Porn of/between consenting adults is fine. CSAM and sexual abuse of minors is not pornography.
EDIT: I intended to reply to the grandparent comment
Pornography is any multimedia content intended for (someone's) sexual arousal. CSAM is obviously a subset of that.
That is out of date
The language has changed as we (in civilised countries) stop punishing sex work "porn" is different from CASM
In the bad old days pornographers were treated the same as sadists
The language is defined by how people actually use it, not by how a handful of activists try to prescribe its use. Ask any random person on the street, and most of them have no idea what CSAM is, but they know full well what "child porn" is. Dictionaries, encyclopedias etc also reflect this common sense usage.
The justification for this attempt to change the definition doesn't make any sense, either. Just because some porn is child porn, which is bad, doesn't in any way imply that all porn is bad. In fact, I would posit that making this argument in the first place is detrimental to sex-positive outlook on porn.
> Just because some porn is child porn, which is bad, doesn't in any way imply that all porn is bad.
I think people who want others to stop using the term "child porn" are actually arguing the opposite of this. Porn is good, so calling it "child porn" is making a euphemism or otherwise diminishing the severity of "CSAM" by using the positive term "porn" to describe it.
I don't think the established consensus on the meaning of the word "porn" itself includes some kind of inherent implied positivity, either; not even among people who have a generally positive attitude towards porn.
"Legitimate" is probably a better word. I think you can get the point though. Those I have seen preferring the term CSAM are more concerned about CSAM being perceived less negatively when it is called child porn than they are about consensual porn being perceived more negatively.
> The language is defined by how people actually use it,
Precisely
Which is how it is used today
A few die hard conservatives cannot change that
Stop doing this. You are confusing the perfectly noble aspect of calling it abuse material to make it victim centric with denying the basic purpose of the material. The people who worked hard to get it called CSAM do not deny that it’s pornography for its users.
The distinction you went on to make was necessary specifically for this reason.
Who do such harsh punishments benefit?
In that case, we should all get 25 years for buying products made with slave labour.
It's a reasonable argument, but a concerning one because it hinges on a couple of layers of indirection between the person engaging in consuming the content and the person doing the harm / person who is harmed.
That's not outside the purview of US law (especially in the world post-reinterpretation of the Commerce Clause), but it is perhaps worth observing how close to the cliff of "For the good of Society, you must behave optimally, Citizen" such reasoning treads.
For example: AI-generated CP (or hand-drawn illustrations) are viscerally repugnant, but does the same "individual suffering and social damage" reasoning apply to making them illegal? The FBI says yes to both in spite of the fact that we can name no human that was harmed or was unable to give consent in their fabrication (handwaving the source material for the AI, which if one chooses not to handwave it: drop that question on the floor and focus on under what reasoning we make hand-illustrated cartoons illegal to possess that couldn't be applied to pornography in general).
> The FBI says yes to both in spite of the fact that we can name no
They have two arguments for this (that I am aware of). The first argument is a practical one, that AI-generated images would be indistinguishable from the "real thing", but that the real thing still being out there would complicate their efforts to investigate and prosecute. While everyone might agree that this is pragmatic, it's not necessarily constitutionally valid. We shouldn't prohibit activities based on whether these activities make it more difficult for authorities to investigate crimes. Besides, this one's technically moot... those producing the images could do so in such a way (from a technical standpoint) that they were instantly, automatically, and indisputably provable as being AI-generated.
All images could be mandated to require embedded metadata which describes the model, seed, and so forth necessary to regenerate it. Anyone who needs to do so could push a button, the computer would attempt to regenerate the image from that seed, and the computer could even indicate that the two images matched (the person wouldn't even need to personally view the image for that to be the case). If the application indicated they did not match, then authorities could investigate it more thoroughly.
The second argument is an economic one. That is, if a person "consumes" such material, they increase economic demand for it to be created. Even in a post-AI world, some "creation" would be criminal. Thus, the consumer of such imagery does cause (indirectly) more child abuse, and the government is justified in prohibiting AI-generated material. This is a weak argument on the best of days... one of the things that law enforcement efforts excel at is just this. When there are two varieties of a behavior, one objectionable and the other not, but both similar enough that they might at a glance be mistaken for one another, is that it can greatly disincentivize one without infringing the other. Being an economic argument, one of the things that might be said is that economic actors seek to reduce their risk of doing business, and so would gravitate to creating the legal variety of material.
While their arguments are dumb, this filth's as reprehensible as anything. The only question worth asking or answering is, were (AI-generated) it legal, would it result in fewer children being harmed or not? It's commonly claimed that the easy availability of mainstream pornography has reduced the rate of rape since the mid-20th century.
The problem with the internet nowadays is that a few big players are making up their own law. Very often it is against local laws, but nobody can fight with it. For example someone created some content but other person uploaded it and got better scores which rendered the original poster blocked. Another example: children were playing a violin concert and the audio got removed due to alleged copyright violation. No possibility to appeal, nobody sane would go to court. It just goes this way...
> the private search doctrine, which authorizes a government actor to repeat a search already conducted by a private party without securing a warrant.
IANAL, etc. Does that mean that if someone breaks in to your house in search of drugs, finds and steals some, and is caught by the police and confesses all that the police can then search your house without a warrant?
IANAL either, but from what I've read before the courts treat searches of your home with extra care under the 4th Amendment. At least one circuit has pushed back on applying private search cases to residences, and that was for a hotel room[0]:
> Unlike the package in Jacobsen, however, which "contained nothing but contraband," Allen's motel room was a temporary abode containing personal possessions. Allen had a legitimate and significant privacy interest in the contents of his motel room, and this privacy interest was not breached in its entirety merely because the motel manager viewed some of those contents. Jacobsen, which measured the scope of a private search of a mail package, the entire contents of which were obvious, is distinguishable on its facts; this Court is unwilling to extend the holding in Jacobsen to cases involving private searches of residences.
So under your hypothetical, I'd expect the police would be able to test "your drugs" that they confiscated from the thief, and use any findings to apply for a warrant for a search of your house, but any search without a warrant would be illegal.
[0] https://casetext.com/case/us-v-allen-167
I think the private search would have to be legal.
"That, however, does not mean that Maher is entitled to relief from conviction. As the district court correctly ruled in the alternative, the good faith exception to the exclusionary rule supports denial of Maher’s suppression motion because, at the time authorities opened his uploaded file, they had a good faith basis to believe that no warrant was required."
"Defendant [..] stands convicted following a guilty plea in the United States District Court for the Northern District of New York (Glenn T. Suddaby, Judge) of both receiving and possessing approximately 4,000 images and five videos depicting child pornography"
A win for google, for the us judicial system, and for constitutional rights.
A loss for child abusers.
Constitutional rights did not win enough, it should be that violating constitutional rights means the accused goes free, period, end of story
You forgot your IANAL, but thankfully it's obvious.
That's a ridiculous desire. In that world, if I delete your comment, and you kill me in retaliation, you should be let free if you argue that my deleting your comment infringed your right to free speech?
What I mean specifically is that because the police saw illegally obtained evidence, all evidence collected afterward after that point should be considered fruit of the poisoned tree and inadmissible
The 9th circuit ruled the same way a few years ago: https://www.insideprivacy.com/data-privacy/ninth-circuits-in...
Was using those md5 sums on images for flagging images 20 years ago for the government, occasional false positives, but the safety team would review those, not operations. My only role was to burn the users account to a dvd (via a script) and have the police officer pick up the dvd, we never touched the disk, and only burned the disk with a warrant. (we never saw/touched the users data...)
Figured this common industry standard for chain of custody for evidence. Same with police videos, they are uploaded to the courts digital evidence repository, and everyone who looks at the evidence is logged.
Seems like a standard legal process was followed.
Wow, do I ever not know how I feel about the "good faith exception."
It feels like it incentivizes the police to minimize their understanding of the law so that they can believe they are following it.
> It feels like it incentivizes the police to minimize their understanding of the law so that they can believe they are following it.
That's a bingo. That's exactly what they do, and why so many cops know less about the law than random citizens. A better society would have high standards for the knowledge expected of police officers, including things like requiring 4-year criminal justice or pre-law degree to be eligible to be hired, rather than capping IQ and preferring people who have had prior experience in conducting violent actions.
In some countries you are required to study the law in order to become a police officer. It's part of the curriculum in the three year bachelor level course you must pass to become a police officer in Norway for instance. See https://en.wikipedia.org/wiki/Norwegian_Police_University_Co... and https://en.wikipedia.org/wiki/Norwegian_Police_Service
Yes, this likely explains part of why the Norwegian police behave like professionals who are trying to do their job with high standards of performance and behavior and the police in the US behave like a bunch of drinking buddies that used to be bullies in high school trying to find their next target to harass.
The good faith exception requires the belief be reasonable. Ignorance of clearly settled law is not reasonable, it should be a situation where the law was unclear, had conflicting interpretations or could otherwise be interpreted the way the police did by a reasonable person.
It's crazy that the most dangerous people one regularly encounters can do anything they want as long as they believe they can do it. The good faith exemption has to be one of the most fascist laws on the books today.
> "the good faith exception to the exclusionary rule supports denial of Maher’s suppression motion because, at the time authorities opened his uploaded file, they had a good faith basis to believe that no warrant was required."
In no other context or career can you do anything you want and get away with it just as long as you say you thought you could. You'd think police offiers would be held to a higher standard, not no standard.
And specifically with respect to the law, breaking a law and claiming you didn't know you did anything wrong as an individual is not considered a valid defense in our justice system. This same type of standard should apply even more to trained law enforcement, not less, otherwise it becomes a double standard.
No this is breaking the law by saying this looked like one of the situations where I already know the law doesn't apply. If Google had looked at the actual image and said it was child porn instead of just saying it was similar to some image that is child porn this would be 100% legal as the courts have already said. That difference is subtle enough that I can see how someone would get it wrong (and in fact I would expect other courts to rule differently)
Doesn’t address the point. Does everyone get a good faith exception from laws they don’t know or misunderstand, or just the police?
when the law isn't clear and so reasonable people would understand it as you did you should get a pass.
> you do anything you want and get away with it just as long as you say you thought you could.
Isn't that the motto of VC? Uber, AirBnB, WeWork, etc...
Sorry, I should have been more explicit. I thought the context provided it.
> you do any illegal action you want and get away with it just as long as you say you thought you could.
And as for corporations: that's the point of incorporating. Reducing liability.
That's not what this means. One can ask whether the belief is reasonable, that is justifiable by a reasoning process. The argument for applying the GFE in this case is that the probability of false positives from a perceptual hash match is low enough that it's OK to assume it's legit and open the image to verify that it was indeed child porn. They then used that finding to get warrants to search the guy's gmail account and later his home.
Good Samaritan laws tend to function similarly
If I'm not a professional and I hurt someone while trying to save their life by doing something stupid, that's understandable ignorance.
If a doctor stops to help someone and hurts them because the doctor did something stupid, that is malpractice and could get them sued and maybe get their license revoked.
Would you hire a programmer who refused to learn how to code the claimed "good faith" every time they screwed things up? Good faith shouldn't cover willful ignorance. A cop is hired to know, understand, and enforce the law. If they can't do that, they should be fired.
It's not exactly the same imo, since GS laws are meant to protect someone who is genuinely trying to do what a reasonable person could consider "something positive"
It is not "you say you thought you could", it is "you have reasonable evidence a crime is happening".
The reasonable here is "google said it", and it was true.
If the police arrive at a house on a domestic abuse call, and hears screams for help, is breaking down the door done in good faith?
In this case you're correct. But the good faith exemption is far broader than this and applies to even officer's completely personal false beliefs in their authority.
> In no other context or career can you do anything you want and get away with it just as long as you say you thought you could
Many white collar crimes, financial and securities fraud/violations can be thwarted this way
Basically, ignorance of the law is no excuse except when you specifically write the law to say it is an excuse
Something that contributes to the DOJ not really trying to bring convictions against individuals at bigger financial institutions
And yeah, a lot of people make sure to write their industry’s laws that way
I think the judge chose to relax a lot on this one due to the circumstances. Releasing a man in society found with 4,000 child porn photos in his computer would be a shame.
But yeah, this opens too wide gates of precedence for tyranny, unfortunately...
Is this because Google is compelled to report this to the authorities?
I would have thought any company could voluntarily submit anything to the police.
Well let's look at how this actually played out.
Now police are facing the obvious question, is this actually CP? They open the image, determine it is, then get a warrant to search his gmail account, and (later) another warrant to search his home.The court here is saying they should have got a warrant to even look at the image in the first place. But warrants only issue on probable cause. What's the PC here? The hash value. What's the probability of hash collisions? Non-zero but very low.
The practical upshot of this is that all reports from NCMEC will now go through an extra step of the police submitting a copy of the report with the hash value and some boilerplate document saying 'based on my law enforcement experience, hash values are pretty reliable indicators of fundamental similarity', and the warrant application will then be rubber stamped by a judge.
An analogous situation would be where I send a sealed envelope with some documents to the police, writing on the outside 'I believe the contents of this envelope are proof that John Doe committed [specific crime]', and the police have to get a warrant to open the envelope. It's arguably more legally consistent, but in practice it just creates an extra stage of legal bureaucracy/delay with no appreciable impact on the eventual outcome.
Recall that the standard for issuance of a warrant is 'probable cause', not 'mathematically proven cause'. Hash collisions are a possibility, but a sufficiently unlikely one that it doesn't matter. Probable cause means 'a fair probability' based on independent evidence of some kind - testimony, observation, forensic results or so. Even a shitty hash function that's only 90% reliable is going to meet that threshold. In the 10% of cases where the opened file turns out to be a random image with no pornographic content it's a 'no harm no foul' situation.
For reference, a primer on hash collision probabilities: https://preshing.com/20110504/hash-collision-probabilities/
and a more detailed examination of common perceptual hashing algorithms (skip to table 3 for the collision probabilities): https://ceur-ws.org/Vol-2904/81.pdf
I think what a lot of people are implicitly arguing here is that the detection system needs to be perfect before anyone can do anything. Nobody wants the job of examining images to check if they're CP or not, so we've outsourced it to machines that do so with good-but-not-perfect accuracy and then pass the hot potato around until someone has to pollute their visual cortex with it.
Obviously we don't want to arrest or convict people based on computer output alone, but how good does it have to be (in % or odds terms) in order to begin an investigation - not of the alleged criminal, but of the evidence itself? Should companies like Google have to submit an estimate of the probability of hash collisions using their algorithm and based on the number of image hashes that exist on their servers at any given moment? Should they be required to submit source code used to derive that? What about the microcode of the silicon substrate on which the calculation is performed?
All other things being equal, what improvement will result here from adding another layer of administrative processing, whose outcome is predetermined?
As many others have said, Google isn’t using a cryptographic hash here. It’s using perceptual hashing, which isn’t collision-safe at all.
Did you read the whole thing?
and a more detailed examination of common perceptual hashing algorithms (skip to table 3 for the collision probabilities): https://ceur-ws.org/Vol-2904/81.pdf
And there was a whole lot of explanation of how probable cause works and how it's different from programmers' aspirations to perfection.
> Recall that the standard for issuance of a warrant is 'probable cause', not 'mathematically proven cause'. Hash collisions are a possibility, but a sufficiently unlikely one that it doesn't matter. Probable cause means 'a fair probability' based on independent evidence of some kind - testimony, observation, forensic results or so. Even a shitty hash function that's only 90% reliable is going to meet that threshold. In the 10% of cases where the opened file turns out to be a random image with no pornographic content it's a 'no harm no foul' situation.
But do we actually know that? Do we know what the thresholds of "similarity" are in use by google and others, and how many false positives they trigger? Billions of photos are processed daily by googles services (google photo, chat programs, gmail, drive, etc.), and very few people actually send such stuff via gmail, so what if the reality is, that 99.9% of the matches are actually false positives? What about intentional matches, like someone intentionally creating some random SFW meme image, that (when hashed) matches with some illegal image hash, and that photo is then sent around intentionally.. should police really be checking all those emails, photos, etc., without warrants?
Well, that's why I'm asking what threshold of certainty people want to apply. The hypotheticals you cite are certainly possible, but are they likely?
what if the reality is, that 99.9% of the matches are actually false positives
Don't you think that if Google were deluging the cops with false positive reports that turned out to be perfectly innocuous 999 times out of 1000, that police would call them up and say 'why are you wasting our time with this?' Or that defense lawyers wouldn't be raising hell if there were large numbers of clients being investigated over nothing? And how would running it through a judge first improve that process?
What about intentional matches, like someone intentionally creating some random SFW meme image [...]
OK, but what is the probability of that happening? And if such images are being mailed in bulk, what would be the purpose other than to provide cover for CSAM traders? The tactic would only be viable for as long as it takes a platform operator to change up their hashing algorithm. And again, how would the extra legal step of consulting a judge alleviate this?
should police really be checking all those emails, photos, etc., without warrants?
But that's not happening. As I pointed out, police examined the submitted image evidence to determine of it was CP (it was). Then they got a warrant to search the gmail account, and following that another warrant to search his home. The didn't investigate the criminal first, the investigated an image file submitted to them to determine whether it was evidence of a crime.
And yet again, how would bouncing this off a judge improve the process? The judge will just look at the report submitted to the police and a standard police letter saying 'reports of this kind are reliable in our experience' and then tell the police yes, go ahead and look.
What is the context of this?
Status quo, there is no change here.
The old example is the email server administrator. If the email administrator has to view the contents of user messages as a part of regular maintenance and that email administrator notices lawful violations in those user messages they can report it to law enforcement. In that case law enforcement can receive the material without a warrant only if law enforcement never asked for it before it was gifted to them. There are no fourth amendment protections provided to offenders in this scenario of third party accidental discovery. Typically, in these cases the email administrator does not have an affirmed requirement to report lawful violations to law enforcement unless specific laws claim otherwise.
If on the other hand law enforcement approaches that email administrator to fish for illegal user content then that email administrator has become an extension of law enforcement and any evidence discovered cannot be used in a criminal proceeding. Likewise, if the email administrator was intentionally looking through email messages for violations of law even not at the request of law enforcement they are still acting as agents of the law. In that case discovery was intentional and not an unintentional product of system maintenance.
There is a third scenario: obscenity. Obscenity is illegal intellectual property, whether digital or physical, as defined by criminal code. Possession of obscene materials is a violation of criminal law for all persons, businesses, and systems in possession. In that case an email administrator that accidentally discovers obscene material does have a required obligation to report their discoveries, typically through their employer's corporate legal process, to law enforcement. Failures to disclose such discoveries potentially aligns the system provider to the illegal conduct of the violating user.
Google's discovery, though, was not accidental as a result of system maintenance. It was due to an intentional discovery mechanism based on stored hashes, which puts Google's conduct in line with law enforcement even if they specified their conduct in their terms of service. That is why the appeals court claims the district court erred by denying the defendant's right to suppression on fourth amendment grounds.
The saving grace for the district court was a good faith exception, such as inevitable discovery. The authenticity and integrity of the hash algorithm was never in question by any party so no search for violating material was necessary, which established probably cause thus allowing law enforcement reasonable grounds to proceed to trial. No warrant was required because the evidence was likely sufficient at trial even if law enforcement did not directly view image in question, but they did verify the image. None of that was challenged by either party. What was challenged was just Google's conduct.
The judge doesn't really understand a hash well. They say things like "Google assigned a hash" which is not true, Google calculated the hash.
Also I'm surprised the 3rd-party doctrine doesn't apply. There's the "private search doctrine" mentioned but generally you don't have an expectation of privacy for things you share with Google
Erm, "Assigned" in this context is not new: https://law.justia.com/cases/federal/appellate-courts/ca5/17...
"More simply, a hash value is a string of characters obtained by processing the contents of a given computer file and assigning a sequence of numbers and letters that correspond to the file’s contents."
From 2018 in United States v. Reddick.
The calculation is what assigns the value.
No. The calculation is what determines what the assignation should be. It does not actually assign anything.
This FOIA litigation by ACLU v ICE goes into this topic quite a lot: https://caselaw.findlaw.com/court/us-2nd-circuit/2185910.htm...
Yes, Google's calculation.
Did Google invent this hash?
Why is that relevant? Google used a hashing function to persist a new record within a database. They created a record for this.
Like I said in a sib. comment, this FOIA lawsuit goes into questions of hashing pretty well: https://caselaw.findlaw.com/court/us-2nd-circuit/2185910.htm...
Google at some point decided how to calculate that hash and that influences what the value is right? Assigned seems appropriate in that context?
Either way I think the judge's wording makes sense.
Google assigned the hashing algorithm (maybe, assuming it wasn't chosen in some law somewhere, I know this CSAM hashing is something the big tech work on together).
Once the hashing algorithm was assigned, individual values are computed or calculated.
I don't think the judge's wording is all that bad but the word "assigned" is making it sound like Google exercised some agency when really all it did was apply a pre-chosen algorithm.
Hash can be arbitrary, the only requirement is it is a deterministic one-way function.
And it should be mostly bijective under most conditions. (This is obviously impossible in practice but hashes with common collisions shouldn't be allowed as legal evidence). Also neural/visual hashes like those used by big tech makes things tricky.
The hash in question has many collisions. It it probably enough to get a warrant put it on a warrant, but it may not be enough to get a warrant without some other evidence. (it can be enough evidence to look for other public signs of evidence, or perhaps because there are a number of images that match different hashes)
There's a password on my Google account, I totally expect to have privacy for anything I didn't choose to share with other people.
The hash is kind of metadata recorded by Google, I feel like Google using it to keep child porn off their systems should be reasonable. Same ballpark as limiting my storage to 1GB based on file sizes. Sharing metadata without a warrant is a different question though.
As should be expected from the lawyer world, it seems like whether you have an expectation of privacy using gmail comes down to very technical word choices in the ToS, which of course neither this guy nor anyone else has ever read. Specifically, it may be legally relevant to your expectation of privacy whether Google says they "may" or "will" scan for this stuff.
Does a lab assigns a DNA to you or does it calculate?
Does two different labs DNA analysis are exactly equal?
Remember that you can use multiple different algorithms to calculate a hash.
Out of curiosity, what is false positive rate of a hash match?
If the FPR is comparable to asking a human "are these the same image?", then it would seem to be equivalent to a visual search. I wonder if (or why) human verification is actually necessary here.
The reason human verification is necessary is that the government is relying on something called the "private search" doctrine to conduct the search without a warrant. This doctrine allows them to repeat a search already conducted by a private party (i.e., Google) without getting a warrant. Since Google didn't actually look at the file, the government is not able to look at the file without a warrant, as that search exceeds the scope of the initial search Google performed.
I doubt sha1 hashes are used for this. Those image hashes should match files regardless of orientation, cropping, resizing, re-compression, color correction etc. The collision could be far more frequent with these hashes.
The hash should ideally match even if you use photoshop to cut the one person out of the picture and put that person into a different photo. I'm not sure if that is possible, but that is what we want.
Naively, 1/(2^{hash_size_in_bits}). Which is about 1 in 4 billion odds for a 32 bit hash, and gets astronomically low at higher bit counts.
Of course, that's assuming a perfect, evenly distributed hash algorithm. And that's just the odds that any given pair of images has the same hash, not the odds that a hash conflict exists somewhere on the internet.
You need to know the input space as well as the output space (hash size).
If you have a 32bit hash but your input is only 16bit, you'll never have a collision (and you'll be wasting a ton of space on your hashes!).
Image files can get into the megabytes though, so unless the output hash is large the potential for collisions is probably not all that low.
You do not need to know the input space.
Normal hash functions have pseudo-random outputs and they can collide even when the input space is much smaller than the output space.
In fact, I'll go run ten million values, encoded into 24 bits each, through a 40 bit hash and count the collisions. My hash of choice will be a truncated sha256.
... I got 49 collisions.
> Out of curiosity, what is false positive rate of a hash match?
No way to know without knowledge of the 'proprietary hashing technology'. Theoretically though, a hash can have infinitely many inputs that produce the same output.
Mismatching hash values from the same hashing algorithm can prove mismatching inputs, but matching hash values don't ensure matching inputs.
> I wonder if (or why) human verification is actually necessary here
It's not about frequency, it's about criticality of getting it right. If you are going to make a negatively life-altering report on someone, you'd better make sure the accusation is legitimate.
I'd say the focus on hashing is a bit of a red herring.
Most anyone would agree that the hash matching should probably form probable cause for a warrant, allowing a judge to sign off on the police searching (i.e., viewing) the image. So, if it's a collision, the cops get a warrant and open up your linux ISO or cat meme, and it's all good. Probably the ideal case is that they get a warrant to search the specific image, and are only able to obtain a warrant to search your home and effects, etc. if the image does appear to be CSAM.
At issue here is the fact that no such warrant was obtained.
I think it'll prove far more likely that the government creates incentives to lead Google/other providers to fully do the search on their behalf.
The entire appeal seems to hinge on the fact that Google didn't actually view the image before passing it to NCMEC. Had Google policy been that all perceptual hash hits were reviewed by employees first, this would've likely been a one page denial.
If the hash algorithm were CRC8, then obviously it should not be probable cause for anything. If it were SHA-3, then it's basically proof beyond reasonable doubt of what the file is. It seems reasonable to question how collisions behave.
I don't agree that it would be proof beyond reasonable doubt, especially because neither google nor law enforcement can produce the original image that got tagged.
By original do you mean the one in the database or the one on the device?
If the device spit out the same SHA3, then either it had the exact same image, or the SHA3 was planted somehow. The idea that it's actually a different file is not a reasonable doubt. It's too unlikely.
By the original, I mean the image that was used to produce the initial hash, which Google (rightly) claimed to be CSAM. Without some proof that an illicit image that has the same hash exists, I wouldn't accept a claim based on hash alone.
Oh definitely you need someone to examine the image that was put in the database to show it's CSAM, if the legal argument depends on that. But that's an entirely different question from whether the image on the device is that image.
> Most anyone would agree that the hash matching should probably form probable cause for a warrant
I disagree with this. Yes, if we were talking MD5, SHA, or some similar true hash algo, then the probability of a natural collision is small enough that I agree in principle.
But if the hash algo is of some other kind then I do not know enough about it to assert that it can justify probable cause. Anyone who agrees without knowing more about it is a fool.
That's fair. I came away from reading the opinion that this was not a perceptual hash, but I don't think it is explicitly stated anywhere. I would have similar misgivings if indeed it is a perceptual hash.
For non-broken cryptographic hashes (e.g., SHA-256), the false-positive rate is negligible. Indeed, cryptographic hashes were designed so that even nation-state adversaries do not have the resources to generate two inputs that hash to the same value.
See also:
https://en.wikipedia.org/wiki/Collision_resistance
https://en.wikipedia.org/wiki/Preimage_attack
These are not the kinds of hashes used for CSAM detection, though, because that would only work for the exact pixel-by-pixel copy - any resizing, compression etc would drastically change the hash.
Instead, systems like these use perceptual hashing, in which similar inputs produce similar hashes, so that one can test for likeness. Those have much higher collision rates, and are also much easier to deliberately generate collisions for.
So now an algorithm can interpret the law better than a judge. It’s amazing how technology becomes judge and jury while privacy rights are left to a good faith interpretation. Are we really okay with letting an algorithmic click define the boundaries of privacy?
The algorithm is interpreting whether one image file matches another, a purely probabilistic issue.