On July 2nd, Viacom’s lawsuit against Google’s YouTube unit saw a significant ruling, potentially troubling for user privacy. Viacom asked for, and judge Louis L. Stanton ordered Google to turn over, the logs of each viewing of all videos in the YouTube database, showing the username and IP address of the user who was viewing the video, a timestamp, and a code identifying the video. The judge found that Viacom “need[s] the data to compare the relative attractiveness of allegedly infringing videos with that of non-infringing videos.” The fraction of views that involve infringing video bears on Viacom’s claim that Google should have vicarious copyright liability–if the infringing videos appear to be an important draw for YouTube users, this implies a financial benefit to Google from the infringement, which would weigh in favor of a claim of vicarious liability.
As Doug Tygar has observed, the judge’s optimistic belief that disclosure of these logs won’t harm privacy seems to be based in part on the conflicting briefs of the parties. Viacom, desiring the logs, told the judge that “the login ID is an anonymous pseudonym that users create for themselves when they sign up with YouTube” which without more data “cannot identify specific individuals.” After quoting this claim and noting that Google did not refute it, the judge goes on to quote a Google employee’s blog post arguing that “in most cases, an IP address without additional information cannot” identify a particular user.
Each of these claims–first, that the login IDs of users are anonymous pseudonyms, and second, that IP addresses alone don’t suffice to identify individuals–is debatable. I haven’t reviewed the briefs that led Judge Stanton to believe each of the assertions. I suppose that his conclusions are reasonable in light of the material presented. It might be the case that the briefs should have led him to a different conclusion. Then again, as the blog post quoted above suggests, Google has at times found itself downplaying the privacy risks associated with certain data. A victory in this argument, causing the judge to take a more expansive view of the possible privacy harms, might have been a mixed blessing for Google in the longer run.
In any case, when he combined the two claims to compel the turnover of the logs, the judge made a significant mistake of his own. Agreeing for the sake of argument that login IDs alone don’t compromise privacy, and that IP addresses alone also don’t compromise privacy, it doesn’t follow that the two combined are equally innocuous. Earlier cases like the AOL debacle have shown us that information that may seem privacy-safe in isolation can be privacy-compromising when it is combined. The fact of combination–the fact that some viewing by a particular login ID happened at a certain IP address, and conversely that a viewing from a particular IP address occurred under the login of a particular user–is itself a potentially important further piece of information. If the judge thought about this fact–if he thought about the further privacy risk involved in the combination of IPs and login IDs–I couldn’t find any evidence of such consideration in his ruling.
Google wants to be permitted to modify the data to reduce the privacy risk before handing it over to Viacom, but it’s not yet clear what agreement if any the parties will reach that would do more to protect privacy that Judge Stanton’s ruling requires. It’s also not yet apparent exactly how the judge’s protective order will be constructed. But if the logs are turned over unaltered, as they may yet be, the result could be significant risk: YouTube’s users would then face extreme privacy harm in the event that the data were to leak from Viacom’s possession.
[As always, this post is the opinion of the author (David Robinson) only.]
Leave a Reply