Privacy Falls into YouTube's Data Tar Pit

As a big lawsuit grinds forward, its parties engage in discovery, a wide-ranging search for information "reasonably calculated to lead to the discovery of admissible evidence." (FRCP Rule 26(b)) And so Viacom has calculated that scouring YouTube's data dumps would help provide evidence in Viacom's copyright lawsuit.

According to a discovery order released Wednesday, Viacom asked for discovery of YouTube source code and of logs of YouTube video viewership; Google refused both. The dispute came before Judge Stanton, in the Southern District of New York, who ordered the video viewing records -- but not the source code -- disclosed.

The order shows the difficulty we have protecting personally sensitive information. The court could easily see the economic value of Google's secret source code for search and video ID, and so it refused to compel disclosure of that "vital asset," the "product of over a thousand person-years of work."

But the user privacy concerns proved harder to evaluate. Viacom asked for "all data from the Logging database concerning each time a YouTube video has been viewd on the YouTube website or through embedding on a third-party website," including users' viewed videos, login IDs, and IP addresses. Google contended it should not be forced to release these records because of users' privacy concerns, which the court rejected.

The court erred both in its assessment of the personally identifying nature of these records, and the scope of the harm. It makes no sense to discuss whether an IP address is or is not "personally identifying" without considering the context with which it is connected. It may not be a name, but is often one search step from it. Moreover, even "anonymized" records often provide sufficiently deep profiles that they can be traced back to individuals, as researchers armed with the AOL and Netflix data releases showed.

Viewers "gave" their IP address and username information to YouTube for the purpose of watching videos. They might have expected the information to be used within Google, but not anticipate that it would be shared with a corporation busily prosecuting copyright infringement. Viewers may not be able to quantify economic harm, but if communications are chilled by the disclosure of viewing habits, we're all harmed socially. The court failed to consider these third party interests in ordering the disclosure.

Trade secret wins, privacy loses. Google has said it will not appeal the order.

Is there hope for the end users here, concerned about disclosure of their video viewing habits? First, we see the general privacy problem with "cloud" computing: by conducting our activities at third-party sites, we place a great deal of information about our activities in their hands. We may do so because Google is indispensable, or because it tells us its motto is "don't be evil." But discovery demands show that it's not enough for Google to follow good precepts.

Google, like most companies, indicates that it will share data where "We have a good faith belief that access, use, preservation or disclosure of such information is reasonably necessary to (a) satisfy any applicable law, regulation, legal process or enforceable governmental request." Its reputation as a good actor is important, but the company is not going to face contempt charges over user privacy.

I worry that this discovery demand is just the first of a wave, as more litigants recognize the data gold mines that online service providers have been gathering: search terms, blog readership and posting habits, video viewing, and browsing might all "lead to the discovery of admissible evidence" -- if the privacy barriers are as low as Judge Stanton indicates, won't others follow Viacom's lead? A gold mine for litigants becomes a tar pit for online services' user.

Economic concerns, the cost of producing the data in response to a wave of subpoenas, or reputational concerns, the fear that users will be driven away from a service that leaves their sensitive data vulnerable, may exercise some constraint, but they're unlikely to be enough to match our privacy expectations.

We need the law to supply protection against unwanted data flows, to declare that personally sensitive information -- or the profiles from which identity may be extracted and correlated -- deserves consideration at least on par with "economically valuable secrets." We need better assurance that the data we provide in the course of communicative activities will be kept in context. There is room for that consideration in the "undue burden" discovery standard, but statutory clarification would help both users and their Internet service providers to negotiate privacy expectations better.

Is there a law? In this particular context, there might actually be law on the viewers' side. The Video Privacy Protection Act, passed after reporters looked into Judge Bork's video rental records, gives individuals a cause of action against "a video tape service provider who knowingly discloses, to any person, personally identifiable information concerning any consumer of such provider." ("Video tape" includes similar audio visual materials.) Will any third parties intervene to ask that the discovery order be quashed?

Further, Bloomberg notes the concerns of Europeans, whose privacy regime is far more user-protective than that of the United States. Is this one case where "harmonization" can work in favor of individual rights?

Content Type: 

Subject Area: 

Comments

VPPA and other grounds to quash

A number of commenters around the 'net have expressed the opinion that the definition of “consumer” may make the Video Privacy Protection Act inapplicable. Specifically, those commenters argue that most (or all) YouTube users are not “subscribers” within the meaning of the act.

From 18 USC § 2710(a):

(1) the term “consumer” means any renter, purchaser, or subscriber of goods or services from a video tape service provider;

Those commenters construe “subscriber” narrowly. However, there is precedent to construe the VPPA broadly.

In Dirkes v Borough of Runnemede (D.N.J. 1996), a federal court opined that the VPPA, as a remedial statute, “should be construed broadly.” The district court wrote:

[T]he Supreme Court in Local 28 of Sheet Metal Workers' v. E.E.O.C., (1986), reinforced the principle that remedial statutes should be construed broadly. Local 28 involved a violation of Title VII, a statute designed to address employment discrimination. Upon examining the legislative history of Title VII, the Court determined that "Congress reaffirmed the breadth of the [district] court's remedial powers under § 706(g) by adding language authorizing courts to order ‘any other equitable relief as the court deems
appropriate.’” This added language is identical to that used in subsection (c)(2)(D) of the Videotape Privacy Protection Act. It is evident throughout the Local 28 opinion that the Supreme Court intended to give effect to the legislators' intent to provide as broad remedial powers as possible to the district courts to eliminate the effects of illegal discrimination. This Court will exercise the same broad powers to give effect to the intent of Videotape Privacy Protection Act's U.S. Senate sponsors. The importance of maintaining the privacy of an individual's personally identifiable information mandates that people who obtain such information from a violation of the Act be held as proper defendants to prevent the further disclosure of the information.

(Cites omitted.)

But that opinion does regard a different portion of the statute: The district court was determing whether a party to whom personally identifiable information was disclosed was a proper defendant. The court was not determining whether the plaintiff was a “consumer” as defined by the act.

I myself tend to think here in the YouTube case, the First Amendment arguments are stronger than the VPPA arguments.

There is a lot of core political speech published on YouTube. And the precedents for both anonymous publishing of political speech —and anonymous access to political speech— seem firm.

First Amendment

Thanks via_tor (and I appreciate your anonymity interests, too, glad the page was configured so that you could post.)
You're right to point out the strong First Amendment protections for anonymous speech. If a YouTube poster or viewer were being accused of defamation for example, the person would have the right to challenge the release of his/her identity (e.g. Doe v. 2theMart). The protections shouldn't be less when the posters and viewers aren't even parties to the lawsuit and their identities are obtained only incidentally.

Privacy of YouTube data

Interesting case. I think it's also important not to overlook the context here: civil discovery in a lawsuit. For better or worse, disclosure of highly confidential info is often compelled in discovery, even though that info may be subject to all manner of confidentiality protections in the ordinary course. Examples: trade secrets; personal medical info (normally protected by HIPPA); individuals' tax returns.

The principle would seem to be that civil discovery, subject to protective orders enforced by a federal judge, is a huge exception to all rules of confidentiality. The assumption behind the principle would seem to be that a judicial protective order is sufficient to protect confidentiality/privacy interests. I, for one, would certainly question that assumption.

I'm puzzled that Google isn't appealing. Any thoughts on why? Is it worried that a favorable ruling based on the Video Privacy Protection Act, while helpful in this case, could expose Google to significant liability in the future?