A Failure To Communicate: When a Privacy Seal Doesn't Help

An articulate privacy policy helps, but if reality and the policy don't agree, you still have a problem. That's what TRUSTe is all about: helping people identify sites with privacy policies that reflect reality. Too bad TRUSTe's own policy didn't tell you about who was tracking users of its site. Oops.

A Failure to Communicate:
When a Privacy Seal Doesn't Help

Matt Curtin
Interhack Corporation
http://www.interhack.net/

August 25, 2000 
Revision: 1.3

Abstract:

TRUSTe seeks to establish itself as an organization for establishing and promoting good privacy practices online. End users are supposed to have confidence in web sites displaying the TRUSTe ``trustmark'' symbol, as it presumably indicates that the site adheres to good privacy policy. We describe how the TRUSTe organization itself is breaking its own stated privacy policy by planting a third party ``web bug'' that makes multisite user profiling technically feasible on its own web site and discuss the implications.

1 Introduction

  Internet privacy is a hotly debated topic and it's easy to react to discoveries before fully understanding their implications. We believe this situation significant enough to report and important enough for everyone who uses the Web to give consideration, but we urge thoughtful consideration of the facts presented and the questions raised so that we can engage in intelligent debate about the real significance of the situation. While we believe this discovery to be demonstrative of a serious flaw in the TRUSTe definition of privacy, we respect the right of each reader to decide for himself what is and is not acceptable practice for handling of information about him and his activity online.

We jump quickly into discussion of different parties and many terms; we'll give an overview of each of these to ensure that readers can understand this report irrespective of how closely they follow the Internet.

1.1 TRUSTe

  TRUSTe[*] claims to promote good online privacy practices, helping both web site operators and web users to understand how to approach the thorny issue of privacy online. The TRUSTe trustmark seal can be displayed by TRUSTe member sites whose privacy policies adhere to a standard for handling of private information. TRUSTe's trademarked slogan, ``Building a Web you can believe in'', clearly indicates that TRUSTe intends to build consumer confidence in the Web.

1.2 TheCounter.com

  TheCounter.com is a free ``hit counter'' service run by Internet.com LLC; it allows web site operators to see how their site is being used without having to look at their own access logs.

1.3 Cookies

  Cookies are small tokens passed between Web servers and clients. These cookies allow the server to maintain state, making such things as remembering what items the client has in an online shopping kart possible. Some cookies are stored on the client's drive until their specified expiration time has passed. Cookies without expiration times are never stored; they're only memory-resident and will be destroyed when the browser is exited. EPIC maintains a Web page on cookies and their privacy implications [1].

1.4 Web Bugs

  The term ``Web bug'' was coined by Richard Smith, CTO of the Privacy Foundation, who identified how some web site operators place these tiny invisible images on their sites in such a way that they allow third parties to be able to track clients' use of web servers [6].

2 TRUSTe Seems To violate Its Own Policy

  TRUSTe pays a great deal of attention to policy, which is a critical part of the privacy puzzle. However, equally important but apparently less well-understood is technology and the practices that result from its use.

At the very least, it is extremely curious that TRUSTe--an organization that has made attempts to establish itself as the place for consumers and web publishers alike to turn for guidance for online privacy--would dare to send any information about a user online to a third party. It's especially strange that TRUSTe would do this merely to see how many people are using its site and how they're using the site when all of that information is already available to them in the form of their own web server's access logs. Looking at TRUSTe's own privacy policy, it isn't exactly clear to what degree this sort of covert monitoring by a third party constitutes a policy violation.

2.1 Stated Policy

  Observe what the TRUSTe Privacy Policy[*] has to say about logs of the system's use. Nowhere is there any mention of a third party's involvement in the collection of these data. Furthermore, this statement fails to disclose what information beyond IP address and browser type is recorded.

Logged Files
The TRUSTe Web site logs IP addresses and browser types for systems administration purposes and these logs will be analyzed to constantly improve the value of the materials available on the website. We do not link IP addresses to anything personally identifiable. This means that a user's session will be tracked, but the user will be anonymous.

TRUSTe's opt-out policy is commendable; it's actually an opt-in policy. We argue that this is the only workable means of protecting online privacy. However, TRUSTe's implementation of the opt-in policy would seem flawed if any information is being sent to any third parties without the user explicitly allowing it.

Opt-out Policy
TRUSTe provides its visitors the option of opting-in as opposed to opting-out. Therefore, when there is a choice of using information for purposes other than what the information was originally collected for, TRUSTe will present the user with an opt-in box. For example, when visitors go to the "enter the drawing", they are presented with an opt-in box, asking if they would like to receive information about TRUSTe. By default, the box is NOT checked. If visitors check the box, a TRUSTe informational package will be emailed to them.

2.2 What Actually Happens

 

When viewing the TRUSTe web site, information about the user moves around the site and the user's computer is being sent to TheCounter.com, which bills itself as a free hit counting service. However, there's much more to this system than one might think initially.

2.2.1 Web Bug and JavaScript Used Together

This system's implementation requires that some object--in this case, a tiny invisible image, also known as a ``web bug''--be fetched from a server run by TheCounter.com so that a ``hit'' can be recorded.

In addition, JavaScript code is used to report information about the user's web browser and computer system to TheCounter.com. Those variables, as identified as being reported to TheCounter.com from a client viewing TRUSTe.org are identified in Table 1.

Interestingly, TheCounter.com uses cookies. We wish to highlight an important fact about the cookie that comes from c2.thecounter.com: it is not a persistent cookie, one that will exist on the user's hard drive until an appointed time has passed. (Generally, cookies used for tracking individuals' activity of online activity will not expire for many yearsw, most typically in the year 2039.) The cookie in this case has no expiration time defined, which means that it will only be present until the browser is closed. In practice, this means that the duration of a given profile is built up over the course of days or weeks instead of years.

Though this tends to suggest that the privacy implications of using such a system are much less significant than in other systems, where cookies can live for years, we would do well to remind ourselves of a few facts. In addition to the cookie and the values placed in the web bug's query string, the web server learns other information about the user and the client software by the nature of the IP and HTTP protocols, including:

  • Which ISP or company the client is coming from.
  • The browser type.
  • The browser version.
  • The language edition of the browser.
  • The language preference configuration of the browser.
  • The media types the browser accepts (which would indicate which browser plug-ins are active).
  • The time and day of the request.
  • The user's operating system type and version.

None of this is itself a big deal, but when considering the bewildering number of possible combinations, this means that a great deal of information about the client and its user is being directed to TheCounter.com. The fact that the cookie will last only as long as the browser is running in a given session, i.e., until it's closed or crashes means that it's not possible to put all of this information into a single dossier from browser session to browser session using the cookie.

2.2.2 Profiling Capability

We cannot help but wonder about the current state of the art of stylometry and how effective such techniques would be in linking a profile from one browser session to the profile from another browser session for the same user. Irrespective of the technology available today or in the future, the fact of the matter is that today the system is capable of limited profiling.

Careful examination of the Last-Modified HTTP header set in responses coming from c2.thecounter.com reveals that the header is being in a used contrary to its stated purpose [2], as demonstrated in Appendix D. The Meantime exploit [4] describes how to use this behavior to track users across sessions.

Furthermore, because essentially any site can use TheCounter.com, there is no guarantee that some other site will not send something like a name, address, or telephone number to TheCounter.com with the same cookie that was used to report activity on the TRUSTe web site.

 
Table 1: Variables Collected by TheCounter.com
Name Description
id The site's ID number for TheCounter.com.
size The width of the client's screen, in pixels.
colors The number of colors the monitor can display.
referer [sic] The URL of page visited before the bugged page.
java Whether Java is enabled.
 

2.2.3 TheCounter.com Owns TRUSTe Visitor Information

Apparently in violation of its opt-in policy, TRUSTe's site is constructed such that information is sent to a third party without the user's consent. More importantly, by contract, this information actually becomes joint property of a third party (see item 12 of TheCounter.com's Terms and Conditions of use, reproduced in Appendix A).

3 Questions and Conclusions

  Any system that relies on an authority to protect the interests of others has a serious problem to overcome: why should the authority be trusted?

3.1 What Is Privacy?

  A German court succinctly defined privacy as ``informational self-determination'' [cite]. Each of us must decide who may and may not know what about us and our business. If any such information is distributed without our approval--especially without our knowledge--it has the potential to constitute a violation of our privacy.

Industry attempts to limit protections to information like a name or telephone number that is reasonably unique and therefore easy to link to the individual fail to acknowledge that a great deal of damage can occur when a detailed pseudonymous dossier is collected. (Many organizations, including TRUSTe, call information that does not contain an individual's name ``anonymous'', though where there is another unique token in its place--such as a cookie, the system isn't anonymous, it's pseudonymous. This is an important distinction, especially in light of what kinds of analysis can be performed on pseudonymous data [5].)

Though each of the participants in building the dossier might be committed not to provide any personally-identifiable information, it only takes one data collector that doesn't have such a commitment to add a real name (or other personal identifiers) to the pseudonymous profile. Such exposures can be intentional by combining anonymous information from one database with non-anonymous information in another database or changes in policy which allow the collection of non-anonymous information into the same profile that was formerly anonymous. Such exposures can also take place unintentionally through human error or some sort of system failure.

Thus, the industry's view of privacy stands in stark contrast with the view of Americans who use the Internet. A recent report has shown that American Internet users have grave concerns about their privacy online and the majority believe that Web tracking invades their privacy [3]. The level of media interest in Internet privacy issues gives additional support to the claim of the issue's importance to end users.

Collection of this kind of information is a dangerous game. Unfortunately, those who profit by the collection, analysis, and distribution of information bear almost none of the risk.

3.2 Questions Raised

  We have identified that it is technically feasible for both TRUSTe and TheCounter.com to do much more with the information they collect than is stated on their privacy policies. We suspect that this is simply an oversight, perhaps due to a failure to understand how these systems work and the consequences of their use under these circumstances. Nevertheless, there are many hard questions that now must be asked.

  • If TRUSTe is in a position of ensuring member sites' compliance to good privacy policy, who will ensure that TRUSTe is in compliance?
  • Is the TRUSTe definition of privacy consistent with what users--not the online vendors who want to market to them--think?
  • Why is TRUSTe using a third party service to collect information about its visitors, when they have all they need in their own hands to generate the kind of statistical site usage information that TheCounter.com provides?
  • Why does TRUSTe not disclose its relationship with TheCounter.com?
  • Why does TRUSTe not disclose that their use of TheCounter.com contractually makes TheCounter.com an owner of the data collected?
  • Why does TheCounter.com's Last-Modified header behave in a way that's apparently inconsistent with the purpose as defined in the HTTP specification?
  • What exactly does TheCounter.com do with the information it collects?
  • What plans does TheCounter.com have for the information it has collected?
  • If TheCounter.com or its parent company is sold, is the information available to the new company owner?
  • If TheCounter.com or its parent company liquidates its assets, can the information collected be auctioned off to the highest bidder?
  • In light of this report's findings, does TRUSTe consider its use of TheCounter.com to be a breach of its privacy policy?
  • Do TRUSTe users consider TRUSTe's use of TheCounter.com to be an invasion of their privacy?
  • If an invasion of privacy has been taken place, how can information that has been collected without TRUSTe users' knowledge be reclaimed?
  • How does this compare to other web sites and other hit counters?

3.3 Whom Can We Trust?

  Many seem to be in search of a ``silver bullet'', a single solution that will solve the privacy problem for everyone, forever. There is no silver bullet. There is no single source from which we can expect protection. Even those with established track records of diligence, competence, and vigilance can (and do) make mistakes.

Building systems that collect information about people--even ``anonymous'' information--seems a dangerous proposition. At the very least, protecting privacy--real privacy, by allowing individuals to determine for themselves who may know what--is a very complicated problem and might well be beyond our reach given the current state of technology and economics.

A TheCounter.com's Terms and Conditions of Use

  The Terms and Conditions of Use can be seen by going to 
http://www.thecounter.com/ and following the links on how to get a new counter. These are the terms and conditions from August 23, 2000.

By using thecounter.com service you agree to these terms and conditions.

1. We reserve the right to change these terms and conditions without notice by posting the changes to our Web site.

2. We or you may terminate your account and remove your site from our listings at any time for any reason.

3. The following types of sites are NOT allowed to participate in thecounter.com: sites encouraging illegal activity or racism, sites providing instructions or discussions about performing illegal activities, sites that promote or utilize software or services designed to deliver unsolicited email, or any other sites we deem to be inappropriate.

4. You agree not to change thecounter.com programming code.

5. You have read our general copyright notice and terms and conditions and you agree to them.

6. Users acknowledge and agree that their Web site information (name, URL, traffic counts, etc.) may be utilized by thecounter.com. Possible uses include (but are not limited to) lists of the busiest sites, lists of member sites, general promotional uses, etc.

7. You agree to use our services at your own risk. Our services are provided on an "as is" and "as available" basis. You agree that you have made your own determination regarding the usefulness of the service. We disclaim all warranties including, but not limited to, warranties of merchantability and fitness for a particular purpose.

8. We are not liable for damages, direct or consequential, resulting from your use of the service, any failure to provide service, suspension of service, or termination of service. We do not guarantee the availability of the service. You agree not to hold us responsible for data loss or interruption of service of any kind.

9. We retain ownership and all rights to thecounter.com logos, trademarks, software, trade secrets, databases, reports, and Web site.

10. If this agreement is terminated by us or by you for any reason, you agree to remove our code, logos and trademarks from all of your Web sites and other items.

11. YOU AGREE TO DEFEND, INDEMNIFY AND HOLD US HARMLESS FROM AND AGAINST ANY AND ALL CLAIMS, LOSSES, LIABILITY COSTS AND EXPENSES (INCLUDING BUT NOT LIMITED TO ATTORNEY'S FEES) ARISING FROM YOUR VIOLATION OF THIS AGREEMENT OR ANY THIRD-PARTY'S RIGHTS, INCLUDING BUT NOT LIMITED TO INFRINGEMENT OF ANY COPYRIGHT, VIOLATION OF ANY PROPRIETARY RIGHT AND INVASION OF ANY PRIVACY RIGHTS. THIS OBLIGATION SHALL SURVIVE ANY TERMINATION OF THIS AGREEMENT. OUR LIABILITY WILL NOT EXCEED THE PURCHASE PRICE OF THE SERVICES.

12. We both own the data regarding visitors to your Web site that we collect. You can use the data we provide for any legal purposes. We will use the data in compliance with our privacy policy.

13. This Agreement will be construed and enforced in accordance with the laws of the State of Connecticut without regard to its conflict of law principles. Venue for any dispute under this Agreement will be the State of Connecticut, USA.

B TheCounter.com Tracking Code

  This code block, taken from http://www.truste.org/users/ shows JavaScript probing the system to determine the screen width, the document's referrer, the number of colors the system can display, and whether Java is enabled. That information is then formatted for database entry by standardizing it in a query string and writing the resulting HTML into the current page, causing the browser to make a request for the web bug with all of the discovered information in the query string.

<!-- Start of TheCounter.com Code -->
<SCRIPT TYPE="text/javascript" LANGUAGE="javascript">
s="na";c="na";j="na";f=""+escape(document.referrer)
</SCRIPT>
<SCRIPT TYPE="text/javascript" LANGUAGE="javascript1.2">
s=screen.width;v=navigator.appName
if (v != "Netscape") {c=screen.colorDepth}
else {c=screen.pixelDepth}
j=navigator.javaEnabled()
</SCRIPT>
<SCRIPT TYPE="text/javascript" LANGUAGE="javascript">
function pr(n) {document.write(n,"\n");}
NS2Ch=0
if (navigator.appName == "Netscape" &&
navigator.appVersion.charAt(0) == "2") {NS2Ch=1}
if (NS2Ch == 0) {
r="&size="+s+"&colors="+c+"&referer="+f+"&java="+j+""
pr("<A HREF=\"http://www.TheCounter.com\" TARGET=\"_top\"><IMG"+
" BORDER=0 SRC=\"http://c2.thecounter.com/id=1323444"+r+"\"><\/A>")}
</SCRIPT>
<NOSCRIPT><A HREF="http://www.TheCounter.com" TARGET="_top"><IMG
SRC="http://c2.thecounter.com/id=1323444" ALT="TC" BORDER=0></A>
</NOSCRIPT>
<!-- End of TheCounter.com Code -->

C TheCounter.com Web Bug Setting a Cookie

  This can easily be demonstrated with the standard telnet program. Observe the Set-Cookie header in the response.

$ telnet c2.thecounter.com 80
Trying 63.236.73.251...
Connected to c2.thecounter.com.
Escape character is '^]'.
HEAD /id=1323213&size=1280&colors=8&referer=&java=false HTTP/1.0

HTTP/1.0 200 OK
Date: Wed, 23 Aug 2000 21:00:33 GMT
Server: TheCounter/2.0
Last-Modified: Wed, 23 Aug 2000 21:00:33 GMT
Pragma: no-cache
Cache-control: no-cache, must-revalidate
Expires: Wed, 23 Aug 2000 21:00:33 GMT
Set-Cookie: VTC1323213=0;PATH=/
Accept-Ranges: bytes
Content-Length: 43
Connection: close
Content-Type: image/gif

Connection closed by foreign host.

D TheCounter.com Using Cache Negotiation for Tracking?

  Each request made for the identical resource (i.e., the same web bug with exactly the same URL) is returned with a different answer in the Last-Modified header. Normally, when a resource is returned to the client, that header is used to identify when the resource was last updated, so that caches will be able to recognize when a resource can safely be served from cache and when it must be refreshed.

Observe that this is not the case in TheCounter.com's use of the header. Each time the same request is made, the value is different, thus, that is not the correct ``last modification'' time of the resource. Additionally, the Pragma and Cache-control headers direct caches not to cache this information; each time the resource is requested the client must make contact with the c2.thecounter.com server.

Here we make several sequential requests--all made within a period of less than five seconds--or the same URL. Observe the Last-Modified header in each response. Discussing this matter with Internet.com folks, there seems to be a reasonable explanation for this behavior. Though still violating the purpose of the header, the Last-Modified header always matches the Date header. This is (yet another) attempt (in addition to the Pragma and Cache-Control headers) to ensure that the object will not be cached. As it turns out, it would seem that c2.thecounter.com, although having only one IP address, is actually a group of servers, and if their clocks are out of synchronization, it could cause series of requests to exhibit the same behavior as that which we'd see in the Meantime tracking exploit.

So are they tracking? The system seems technically capable of doing so--within some margin of error--but we have no reason to believe that they are. Herein lies the issue: if the system is capable of tracking, whether by design or by oversight, how do we know whether such is taking place?

HEAD http://c2.thecounter.com/id=1323213&size=1280&colors=8&referer=&java=false HTTP/1.0

HTTP/1.0 200 OK
Date: Thu, 24 Aug 2000 15:43:53 GMT
Server: TheCounter/2.0
Last-Modified: Thu, 24 Aug 2000 15:43:53 GMT
Pragma: no-cache
Cache-control: no-cache, must-revalidate
Expires: Thu, 24 Aug 2000 15:43:53 GMT
Set-Cookie: VTC1323213=0;PATH=/
Accept-Ranges: bytes
Content-Length: 43
Connection: close
Content-Type: image/gif

HEAD http://c2.thecounter.com/id=1323213&size=1280&colors=8&referer=&java=false HTTP/1.0

HTTP/1.0 200 OK
Date: Thu, 24 Aug 2000 12:43:18 GMT
Server: TheCounter/2.0
Last-Modified: Thu, 24 Aug 2000 12:43:18 GMT
Pragma: no-cache
Cache-control: no-cache, must-revalidate
Expires: Thu, 24 Aug 2000 12:43:18 GMT
Set-Cookie: VTC1323213=0;PATH=/
Accept-Ranges: bytes
Content-Length: 43
Connection: close
Content-Type: image/gif

HEAD http://c2.thecounter.com/id=1323213&size=1280&colors=8&referer=&java=false HTTP/1.0

HTTP/1.0 200 OK
Date: Thu, 24 Aug 2000 15:56:20 GMT
Server: TheCounter/2.0
Last-Modified: Thu, 24 Aug 2000 15:56:20 GMT
Pragma: no-cache
Cache-control: no-cache, must-revalidate
Expires: Thu, 24 Aug 2000 15:56:20 GMT
Set-Cookie: VTC1323213=0;PATH=/
Accept-Ranges: bytes
Content-Length: 43
Connection: close
Content-Type: image/gif

References

1
EPIC. 
The cookies page. 
[online] http://www.epic.org/privacy/cookies/, February 2000.

2
R. Fielding, J. Gettys, J. Mogul, H. Frystyk, L. Masinter, P. Leach, and T. Berners-Lee. 
Hypertext transfer protocol - HTTP/1.1. 
RFC 2616, June 1999. 
[online] http://www.ietf.org/rfc/rfc2616.txt.

3
Susannah Fox, Lee Rainie, John Horrigan, Amanda Lenhart, Tom Spooner, and Cornelia Carter. 
Trust and privacy online: Why Americans want to rewrite the rules. 
Technical report, The Pew Internet & American Life Project, August 2000.

4
Martin Pool. 
meantime: non-consensual http user tracking using caches. 
Technical report, March 2000. 
[online] http://www.linuxcare.com.au/mbp/meantime/.

5
Josyula R. Rao and Pankaj Rohatgi. 
Can pseudonymity really guarantee privacy? 
In Proceedings of the 9th USENIX Security Symposium, pages 85-96. IBM T.J. Watson Research Center, USENIX Association, August 2000. 
[online] http://www.usenix.org/publications/library/proceedings/sec2000/rao.html.

6
Richard M. Smith. 
Web bug FAQ. 
Technical report, Privacy Foundation, 2000. 
[online] http://www.privacyfoundation.org/education.html.

Footnotes

...TRUSTe
On the Web at http://www.truste.org/ .

...Policy
http://www.truste.org/TRUSTe_privacy.html