PSA: Tor exposes all traffic by design. Do NOT use it for normal web browsing.

As news of PRISM and other top-secret domestic surveillance programs cartier jewelry replica has been reported, many Americans have sought out means to prevent the government’s prying eyes from gaining access to their data. One of the most frequently cited methods of circumvention is Tor. NPR’s Science Friday, for instance, spoke about Tor as a potential PRISM circumvention on July 12, and the Tor Browser Bundle is one of the first things promoted on PRISM Break.

This is very bad. Tor should not, I repeat, NOT, be used as as a default wrapper for one’s browsing traffic. I’ve had to stop several friends from making this mistake after being misled by pseudo-technical sources, and now I’m here to stop you.

This is not about a flaw in the Tor protocol; rather, it is a correction of the myth that Tor can protect your conversations from random listeners. This belief is in fact the opposite of the truth; using Tor guarantees that at least one random party will have full access to all packets in both directions going over a specific node chain, because Tor is about hiding your IP address, not hiding your packet contents. As this is the effect that most people are attempting to avoid, Tor is not only counterproductive but dangerous for the average user.

PRISM can only be beaten by not playing

Before we discuss the specific mechanics of why it’s such a big no-no to wrap your web traffic in Tor by default, we should address a more fundamental point. PRISM is a voluntary program of data submission. This means that PRISM participants have been invited by the NSA to upload the contents of their database, and that the vendors have chosen to accept this invitation. It doesn’t matter how you access a PRISM participant’s resources, because they upload all the data they have on you anyway. Therefore, the only way to prevent your data from getting submitted to the NSA, whether you’re connecting from your home DSL or the Starship Enterprise, is to not give any data to the entities that are wrapping it in a neat bow and dropping it on the NSA’s doorstep. Tor will not help with this. Tor will do nothing to prevent this. Tor makes it harder for an endpoint to discover the data’s originating IP address, which is a fairly minor detail when we’re discussing something on the scale of PRISM, since they already have all the emails, IMs, photos, cell phone information, etc. of basically everyone.

I repeat: the only thing that will protect someone from PRISM is refusal to utilize the products of PRISM participants. It does not matter how or why or when or where you access it. If you upload any data to the service of a vendor who participates in PRISM, the NSA has it, and that’s the end of the story. As far as the U.S. government is concerned, using Tor will just result in a flag on your account that makes the guys who’re reading your email laugh and say, “Ha! This guy thinks we care so much about his boring emails that he should try to hide from us. What a jokester.”

However, it is very important that one doesn’t use Tor to do mundane things that are just as well done on a direct connection, because Tor’s infrastructure is inherently insecure for most ordinary uses.

Your traffic is visible to the exit node.

Tor is an acronym for “The Onion Router”. It is so named because it works by wrapping your request in several layers of encryption and then sending this request through an automatically generated chain of nodes. At some point, the request must be unwrapped to be sent to its final destination because most people are trying to communicate with an ordinary online service that doesn’t understand Tor’s methods.

The Tor node that performs the final unwrapping is called an exit node. The exit node decrypts the packet it received from its sibling on the chain of nodes and receives your full, plaintext request, which it submits on your behalf to the intended destination. The exit node waits for the response, encrypts it, and sends the encrypted response back up through the node chain until it reaches you, the dear user and the termination of the chain, where your Tor client decrypts the packet from your chain-sibling and presents your client with a comprehensible piece of data.

There is no way to restrict what an exit node can do with your session’s plaintext, and anyone can run an exit node. There is no qualification process and there are no restrictions. Barack Obama could be running an exit node within minutes, and so could Edward Snowden, and there’d be no way for either replica cartier love bracelet of them to ensure that the other couldn’t see the requests they were sending. The user simply checks a box in Vidalia and he’s running an exit node, relaying plaintext data between conversants. Exit nodes automatically change every few minutes, so many exit nodes will be relaying pieces of your conversation, possibly re-exposing sensitive data to many entities over the course of a single session. Anyone running a Tor exit node is a potential listener.

The Tor project attempts to scare exit node operators straight by citing the possibility of prosecution under wiretap laws, but this is a purely legal restriction; under Tor’s design, there is no possible technical implementation that would prevent the exit node operator from being able to save both incoming and outgoing messages as sent between conversants. Only the threat of prosecutorial pressure (which is basically non-existent for certain parties) stands betwixt an exit node operator and your data. Thus, Tor is extremely dangerous for the ordinary user. It must be used only for specific, carefully-planned sessions, or you risk exposing sensitive personal data to anyone running an exit node.

In principle, Tor is not very complex. It simply automates what would otherwise be a very cumbersome manual process of chaining proxies and encrypting a message for each replica cartier love bracelets proxy’s public key. Tor’s directories and announce mechanisms mean that one no longer must trawl for private proxies, but they also mean that anyone can register a node as a proxy and do whatever they like with the traffic they’re passing. Tor puts no restrictions on any of this — literally anyone running the Tor software can volunteer to pass along traffic and will automatically begin receiving the traffic of other users.

You are much safer with just the NSA spying on you than all the people you invite to spy when you utilize Tor indiscriminately.

What about SSL/TLS?

Encryption protocols implemented by browsers may mitigate this issue to varying degrees, dependent on the details of the cryptography’s implementation and negotiation (and the assumption that an exit node isn’t tampering with the negotiation handshakes to allow easier interception of the encrypted conversation), the validity and trustworthiness of the certificates in use, the server’s proper attribution of security flags, and other variables. That’s sure a lot of stuff to have to assume is in place when you’re broadcasting your packet-level conversations out to potentially any Joe Blow on the street.

Why does Tor exist if it’s so unsafe?

Because Tor is not designed to be a universal privacy tool. It was built for a specific purpose, which was the circumvention of restrictive firewalls. The default example is China; Tor could be used by Chinese dissidents to post or access information that is censored in China, but available in the “free world”. Tor would make it impossible for the Chinese government to tell which computer was used to post a certain piece of information, and would hide the fact that other information was being accessed at all. Tor is meant as a lifeline to the outside world. Tor actually makes it much easier to spy on random conversations between entities, if you’re into that kind of thing (and the government obviously is), because the idea is to get public information in and out of a locked-down environment. And it works very well for that.

With this in mind, it’s ironic to look back on the way that certain persons have clung to Tor as a solution to domestic spying, because in actual fact, Tor makes such spying easier for an adversary that is only slightly removed from many of Tor’s biggest participants (universities), and opens the user’s traffic up to the possibility of tampering or recording from a potentially infinite collection of more ignominious foes.

OK, when can I use Tor?

Assume any data you pass through Tor, including usernames and passwords, will be publicly visible. If you have a use case where you’re OK with that happening, you’re OK to use Tor; if not, you aren’t. As most people do many things that they don’t want publicized, Tor is a very bad solution for most people.

Reddit the open-source software

ketralnis has responded to this post here and throughout this thread.

Occurrences of “reddit the open-source software” have been abbreviated to “reddit OSS”. – Nov. 19, 2010

I use reddit, as in reddit the open-source software, for a website that doesn’t get much traffic for several reasons. reddit OSS is one of the bigger reasons. I want to talk about reddit OSS and its management for a moment.

reddit OSS is published at http://github.com/reddit and http://code.reddit.com. reddit.com usually works pretty well, but reddit OSS is very unfriendly to anyone that is not reddit.com.

There has not been a push for about a month, and before that, there had not been a push since mid-July, despite “planning on a much more sane release schedule for future patches (much closer to ‘weekly’ rather than ‘epoch modulo 10Ms’).” The long lag time between pushing changes makes code merges when a new version eventually does get pushed a serious undertaking, especially for those who run hobby or part-time sites (as most running the reddit open-source platform would be). Each time I have updated my reddit installation to a new HEAD it has been two or three days of configuration, re-merging, and bug-squashing before the updated codebase was working as expected; recently, subtle failures occurred while running ads for the site and essentially made it impossible to post comments. If changes were pushed in smaller increments, the same necessary merges would be much easier to handle; merging three or four changes is much simpler than merging 60-70+.

Merges get even more complicated because to customize reddit even in the most basic ways, you’ll have to hack up several base code files that contain a lot of other stuff. When you clone reddit from git, the clone comes with the same ads that run on reddit, and the only way to remove them is to edit that file, a file that git tracks and a file that clashes on merges (if you don’t –assume-unchanged, which is probably safe in this case as that file hasn’t been updated in over two years, but still extra hassle and excludes all future changes from applying automatically — changes which may be important).

There are several other instances for things that really should have been cleaned up for reddit OSS but still linger, and as you go through removing all them, you get quite a few changes built up — changes that cause problems when it’s time to pull. You shouldn’t have to sanitize the codebase of the OSS version in the first place; that’s the maintainer’s job.

Most obvious among these things that should have been stripped is the reddit alien. It is all over the place — under the submit link button, under the create a subreddit button, thumbnail placeholder, and so on. As far as I know the reddit alien is still held by Conde Nast/reddit corporate under an All Rights Reserved copyright license, as one might expect for a company’s logo. The term “reddit”, “subreddit”, etc., appear throughout the site, causing potential trademark liabilities.

If a website that runs reddit OSS starts to gain momentum, how long do we expect the lawyers at Conde Nast to abide usage of the reddit name and logo on a website over which they have no control, especially if that site infringes on reddit.com’s primary audience? Why can’t they draw an distinct alien for reddit OSS or just include generic images and icons from Tango et al? It would be a much better thing to do so. My site has been going for almost a year and I’m still finding the term “reddit” sprinkled in odd places, despite going through the translation file a few times. It’s hard-coded in some spots.

Then, to run reddit OSS, one must use memcached, Cassandra, an AMQP server like rabbitmq, PostgreSQL, and a handful of paster daemons included with reddit, which are currently configured to run with daemontools, so unless you want to spend a while converting the current scripts/daemons, you must also install and use daemontools. Furthermore, running these daemons is non-obvious and it was not required when I originally pulled, so it took me a while to figure out a lot of the weird bugs I got resulted from not running these daemons. These daemons are mostly for caching as far as I know, but if you don’t have them in place weird things like disappearing thumbnails and comments will befall you. The commit messages I saw did not make big shiny letters about it, and the overall documentation is poor.

reddit.com does almost no testing of reddit OSS. They just push out what they run on reddit.com. Many times in #reddit-dev I have seen “we haven’t tested it that way but it should work…” before someone describes a bug or submits a patch. reddit does not test reddit in a conventional environment.

In the October update, reddit merged several contributed patches, but prior thereto it was rather rare, only occurring a couple of times on a couple of patches (from the github history). There are still a lot of changes out there that would do well to be merged, but reddit.com is trying to keep the codebase unified (despite its super-ugly squash commits that get pushed out in the “weekly” updates), so if your patch would help most users of reddit OSS but not reddit.com, it won’t get merged. This can be good in some cases — it forced me to produce a more scalable database reconnect priority patch, for instance — but it can also mean that more sensible defaults or caching mechanisms for sites that are not reddit.com would be rejected.

The reddit guys insist that their number one priority is reddit.com and almost any time someone brings up a push of reddit.com to the OSS version or merging of a patch or whatever in #reddit-dev, ketralnis is adamant that there is just no time for that. reddit is clearly understaffed and reddit OSS is largely neglected.

There’s not necessarily anything wrong with that, but all of this means that reddit OSS is in prime condition for a fork. However, ketralnis does not think a fork is a good idea. Here is a snippet from IRC, with pieces omitted for brevity and coherence:

(01:03:12 AM) sjuxax: I am planning on forking reddit sometime soon fyi
(01:05:07 AM) ketralnis: I wouldn’t recommend that

(01:05:12 AM) sjuxax: why?

(01:05:21 AM) ketralnis: It’d be a nightmare to maintain against our code-releases, for one

(01:05:47 AM) ketralnis: For another, the license make it difficult to divorce from our brand

(01:05:55 AM) sjuxax: Well it’s already a nightmare to merge with the six-month release cycles and big changes you guys make.

(01:06:12 AM) ketralnis: Agreed, and we should do less of that

(01:06:14 AM) sjuxax: The license basically just requires the attribution at the bottom, right?

(01:06:39 AM) ketralnis: If you’re planning on forking it, you should actually read it. It’s not a long one

(01:06:43 AM) sjuxax: so we can leave that, but the alien is all over. Obviously the license won’t let us get rid of the powered by reddit logo, but the rest should be free to go

(01:06:51 AM) sjuxax: I have read it in the past, but it’s been a while

(01:07:16 AM) ketralnis: I understand where you’re coming from, but it would harm our open-source development to have it forked

(01:08:43 AM) sjuxax: Well I would prefer to keep upstream and the fork at least somewhat compatible

(01:08:58 AM) sjuxax: so hopefully most patches could still go both ways

(01:10:17 AM) sjuxax: but yeah, uh, sorry. reddit has neglected its open-source users imo so a fork is inevitable when you get serious users; that’s why we use OSS software; if the maintainer isn’t taking care of it, someone else can

(01:10:47 AM) ketralnis: We are taking care of it, in that it’s what’s running our live site, right now. 14 million pageviews yesterday.

(01:11:04 AM) ketralnis: So I’d say it’s holding up rather well under its current maintanence

(01:11:21 AM) sjuxax: OK, you are taking care of your reddit installation, you are running reddit for reddit which is fine if that’s what you want to do

(01:11:25 AM) ketralnis: The right solution is for me to set aside a day to merge up with public, not to go forking it

(01:11:28 AM) sjuxax: but it is not attractive as an option for not-reddit

(01:11:43 AM) ketralnis: I’m telling you, forking us will hurt reddit.

(01:11:50 AM) sjuxax: but you don’t set aside that day often enough; you were going to do it weekly but now it’s been months again

(01:12:57 AM) sjuxax: reddit as an open-source project is either going to get forked or going to continue to limp on. it will be nice for reddit’s reddit, but if things keep going how they have been going, virtually no one is going to use the code you publish.

(01:12:57 AM) ketralnis: I don’t have time to argue this right now. But trust me, you forking reddit will fuck up my week, and probably stall any future open source contribution to reddit.

(01:13:53 AM) ketralnis: Forking it will make that situation worse by losing the only developers *paid* to contribute to it from your fork, and any open source developers from either

So reddit corporate would not be happy to see a fork rise up, but what choice do users of reddit OSS have? Things are definitely not good the way they are now and I think that a fork is ultimately inevitable unless reddit revises their policies, allows some divergence, and finally takes the open-source side of things seriously.

Is there much interest in a fork out there? There’s lots of good contributions on github that remain unmerged, and a fork would be more active about merging these and especially merging changes that enhance the platform for smaller sites. Once someone gets reddit.com-level traffic, they can switch the platform to the official reddit OSS and then all of the onerous/tricky/annoying/monstrous stuff that is employed by reddit to allow caching and survival under that kind of traffic will be beneficial.

The paths before reddit.com/reddit corporate are A) take reddit OSS seriously, get patching and merging fixed up and make it easier to push out changes, and then maintain the open-source version frequently and well, including possible divergences where it benefits the OSS user; B) stay the course until someone forks, and its unclear what the ultimate consequences of this would be. ketralnis seems to think it would mean a secession of commercially-funded development entirely; or C) stay the course until everyone gives up on reddit OSS and the project withers and dies. What’ll it be?