Reddit the open-source software

ketralnis has responded to this post here and throughout this thread.

Occurrences of “reddit the open-source software” have been abbreviated to “reddit OSS”. – Nov. 19, 2010

I use reddit, as in reddit the open-source software, for a website that doesn’t get much traffic for several reasons. reddit OSS is one of the bigger reasons. I want to talk about reddit OSS and its management for a moment.

reddit OSS is published at http://github.com/reddit and http://code.reddit.com. reddit.com usually works pretty well, but reddit OSS is very unfriendly to anyone that is not reddit.com.

There has not been a push for about a month, and before that, there had not been a push since mid-July, despite “planning on a much more sane release schedule for future patches (much closer to ‘weekly’ rather than ‘epoch modulo 10Ms’).” The long lag time between pushing changes makes code merges when a new version eventually does get pushed a serious undertaking, especially for those who run hobby or part-time sites (as most running the reddit open-source platform would be). Each time I have updated my reddit installation to a new HEAD it has been two or three days of configuration, re-merging, and bug-squashing before the updated codebase was working as expected; recently, subtle failures occurred while running ads for the site and essentially made it impossible to post comments. If changes were pushed in smaller increments, the same necessary merges would be much easier to handle; merging three or four changes is much simpler than merging 60-70+.

Merges get even more complicated because to customize reddit even in the most basic ways, you’ll have to hack up several base code files that contain a lot of other stuff. When you clone reddit from git, the clone comes with the same ads that run on reddit, and the only way to remove them is to edit that file, a file that git tracks and a file that clashes on merges (if you don’t –assume-unchanged, which is probably safe in this case as that file hasn’t been updated in over two years, but still extra hassle and excludes all future changes from applying automatically — changes which may be important).

There are several other instances for things that really should have been cleaned up for reddit OSS but still linger, and as you go through removing all them, you get quite a few changes built up — changes that cause problems when it’s time to pull. You shouldn’t have to sanitize the codebase of the OSS version in the first place; that’s the maintainer’s job.

Most obvious among these things that should have been stripped is the reddit alien. It is all over the place — under the submit link button, under the create a subreddit button, thumbnail placeholder, and so on. As far as I know the reddit alien is still held by Conde Nast/reddit corporate under an All Rights Reserved copyright license, as one might expect for a company’s logo. The term “reddit”, “subreddit”, etc., appear throughout the site, causing potential trademark liabilities.

If a website that runs reddit OSS starts to gain momentum, how long do we expect the lawyers at Conde Nast to abide usage of the reddit name and logo on a website over which they have no control, especially if that site infringes on reddit.com’s primary audience? Why can’t they draw an distinct alien for reddit OSS or just include generic images and icons from Tango et al? It would be a much better thing to do so. My site has been going for almost a year and I’m still finding the term “reddit” sprinkled in odd places, despite going through the translation file a few times. It’s hard-coded in some spots.

Then, to run reddit OSS, one must use memcached, Cassandra, an AMQP server like rabbitmq, PostgreSQL, and a handful of paster daemons included with reddit, which are currently configured to run with daemontools, so unless you want to spend a while converting the current scripts/daemons, you must also install and use daemontools. Furthermore, running these daemons is non-obvious and it was not required when I originally pulled, so it took me a while to figure out a lot of the weird bugs I got resulted from not running these daemons. These daemons are mostly for caching as far as I know, but if you don’t have them in place weird things like disappearing thumbnails and comments will befall you. The commit messages I saw did not make big shiny letters about it, and the overall documentation is poor.

reddit.com does almost no testing of reddit OSS. They just push out what they run on reddit.com. Many times in #reddit-dev I have seen “we haven’t tested it that way but it should work…” before someone describes a bug or submits a patch. reddit does not test reddit in a conventional environment.

In the October update, reddit merged several contributed patches, but prior thereto it was rather rare, only occurring a couple of times on a couple of patches (from the github history). There are still a lot of changes out there that would do well to be merged, but reddit.com is trying to keep the codebase unified (despite its super-ugly squash commits that get pushed out in the “weekly” updates), so if your patch would help most users of reddit OSS but not reddit.com, it won’t get merged. This can be good in some cases — it forced me to produce a more scalable database reconnect priority patch, for instance — but it can also mean that more sensible defaults or caching mechanisms for sites that are not reddit.com would be rejected.

The reddit guys insist that their number one priority is reddit.com and almost any time someone brings up a push of reddit.com to the OSS version or merging of a patch or whatever in #reddit-dev, ketralnis is adamant that there is just no time for that. reddit is clearly understaffed and reddit OSS is largely neglected.

There’s not necessarily anything wrong with that, but all of this means that reddit OSS is in prime condition for a fork. However, ketralnis does not think a fork is a good idea. Here is a snippet from IRC, with pieces omitted for brevity and coherence:

(01:03:12 AM) sjuxax: I am planning on forking reddit sometime soon fyi
(01:05:07 AM) ketralnis: I wouldn’t recommend that

(01:05:12 AM) sjuxax: why?

(01:05:21 AM) ketralnis: It’d be a nightmare to maintain against our code-releases, for one

(01:05:47 AM) ketralnis: For another, the license make it difficult to divorce from our brand

(01:05:55 AM) sjuxax: Well it’s already a nightmare to merge with the six-month release cycles and big changes you guys make.

(01:06:12 AM) ketralnis: Agreed, and we should do less of that

(01:06:14 AM) sjuxax: The license basically just requires the attribution at the bottom, right?

(01:06:39 AM) ketralnis: If you’re planning on forking it, you should actually read it. It’s not a long one

(01:06:43 AM) sjuxax: so we can leave that, but the alien is all over. Obviously the license won’t let us get rid of the powered by reddit logo, but the rest should be free to go

(01:06:51 AM) sjuxax: I have read it in the past, but it’s been a while

(01:07:16 AM) ketralnis: I understand where you’re coming from, but it would harm our open-source development to have it forked

(01:08:43 AM) sjuxax: Well I would prefer to keep upstream and the fork at least somewhat compatible

(01:08:58 AM) sjuxax: so hopefully most patches could still go both ways

(01:10:17 AM) sjuxax: but yeah, uh, sorry. reddit has neglected its open-source users imo so a fork is inevitable when you get serious users; that’s why we use OSS software; if the maintainer isn’t taking care of it, someone else can

(01:10:47 AM) ketralnis: We are taking care of it, in that it’s what’s running our live site, right now. 14 million pageviews yesterday.

(01:11:04 AM) ketralnis: So I’d say it’s holding up rather well under its current maintanence

(01:11:21 AM) sjuxax: OK, you are taking care of your reddit installation, you are running reddit for reddit which is fine if that’s what you want to do

(01:11:25 AM) ketralnis: The right solution is for me to set aside a day to merge up with public, not to go forking it

(01:11:28 AM) sjuxax: but it is not attractive as an option for not-reddit

(01:11:43 AM) ketralnis: I’m telling you, forking us will hurt reddit.

(01:11:50 AM) sjuxax: but you don’t set aside that day often enough; you were going to do it weekly but now it’s been months again

(01:12:57 AM) sjuxax: reddit as an open-source project is either going to get forked or going to continue to limp on. it will be nice for reddit’s reddit, but if things keep going how they have been going, virtually no one is going to use the code you publish.

(01:12:57 AM) ketralnis: I don’t have time to argue this right now. But trust me, you forking reddit will fuck up my week, and probably stall any future open source contribution to reddit.

(01:13:53 AM) ketralnis: Forking it will make that situation worse by losing the only developers *paid* to contribute to it from your fork, and any open source developers from either

So reddit corporate would not be happy to see a fork rise up, but what choice do users of reddit OSS have? Things are definitely not good the way they are now and I think that a fork is ultimately inevitable unless reddit revises their policies, allows some divergence, and finally takes the open-source side of things seriously.

Is there much interest in a fork out there? There’s lots of good contributions on github that remain unmerged, and a fork would be more active about merging these and especially merging changes that enhance the platform for smaller sites. Once someone gets reddit.com-level traffic, they can switch the platform to the official reddit OSS and then all of the onerous/tricky/annoying/monstrous stuff that is employed by reddit to allow caching and survival under that kind of traffic will be beneficial.

The paths before reddit.com/reddit corporate are A) take reddit OSS seriously, get patching and merging fixed up and make it easier to push out changes, and then maintain the open-source version frequently and well, including possible divergences where it benefits the OSS user; B) stay the course until someone forks, and its unclear what the ultimate consequences of this would be. ketralnis seems to think it would mean a secession of commercially-funded development entirely; or C) stay the course until everyone gives up on reddit OSS and the project withers and dies. What’ll it be?

11 thoughts on “Reddit the open-source software

  1. haulden

    I can definitely get where you’re coming from. I remember using reddit open source (fixxit) for a pretty low traffic site a couple of years ago, and it was a huge pain in the ass. As a matter of fact I stopped using it and developed an in-house solution the first merge, because, honestly, I just disheartened by the amount of bug-fixing needed to get it back to a working state.

    I agree with you that a fork is not only in order, but pretty much necessary if fixxit is to gain further traction. I can understand ketralnis’ worries, but as is thoroughly explained in your interesting article, the project’s management is pretty appalling (in the fixxit front, not the reddit.com).

    Long story short, count me in for a fork!

  2. joehillen

    Personally, I would be very interested in a maintained reddit fork, even if they branch so far apart that it would be unreasonable to try to merge them.

    The massive dependency stack is my biggest concern. It is simply unnecessary and unwieldy.

  3. Paul Hammant

    You want to keep one file (the one with ads in it) divergent from the origin?

    Cherry-pick the commit you don’t want to that file first, but choose –strategy=ours for the merge. Following that merge the rest in the normal way.

  4. David W

    I don’t see what the big deal is from either side here. Go ahead and create your fork. Fix up the small things that are annoying you. Advertise the existence of the fork, offer up (clean) patches.

    If upstream still isn’t accepting those patches (e.g. a fix for your complaint about manually editing a file to disable ads) before, say, their next public release, then your fork was legitimate. Otherwise, just abandon the fork and praise the lords of open source licensing that you had a second option to wield at ketralnis.

  5. JC Brand

    Glad to read that someone else a similar experience.

    I would be interested in a well maintained for with more generic defaults and fewer dependencies.

    I’ve looked at the reddit code to build a small community site recently but gave up eventually to instead create the site from scratch.

  6. Rick M

    I tried setting up an instance of reddit myself about 18 months ago and with my low level of technical expertise found it all but impossible.

    I now have many more reasons to set up instances of reddit (I have some experiments I’d like to run) but I don’t even consider trying again, my thoughts at the moment involve improvising my ideas so that I could do something with sub-reddits on reddit.com (decidedly sub-optimal for a number of reasons).

    So, I’d definitely be interested in making use of a cleaned up version of reddit’s source… but I don’t have the skills to help make it happen.

  7. Hrishi B

    I also thought of running a reddit clone for a website. I did not do it because modifying reddit’s code to remove all trademarks and “reddit.com” specific code seemed daunting.

    A fork would be useful even for reddit.com since they will also have access to the patchs/bug fixes.

  8. Kevin

    If it’s going to take a fork, a new open source project, and new management to have a quality source code base of Reddit, then that is what should happen.

  9. alive1

    Popping in to support a fork :) I don’t see what the big deal is from either side. Why is Reddit so opposed against a fork? Why is it such a big deal? They are obviously not interested in supporting an “Reddit the OSS Software” because they are busy with supporting “Reddit The Site”. A fork could give VALUABLE contributions to both sides. The community gets a reddit software it can use, the reddit gets golden input on features for reddit the site.

    Why is it so bad? People want to run reddit for themselves, then let them. It’s an awesome software package! There’s none other like it!

    If people can write patches for reddit for their own benefit, then these patches will be better tested and better written. More patches will be written as it will make sense to write them, because you can use them, and know they /will/ be used instead of having to first be approved by someone before it’ll see any production environment. A fork doesn’t have to be a competitor, it can be a cooperator.

    A fork could be a valuable benefit to Reddit Website. And to the community.

  10. iWebAll

    Why depending on Reddit when you can have your own code in place and gain all the benefits to yourself? Jus write you own codes. Technology these days is so advance that the only reason you’ll want to use a third party code is if you don’t wanna work writing your own stuff…

Leave a Reply

Your email address will not be published. Required fields are marked *

You may use these HTML tags and attributes: <a href="" title=""> <abbr title=""> <acronym title=""> <b> <blockquote cite=""> <cite> <code> <del datetime=""> <em> <i> <q cite=""> <strike> <strong>