News

A data-driven guide to creating successful Hacker News posts

Some time ago, Randal Olson analyzed how to create successful Reddit posts in a data-driven way. I thought it would be interesting to do the same for Hacker News, a popular social-news based not-only-tech portal I first came across when my Xerox article made it to its first page, causing quite a bit of server load on this blog. Hacker News is kind of important in the tech web; getting a popular post there can for example instantaneously provide you with a critical mass of page impressions to for example get something viral.

In order to write a data-driven post, the most important thing you need is … data (no shit, Sherlock!). So lots of thanks go to Shital Shah, who downloaded all Hacker News posts since 2006 and proceeded to make them available for download as a big-*ss JSON file. Thanks, Shital! :-) From the JSON file we can read a whopping 1333789 posts.

The remainder of the article is structured as follows. First, for the readers in a hurry, I will be analyzing when and what to post in order to improve your odds in getting a popular post. This will be done on the more recent part of the data, namely all posts since 2013. After that, I will try and derive some possible explanations for the popularity observations made by having a look at when and what HN users post in a coarsely grained way. Third, I will go further and analyses in a more general way the behavior of HN users, considering the whole dataset, not only the recent posts. Also, I will look at how the users' behavior changed over time.

Disclaimer: Please be aware that, like Randal, I am only making statements about probability in this post. Following these guidelines will by no means 100% guarantee that you will have a successful post.

Even more important Disclaimer: As a HN user points out: “You get popular by posting content that conforms to the group-think and avoid content that doesn't”. Another user states “yes, what the world needs now is more and more people who care about marketing their brand”. Both are right and point to the most important thing. What I show here is a data-driven analysis that may increase your odds by like a few percent – but it won't get you anywhere if you post nonsense or otherwise behave like a dipshit.

Video and slides of FrOSCon 2015 talk "Lies, damned lies and scans"

Here, as promised, the Slides (PDF, 3.8MB) and the youtube video of my Talk “Lies, damned lies and scans” at FrOSCon 2015 in St. Augustin, Germany! Thanks for inviting me, it was a lot of fun. You find the original Xerox Saga information here.

Congratz again to FrOSCon to the 10 year anniversary, and I feel honoured by the celebrity in the audience! 8-)

Next Talk: August 22th, FrOSCon, "Lies, damned lies and scans"

At 22th of August, I will be presenting about the Xerox-Saga at FrOSCon – looking forward to it :-).

The program says that I will be starting Saturday at 5:45 PM, so if you like, I will be closing the track for this day. I undertake for an entertaining session. The language will be english this time, and as far as I have been told, there will be streams and videos again.