Top TIL contributor is a bot, here's what the data found

Joined
Nov 5, 2014
Messages
831
Likes
616
Degree
3
cfjQE5c.png

User Possible_Urban_King on Reddit is actually a machine learning bot. The programmer fed a machine learning algo all of Wikipedia and all TIL threads with upvote count . The machine than suggested new Wikipedia articles that should do well on TIL. For 90 days, the account posted on TIL and it resulted in 4 posts that went to the top of TIL and 3 posts that made it to the front page. This came from a total of 50 posts. Here's the full story http://www.buzzfeed.com/hamzashaban/today-ai-learned

Key points:
* "Nazis, racism, and unsolved grisly murders" are the best TIL as quoted from author
* Author made a graph showing the correlation between upvotes and the time a post was made (no causation suggested) http://imgur.com/a/GQpC8 https://www.reddit.com/r/dataisbeau...time_of_day_to_post_to_til_a_breakdown_posts/ I'm guessing its 10 am EST as author is from DC. Makes sense as it'll have the whole day to get shared and reddit threads die fast.
* author states that title plays a bigger role than posting time (duh)

Obviously, applying the same algo to memes or any other category would give you suggestions for undiscovered yet potentially viral topics in less time and work while having a much higher efficiency than browsing the web yourself.

This is traffic leaking to the next level (and maybe the future).
 
Here's where the real challenge comes in...

Yeah, when you're spamming /r/TIL with wikipedia links, nobody notices. Try hitting your own site over and over and watch the resistance rise. Also, wikipedia has the world's most quality and largest user generated content set with active editing and moderation going on. Can you produce that on your site?

Or are you going to build enough sites to spread your impact and hide to avoid the resistance? Are you going to try (keyword: try) to game Reddit with more accounts?

It's a cool idea on the surface. But how do you make it pragmatically useful instead of just being about internet points?

When you're ready on the backend to pursue this kind of mass promotion, you'll be ready to get even better results by employing human hands and eyes.
 
Here's where the real challenge comes in...

Yeah, when you're spamming /r/TIL with wikipedia links, nobody notices. Try hitting your own site over and over and watch the resistance rise. Also, wikipedia has the world's most quality and largest user generated content set with active editing and moderation going on. Can you produce that on your site?

Or are you going to build enough sites to spread your impact and hide to avoid the resistance? Are you going to try (keyword: try) to game Reddit with more accounts?

It's a cool idea on the surface. But how do you make it pragmatically useful instead of just being about internet points?

When you're ready on the backend to pursue this kind of mass promotion, you'll be ready to get even better results by employing human hands and eyes.

Still gives you the type of content that people are interested in, Reddit has a large sample size. You at least get insight on Reddit's demographic on what they would want to read.

As far as how to leverage this to promoting your site specifically, you're right where that's where it gets difficult for Reddit. Between a personal Reddit army, promoting sites with your content on it (microsites/parasites), and a lot more. Not a huge barrier to entry here at all, which is why I enjoy it at least.

My issues with Reddit traffic though is they're generally incredibly cheap, quick to judge, easily offended. Like there's people on Reddit who would read this comment about marketing there and think I'm the devil.

Analytics always rated the traffic as 'someone who would be interested in employment', which might give some people ideas.
 
@juliantrueflynn That sounds like Reddit alright.

Reddit: liberal neckbeards who like esoteric discussions (ie jury nullification)
4Chan: individuals who think anti-social behaviors are cool (ie 4chan raids)
9GAG: immature and childish people who like cheap laughs. It was created by someone in China so that influenced its culture. (ie their motto is "go fun yourself")
Digg: If The Atlantic were an online portal, it'll be Digg
YouTube: Its a really wide category, too wide to sum up but they do like quick scenes when editing videos.

Eh, I can go on. Here's a cool drawing that sums it up surprisingly well

netizens.jpg

http://www.ryantan.net/content/netizens.jpg

Did he publish the source code anywhere?
Not that I know of but he did do (or tried to do) an AMA. I'm sure, as he's a student, he'll give you the code for free out of naiveness or flattery in the fact that someone's interested in his work.
 
I've always seen Reddit as a great environment for speech recognition and machine learning. There's millions of comments and linked responses about every subject and in every context. If you could collect all of this data, string normalize to floating point and use karma as a relevance weight you'd be able to fabricate accurate conversation. It would take input as a users comment, string normalize to floating point then find a correlating response with the smallest distance between the two floating points.
 
Back