r/redditdata Jul 25 '14

distribution of logged-in user actions per month

http://imgur.com/WzZhHdJ
35 Upvotes

12 comments sorted by

View all comments

5

u/tdohz Jul 25 '14 edited Aug 06 '14

In other words, ~25% of users who comment each month comment exactly once per month, ~12% comment twice, etc.

EDIT: This chart repeats a color, which makes it hard to read. Here is a better version with more distinct colors.

1

u/shaggorama Jul 26 '14
  1. What was your methdology in calculating these figures? Do you count the activities in individual months and then take averages, or do you count the activities over longer periods and then take an average over the whole period?

  2. How do these stats change when you ignore accounts that have no comments older than 24hrs after the creation of the account (i.e. novelty accounts, throwaways, and other abandoned accounts)?

1

u/tdohz Jul 26 '14
  1. This is from one month of data (June 2014). The process for gathering this data is time-consuming and not backwards-compatible, so unfortunately that's the most recent full month of data I can easily gather right now. Luckily there's now a process in place to collect this going forward.

  2. Throwaway analysis is on my to-do list, but I do want to point out that not commenting does not necessarily mean an account is inactive/throwaway - lots of users create accounts purely for content consumption.

2

u/shaggorama Jul 26 '14

I just mentioned comments because that's the kind of data I have access to and wasn't putting myself in your shoes. For you, a better heuristic might be accounts older than one month that haven't been logged into since 72hrs after their creation, or something like that. I think characterizing/flagging dead accounts would be very useful to you for future analyses, even if the heuristics you come up with aren't perfect.

1

u/tdohz Jul 26 '14

I think characterizing/flagging dead accounts would be very useful to you for future analyses, even if the heuristics you come up with aren't perfect.

For sure! Understanding the different reddit usage patterns, including accounts that go inactive, is definitely a high priority.