My Digital Identity — Reddit

@PatrickYoon
6 min readJul 26, 2020

This is a piece I wrote for a university English course I took that focused around how algorithms are derived and affected in the real world. The project was about gathering large amounts of data about yourself and visualizing it to show how we make choices when reading and writing online.

I’ve heard of stories where people would close out of Instagram or Twitter on their phones and unconsciously open the app right away. For me, that is Reddit. I will be on my computer working on homework, then unknowingly open a new tab and go on Reddit. What I don’t realize is that I have Reddit open on my second and third monitor and another 2 tabs opened on my main monitor. When I first heard about this project, I knew I wanted to do something with Reddit as it is the only site I have been using since my middle school days, and I was curious about what my life trend is. I wanted to know if my interests noticeably followed my subreddit usage, or was it still dominated by the League of Legends subreddit (a video game that I started in middle school and still actively play after seven years). Since I spent more time lurking around the site than posting, I was also curious as to whether or not there was something specific about a submission that prompted me to respond; maybe it is a subreddit that I am more likely to post or I like to post more on subreddits with less user activity. Regardless, I knew that I wanted to gather the data of my Reddit history.

Unfortunately, the Reddit API is quite limited, so I had to get a little creative with it. The API itself does not actually provide that much information. By providing a simple GET request, the most you can get is basic information about a user/submission/comment and there is no information you can get from the API that you cannot get from going on the site itself. Actually, there is less information you can get from the API since you cannot get every single post from a user all at once, but you can manually check each post from the beginning of a user’s submission history. Thus, after doing some research, I came across pushshift.io. This site is able to get older and larger amounts of information from Reddit, and has many more parameters to cater to what I was looking for.

The first data I gathered was the number of comments I had made per subreddit over my entire Reddit history. I had a feeling that the League of Legends subreddit would be my most commented, as it’s what got me into Reddit and I still check it daily, but I was curious about the distribution of comments between the League of Legends subreddit and the other subreddits I use.

Here is the distribution of my comments per subreddit in a pie chart. According to the API, I’ve made exactly 426 comments from the beginning of my account to when I collected this data. That number is actually quite small given how often and how l long I have been using Reddit. If you do the math, it’s only about 5 comments made per month. As expected, the League of Legends subreddit takes a majority of the distribution with an astounding 62%. The rest of the pie chart consists of whatever I was interested about or subreddits promote a lot of engagement. At certain points throughout my life, I would be super interested in something such as Minecraft, high school band, or TWICE (music group), so I would go on their respective subreddits and be involved for a couple months before going back to the League of Legends subreddit. There are also subreddits that encourage commenting and user activity such as r/RandomKindness (users can request small things for free), r/User_Simulator (a bot that uses machine learning to mimic your posts), and r/chanceme (ask for other users to chance the poster’s probability of making some college). I included an “Other” tab as there were about 25 subreddits that I only made one comment on, and it would have clustered the pie chart and make it unreadable. After taking a look at this pie chart, I was interested in which subreddits I was most active in each month since the pie chart doesn’t really display my activity over time. Using the API, I gathered the most commented subreddit for each month since the start of my account back in October 2013.

Here are eight horizontal bar charts that show my most commented subreddit per month and the number of comments I made for that subreddit for each year. I think these eight bar charts show a lot about my posting habits over the years. You can see what I was like over the years just by looking at the trends of these charts. For example, the summer of my middle school years (2014–2015) primarily consisted of League of Legends. You can even see when I didn’t post as much on Reddit, such as the summer of 2016, where I got into my first romantic relationship or the summer of 2019, where I had just graduated high school and spent everyday out with friends. Other noticeable trends are when I joined the marching band (early 2017), I started applying to colleges (late 2018-early 2019), and the beginning of the pandemic (frequently visited the Northeastern University subreddit).

While it’s easy to spot trends of which subreddits I tend to engage in and why, it’s not quite clear exactly what prompts me to actually post. This pushed me to gather more data. Here are two graphs that aren’t quite as interesting but still informational.

As you can see, neither graphs have any correlation at all with upvotes. It’s reassuring to know that the upvotes on my comments aren’t affected over time nor the length of the comment (which is actually a meme on Reddit for one-liner comments to do significantly better than well-thought-out paragraphs).

So there is not a specific habit or trend that gets me to comment more, but after learning about Foucault’s “panopticon”, this has me thinking about Reddit’s anonymity. Since anyone can make a Reddit account without having their real life tied to it (compared to mainstream social media platforms), anyone can post and comment whatever they desire without having their real-life identity on the line. To me, this makes sense as I comment significantly more on Reddit relative to other sites such as Facebook or Instagram where my real name and picture is publicly tied to my post. However, Cheney-Lippold claims that we as users are powerless from breaking away from our online identities. This is quite scary to imagine as Reddit and other anonymous sites are supposed to allow us to be our true selves, and still having our data sold and being tied to some digital identity feels quite limiting. As Beck says, “We may never be able to realize a web where tracking is not part of the package, but being educated about tracking technologies and how to limit the data stored and shared helps protect us” (Beck 130).

Overall, Reddit is a site where I frequently visit, but rarely comment or post. However, there isn’t a specific subreddit or pattern that pushes me to comment, and it’s likely dependent on the specific submission that I’m responding to. While writing this for the past week, I became more self-conscious about my Reddit usage. I want to know what urges me to respond and search for different subreddits, but I’m sure the overlying companies can already predict that with my digital identity left on Reddit.

--

--