Reddit Data Collection Guide
This guide explains how to collect qualitative data from Reddit for research purposes.
Overview
Reddit is a valuable source of qualitative data because users share detailed personal experiences in topic-specific communities (subreddits). This guide covers best practices for systematic data collection.
Step 1: Set Up a Shared Google Sheet
Create a Google Sheet with the following columns:
| Column | Description |
|---|---|
| Post ID | Reddit post/comment ID for reference |
| Subreddit | Source subreddit |
| Date | When posted |
| Quote/Summary | Key content (paraphrase, don’t copy entire posts) |
| Category/Theme | Categorize the issue |
| Context | Industry, job type, demographics if mentioned |
| Notes | Your observations |
Step 2: Identify Target Subreddits
Finding Relevant Subreddits
- Search Reddit for your topic keywords
- Identify subreddits where users share personal experiences
- Prioritize active subreddits with substantial post history
Example Subreddit Categories
- Work/Career: r/jobs, r/careerguidance, r/antiwork, r/AskHR
- Health: r/ChronicPain, r/mentalhealth, r/ADHD
- Parenting: r/Parenting, r/workingmoms, r/daddit
- Industry-Specific: r/nursing, r/Teachers, r/accounting
Step 3: Search Strategy
Using Reddit Search
Within your target subreddit, use the search bar to find relevant posts.
Keyword Examples
Adapt these to your research topic:
- Role-related: “work”, “job”, “boss”, “manager”, “HR”, “coworker”
- Policy-related: “leave”, “accommodations”, “remote work”, “policy”
- Experience-related: “told my”, “asked for”, “struggled with”
Search Tips
- Use quotes for exact phrases: “told my manager”
- Try variations: “boss” vs “supervisor” vs “manager”
- Sort by relevance or by new/top posts
Step 4: Data Collection Process
What to Look For
- Personal experiences - First-hand accounts, not advice-seeking
- Detailed stories - Posts with context and specifics
- Workplace interactions - How situations were handled
- Policy mentions - What helped or was missing
- Emotional content - Challenges, frustrations, successes
How to Record
- Paraphrase or summarize - Don’t copy entire posts verbatim
- Note the context - Industry, role, company size if mentioned
- Flag insightful posts - Mark particularly rich data for deeper analysis
- Record post ID - For reference and verification
Avoiding Duplicates
- Before adding a post, search your Google Sheet by keywords or post title
- If working in a team, divide keywords or time periods
Step 5: Team Coordination (If Applicable)
Dividing Work
| Approach | Example | |———-|———| | By keyword | Team member A searches keywords 1-5, B searches 6-10 | | By time period | A collects 2024 posts, B collects 2025 posts | | By subreddit | A focuses on r/subreddit1, B focuses on r/subreddit2 |
Communication
- Meet regularly to discuss findings and avoid overlap
- Share interesting patterns you notice
- Update the shared sheet in real-time
Ethical Considerations
- Public Data Only - Only use posts from public subreddits
- Anonymity - Never include usernames in your final report
- Paraphrase - Summarize rather than quote extensively to protect privacy
- No Direct Contact - Do not message Reddit users
- Research Purpose - Data is for academic analysis only
- IRB Compliance - Follow your institution’s research ethics guidelines
Step 6: Coding and Analysis
After data collection, analyze the data using qualitative methods.
Coding Process
- Open coding - Read through data, identify initial patterns
- Axial coding - Group related codes into categories
- Selective coding - Identify overarching themes
Creating a Thematic Table
Organize your findings into a table showing:
- 1st-order concepts (raw data patterns)
- 2nd-order themes (grouped categories)
- Aggregate dimensions (broader theoretical concepts)
See Qualitative Research Paper Guide for detailed guidance on analysis and presenting findings.
Data Quality Tips
Signs of Good Data
- Detailed personal narratives
- Specific examples and situations
- Emotional authenticity
- Relevant to research question
Data to Skip
- Very short posts without detail
- Posts that are purely advice-seeking without experience sharing
- Duplicate or cross-posted content
- Posts that seem fabricated or trolling
Checklist
- Target subreddits identified
- Google Sheet set up with appropriate columns
- Search keywords developed
- Team responsibilities assigned (if applicable)
- Data collection completed
- Duplicates removed
- Coding/analysis completed
- Thematic table created
Related Guides
- Qualitative Research Paper Guide - How to write up your findings
- Literature Review Guide - How to review existing research