TLDR: A research paper by Mateusz Kmak et al. investigated whether social media sentiment from Reddit’s r/wallstreetbets could predict stock prices for GameStop (GME) and AMC Entertainment (AMC). Surprisingly, sentiment analysis, even with a ChatGPT-annotated model, showed only a weak correlation with stock prices. Instead, simpler metrics like the volume of comments and Google search trends exhibited stronger predictive signals and a symmetric causal relationship with stock price changes. The study suggests that traditional sentiment analysis may not fully capture the nuances of market-moving online discussions.
The rise of social media has fundamentally reshaped how information, and even financial decisions, are shared among individuals. This shift became particularly evident during the 2021 GameStop short squeeze, where a surge of retail investor activity on platforms like Reddit’s r/wallstreetbets seemingly influenced stock prices. This phenomenon raised a crucial question: can sentiment derived from online discussions truly predict stock market movements?
A recent research paper, titled Predicting stock prices with ChatGPT-annotated Reddit sentiment: Hype or reality?, delves into this very question. Authored by Mateusz Kmak, Kamil ChmurzyÅ„ski, Kamil Matejuk, PaweÅ‚ Kotzbach, and Jan KocoÅ„ from the Department of Artificial Intelligence at WrocÅ‚aw University of Science and Technology, the study investigates the influence of social media sentiment on the stock prices of two prominent ‘meme stocks’: GameStop (GME) and AMC Entertainment (AMC).
Methodology: Diving Deep into Online Discourse
To assess the role of online sentiment, the researchers employed a comprehensive approach. They collected a vast dataset of Reddit posts and comments mentioning GME and AMC from the r/wallstreetbets subreddit, spanning from January 4, 2021, to March 31, 2021, totaling about 7 million posts. Alongside this, they gathered Google search trends data for these companies.
For sentiment analysis, three methods were utilized: TextBlob, a general-purpose Python library for text processing; Financial-RoBERTa, a language model specifically fine-tuned for financial texts; and a newly developed, fine-tuned version of Financial-RoBERTa. This new model was uniquely trained using Reddit posts annotated by ChatGPT, allowing it to better interpret the informal language, slang, and prevalent emojis found in social media discussions. The paper highlights the significant role of emojis in the dataset, with common ones like the ‘rocket’ emoji (meaning ‘to the moon’) and ‘diamond hands’ conveying specific investor sentiments.
To determine predictive power, the study used several statistical metrics: Pearson correlation coefficient and Kendall Tau correlation to measure linear and ordinal relationships, respectively, and Granger causality to assess if one time series could predict another. The analysis also considered modifications to stock price data, such as shifting it by one day (SHIFTED) or focusing on daily price changes (STATIONARY), to better capture dynamic relationships.
Surprising Findings: Sentiment’s Weak Link
The results of the study presented some unexpected insights. For both GameStop and AMC, the sentiment values derived from all three analysis models showed only a very weak correlation with stock prices. This suggests that traditional sentiment analysis, even with advanced models, may not fully capture the complex factors influencing stock movements in these cases.
However, simpler metrics told a different story. The volume of comments on Reddit and Google search trends exhibited a stronger correlation with stock prices. Furthermore, Granger Causality tests, particularly when applied to daily price changes (STATIONARY modification), indicated a significant causal relationship for these simpler metrics. Interestingly, this causal relationship appeared to be symmetric: increased online discussions and search trends could predict stock price movements, and conversely, stock price fluctuations also drove online engagement.
For AMC, the study found that the number of emojis used in Reddit posts had a measurable causal effect on the stock price, suggesting that non-textual sentiment indicators might play a unique role in market dynamics. This implies that the concise, emotional expression of emojis could be a more immediate signal of investor sentiment compared to more complex text-based analyses.
Also Read:
- Unlocking Deeper Connections: AI-Powered Movie Recommendations Based on Emotional Tone
- Filtering Noise for Smarter Asset Pricing: A New Model Using Information Bottleneck
Conclusion: Beyond Hype
The research concludes that while social media sentiment reflects investor emotions, it does not necessarily translate into a direct, strong predictor of stock price movements. The models, even the one fine-tuned with ChatGPT-annotated data, yielded statistically insignificant results for sentiment’s direct impact. Instead, simpler metrics like the sheer volume of comments and Google search interest proved to be more meaningfully correlated and causally related to stock prices, especially when considering daily price changes.
These findings underscore the complexity of retail investor behavior and the multifaceted nature of market-moving online discussions. The study suggests that future research should explore alternative methods for determining sentiment that might provide a stronger causal relationship with stock prices, and also investigate if these findings hold true for other companies and time periods.


