AI Models Leverage Language Model Sentiments for Enhanced Stock Market Forecasting

TLDR: A new study demonstrates that combining sentiment scores from ten large language models (LLMs) with minute-level stock data significantly improves short-term stock prediction. The Mamba deep learning model consistently outperformed the Reformer model, achieving the lowest error rate when paired with LLaMA 3.3–70B, highlighting the effectiveness of integrating LLM-based semantic analysis with efficient temporal modeling for real-time financial forecasting.

Predicting the stock market, especially in the short term, is notoriously challenging due to its high volatility, constant news cycles, and the complex, non-linear nature of financial data. However, new research is exploring how the power of large language models (LLMs) can be harnessed to improve these predictions, even down to the minute level.

A recent study by Lokesh Antony Kadiyala and Amir Mirzaeinia from the University of North Texas introduces a novel framework that combines semantic sentiment scores from ten different LLMs with minute-interval intraday stock price data. Their goal was to enhance the accuracy of minute-level stock predictions, focusing specifically on Apple Inc. (AAPL) stock prices and news articles from April 4 to May 2, 2025.

The core idea involves using advanced LLMs like DeepSeek-V3, various GPT models, LLaMA, Claude, Gemini, Qwen, and Mistral to analyze financial news articles. Each article received a sentiment score from all ten LLMs, scaled to a range of 0 to 1 (0 being strongly negative, 1 strongly positive). These sentiment scores were then integrated with traditional stock price data and technical indicators such as the Relative Strength Index (RSI), Rate of Change (ROC), and Bollinger Band Width.

To process this rich dataset, the researchers employed two state-of-the-art deep learning models: Reformer and Mamba. These models were trained separately, with each LLM’s sentiment scores fed as input. Hyperparameters for both models were carefully optimized using Optuna, and their performance was evaluated over a three-day period.

The findings were quite significant. Mamba consistently outperformed Reformer, not only in terms of speed but also in prediction accuracy across all ten LLMs tested. Mamba achieved its best performance when combined with LLaMA 3.3–70B, yielding the lowest error rate of 0.137. While Reformer was capable of identifying broader trends within the data, it tended to smooth over sudden changes indicated by the LLMs, making it less responsive to rapid market shifts.

This research highlights the immense potential of integrating LLM-based semantic analysis with efficient temporal modeling techniques to significantly improve real-time financial forecasting. It suggests that the nuanced understanding of context, emotion, and intent that LLMs provide from financial news can offer a more precise signal for market direction, especially when paired with models designed to handle long, noisy, and fine-grained time series data like Mamba.

The study involved a meticulous data collection process, gathering financial news articles about Apple Inc. and 1-minute interval stock prices using the News API and Polygon.io API, respectively. Timestamps were carefully aligned, and articles published outside trading hours were adjusted. Feature engineering played a crucial role, incorporating various technical indicators and temporal encodings (like minute of the day, minute offset, and sine/cosine transformations of the minute index) to enrich the dataset.

Both Mamba and Reformer models were configured to accept input sequences of 60 consecutive minutes, each represented by a vector of 10 features (9 engineered indicators and 1 sentiment score). The models then predicted the closing price for the subsequent minute. Mamba, based on state-space modeling, proved particularly adept at capturing long-range dependencies with linear complexity, making it ideal for high-resolution time series. Reformer, a Transformer variant, used locality-sensitive hashing to handle long sequences efficiently.

Also Read:

The results clearly demonstrate Mamba’s superior responsiveness to minor market trends and price surges, maintaining tight alignment with actual data during rapid price shifts. Reformer, while good at capturing broader trends, showed less sensitivity to quick fluctuations. This study paves the way for future research to extend the time period, include additional stocks, and further refine LLM prompts for financial texts. For more details, you can read the full research paper here.

Financial Sector Fortifies Against Surging AI-Powered Scams

Deloitte’s 2025 Outlook: Navigating Escalating AI Challenges in Human Capital

Salesforce Study Reveals Data Quality is Pivotal for Employee Trust in AI Adoption

Top Executives Sidestep Company AI Guidelines, Fueling Shadow AI Risks

Intel’s Evolving IP Strategy: A Calculated Shift Towards Core AI Innovation

Generative AI Prompts Increased Workforce Surveillance in Indian IT Sector

Financial Sector Fortifies Against Surging AI-Powered Scams

Deloitte’s 2025 Outlook: Navigating Escalating AI Challenges in Human Capital

Salesforce Study Reveals Data Quality is Pivotal for Employee Trust in AI Adoption

Top Executives Sidestep Company AI Guidelines, Fueling Shadow AI Risks

Intel’s Evolving IP Strategy: A Calculated Shift Towards Core AI Innovation

Generative AI Prompts Increased Workforce Surveillance in Indian IT Sector

AI Models Leverage Language Model Sentiments for Enhanced Stock Market Forecasting

Gen AI News and Updates

PASA Unveils New ‘Data for AI’ Guidance to Foster Responsible Innovation in Pensions Administration

AZTECH Introduces Comprehensive AI Training Series to Propel Regional Digital Transformation

HKU Spearheads AI Integration in Hong Kong’s Digital Education Future

Boosting Business Efficiency: A New AI and Big Data Model for Process Optimization

AlphaCast: A New Approach to Time Series Prediction Through Human-AI Collaboration

New Graph Neural Networks Improve Reasoning in Assumption-Based Argumentation

Enhancing AI Reasoning: How Recursive Refinement and Multi-Agent Systems Improve Language Model Performance

ARGUS: A Proactive Framework for Enhancing Autonomous Driving Safety

Generative AI Powers Next-Gen Autonomous Emergency Response

OR-R1: Advancing Automated Optimization with Smart, Data-Efficient AI

Enhancing GUI Agents with Memory: A New Framework for History-Aware Reasoning

ProBench: A Deeper Look into How We Evaluate AI Agents for Mobile Apps

Enhancing Large Language Model Reasoning with Concise Outputs

Ensuring Trust in Autonomous AI: A Two-Layered Monitoring Approach for Agentic Systems

MedFuse: A Multiplicative Approach to Understanding Irregular Clinical Time Series Data

HyperD: A New Framework for More Accurate and Robust Traffic Predictions

Beyond Training: Researchers Propose ‘Model Raising’ for AI with Intrinsic Values

Bridging the Divide: Why AI Needs a Qualitative Revolution

Language Models Enhance Safety Certificate Synthesis for Dynamic Systems

Subscribe to get the latest news and updates