Unlocking Real-Time Trading: How Feature Decomposition Distills LLM Power into Small Models

TLDR: A new framework called Cooperative Market Making (CMM) enables Large Language Models (LLMs) to perform real-time market making by distilling their complex features into smaller, faster models. It uses a ‘Normalized Fluorescent Probe’ to understand how LLMs process financial data, revealing that different layers specialize in different predictions (mid-price, spread, volume) and features vary with market volatility. Based on this, CMM employs ‘Orthogonal Feature Decomposition Distillation’ to break down LLM features across layer hierarchy, task objectives, and data market regimes, training specialized small models for each. Finally, a ‘H´ajek Projection-based Mixture-of-Experts’ dynamically integrates these small models’ outputs, assigning confidence scores based on their alignment with a consensus. This approach significantly boosts profitability, reduces risk, and offers superior speed and energy efficiency compared to traditional methods, making LLM-driven market making viable for real-time financial applications.

Large Language Models (LLMs) have shown remarkable potential in complex financial tasks like market making, where they can predict market movements and generate trading strategies with impressive accuracy. However, their sheer size and computational demands make them too slow for the subsecond latency required in real-time financial trading environments. This challenge has led researchers to explore knowledge distillation, a technique to transfer knowledge from a large ‘teacher’ model to a smaller ‘student’ model. Yet, existing distillation methods often focus on distilling to other LLMs, which are still too slow, or fail to effectively transfer the complex features of LLMs to much smaller, lightweight models.

A new research paper titled “Two Heads are Better than One: Distilling Large Language Model Features Into Small Models with Feature Decomposition and Mixture” by Tianhao Fu, Xinxin Xu, Weichen Xu, Jue Chen, Ruilong Ren, Bowen Deng, Xinyu Zhao, Jian Cao, and Xixin Cao from Peking University introduces a novel framework called Cooperative Market Making (CMM) to address these limitations. CMM aims to leverage the power of LLMs for market making while achieving the speed and efficiency of smaller models, making real-time application feasible. You can read the full paper here.

Understanding LLM Features with a Normalized Fluorescent Probe

The foundation of CMM lies in a deep understanding of how LLMs process financial information. The researchers developed a ‘Normalized Fluorescent Probe’ to analyze the internal workings of LLMs. This probe helps identify which parts of the LLM are responsible for different predictions and how they react to various types of market data. Their analysis revealed two crucial insights:

Different layers within an LLM specialize in different aspects of market prediction. For instance, shallow layers are more attuned to predicting the mid-price, middle layers focus on the spread, and deeper layers are geared towards forecasting total trading volume.
Even within the same layer, the LLM’s features behave differently when processing data from markets with varying volatility (e.g., low, medium, or high volatility).

These findings suggest that an LLM’s complex decision-making process can be broken down into simpler, more specialized components.

Orthogonal Feature Decomposition Distillation (OFDD)

Building on these insights, CMM introduces Orthogonal Feature Decomposition Distillation (OFDD). Instead of trying to distill all of an LLM’s complex features into a single small model, OFDD decomposes these features along three ‘orthogonal’ (independent) dimensions:

Layer Feature Hierarchy: Recognizing that different LLM layers handle different prediction tasks, OFDD assigns specialized small models to learn features from specific layers. For example, one small model might focus on the shallow layer’s mid-price prediction features, while another learns the deep layer’s total volume features.
Task Objectives: The overall market-making task is broken down into sub-tasks: predicting mid-price, spread, and total volume. Each sub-task is then handled by a dedicated small model, which learns the specific LLM features relevant to that task.
Data Type/Market Regime: Since LLM features vary with market volatility, OFDD categorizes input data into low, medium, and high volatility types. This allows for the training of specialized small models that are optimized for specific market conditions, leading to more robust and adaptive strategies.

This decomposition simplifies the learning task for each small model, allowing them to effectively capture specific, simpler LLM features. The result is an ensemble of lightweight models, each an ‘expert’ in a particular aspect of the market-making problem.

H´ajek Projection-based Mixture-of-Experts (H´ajek-MoE)

After OFDD, CMM has multiple specialized small models. The next challenge is to effectively combine their outputs to make a comprehensive market-making decision. CMM employs a novel ‘H´ajek Projection-based Mixture-of-Experts (H´ajek-MoE)’ framework, which differs from traditional Mixture-of-Experts approaches.

H´ajek-MoE uses a kernel function to project the features and predictions of each small model into a shared feature space. In this space, it calculates a ‘confidence score’ for each expert based on how well its output aligns with a ‘consensus vector’ (an average of all expert outputs). This dynamic weighting mechanism allows CMM to adaptively integrate the contributions of different experts in real-time, prioritizing the most relevant experts based on the current market conditions.

Also Read:

Superior Performance and Efficiency

Extensive experiments on four real-world market datasets (FU, RB, CU, and AG futures contracts from the Shanghai Futures Exchange) demonstrated CMM’s significant advantages:

Higher Profitability and Reduced Risk: CMM consistently outperformed both traditional Reinforcement Learning (RL) based market-making strategies and other knowledge distillation methods in terms of profitability (Episodic PnL, Return Per Trade, PnLMAP) while also substantially reducing inventory risk (Mean Absolute Position).
Robustness in Extreme Conditions: The framework proved more resilient under simulated flash crashes and sudden market reversals, maintaining a healthier risk-adjusted return compared to baseline models.
Long-Term Adaptability: CMM showed superior performance across various market regimes (bull, bear, sideways markets) over a one-week period, indicating its ability to adapt to changing market conditions.
Data Efficiency: Even when trained on a minimal fraction (10%, 20%, 50%) of the available data, CMM consistently delivered higher returns and maintained stricter risk control than baselines, suggesting it learns more generalizable market behaviors.
Energy Efficiency and Inference Speed: CMM achieved significantly lower power consumption and much faster inference speeds (6.3 times lower latency than the original LLM, at just 0.3 seconds), making it highly suitable for real-time trading environments where speed and efficiency are paramount.

In conclusion, CMM offers a powerful solution for deploying LLM-driven market-making strategies in real-time. By intelligently decomposing LLM features and dynamically integrating specialized small models, it achieves superior accuracy, reduced computational cost, and greater sample efficiency, paving the way for more advanced and practical AI applications in finance.

Financial Sector Fortifies Against Surging AI-Powered Scams

Deloitte’s 2025 Outlook: Navigating Escalating AI Challenges in Human Capital

Salesforce Study Reveals Data Quality is Pivotal for Employee Trust in AI Adoption

Top Executives Sidestep Company AI Guidelines, Fueling Shadow AI Risks

Intel’s Evolving IP Strategy: A Calculated Shift Towards Core AI Innovation

Generative AI Prompts Increased Workforce Surveillance in Indian IT Sector

Financial Sector Fortifies Against Surging AI-Powered Scams

Deloitte’s 2025 Outlook: Navigating Escalating AI Challenges in Human Capital

Salesforce Study Reveals Data Quality is Pivotal for Employee Trust in AI Adoption

Top Executives Sidestep Company AI Guidelines, Fueling Shadow AI Risks

Intel’s Evolving IP Strategy: A Calculated Shift Towards Core AI Innovation

Generative AI Prompts Increased Workforce Surveillance in Indian IT Sector

Unlocking Real-Time Trading: How Feature Decomposition Distills LLM Power into Small Models

Understanding LLM Features with a Normalized Fluorescent Probe

Orthogonal Feature Decomposition Distillation (OFDD)

H´ajek Projection-based Mixture-of-Experts (H´ajek-MoE)

Superior Performance and Efficiency

Gen AI News and Updates

Vesl AI Recognized for AI Infrastructure Innovation with ASOCIO Digital Summit Award

AT&T Unleashes Agentic AI Across Business Operations for Enhanced Efficiency and Innovation

Financial Sector Fortifies Against Surging AI-Powered Scams

Boosting Business Efficiency: A New AI and Big Data Model for Process Optimization

AlphaCast: A New Approach to Time Series Prediction Through Human-AI Collaboration

New Graph Neural Networks Improve Reasoning in Assumption-Based Argumentation

Enhancing AI Reasoning: How Recursive Refinement and Multi-Agent Systems Improve Language Model Performance

ARGUS: A Proactive Framework for Enhancing Autonomous Driving Safety

Generative AI Powers Next-Gen Autonomous Emergency Response

OR-R1: Advancing Automated Optimization with Smart, Data-Efficient AI

Enhancing GUI Agents with Memory: A New Framework for History-Aware Reasoning

ProBench: A Deeper Look into How We Evaluate AI Agents for Mobile Apps

Enhancing Large Language Model Reasoning with Concise Outputs

Ensuring Trust in Autonomous AI: A Two-Layered Monitoring Approach for Agentic Systems

MedFuse: A Multiplicative Approach to Understanding Irregular Clinical Time Series Data

HyperD: A New Framework for More Accurate and Robust Traffic Predictions

Beyond Training: Researchers Propose ‘Model Raising’ for AI with Intrinsic Values

Bridging the Divide: Why AI Needs a Qualitative Revolution

Language Models Enhance Safety Certificate Synthesis for Dynamic Systems

Subscribe to get the latest news and updates