TLDR: Researchers have developed a novel auditing framework based on martingale theory to detect when cloud-based Large Language Model (LLM) providers might be overcharging users by misreporting the number of tokens used. This framework is guaranteed to detect unfaithful providers regardless of their strategy and maintains a low false positive rate. Experiments show it can identify misreporting in as few as 70 reported outputs, enhancing transparency and fairness in LLM-as-a-service.
Large Language Models (LLMs) have become indispensable tools, powering a vast array of cloud-based services. Millions of users interact with these advanced AI models through APIs, typically paying based on the number of tokens generated in an output. However, this seemingly straightforward ‘pay-per-token’ pricing model harbors a significant vulnerability: a financial incentive for providers to potentially misreport the actual token count, leading to overcharging.
A recent research paper, titled “Auditing Pay-Per-Token in Large Language Models,” by Ander Artola Velasco, Stratis Tsirtsis, and Manuel Gomez-Rodriguez from the Max Planck Institute for Software Systems, delves into this critical issue. The authors highlight that the core of the problem lies in an information asymmetry: while the provider observes the entire generative process of the LLM, the user only sees and pays for the final output. Furthermore, the tokenization of a string is not always unique, meaning the same output string can be represented by different numbers of tokens. This creates an opportunity for a provider to claim a longer, more expensive tokenization than what was actually generated by the model.
Consider an example: if an LLM generates “Tangier, Morocco” as an output, it might internally use four tokens. A self-serving provider, however, could claim a tokenization that results in seven tokens for the same string, thereby overcharging the user. This practice, known as token misreporting, poses a significant threat to the fairness and transparency of the LLM-as-a-service market.
A Novel Auditing Framework
To address this challenge, the researchers propose an innovative auditing framework built upon martingale theory. This framework is designed to enable a trusted third-party auditor to sequentially query an LLM provider and detect instances of token misreporting. A crucial aspect of this framework is its guarantee to always detect token misreporting, regardless of the provider’s specific misreporting strategy, while also maintaining a high probability of not falsely flagging a faithful provider as unfaithful.
The auditing process works by framing the detection of misreporting as a sequential hypothesis test. The auditor, with access to the LLM’s next-token probability distribution (a realistic scenario for open-weight models or under regulatory requirements), compares the length of the reported token sequence with an estimated average length of token sequences the LLM would use to encode the same output string. The paper introduces a novel, unbiased, and efficient estimator for this average length.
The framework then aggregates this statistical evidence over time using a stochastic process, referred to as ‘M’. If this aggregated evidence exceeds a predefined threshold, the provider is flagged as unfaithful. The mathematical underpinnings ensure that if the provider is faithful, the evidence for misreporting does not increase on average, making the process a martingale. This property is key to controlling the false positive rate.
Experimental Validation and Results
To validate their framework, the researchers conducted extensive experiments using several large language models from the Llama, Gemma, and Ministral families. They used input prompts from the popular LMSYS Chatbot Arena dataset and simulated various misreporting policies, including ‘random’ and ‘heuristic’ strategies that aim to increase token counts.
The results were compelling: the auditing framework successfully detected unfaithful providers after observing fewer than approximately 70 reported outputs. Importantly, the probability of falsely flagging a faithful provider remained negligible, well below the specified threshold of 0.05. The study also found that the average number of reported outputs needed for detection decreased rapidly as the intensity of misreporting increased, aligning with theoretical predictions.
Also Read:
- Unmasking Covert Collusion in Multi-Agent LLM Systems
- Auditing AI Fairness with Incomplete Data: A New Approach to Cost Efficiency
Implications and Future Directions
While highly effective, the framework does have certain requirements and limitations. It necessitates the auditor to have access to the LLM’s next-token probabilities. Future research could explore ways to lift this requirement. Additionally, the current theoretical analysis assumes a static reporting policy from the provider; dynamic changes in misreporting strategies could be an interesting avenue for future work. The paper also suggests that better methods for setting the framework’s parameters could further reduce the detection time.
This research represents a significant step towards building trust and ensuring fairness in the rapidly expanding LLM-as-a-service market. By providing a robust tool for algorithmic auditing, it helps to align the incentives of providers with those of end-users, protecting them from potential overcharging. The full paper can be accessed here: Auditing Pay-Per-Token in Large Language Models.


