PLEX: A Rapid and Perturbation-Free Approach to Explaining LLM Text Classifications

TLDR: PLEX is a novel method for explaining Large Language Model (LLM) text classifications. Unlike LIME and SHAP, which rely on slow, computationally expensive perturbations, PLEX uses a Siamese neural network trained once to directly map contextual word embeddings to importance scores. This makes PLEX dramatically faster and more efficient, while still accurately identifying influential words and showing high agreement with traditional explanation methods across various tasks.

Large Language Models (LLMs) have become incredibly powerful tools for tasks like text classification, excelling at understanding and categorizing text. However, their complex internal workings often make it difficult to understand why they make certain predictions. This lack of transparency can be a major hurdle, especially in sensitive areas like healthcare or finance, where trust and accountability are paramount.

To address this, the field of Explainable AI (XAI) has developed methods to shed light on these “black box” models. Two popular local explanation methods, LIME (Local Interpretable Model-agnostic Explanations) and SHAP (SHapley Additive exPlanations), work by identifying the most influential words in a sentence that contribute to a model’s prediction. For example, if a model predicts a sentence is about “joy,” LIME or SHAP might highlight words like “happy” or “celebrate” as key contributors.

While effective, LIME and SHAP face a significant challenge: they are computationally intensive. These methods typically generate thousands of slightly altered versions of a sentence (perturbations) and then run the LLM on each altered sentence to see how the prediction changes. This process can be incredibly time-consuming and resource-heavy, especially with the large and complex LLMs we use today.

Introducing PLEX: A Faster, Smarter Way to Explain LLMs

A new approach called PLEX (Perturbation-free Local Explanation) offers a compelling solution to this problem. PLEX is designed to provide local explanations for LLM-based text classification without the need for these expensive perturbations. Instead, it takes a different route.

PLEX works by leveraging the “contextual embeddings” that LLMs naturally generate for words within a sentence. These embeddings are rich numerical representations that capture the meaning of words based on their surrounding context. PLEX then uses a special type of neural network, inspired by “Siamese networks,” which is trained to directly connect these word embeddings with their importance scores. Think of it as teaching a network to understand how much each word contributes to the overall meaning or classification of a sentence.

The key innovation here is the “one-off training.” Once this Siamese network is trained, it can efficiently generate explanations for any new sentence instantly, without needing to create and process thousands of perturbed versions. This dramatically cuts down on the time and computational power required.

Also Read:

Demonstrated Effectiveness and Efficiency

The effectiveness of PLEX was rigorously tested across four different text classification tasks: sentiment analysis, fake news detection, COVID-19 fake news detection, and depression prediction. The results were impressive: PLEX showed over 92% agreement with the explanations provided by LIME and SHAP. This means that PLEX largely identifies the same influential words as these established methods.

A “stress test” further validated PLEX’s accuracy. This test involved removing the words identified as most important by each explanation method and observing how much the classification accuracy dropped. PLEX caused a similar decline in accuracy as LIME and SHAP, confirming its ability to accurately pinpoint truly influential words. In some cases, PLEX even showed superior performance in capturing the impact of key features.

Where PLEX truly shines is in its computational efficiency. It accelerates the explanation process by two orders of magnitude in time and four orders of magnitude in computational overhead compared to LIME and SHAP. For instance, explaining a long sentence with a complex LLM might take tens of seconds or even minutes with traditional methods, but PLEX can do it in a few seconds. This makes PLEX a highly practical solution for real-time applications and environments with limited computing resources.

This research offers a promising path forward for making powerful LLMs more transparent and trustworthy, without sacrificing performance or incurring prohibitive costs. For more technical details, you can refer to the full research paper here.

Financial Sector Fortifies Against Surging AI-Powered Scams

Deloitte’s 2025 Outlook: Navigating Escalating AI Challenges in Human Capital

Salesforce Study Reveals Data Quality is Pivotal for Employee Trust in AI Adoption

Top Executives Sidestep Company AI Guidelines, Fueling Shadow AI Risks

Intel’s Evolving IP Strategy: A Calculated Shift Towards Core AI Innovation

Generative AI Prompts Increased Workforce Surveillance in Indian IT Sector

Financial Sector Fortifies Against Surging AI-Powered Scams

Deloitte’s 2025 Outlook: Navigating Escalating AI Challenges in Human Capital

Salesforce Study Reveals Data Quality is Pivotal for Employee Trust in AI Adoption

Top Executives Sidestep Company AI Guidelines, Fueling Shadow AI Risks

Intel’s Evolving IP Strategy: A Calculated Shift Towards Core AI Innovation

Generative AI Prompts Increased Workforce Surveillance in Indian IT Sector

PLEX: A Rapid and Perturbation-Free Approach to Explaining LLM Text Classifications

Introducing PLEX: A Faster, Smarter Way to Explain LLMs

Demonstrated Effectiveness and Efficiency

Gen AI News and Updates

Vesl AI Recognized for AI Infrastructure Innovation with ASOCIO Digital Summit Award

AT&T Unleashes Agentic AI Across Business Operations for Enhanced Efficiency and Innovation

Boosting Business Efficiency: A New AI and Big Data Model for Process Optimization

Boosting Business Efficiency: A New AI and Big Data Model for Process Optimization

AlphaCast: A New Approach to Time Series Prediction Through Human-AI Collaboration

New Graph Neural Networks Improve Reasoning in Assumption-Based Argumentation

Enhancing AI Reasoning: How Recursive Refinement and Multi-Agent Systems Improve Language Model Performance

ARGUS: A Proactive Framework for Enhancing Autonomous Driving Safety

Generative AI Powers Next-Gen Autonomous Emergency Response

OR-R1: Advancing Automated Optimization with Smart, Data-Efficient AI

Enhancing GUI Agents with Memory: A New Framework for History-Aware Reasoning

ProBench: A Deeper Look into How We Evaluate AI Agents for Mobile Apps

Enhancing Large Language Model Reasoning with Concise Outputs

Ensuring Trust in Autonomous AI: A Two-Layered Monitoring Approach for Agentic Systems

MedFuse: A Multiplicative Approach to Understanding Irregular Clinical Time Series Data

HyperD: A New Framework for More Accurate and Robust Traffic Predictions

Beyond Training: Researchers Propose ‘Model Raising’ for AI with Intrinsic Values

Bridging the Divide: Why AI Needs a Qualitative Revolution

Language Models Enhance Safety Certificate Synthesis for Dynamic Systems

Subscribe to get the latest news and updates