Unmasking Prompt Injection Risks in Web Chatbot Plugins

TLDR: A new study reveals widespread prompt injection vulnerabilities in third-party AI chatbot plugins used by over 10,000 public websites. Researchers found that many plugins fail to verify conversation history integrity, allowing attackers to forge messages and directly manipulate chatbots. Additionally, plugins often indiscriminately scrape untrusted third-party website content, creating opportunities for indirect prompt injection. These insecure practices undermine built-in LLM safeguards, making web chatbots susceptible to tasks like system prompt extraction, task hijacking, and tool hijacking. The study highlights an urgent need for better security practices and proposes lightweight defenses to protect this rapidly growing ecosystem.

Large language models (LLMs) are increasingly integrated into everyday web applications, from personal assistants to customer service chatbots. While much attention has been paid to the security of cutting-edge LLM applications, a recent study sheds light on a significant, yet often overlooked, area of vulnerability: third-party AI chatbot plugins used by thousands of public websites.

Titled “When AI Meets the Web: Prompt Injection Risks in Third-Party AI Chatbot Plugins”, this research by Yigitcan Kaya, Anton Landerer, Stijn Pletinckx, Michelle Zimmermann, Christopher Kruegel, and Giovanni Vigna from the University of California, Santa Barbara, uncovers critical prompt injection risks in these widespread chatbot implementations. The study, which analyzed 17 third-party chatbot plugins deployed on over 10,000 websites, reveals how insecure practices at the plugin level can undermine the built-in safeguards of LLMs.

The Dual Threat: Direct and Indirect Prompt Injection

The researchers identified two primary categories of prompt injection vulnerabilities:

1. Direct Prompt Injection via History Forging: Eight of the studied plugins, used by approximately 8,000 websites, were found to transmit conversation history between the website visitor and the chatbot without enforcing integrity checks. This critical oversight allows attackers to forge conversation histories, including fake system messages, which can significantly boost their ability to elicit unintended behaviors from the chatbot (e.g., generating unauthorized code). This vulnerability essentially bypasses the LLM’s instruction hierarchy, which is designed to prioritize developer-defined system messages over user inputs.

2. Indirect Prompt Injection via Website Content Manipulation: Fifteen plugins offer tools, such as web-scraping, to enrich the chatbot’s context with website-specific content. However, these tools often fail to distinguish between trusted content (like product descriptions) and untrusted, third-party content (such as customer reviews). This creates a pathway for indirect prompt injection, where adversaries can embed malicious instructions within publicly viewable content. The study found that about 13% of e-commerce websites had already exposed their chatbots to such third-party content, making them susceptible to persistent attacks where malicious prompts can be retrieved and trigger harmful behavior during benign user interactions.

Real-World Impact and Experimental Validation

The research goes beyond identifying vulnerabilities, providing a large-scale characterization of the AI chatbot plugin ecosystem. It highlights a rapid growth, with the ecosystem expanding by nearly 50% in 2025 alone. Chatbots are deployed across diverse sectors, including e-commerce, education, health, and even government websites, underscoring the broad impact of these security flaws.

Through controlled experiments, the researchers quantified the impact of these vulnerabilities. They found that injecting prompts into non-user roles (made possible by insecure plugins) was far more effective than user-role injections. While hardened system prompts could reduce the success of task hijacking attacks (like coercing a chatbot into coding), they offered limited protection against tool hijacking, where attackers manipulate the chatbot into misusing external tools (e.g., sending malicious notifications). This suggests that hardening system prompts alone is insufficient to secure chatbots with tool-use capabilities.

Also Read:

Additional Weaknesses and Mitigation Strategies

Beyond prompt injection, the study also uncovered other security weaknesses, including the verbatim exposure of system prompts, leakage of LLM provider API keys, and privacy risks associated with chatbots inadvertently leaking sensitive information due to inadequate data sanitization.

To address these issues, the paper proposes several mitigation strategies:

Storing conversation history server-side or using LLM provider features to isolate and authenticate message state.
Implementing digital signatures for legitimate messages to prevent forgery.
Developing tools like “UGC-Buster” to isolate untrusted user-generated content on webpages, preventing it from being scraped into the chatbot’s knowledge base.
Automatically strengthening tool instructions with explicit anti-hijacking rules using LLMs, which significantly reduced attack success rates in experiments.

The authors emphasize that many plugin developers may lack understanding of LLM security best practices, leading to implementations that inadvertently undermine model-level defenses. While responsible disclosures were made to affected plugin developers, some vulnerabilities remain unaddressed. This research serves as a crucial call to action for the AI security community to broaden its focus and secure this rapidly expanding class of AI-enabled web applications before these vulnerabilities become deeply entrenched. You can read the full research paper here: When AI Meets the Web: Prompt Injection Risks in Third-Party AI Chatbot Plugins.

Financial Sector Fortifies Against Surging AI-Powered Scams

Deloitte’s 2025 Outlook: Navigating Escalating AI Challenges in Human Capital

Salesforce Study Reveals Data Quality is Pivotal for Employee Trust in AI Adoption

Top Executives Sidestep Company AI Guidelines, Fueling Shadow AI Risks

Intel’s Evolving IP Strategy: A Calculated Shift Towards Core AI Innovation

Generative AI Prompts Increased Workforce Surveillance in Indian IT Sector

Financial Sector Fortifies Against Surging AI-Powered Scams

Deloitte’s 2025 Outlook: Navigating Escalating AI Challenges in Human Capital

Salesforce Study Reveals Data Quality is Pivotal for Employee Trust in AI Adoption

Top Executives Sidestep Company AI Guidelines, Fueling Shadow AI Risks

Intel’s Evolving IP Strategy: A Calculated Shift Towards Core AI Innovation

Generative AI Prompts Increased Workforce Surveillance in Indian IT Sector

Unmasking Prompt Injection Risks in Web Chatbot Plugins

The Dual Threat: Direct and Indirect Prompt Injection

Real-World Impact and Experimental Validation

Additional Weaknesses and Mitigation Strategies

Gen AI News and Updates

OneShield Achieves Landmark Registration Under Cloud Security Alliance AI Controls Matrix, Setting New Industry Standard

Vesl AI Recognized for AI Infrastructure Innovation with ASOCIO Digital Summit Award

AT&T Unleashes Agentic AI Across Business Operations for Enhanced Efficiency and Innovation

Boosting Business Efficiency: A New AI and Big Data Model for Process Optimization

AlphaCast: A New Approach to Time Series Prediction Through Human-AI Collaboration

New Graph Neural Networks Improve Reasoning in Assumption-Based Argumentation

Enhancing AI Reasoning: How Recursive Refinement and Multi-Agent Systems Improve Language Model Performance

ARGUS: A Proactive Framework for Enhancing Autonomous Driving Safety

Generative AI Powers Next-Gen Autonomous Emergency Response

OR-R1: Advancing Automated Optimization with Smart, Data-Efficient AI

Enhancing GUI Agents with Memory: A New Framework for History-Aware Reasoning

ProBench: A Deeper Look into How We Evaluate AI Agents for Mobile Apps

Enhancing Large Language Model Reasoning with Concise Outputs

Ensuring Trust in Autonomous AI: A Two-Layered Monitoring Approach for Agentic Systems

MedFuse: A Multiplicative Approach to Understanding Irregular Clinical Time Series Data

HyperD: A New Framework for More Accurate and Robust Traffic Predictions

Beyond Training: Researchers Propose ‘Model Raising’ for AI with Intrinsic Values

Bridging the Divide: Why AI Needs a Qualitative Revolution

Language Models Enhance Safety Certificate Synthesis for Dynamic Systems

Subscribe to get the latest news and updates