TLDR: A research paper introduces a comprehensive taxonomy of 50 attacks on Large Language Models (LLMs), categorizing them by complexity and whether they target the model or its infrastructure. Applying the DREAD risk assessment framework, the study identifies token smuggling, adversarial prompts, direct injection, and multi-step jailbreak as critical threats to educational LLMs (eLLMs). It also proposes mitigation strategies for educational institutions to enhance eLLM security.
Large Language Models (LLMs) are rapidly changing how we work and learn, and the education sector is no exception. These “Educational Large Language Models” or eLLMs, are being adopted for everything from personalized learning to automated grading and research assistance. However, as their use grows, so do the cybersecurity risks they introduce. A new research paper sheds light on these threats, offering a comprehensive look at the types of attacks LLMs face and how severe they could be in an educational setting.
The paper, titled “Securing Educational LLMs: A Generalised Taxonomy of Attacks on LLMs and DREAD Risk Assessment,” by Farzana Zahid, Anjalika Sewwandi, Lee Brandon, Vimal Kumar, and Roopak Sinha, addresses a critical gap in understanding the cybersecurity landscape for LLMs in education. It introduces a broad classification of fifty different attacks, categorizing them based on whether they target the LLM models themselves or the underlying infrastructure that supports them.
Understanding the Attacks: A New Classification
The researchers propose a novel way to classify these attacks based on their “sophistication level” or complexity: Low, Medium, or High. This helps in understanding how much effort an attacker would need to compromise an LLM.
Attacks on LLM Models directly target the core of the AI. These can include:
- Prompt Injection: This is when malicious instructions are inserted into a prompt, tricking the LLM into doing something unintended, like revealing sensitive information or generating harmful content. Direct prompt injection is considered low complexity, while more advanced forms like prompt divergence are high complexity.
- Jailbreak Attacks: These bypass the LLM’s safety features, allowing it to generate inappropriate or restricted content. Examples include “Do Anything Now” (DAN) mode or multi-step jailbreaks, which are often medium complexity.
- Poisoning Attacks: Attackers can subtly introduce bad or biased data into the LLM’s training process, leading to skewed or incorrect outputs. Pre-training poisoning can be low complexity, while fine-tuning poisoning is high complexity.
- Token Smuggling: This involves encoding banned words or instructions in a way that evades the LLM’s filters, allowing for the creation of harmful content or access to restricted information. This is identified as a critical risk due to its low complexity and high impact.
Attacks on LLM Infrastructure target the systems and environments where LLMs operate. These include:
- Supply Chain Attacks: Infiltrating various stages of the LLM’s development and deployment, such as injecting poisoned data into training or manipulating third-party libraries. These are generally high complexity.
- Ransomware Attacks: Encrypting or locking down LLM functionality or data until a ransom is paid, leading to significant operational disruption and financial losses. This is also a high complexity attack.
- Unbounded Consumption Attacks: Designed to make LLM services unavailable to legitimate users or to deplete financial resources by forcing the LLM to perform high-volume, resource-intensive operations. These are typically medium complexity.
Assessing Risk in Education with DREAD
To evaluate the severity of these attacks specifically within the education sector, the researchers applied the DREAD risk assessment framework. DREAD stands for: Damage, Reproducibility, Exploitability, Affected Users, and Discoverability. By scoring each attack across these criteria, the paper provides a quantitative measure of risk.
The assessment revealed that several attacks pose a “critical” risk to educational institutions. These include token smuggling, adversarial prompts, direct injection, and multi-step jailbreak. For example, token smuggling could allow students to bypass plagiarism detectors or access exam questions, severely impacting academic integrity. Content manipulation, another high-risk attack, could lead to the spread of misinformation in learning materials, undermining the quality of education and public trust.
Ransomware attacks, while technically more complex for attackers, are also deemed a high risk due to their potential for widespread operational disruption, significant financial losses, and severe reputational damage to institutions. Imagine an entire university’s learning management system or virtual tutors being locked down – the impact on students and staff would be immense.
Also Read:
- Persistent Peril: How Multi-Turn Jailbreaking Amplifies LLM Vulnerabilities
- Unmasking AI: New Strategies for Identifying and Protecting Large Language Models
Building Resilient Educational LLMs
The paper emphasizes that securing eLLMs is an ongoing effort. To mitigate these identified risks, educational institutions are urged to adopt several key strategies:
- Enforce eLLM Usage Policies: Clear guidelines and accountability for ethical and sensible use, along with auditing and monitoring for unusual interactions.
- Threat Modeling and Risk Assessment: Regularly identifying and prioritizing threats to eLLMs and their infrastructure to allocate resources effectively.
- Rapid Training and Awareness: Educating staff and students about the responsible use of eLLMs, how to identify misuse, and the potential risks.
- Regular Security Updates, Patching, and Response Plans: Continuously updating models and infrastructure, monitoring for vulnerabilities, and having clear incident response plans.
- Implement Strong Access Controls: Using multi-factor authentication, role-based access, and secure key management to restrict unauthorized access.
This research provides a vital framework for academic and industrial practitioners to build more resilient LLM solutions, ensuring the safety and integrity of learners and institutions as AI becomes an increasingly integral part of education. For more in-depth information, you can read the full research paper here: Securing Educational LLMs: A Generalised Taxonomy of Attacks on LLMs and DREAD Risk Assessment.


