spot_img
HomeResearch & DevelopmentEnhancing Legal AI: A Structured Prompting Method for Long...

Enhancing Legal AI: A Structured Prompting Method for Long Documents

TLDR: A new study presents a structured prompting methodology for Large Language Models (LLMs) to effectively handle long legal documents. By combining document chunking with augmentation, engineered prompts for QWEN-2, and two novel heuristics (Distribution-based Localisation and Inverse Cardinality Weighting) for candidate selection, the approach achieves state-of-the-art performance in information retrieval from legal texts. This method offers a cost-effective alternative to fine-tuning, improving reliability and transparency in legal AI, and performs up to 9% better than previous methods.

Large Language Models (LLMs) are transforming many fields, but their adoption in the legal sector has faced unique hurdles. Key concerns include ensuring reliability and transparency, especially when dealing with the vast and complex nature of legal documents. A recent study by Strahinja Klem and Noura Al Moubayed introduces an innovative approach to overcome these challenges, offering a structured prompting methodology that allows general-purpose LLMs to effectively process and retrieve information from lengthy legal texts.

Addressing Legal Document Challenges with LLMs

Legal documents are notoriously long and intricate, often exceeding the context window of most LLMs. This “long document problem” means models struggle to process an entire document at once. Furthermore, the task of accurately retrieving specific information from these documents, known as the “information retrieval problem,” is a time-consuming and repetitive part of a legal professional’s job. The researchers recognized the need for trustworthy AI tools that can assist legal practitioners without becoming decision-makers themselves, prioritizing human agency and responsibility.

A Novel Structured Prompting Methodology

Instead of relying on expensive fine-tuning, which is common for specialized AI models, Klem and Al Moubayed propose a structured prompting methodology. This approach leverages the power of a general-purpose model, QWEN-2 (a 7 billion parameter variant), making the solution more accessible and scalable. The core of their method involves several key steps:

  • Chunking and Augmentation: To tackle the long document problem, legal documents are first split into smaller, manageable “chunks.” A crucial “augmentation” step is then applied, where redundancy is added between chunks. This helps to relink context that might otherwise be lost when a document is divided, significantly reducing the risk of missing critical information.
  • Engineered Prompting: The study emphasizes the importance of carefully crafted prompts. Through a systematic process of creation, testing, and optimization, the researchers developed prompts that guide the LLM to perform information retrieval tasks more accurately. This prompt engineering step is designed to increase the reliability of the model’s outputs.
  • Candidate Selection Heuristics: After the LLM processes each chunk and generates potential answers, a “candidate selection problem” arises – how to choose the most accurate answer from multiple possibilities. The researchers introduced two heuristics:
    • Distribution-based Localisation (DBL): This heuristic uses patterns from existing data to predict where answers are most likely to appear within a document. Chunks containing these likely locations are given higher weight.
    • Inverse Cardinality Weighting (ICW): This method groups similar answers and weights them inversely to the size of their groups. The idea is that correct answers might appear less frequently than incorrect or noisy responses, helping to isolate the most probable correct answer.

Also Read:

Performance and Implications

The methodology was tested on the CUAD dataset, an American legal dataset specifically designed for contract review and information retrieval. The results demonstrated a significant improvement, with the model performing up to 9% better than previously presented methods, achieving state-of-the-art performance. This represents an average increase in correctness of about 9% per question, and an absolute jump of 250 correct answers in total.

While the study highlights the immense potential of structured prompt engineering in the legal domain, it also points out the limitations of current automatic evaluation metrics for question answering. This calls for future research into more specialized metrics that can accurately assess the nuanced and variable nature of legal text outputs.

Ultimately, this research underscores that by combining structured prompt engineering with intelligent heuristics, generalist LLMs can become powerful, reliable, and transparent tools for navigating the complexities of long legal documents, ensuring accountability and responsibility in AI applications within law and beyond. For more details, you can read the full research paper here.

Ananya Rao
Ananya Raohttps://blogs.edgentiq.com
Ananya Rao is a tech journalist with a passion for dissecting the fast-moving world of Generative AI. With a background in computer science and a sharp editorial eye, she connects the dots between policy, innovation, and business. Ananya excels in real-time reporting and specializes in uncovering how startups and enterprises in India are navigating the GenAI boom. She brings urgency and clarity to every breaking news piece she writes. You can reach her out at: [email protected]

- Advertisement -

spot_img

Gen AI News and Updates

spot_img

- Advertisement -