Natural language processing in legal document analysis software: A systematic review of current approaches, challenges, and opportunities
Abstract
Natural Language Processing (NLP) techniques have been integrated into legal software systems to address the increasing volume and complexity of legal content in regulatory compliance, litigation, and contract management. This systematic study examines the current advancements in NLP applications for legal document analysis, focusing on critical tasks such as contract appraisal, case law summarization, legal question answering, and compliance verification. Ten fundamental research papers published between 2019 and 2025 were selected from academic sources such as IEEE Xplore, ACM Digital Library, SpringerLink, arXiv, and others utilizing the PRISMA methodology. The paper highlights the evolution from rule-based and statistical models to deep learning architectures and large language models (LLMs) tailored for legal text, such as Legal-BERT and GPT-based systems. Despite the potential of NLP in legal practice to automate monotonous tasks and enhance legal reasoning, significant challenges persist. These include the absence of annotated legal datasets, difficulties in interpreting domain-specific terminology, model bias, insufficient output transparency, and ethical concerns over automation in critical sectors. Numerous systems also exhibit a deficiency in explainability, undermining regulatory approval and trust. This work encapsulates current achievements, evaluates model performance on common legal NLP tasks, and highlights significant gaps and future research paths. It facilitates the development of legally competent, auditable, domain-adaptive NLP systems that seamlessly integrate into judicial and commercial legal procedures.
Authors

This work is licensed under a Creative Commons Attribution-NonCommercial 4.0 International License.