Phi-3 Meets Law: Fine-tuning ma Phi-3 Meets Law: Fine-tuning Small Language Models for Legal Document Understanding
Abstract
This study explores the application of Microsoft’s Phi-3 mini 4K model to legal document understanding
using the CaseHOLD dataset. We fine-tuned Phi-3, a compact yet powerful language model, on CaseHOLD’s 53,000+
multiple-choice questions designed to test legal case holding identification. Employing Low-Rank Adaptation (LoRA) and
Quantized Low-Rank Adapta-tion (QLoRA) techniques, we optimized the fine-tuning process for efficiency. Our results
demonstrate that the fine-tuned Phi-3 mini 4K model achieves an F1 score of 76.89, surpassing previous state-of-the
art models in legal document understanding. This performance was achieved with only 3600 training steps, highlighting
Phi-3’s rapid adaptability to domain specific tasks. The base Phi-3 model, without fine-tuning, achieved an F1 score of
64.89, outperforming some specialized legal models. Our findings underscore the potential of compact language models
like Phi-3 in specialized domains when properly fine-tuned, offering a balance between model size, training efficiency,
and performance. This research contributes to the growing body of work on adapting large language models to specific
domains, particularly in the legal field.