Back to news
Law Society Ethics

Artificial Intelligence Reaches Human Level in Basic Legal Annotations but Stumbles in Complex References

Artificial intelligence is already capable of making basic annotations of legal cases at a human level, but falls behind in complex legal references, according to a recent study published in the journal Artificial Intelligence and Law.

The study compared the performance of the large language model GPT-4o with law students and experienced legal professionals. The task was to annotate decisions of the United Nations Committee on Economic, Social and Cultural Rights: for example, to classify the content of decisions and extract legal references from them.

GPT-4o achieved human-level accuracy in basic annotations. It was able to classify decisions and make simple annotations as well as trained human annotators. Differences emerged when the model had to identify and extract legal references from the decisions, such as statutes and precedents.

The AI model proved to be accurate but cautious: it made few incorrect references but also missed several correct ones. This was particularly evident in complex legal references, where human annotators were clearly more reliable.

In contrast, the problem in human-made annotations was inconsistent formatting and careless errors. Although they found references more comprehensively, the consistency of the results suffered.

The study also highlights another weakness of AI: the responses of GPT-4o varied when the same task was repeated, raising questions about the reproducibility of the results. Despite this, researchers emphasize that cost-effectiveness is a clear advantage of the model. AI can perform large amounts of annotation significantly cheaper than human labor, which could broadly change the handling of legal material.

Source: The price of automated case law annotation: comparing the cost and performance of GPT-4o and student annotators, Artificial Intelligence and Law.

This text was generated with AI assistance and may contain errors. Please verify details from the original source.

Original research: The price of automated case law annotation: comparing the cost and performance of GPT-4o and student annotators
Publisher: Artificial Intelligence and Law
Authors: Iris Schepers, Michelle Bruijn, ... Michel Vols
December 23, 2025
Read original →