JAPAS: A Benchmark and Neural Approach for Japanese Patent S upport Relation Extraction

Abstract

Efficient analysis of patent literature is crucial for technological development and protecting intellectual property. Akey task is verifying the “support requirement,” which mandates that the detailed description must fully describe theclaimed invention. This requirement is fundamental to a patent’s validity. Manual verification is a labor-intensiveprocess that demands technical and legal expertise, making automation highly desirable. However, research onthis task has been hampered by two key challenges: (1) the absence of a public benchmark, and (2) the reliance ofprior work on lexical matching, which fails to capture semantic equivalence. To address these issues, we introduceJAPAS, the first public benchmark for this task, comprising over 2,000 instances manually annotated for Japanesepatents. Each instance is labeled with a claim span, a supporting description paragraph, a relation type, and theannotator’s confidence level. Using this benchmark, we also establish modern baselines that capture semanticsimilarity, such as embeddings and LLMs. Our experiments show that a fine-tuned Qwen3-14B model achieves anF1 score of 0.50, outperforming the conventional lexical-based baseline. This result, which demonstrates that thetask is feasible yet challenging, highlights the utility of JAPAS as a research foundation and provides a performancetarget for future work.

Publication
Proceedings of the 15th biennial Language Resource and Evaluation Conference

Add the full text or supplementary notes for the publication here using Markdown formatting.