Automatic Classification of Rhetorical Roles for Sentences: Comparing Rule-Based Scripts with Machine Learning

Vern R. Walker, Krishnan Pillaipakkamnatt, Alexandra M. Davidson, Marysa Linares, Domenick J. Pesce
2019 International Conference on Artificial Intelligence and Law  
Automatically mining patterns of reasoning from evidenceintensive legal decisions can make legal services more efficient, and it can increase the public's access to justice, through a range of use cases (including semantic viewers, semantic search, decision summarizers, argument recommenders, and reasoning monitors). Important to these use cases is the task of automatically classifying those sentences that state whether the conditions of applicable legal rules have been satisfied or not in a
more » ... ticular legal case. However, insufficient quantities of gold-standard semantic data, and the high cost of generating such data, threaten to undermine the development of such automatic classifiers. This paper tests two hypotheses: whether distinctive phrasing enables the development of automatic classifiers on the basis of a small sample of labeled decisions, with adequate results for some important use cases, and whether semantic attribution theory provides a general methodology for developing such classifiers. The paper reports promising results from using a qualitative methodology to analyze a small sample of classified sentences (N = 530) to develop rulebased scripts that can classify sentences that state findings of fact ("Finding Sentences"). We compare those results with the performance of standard machine learning (ML) algorithms trained and tested on a larger dataset (about 5,800 labeled sentences), which is still relatively small by ML standards. This methodology and these test results suggest that some access-to-justice use cases can be adequately addressed at much lower cost than previously believed. The datasets, the protocols used to define sentence types, the scripts and ML codes will be publicly available.
dblp:conf/icail/WalkerPDLP19 fatcat:jkbkoj4zf5hddcaibqvc7obhdi