Cost-Sensitive Algorithms for Text Classification in the Legal Domain:
Addressing Imbalanced Lawsuit Themes
DOI:
https://doi.org/10.5540/tcam.2025.026.e01859Keywords:
mbalanced classification, cost-sensitive learning, machine learning, resampling, text classificationAbstract
This article discusses the challenges of imbalanced classification in machine learning, where algorithms often incorrectly assume an even distribution of instances between classes. This issue is common in real-world scenarios, leading to poor representation of minority classes in training data. To combat this, Cost-Sensitive Learning techniques have been developed, focussing on minimising the overall cost of misclassification rather than merely optimising accuracy. These techniques are categorised into three types: Cost-Sensitive Resampling, Algorithms, and Hybrid techniques. The research presents a case study on classifying lawsuits into repetitive themes in São Paulo Court, Brazil, using these cost-sensitive approaches on an imbalanced dataset. The goal is to automate the classification of lawsuits to save time, use human resources more effectively, and speed up the resolution of the lawsuit. The study highlights the effectiveness of cost-sensitive techniques in handling imbalanced classification and their benefits in real-world applications, particularly in the legal field, by improving efficiency and reducing manual workload and processing time for lawsuits.
Downloads
Published
How to Cite
Issue
Section
License
Copyright (c) 2025 daniela

This work is licensed under a Creative Commons Attribution-NoDerivatives 4.0 International License.
Copyright
Authors of articles published in the journal Trends in Computational and Applied Mathematics retain the copyright of their work. The journal uses Creative Commons Attribution (CC-BY) in published articles. The authors grant the TCAM journal the right to first publish the article.
Intellectual Property and Terms of Use
The content of the articles is the exclusive responsibility of the authors. The journal uses Creative Commons Attribution (CC-BY) in published articles. This license allows published articles to be reused without permission for any purpose as long as the original work is correctly cited.
The journal encourages Authors to self-archive their accepted manuscripts, publishing them on personal blogs, institutional repositories, and social media, as long as the full citation is included in the journal's website version.




