Analysis of Document Retrieval for Online Public Administrative Procedure Services
Abstract
Public Administrative Procedures (APs) are processes, implementation methods, documents, and requirements or conditions prescribed by government agencies or authorized individuals to address specific tasks related to individuals or organizations. However, organizations and citizens still face challenges in easily and conveniently accessing information and public administrative services. This paper investigates advanced techniques to address the challenges of document retrieval for public administrative services. We implement a hybrid retrieval process, combining traditional retrieval models such as TF-IDF and BM25 with modern models such as SBERT and fine-tuning models. Results demonstrate that combining these models significantly enhances retrieval performance. The ensemble model of BM25 and finetuned SBERT achieved
the highest F2 score, indicating superior effectiveness in information retrieval.
References
No.76/NQ-CP. (2021) Nghi-quyet-76-nq-cp-2021. [Online]. Available: https://datafiles.chinhphu.vn/cpp/files/vbpq/2021/07/76.signed.pdf(Vietnamese)
F. Ortiz-Rodríguez, R. Palma, and B. Villazón-Terrazas, “Egoir: ontology-based information retrieval intended for egovernment,” in Informatik 2007–Informatik trifft Logistik– Band 1. Gesellschaft f¨ur Informatik e. V., 2007, pp. 237–241.
L. Cheng, Y. Yang, K. Zhao, and Z. Gao, “Research and improvement of tf-idf algorithm based on information theory,” Advances in Intelligent Systems and Computing, 2018. [Online]. Available: https://api.semanticscholar.org/CorpusID: 198317927
M. Ogbi and M. Aminilari, “Bm25 ranking algorithm development using matching concepts in unstructured text,” 2015. [Online]. Available: https://api.semanticscholar.org/CorpusID:9098595
J. Devlin, M.-W. Chang, K. Lee, and K. Toutanova, “Bert: Pre-training of deep bidirectional transformers for language understanding,” 2018.
N. Reimers and I. Gurevych, “Sentence-bert: Sentence embeddings using siamese bert-networks,” arXiv preprint arXiv:1908.10084, 2019.
M. Sokolova and G. Lapalme, “A systematic analysis of performance measures for classification tasks,” Information processing & management, vol. 45, no. 4, pp. 427–437, 2009.