Xác định thứ tự thời gian giữa hai câu tiếng Việt chỉ quá trình để tóm lược
Abstract
In this paper we introduce a method for summarizing the meaning of two continual Vietnamese sentences manifesting a sequence of processes which belongs to one of three process types (according to Functional Grammar [26, 41]): the state of subject is changed, the position of subject is changed, and the state or position of the subject is affected by an agent. The sentence-generation method is performed in two main processes: (i) resolve anaphoric pronoun and represent the semantics of the source pair of sentences; (ii) determine the ordinal relationship of processes and generate new reduced Vietnamese sentence. To evaluate the quality of summarization, we compare our generated sentences with sentence fusions which generated using K. Filippova [31]’s method as well as an enhancement by F. Boudin and E. Morin [16]. Using ROUGE measures [6 - 9], the results show that our method’s summaries are more precise and natural in overall.References
A. KHAN and N. SALIM, “A Review on Abstractive Summarization Methods”, Journal of Theoretical and Applied Information Technology, vol. 59, no.1, 2014, pp. 64–72.
B. SANTORINI, Part-of-speech Tagging Guidelines for the Penn Treebank Project, Technical Report MS-CIS-90--47, Department of Computer and Information Science, University of Pennsylvania, 1990.
C. D. MANNING and H. SCHUTZE, Foundations of Statistical Natural Language Processing, MIT Press, Cambridge, MA USA, 1999.
C. S. LEE, Z. W. JIAN and L. K. HUANG, “A Fuzzy Ontology and Its Application to News Summarization”, IEEE Transaction on Systems, Man and Cybernetics, Part B: Cybernetics, vol. 35, no. 5, 2005, pp. 859–880.
C. S. SARANYAMOL and L. SINDHU, “A Survey on Automatic Text Summarization”, International Journal of Computer Science and Information Technologies, vol. 5, no. 6, 2014, pp. 7889–7893.
C. Y. LIN, “ROUGE: A Package for Automatic Evaluation of Summaries”, Proceedings of the Workshop on Text Summarization Branches Out, Post-Conference Workshop of ACL 2004, Barcelona, Spain, 2004.
C. Y. LIN, “Looking for a Few Goods Metrics: ROUGE and its Evaluation”, Proceedings of NTCIR Workshop 2004, Tokyo, Japan, 2004.
C. Y. LIN and E. H. HOVY, “Automatic Evaluation of Summaries Using N-gram Co-occurrence Statistics”, Proceedings of 2003 Language Technology Conference (HLT-NAACL 2003), Edmonton, Canada, 2004.
C. Y. LIN and F. J. OCH, “Automatic Evaluation of Machine Translation Quality Using Longest Common Subsequence and Skip-Bigram Statistics”, Proceedings of the 42nd Annual Meeting of ACL (ACL 2004), Barcelona, Spain, 2004.
D. DAS and A. F. T. MARTINS, A survey on automatic text summarization, Language Technologies Institute, Carnegie Mellon University, 2007.
D. M. W. POWERS, “Evaluation: From Precision, Recall and F-MEASURE to ROC, Informedness, Markedness & Correlation”, Journal of Machine Learning Technologies, vol. 2, no. 1, 2011, pp. 37–63.
D. R. RADEV, E. HOVY and K. MCKEOWN, “Introduction to the special issue on summarization”, Computational Linguistics, vol. 28, no. 4, 2002, pp. 399–408.
E. LLORET, Text summarization: an overview, paper supported by the Spanish Government under the project TEXT-MESS (TIN2006-15265-C06-01), 2008.
E. LLORET and M. PALOMAR, “Analyzing the Use of Word Graphs for Abstractive Text Summarization”, Proceedings of the 1st International Conference on Advances in Information Mining and Management (IMMM 2011), Barcelona, Spain, 2011, pp. 61–66.
E. REITER and R. DALE, Building Natural Language Generation System, Cambridge University Press, 1997.
F. BOUDIN and E. MORIN, “Keyphrase extraction for n-best reranking in multi-sentence compression”, Proceedings of the 2013 Conference of the North American Chapter of the Association for Computational Linguistics: Human Language Technologies (NAACL-HLT 2013), Atlanta, Georgia, 2013, pp. 298–305.
F. LIU, J. FLANIGAN, S. THOMSON, N. SADEH and N. A. SMITH, “Toward Abstractive Summarization Using Semantic Representations”, Accepted by the Conference of the North American Chapter of the Association for Computational Linguistics: Human Language Technologies (NAACL 2015).
G. CARENINI and J. C. K. CHEUNG, “Extractive vs. NLG-based Abstractive Summarization of Evaluative Text: The Effect of Corpus Controversiality”, Proceedings of the 5th International Natural Language Generation Conference, Salt Fork, Ohio, 2008.
H. KAMP, “A theory of truth and semantic representation”, in: Groenendijk, Jeroen, Janssen, Theo M. V and Stokhof, Martin (eds.), Formal Methods in the Study of Language, Part 1, pp. 277–322, 1981, Mathematical Centre Tracts.
H. P. LUHN, “The automatic creation of literature abstracts”, IBM Journal of Research Development, vol.2, no. 2, 1958, pp. 159–165.
H. P. EDMUNDSON, “New methods in automatic extracting”, Journal of the ACM, vol. 1, no. 2, 1969, pp. 264–285.
H. SAGGION and G. LAPALME, “Generating Indicative-Informative Summaries with SumUM”, Computational Linguistics, vol. 28, no. 4, 2002, pp. 497–526.
H. T. LE, R. C. SAM and P. T. NGUYEN, “Extracting Phrases in Vietnamese Document for Summary Generation”, Proceedings International Conference on Asian Language Processing (IALP), Harbin, China, 2010, pp. 207–210.
H. T. T. NGUYEN and Q. H. NGUYEN, “A semi-supervised learning method combined with dimensionality reduction in vietnamese text summarization”, International Journal of Innovative Computing, Information and Control, vol. 9, no. 12, pp. 4903–4915.
H. T. T. NGUYEN, Q. H. NGUYEN and T. N. T. NGUYEN, “A supervised learning method combine with dimensionality reduction in vietnamese text summarization”, Proceedings 2013 Computing, Communications and IT Applications Conference (ComComAp), Hong Kong, 2013, pp. 69–73.
H. X. CAO, Tiếng Việt: Sơ thảo ngữ pháp chức năng, Nhà xuất bản giáo dục, 2006.
I. F. MOAWAD and M. AREF, “Semantic graph reduction approach for abstractive Text Summarization”, Proceedings of Computer Engineering & Systems (ICCES), 2012 Seventh International Conference on, 2012, pp. 132-138.
I. MANI, Automatic Summarization, John Benjamins Publishing Company, 2001.
I. MANI and M. T. MAYBURY, Advances in Automatic Text Summarization, MIT Press, 1999.
K. A. GANESAN, C. X. ZHAI and J. HAN, “Opinosis: A Graph-Based Approach to Abstractive Summarization of Highly Redundant Opinions”, Proceedings of the 23rd International Conference on Computational Linguistics (COLING 2010), Beijing, China, 2010, pp. 340–348.
K. FILIPPOVA, “Multi-Sentence Compression: Finding Shortest Paths in Word Graphs”, Proceedings of the 23rd International Conference on Computational Linguistics (COLING 2010), Beijing, China, 2010, pp. 322–330.
K. FILIPPOVA and M. STRUBE, “Dependency Tree Based Sentence Compression”, Proceedings of the 5th International Natural Language Generation Conference, Salt Fork, Ohio, 2008.
K. FILIPPOVA and M. STRUBE, “Sentence Fusion via Dependency Graph Compression”, Proceedings of the Conference on Empirical Methods in Natural Language Processing, Honolulu, Hawaii, 2008.
K. JEZEK and J. STEINBERGER, “Automatic Text summarization”, Vaclav Snasel (Ed.): Znalosti 2008, ISBN 978-80-227-2827-0, FIIT STU Brarislava, Ustav Informatiky a softveroveho inzinierstva, pp. 1–12, 2008.
K. S. JONES, “Automatic summarizing: factors and directions”, in: I. Mani and M. Marbury, editors, Advances in Automatic Text Summarization, MIT Press, 1999.
K. S. JONES, Automatic summarising: a review and discussion of the state of the art, Technical Report 679, Computer Laboratory, University of Cambridge, 2007.
M. A. COVINGTON, GULP 4: An Extension of Prolog for Unification Based Grammar, Research Report AI-1994-06. USA: Artificial Intelligence Center, The University of Georgia, 2007.
M. A. COVINGTON and N. SCHMITZ, An Implementation of Discourse Representation Theory, ACMC Research Report 01-0023. USA: Advanced Computational Methods Center, The University of Georgia, 1989.
M. A. COVINGTON, D. NUTE, N. SCHMITZ and D. GOODMAN, From English to Prolog via Discourse Representation Theory, ACMC Research Report 01-0024. USA: The University of Georgia, 1988.
M. A. FATTAH and F. REN, “Automatic Text Summarization”, Proceedings of World Academy of Science, Engineering and Technology, vol. 27, ISSN 13076884, 2008, pp. 192–195.
M. A. K. HALLIDAY and C. M. I. M. MATTHIESSEN, An Introduction to Functional Grammar, Third Edition, Hodder Arnold, 2004.
N. R. KASTURE, N. YARGAL, N. N. SINGH, N. KULKARNI and V. MATHUR, “A Survey on Methods of Abstractive Text Summarization”, International Journal for Research in Merging Science and Technology, vol. 1, iss. 6, 2014, pp. 53–57.
O. CHAOWALIT and O. SORNIL, “An Automatic Approach to Generating Abstractive Summary for Thai Opinions”, International Journal of Advancements in Computing Technology, vol. 6, no. 3, 2014, pp. 142–150.
P. BAXENDALE, “Machine-made index for technical literature - an experiment”, IBM Journal of Research Development, vol. 2, no. 4, 1958, pp. 354–361.
P. BLACKBURN and J. BOS, Representation and Inference for Natural Language – Volume II: Working with Discourse Representation Structures, Germany: Department of Computational Linguistics, University of Saarland, 1999.
P. E. GENEST and G. LAPALME, “Framework for Abstractive Summarization using Text-to-Text Generation”, Proceedings of the Workshop on Monolingual Text-to-Text Generation, Oregon, Portland, 2011, pp. 64–73.
P. E. GENEST and G. LAPALME, “Text Generation for Abstractive Summarization”, Proceedings of the 3rd Text Analysis Conference, Gaithersburg, Maryland, USA, 2010.
P. E. GENEST and G. LAPALME, “Fully Abstractive Approach to Guided Summarization”, Proceedings of the 50th Annual Meeting of the Association for Computational Linguistics: Short Papers – Volum 2, Jeju Island, Korea, 2012, pp. 354–358.
P. T. NGUYEN and H. T. LE, “Vietnamese text summarisation using discourse structures”, ICT.rda Conference, Hanoi, Vietnam, 2008.
R. BARZILAY, K. R. MCKEOWN and M. ELHADAD, “Information fusion in the context of multi-document summarization”, Proceedings of the 37th annual meeting of the Association for Computational Linguistics on Computational Linguistics, 1999, pp. 550–557.
R. BARZILAY and K. R. MCKEOWN, “Sentence fusion for multidocument news summarization”, Computational Linguistics, vol. 31, 2005, pp. 297–328.
S. GERANI, Y. MEHDAD, G. CARENINI, T. NG. RAYMOND and B. NEJAT, “Abstractive Summarization of Product Reviews Using Discourse Structure”, Proceedings of the 2014 Conference on Empirical Methods in Natural Language Processing (EMNLP 2014), Doha, Qatar, 2014, pp. 1602–1613.
S. K. JAGADISH, K. G. SRINIVASA and R. B. ESWARA, “A Comprehensive Analysis of Guided Abstractive Text Summarization”, International Journal of Computer Science Issues, vol. 11, iss. 6, no. 1, 2014, pp. 115–121.
S. NIVATTANAKUL, J. SINGTHONGCHAI, E. NAENUDORN and S. WANAPU, “Using of Jaccard coefficient for keywords similarity”, Proceedings of the International Muti Conference Engineers and Computer Scientists, Hong Kong, 2013, pp. 380–384.
S. M. SHIEBER, An introduction to unification-based approaches to grammar, Massachusetts: Microtome Publishing Brookline, 2003.
T. T. TANIMOTO, An element mathematical theory of classification, Technical report, I.B.M. Research, New York, NY USA, 1958. Internal report.
T. TRAN and D. T. NGUYEN, “A Solution for Resolving Inter-sentential Anaphoric Pronouns for Vietnamese Paragraphs Composing Two Single Sentences”, Proceedings of The 5th IEEE International Conference of Soft Computing and Pattern Recognition (SoCPaR 2013), Hanoi, Vietnam, 2013, pp. 172–177.
T. TRAN and D. T. NGUYEN, “The Solution for Resolving Inter-Sentential Anaphoric Pronoun “nó” in Vietnamese Paragraphs Composing 3 to 5 Simple Sentences”, International Journal of Advanced Science and Technology, vol. 65, 2014, pp. 95–112.
T. TRAN and D. T. NGUYEN, “Merging Two Vietnamese Sentences Related by Inter-sentential Anaphoric Pronouns for Summarizing”, Proceedings of The 1st NAFOSTED Conference on Information and Computer Science, Hanoi, Vietnam, 2014, pp. 371–381.
T. TRAN and D. T. NGUYEN, “Improving Techniques for Summarizing the Meaning of Two Vietnamese Sentences by Adding a Meaningful Relationship between Two Actions”, Proceedings of The 16th ACM International Conference on Information Integration and Web-based Applications & Services (iiWAS 2014), Hanoi, Vietnam, 2014, pp. 484–488.
T. TRAN and D. T. NGUYEN, “Enhancement of Sentence-Generation Based Summarization Method By Modelling Inter-Sentential Consequent-Relationships”, Proceedings of the 16th ACM International Conference on Information Integration and Web-based Applications & Services (iiWAS 2014), Hanoi, Vietnam, 2014, pp. 302–309.
T. TRAN and D. T. NGUYEN, “Modelling Consequence Relationships between Two Action, State or Process Vietnamese Sentences for Improving the Quality of New Meaning-Summarizing Sentence”, International Journal of Pervasive Computing and Communications, vol. 11, no. 2, 2015, pp. 169–190. Emerald Group Publishing Limited. ISBN 1742-7371.
T. TRAN and D. T. NGUYEN, “Semantic Predicative Analysis for Resolving Some Cases of Ambiguous Referents of Pronoun “Nó” in Summarizing Meaning of Two Vietnamese Sentences”, Proceedings of the 17th UKSIM-AMSS International Conference on Modelling and Simulation (UKSIM 2015), Cambridge, United Kingdom, 2015, pp. 340–345.
T. TRAN and D. T. NGUYEN, “Combined Method of Analyzing Anaphoric Pronouns and Inter-sentential Relationships between Transitive Verbs for Enhancing Pairs of Sentences Summarization”, Proceedings of the 4th Computer Science On-line Conference (CSOC 2015) – Vol 1: Artificial Intelligence Perspectives and Applications, in: R. Silhavy et al. (eds), Advances in Intelligent Systems and Computing – Vol. 347, 2015, pp. 67–77.
V. GUPTA and G. S. LEHAL, “A Survey of Text Summarization Extractive Techniques”, Journal of Emerging Technologies in Web Intelligence, vol. 2, no. 3, 2010, pp. 258–268.
V. SORNLERTLAMVANICH, T. POTIPITI and T. CHAROENPORN, “UNL Document Summarization”, Proceedings of the 1st International Workshop on Multimedia Annotation (MMA 2001), Tokyo, Japan, 2001.