Hệ thống trả lời tự động trong điều kiện tài nguyên hạn chế
Abstract
The Interactive Voice Response (IVR) system that has embedded spoken language processing, is emerging as one of the most prominent research areas in the world. For the purpose of constructing an IVR system for Vietnamese, this paper aimes to propose a spoken dialog system architecture with four components: The Automatic Speech Recognition, The Text-to-Speech Synthesiser, The Dialog Manager and The Telephone Network Interface. This architecture allows to construct a new IVR system for any domain more rapidly, effectively with low cost.References
DELOGU, C., DI CARLO, A., ROTUNDI, P., SATORI, D., “A comparison between DTMF and ASR IVR services through objective and subjective evaluation”, Interactive Voice Technology for Telecommunications Applications (IVTTA), Italy, IEEE, vol.3, pp. 145-150, 1998.
STEVE CAWN, BAIJU MANDALIA, WENDI NUSBICKEL, “Incorporating IBM speech solutions into a service oriented architecture”, IBM WebShpere Voice Server WhitePaper, June, 2006 .
T. ISOBE et al., “Voice-activated home banking system and its field trial”, ICSLP-1996, vol. 3, pp. 1688-1691, 1996.
W. WARD, “The CMU air travel information service: understanding spontaneous speech”, Proceedings of the DARPA Speech and Natural Language Workshop, vol.1, pp. 127-129, Jun. 1990.
V. ZUE et al, “JUPITER: A Telephone-Based Conversational Interfacefor Weather Information”, IEEE Transactions on Speech and Audio Processing, Vol. 8, No. 1, pp.85–96, Jan 2000.
CHU-CARROLL J., CARPENTER B., “Vector-based natural language call routing”, Computational Linguistics, vol. 3, pp. 361-388, 1999.
NHUT PHAM, QUAN VU, “A Spoken Dialog System For Stock Information Inquiry”, pp. 242-245, IT@EDU, 2010.
DUONG DAU, MINH LE, CUONG LE, QUAN VU, “A Robust Vietnamese Voice Server for Automated Directory Assistance Application”, VLSP-RIVF, Ho Chi Minh City, 2012.
S. SINISCALCHI, T. SVENDSEN, C.-H. LEE, “Towards a detector-based universal phone recognizer”, in Proc.ICASSP 2008, pp. 4261–4264, 2008.
K. TOKUDA, H. ZEN, A. BLACK, “An HMM-based speech synthesis system applied to English”, in IEEE Speech Synthesis Workshop, pp. 227- 230, pp.11–13, 2002.
Y. ALVAREZ, M. HUCKVALE, “The reliability of the ITU-T P.85 standard for the evaluation of text-to- speech systems”, Proceedings of the ICSLP'02, Denver, pp. 329-332, 2002.
POVEY, D., et al., “Subspace Gaussian mixture models for speech recognition”, Proceedings of ICASSP’10, pp. 4330-4333, 2010.
Wikipedia: http://en.wikipedia.org/wiki/Interactive_voice_response, 2012.
Asterisk project: http://www.asterisk.org/docs, 2013.
Kaldi project, http://kaldi.sourceforge.net, 2012.
V.B. LE, L. BESACIER, “Automatic Speech Recognition for Under-Resourced Languages: Application to Vietnamese Language,” IEEE Transactions on Audio, Speech, and Language Processing, vol. 17, issue 8, pp. 1471 – 1482, 2009.
QUAN H. VU et al., “Distributed Web Service Architecture Towards Robotic Speech Communication: A Vietnamese Case Study,” International Journal of Advanced Robotic Systems", ISSN 1729-8806, vol. 10, pp. 130-141 Open Intech, 2013.