VOS: the Corpus-Based  etnamese Text-to-Speech System

Vo Quang Dieu Ha; Nguyen Manh Tuan; Cao Xuan Nam; Pham Minh Nhut; Vu Hai Quan

doi:10.32913/mic-ict-research.v3.n7.285

Vo Quang Dieu Ha
Nguyen Manh Tuan
Cao Xuan Nam
Pham Minh Nhut
Vu Hai Quan

DOI: https://doi.org/10.32913/mic-ict-research.v3.n7.285

Abstract

This paper presents a complete specification of the Vietnamese speech synthesis system named VOS (Voice of Southern Vietnam). Due to the fact that current Vietnamese text-to-speech systems lack the naturalness of output synthetic speech, VOS is based on the unit selection approach which aims to achieve maximum naturalness. There are three main parts constituting VOS: a corpus manager, a synthesizer, and a transliteration model. Corpus manager manages automated speech indexing and segmentation for unit selection executed by the synthesizer, while transliteration model deals with the pronunciation of words in foreign languages. A comparative experimental evaluation of VnSpeech, VietVoice, and VOS is conducted using ITU-T P.85 standard. Results show that VOS outperforms the former two TTS systems.

VOS: the Corpus-Based etnamese Text-to-Speech System

Abstract

Most read articles by the same author(s)

Aim, Scope, Indexing

Editorial Board