A Variable Break Prediction Method Using CART in a Japanese Text-to-Speech System
- Authors
- Na, Deok-Su; Bae, Myung-Jin
- Issue Date
- Feb-2009
- Publisher
- IEICE-INST ELECTRONICS INFORMATION COMMUNICATIONS ENG
- Keywords
- text-to-speech system; break prediction; variable break
- Citation
- IEICE TRANSACTIONS ON INFORMATION AND SYSTEMS, v.E92D, no.2, pp.349 - 352
- Journal Title
- IEICE TRANSACTIONS ON INFORMATION AND SYSTEMS
- Volume
- E92D
- Number
- 2
- Start Page
- 349
- End Page
- 352
- URI
- http://scholarworks.bwise.kr/ssu/handle/2018.sw.ssu/15883
- DOI
- 10.1587/transinf.E92.D.349
- ISSN
- 0916-8532
- Abstract
- Break prediction is an important step in text-to-speech systems as break indices (BIs) have a great influence on how to correctly represent prosodic phrase boundaries. However, an accurate prediction is difficult since BIs are often chosen according to the meaning of a sentence or the reading style of the speaker. In Japanese, the prediction of an accentual phrase boundary (APB) and major phrase boundary (MPB) is particularly difficult. Thus, this paper presents a method to complement the prediction errors of an APB and MPB. First, we define a subtle BI in which it is difficult to decide between an APB and MPB clearly as a variable break (VB), and an explicit BI as a fixed break (FB). The VB is chosen using the classification and regression tree, and multiple prosodic targets in relation to the pith and duration are then generated. Finally, unit-selection is conducted using multiple prosodic targets. The experimental results show that the proposed method improves the naturalness of synthesized speech.
- Files in This Item
-
Go to Link
- Appears in
Collections - College of Information Technology > ETC > 1. Journal Articles
Items in ScholarWorks are protected by copyright, with all rights reserved, unless otherwise indicated.