Cited 4 time in
A statistical model-based voice activity detection using multiple dnns and noise awareness
| DC Field | Value | Language |
|---|---|---|
| dc.contributor.author | Hwang, I. | - |
| dc.contributor.author | Sim, J. | - |
| dc.contributor.author | Kim, S.-H. | - |
| dc.contributor.author | Song, K.-S. | - |
| dc.contributor.author | Chang, J.-H. | - |
| dc.date.accessioned | 2021-08-02T18:27:05Z | - |
| dc.date.available | 2021-08-02T18:27:05Z | - |
| dc.date.issued | 2015-01 | - |
| dc.identifier.issn | 1990-9772 | - |
| dc.identifier.issn | 2308-457X | - |
| dc.identifier.uri | https://scholarworks.bwise.kr/hanyang/handle/2021.sw.hanyang/25659 | - |
| dc.description.abstract | In this paper, we propose the ensemble of deep neural networks (DNNs) by using acoustic environment classification for statis- tical model-based voice activity detection (VAD). Since conven- tional decision functions for statistical model-based VAD are based on shallow model and it cannot take an advantage of the diversity of the space distribution of features, we present to use the multiple DNNs separately trained on different noise con- dition as decision function for the statistical model-based VAD. And, environmental noise classification is also performed based on the separate DNN since acoustic environment classification makes it possible to achieve high detection performance at var- ious type of noise environment by using different algorithm ac- cording to current noise condition. In the training stage, a num- ber of DNNs are independently trained according to different type of noise environments, and separate DNN is organized to detect one of the environmental conditions. In an online stage, the environmental knowledge on each frame is contributed to allow us to combine the speech presence probabilities, which are derived from the ensemble of the trained DNNs for the in- dividual environment. Our approach for VAD was evaluated in terms of objective measures and showed significant improve- ment compared to the conventional algorithm. Copyright ? 2015 ISCA. | - |
| dc.format.extent | 5 | - |
| dc.language | 영어 | - |
| dc.language.iso | ENG | - |
| dc.title | A statistical model-based voice activity detection using multiple dnns and noise awareness | - |
| dc.type | Article | - |
| dc.identifier.scopusid | 2-s2.0-84959159917 | - |
| dc.identifier.bibliographicCitation | Proceedings of the Annual Conference of the International Speech Communication Association, INTERSPEECH, v.2015-January, pp 2277 - 2281 | - |
| dc.citation.title | Proceedings of the Annual Conference of the International Speech Communication Association, INTERSPEECH | - |
| dc.citation.volume | 2015-January | - |
| dc.citation.startPage | 2277 | - |
| dc.citation.endPage | 2281 | - |
| dc.type.docType | Conference Paper | - |
| dc.description.isOpenAccess | N | - |
| dc.description.journalRegisteredClass | scopus | - |
| dc.subject.keywordPlus | Acoustic noise | - |
| dc.subject.keywordPlus | Speech communication | - |
| dc.subject.keywordPlus | Deep neural networks | - |
| dc.subject.keywordPlus | Ensemble | - |
| dc.subject.keywordPlus | Environment classification | - |
| dc.subject.keywordPlus | Statistical modeling | - |
| dc.subject.keywordPlus | Voice activity detection | - |
| dc.subject.keywordPlus | Speech recognition | - |
| dc.subject.keywordAuthor | Acous- tic environment classification | - |
| dc.subject.keywordAuthor | Deep neural network | - |
| dc.subject.keywordAuthor | Ensemble | - |
| dc.subject.keywordAuthor | Statistical model | - |
| dc.subject.keywordAuthor | Voice activity detection | - |
Items in ScholarWorks are protected by copyright, with all rights reserved, unless otherwise indicated.
222, Wangsimni-ro, Seongdong-gu, Seoul, 04763, Korea+82-2-2220-1366
COPYRIGHT © 2024 HANYANG UNIVERSITY.
Certain data included herein are derived from the © Web of Science of Clarivate Analytics. All rights reserved.
You may not copy or re-distribute this material in whole or in part without the prior written consent of Clarivate Analytics.
