Detailed Information

Cited 7 time in webofscience Cited 11 time in scopus
Metadata Downloads

Modeling Speech Emotion Recognition via Attention-Oriented Parallel CNN Encoders

Full metadata record
DC Field Value Language
dc.contributor.authorFazliddin, Makhmudov-
dc.contributor.authorKutlimuratov, Alpamis-
dc.contributor.authorAkhmedov, Farkhod-
dc.contributor.authorAbdallah, Mohamed S.-
dc.contributor.authorCho, Young-Im-
dc.date.accessioned2023-01-19T00:40:19Z-
dc.date.available2023-01-19T00:40:19Z-
dc.date.created2023-01-18-
dc.date.issued2022-12-
dc.identifier.issn2079-9292-
dc.identifier.urihttps://scholarworks.bwise.kr/gachon/handle/2020.sw.gachon/86640-
dc.description.abstractMeticulous learning of human emotions through speech is an indispensable function of modern speech emotion recognition (SER) models. Consequently, deriving and interpreting various crucial speech features from raw speech data are complicated responsibilities in terms of modeling to improve performance. Therefore, in this study, we developed a novel SER model via attention-oriented parallel convolutional neural network (CNN) encoders that parallelly acquire important features that are used for emotion classification. Particularly, MFCC, paralinguistic, and speech spectrogram features were derived and encoded by designing different CNN architectures individually for the features, and the encoded features were fed to attention mechanisms for further representation, and then classified. Empirical veracity executed on EMO-DB and IEMOCAP open datasets, and the results showed that the proposed model is more efficient than the baseline models. Especially, weighted accuracy (WA) and unweighted accuracy (UA) of the proposed model were equal to 71.8% and 70.9% in EMO-DB dataset scenario, respectively. Moreover, WA and UA rates were 72.4% and 71.1% with the IEMOCAP dataset.-
dc.language영어-
dc.language.isoen-
dc.publisherMDPI-
dc.relation.isPartOfELECTRONICS-
dc.titleModeling Speech Emotion Recognition via Attention-Oriented Parallel CNN Encoders-
dc.typeArticle-
dc.type.rimsART-
dc.description.journalClass1-
dc.identifier.wosid000896175000001-
dc.identifier.doi10.3390/electronics11234047-
dc.identifier.bibliographicCitationELECTRONICS, v.11, no.23-
dc.description.isOpenAccessY-
dc.identifier.scopusid2-s2.0-85143658341-
dc.citation.titleELECTRONICS-
dc.citation.volume11-
dc.citation.number23-
dc.contributor.affiliatedAuthorFazliddin, Makhmudov-
dc.contributor.affiliatedAuthorKutlimuratov, Alpamis-
dc.contributor.affiliatedAuthorAkhmedov, Farkhod-
dc.contributor.affiliatedAuthorAbdallah, Mohamed S.-
dc.contributor.affiliatedAuthorCho, Young-Im-
dc.type.docTypeArticle-
dc.subject.keywordAuthorspeech emotion recognition-
dc.subject.keywordAuthorconvolution neural network-
dc.subject.keywordAuthorattention-
dc.subject.keywordAuthordeep learning-
dc.subject.keywordAuthormodeling-
dc.relation.journalResearchAreaComputer Science-
dc.relation.journalResearchAreaEngineering-
dc.relation.journalResearchAreaPhysics-
dc.relation.journalWebOfScienceCategoryComputer Science, Information Systems-
dc.relation.journalWebOfScienceCategoryEngineering, Electrical & Electronic-
dc.relation.journalWebOfScienceCategoryPhysics, Applied-
dc.description.journalRegisteredClassscie-
dc.description.journalRegisteredClassscopus-
Files in This Item
There are no files associated with this item.
Appears in
Collections
IT융합대학 > 컴퓨터공학과 > 1. Journal Articles

qrcode

Items in ScholarWorks are protected by copyright, with all rights reserved, unless otherwise indicated.

Related Researcher

Researcher ,  photo

,
College of IT Convergence (Department of Software)
Read more

Altmetrics

Total Views & Downloads

BROWSE