Enhancing User Experience on Q&A Platforms: Measuring Text Similarity Based on Hybrid CNN-LSTM Model for Efficient Duplicate Question Detectionopen access
- Authors
- Faseeh, Muhammad; Khan, Murad Ali; Iqbal, Naeem; Qayyum, Faiza; Mehmood, Asif; Kim, Jungsuk
- Issue Date
- Jan-2024
- Publisher
- IEEE-INST ELECTRICAL ELECTRONICS ENGINEERS INC
- Keywords
- Deep learning; Semantics; Brain modeling; Task analysis; Feature extraction; Convolutional neural networks; Syntactics; Natural language processing; Question answering (information retrieval); Duplicate question identification; stack overflow; deep learning (DL); word embeddings; natural language processing (NLP); question-and-answer (QA) platforms
- Citation
- IEEE ACCESS, v.12, pp 34512 - 34526
- Pages
- 15
- Journal Title
- IEEE ACCESS
- Volume
- 12
- Start Page
- 34512
- End Page
- 34526
- URI
- https://scholarworks.bwise.kr/gachon/handle/2020.sw.gachon/90911
- DOI
- 10.1109/ACCESS.2024.3358422
- ISSN
- 2169-3536
- Abstract
- This research introduces an innovative approach for identifying duplicate questions within the Stack Overflow community, a challenging task in NLP. Leveraging deep learning techniques, our proposed methodology combines Convolutional Neural Networks (CNN) and Long Short-Term Memory (LSTM) networks to capture both local and long-term dependencies in textual data. We employ word embeddings, specifically Google's Word2Vec and GloVe, to enhance text representation. Extensive experiments on the Stack Overflow dataset demonstrate the effectiveness of our approach, achieving an impressive accuracy of 87.09% and a recall rate of 87.%. The integration of CNN and LSTM models significantly streamlines preprocessing, making it a valuable tool for detecting duplicate questions. Future directions include extending the model to multiple languages and exploring alternative word embedding techniques. Our approach presents promising applications beyond Stack Overflow, offering solutions for identifying similar questions on various QA platforms.
- Files in This Item
- There are no files associated with this item.
- Appears in
Collections - ETC > 1. Journal Articles
![qrcode](https://api.qrserver.com/v1/create-qr-code/?size=55x55&data=https://scholarworks.bwise.kr/gachon/handle/2020.sw.gachon/90911)
Items in ScholarWorks are protected by copyright, with all rights reserved, unless otherwise indicated.