A study of corresponding words in TED talks using Word2vec
- Authors
- Lee, J.H.; Cha, K.-W.
- Issue Date
- 2020
- Publisher
- 글로벌영어교육학회
- Keywords
- Word2vec; TED Talks; word correlation; cosine similarity; big data
- Citation
- Studies in English Education, v.25, no.3, pp 241 - 269
- Pages
- 29
- Journal Title
- Studies in English Education
- Volume
- 25
- Number
- 3
- Start Page
- 241
- End Page
- 269
- URI
- https://scholarworks.bwise.kr/cau/handle/2019.sw.cau/52189
- DOI
- 10.22275/SEE.25.3.01
- ISSN
- 1223-3451
- Abstract
- Word2vec is a widely used program that is composed of a group of related models used in the generation of word embeddings. These models are useful in the analysis of word correlations. The purpose of this research is to establish its usefulness in English education. In this research, the Continuous Bag-of-Words (CBOW) model was used to determine cosine similarity. The selected materials for this research are TED Talks on education. Out of the 323 available clips in the education section, 292 were chosen. Using Wordsmith 7, 40 of the most frequently used words were singled out. The 40 words were divided by four parts of speech: 10 nouns, 10 verbs, 10 adjectives, and 10 adverbs. These included only content words; function words were excluded. Using Word2vec, 10 words from the target words were displayed in a vector chart. These correlated words all share the highest cosine similarity with their target word. They reveal unique word patterns.
Some word pairs are collocations, and others are synonyms and antonyms. The identifying and classification of words can help the learners develop their vocabulary, and how to properly use words in sentences.
- Files in This Item
- There are no files associated with this item.
- Appears in
Collections - College of Education > Department of English Education > 1. Journal Articles
Items in ScholarWorks are protected by copyright, with all rights reserved, unless otherwise indicated.