Effective similarity discovery from semi-structured documents
- Authors
- Moon, H.; Kim, K.; Park, G.; Yoo, C.-W.
- Issue Date
- 2006
- Keywords
- Clustering; Dtd; Ontology; Similarity detection; Wordnet; Xml
- Citation
- International Journal of Multimedia and Ubiquitous Engineering, v.1, no.3, pp.12 - 18
- Journal Title
- International Journal of Multimedia and Ubiquitous Engineering
- Volume
- 1
- Number
- 3
- Start Page
- 12
- End Page
- 18
- URI
- http://scholarworks.bwise.kr/ssu/handle/2018.sw.ssu/19158
- ISSN
- 1975-0080
- Abstract
- The semi-structured data in XML format has been diffused through the widespread of the internet. To support the storage and retrieval of huge collections of such documents, reconciling similar DTDs within a cluster and using an effective similarity function are the keys of a successful data management process. XClust introduced WordNet ontology system to be widely extended the word compatibility performance. By using the ontology system, semantic compatibility can be stretched, but the velocity for the semantic similarity detection process is relatively increased in a great degree. This paper proposes a fast and effective method that can have ontological similarity flexibility same as XClust, but does not have big velocity delay. For practicality, we use a simple and very fast structural similarity detection method in the domain of frequencies, which can extremely elevate the performance of our similarity detection method. Our straightforward structural similarity detection method especially gets very fast and good results in such databases that have large number of similar documents.
- Files in This Item
- There are no files associated with this item.
- Appears in
Collections - College of Information Technology > School of Computer Science and Engineering > 1. Journal Articles
Items in ScholarWorks are protected by copyright, with all rights reserved, unless otherwise indicated.