TY - JOUR
T1 - Exploring topics in the field of data science by analyzing wikipedia documents
T2 - A preliminary result wikipedia documents: A preliminary result
AU - Wang, Yanyan
AU - Joo, Soohyung
AU - Lu, Kun
PY - 2014
Y1 - 2014
N2 - In this poster, topics in the field of Data Science were explored from Wikipedia documents based on clustering, principal component analysis (PCA), and topic modeling. As a pilot study, we analyzed part of the dataset of Wikipedia documents to initially identify topics discussed in Data Science. Hierarchical clustering resulted in six clusters of topics while PCA identified eleven dimensions in the Data Science field. In addition, topic modeling based on latent Dirichlet allocation (LDA) produced fifty topics related to Data Science. The researchers plan to further examine hierarchical, structural relationships between topics using structural equation modeling and social network analysis. The findings from this study will be useful to understand what topics are currently discussed in the area of Data Science.
AB - In this poster, topics in the field of Data Science were explored from Wikipedia documents based on clustering, principal component analysis (PCA), and topic modeling. As a pilot study, we analyzed part of the dataset of Wikipedia documents to initially identify topics discussed in Data Science. Hierarchical clustering resulted in six clusters of topics while PCA identified eleven dimensions in the Data Science field. In addition, topic modeling based on latent Dirichlet allocation (LDA) produced fifty topics related to Data Science. The researchers plan to further examine hierarchical, structural relationships between topics using structural equation modeling and social network analysis. The findings from this study will be useful to understand what topics are currently discussed in the area of Data Science.
KW - Data science
KW - Hierarchical clustering
KW - Latent Dirichlet allocation
KW - Principal component analysis
KW - Structural equation modeling
KW - Topic modeling
KW - Wikipedia
UR - http://www.scopus.com/inward/record.url?scp=84961904366&partnerID=8YFLogxK
UR - http://www.scopus.com/inward/citedby.url?scp=84961904366&partnerID=8YFLogxK
U2 - 10.1002/meet.2014.14505101116
DO - 10.1002/meet.2014.14505101116
M3 - Article
AN - SCOPUS:84961904366
SN - 1550-8390
VL - 51
JO - Proceedings of the ASIST Annual Meeting
JF - Proceedings of the ASIST Annual Meeting
IS - 1
ER -