Title | Topic Detection and Classification in Consumer Web Communication Data |
Authors | Nakayama, Atsuho |
Year | 2018 |
Volume | Archives of Data Science, Series A 5(1) / 2018 |
Abstract | In this study, we examined temporal variation in topics regarding new products by classifying words into clusters based on the co-occurrence of words in Twitter entries. To help identify market trends, analysis of consumer tweet data has received much attention. We collected Twitter entries about new products based on their specific expressions of sentiment or interest. The matrix obtained from the Twitter entries are sparse and of high dimensionality, so we need to perform a dimensionality reduction analysis. We analyzed the matrix using non-negative matrix factorization to reduce the dimensionality. We also clarified temporal variation by using the weight coefficients which show the strength of associations between entries and topics. It is important to consider the temporal variation of these topics when detecting trending topics by classifying words into clusters based on co-occurrence of words. |