Topic Extraction in Social Networks
Keywords:Twitter, topic extraction, big data, data crawling, social networks
The number of Twitter users is increasing and the quantity of produced data is growing. Using this big data to analyze user behavior has become a very active field. The two key challenges of this paper are extracting data from Twitter and extracting topics from user tweets. The proposed approach uses data crawling to collect data from Twitter and a bunch of natural language processing techniques to extract information from the so collected data and build a dataset. Thereafter, we use K-means clustering and Latent Dirichlet Allocation to extract the prevalent topics from this dataset, as they are the most common in the literature. Our proposal is generic, it can be reused by scientists to annotate any text collection.