A Novel Graph Based Clustering Approach to Document Topic Modeling


Clustering is the task of assigning a set of objects into groups so that the objects within the same cluster are more similar to each other than to those in other clusters based on some similarity measures. Clustering of documents is an important task in text mining based on their research topics. In this field, cluster analysis is the task of grouping a set of documents in such a way that the documents in the same cluster have similar topic and documents of different clusters have different topics. The proposed method introduces a novel graph based clustering method which uses the importance factor of a document based on a better mathematical approach than well known classical methods. Document with the maximum importance factor in a cluster is considered as the centroid of the cluster. Publicly available synthetic dataset is used to evaluate the performance of the proposed algorithm and the method is compared with some traditional graph based methods to demonstrate its accuracy.

In International Conference on Computing, Communication and Networking Technologies, IEEE. Xplore