Multi-attribute identity resolution for online social network

Shalini Yadav, Adwitiya Sinha, Pawan Kumar
2019 SN Applied Sciences  
Social media has gained prominent immensity in usage with the explosion of communication owing to social phenomena. An apparent dimension of social media is the virality of information, which when generated from redundant and false user identities, may cause chaos across online social communities. Duplicate user profiles are quite conventional in social networks causing unintentional faults or intentional deceptions. Such fraudulent motives may involve cyberbullying, fake boosting, fake
more » ... g, cyberstalking, etc. Hence, revealing the malicious users having multiple profiles in social networks has become significant for downstream analysis to ensure cyber safety. Identity resolution is considered one of the pivotal techniques to reveal redundant identities, which are found co-referent to the same real-world user. As per social media theory, there are different stages for automated identity resolution, namely identity searching, identity linking and identity merging. Our research is focused on developing a novel identity resolution framework for detecting redundant user profiles on Twitter social media constructed with nodes, ranging from small-scale to massive-scale networks. Our proposed solution extracts Twitter user profiles with various attributes, for instance, first name, last name, username, user id, tag line, location, language, profile URL and tweets. We developed various algorithms for matching and merging redundant user profiles with the Jaro-Winkler similarity technique. The similar profiles are linked to eradicating their distinctive impact from the originally constructed social network. Our experimental outcomes illustrate iteration-wise reduction of irrelevant and redundant profiles for the close-community of the randomly selected user, which is further extended to the entire user-based and trend-based Twitter network. The results of our approach were compared with existing counterparts and were found to excel in performance concerning accuracy in the detection of redundant identities. Our approach would greatly assist viral marketing, terrorist screening, and social media trending.
doi:10.1007/s42452-019-1701-z fatcat:ftjqtwiczzgxfgtbfa6tyoc5pu