An automated privacy information detection approach for protecting individual online social network users

Weihua LI, Jiaqi WU, Quan BAI
2019
Massive private messages are posted by online social network users unconsciously every day, some users may face undesirable consequences. Thus, many studies have been dedicated to privacy leakage analysis. Whereas, there are very few studies detect privacy revealing for individual users. With this motivation, this paper aims to propose an automated privacy information detection approach to effectively detect and prevent privacy leakage for individual users. Based on the experimental results and
more » ... case studies, the proposed model carries out a considerable performance. Online social networks (OSNs) have become ubiquitous in people's activities. The popularization of OSNs turns out to be a double-edged sword. On one hand, it provides convenience for people to communicate, collaborate, and share information. On the other hand, OSNs also come with serious privacy issues. Without given much attention by the users, a massive amount of private information can be accessed publicly through OSNs. Users may expose themselves to a wide range of "observers", which include not only relatives and close friends, but also strangers and even stalkers. This raises a serious cybersecurity issue, i.e., online privacy leak. Online privacy leak means that an individual user shares his/her private information to people who he/she does not know well or even strangers on the Internet. This can be very dangerous for general Internet users, especially with the booming of OSNs. It is necessary to have a tool to assist general users to make better use of OSNs and protect them from leaking privacy information [Wang 11] [Hasan 13]. Hence, it is essential to detect privacy leakage in OSNs and remind individual online social network users before posting any privacy-related message. Under this motivation, in this paper, we propose a novel privacy detection framework for individual users of OSNs by using a Deep Learning approach. Twitter has been used as the source of data for training and validating our proposed framework since it is the biggest microblogging social media in the world [Mao 11]. Based on the generic definition of privacy and the characteristics of OSNs, the definition of "individual privacy" in OSNs have been formally defined. Furthermore, a deep learning-based approach has been developed and utilized to extract privacy-related entities from the messages posted by the users. The rest of the paper is organized as follows. Section 2 reviews the existing research work regarding data leaks on OSNs. Section 3 introduces the automated privacy information detection framework. In Section 4, two experiments have been conducted to evaluate the proposed framework by using a real-world dataset collected from Twitter. Section 5 concludes this study, as well as the limitations and future work.
doi:10.11517/pjsai.jsai2019.0_3h3e305 fatcat:jlfsd5xg4jda7oaiojya52ur7e