Privacy preserving sentiment analysis on multiple edge data streams with Apache NiFi

Abhinay Pandya, Panos Kostakos, Hassan Mahmood, Marta Cortes, Ekaterina Gilman, Mourad Oussalah, Susanna Pirttikangas
2019 Zenodo  
Sentiment analysis, also known as opinion mining, plays a big role in both private and public sector Business Intelligence (BI); it attempts to improve public and customer experience. Nevertheless, de-identified sentiment scores from public social media posts can compromise individual privacy due to their vulnerability to record linkage attacks. Established privacy-preserving methods like k-anonymity, l-diversity and t-closeness are offline models exclusively designed for data at rest.
more » ... a at rest. Recently, a number of online anonymization algorithms (CASTLE, SKY, SWAF) have been proposed to complement the functional requirements of streaming applications, but without open-source implementation. In this paper, we present a reusable Apache NiFi dataflow that buffers tweets from multiple edge devices and performs anonymized sentiment analysis in real-time, using randomization. The solution can be easily adapted to suit different scenarios, enabling researchers to deploy custom anonymization algorithms.
doi:10.5281/zenodo.4298915 fatcat:jq4utsrcwzgtjiu2iqe36fymy4