Lifelong aspect extraction from big data: knowledge engineering

M. Taimoor Khan, Mehr Durrani, Shehzad Khalid, Furqan Aziz
2016 Complex Adaptive Systems Modeling  
Probabilistic topic models perform statistical evaluations on words co-occurrence to extract popular words and group them in topics. A topic can be considered as a concept represented through its top words. In aspect based sentiment analysis (ABSA), topics are used to represent product aspects or sentiment category. Due to the amount of content produced online, there is rich information available that can be processed for decision making. It can help to identify public trends, e.g., popular
more » ... ucts and their features. In social sciences it is used to aggregate opinions of a group of people as opinion mining or sentiment analysis. Its sub-domains are hazard analysis, threat analysis, bias analysis, etc. Government bodies can use it to make policies that address the concern of majority of the people and can even plan their speeches and official statements accordingly. Sentiment analysis in Khan et al. (2015) stress on user centered health care facilities. The importance of ABSA can be realized from the fact that its practical applications are available while it is far from mature. Sentiment analysis aggregated at aspect level is more informative than product level analysis, as shown in Fig. 1 . Abstract Traditional machine learning techniques follow a single shot learning approach. It includes all supervised, semi-supervised, transfer learning, hybrid and unsupervised techniques having a single target domain known prior to analysis. Learning from one task is not carried to the next task, therefore, they cannot scale up to big data having many unknown domains. Lifelong learning models are tailored for big data having a knowledge module that is maintained automatically. The knowledge-base grows with experience where knowledge from previous tasks helps in current task. This paper surveys topic models leading the discussion to knowledge-based topic models and lifelong learning models. The issues and challenges in learning knowledge, its abstraction, retention and transfer are elaborated. The state-of-the art models store word pairs as knowledge having positive or negative co-relations called must-links and cannotlinks. The need for innovative ideas from other research fields is stressed to learn more varieties of knowledge to improve accuracy and reveal more semantic structures from within the data.
doi:10.1186/s40294-016-0018-7 fatcat:bs5qo3f3lnfb7bwarbvfxxdr2m