Towards semantic methodologies for automatic regulatory compliance support

Krishna Sapkota, Arantza Aldea, David A. Duce, Muhammad Younas, René Bañares-Alcántara
2011 Proceedings of the 4th workshop on Workshop for Ph.D. students in information & knowledge management - PIKM '11  
2 Abstract Regulatory Compliance Management (RCM) is a management process, which an organization implements to conform to regulatory guidelines. Some processes that contribute towards automating RCM are: (i) extraction of meaningful entities from the regulatory text and (ii) mapping regulatory guidelines with organisational processes. These processes help in updating the RCM with changes in regulatory guidelines. The update process is still manual since there are comparatively less research in
more » ... his direction. The Semantic Web technologies are potential candidates in order to make the update process automatic. There are stand-alone frameworks that use Semantic Web technologies such as Information Extraction, Ontology Population, Similarities and Ontology Mapping. However, integration of these innovative approaches in the semantic compliance management has not been explored yet. Considering these two processes as crucial constituents, the aim of this thesis is to automate the processes of RCM. It proposes a framework called, RegCMantic. The proposed framework is designed and developed in two main phases. The first part of the framework extracts the regulatory entities from regulatory guidelines. The extraction of meaningful entities from the regulatory guidelines helps in relating the regulatory guidelines with organisational processes. The proposed framework identifies the document-components and extracts the entities from the document-components. The framework extracts important regulatory entities using four components: (i) parser, (ii) definition terms, (iii) ontological concepts and (iv) rules. The parsers break down a sentence into useful segments. The extraction is carried out by using the definition terms, ontological concepts and the rules in the segments. The entities extracted are the core-entities such as subject, action and obligation, and the aux-entities such as time, place, purpose, procedure and condition. The second part of the framework relates the regulatory guidelines with organisational processes. The proposed framework uses a mapping algorithm, which considers three types of Abstract 3 entities in the regulatory-domain and two types of entities in the process-domains. In the regulatory-domain, the considered entities are regulation-topic, core-entities and aux-entities. Whereas, in the process-domain, the considered entities are subject and action. Using these entities, it computes aggregation of three types of similarity scores: topic-score, core-score and aux-score. The aggregate similarity score determines whether a regulatory guideline is related to an organisational process. The RegCMantic framework is validated through the development of a prototype system. The prototype system implements a case study, which involves regulatory guidelines governing the Pharmaceutical industries in the UK. The evaluation of the results from the case-study has shown improved accuracy in extraction of the regulatory entities and relating regulatory guidelines with organisational processes. This research has contributed in extracting meaningful entities from regulatory guidelines, which are provided in unstructured text and mapping the regulatory guidelines with organisational processes semantically. Acknowledgement 4 Acknowledgement My sincere gratitude goes to my first supervisor, Dr Arantza Aldea for her inspiration and motivation, without which, I could not have started this PhD. I am very much indebted and thankful to her for her continuous help, support, supervision, inspiration and motivation during this thesis. Likewise, I would like to thank my second supervisor, Dr Muhammad Younas for his continuous help, support, supervision and motivation. Despite their busy schedule, Arantza and Younas are giving me their valuable feedback regularly (mostly weekly), which kept me motivated all the time during this thesis.
doi:10.1145/2065003.2065021 dblp:conf/cikm/SapkotaADYB11a fatcat:kdx3cvk6xrhjpfj4ae4n6hsfey