Enhanced PROBCONS for Multiple Sequence Alignment in Cloud Computing

Eman M. Mohamed, Faculty of Computers and Information, Menoufia University, Egypt, Hamdy M. Mousa, Arabi E. keshk
2019 International Journal of Information Technology and Computer Science  
Multiple protein sequence alignment (MPSA) intend to realize the similarity between multiple protein sequences and increasing accuracy. MPSA turns into a critical bottleneck for large scale protein sequence data sets. It is vital for existing MPSA tools to be kept running in a parallelized design. Joining MPSA tools with cloud computing will improve the speed and accuracy in case of large scale data sets. PROBCONS is probabilistic consistency for progressive MPSA based on hidden Markov models.
more » ... ROBCONS is an MPSA tool that achieves the maximum expected accuracy, but it has a time-consuming problem. In this paper firstly, the proposed approach is to cluster the large multiple protein sequences into structurally similar protein sequences. This classification is done based on secondary structure, LCS, and amino acids features. Then PROBCONS MPSA tool will be performed in parallel to clusters. The last step is to merge the final PROBCONS of clusters. The proposed algorithm is in the Amazon Elastic Cloud (EC2). The proposed algorithm achieved the highest alignment accuracy. Feature classification understands protein sequence, structure and function, and all these features affect accuracy strongly and reduce the running time of searching to produce the final alignment result.
doi:10.5815/ijitcs.2019.09.05 fatcat:diipeg7yyrctnolfvxo6xo6wwq