Discovery of DNA Motif Utilising an Integrated Strategy Based on Random Projection and Particle Swarm Optimization
Mathematical Problems in Engineering
During the process of gene expression and regulation, the DNA genetic information can be transferred to protein by means of transcription. The recognition of transcription factor binding sites can help to understand the evolutionary relations among different sequences. Thus, the problem of recognition of transcription factor binding sites, i.e., motif recognition, plays an important role for understanding the biological functions or meanings of sequences. However, when the established search
... ce processes much noise subsequences, many optimization algorithms tend to be trapped into local optimum. In order to solve this problem, a particle swarm optimization and random projection-based algorithm (PSORPS) is proposed for recognizing DNA motifs. First, a random projection strategy is employed to filter the noise subsequences for constructing the objective space. Moreover, the sequence segments distributed in the majority of DNA sequences can be obtained and used for the population initialization of PSO. Then, the motifs of DNA sequences can be automatically searched by using a designed PSO algorithm in the constructed l-mer objective space. Finally, to alleviate the base deviation and further improve the recognition accuracy, the two operators of associated drift and independent drift are performed on the optimization results obtained by PSO. The experiments are conducted on real-world biological datasets, and the experimental results verify the effectiveness of the proposed algorithm.