Recognition of splice junctions on DNA sequences by BRAIN learning algorithm

S. Rampone
1998 Bioinformatics  
Motivation: The problem addressed in this paper is the prediction of splice site locations in human DNA. The aims of the proposed approach are explicit splicing rule description, high recognition quality, and robust and stable 'one shot' data processing. Results: These results are achieved by means of a new learning algorithm [BRAIN (Batch Relevance-based Artificial INtelligence)], described in the paper, inferring Boolean formulae from examples, and by considering the splicing rules as
more » ... ng rules as disjunctive normal form (DNF) formulae. The formula terms are computed in an iterative way, by identifying from the training set a relevance coefficient for each attribute. The classification is then refined by a neural network and combined with a discriminant analysis procedure. This splice site recognition method shows low error rates (0.0002 and 0.0003) and high correlation coefficient measures (0.83 and 0.81) for donor and acceptor sites, respectively; better than other methods. Availability: The BRAIN package (Borland Turbo Pascal for Windows) is available on the EMBL file server.
doi:10.1093/bioinformatics/14.8.676 pmid:9789093 fatcat:7d35uyjtpbdr5aes5cqvuhvfha