Graphical Representation [chapter]

Encyclopedia of Public Health  
We develop a method to find protein coding genes based on a 3D graphical representation of DNA sequence. The method is simple and robust. We illustrate it on the yeast genome and it may be extended to find genes in prokaryotic genomes or eukaryotic genomes with less introns. Three-fold cross-validation tests have demonstrated that the accuracy of the algorithm is better than 96%. Based on this, it is found that the total number of protein coding genes in the yeast genome is 5891~5920. Among the
more » ... 891~5920. Among the ORFs annotated in the MIPS database, those recognized as non-coding by the present algorithm are listed in this paper in detail.
doi:10.1007/978-1-4020-5614-7_1298 fatcat:qrfi56usfvgwvjrjmxx76mypii