Gordon Life Science Institute and Its Impacts on Computational Biology and Drug Development
Gordon Life Science Institute is the first Internet Research Institute ever established in the world. It is a non-profit institute. Those scientists who really dedicate themselves to science and loving science more than anything else can become its member. In the friendly door-opened Institute, they can maximize their time and energy to engage in their scientific creativity. They have also believed that science would be more truthful and wonderful if scientists do not have to spend a lot of
... spend a lot of time on funding application, and that great scientific findings and creations in history were often made by those who were least supported or funded but driven by interesting imagination and curiosity. Recollected in this review article is its establishing and developing processes, as well as its philosophy and accomplishments. Particularly, its productive and by-productive outcomes have covered the following five very hot topics in bioinformatics and drug development: 1) PseAAC and PseKNC; 2) Disported key theory; 3) Wenxiang diagram; 4) Multi-label system prediction; 5) 5-steps rule. Their impacts on the proteomics and genomics as well as drug development are substantially and awesome. Natural Science ACCOMPLISHMENTS With the explosive growth of biological sequences in the post-genomic era, one of the most challenging problems in computational biology is how to express a biological sequence with a discrete model or a vector, yet still keep considerable sequence-order information or key pattern characteristic. This is because all the existing machine-learning algorithms (such as "Optimization" algorithm , "Covariance Discriminant" or "CD" algorithm [2, 3], "Nearest Neighbor" or "NN" algorithm , and "Support Vector Machine" or "SVM" algorithm [4, 5] ) can only handle vectors as elaborated in a comprehensive review . However, a vector defined in a discrete model may completely lose all the sequence-pattern information. To avoid completely losing the sequence-pattern information for proteins, the pseudo amino acid composition  or PseAAC  was proposed. Ever since then, it has been widely used in nearly all the areas of computational proteomics [3, 9-61, 58-60, 62-272].