High-Throughput Analysis [entry]

2004 Encyclopedic Dictionary of Genetics, Genomics and Proteomics   unpublished
Modern biology is experiencing a rapid increase of data volumes that challenges our analytical skills and existing cyberinfrastructure. Exponential expansion of the Protein Sequence Universe (PSU), a protein sequence space, together with complexities of manual creation creates a major bottleneck in a biomedical research which requires a fusion of novel analytical approaches and computational means. Comprehensive visualization tool can be instrumental in meeting the need for functional
more » ... unctional annotation. Current existing resources lack scalable visualization tools to study the structure of the PSU. Here, we describe a multi-dimensional scaling (MDS) implementation to create a 3D embedding of the PSU. Applying the method to the prokaryotic PSU shows that MDS is capable of preserving important grouping structure such as relative proximity of functionally similar clusters, and a clear structural separation between clusters with specific and general functions. We also discuss the merits of the method including its scalable implementation and its role as a protein annotation tool that could help alleviate major bottleneck issue in modern biology. In conclusions, we emphasize the need for a transdisciplinary approach to quickly and efficiently translate the influx of new data into tangible innovations and long-awaited treatments.
doi:10.1002/0471684228.egp05698 fatcat:kxirlo4iijgqdgnphpz6w4b4xu