CD-HIT Suite: a web server for clustering and comparing biological sequences

Ying Huang, Beifang Niu, Ying Gao, Limin Fu, Weizhong Li
2010 Computer applications in the biosciences : CABIOS  
CD-HIT is a widely used program for clustering and comparing large biological sequence datasets. In order to further assist the CD-HIT users, we significantly improved this program with more functions and better accuracy, scalability and flexibility. Most importantly, we developed a new web server, CD-HIT Suite, for clustering a user-uploaded sequence dataset or comparing it to another dataset at different identity levels. Users can now interactively explore the clusters within web browsers. We
more » ... also provide downloadable clusters for several public databases (NCBI NR, Swissprot and PDB) at different identity levels. Availability: Free access at http://cd-hit.org users to cluster or compare sequences without installing and executing the command-line version of CD-HIT locally. The server provides interactive interface and additional visualization tools. It also provides precalculated and regularly updated sequence clusters for several widely used databases, including NCBI NR, Swissprot and PDB.
doi:10.1093/bioinformatics/btq003 pmid:20053844 pmcid:PMC2828112 fatcat:k2hldpq3izhm3mu55bwcyudali