Seqs-Extractor: Automated sequences extraction to reduce tedious manual corrections of large datasets [post]

Patrick D C Pereira, Cleyssian Dias, Mauro A D Melo, Nara G M Magalhães, Cristovam G Diniz, Cristovam W P Diniz
2017 unpublished
The analysis of large numbers of sequences requires the reduction of ambiguities during the analytical work to ensure that the effort will focus only on confirmed sequences. Performing this work automatically may help to minimize potential errors associated with tedious manual correction, allowing more effective results. Basic local alignment search tool (BLAST) seems to be the most widely used sequence analysis program. It is free, but commercial parties enhanced BLAST applications and charge
more » ... cations and charge a fee for their uses. There are some tools of public domain that can perform the search of microsatellites in the next generation sequencing (NGS) data, as the microsatellite identification tool (MISA), which has some features to discover microsatellites in large datasets. Here, we developed a basic shell script (BASH script) to be ran under Linux environment that can be used to extract from a sequence dataset only confirmed (BLASTed) sequences from both nucleotide (BLASTN) and protein (BLASTX) databases and extract sequences that contains microsatellites using MISA tool, using a friendly interface and no fees charged. Seqs-Extractor is a helpful tool that may enhance the analysis of large datasets in BLAST+ and MISA by minimizing the time of management, reducing potential errors caused by manipulating data and no fees charged. Seqs-Extractor is available at https://github.com/patrick-douglas/Seqs-Extractor/wiki .
doi:10.7287/peerj.preprints.3364 fatcat:m3cmorpshradhplen2ve3va7qy