An XML application for genomic data interoperation

Kei-Hoi Cheung, Yang Liu, A. Kumar, M. Snyder, M. Gerstein, P. Miller
2001 Proceedings 2nd Annual IEEE International Symposium on Bioinformatics and Bioengineering (BIBE 2001)  
As the eXtensible Markup Language (XML) becomes a popular or standard language for exchanging data over the Internet/Web, there are a growing number of genome Web sites that make their data available in XML format. Publishing genomic data in XML format alone would not be that useful if there is a lack of development of software applications that could take advantage of the XML technology to process these XML-formatted data. This paper illustrates the usefulness of XML in representing and
more » ... erating genomic data between two different data sources (Snyder' s laboratory at Yale and SGD at Stanford). In particular, we compare the locations of transposon insertions in the yeast DNA sequences that have been identified by BLAST searches with the chromosomal locations of the yeast open reading frames (ORFs) stored in SGD. Such a comparison allows us to characterize the transposon insertions by indicating whether they fall into any ORFs (which may potentially encode proteins that possess essential biological functions). To implement this XML-based interoperation, we used NCBI's "blastall" (which gives an XML output option) and SGD's yeast nucleotide sequence dataset to establish a local blast server. Also, we converted the SGD's ORF location data file (which is available in tab-delimited format) into an XML document based on the BIOML (BIOpolymer Markup Language) standard.
doi:10.1109/bibe.2001.974417 dblp:conf/bibe/CheungLKSGM01 fatcat:iy6rjajokvgdpdwomyyw22zyty