A copy of this work was available on the public web and has been preserved in the Wayback Machine. The capture dates from 2011; you can also visit the original URL.
The file type is
Resource description framework technologies in chemistry Egon L Willighagen 1* and Martin P Brändle 2 Editorial The Resource Description Framework (RDF) is providing the life sciences with new standards ...doi:10.1186/1758-2946-3-15 pmid:21569523 pmcid:PMC3118380 fatcat:psv6nlqssvh6bmi5bvkcovhewa
WikiPathways (www.wikipathways.org) is a community-curated pathway database that enables researchers to capture rich, intuitive models of biological pathways. Importantly, pathway models from WikiPathways are a valuable source for pathway and network analysis approaches and the content is provided in different formats (e.g. RDF ), via dedicated apps for Cytoscape , and on the network data exchange platform NDEx . This knowledge distribution enables the simple integration of pathway anddoi:10.5281/zenodo.4269617 fatcat:zcf46fzdbnfufbezl2wmdtsxtq
more »... interaction data in network analysis as highlighted in recent publications [4-6]. In response to the COVID-19 pandemic, WikiPathways established the COVID-19 community portal (http://covid.wikipathways.org), which currently contains over 20 COVID-19 related pathway models. We focus on pathway curation and the development of data analysis workflows, which are continuously executed with the newest knowledge and data. By analyzing and visualizing variations in pathway activity between cell-types, tissues, populations (e.g. male vs female, young vs old, healthy vs diseased, different ethnicities), and species, we want to provide insights into the exact mechanisms underlying differences in disease severity. As part of the international COVID-19 Disease Map project , curation efforts are aligned, conversions between formats are improved (e.g. between WikiPathways and Minerva ), and new software features are being developed. For our pathway editor PathVisio , we are planning to add detailed evidence and provenance information for interactions and support multi-species pathways with identifier mapping for both host (human) and virus (COVID-19) provided by BridgeDb . In addition to ongoing curation efforts to grow and maintain the database, we have identified publication figures as a valuable resource. We estimate ~1000 pathway figures are published and indexed by PubMed Central each month]. These figures contain novel pathway content not present in the text nor captured in pathway databases. We performed optical chara [...]
The web has seen an explosion of chemistry and biology related resources in the last 15 years: thousands of scientific journals, databases, wikis, blogs and resources are available with a wide variety of types of information. There is a huge need to aggregate and organise this information. However, the sheer number of resources makes it unrealistic to link them all in a centralised manner. Instead, search engines to find information in those resources flourish, and formal languages likedoi:10.1186/1471-2105-8-487 pmid:18154664 pmcid:PMC2222660 fatcat:s7pxd755sfeajjq4bw3iwbzvbq
Small molecules are of increasing interest for bioinformatics in areas such as metabolomics and drug discovery. The recent release of large open access chemistry databases generates a demand for flexible tools to process them and discover new knowledge. To freely support open science based on these data resources, it is desirable for the processing tools to be open source and available for everyone. Results: Here we describe a novel combination of the workflow engine Taverna and thedoi:10.1186/1471-2105-11-159 pmid:20346188 pmcid:PMC2862046 fatcat:hjde2ok7uzg2tp7zgeviolooa4
more »... ics library Chemistry Development Kit (CDK) resulting in a open source workflow solution for cheminformatics. We have implemented more than 160 different workers to handle specific cheminformatics tasks. We describe the applications of CDK-Taverna in various usage scenarios. Conclusions: The combination of the workflow engine Taverna and the Chemistry Development Kit provides the first open source cheminformatics workflow solution for the biosciences. With the Taverna-community working towards a more powerful workflow engine and a more user-friendly user interface, CDK-Taverna has the potential to become a free alternative to existing proprietary workflow tools.
Curation, Formal Analysis, Validation, Writing -Original Draft Preparation, Writing - Ehrhart F Review & Editing; : Data Curation, Writing -Review & Editing; : Conceptualization, Supervision; : Willighagen ...doi:10.12688/f1000research.14613.1 pmid:31489175 pmcid:PMC6707396 fatcat:34tzrw6jwveelgms6znbbi7vwa
Curation, Formal Analysis, Validation, Writing -Original Draft Preparation, Writing - Ehrhart F Review & Editing; : Data Curation, Writing -Review & Editing; : Conceptualization, Supervision; : Willighagen ...doi:10.12688/f1000research.14613.2 fatcat:qr654mccujeavnmmdb5jyzfilq
Xenobiotic metabolism is an active research topic but the limited amount of openly available high-quality biotransformation data constrains predictive modeling. Current database often default to commonly available information: which enzyme metabolizes a compound, but neither experimental conditions nor the atoms that undergo metabolization are captured. We present XMetDB, an open access database for drugs and other xenobiotics and their respective metabolites. The database contains chemicaldoi:10.1186/s13321-016-0161-3 pmid:27651835 fatcat:xnoxkdmdmbcwvh4tvzattn2nwa
more »... ctures of xenobiotic biotransformations with substrate atoms annotated as reaction centra, the resulting product formed, and the catalyzing enzyme, type of experiment, and literature references. Associated with the database is a web interface for the submission and retrieval of experimental metabolite data for drugs and other xenobiotics in various formats, and a web API for programmatic access is also available. The database is open for data deposition, and a curation scheme is in place for quality control. An extensive guide on how to enter experimental data into is available from the XMetDB wiki. XMetDB formalizes how biotransformation data should be reported, and the openly available systematically labeled data is a big step forward towards better models for predictive metabolism. Acknowlegements Funding was received from Stiftelsen Olle Engkvist Byggmästare, and the Swedish strategic research program eSSENCE. During the XMetDB project, we were shocked by the tragic death of Patrik Rydberg. We would here like to acknowledge his scientific contributions in the field of xenobiotic metabolism and will continue the XMetDB project in his memory.
The Open-Source Chemistry Analysis Routines (OSCAR) software, a toolkit for the recognition of named entities and data in chemistry publications, has been developed since 2002. Recent work has resulted in the separation of the core OSCAR functionality and its release as the OSCAR4 library. This library features a modular API (based on reduction of surface coupling) that permits client programmers to easily incorporate it into external applications. OSCAR4 offers a domain-independentdoi:10.1186/1758-2946-3-41 pmid:21999457 pmcid:PMC3205045 fatcat:txtflbq26zaltd7kunj3dcyuie
more »... upon which chemistry specific text-mining tools can be built, and its development and usage are discussed.
QSAR is a widely used method to relate chemical structures to responses or properties based on experimental observations. Much effort has been made to evaluate and validate the statistical modeling in QSAR, but these analyses treat the dataset as fixed. An overlooked but highly important issue is the validation of the setup of the dataset, which comprises addition of chemical structures as well as selection of descriptors and software implementations prior to calculations. This process isdoi:10.1186/1758-2946-2-5 pmid:20591161 pmcid:PMC2909924 fatcat:y4rup2562jbstib25lewyq7u2e
more »... ed by the lack of standards and exchange formats in the field, making it virtually impossible to reproduce and validate analyses and drastically constrain collaborations and reuse of data. Results: We present a step towards standardizing QSAR analyses by defining interoperable and reproducible QSAR datasets, consisting of an open XML format (QSAR-ML) which builds on an open and extensible descriptor ontology. The ontology provides an extensible way of uniquely defining descriptors for use in QSAR experiments, and the exchange format supports multiple versioned implementations of these descriptors. Hence, a dataset described by QSAR-ML makes its setup completely reproducible. We also provide a reference implementation as a set of plugins for Bioclipse which simplifies setup of QSAR datasets, and allows for exporting in QSAR-ML as well as old-fashioned CSV formats. The implementation facilitates addition of new descriptor implementations from locally installed software and remote Web services; the latter is demonstrated with REST and XMPP Web services. Conclusions: Standardized QSAR datasets open up new ways to store, query, and exchange data for subsequent analyses. QSAR-ML supports completely reproducible creation of datasets, solving the problems of defining which software components were used and their versions, and the descriptor ontology eliminates confusions regarding descriptors by defining them crisply. This makes is easy to join, extend, combine datasets and hence work collectively, but also allows for analyzing the effect descriptors have on the statistical model's performance. The presented Bioclipse plugins equip scientists with graphical tools that make QSAR-ML easily accessible for the community.
Willighagen (email@example.com) all contributed material and reviewed content. Data Intelligence Taking FAIR on the ChIN: The Chemistry Implementation Network ...doi:10.1162/dint_a_00035 fatcat:yv4ccozzrnbb5nenbmwdphoiki
BMC Research Notes
getToken() Willighagen et al. BMC Research Notes 2011, 4:487 http://www.biomedcentral.com/1756-0500/4/487 ...doi:10.1186/1756-0500-4-487 pmid:22075173 pmcid:PMC3264531 fatcat:4qolz3bn2rcw3fycx7fsmaqxcu
The InChI algorithms are written in C++ and not available as Java library. Integration into software written in Java therefore requires a bridge between C and Java libraries, provided by the Java Native Interface (JNI) technology. Results: We here describe how the InChI library is used in the Bioclipse workbench and the Chemistry Development Kit (CDK) cheminformatics library. To make this possible, a JNI bridge to the InChI library was developed, JNI-InChI, allowing Java software to access thedoi:10.1186/1758-2946-5-14 pmid:23497723 pmcid:PMC3674901 fatcat:ilw6pmpwkncs5kcrftd6vkplqi
more »... nChI algorithms. By using this bridge, the CDK project packages the InChI binaries in a module and offers easy access from Java using the CDK API. The Bioclipse project packages and offers InChI as a dynamic OSGi bundle that can easily be used by any OSGi-compliant software, in addition to the regular Java Archive and Maven bundles. Bioclipse itself uses the InChI as a key component and calculates it on the fly when visualizing and editing chemical structures. We demonstrate the utility of InChI with various applications in CDK and Bioclipse, such as decision support for chemical liability assessment, tautomer generation, and for knowledge aggregation using a linked data approach. Conclusions: These results show that the InChI library can be used in a variety of Java library dependency solutions, making the functionality easily accessible by Java software, such as in the CDK. The applications show various ways the InChI has been used in Bioclipse, to enrich its functionality.
The computational processing and analysis of small molecules is at heart of cheminformatics and structural bioinformatics and their application in e.g. metabolomics or drug discovery. Pipelining or workflow tools allow for the Lego™-like, graphical assembly of I/O modules and algorithms into a complex workflow which can be easily deployed, modified and tested without the hassle of implementing it into a monolithic application. The CDK- Taverna project aims at building a free open-sourcedoi:10.1186/1758-2946-3-54 pmid:22166170 pmcid:PMC3292505 fatcat:qvycgcayqvde5kb2rtysw7mu6y
more »... rmatics pipelining solution through combination of different open-source projects such as Taverna, the Chemistry Development Kit (CDK) or the Waikato Environment for Knowledge Analysis (WEKA). A first integrated version 1.0 of CDK-Taverna was recently released to the public. Results: The CDK-Taverna project was migrated to the most up-to-date versions of its foundational software libraries with a complete re-engineering of its worker's architecture (version 2.0). 64-bit computing and multi-core usage by paralleled threads are now supported to allow for fast in-memory processing and analysis of large sets of molecules. Earlier deficiencies like workarounds for iterative data reading are removed. The combinatorial chemistry related reaction enumeration features are considerably enhanced. Additional functionality for calculating a natural product likeness score for small molecules is implemented to identify possible drug candidates. Finally the data analysis capabilities are extended with new workers that provide access to the open-source WEKA library for clustering and machine learning as well as training and test set partitioning. The new features are outlined with usage scenarios. Conclusions: CDK-Taverna 2.0 as an open-source cheminformatics workflow solution matured to become a freely available and increasingly powerful tool for the biosciences. The combination of the new CDK-Taverna worker family with the already available workflows developed by a lively Taverna community and published on myexperiment.org enables molecular scientists to quickly calculate, process and analyse molecular data as typically found in e.g. today's systems biology scenarios.
= creator.getOrcidUri("0000-0001-7542-0286") freddie = creator.getOrcidUri("0000-0002-7770-620X") creator.addCreator(freddie) creator.addCreator(egon) trustedPub = creator.finalizeTrustyNanopub() outputBuffer ... license It is made available under a (which was not peer-reviewed) is the author/funder, who has granted bioRxiv a license to display the preprint in perpetuity. creator.addPubinfoStatement(rights, cczero) egon ...doi:10.1101/595819 fatcat:zvd3s77ltjh7nbyi2ntgjhwyr4
« Previous Showing results 1 — 15 out of 119 results