119 Hits in 2.3 sec

Resource description framework technologies in chemistry

Egon L Willighagen, Martin P Brändle
2011 Journal of Cheminformatics  
Resource description framework technologies in chemistry Egon L Willighagen 1* and Martin P Brändle 2 Editorial The Resource Description Framework (RDF) is providing the life sciences with new standards  ... 
doi:10.1186/1758-2946-3-15 pmid:21569523 pmcid:PMC3118380 fatcat:psv6nlqssvh6bmi5bvkcovhewa

Expanding the eNanoMapper Ontology (short paper)

Laurent A. Winckers, Egon L. Willighagen
2020 International Conference on Biomedical Ontology  
dblp:conf/icbo/WinckersW20 fatcat:44tnrjw77bfnle6cpy6deoqjbe

The WikiPathways COVID-19 Community Portal

Martina Kutmon, Friederike Ehrhart, Denise N. Slenter, Kristina Hanspers, Egon L. Willighagen, Alexander R. Pico, Chris T. Evelo
2020 Zenodo  
WikiPathways ( is a community-curated pathway database that enables researchers to capture rich, intuitive models of biological pathways. Importantly, pathway models from WikiPathways are a valuable source for pathway and network analysis approaches and the content is provided in different formats (e.g. RDF [1]), via dedicated apps for Cytoscape [2], and on the network data exchange platform NDEx [3]. This knowledge distribution enables the simple integration of pathway and
more » ... interaction data in network analysis as highlighted in recent publications [4-6]. In response to the COVID-19 pandemic, WikiPathways established the COVID-19 community portal (, which currently contains over 20 COVID-19 related pathway models. We focus on pathway curation and the development of data analysis workflows, which are continuously executed with the newest knowledge and data. By analyzing and visualizing variations in pathway activity between cell-types, tissues, populations (e.g. male vs female, young vs old, healthy vs diseased, different ethnicities), and species, we want to provide insights into the exact mechanisms underlying differences in disease severity. As part of the international COVID-19 Disease Map project [7], curation efforts are aligned, conversions between formats are improved (e.g. between WikiPathways and Minerva [8]), and new software features are being developed. For our pathway editor PathVisio [9], we are planning to add detailed evidence and provenance information for interactions and support multi-species pathways with identifier mapping for both host (human) and virus (COVID-19) provided by BridgeDb [10]. In addition to ongoing curation efforts to grow and maintain the database, we have identified publication figures as a valuable resource. We estimate ~1000 pathway figures are published and indexed by PubMed Central each month]. These figures contain novel pathway content not present in the text nor captured in pathway databases. We performed optical chara [...]
doi:10.5281/zenodo.4269617 fatcat:zcf46fzdbnfufbezl2wmdtsxtq

Userscripts for the Life Sciences

Egon L Willighagen, Noel M O'Boyle, Harini Gopalakrishnan, Dazhi Jiao, Rajarshi Guha, Christoph Steinbeck, David J Wild
2007 BMC Bioinformatics  
The web has seen an explosion of chemistry and biology related resources in the last 15 years: thousands of scientific journals, databases, wikis, blogs and resources are available with a wide variety of types of information. There is a huge need to aggregate and organise this information. However, the sheer number of resources makes it unrealistic to link them all in a centralised manner. Instead, search engines to find information in those resources flourish, and formal languages like
more » ... Description Framework and Web Ontology Language are increasingly used to allow linking of resources. A recent development is the use of userscripts to change the appearance of web pages, by on-the-fly modification of the web content. This opens possibilities to aggregate information and computational results from different web resources into the web page of one of those resources. Results: Several userscripts are presented that enrich biology and chemistry related web resources by incorporating or linking to other computational or data sources on the web. The scripts make use of Greasemonkey-like plugins for web browsers and are written in JavaScript. Information from third-party resources are extracted using open Application Programming Interfaces, while common Universal Resource Locator schemes are used to make deep links to related information in that external resource. The userscripts presented here use a variety of techniques and resources, and show the potential of such scripts. Conclusion: This paper discusses a number of userscripts that aggregate information from two or more web resources. Examples are shown that enrich web pages with information from other resources, and show how information from web pages can be used to link to, search, and process information in other resources. Due to the nature of userscripts, scientists are able to select those scripts they find useful on a daily basis, as the scripts run directly in their own web browser rather than on the web server. This flexibility allows the scientists to tune the features of web resources to optimise their productivity.
doi:10.1186/1471-2105-8-487 pmid:18154664 pmcid:PMC2222660 fatcat:s7pxd755sfeajjq4bw3iwbzvbq

CDK-Taverna: an open workflow environment for cheminformatics

Thomas Kuhn, Egon L Willighagen, Achim Zielesny, Christoph Steinbeck
2010 BMC Bioinformatics  
Small molecules are of increasing interest for bioinformatics in areas such as metabolomics and drug discovery. The recent release of large open access chemistry databases generates a demand for flexible tools to process them and discover new knowledge. To freely support open science based on these data resources, it is desirable for the processing tools to be open source and available for everyone. Results: Here we describe a novel combination of the workflow engine Taverna and the
more » ... ics library Chemistry Development Kit (CDK) resulting in a open source workflow solution for cheminformatics. We have implemented more than 160 different workers to handle specific cheminformatics tasks. We describe the applications of CDK-Taverna in various usage scenarios. Conclusions: The combination of the workflow engine Taverna and the Chemistry Development Kit provides the first open source cheminformatics workflow solution for the biosciences. With the Taverna-community working towards a more powerful workflow engine and a more user-friendly user interface, CDK-Taverna has the potential to become a free alternative to existing proprietary workflow tools.
doi:10.1186/1471-2105-11-159 pmid:20346188 pmcid:PMC2862046 fatcat:hjde2ok7uzg2tp7zgeviolooa4

CyTargetLinker app update: A flexible solution for network extension in Cytoscape

Martina Kutmon, Friederike Ehrhart, Egon L. Willighagen, Chris T. Evelo, Susan L. Coort
2018 F1000Research  
Curation, Formal Analysis, Validation, Writing -Original Draft Preparation, Writing - Ehrhart F Review & Editing; : Data Curation, Writing -Review & Editing; : Conceptualization, Supervision; : Willighagen  ... 
doi:10.12688/f1000research.14613.1 pmid:31489175 pmcid:PMC6707396 fatcat:34tzrw6jwveelgms6znbbi7vwa

CyTargetLinker app update: A flexible solution for network extension in Cytoscape

Martina Kutmon, Friederike Ehrhart, Egon L. Willighagen, Chris T. Evelo, Susan L. Coort
2019 F1000Research  
Curation, Formal Analysis, Validation, Writing -Original Draft Preparation, Writing - Ehrhart F Review & Editing; : Data Curation, Writing -Review & Editing; : Conceptualization, Supervision; : Willighagen  ... 
doi:10.12688/f1000research.14613.2 fatcat:qr654mccujeavnmmdb5jyzfilq

XMetDB: an open access database for xenobiotic metabolism

Ola Spjuth, Patrik Rydberg, Egon L. Willighagen, Chris T. Evelo, Nina Jeliazkova
2016 Journal of Cheminformatics  
Xenobiotic metabolism is an active research topic but the limited amount of openly available high-quality biotransformation data constrains predictive modeling. Current database often default to commonly available information: which enzyme metabolizes a compound, but neither experimental conditions nor the atoms that undergo metabolization are captured. We present XMetDB, an open access database for drugs and other xenobiotics and their respective metabolites. The database contains chemical
more » ... ctures of xenobiotic biotransformations with substrate atoms annotated as reaction centra, the resulting product formed, and the catalyzing enzyme, type of experiment, and literature references. Associated with the database is a web interface for the submission and retrieval of experimental metabolite data for drugs and other xenobiotics in various formats, and a web API for programmatic access is also available. The database is open for data deposition, and a curation scheme is in place for quality control. An extensive guide on how to enter experimental data into is available from the XMetDB wiki. XMetDB formalizes how biotransformation data should be reported, and the openly available systematically labeled data is a big step forward towards better models for predictive metabolism. Acknowlegements Funding was received from Stiftelsen Olle Engkvist Byggmästare, and the Swedish strategic research program eSSENCE. During the XMetDB project, we were shocked by the tragic death of Patrik Rydberg. We would here like to acknowledge his scientific contributions in the field of xenobiotic metabolism and will continue the XMetDB project in his memory.
doi:10.1186/s13321-016-0161-3 pmid:27651835 fatcat:xnoxkdmdmbcwvh4tvzattn2nwa

OSCAR4: a flexible architecture for chemical text-mining

David M Jessop, Sam E Adams, Egon L Willighagen, Lezan Hawizy, Peter Murray-Rust
2011 Journal of Cheminformatics  
The Open-Source Chemistry Analysis Routines (OSCAR) software, a toolkit for the recognition of named entities and data in chemistry publications, has been developed since 2002. Recent work has resulted in the separation of the core OSCAR functionality and its release as the OSCAR4 library. This library features a modular API (based on reduction of surface coupling) that permits client programmers to easily incorporate it into external applications. OSCAR4 offers a domain-independent
more » ... upon which chemistry specific text-mining tools can be built, and its development and usage are discussed.
doi:10.1186/1758-2946-3-41 pmid:21999457 pmcid:PMC3205045 fatcat:txtflbq26zaltd7kunj3dcyuie

Towards interoperable and reproducible QSAR analyses: Exchange of datasets

Ola Spjuth, Egon L Willighagen, Rajarshi Guha, Martin Eklund, Jarl ES Wikberg
2010 Journal of Cheminformatics  
QSAR is a widely used method to relate chemical structures to responses or properties based on experimental observations. Much effort has been made to evaluate and validate the statistical modeling in QSAR, but these analyses treat the dataset as fixed. An overlooked but highly important issue is the validation of the setup of the dataset, which comprises addition of chemical structures as well as selection of descriptors and software implementations prior to calculations. This process is
more » ... ed by the lack of standards and exchange formats in the field, making it virtually impossible to reproduce and validate analyses and drastically constrain collaborations and reuse of data. Results: We present a step towards standardizing QSAR analyses by defining interoperable and reproducible QSAR datasets, consisting of an open XML format (QSAR-ML) which builds on an open and extensible descriptor ontology. The ontology provides an extensible way of uniquely defining descriptors for use in QSAR experiments, and the exchange format supports multiple versioned implementations of these descriptors. Hence, a dataset described by QSAR-ML makes its setup completely reproducible. We also provide a reference implementation as a set of plugins for Bioclipse which simplifies setup of QSAR datasets, and allows for exporting in QSAR-ML as well as old-fashioned CSV formats. The implementation facilitates addition of new descriptor implementations from locally installed software and remote Web services; the latter is demonstrated with REST and XMPP Web services. Conclusions: Standardized QSAR datasets open up new ways to store, query, and exchange data for subsequent analyses. QSAR-ML supports completely reproducible creation of datasets, solving the problems of defining which software components were used and their versions, and the descriptor ontology eliminates confusions regarding descriptors by defining them crisply. This makes is easy to join, extend, combine datasets and hence work collectively, but also allows for analyzing the effect descriptors have on the statistical model's performance. The presented Bioclipse plugins equip scientists with graphical tools that make QSAR-ML easily accessible for the community.
doi:10.1186/1758-2946-2-5 pmid:20591161 pmcid:PMC2909924 fatcat:y4rup2562jbstib25lewyq7u2e

Taking FAIR on the ChIN: The Chemistry Implementation Network

Simon J. Coles, Jeremy G. Frey, Egon L. Willighagen, Stuart J. Chalk
2019 Data Intelligence  
Willighagen ( all contributed material and reviewed content. Data Intelligence Taking FAIR on the ChIN: The Chemistry Implementation Network  ... 
doi:10.1162/dint_a_00035 fatcat:yv4ccozzrnbb5nenbmwdphoiki

Computational toxicology using the OpenTox application programming interface and Bioclipse

Egon L Willighagen, Nina Jeliazkova, Barry Hardy, Roland C Grafström, Ola Spjuth
2011 BMC Research Notes  
getToken() Willighagen et al. BMC Research Notes 2011, 4:487  ... 
doi:10.1186/1756-0500-4-487 pmid:22075173 pmcid:PMC3264531 fatcat:4qolz3bn2rcw3fycx7fsmaqxcu

Applications of the InChI in cheminformatics with the CDK and Bioclipse

Ola Spjuth, Arvid Berg, Samuel Adams, Egon L Willighagen
2013 Journal of Cheminformatics  
The InChI algorithms are written in C++ and not available as Java library. Integration into software written in Java therefore requires a bridge between C and Java libraries, provided by the Java Native Interface (JNI) technology. Results: We here describe how the InChI library is used in the Bioclipse workbench and the Chemistry Development Kit (CDK) cheminformatics library. To make this possible, a JNI bridge to the InChI library was developed, JNI-InChI, allowing Java software to access the
more » ... nChI algorithms. By using this bridge, the CDK project packages the InChI binaries in a module and offers easy access from Java using the CDK API. The Bioclipse project packages and offers InChI as a dynamic OSGi bundle that can easily be used by any OSGi-compliant software, in addition to the regular Java Archive and Maven bundles. Bioclipse itself uses the InChI as a key component and calculates it on the fly when visualizing and editing chemical structures. We demonstrate the utility of InChI with various applications in CDK and Bioclipse, such as decision support for chemical liability assessment, tautomer generation, and for knowledge aggregation using a linked data approach. Conclusions: These results show that the InChI library can be used in a variety of Java library dependency solutions, making the functionality easily accessible by Java software, such as in the CDK. The applications show various ways the InChI has been used in Bioclipse, to enrich its functionality.
doi:10.1186/1758-2946-5-14 pmid:23497723 pmcid:PMC3674901 fatcat:ilw6pmpwkncs5kcrftd6vkplqi

New developments on the cheminformatics open workflow environment CDK-Taverna

Andreas Truszkowski, Kalai Jayaseelan, Stefan Neumann, Egon L Willighagen, Achim Zielesny, Christoph Steinbeck
2011 Journal of Cheminformatics  
The computational processing and analysis of small molecules is at heart of cheminformatics and structural bioinformatics and their application in e.g. metabolomics or drug discovery. Pipelining or workflow tools allow for the Lego™-like, graphical assembly of I/O modules and algorithms into a complex workflow which can be easily deployed, modified and tested without the hassle of implementing it into a monolithic application. The CDK- Taverna project aims at building a free open-source
more » ... rmatics pipelining solution through combination of different open-source projects such as Taverna, the Chemistry Development Kit (CDK) or the Waikato Environment for Knowledge Analysis (WEKA). A first integrated version 1.0 of CDK-Taverna was recently released to the public. Results: The CDK-Taverna project was migrated to the most up-to-date versions of its foundational software libraries with a complete re-engineering of its worker's architecture (version 2.0). 64-bit computing and multi-core usage by paralleled threads are now supported to allow for fast in-memory processing and analysis of large sets of molecules. Earlier deficiencies like workarounds for iterative data reading are removed. The combinatorial chemistry related reaction enumeration features are considerably enhanced. Additional functionality for calculating a natural product likeness score for small molecules is implemented to identify possible drug candidates. Finally the data analysis capabilities are extended with new workers that provide access to the open-source WEKA library for clustering and machine learning as well as training and test set partitioning. The new features are outlined with usage scenarios. Conclusions: CDK-Taverna 2.0 as an open-source cheminformatics workflow solution matured to become a freely available and increasingly powerful tool for the biosciences. The combination of the new CDK-Taverna worker family with the already available workflows developed by a lively Taverna community and published on enables molecular scientists to quickly calculate, process and analyse molecular data as typically found in e.g. today's systems biology scenarios.
doi:10.1186/1758-2946-3-54 pmid:22166170 pmcid:PMC3292505 fatcat:qvycgcayqvde5kb2rtysw7mu6y

History of rare diseases and their genetic causes - a data driven approach [article]

Friederike Ehrhart, Egon L. Willighagen, Martina Kutmon, Max van Hoften, Nasim Bahram Sangani, Leopold G.M. Curfs, Chris T. Evelo
2019 bioRxiv   pre-print
= creator.getOrcidUri("0000-0001-7542-0286") freddie = creator.getOrcidUri("0000-0002-7770-620X") creator.addCreator(freddie) creator.addCreator(egon) trustedPub = creator.finalizeTrustyNanopub() outputBuffer  ...  license It is made available under a (which was not peer-reviewed) is the author/funder, who has granted bioRxiv a license to display the preprint in perpetuity. creator.addPubinfoStatement(rights, cczero) egon  ... 
doi:10.1101/595819 fatcat:zvd3s77ltjh7nbyi2ntgjhwyr4
« Previous Showing results 1 — 15 out of 119 results