Filters








92 Hits in 1.2 sec

vec2sparql.pdf [article]

Maxat Kulmanov, Şenay Kafkas, Andreas Karwath, Alexander Malic, Georgios V Gkoutos, Michel Dumontier, Robert Hoehndorf
2018 Figshare  
Recent developments in machine learning have led to a rise of largenumber of methods for extracting features from structured data. Thefeatures are represented as vectors and may encode for some semanticaspects of data. They can be used in a machine learning models fordifferent tasks or to compute similarities between the entities of thedata.SPARQL is a query language for structured data originally developedfor querying Resource Description Framework (RDF) data. It has been inuse for over a
more » ... e as a standardized NoSQL query language. Manydifferent tools have been developed to enable data sharing withSPARQL. For example, SPARQL endpoints make your data interoperableand available to the world. SPARQL queries can be executed acrossmultiple endpoints.We have developed a Vec2SPARQL, which is a general framework forintegrating structured data and their vector space representations.Vec2SPARQL allows jointly querying vector functions such as computingsimilarities (cosine, correlations) or classifications with machinelearning models within a single SPARQL query. We demonstrateapplications of our approach for biomedical and clinical use cases.
doi:10.6084/m9.figshare.7423673.v1 fatcat:oazrqudbznfvlj2uusc2ob432q

Ontology based mining of pathogen-disease associations from literature [article]

Senay Kafkas, Robert Hoehndorf
2018 bioRxiv   pre-print
Infectious diseases claim millions of lives especially in the developing countries each year, and resistance to drugs is an emerging threat worldwide. Identification of causative pathogens accurately and rapidly plays a key role in the success of treatment. To support infectious disease research and mechanisms of infection, there is a need for an open resource on pathogen-disease associations that can be utilized in computational studies. A large number of pathogen-disease associations is
more » ... ble from the literature in unstructured form and we need automated methods to extract the data. Results: We developed a text mining system designed for extracting pathogen-disease relations from literature. Our approach utilizes background knowledge from an ontology and statistical methods for extracting associations between pathogens and diseases. In total, we extracted a total of 3,420 pathogen-disease associations from literature. We integrated our literature-derived associations into a database which links pathogens to their phenotypes for supporting infectious disease research. Conclusions: To the best of our knowledge, we present the first study focusing on extracting pathogen-disease associations from publications. We believe the text mined data can be utilized as a valuable resource for infectious disease research. All the data is publicly available from https://github.com/bio-ontology-research-group/padimi and through a public SPARQL endpoint from http://patho.phenomebrowser.net/.
doi:10.1101/437558 fatcat:mqqxgpepm5b45hnk4l2uriqr2e

Literature Evidence in Open Targets - a target validation platform [article]

Senay Kafkas, Ian Dunham, Johanna McEntyre
2017 bioRxiv   pre-print
We present the Europe PMC literature component of Open Targets - a target validation platform that integrates various evidence to aid drug target identification and validation. The component identifies target-disease associations in documents and ranks the documents based on their confidence from the Europe PMC literature database, by using rules utilising expert-provided heuristic information and serves the platform regularly with the up-to-date data since December, 2015. Results: Currently,
more » ... ere are a total number of 1168365 distinct target-disease associations text mined from >26 million PubMed abstracts and >1.2 million Open Access full text articles. Our comparative analyses on the current available evidence data in the platform revealed that 850179 of these associations are exclusively identified by literature mining. Conclusion: This component helps the platform's users by providing the most relevant literature hits for a given target and disease. The text mining evidence along with the other types of evidence can be explored visually through https://www.targetvalidation.org and all the evidence data is available for download in json format from https://www.targetvalidation.org/downloads/data .
doi:10.1101/124719 fatcat:em3dvz735nfh3p3ys2k43inaam

An Technical Apporach to Günter Grass Novel Crabwalk
GÜNTER GRASS'IN YENGEÇ YÜRÜYÜŞÜ NUVELİNE ANLATIM TEKNİKLERİ AÇISINDAN BİR YAKLAŞIM

Şenay KAYĞIN
2014 Kafkas Universitesi Sosyal Bilimler Enstitüsü Dergisi  
Öz Bu yazıdaki asıl amaç Günter Grass'ın Yengeç Yürüyüşü nuvelindeki anlatım tekniklerini saptarken, yapıtı farklı bir açıdan ele alarak değişik bir yorum getirebilmektir. Yazar; bu nuvelde kullandığı teknikler ile olayların, olguların ve insanların okur üzerinde etkili izlenimler bırakmasına olanak sağlamıştır. Grass'ın nuvelde kullandığı anlatım tekniklerinin hem onun yazın sanatıyla örtüştüğü hem de kişilerin düşüncelerini ve ruhsal çözümlemelerini yaparken büyük kolaylıklar sağladığı
more » ... r. Özellikle anlatıcının içinde bulunduğu ruhsal durumun ve karşı karşıya kaldığı sorunların anlaşılması açısından bu teknikler büyük katkı sağlamıştır. Anahtar Kelimeler: Günter Grass, Anlatım Teknikleri, Anlatı Sanatı. Abstract A technical Analysis of Crabwalk Novel by Günter Grass. This sudy is intended to detect narrative technique in Günter Grass Crabwalk andthus provide a different interpretation of the novel. Grass makes it possible fort he events, facts and characters to leave effective impressions on reader with his techniques he used in the novel. İt is realised that these narrative techniques both associate with Grass literary craftsmanship and help the reader to analyse characters thoughts and psychologies easily. These techniques especially enable the reader to understand the psychological states of the narrators and the problems they face.
doi:10.9775/kausbed.2014.010 fatcat:j7b5gaf7fvh7jjmzxfxvtqellm

Phenotypic, functional and taxonomic features predict host-pathogen interactions [article]

Wang Liu-Wei, Şenay Kafkas, Robert Hoehndorf
2018 bioRxiv   pre-print
We obtained phenotypes associated with pathogens from the PathoPhenoDB (Kafkas et al., 2018) , a database of manually curated and text-mined associations of pathogens, diseases and phenotypes.  ...  For pathogens, we use the phenotype annotations from the PathoPhenoDB (Kafkas et al., 2018) , a database of pathogen-phenotype associations, and taxonomic information from the NCBI Taxonomy (Sayers et  ... 
doi:10.1101/508762 fatcat:ep5cueyetzcavn746i4tcl3ju4

Usage of cell nomenclature in biomedical literature

Şenay Kafkas, Sirarat Sarntivijai, Robert Hoehndorf
2017 BMC Bioinformatics  
Cell lines and cell types are extensively studied in biomedical research yielding to a significant amount of publications each year. Identifying cell lines and cell types precisely in publications is crucial for science reproducibility and knowledge integration. There are efforts for standardisation of the cell nomenclature based on ontology development to support FAIR principles of the cell knowledge. However, it is important to analyse the usage of cell nomenclature in publications at a large
more » ... scale for understanding the level of uptake of cell nomenclature in literature by scientists. In this study, we analyse the usage of cell nomenclature, both in Vivo, and in Vitro in biomedical literature by using text mining methods and present our results. Results: We identified 59% of the cell type classes in the Cell Ontology and 13% of the cell line classes in the Cell Line Ontology in the literature. Our analysis showed that cell line nomenclature is much more ambiguous compared to the cell type nomenclature. However, trends indicate that standardised nomenclature for cell lines and cell types are being increasingly used in publications by the scientists. Conclusions: Our findings provide an insight to understand how experimental cells are described in publications and may allow for an improved standardisation of cell type and cell line nomenclature as well as can be utilised to develop efficient text mining applications on cell types and cell lines. All data generated in this study is available at https://github.com/shenay/CellNomenclatureStudy.
doi:10.1186/s12859-017-1978-0 pmid:29322912 pmcid:PMC5763300 fatcat:wanjem72zfasxnm2c4x3rt6j7m

The Traces His Own Life Story in Stefan Zweig's NovelCalled "Bitter Feelings"
STEFAN ZWEİG'IN "ACI DUYGULAR" ADLI ROMANINDA ÖZYAŞAMÖYKÜSÜNDEN İZLER

Şenay KAYĞIN
2014 Kafkas Universitesi Sosyal Bilimler Enstitüsü Dergisi  
Öz Bu çalışmada Stefan Zweig'ın, özyaşamöyküsünden izler taşıdığı kanısında olduğumuz "Acı Duygular" romanını yorumlama denemesinde bulunulmuştur. Zweig'ın sık kullandığı anlatım tekniklerinin ele aldığımız roman üzerindeki işlevi saptanmaya çalışılmış ve yazarın biyografi konusundaki ustalığına da değinilmiştir. Ayrıca Dünya ve Türk yazınının Zweig'a karşı tutumundan da söz edilmiştir. Anahtar Kelimeler: Stefan Zweig, Acı duygular, Biyografi. Abstract In this study, we tried to interpret the
more » ... efan Zweig's novel called bitter Feelings we are of the opinion that it carries traces from his own life story. In this novel, we took narration techniques used commonly by Zweig, we tried to detect the function on the novel, and we mentioned his mastery about biography. In addition, we mentioned the attitudes of the World and Turkish literature against Zeig.
doi:10.9775/kausbed.2014.002 fatcat:rxrl5s5fqvgkjexaasx57osu7e

Database Citation in Full Text Biomedical Articles

Şenay Kafkas, Jee-Hyub Kim, Johanna R. McEntyre, Vincent Larivière
2013 PLoS ONE  
Molecular biology and literature databases represent essential infrastructure for life science research. Effective integration of these data resources requires that there are structured cross-references at the level of individual articles and biological records. Here, we describe the current patterns of how database entries are cited in research articles, based on analysis of the full text Open Access articles available from Europe PMC. Focusing on citation of entries in the European Nucleotide
more » ... Archive (ENA), UniProt and Protein Data Bank, Europe (PDBe), we demonstrate that text mining doubles the number of structured annotations of database record citations supplied in journal articles by publishers. Many thousands of new literature-database relationships are found by text mining, since these relationships are also not present in the set of articles cited by database records. We recommend that structured annotation of database records in articles is extended to other databases, such as ArrayExpress and Pfam, entries from which are also cited widely in the literature. The very high precision and high-throughput of this text-mining pipeline makes this activity possible both accurately and at low cost, which will allow the development of new integrated data services.
doi:10.1371/journal.pone.0063184 pmid:23734176 pmcid:PMC3667078 fatcat:cix3dhrmw5f3ppwk5s5yrojwr4

Vec2SPARQL: integrating SPARQL queries and knowledge graph embeddings [article]

Maxat Kulmanov, Senay Kafkas, Andreas Karwath, Alexander Malic, Georgios V Gkoutos, Michel Dumontier, Robert Hoehndorf
2018 biorxiv/medrxiv   pre-print
AbstractRecent developments in machine learning have lead to a rise of large number of methods for extracting features from structured data. The features are represented as a vectors and may encode for some semantic aspects of data. They can be used in a machine learning models for different tasks or to compute similarities between the entities of the data. SPARQL is a query language for structured data originally developed for querying Resource Description Framework (RDF) data. It has been in
more » ... se for over a decade as a standardized NoSQL query language. Many different tools have been developed to enable data sharing with SPARQL. For example, SPARQL endpoints make your data interoperable and available to the world. SPARQL queries can be executed across multiple endpoints. We have developed a Vec2SPARQL, which is a general framework for integrating structured data and their vector space representations. Vec2SPARQL allows jointly querying vector functions such as computing similarities (cosine, correlations) or classifications with machine learning models within a single SPARQL query. We demonstrate applications of our approach for biomedical and clinical use cases. Our source code is freely available at https://github.com/bio-ontology-research-group/vec2sparql and we make a Vec2SPARQL endpoint available at http://sparql.bio2vec.net/.
doi:10.1101/463778 fatcat:x3qi6swk3jedhevqdpza7cayq4

MOESM1 of Combining lexical and context features for automatic ontology extension

Sara Althubaiti, Şenay Kafkas, Marwa Abdelhakim, Robert Hoehndorf
2020 Figshare  
Additional file 1 Different conducted experiments based on different classification tasks.
doi:10.6084/m9.figshare.11598228.v1 fatcat:agcknksotbcezivgcti3fyesqa

Vec2SPARQL: integrating SPARQL queries and knowledge graph embeddings

Maxat Kulmanov, Şenay Kafkas, Andreas Karwath, Alexander Malic, Georgios V Gkoutos, Michel Dumontier, Robert Hoehndorf
2018 Figshare  
Recent developments in machine learning have led to a rise of largenumber of methods for extracting features from structured data. The featuresare represented as vectors and may encode for some semantic aspects of data.They can be used in a machine learning models for different tasks or to com-pute similarities between the entities of the data. SPARQL is a query languagefor structured data originally developed for querying Resource Description Frame-work (RDF) data. It has been in use for over
more » ... decade as a standardized NoSQLquery language. Many different tools have been developed to enable data shar-ing with SPARQL. For example, SPARQL endpoints make your data interopera-ble and available to the world. SPARQL queries can be executed across multi-ple endpoints. We have developed a Vec2SPARQL, which is a general frame-work for integrating structured data and their vector space representations.Vec2SPARQL allows jointly querying vector functions such as computing sim-ilarities (cosine, correlations) or classifications with machine learning modelswithin a single SPARQL query. We demonstrate applications of our approachfor biomedical and clinical use cases. Our source code is freely available athttps://github.com/bio-ontology-research-group/vec2sparql and we make aVec2SPARQL endpoint available at http://sparql.bio2vec.net/
doi:10.6084/m9.figshare.7356416.v2 fatcat:iqvn3jbotzhc3ne4wfchrqfh4q

DDIEM: Drug Database for Inborn Errors of Metabolism [article]

Marwa Abdelhakim, Eunice McMurray, Ali Raza Syed, Senay Kafkas, Allan Anthony Kamau, Paul N Schofield, Robert Hoehndorf
2020 biorxiv/medrxiv   pre-print
Inborn errors of metabolism (IEM) represent a subclass of rare inherited diseases caused by a wide range of defects in metabolic enzymes or their regulation. Of over a thousand characterized IEMs, only about half are understood at the molecular level, and overall the development of treatment and management strategies has proved challenging. An overview of the changing landscape of therapeutic approaches is helpful in assessing strategic patterns in the approach to therapy, but the information
more » ... scattered throughout the literature and public data resources. Results: We gathered data on therapeutic strategies for 299 diseases into the Drug Database for Inborn Errors of Metabolism (DDIEM). Therapeutic approaches, including both successful and ineffective treatments, were manually classified by their mechanisms of action using a new ontology. Conclusions: We present a manually curated, ontologically formalized knowledgebase of drugs, therapeutic procedures, and mitigated phenotypes. DDIEM is freely available through a web interface and for download at http://ddiem.phenomebrowser.net.
doi:10.1101/2020.01.08.897223 fatcat:oftqmw5eerfcfkuwpnpyzgkkme

Ontology based mining of pathogen–disease associations from literature

Şenay Kafkas, Robert Hoehndorf
2019 Journal of Biomedical Semantics  
Infectious diseases claim millions of lives especially in the developing countries each year. Identification of causative pathogens accurately and rapidly plays a key role in the success of treatment. To support infectious disease research and mechanisms of infection, there is a need for an open resource on pathogen-disease associations that can be utilized in computational studies. A large number of pathogen-disease associations is available from the literature in unstructured form and we need automated methods to extract the data.
doi:10.1186/s13326-019-0208-2 pmid:31533864 pmcid:PMC6751637 fatcat:r3draezgqzdm3bykvlnof2iira

Literature evidence in open targets - a target validation platform

Şenay Kafkas, Ian Dunham, Johanna McEntyre
2017 Journal of Biomedical Semantics  
We present the Europe PMC literature component of Open Targetsa target validation platform that integrates various evidence to aid drug target identification and validation. The component identifies target-disease associations in documents and ranks the documents based on their confidence from the Europe PMC literature database, by using rules utilising expert-provided heuristic information and serves the platform regularly with the up-to-date data since December, 2015.
doi:10.1186/s13326-017-0131-3 pmid:28587637 pmcid:PMC5461726 fatcat:umy4zcbdbje3rmvptv4h3k5wga

Monitoring named entity recognition: the League Table

Dietrich Rebholz-Schuhmann, Senay Kafkas, Jee-Hyub Kim, Antonio Yepes, Ian Lewin
2013 Journal of Biomedical Semantics  
Named entity recognition (NER) is an essential step in automatic text processing pipelines. A number of solutions have been presented and evaluated against gold standard corpora (GSC). The benchmarking against GSCs is crucial, but left to the individual researcher. Herewith we present a League Table web site, which benchmarks NER solutions against selected public GSCs, maintains a ranked list and archives the annotated corpus for future comparisons. Results: The web site enables access to the
more » ... fferent GSCs in a standardized format (IeXML). Upon submission of the annotated corpus the user has to describe the specification of the used solution and then uploads the annotated corpus for evaluation. The performance of the system is measured against one or more GSCs and the results are then added to the web site ("League Table" ). It displays currently the results from publicly available NER solutions from the Whatizit infrastructure for future comparisons. Conclusion: The League Table enables the evaluation of NER solutions in a standardized infrastructure and monitors the results long-term. For access please go to http://wwwdev.ebi.ac.uk/Rebholz-srv/calbc/assessmentGSC/. Contact: rebholz@ifi.uzh.ch.
doi:10.1186/2041-1480-4-19 pmid:24034148 pmcid:PMC4015903 fatcat:blv6w5lrfba7ti5vzcuw67g55a
« Previous Showing results 1 — 15 out of 92 results