795 Hits in 7.4 sec

Multi-class Cancer Classification and Biomarker Identification using Deep Learning [article]

Fariha Muazzam
2020 bioRxiv   pre-print
This research revolves around multi-class cancer classification, feature extraction and relevant gene identification through deep learning methods for 12 different types of cancers using RNA-SEQ from The  ...  Genetic data is important for analysing cellular functions whose disruption gives rise to various kinds of cancer.  ...  Acknowledgements This study could not have been without the guidance and support of my supervisor Dr. Saira Karim.  ... 
doi:10.1101/2020.12.24.424317 fatcat:o4cjyititjcnpj7h43ewmzlm44

Evaluation of data discretization methods to derive platform independent isoform expression signatures for multi-class tumor subtyping

Segun Jung, Yingtao Bi, Ramana V Davuluri
2015 BMC Genomics  
For models trained on exon-array data and tested on RNA-seq data, the addition of data discretization step dramatically improved the classification accuracies with Equal-frequency binning showing the highest  ...  (GBM) when the classification algorithms were trained on the isoform-level gene expression profiles from exon-array platform and tested on the corresponding profiles from RNA-seq data.  ...  Declarations The publication costs for this article were funded by the NIH-NLM grant R01LM011297. This article has been published as part of BMC Genomics  ... 
doi:10.1186/1471-2164-16-s11-s3 pmid:26576613 pmcid:PMC4652565 fatcat:l4r7bmmhava5zgqliuuszyjy6u

A Regularized Multi-Task Learning Approach for Cell Type Detection in Single-Cell RNA Sequencing Data

Piu Upadhyay, Sumanta Ray
2022 Frontiers in Genetics  
Cell type prediction is one of the most challenging goals in single-cell RNA sequencing (scRNA-seq) data.  ...  Learning the structure of subpopulations is treated as a separate task in the multi-task learner. Regularization is used to modulate the multi-task model (e.g., W1, W2, ...  ...  T; here, each task represents the learning of expression data from individual cells. Here, the reference data are the scRNA-seq expression matrix over all the cellular identities.  ... 
doi:10.3389/fgene.2022.788832 pmid:35495159 pmcid:PMC9043858 fatcat:22aoer7ajnfqvlcrv4iaar6bpy

Detecting false positive sequence homology: a machine learning approach

M. Stanley Fujimoto, Anton Suvorov, Nicholas O. Jensen, Mark J. Clement, Seth M. Bybee
2016 BMC Bioinformatics  
false positives inferred by heuristic algorithms especially among proteomes recovered from low-coverage RNA-seq data.  ...  We demonstrate that our machine learning method trained on both known homology clusters obtained from OrthoDB and randomly generated sequence alignments (non-homologs), successfully determines apparent  ...  Lord for the generation of sequence data, T. Heath Ogden for providing specimens and Eric Ringger for his help with machine learning model selection and valuable discussion.  ... 
doi:10.1186/s12859-016-0955-3 pmid:26911862 pmcid:PMC4765110 fatcat:yfo3w3jdqrbmvhrbdnozfbmmx4

Deep Learning for Genomics: A Concise Overview [article]

Tianwei Yue, Haohan Wang
2018 arXiv   pre-print
We also provide a concise review of deep learning applications in various aspects of genomic research, as well as pointing out potential opportunities and obstacles for future genomics applications.  ...  of developing modern deep learning architectures for genomics.  ...  A collaboratively written review paper on deep learning, genomics, and precision medicine, now available at  ... 
arXiv:1802.00810v2 fatcat:u6s7pz2p6jdxzodz5k34it2hiu

Therapeutics Data Commons: Machine Learning Datasets and Tasks for Drug Discovery and Development [article]

Kexin Huang, Tianfan Fu, Wenhao Gao, Yue Zhao, Yusuf Roohani, Jure Leskovec, Connor W. Coley, Cao Xiao, Jimeng Sun, Marinka Zitnik
2021 arXiv   pre-print
, multi-scale modeling of heterogeneous data, and robust generalization to novel data points.  ...  Here, we introduce Therapeutics Data Commons (TDC), the first unifying platform to systematically access and evaluate machine learning across the entire range of therapeutics.  ...  of K (RP@K). • Multi-Class and Multi-label Classification: TDC includes Micro-F , Macro-F , and Cohen's Kappa. • Token-Level Classification conducts binary classification for each token in a sequence.  ... 
arXiv:2102.09548v2 fatcat:i5f5vrbaxnehhmhqiuwkkx2s6y

Microbial Forensics: Predicting Phenotypic Characteristics and Environmental Conditions from Large-Scale Gene Expression Profiles

Minseung Kim, Violeta Zorraquino, Ilias Tagkopoulos, Olga G. Troyanskaya
2015 PLoS Computational Biology  
Results show that gene expression is an excellent predictor of environmental structure, with multi-class ensemble models achieving balanced accuracy between 70.0% (±3.5%) to 98.3% (±2.3%) for the various  ...  To investigate this relationship, we created an extensive normalized gene expression compendium for the bacterium Escherichia coli that was further enriched with meta-information through an iterative learning  ...  Acknowledgments We thank the Tagkopoulos lab for the helpful discussions. Author Contributions  ... 
doi:10.1371/journal.pcbi.1004127 pmid:25774498 pmcid:PMC4361189 fatcat:4ay45uxx6fgoja3t6bejkn4mka

Deep Learning for Human Disease Detection, Subtype Classification, and Treatment Response Prediction Using Epigenomic Data

Thi Mai Nguyen, Nackhyoung Kim, Da Hae Kim, Hoang Long Le, Md Jalil Piran, Soo-Jong Um, Jin Hee Kim
2021 Biomedicines  
Deep learning (DL) is a distinct class of machine learning that has achieved first-class performance in many fields of study.  ...  DNA methylation and RNA-sequencing data are most frequently used to train the predictive models.  ...  The funders had no role in the design of the study; in the collection, analyses, or interpretation of data; in the writing of the manuscript, or in the decision to publish the results.  ... 
doi:10.3390/biomedicines9111733 pmid:34829962 pmcid:PMC8615388 fatcat:oqgigce2bvaitl5sjflutsccsm

Machine learning for integrating data in biology and medicine: Principles, practice, and opportunities

Marinka Zitnik, Francis Nguyen, Bo Wang, Jure Leskovec, Anna Goldenberg, Michael M. Hoffman
2019 Information Fusion  
The key challenge in developing such approaches is the identification of effective models to provide a comprehensive and relevant systems view.  ...  No single data type, however, can capture the complexity of all the factors relevant to understanding a phenomenon such as a disease.  ...  It also uses data from RNA-seq, a method for determining steady-state gene expression.  ... 
doi:10.1016/j.inffus.2018.09.012 pmid:30467459 pmcid:PMC6242341 fatcat:mjhnzxxv4fbrlgufb7vkg3pz5u

Deciphering serous ovarian carcinoma histopathology and platinum response by convolutional neural networks

Kun-Hsing Yu, Vincent Hu, Feiran Wang, Ursula A Matulonis, George L Mutter, Jeffrey A Golden, Isaac S Kohane
2020 BMC Medicine  
We analyzed the whole-slide histopathology images, RNA-Seq, and proteomics data from 587 primary serous ovarian adenocarcinoma patients and developed a systematic algorithm to integrate histopathology  ...  Functional omics analysis revealed that expression levels of proteins participated in innate immune responses and catabolic pathways are associated with tumor grade.  ...  The authors would like to acknowledge the Amazon Web Services (AWS) Cloud Credit for Research, the Microsoft Azure Award, and the NVIDIA GPU Grant Program for their support on computational infrastructure  ... 
doi:10.1186/s12916-020-01684-w pmid:32807164 pmcid:PMC7433108 fatcat:buqpgsmaavbwbexb34kcdhhx2y

Prediction of activity and specificity of CRISPR-Cpf1 using convolutional deep learning neural networks

Jiesi Luo, Wei Chen, Li Xue, Bin Tang
2019 BMC Bioinformatics  
Trained on published data sets, DeepCpf1 is superior to other machine learning algorithms and reliably predicts the most efficient and less off-target effects guide RNAs for a given gene.  ...  CRISPR-Cpf1 has recently been reported as another RNA-guided endonuclease of class 2 CRISPR-Cas system, which expands the molecular biology toolkit for genome editing.  ...  Acknowledgements We would like to acknowledge the members of Center for Bioinformatics and Systems Biology at Wake Forest School of Medicine.  ... 
doi:10.1186/s12859-019-2939-6 fatcat:nec3ett22ncxddox2itkefluom

GARS: Genetic Algorithm for the identification of a Robust Subset of features in high-dimensional datasets

Mattia Chiesa, Giada Maioli, Gualtiero I. Colombo, Luca Piacentini
2020 BMC Bioinformatics  
Here, we propose an innovative implementation of a genetic algorithm, called GARS, for fast and accurate identification of informative features in multi-class and high-dimensional datasets.  ...  Feature selection is a crucial step in machine learning analysis.  ...  Authors' contributions MC and GM designed the GARS core algorithm. MC, GM, and LP implemented the scripts, performed the tests, and developed the package for Bioconductor submission.  ... 
doi:10.1186/s12859-020-3400-6 pmid:32046651 fatcat:t7qe4lih6feslepn675vg3t6ue

Molecular signature comprising 11 platelet-genes enables accurate blood-based diagnosis of NSCLC

Chitrita Goswami, Smriti Chawla, Deepshi Thakral, Himanshu Pant, Pramod Verma, Prabhat Singh Malik, Jayadeva ▮, Ritu Gupta, Gaurav Ahuja, Debarka Sengupta
2020 BMC Genomics  
We also discussed a strategy for boosting the predictive model performance by artificial augmentation of gene expression data.  ...  When applied to platelet-gene expression data from a published study, our machine learning model could accurately discriminate between non-metastatic NSCLC cases and healthy samples.  ...  Acknowledgements The authors would like to thank the lab members of Laboratory Oncology Unit, All India Institute of Medical Sciences, New Delhi, India who helped us to carry out the RT-qPCR experiments  ... 
doi:10.1186/s12864-020-07147-z pmid:33287695 fatcat:jbdguhmb2zdvrkh4wf3hjwmfla

Brain Immunoinformatics: A Symmetrical Link between Informatics, Wet Lab and the Clinic

Ismini Papageorgiou, Daniel Bittner, Marios Nikos Psychogios, Stathis Hadjidemetriou
2021 Symmetry  
The intermingling of machine learning with wet lab applications and clinical results has hatched the newly defined immunoinformatics society.  ...  The new classification of microglia, the brain's innate immune cells, was an NII achievement.  ...  Data Availability Statement: No original data applied. Conflicts of Interest: The authors declare no competing financial interest.  ... 
doi:10.3390/sym13112168 fatcat:jzvdq66k2bai5hbzdmenxst66e

Computational analysis of regulatory mechanism and interactions of microRNAs

Takaya Saito, Pål Sætrom
2011 Zenodo  
To solve this possible fault, we developed a two step support vector machine (SVM) model.  ...  RNAi is a regulatory process that uses small non-coding RNAs (ncRNAs) to suppress gene expression at the post-transcriptional level.  ...  Although some machine learning algorithms are strictly limited to binary classification, there are several SVM approaches that can handle multi-class problems (179) .  ... 
doi:10.5281/zenodo.4902326 fatcat:2z2um4cglfaydanvwtvzu3ufn4
« Previous Showing results 1 — 15 out of 795 results