4,466 Hits in 6.4 sec

Identifying consistent statements about numerical data with dispersion-corrected subgroup discovery

Mario Boley, Bryan R. Goldsmith, Luca M. Ghiringhelli, Jilles Vreeken
2017 Data mining and knowledge discovery  
Existing algorithms for subgroup discovery with numerical targets do not optimize the error or target variable dispersion of the groups they find.  ...  This often leads to unreliable or inconsistent statements about the data, rendering practical applications, especially in scientific domains, futile.  ...  Goldsmith acknowledges support from the Alexander von Humboldt-Foundation with a Postdoctoral Fellowship. Additionally,  ... 
doi:10.1007/s10618-017-0520-3 fatcat:cwwiej6jgnckpl23qwtyqqz2rm

Uncovering structure-property relationships of materials by subgroup discovery

Bryan R Goldsmith, Mario Boley, Jilles Vreeken, Matthias Scheffler, Luca M Ghiringhelli
2017 New Journal of Physics  
Subgroup discovery (SGD) is presented here as a data-mining approach to help find interpretable local patterns, correlations, and descriptors of a target property in materials-science data.  ...  Specifically, we will be concerned with data generated by density-functional theory calculations.  ...  Acknowledgments The project received funding from the European Union's Horizon 2020 research and innovation program under grant agreement no. 676580 with The Novel Materials Discovery (NOMAD) Laboratory  ... 
doi:10.1088/1367-2630/aa57c2 fatcat:suw3iux54jgrxj7dnotw37i3tm

Functional Connectivity and Regional Homogeneity Alterations in Migraine Patients: A Protocol of Systematic Review and Meta-Analysis

2022 Advances in Machine Learning & Artificial Intelligence  
Conclusion: This study will reveal cerebral functional changes of migraine patients based on current literature to identify consistent conclusions and to describe potential future direction.  ...  We will perform a systematic review and meta-analysis of this body of literature, aiming to identify consistent conclusions regarding cerebral functional changes in migraine patients and to describe potential  ...  The altered areas of FC disperse throughout multiple cerebral regions of migraine patients, without many consistently summarized brain regions.  ... 
doi:10.33140/amlai.03.01.06 fatcat:gny36sgotvepjhxrnouwss2zlq

Robust subgroup discovery [article]

Hugo Manuel Proença, Peter Grünwald, Thomas Bäck, Matthijs van Leeuwen
2022 arXiv   pre-print
First, we formulate the broad model class of subgroup lists, i.e., ordered sets of subgroups, for univariate and multivariate targets that can consist of nominal or numeric variables, including traditional  ...  Furthermore, we empirically show on 54 datasets that SSD++ outperforms previous subgroup discovery methods in terms of quality, generalisation on unseen data, and subgroup list size.  ...  Acknowledgements This work is part of the research programme Indo-Dutch Joint Research Programme for ICT 2014 with project number 629.002.201, SAPPAO, which is (partly) financed by the Netherlands Organisation  ... 
arXiv:2103.13686v4 fatcat:6njsxtwx3bce7chcyplgo7daxq

Methylation differences reveal heterogeneity in preterm pathophysiology: results from bipartite network analyses

Suresh K. Bhavnani, Bryant Dang, Varun Kilaru, Maria Caro, Shyam Visweswaran, George Saade, Alicia K. Smith, Ramkumar Menon
2018 Journal of Perinatal Medicine  
Conclusions: The results demonstrate that unsupervised bipartite networks helped to identify a complex but comprehensible data-driven hypotheses related to patient subgroups and inferences about their  ...  Methods: The data consisted of DNA methylation across the genome (HumanMethylation450 BeadChip) in cord blood from 50 African-American subjects consisting of 22 cases of early spontaneous PTB (24–34 weeks  ...  However, as shown on the right, there was a dispersed group of cases with mainly thin edges connecting them to all the methylation sites, suggesting that this subgroup was hypomethelated at all the 10  ... 
doi:10.1515/jpm-2017-0126 pmid:28665803 fatcat:kzsuyj76yrcvpg3yhxvouf6jwi

Unsupervised clustering reveals new prostate cancer subtypes

Shaowei Gao, Zeting Qiu, Yiyan Song, Chengqiang Mo, Wulin Tan, Qinchang Chen, Dong Liu, Mengyu Chen, Huaqiang Zhou
2017 Translational Cancer Research  
Conclusions: We established a PCS classifier (183 genes) based on RNA-Seq data, and identified three PCSs.  ...  Three subgroups based on the classifier were tested whether to have significant differences in the clinical data.  ...  We appreciate Desousa's methods, which provided us with so much guidance. Footnote  ... 
doi:10.21037/tcr.2017.05.15 fatcat:5cvpygxcobatxd5f56vaqwmf5q

Ab initio data-analytics study of carbon-dioxide activation on semiconductor oxide surfaces [article]

Aliaksei Mazheika, Yanggang Wang, Rosendo Valero, Luca M. Ghiringhelli, Francesc Vines, Francesc Illas, Sergey V. Levchenko, Matthias Scheffler
2020 arXiv   pre-print
Using artificial intelligence (AI) trained on high-throughput first principles based data for a broad family of oxides, we develop a strategy for a rational design of catalytic materials for converting  ...  Instead, our AI model identifies the common feature of these surfaces in the binding of a molecular O atom to a surface cation, which results in a strong elongation and therefore weakening of one molecular  ...  the many-body dispersion correction [17] .  ... 
arXiv:1912.06515v2 fatcat:oea54ondrnamlosmol754zg25a

OB Associations [article]

Nicholas J. Wright
2022 arXiv   pre-print
the subgroups were more compact in the past.  ...  The kinematics of associations have shown them to be globally unbound and expanding, with the majority of recent studies revealing evidence for clear expansion patterns in the association subgroups, suggesting  ...  A more correct statement, and one more consistent with Lada & Lada's view, would be 'most stars form at significantly higher densities than the field', followed by 'and around 10% of stars remain in bound  ... 
arXiv:2203.10007v1 fatcat:7tp223mguve6pi2ncvjcigffda

A study of subgroup discovery approaches for defect prediction

Daniel Rodriguez, Roberto Ruiz, Jose C. Riquelme, Rachel Harrison
2013 Information and Software Technology  
Method: We describe two well-known subgroup discovery algorithms, the SD algorithm and the CN2-SD algorithm to obtain rules that identify defect prone modules.  ...  Subgroup discovery algorithms mitigate against characteristics of datasets that hinder the applicability of classification algorithms and so remove the need for preprocessing techniques.  ...  Petra Kralj Novak for answering some of our questions about the Orange tool and the anonymous reviewers for their useful comments while improving this manuscript.  ... 
doi:10.1016/j.infsof.2013.05.002 fatcat:oazfdzfwjnebpmqo6vgp5fm7e4

Contemporary Methods and Evidence for Species Delimitation

David M. Hillis, E. Anne Chambers, Thomas J. Devitt
2021 Ichthyology & Herpetology  
Testing the alternative hypotheses requires more than speculation about the fate of the hybrid zone based on rough estimates of dispersal distance, generation time, and unpublished data on how long these  ...  with an F1 hybrid; and hybrid indices near 0.25 and 0.75 are consistent with the respective backcrosses).  ... 
doi:10.1643/h2021082 fatcat:dyrcqvz7g5enbn2wt2iolsto74

Gene flow in Argentinian sunflowers as revealed by genotyping-by-sequencing data

Ana Mondon, Gregory L. Owens, Mónica Poverene, Miguel Cantamutto, Loren H. Rieseberg
2017 Evolutionary Applications  
While many hybrids are F1s, there were signals consistent with introgression from the domesticated sunflower into H. petiolaris.  ...  previously published data from samples from the native range (North America), to determine the native source populations of the Argentinian samples and to detect admixture.  ...  ACKNOWLEDGEMENTS DATA ARCHIVING STATEMENT All sequence data are archived in the Sequence Read Archive: BioProject PRJNA359995.  ... 
doi:10.1111/eva.12527 pmid:29387155 pmcid:PMC5775495 fatcat:wshmngzvifb43j254biycwyghy

Reproducibility for Hepatocellular Carcinoma CT Radiomic Features: Influence of Delineation Variability Based on 3D-CT, 4D-CT and Multiple-Parameter MR Images

Jinghao Duan, Qingtao Qiu, Jian Zhu, Dongping Shang, Xue Dou, Tao Sun, Yong Yin, Xiangjuan Meng
2022 Frontiers in Oncology  
Quartile coefficient of dispersion (QCD) and intraclass correlation coefficient (ICC) were applied to assess the variability of each radiomic feature.  ...  However, the number of radiomic features (mean 89) with ICC≥0.75 was the highest in the multiple-parameter MR group, followed by the 3DCT group (mean 77) and the MIP group (mean 73).  ...  Data Analysis and Statistics Variation of Radiomics Features Quartile coefficient of dispersion (QCD) was used to assess the variation of radiomics features.  ... 
doi:10.3389/fonc.2022.881931 pmid:35494061 pmcid:PMC9047864 fatcat:hgwtnl3xzfbpdempilec4tpw4y

Inflammatory profile of patients with tuberculosis with or without HIV-1 co-infection: a prospective cohort study and immunological network analysis

Elsa Du Bruyn, Kiyoshi F Fukutani, Neesha Rockwood, Charlotte Schutz, Graeme Meintjes, María B Arriaga, Juan M Cubillos-Angulo, Rafael Tibúrcio, Alan Sher, Catherine Riou, Katalin A Wilkinson, Bruno B Andrade (+1 others)
2021 The Lancet Microbe  
In addition to the discovery cohort, a validation cohort of patients with HIV-1 admitted to hospital with CD4 counts less than 350 cells per μL and a high clinical suspicion of new tuberculosis were recruited  ...  Through network analysis we identified IL-17A as an important node in HIV-tuberculosis co-infection, thus implicating this cytokine's capacity to correlate with, and regulate, other inflammatory markers  ...  that this hypothesis was correct (figure 3A , appendix 1 p 23).  ... 
doi:10.1016/s2666-5247(21)00037-9 pmid:34386782 pmcid:PMC8357308 fatcat:cuiljjjihrfovasvjlb6azz4zq

Bonobos Extract Meaning from Call Sequences

Zanna Clay, Klaus Zuberbühler, Martine Hausberger
2011 PLoS ONE  
We address this issue with a first playback study on the natural vocal behaviour of bonobos.  ...  Rather than attending to individual calls, bonobos attended to the entire sequences to make inferences about the food encountered by a caller.  ...  We are grateful to Maren Mende for help with data collection, to Brian Kirk and Andy Burnley for technical support, and to Katie Slocombe for valuable advice.  ... 
doi:10.1371/journal.pone.0018786 pmid:21556149 pmcid:PMC3083404 fatcat:b5p2r3bf6zg5bly67dpde3jl7q

In-season prediction of batting averages: A field test of empirical Bayes and Bayes methodologies

Lawrence D. Brown
2008 Annals of Applied Statistics  
data.  ...  A newly proposed nonparametric empirical Bayes procedure performs particularly well in the basic analysis of the full data set, though less well with analyses involving more homogeneous subsets of the  ...  Shirley (who also prepared the original data set used here), S. Jensen, D. Small, A. Wyner and L. Zhao.  ... 
doi:10.1214/07-aoas138 fatcat:lpsn43bhlfeyznxqx3ghb2se5y
« Previous Showing results 1 — 15 out of 4,466 results