Filters








1,613 Hits in 0.97 sec

Beyond Prediction: First Steps Toward Automatic Intervention in MOOC Student Stopout

Jacob Whitehill, Joseph Jay Williams, Glenn Lopez, Cody Austun Coleman, Justin Reich
2015 Social Science Research Network  
High attrition rates in massive open online courses (MOOCs) have motivated growing interest in the automatic detection of student "stopout". Stopout classifiers can be used to orchestrate an intervention before students quit, and to survey students dynamically about why they ceased participation. In this paper we expand on existing stop-out detection research by (1) exploring important elements of classifier design such as generalizability to new courses; (2) developing a novel framework
more » ... d by control theory for how to use a classifier's outputs to make intelligent decisions; and (3) presenting results from a "dynamic survey intervention" conducted on 2 HarvardX MOOCs, containing over 40000 students, in early 2015. Our results suggest that surveying students based on an automatic stopout classifier achieves higher response rates compared to traditional post-course surveys, and may boost students' propensity to "come back" into the course.
doi:10.2139/ssrn.2611750 fatcat:k7ukmcmfmnbgxm65ibajkvsxwy

Beyond Time-on-Task: The Relationship between Spaced Study and Certification in MOOCs

Yohsuke R. Miyamoto, Cody Austun Coleman, Joseph Jay Williams, Jacob Whitehill, Sergiy O Nesterko, Justin Reich
2015 Social Science Research Network  
A long history of laboratory and field experiments have demonstrated that dividing study time into many sessions is often superior to massing study time into few sessions, a phenomenon known as the "spacing effect." We use this well-established finding from the psychology literature as inspiration for investigating how students distribute their study sessions across an entire Massive Open Online Course (MOOC). Drawing on observational tracking log data from 20 HarvardX courses, we examine the
more » ... lationship between students' allocation of their time in MOOCs and their performance. While controlling for the effect of total time, we show that the number of sessions students initiate is correlated with certification rate, across students in all courses. A one-unit change in session count is positively associated with an estimated 3.4% change in certification odds. When individual students spend similar amounts of time in multiple courses, they perform better in courses where that time is distributed among more sessions, suggesting that the benefit of spacing MOOC study sessions is independent of student characteristics. Our study demonstrates that well-established learning theories can be combined with massive new datasets and innovative approaches to learning analytics to advance our understanding of student practice and learning.
doi:10.2139/ssrn.2547799 fatcat:zzkhqajy2ne5nbwsps3ycaq2eq

ASXL1 Directs Neutrophilic Differentiation via Modulation of MYC and RNA Polymerase II [article]

Theodore P Braun, Joseph Estabrook, Daniel J Coleman, Zachary Schonrock, Brittany M Smith, Trevor Enright, Cody Coblentz, Rowan Callahan, Hisham Mohammed, Brian J Druker, Theresa Lusardi, Julia E Maxson
2020 bioRxiv   pre-print
Mutations in the gene Additional Sex-Combs Like 1 (ASXL1) are recurrent in myeloid malignancies as well as the pre-malignant condition clonal hematopoiesis, where they are universally associated with poor prognosis. An epigenetic regulator, ASXL1 ca-nonically directs the deposition of H3K27me3 via the polycomb repressive complex 2. However, its precise role in myeloid lineage maturation is incompletely described. We utilized single cell RNA sequencing (scRNA-seq) on a murine model of
more » ... ic-specific ASXL1 deletion and identified a specific role for ASXL1 in terminal granulo-cyte maturation. Terminal maturation is accompanied by down regulation of Myc ex-pression and cell cycle exit. ASXL1 deletion leads to hyperactivation of Myc in granu-locyte precursors and a quantitative decrease in neutrophil production. This failure of normal developmentally-associated Myc suppression is not accompanied by signifi-cant changes in the landscape of covalent histone modifications including H3K27me3. Examining the genome-wide localization of ASXL1 in myeloid progenitors revealed strong co-localization with RNA Polymerase II (RNAPII) at the promoters and spread across the gene bodies of transcriptionally active genes. ASXL1 deletion results in a decrease in RNAPII promoter-proximal pausing in granulocyte progenitors, indicative of a global increase in productive transcription, consistent with the known role of ASXL1 as a mediator of RNAPII pause release. These results suggest that ASXL1 in-hibits productive transcription in granulocyte progenitors, identifying a new role for this epigenetic regulator and highlighting a novel potential oncogenic mechanism for ASXL1 mutations in myeloid malignancies.
doi:10.1101/2020.09.14.295295 fatcat:ln3qq5pferb6jnrtpg7mrqmceq

Beyond time-on-task: The relationship between spaced study and certification in MOOCs

Yohsuke R. Miyamoto, Cody A. Coleman, Joseph Jay Williams, Jacob Whitehill, Sergiy Nesterko, Justin Reich
2015 Journal of Learning Analytics  
A long history of laboratory and field experiments have demonstrated that dividing study time into many sessions is often superior to massing study time into few sessions, a phenomenon known as the "spacing effect." We use this well-established finding from the psychology literature as inspiration for investigating how students distribute their study sessions across an entire Massive Open Online Course (MOOC). Drawing on observational tracking log data from 20 HarvardX courses, we examine the
more » ... lationship between students' allocation of their time in MOOCs and their performance. While controlling for the effect of total time, we show that the number of sessions students initiate is correlated with certification rate, across students in all courses. A one-unit change in session count is positively associated with an estimated 3.4% change in certification odds. When individual students spend similar amounts of time in multiple courses, they perform better in courses where that time is distributed among more sessions, suggesting that the benefit of spacing MOOC study sessions is independent of student characteristics. Our study demonstrates that well-established learning theories can be combined with massive new datasets and innovative approaches to learning analytics to advance our understanding of student practice and learning.
doi:10.18608/jla.2015.22.5 fatcat:uym556eclbcpvbemm7hturyr4m

PATTERNS OF HABITAT USE BY BATS ALONG A RIPARIAN CORRIDOR IN NORTHERN UTAH

Duke S. Rogers, Mark C. Belk, Malinda W. González, Brent L. Coleman, Cody W. Edwards
2006 The Southwestern naturalist  
These points were identified with global positioning systems and ground verification to ensure they were consistently monitored for the duration of the study (Coleman, 2002) .  ...  were as follows: riparian forest ϭ 10 nights, 40 acoustic survey h; wetlands ϭ 15 nights, 60 h; agriculture fields ϭ 9 nights, 36 h; edge areas ϭ 19 nights, 76 h; restoration site ϭ 8 nights, 32 h (Coleman  ... 
doi:10.1894/0038-4909(2006)51[52:pohubb]2.0.co;2 fatcat:vy6is3rdm5empkxljvgfeapd2i

Similarity Search for Efficient Active Learning and Search of Rare Concepts [article]

Cody Coleman, Edward Chou, Julian Katz-Samuels, Sean Culatana, Peter Bailis, Alexander C. Berg, Robert Nowak, Roshan Sumbaly, Matei Zaharia, I. Zeki Yalniz
2021 arXiv   pre-print
A simple framework for contrastive learning of visual representations. arXiv preprint arXiv:2002.05709, 2020. [12] Cody Coleman, Christopher Yeh, Stephen Mussmann, Baharan Mirzasoleiman, Peter Bailis,  ... 
arXiv:2007.00077v2 fatcat:iadi75ltgrhobp6ocobssdmrwu

Analysis of DAWNBench, a Time-to-Accuracy Machine Learning Performance Benchmark [article]

Cody Coleman, Daniel Kang, Deepak Narayanan, Luigi Nardi, Tian Zhao, Jian Zhang, Peter Bailis, Kunle Olukotun, Chris Re, Matei Zaharia
2019 arXiv   pre-print
Researchers have proposed hardware, software, and algorithmic optimizations to improve the computational performance of deep learning. While some of these optimizations perform the same operations faster (e.g., increasing GPU clock speed), many others modify the semantics of the training procedure (e.g., reduced precision), and can impact the final model's accuracy on unseen data. Due to a lack of standard evaluation criteria that considers these trade-offs, it is difficult to directly compare
more » ... hese optimizations. To address this problem, we recently introduced DAWNBench, a benchmark competition focused on end-to-end training time to achieve near-state-of-the-art accuracy on an unseen dataset---a combined metric called time-to-accuracy (TTA). In this work, we analyze the entries from DAWNBench, which received optimized submissions from multiple industrial groups, to investigate the behavior of TTA as a metric as well as trends in the best-performing entries. We show that TTA has a low coefficient of variation and that models optimized for TTA generalize nearly as well as those trained using standard methods. Additionally, even though DAWNBench entries were able to train ImageNet models in under 3 minutes, we find they still underutilize hardware capabilities such as Tensor Cores. Furthermore, we find that distributed entries can spend more than half of their time on communication. We show similar findings with entries to the MLPERF v0.5 benchmark.
arXiv:1806.01427v2 fatcat:oaxei3hhufa2dkqgqhs2jftmmu

Analysis of DAWNBench, a Time-to-Accuracy Machine Learning Performance Benchmark

Cody Coleman, Matei Zaharia, Daniel Kang, Deepak Narayanan, Luigi Nardi, Tian Zhao, Jian Zhang, Peter Bailis, Kunle Olukotun, Chris Ré
2019 ACM SIGOPS Operating Systems Review  
Researchers have proposed hardware, software, and algorithmic optimizations to improve the computational performance of deep learning. While some of these optimizations perform the same operations faster (e.g., increasing GPU clock speed), many others modify the semantics of the training procedure (e.g., reduced precision), and can impact the final model's accuracy on unseen data. Due to a lack of standard evaluation criteria that considers these trade-offs, it is difficult to directly compare
more » ... hese optimizations. To address this problem, we recently introduced DAWNBENCH, a benchmark competition focused on end-to-end training time to achieve near-state-of-the-art accuracy on an unseen dataset-a combined metric called time-to-accuracy (TTA). In this work, we analyze the entries from DAWNBENCH, which received optimized submissions from multiple industrial groups, to investigate the behavior of TTA as a metric as well as trends in the best-performing entries. We show that TTA has a low coefficient of variation and that models optimized for TTA generalize nearly as well as those trained using standard methods. Additionally, even though DAWNBENCH entries were able to train ImageNet models in under 3 minutes, we find they still underutilize hardware capabilities such as Tensor Cores. Furthermore, we find that distributed entries can spend more than half of their time on communication. We show similar findings with entries to the MLPERF v0.5 benchmark.
doi:10.1145/3352020.3352024 fatcat:qqiqkgwgunbutj673q4dqxie6u

HarvardX and MITx: Two Years of Open Online Courses Fall 2012-Summer 2014

Andrew Dean Ho, Isaac Chuang, Justin Reich, Cody Austun Coleman, Jacob Whitehill, Curtis G Northcutt, Joseph Jay Williams, John D Hansen, Glenn Lopez, Rebecca Petersen
2015 Social Science Research Network  
doi:10.2139/ssrn.2586847 fatcat:3wy3naushrbopbtvie6qea2fzy

Selection via Proxy: Efficient Data Selection for Deep Learning [article]

Cody Coleman, Christopher Yeh, Stephen Mussmann, Baharan Mirzasoleiman, Peter Bailis, Percy Liang, Jure Leskovec, Matei Zaharia
2020 arXiv   pre-print
Data selection methods, such as active learning and core-set selection, are useful tools for machine learning on large datasets. However, they can be prohibitively expensive to apply in deep learning because they depend on feature representations that need to be learned. In this work, we show that we can greatly improve the computational efficiency by using a small proxy model to perform data selection (e.g., selecting data points to label for active learning). By removing hidden layers from
more » ... target model, using smaller architectures, and training for fewer epochs, we create proxies that are an order of magnitude faster to train. Although these small proxy models have higher error rates, we find that they empirically provide useful signals for data selection. We evaluate this "selection via proxy" (SVP) approach on several data selection tasks across five datasets: CIFAR10, CIFAR100, ImageNet, Amazon Review Polarity, and Amazon Review Full. For active learning, applying SVP can give an order of magnitude improvement in data selection runtime (i.e., the time it takes to repeatedly train and select points) without significantly increasing the final error (often within 0.1%). For core-set selection on CIFAR10, proxies that are over 10x faster to train than their larger, more accurate targets can remove up to 50% of the data without harming the final accuracy of the target, leading to a 1.6x end-to-end training time improvement.
arXiv:1906.11829v4 fatcat:jspmbcf7mngllnmoezxc5utg74

A simplified design for the C. elegans lifespan machine

Mark Abbott, Stephen A Banse, Ilija Melentijevic, Cody M Jarrett, Jonathan St Ange, Christine A Sedore, Ron Falkowski, Benjamin W Blue, Anna L Coleman-Hulbert, Erik Johnson, Max Guo, Gordon J Lithgow (+2 others)
2020 Journal of Biological Methods  
Caenorhabditis elegans (C. elegans) lifespan assays constitute a broadly used approach for investigating the fundamental biology of longevity. Traditional C. elegans lifespan assays require labor-intensive microscopic monitoring of individual animals to evaluate life/death over a period of weeks, making large-scale high throughput studies impractical. The lifespan machine developed by Stroustrup et al. (2013) adapted flatbed scanner technologies to contribute a major technical advance in the
more » ... iciency of C. elegans survival assays. Introducing a platform in which large portions of a lifespan assay are automated enabled longevity studies of a scope not possible with previous exclusively manual assays and facilitated novel discovery. Still, as initially described, constructing and operating scanner-based lifespan machines requires considerable effort and expertise. Here we report on design modifications that simplify construction, decrease cost, eliminate certain mechanical failures, and decrease assay workload requirements. The modifications we document should make the lifespan machine more accessible to interested laboratories.
doi:10.14440/jbm.2020.332 pmid:33204740 pmcid:PMC7666331 fatcat:63445ayhzjgctcwxbbvf7ga56a

Platelet-derived growth factor receptor alpha (PDGFRα) targeting and relevant biomarkers in ovarian carcinoma

Koji Matsuo, Masato Nishimura, Kakajan Komurov, Mian M.K. Shahzad, Rouba Ali-Fehmi, Ju-Won Roh, Chunhua Lu, Dianna D. Cody, Prahlad T. Ram, Nick Loizos, Robert L. Coleman, Anil K. Sood
2014 Gynecologic Oncology  
Objective-Platelet-derived growth factor receptor alpha (PDGFRα) is believed to be associated with cell survival. We examined (i) whether PDGFRα blockade enhances the antitumor activity of taxanes in ovarian carcinoma and (ii) potential biomarkers of response to anti-PDGFRα therapy. Methods-PDGFRα expression in 176 ovarian carcinomas was evaluated with tissue microarray and correlated to survival outcome. Human-specific monoclonal antibody to PDGFRα (IMC-3G3) was used for in vitro and in vivo
more » ... periments with or without docetaxel. Gene microarrays and reverse-phase protein arrays with pathway analyses were performed to identify potential predictive biomarkers. Results-When compared to low or no PDGFRα expression, increased PDGFRα expression was associated with significantly poorer overall survival of patients with ovarian cancer (P = 0.014). Although treatment with IMC-3G3 alone did not affect cell viability or increase apoptosis, concurrent use of IMC-3G3 with docetaxel significantly enhanced sensitization to docetaxel and apoptosis. In an orthotopic mouse model, IMC-3G3 monotherapy had no significant antitumor effects in SKOV3-ip1 (low PDGFRα expression), but showed significant antitumor effects in HeyA8-MDR (high PDGFRα expression). Concurrent use of IMC-3G3 with docetaxel, compared with use of docetaxel alone, significantly reduced tumor weight in all tested cell lines. In protein ontology, the EGFR and AKT pathways were downregulated by IMC-3G3 therapy. MAPK and CCNB1 were downregulated only in the HeyA8-MDR model. Conclusion-These data identify IMC-3G3 as an attractive therapeutic strategy and identify potential predictive markers for further development.
doi:10.1016/j.ygyno.2013.10.027 pmid:24183729 pmcid:PMC3946949 fatcat:dgvzgooc7beu7e5qrmryzlxiqm

Text Classification and Tagging of United States Army Ground Vehicle Fault Descriptions in Support of Data-Driven Prognostics

Brandon Hansen, Cody Coleman, Yi Zhang, Maria Seale
2020 Proceedings of the Annual Conference of the Prognostics and Health Management Society, PHM  
The manner in which a prognostics problem is framed is critical for enabling its solution by the proper method. Recently, data-driven prognostics techniques have demonstrated enormous potential when used alone, or as part of a hybrid solution in conjunction with physics-based models. Historical maintenance data constitutes a critical element for the use of a data-driven approach to prognostics, such as supervised machine learning. The historical data is used to create training and testing data
more » ... ets to develop the machine learning model. Categorical classes for prediction are required for machine learning methods; however, faults of interest in US Army Ground Vehicle Maintenance Records appear as natural language text descriptions rather than a finite set of discrete labels. Transforming linguistically complex data into a set of prognostics classes is necessary for utilizing supervised machine learning approaches for prognostics. Manually labeling fault description instances is effective, but extremely time-consuming; thus, an automated approach to labelling is preferred. The approach described in this paper examines key aspects of the fault text relevant to enabling automatic labeling. A method was developed based on the hypothesis that a given fault description could be generalized into a category. This method uses various natural language processing (NLP) techniques and a priori knowledge of ground vehicle faults to assign classes to the maintenance fault descriptions. The core component of the method used in this paper is a Word2Vec word-embedding model. Word embeddings are used in conjunction with a token-oriented rule-based data structure for document classification. This methodology tags text with user-provided classes using a corpus of similar text fields as its training set. With classes of faults reliably assigned to a given description, supervised machine learning with these classes can be applied using related maintenance information that preceded the fault. This method was developed for labeling US Army Ground Vehicle Maintenance Records, but is general enough to be applied to any natural language data sets accompanied with a priori knowledge of its contents for consistent labeling. In addition to applications in machine learning, generated labels are also conducive to general summarization and case-by-case analysis of faults. The maintenance components of interest for this current application are alternators and gaskets, with future development directed towards determining the RUL of these components based on the labeled data.
doi:10.36001/phmconf.2020.v12i1.1154 fatcat:pllys6q2rvdcvbgr5vvfhnye6y

Influence of social determinants of health and county vaccination rates on machine learning models to predict COVID-19 case growth in Tennessee

Lukasz S Wylezinski, Coleman R Harris, Cody N Heiser, Jamieson D Gray, Charles F Spurlock
2021 BMJ Health & Care Informatics  
R Harris @colemanrharris, Cody N Heiser @cody_heiser, Jamieson D Gray @jamiesongray and Charles F Spurlock @cfspurlock Contributors CFS had full access to all of the data in the study and takes responsibility  ...  governments to improve policy and resource allocation to mitigate outbreaks, enhance resilience to future public health threats, and capture evolving risk profiles as novel virus variants emerge. 8 Twitter Coleman  ... 
doi:10.1136/bmjhci-2021-100439 pmid:34580088 pmcid:PMC8478575 fatcat:s7wcr5hstfhh3p7biazp5vigea

Preclinical Evaluation of a Genetically Engineered Herpes Simplex Virus Expressing Interleukin-12

J. M. Markert, J. J. Cody, J. N. Parker, J. M. Coleman, K. H. Price, E. R. Kern, D. C. Quenelle, A. D. Lakeman, T. R. Schoeb, C. A. Palmer, S. C. Cartner, G. Y. Gillespie (+1 others)
2012 Journal of Virology  
Herpes simplex virus 1 (HSV-1) mutants that lack the γ(1)34.5 gene are unable to replicate in the central nervous system but maintain replication competence in dividing cell populations, such as those found in brain tumors. We have previously demonstrated that a γ(1)34.5-deleted HSV-1 expressing murine interleukin-12 (IL-12; M002) prolonged survival of immunocompetent mice in intracranial models of brain tumors. We hypothesized that M002 would be suitable for use in clinical trials for patients
more » ... with malignant glioma. To test this hypothesis, we (i) compared the efficacy of M002 to three other HSV-1 mutants, R3659, R8306, and G207, in murine models of brain tumors, (ii) examined the safety and biodistribution of M002 in the HSV-1-sensitive primate Aotus nancymae following intracerebral inoculation, and (iii) determined whether murine IL-12 produced by M002 was capable of activating primate lymphocytes. Results are summarized as follows: (i) M002 demonstrated superior antitumor activity in two different murine brain tumor models compared to three other genetically engineered HSV-1 mutants; (ii) no significant clinical or magnetic resonance imaging evidence of toxicity was observed following direct inoculation of M002 into the right frontal lobes of A. nancymae; (iii) there was no histopathologic evidence of disease in A. nancymae 1 month or 5.5 years following direct inoculation; and (iv) murine IL-12 produced by M002 activates A. nancymae lymphocytes in vitro. We conclude that the safety and preclinical efficacy of M002 warrants the advancement of a Δγ(1)34.5 virus expressing IL-12 to phase I clinical trials for patients with recurrent malignant glioma.
doi:10.1128/jvi.06998-11 pmid:22379082 pmcid:PMC3347348 fatcat:esij5daprrglpj43lxr5b3s2vi
« Previous Showing results 1 — 15 out of 1,613 results