Filters








3,211 Hits in 5.4 sec

Comparison of Classification Methods for Very High-Dimensional Data in Sparse Random Projection Representation [article]

Anton Akusok, Emil Eirola
2019 arXiv   pre-print
Unstructured data produces very large binary matrices with millions of columns when converted to vector form.  ...  This work studies efficient non-iterative and iterative methods suitable for such data, evaluating the results on two representative machine learning tasks with millions of samples and features.  ...  Acknowledgements This work was supported by Tekes -the Finnish Funding Agency for Innovation -as part of the "Cloud-assisted Security Services" (CloSer) project.  ... 
arXiv:1912.08616v1 fatcat:jadqegkygfeexn4m55zs7vo424

maxent: An R Package for Low-memory Multinomial Logistic Regression with Support for Semi-automated Text Classification

Timothy,P. Jurka
2012 The R Journal  
maxent is a package with tools for data classification using multinomial logistic regression, also known as maximum entropy.  ...  The focus of this maximum entropy classifier is to minimize memory consumption on very large datasets, particularly sparse document-term matrices represented by the tm text mining package.  ...  Acknowledgements This project was made possible through financial support from the University of California at Davis, University of Antwerp, and Sciences Po Bordeaux.  ... 
doi:10.32614/rj-2012-007 fatcat:mpro6mhn3jbbdbo4hw5uq5semu

Cross-Lingual Adaptation using Structural Correspondence Learning [article]

Peter Prettenhofer, Benno Stein
2010 arXiv   pre-print
In this article we describe an extension of Structural Correspondence Learning (SCL), a recently proposed algorithm for domain adaptation, for cross-lingual adaptation.  ...  The results show a significant improvement of the proposed method over a machine translation baseline, reducing the relative error due to cross-lingual adaptation by an average of 30% (topic classification  ...  A common choice for R is L2-regularization, which imposes an L2-norm penalty on w, R(w) = 1 2 w 2 2 = 1 2 w T w.  ... 
arXiv:1008.0716v2 fatcat:qygcn7nvuvea3erjgllg7vndlq

Cross-Lingual Adaptation Using Structural Correspondence Learning

Peter Prettenhofer, Benno Stein
2011 ACM Transactions on Intelligent Systems and Technology  
In this article we describe an extension of Structural Correspondence Learning (SCL), a recently proposed algorithm for domain adaptation, for cross-lingual adaptation in the context of text classification  ...  The results show a significant improvement of the proposed method over a machine translation baseline, reducing the relative error due to cross-lingual adaptation by an average of 30% (topic classification  ...  A common choice for R is L2-regularization, which imposes an L2-norm penalty on w, R(w) = 1 2 w 2 2 = 1 2 w T w.  ... 
doi:10.1145/2036264.2036277 fatcat:5xsjwtvlh5cx7iucai3wstrvba

Convolutional Neural Network: Text Classification Model for Open Domain Question Answering System [article]

Muhammad Zain Amin, Noman Nadeem
2019 arXiv   pre-print
Recently machine learning is being applied to almost every data domain one of which is Question Answering Systems (QAS).  ...  The neural network classifier can be trained on large dataset. We report series of experiments conducted on Convolution Neural Network (CNN) by training it on two different datasets.  ...  Learning Algorithms: use classifiers such as naïve Bayes (NB), logistic Regression (LR) and support vector machines (SVM).  ... 
arXiv:1809.02479v2 fatcat:uvhmz27tlfcljkvytjvks55x5i

Sparse Named Entity Classification using Factorization Machines [article]

Ai Hirata, Mamoru Komachi
2017 arXiv   pre-print
A bottleneck in named entity classification, however, is the data problem of sparseness, because new named entities continually emerge, making it rather difficult to maintain a dictionary for named entity  ...  Experimental results show that our proposed model, with fewer features and a smaller size, achieves competitive accuracy to state-of-the-art models.  ...  They reported that their regularization achieved higher accuracy than L1 and L2 regularization, frequently used in natural language processing (Okanohara and Tsujii, 2009).  ... 
arXiv:1703.04879v1 fatcat:yfmmz22dnjbwvj4rtmzfzvq6re

Formal Models of the Network Co-occurrence Underlying Mental Operations

Danilo Bzdok, Gaël Varoquaux, Olivier Grisel, Michael Eickenberg, Cyril Poupon, Bertrand Thirion, Danielle S Bassett
2016 PLoS Computational Biology  
This idea is supported by evidence that proportional default-mode network recruitment impairs task performance, which is believed to be subserved by other large-scale networks [25, 26] .  ...  In line with this contention, the onset of a given cognitive task might induce characteristic changes in functional coupling of large-scale networks.  ...  The support vector machines were penalized by l2-regularization because classifier fitting was preceded by automatic selection of the k most relevant networks for each task (cf. methods section).  ... 
doi:10.1371/journal.pcbi.1004994 pmid:27310288 pmcid:PMC4911040 fatcat:2cj2eehfh5emvj5ggkihbaegry

A Comparative Study on Different Types of Approaches to Bengali document Categorization [article]

Md. Saiful Islam, Fazla Elahi Md Jubayer, Syed Ikhtiar Ahmed
2017 arXiv   pre-print
In this paper three well-known supervised learning techniques which are Support Vector Machine(SVM), Na\"ive Bayes(NB) and Stochastic Gradient Descent(SGD) compared for Bengali document categorization.  ...  Document categorization is a technique where the category of a document is determined.  ...  It is effectively applied to large-scale and sparse machine learning issues frequently experienced in DC.  ... 
arXiv:1701.08694v1 fatcat:z2my45womnh5tjy7qlnfkvksyu

A Reduction of the Elastic Net to Support Vector Machines with an Application to GPU Computing [article]

Quan Zhou, Wenlin Chen, Shiji Song, Jacob R. Gardner, Kilian Q. Weinberger, Yixin Chen
2014 arXiv   pre-print
The past years have witnessed many dedicated open-source projects that built and maintain implementations of Support Vector Machines (SVM), parallelized for GPU, multi-core CPUs and distributed systems  ...  With a simple wrapper, consisting of only 11 lines of MATLAB code, we obtain an Elastic Net implementation that naturally utilizes GPU and multi-core CPUs.  ...  For example, deep (convolutional) neural networks can naturally take advantage of multiple GPUs [18] ; support vector machines (SVM) have been ported to GPUs [8, 26] , multi-core CPUs [7, 26] and even  ... 
arXiv:1409.1976v1 fatcat:r3bt6ir2uvfvhaxlwgli4s5sbe

Unsupervised Feature Learning for Visual Sign Language Identification

Binyam Gebrekidan Gebre, Onno Crasborn, Peter Wittenburg, Sebastian Drude, Tom Heskes
2014 Proceedings of the 52nd Annual Meeting of the Association for Computational Linguistics (Volume 2: Short Papers)  
In this paper, we focus on the visual modality and present a method for identifying sign languages solely from short video samples.  ...  Given that sign languages are underresourced, unsupervised feature learning techniques are the right tools and our results indicate that this is realistic for sign language identification.  ...  Learning: Learn a linear classifier to predict the labels given the feature vectors. We use logistic regression classifier and support vector machines (Pedregosa et al., 2011) .  ... 
doi:10.3115/v1/p14-2061 dblp:conf/acl/GebreCWDH14 fatcat:37rqfzfgfzchlm4wbdd6z7s7ba

A family of large margin linear classifiers and its application in dynamic environments

Jianqiang Shen, Thomas G. Dietterich
2009 Statistical analysis and data mining  
Real-time problems, in which the learning must be fast and the importance of the features might be changing, pose a challenge to machine learning algorithms.  ...  We solve the problems by combining regularization mechanisms with online large margin learning algorithms.  ...  They have limited requirements for CPU time and memory. Their efficiency has made them popular for large-scale learning problems, such as natural language processing.  ... 
doi:10.1002/sam.10055 fatcat:nrpq2takebdylpukefpz4xwo3q

A Family of Large Margin Linear Classifiers and Its Application in Dynamic Environments [chapter]

Jianqiang Shen, Thomas G. Dietterich
2009 Proceedings of the 2009 SIAM International Conference on Data Mining  
Real-time problems, in which the learning must be fast and the importance of the features might be changing, pose a challenge to machine learning algorithms.  ...  We solve the problems by combining regularization mechanisms with online large margin learning algorithms.  ...  They have limited requirements for CPU time and memory. Their efficiency has made them popular for large-scale learning problems, such as natural language processing.  ... 
doi:10.1137/1.9781611972795.15 dblp:conf/sdm/ShenD09 fatcat:3x2lzzqmyjckrhwon6sdadlt3m

Deep Learning on Big, Sparse, Behavioral Data

Sofie De Cnudde, Yanou Ramon, David Martens, Foster Provost
2019 Big Data  
) has been found to be the most accurate machine learning technique generally for sparse behavioral data.  ...  The outstanding performance of deep learning (DL) for computer vision and natural language processing has fueled increased interest in applying these algorithms more broadly in both research and practice  ...  They compare variations of support vector machines, naive Bayes, LR, and a relational classifier called pseudosocial network (PSN).  ... 
doi:10.1089/big.2019.0095 pmid:31860341 fatcat:p5pe36v5srgerdh7rwmfi7oq6m

Supervised Learning for Non-Sequential Data: A Canonical Polyadic Decomposition Approach [article]

Alexandros Haliassos, Kriton Konstantinidis, Danilo P. Mandic
2021 arXiv   pre-print
Efficient modelling of feature interactions underpins supervised learning for non-sequential tasks, characterized by a lack of inherent ordering of features (variables).  ...  The brute force approach of learning a parameter for each interaction of every order comes at an exponential computational and memory cost (Curse of Dimensionality).  ...  A.H. is supported by an Imperial College London President's Scholarship. K.K. is supported by an EPSRC International Doctoral Scholarship.  ... 
arXiv:2001.10109v3 fatcat:yclrwbdtbjh4tpibimlicnvvz4

Sparse regularization techniques provide novel insights into outcome integration processes

Holger Mohr, Uta Wolfensteller, Steffi Frimmel, Hannes Ruge
2015 NeuroImage  
L2regularized (dense) Support Vector Machine on this whole-brain between-subject classification task.  ...  While the beneficial effect of differential outcomes is well-studied in trial-and-error learning, outcome integration in the context of instruction-based learning has remained largely unexplored.  ...  Acknowledgment This work was supported by the German Research Foundation (Deutsche Forschungsgemeinschaft, DFG) SFB940 sub-project Z2.  ... 
doi:10.1016/j.neuroimage.2014.10.025 pmid:25467302 fatcat:o6c7n3zrazg2hfqherjdtki4iu
« Previous Showing results 1 — 15 out of 3,211 results