A copy of this work was available on the public web and has been preserved in the Wayback Machine. The capture dates from 2020; you can also visit the original URL.
The file type is application/pdf
.
Filters
Comparison of Classification Methods for Very High-Dimensional Data in Sparse Random Projection Representation
[article]
2019
arXiv
pre-print
Unstructured data produces very large binary matrices with millions of columns when converted to vector form. ...
This work studies efficient non-iterative and iterative methods suitable for such data, evaluating the results on two representative machine learning tasks with millions of samples and features. ...
Acknowledgements This work was supported by Tekes -the Finnish Funding Agency for Innovation -as part of the "Cloud-assisted Security Services" (CloSer) project. ...
arXiv:1912.08616v1
fatcat:jadqegkygfeexn4m55zs7vo424
maxent: An R Package for Low-memory Multinomial Logistic Regression with Support for Semi-automated Text Classification
2012
The R Journal
maxent is a package with tools for data classification using multinomial logistic regression, also known as maximum entropy. ...
The focus of this maximum entropy classifier is to minimize memory consumption on very large datasets, particularly sparse document-term matrices represented by the tm text mining package. ...
Acknowledgements This project was made possible through financial support from the University of California at Davis, University of Antwerp, and Sciences Po Bordeaux. ...
doi:10.32614/rj-2012-007
fatcat:mpro6mhn3jbbdbo4hw5uq5semu
Cross-Lingual Adaptation using Structural Correspondence Learning
[article]
2010
arXiv
pre-print
In this article we describe an extension of Structural Correspondence Learning (SCL), a recently proposed algorithm for domain adaptation, for cross-lingual adaptation. ...
The results show a significant improvement of the proposed method over a machine translation baseline, reducing the relative error due to cross-lingual adaptation by an average of 30% (topic classification ...
A common choice for R is L2-regularization, which imposes an L2-norm penalty on w, R(w) = 1 2 w 2 2 = 1 2 w T w. ...
arXiv:1008.0716v2
fatcat:qygcn7nvuvea3erjgllg7vndlq
Cross-Lingual Adaptation Using Structural Correspondence Learning
2011
ACM Transactions on Intelligent Systems and Technology
In this article we describe an extension of Structural Correspondence Learning (SCL), a recently proposed algorithm for domain adaptation, for cross-lingual adaptation in the context of text classification ...
The results show a significant improvement of the proposed method over a machine translation baseline, reducing the relative error due to cross-lingual adaptation by an average of 30% (topic classification ...
A common choice for R is L2-regularization, which imposes an L2-norm penalty on w, R(w) = 1 2 w 2 2 = 1 2 w T w. ...
doi:10.1145/2036264.2036277
fatcat:5xsjwtvlh5cx7iucai3wstrvba
Convolutional Neural Network: Text Classification Model for Open Domain Question Answering System
[article]
2019
arXiv
pre-print
Recently machine learning is being applied to almost every data domain one of which is Question Answering Systems (QAS). ...
The neural network classifier can be trained on large dataset. We report series of experiments conducted on Convolution Neural Network (CNN) by training it on two different datasets. ...
Learning Algorithms: use classifiers such as naïve Bayes (NB), logistic Regression (LR) and support vector machines (SVM). ...
arXiv:1809.02479v2
fatcat:uvhmz27tlfcljkvytjvks55x5i
Sparse Named Entity Classification using Factorization Machines
[article]
2017
arXiv
pre-print
A bottleneck in named entity classification, however, is the data problem of sparseness, because new named entities continually emerge, making it rather difficult to maintain a dictionary for named entity ...
Experimental results show that our proposed model, with fewer features and a smaller size, achieves competitive accuracy to state-of-the-art models. ...
They reported that their regularization achieved higher accuracy than L1 and L2 regularization, frequently used in natural language processing (Okanohara and Tsujii, 2009). ...
arXiv:1703.04879v1
fatcat:yfmmz22dnjbwvj4rtmzfzvq6re
Formal Models of the Network Co-occurrence Underlying Mental Operations
2016
PLoS Computational Biology
This idea is supported by evidence that proportional default-mode network recruitment impairs task performance, which is believed to be subserved by other large-scale networks [25, 26] . ...
In line with this contention, the onset of a given cognitive task might induce characteristic changes in functional coupling of large-scale networks. ...
The support vector machines were penalized by l2-regularization because classifier fitting was preceded by automatic selection of the k most relevant networks for each task (cf. methods section). ...
doi:10.1371/journal.pcbi.1004994
pmid:27310288
pmcid:PMC4911040
fatcat:2cj2eehfh5emvj5ggkihbaegry
A Comparative Study on Different Types of Approaches to Bengali document Categorization
[article]
2017
arXiv
pre-print
In this paper three well-known supervised learning techniques which are Support Vector Machine(SVM), Na\"ive Bayes(NB) and Stochastic Gradient Descent(SGD) compared for Bengali document categorization. ...
Document categorization is a technique where the category of a document is determined. ...
It is effectively applied to large-scale and sparse machine learning issues frequently experienced in DC. ...
arXiv:1701.08694v1
fatcat:z2my45womnh5tjy7qlnfkvksyu
A Reduction of the Elastic Net to Support Vector Machines with an Application to GPU Computing
[article]
2014
arXiv
pre-print
The past years have witnessed many dedicated open-source projects that built and maintain implementations of Support Vector Machines (SVM), parallelized for GPU, multi-core CPUs and distributed systems ...
With a simple wrapper, consisting of only 11 lines of MATLAB code, we obtain an Elastic Net implementation that naturally utilizes GPU and multi-core CPUs. ...
For example, deep (convolutional) neural networks can naturally take advantage of multiple GPUs [18] ; support vector machines (SVM) have been ported to GPUs [8, 26] , multi-core CPUs [7, 26] and even ...
arXiv:1409.1976v1
fatcat:r3bt6ir2uvfvhaxlwgli4s5sbe
Unsupervised Feature Learning for Visual Sign Language Identification
2014
Proceedings of the 52nd Annual Meeting of the Association for Computational Linguistics (Volume 2: Short Papers)
In this paper, we focus on the visual modality and present a method for identifying sign languages solely from short video samples. ...
Given that sign languages are underresourced, unsupervised feature learning techniques are the right tools and our results indicate that this is realistic for sign language identification. ...
Learning: Learn a linear classifier to predict the labels given the feature vectors. We use logistic regression classifier and support vector machines (Pedregosa et al., 2011) . ...
doi:10.3115/v1/p14-2061
dblp:conf/acl/GebreCWDH14
fatcat:37rqfzfgfzchlm4wbdd6z7s7ba
A family of large margin linear classifiers and its application in dynamic environments
2009
Statistical analysis and data mining
Real-time problems, in which the learning must be fast and the importance of the features might be changing, pose a challenge to machine learning algorithms. ...
We solve the problems by combining regularization mechanisms with online large margin learning algorithms. ...
They have limited requirements for CPU time and memory. Their efficiency has made them popular for large-scale learning problems, such as natural language processing. ...
doi:10.1002/sam.10055
fatcat:nrpq2takebdylpukefpz4xwo3q
A Family of Large Margin Linear Classifiers and Its Application in Dynamic Environments
[chapter]
2009
Proceedings of the 2009 SIAM International Conference on Data Mining
Real-time problems, in which the learning must be fast and the importance of the features might be changing, pose a challenge to machine learning algorithms. ...
We solve the problems by combining regularization mechanisms with online large margin learning algorithms. ...
They have limited requirements for CPU time and memory. Their efficiency has made them popular for large-scale learning problems, such as natural language processing. ...
doi:10.1137/1.9781611972795.15
dblp:conf/sdm/ShenD09
fatcat:3x2lzzqmyjckrhwon6sdadlt3m
Deep Learning on Big, Sparse, Behavioral Data
2019
Big Data
) has been found to be the most accurate machine learning technique generally for sparse behavioral data. ...
The outstanding performance of deep learning (DL) for computer vision and natural language processing has fueled increased interest in applying these algorithms more broadly in both research and practice ...
They compare variations of support vector machines, naive Bayes, LR, and a relational classifier called pseudosocial network (PSN). ...
doi:10.1089/big.2019.0095
pmid:31860341
fatcat:p5pe36v5srgerdh7rwmfi7oq6m
Supervised Learning for Non-Sequential Data: A Canonical Polyadic Decomposition Approach
[article]
2021
arXiv
pre-print
Efficient modelling of feature interactions underpins supervised learning for non-sequential tasks, characterized by a lack of inherent ordering of features (variables). ...
The brute force approach of learning a parameter for each interaction of every order comes at an exponential computational and memory cost (Curse of Dimensionality). ...
A.H. is supported by an Imperial College London President's Scholarship. K.K. is supported by an EPSRC International Doctoral Scholarship. ...
arXiv:2001.10109v3
fatcat:yclrwbdtbjh4tpibimlicnvvz4
Sparse regularization techniques provide novel insights into outcome integration processes
2015
NeuroImage
L2regularized (dense) Support Vector Machine on this whole-brain between-subject classification task. ...
While the beneficial effect of differential outcomes is well-studied in trial-and-error learning, outcome integration in the context of instruction-based learning has remained largely unexplored. ...
Acknowledgment This work was supported by the German Research Foundation (Deutsche Forschungsgemeinschaft, DFG) SFB940 sub-project Z2. ...
doi:10.1016/j.neuroimage.2014.10.025
pmid:25467302
fatcat:o6c7n3zrazg2hfqherjdtki4iu
« Previous
Showing results 1 — 15 out of 3,211 results