A copy of this work was available on the public web and has been preserved in the Wayback Machine. The capture dates from 2019; you can also visit the original URL.
The file type is `application/pdf`

.

## Filters

##
###
Newton Trees
[chapter]

2010
*
Lecture Notes in Computer Science
*

This paper presents Newton trees, a redefinition of probability estimation trees (PET) based on a stochastic understanding of decision trees that follows the principle of attraction (relating mass and distance through the Inverse Square Law). The structure, application and the graphical representation of Newton trees provide a way to make their stochastically driven predictions compatible with user's intelligibility, so preserving one of the most desirable features of decision trees,

doi:10.1007/978-3-642-17432-2_18
fatcat:nw5tg6tvtjbzbexto3vj7utbqu
## more »

... bility. Unlike almost all existing decision tree learning methods, which use different kinds of partitions depending on the attribute datatype, the construction of prototypes and the derivation of probabilities from distances are identical for every datatype (nominal and numerical, but also structured). We present a way of graphically representing the original stochastic probability estimation trees using a user-friendly gravitation simile.We include experiments showing that Newton trees outperform other PETs in probability estimation and accuracy.##
###
Using negotiable features for prescription problems

2010
*
Computing
*

Data mining is usually concerned on the construction of accurate models from data, which are usually applied to well-defined problems that can be clearly isolated and formulated independently from other problems. Although much computational effort is devoted for their training and statistical evaluation, model deployment can also represent a scientific problem, when several data mining models have to be used together, constraints appear on their application, or they have to be included in

doi:10.1007/s00607-010-0129-5
fatcat:f25lrbqfinhhbhtn73zhzi4qfq
## more »

... on processes based on different rules, equations and constraints. In this paper we address the problem of combining several data mining models for objects and individuals in a common scenario, where not only we can affect decisions as the result of a change in one or more data mining models, but we have to solve several optimisation problems, such as choosing one or more inputs to get the best overall result, or readjusting probabilities after a failure. We illustrate the point in the area of Customer Relationship Management (CRM), where we deal with the general problem of prescription between products and customers. We introduce the concept of negotiable feature, which leads to an extended taxonomy of CRM problems of greater complexity, since each new negotiable feature implies a new degree of freedom. In this context, we introduce several new problems and techniques, such as data mining model inversion (by ranging on the inputs or by changing classification problems into regression problems by function inversion), expected profit estimation and curves, global optimisation through a Montecarlo method, and several negotiation strategies in order to solve this maximisation problem.##
###
Bagging Decision Multi-trees
[chapter]

2004
*
Lecture Notes in Computer Science
*

Ensemble methods improve accuracy by combining the predictions of a set of different hypotheses. A well-known method for generating hypothesis ensembles is Bagging. One of the main drawbacks of ensemble methods in general, and Bagging in particular, is the huge amount of computational resources required to learn, store, and apply the set of models. Another problem is that even using the bootstrap technique, many simple models are similar, so limiting the ensemble diversity. In this work, we

doi:10.1007/978-3-540-25966-4_4
fatcat:6shcv2wvlbfvhb3xtwdyxrp3ee
## more »

... stigate an optimization technique based on sharing the common parts of the models from an ensemble formed by decision trees in order to minimize both problems. Concretely, we employ a structure called decision multi-tree which can contain simultaneously a set of decision trees and hence consider just once the "repeated" parts. A thorough experimental evaluation is included to show that the proposed optimisation technique pays off in practice.##
###
Forgetting and consolidation for incremental and cumulative knowledge acquisition systems
[article]

2015
*
arXiv
*
pre-print

The absence of forgetting was masterly described by

arXiv:1502.05615v1
fatcat:tx65kjibczbbhbvv6pfzhifnwq
*Jose*Luis Borges in his tale "Funes, the Memorious" (1942): "To think is to forget a difference, to generalise, to abstract. ...##
###
Shared Ensemble Learning Using Multi-trees
[chapter]

2002
*
Lecture Notes in Computer Science
*

Decision tree learning is a machine learning technique that allows us to generate accurate and comprehensible models. Accuracy can be improved by ensemble methods which combine the predictions of a set of different trees. However, a large amount of resources is necessary to generate the ensemble. In this paper, we introduce a new ensemble method that minimises the usage of resources by sharing the common parts of the components of the ensemble. For this purpose, we learn a decision multi-tree

doi:10.1007/3-540-36131-6_21
fatcat:uk4c5v2bsbf2dcu65lxjtvnwt4
## more »

... stead of a decision tree. We call this new approach shared ensembles. The use of a multi-tree produces an exponential number of hypotheses to be combined, which provides better results than boosting/bagging. We performed several experiments, showing that the technique allows us to obtain accurate models and improves the use of resources with respect to classical ensemble methods.##
###
Aggregative quantification for regression

2013
*
Data mining and knowledge discovery
*

The problem of estimating the class distribution (or prevalence) for a new unlabelled dataset (from a possibly different distribution) is a very common problem which has been addressed in one way or another in the past decades. This problem has been recently reconsidered as a new task in data mining, renamed quantification when the estimation is performed as an aggregation (and possible adjustment) of a single-instance supervised model (e.g., a classifier). However, the study of quantification

doi:10.1007/s10618-013-0308-z
fatcat:fuutiyazfzb77lgqedydrtezzm
## more »

... as been limited to classification, while it is clear that this problem also appears, perhaps even more frequently, with other predictive problems, such asregression. In this case, the goal is to determine a distribution or an aggregated indicator of the output variable for a new unlabelled dataset. In this paper, we introduce a comprehensive new taxonomy of quantification tasks, distinguishing between the estimation of the whole distribution and the estimation of some indicators (summary statistics), for both classification and regression. This distinction is especially useful for regression, since predictions are numerical values that can be aggregated in many different ways, as in multi-dimensional hierarchical data warehouses. We focus on aggregative quantification for regression and see that the approaches borrowed from classification do not work. We present several techniques based on segmentation which are able to produce accurate estimations of the expected value and the distribution of the output variable. We show experimentally that these methods especially excel for the relevant scenarios where training and test distributions dramatically differ.##
###
Quantification via Probability Estimators

2010
*
2010 IEEE International Conference on Data Mining
*

Quantification is the name given to a novel machine learning task which deals with correctly estimating the number of elements of one class in a set of examples. The output of a quantifier is a real value; since training instances are the same as a classification problem, a natural approach is to train a classifier and to derive a quantifier from it. Some previous works have shown that just classifying the instances and counting the examples belonging to the class of interest (classify & count)

doi:10.1109/icdm.2010.75
dblp:conf/icdm/BellaFHR10
fatcat:dy7iprpesnaobog6sgs5xmpxqq
## more »

... typically yields bad quantifiers, especially when the class distribution may vary between training and test. Hence, adjusted versions of classify & count have been developed by using modified thresholds. However, previous works have explicitly discarded (without a deep analysis) any possible approach based on the probability estimations of the classifier. In this paper, we present a method based on averaging the probability estimations of a classifier with a very simple scaling that does perform reasonably well, showing that probability estimators for quantification capture a richer view of the problem than methods based on a threshold.##
###
Similarity-Binning Averaging: A Generalisation of Binning Calibration
[chapter]

2009
*
Lecture Notes in Computer Science
*

In this paper we revisit the problem of classifier calibration, motivated by the issue that existing calibration methods ignore the problem attributes (i.e., they are univariate). These methods only use the estimated probability as input and ignore other important information, such as the original attributes of the problem. We propose a new calibration method inspired in binning-based methods in which the calibrated probabilities are obtained from k instances from a dataset. Bins are

doi:10.1007/978-3-642-04394-9_42
fatcat:kwdrt4kzgzfcthli63di4g5wea
## more »

... by including the k-most similar instances, considering not only estimated probabilities but also the original attributes. This method has been experimentally evaluated wrt. two calibration measures, including a comparison with other traditional calibration methods. The results show that the new method outperforms the most commonly used calibration methods.##
###
On the effect of calibration in classifier combination

2012
*
Applied intelligence (Boston)
*

*Ramírez*-

*Quintana*DSIC-ELP, Universitat Politècnica de València, Camí de Vera s/n, 46022 Valencia, Spain Tel: +34 96 387 7007 Ext: {83502, 83505, 73585, 73586} Fax: +34 96 387 73 59 E-mail: {abella, cferri ...

##
###
Data Mining Strategies for CRM Negotiation Prescription Problems
[chapter]

2010
*
Lecture Notes in Computer Science
*

In some data mining problems, there are some input features that can be freely modified at prediction time. Examples happen in retailing, prescription or control (prices, warranties, medicine doses, delivery times, temperatures, etc.). If a traditional model is learned, many possible values for the special attribute will have to be tried to attain the maximum profit. In this paper, we exploit the relationship between these modifiable (or negotiable) input features and the output to (1) change

doi:10.1007/978-3-642-13022-9_52
fatcat:yfhpxlswsvg4nncr335npj2mfe
## more »

... e problem presentation, possibly turning a classification problem into a regression problem, and (2) maximise profits and derive negotiation strategies. We illustrate our proposal with a paradigmatic Customer Relationship Management (CRM) problem: maximising the profit of a retailing operation where the price is the negotiable input feature. Different negotiation strategies have been experimentally tested to estimate optimal prices, showing that strategies based on negotiable features get higher profits.##
###
CASP-DM: Context Aware Standard Process for Data Mining
[article]

2017
*
arXiv
*
pre-print

We propose an extension of the Cross Industry Standard Process for Data Mining (CRISPDM) which addresses specific challenges of machine learning and data mining for context and model reuse handling. This new general context-aware process model is mapped with CRISP-DM reference model proposing some new or enhanced outputs.

arXiv:1709.09003v1
fatcat:giwxqiy7rbc63bdftzghqnfe2i
##
###
Learning with Configurable Operators and RL-Based Heuristics
[chapter]

2013
*
Lecture Notes in Computer Science
*

In this paper, we push forward the idea of machine learning systems for which the operators can be modified and finetuned for each problem. This allows us to propose a learning paradigm where users can write (or adapt) their operators, according to the problem, data representation and the way the information should be navigated. To achieve this goal, data instances, background knowledge, rules, programs and operators are all written in the same functional language, Erlang. Since changing

doi:10.1007/978-3-642-37382-4_1
fatcat:7oqe67suyjegxaevdiucnqx7du
## more »

... rs affect how the search space needs to be explored, heuristics are learnt as a result of a decision process based on reinforcement learning where each action is defined as a choice of operator and rule. As a result, the architecture can be seen as a 'system for writing machine learning systems' or to explore new operators.##
###
On the definition of a general learning system with user-defined operators
[article]

2013
*
arXiv
*
pre-print

In this paper, we push forward the idea of machine learning systems whose operators can be modified and fine-tuned for each problem. This allows us to propose a learning paradigm where users can write (or adapt) their operators, according to the problem, data representation and the way the information should be navigated. To achieve this goal, data instances, background knowledge, rules, programs and operators are all written in the same functional language, Erlang. Since changing operators

arXiv:1311.4235v1
fatcat:7pqwyo7vpzhmtimdumfcfowayu
## more »

... ct how the search space needs to be explored, heuristics are learnt as a result of a decision process based on reinforcement learning where each action is defined as a choice of operator and rule. As a result, the architecture can be seen as a 'system for writing machine learning systems' or to explore new operators where the policy reuse (as a kind of transfer learning) is allowed. States and actions are represented in a Q matrix which is actually a table, from which a supervised model is learnt. This makes it possible to have a more flexible mapping between old and new problems, since we work with an abstraction of rules and actions. We include some examples sharing reuse and the application of the system gErl to IQ problems. In order to evaluate gErl, we will test it against some structured problems: a selection of IQ test tasks and some experiments on some structured prediction problems (list patterns).##
###
Probabilistic class hierarchies for multiclass classification

2018
*
Journal of Computational Science
*

The improvement in the performance of classifiers has been the focus of attention of many researchers over the last few decades. Obtaining accurate predictions becomes more complicated as the number of classes increases. Most families of classification techniques generate models that define decision boundaries trying to separate the classes as well as possible. As an alternative, in this paper, we propose to hierarchically decompose the original multiclass problem by reducing the number of

doi:10.1016/j.jocs.2018.01.006
fatcat:k5zhijhhlbgo7kq7qm7ivfznpi
## more »

... es involved in each local subproblem. This is done by deriving a similarity matrix from the misclassification errors given by a first classifier that is learned for this, and then, using the similarity matrix to build a tree-like hierarchy of specialized classifiers. Then, we present two approaches to solve the multiclass problem: the first one traverses the tree of classifiers in a top-down manner similar to the way some hierarchical classification methods do for dealing with hierarchical domains; the second one is inspired in the way probabilistic decision trees compute class membership probabilities. To improve the efficiency of our methods, we propose a criterion to reduce the size of the hierarchy. We experimentally evaluate all of the proposals on a collection of multiclass datasets showing that, in general, the generated classifier hierarchies outperform the original (flat) multiclass classification.##
###
Identifying the Machine Learning Family from Black-Box Models
[chapter]

2018
*
Lecture Notes in Computer Science
*

We address the novel question of determining which kind of machine learning model is behind the predictions when we interact with a black-box model. This may allow us to identify families of techniques whose models exhibit similar vulnerabilities and strengths. In our method, we first consider how an adversary can systematically query a given black-box model (oracle) to label an artificially-generated dataset. This labelled dataset is then used for training different surrogate models (each one

doi:10.1007/978-3-030-00374-6_6
fatcat:c2abzp4eojcvxl3w224eky4xfe
## more »

... rying to imitate the oracle's behaviour). The method has two different approaches. First, we assume that the family of the surrogate model that achieves the maximum Kappa metric against the oracle labels corresponds to the family of the oracle model. The other approach, based on machine learning, consists in learning a meta-model that is able to predict the model family of a new black-box model. We compare these two approaches experimentally, giving us insight about how explanatory and predictable our concept of family is.
« Previous

*Showing results 1 — 15 out of 2,270 results*