Filters








2,270 Hits in 1.9 sec

Newton Trees [chapter]

Fernando Martínez-Plumed, Vicent Estruch, César Ferri, José Hernández-Orallo, María José Ramírez-Quintana
2010 Lecture Notes in Computer Science  
This paper presents Newton trees, a redefinition of probability estimation trees (PET) based on a stochastic understanding of decision trees that follows the principle of attraction (relating mass and distance through the Inverse Square Law). The structure, application and the graphical representation of Newton trees provide a way to make their stochastically driven predictions compatible with user's intelligibility, so preserving one of the most desirable features of decision trees,
more » ... bility. Unlike almost all existing decision tree learning methods, which use different kinds of partitions depending on the attribute datatype, the construction of prototypes and the derivation of probabilities from distances are identical for every datatype (nominal and numerical, but also structured). We present a way of graphically representing the original stochastic probability estimation trees using a user-friendly gravitation simile.We include experiments showing that Newton trees outperform other PETs in probability estimation and accuracy.
doi:10.1007/978-3-642-17432-2_18 fatcat:nw5tg6tvtjbzbexto3vj7utbqu

Using negotiable features for prescription problems

Antonio Bella, Cèsar Ferri, José Hernández-Orallo, María José Ramírez-Quintana
2010 Computing  
Data mining is usually concerned on the construction of accurate models from data, which are usually applied to well-defined problems that can be clearly isolated and formulated independently from other problems. Although much computational effort is devoted for their training and statistical evaluation, model deployment can also represent a scientific problem, when several data mining models have to be used together, constraints appear on their application, or they have to be included in
more » ... on processes based on different rules, equations and constraints. In this paper we address the problem of combining several data mining models for objects and individuals in a common scenario, where not only we can affect decisions as the result of a change in one or more data mining models, but we have to solve several optimisation problems, such as choosing one or more inputs to get the best overall result, or readjusting probabilities after a failure. We illustrate the point in the area of Customer Relationship Management (CRM), where we deal with the general problem of prescription between products and customers. We introduce the concept of negotiable feature, which leads to an extended taxonomy of CRM problems of greater complexity, since each new negotiable feature implies a new degree of freedom. In this context, we introduce several new problems and techniques, such as data mining model inversion (by ranging on the inputs or by changing classification problems into regression problems by function inversion), expected profit estimation and curves, global optimisation through a Montecarlo method, and several negotiation strategies in order to solve this maximisation problem.
doi:10.1007/s00607-010-0129-5 fatcat:f25lrbqfinhhbhtn73zhzi4qfq

Bagging Decision Multi-trees [chapter]

Vicent Estruch, César Ferri, José Hernández-Orallo, Maria José Ramírez-Quintana
2004 Lecture Notes in Computer Science  
Ensemble methods improve accuracy by combining the predictions of a set of different hypotheses. A well-known method for generating hypothesis ensembles is Bagging. One of the main drawbacks of ensemble methods in general, and Bagging in particular, is the huge amount of computational resources required to learn, store, and apply the set of models. Another problem is that even using the bootstrap technique, many simple models are similar, so limiting the ensemble diversity. In this work, we
more » ... stigate an optimization technique based on sharing the common parts of the models from an ensemble formed by decision trees in order to minimize both problems. Concretely, we employ a structure called decision multi-tree which can contain simultaneously a set of decision trees and hence consider just once the "repeated" parts. A thorough experimental evaluation is included to show that the proposed optimisation technique pays off in practice.
doi:10.1007/978-3-540-25966-4_4 fatcat:6shcv2wvlbfvhb3xtwdyxrp3ee

Forgetting and consolidation for incremental and cumulative knowledge acquisition systems [article]

Fernando Martínez-Plumed, Cèsar Ferri, José Hernández-Orallo, María José Ramírez-Quintana
2015 arXiv   pre-print
The absence of forgetting was masterly described by Jose Luis Borges in his tale "Funes, the Memorious" (1942): "To think is to forget a difference, to generalise, to abstract.  ... 
arXiv:1502.05615v1 fatcat:tx65kjibczbbhbvv6pfzhifnwq

Shared Ensemble Learning Using Multi-trees [chapter]

Victor Estruch, Cesar Ferri, Jose Hernández-Orallo, Maria Jose Ramírez-Quintana
2002 Lecture Notes in Computer Science  
Decision tree learning is a machine learning technique that allows us to generate accurate and comprehensible models. Accuracy can be improved by ensemble methods which combine the predictions of a set of different trees. However, a large amount of resources is necessary to generate the ensemble. In this paper, we introduce a new ensemble method that minimises the usage of resources by sharing the common parts of the components of the ensemble. For this purpose, we learn a decision multi-tree
more » ... stead of a decision tree. We call this new approach shared ensembles. The use of a multi-tree produces an exponential number of hypotheses to be combined, which provides better results than boosting/bagging. We performed several experiments, showing that the technique allows us to obtain accurate models and improves the use of resources with respect to classical ensemble methods.
doi:10.1007/3-540-36131-6_21 fatcat:uk4c5v2bsbf2dcu65lxjtvnwt4

Aggregative quantification for regression

Antonio Bella, Cèsar Ferri, José Hernández-Orallo, María José Ramírez-Quintana
2013 Data mining and knowledge discovery  
The problem of estimating the class distribution (or prevalence) for a new unlabelled dataset (from a possibly different distribution) is a very common problem which has been addressed in one way or another in the past decades. This problem has been recently reconsidered as a new task in data mining, renamed quantification when the estimation is performed as an aggregation (and possible adjustment) of a single-instance supervised model (e.g., a classifier). However, the study of quantification
more » ... as been limited to classification, while it is clear that this problem also appears, perhaps even more frequently, with other predictive problems, such asregression. In this case, the goal is to determine a distribution or an aggregated indicator of the output variable for a new unlabelled dataset. In this paper, we introduce a comprehensive new taxonomy of quantification tasks, distinguishing between the estimation of the whole distribution and the estimation of some indicators (summary statistics), for both classification and regression. This distinction is especially useful for regression, since predictions are numerical values that can be aggregated in many different ways, as in multi-dimensional hierarchical data warehouses. We focus on aggregative quantification for regression and see that the approaches borrowed from classification do not work. We present several techniques based on segmentation which are able to produce accurate estimations of the expected value and the distribution of the output variable. We show experimentally that these methods especially excel for the relevant scenarios where training and test distributions dramatically differ.
doi:10.1007/s10618-013-0308-z fatcat:fuutiyazfzb77lgqedydrtezzm

Quantification via Probability Estimators

Antonio Bella, Cesar Ferri, Jose Hernandez-Orallo, Maria Jose Ramirez-Quintana
2010 2010 IEEE International Conference on Data Mining  
Quantification is the name given to a novel machine learning task which deals with correctly estimating the number of elements of one class in a set of examples. The output of a quantifier is a real value; since training instances are the same as a classification problem, a natural approach is to train a classifier and to derive a quantifier from it. Some previous works have shown that just classifying the instances and counting the examples belonging to the class of interest (classify & count)
more » ... typically yields bad quantifiers, especially when the class distribution may vary between training and test. Hence, adjusted versions of classify & count have been developed by using modified thresholds. However, previous works have explicitly discarded (without a deep analysis) any possible approach based on the probability estimations of the classifier. In this paper, we present a method based on averaging the probability estimations of a classifier with a very simple scaling that does perform reasonably well, showing that probability estimators for quantification capture a richer view of the problem than methods based on a threshold.
doi:10.1109/icdm.2010.75 dblp:conf/icdm/BellaFHR10 fatcat:dy7iprpesnaobog6sgs5xmpxqq

Similarity-Binning Averaging: A Generalisation of Binning Calibration [chapter]

Antonio Bella, Cèsar Ferri, José Hernández-Orallo, Marïa José Ramírez-Quintana
2009 Lecture Notes in Computer Science  
In this paper we revisit the problem of classifier calibration, motivated by the issue that existing calibration methods ignore the problem attributes (i.e., they are univariate). These methods only use the estimated probability as input and ignore other important information, such as the original attributes of the problem. We propose a new calibration method inspired in binning-based methods in which the calibrated probabilities are obtained from k instances from a dataset. Bins are
more » ... by including the k-most similar instances, considering not only estimated probabilities but also the original attributes. This method has been experimentally evaluated wrt. two calibration measures, including a comparison with other traditional calibration methods. The results show that the new method outperforms the most commonly used calibration methods.
doi:10.1007/978-3-642-04394-9_42 fatcat:kwdrt4kzgzfcthli63di4g5wea

On the effect of calibration in classifier combination

Antonio Bella, Cèsar Ferri, José Hernández-Orallo, María José Ramírez-Quintana
2012 Applied intelligence (Boston)  
Ramírez-Quintana DSIC-ELP, Universitat Politècnica de València, Camí de Vera s/n, 46022 Valencia, Spain Tel: +34 96 387 7007 Ext: {83502, 83505, 73585, 73586} Fax: +34 96 387 73 59 E-mail: {abella, cferri  ... 
doi:10.1007/s10489-012-0388-2 fatcat:nidj62jkzjaxrn3eumsnl6ygli

Data Mining Strategies for CRM Negotiation Prescription Problems [chapter]

Antonio Bella, Cèsar Ferri, José Hernández-Orallo, María José Ramírez-Quintana
2010 Lecture Notes in Computer Science  
In some data mining problems, there are some input features that can be freely modified at prediction time. Examples happen in retailing, prescription or control (prices, warranties, medicine doses, delivery times, temperatures, etc.). If a traditional model is learned, many possible values for the special attribute will have to be tried to attain the maximum profit. In this paper, we exploit the relationship between these modifiable (or negotiable) input features and the output to (1) change
more » ... e problem presentation, possibly turning a classification problem into a regression problem, and (2) maximise profits and derive negotiation strategies. We illustrate our proposal with a paradigmatic Customer Relationship Management (CRM) problem: maximising the profit of a retailing operation where the price is the negotiable input feature. Different negotiation strategies have been experimentally tested to estimate optimal prices, showing that strategies based on negotiable features get higher profits.
doi:10.1007/978-3-642-13022-9_52 fatcat:yfhpxlswsvg4nncr335npj2mfe

CASP-DM: Context Aware Standard Process for Data Mining [article]

Fernando Martínez-Plumed, Lidia Contreras-Ochando, Cèsar Ferri, Peter Flach, José Hernández-Orallo, Meelis Kull, Nicolas Lachiche, María José Ramírez-Quintana
2017 arXiv   pre-print
We propose an extension of the Cross Industry Standard Process for Data Mining (CRISPDM) which addresses specific challenges of machine learning and data mining for context and model reuse handling. This new general context-aware process model is mapped with CRISP-DM reference model proposing some new or enhanced outputs.
arXiv:1709.09003v1 fatcat:giwxqiy7rbc63bdftzghqnfe2i

Learning with Configurable Operators and RL-Based Heuristics [chapter]

Fernando Martínez-Plumed, Cèsar Ferri, José Hernández-Orallo, María José Ramírez-Quintana
2013 Lecture Notes in Computer Science  
In this paper, we push forward the idea of machine learning systems for which the operators can be modified and finetuned for each problem. This allows us to propose a learning paradigm where users can write (or adapt) their operators, according to the problem, data representation and the way the information should be navigated. To achieve this goal, data instances, background knowledge, rules, programs and operators are all written in the same functional language, Erlang. Since changing
more » ... rs affect how the search space needs to be explored, heuristics are learnt as a result of a decision process based on reinforcement learning where each action is defined as a choice of operator and rule. As a result, the architecture can be seen as a 'system for writing machine learning systems' or to explore new operators.
doi:10.1007/978-3-642-37382-4_1 fatcat:7oqe67suyjegxaevdiucnqx7du

On the definition of a general learning system with user-defined operators [article]

Fernando Martínez-Plumed and Cèsar Ferri and José Hernández-Orallo and María-José Ramírez-Quintana
2013 arXiv   pre-print
In this paper, we push forward the idea of machine learning systems whose operators can be modified and fine-tuned for each problem. This allows us to propose a learning paradigm where users can write (or adapt) their operators, according to the problem, data representation and the way the information should be navigated. To achieve this goal, data instances, background knowledge, rules, programs and operators are all written in the same functional language, Erlang. Since changing operators
more » ... ct how the search space needs to be explored, heuristics are learnt as a result of a decision process based on reinforcement learning where each action is defined as a choice of operator and rule. As a result, the architecture can be seen as a 'system for writing machine learning systems' or to explore new operators where the policy reuse (as a kind of transfer learning) is allowed. States and actions are represented in a Q matrix which is actually a table, from which a supervised model is learnt. This makes it possible to have a more flexible mapping between old and new problems, since we work with an abstraction of rules and actions. We include some examples sharing reuse and the application of the system gErl to IQ problems. In order to evaluate gErl, we will test it against some structured problems: a selection of IQ test tasks and some experiments on some structured prediction problems (list patterns).
arXiv:1311.4235v1 fatcat:7pqwyo7vpzhmtimdumfcfowayu

Probabilistic class hierarchies for multiclass classification

Daniel Silva-Palacios, Cèsar Ferri, María José Ramírez-Quintana
2018 Journal of Computational Science  
The improvement in the performance of classifiers has been the focus of attention of many researchers over the last few decades. Obtaining accurate predictions becomes more complicated as the number of classes increases. Most families of classification techniques generate models that define decision boundaries trying to separate the classes as well as possible. As an alternative, in this paper, we propose to hierarchically decompose the original multiclass problem by reducing the number of
more » ... es involved in each local subproblem. This is done by deriving a similarity matrix from the misclassification errors given by a first classifier that is learned for this, and then, using the similarity matrix to build a tree-like hierarchy of specialized classifiers. Then, we present two approaches to solve the multiclass problem: the first one traverses the tree of classifiers in a top-down manner similar to the way some hierarchical classification methods do for dealing with hierarchical domains; the second one is inspired in the way probabilistic decision trees compute class membership probabilities. To improve the efficiency of our methods, we propose a criterion to reduce the size of the hierarchy. We experimentally evaluate all of the proposals on a collection of multiclass datasets showing that, in general, the generated classifier hierarchies outperform the original (flat) multiclass classification.
doi:10.1016/j.jocs.2018.01.006 fatcat:k5zhijhhlbgo7kq7qm7ivfznpi

Identifying the Machine Learning Family from Black-Box Models [chapter]

Raül Fabra-Boluda, Cèsar Ferri, José Hernández-Orallo, Fernando Martínez-Plumed, María José Ramírez-Quintana
2018 Lecture Notes in Computer Science  
We address the novel question of determining which kind of machine learning model is behind the predictions when we interact with a black-box model. This may allow us to identify families of techniques whose models exhibit similar vulnerabilities and strengths. In our method, we first consider how an adversary can systematically query a given black-box model (oracle) to label an artificially-generated dataset. This labelled dataset is then used for training different surrogate models (each one
more » ... rying to imitate the oracle's behaviour). The method has two different approaches. First, we assume that the family of the surrogate model that achieves the maximum Kappa metric against the oracle labels corresponds to the family of the oracle model. The other approach, based on machine learning, consists in learning a meta-model that is able to predict the model family of a new black-box model. We compare these two approaches experimentally, giving us insight about how explanatory and predictable our concept of family is.
doi:10.1007/978-3-030-00374-6_6 fatcat:c2abzp4eojcvxl3w224eky4xfe
« Previous Showing results 1 — 15 out of 2,270 results