A copy of this work was available on the public web and has been preserved in the Wayback Machine. The capture dates from 2019; you can also visit the original URL.
The file type is application/pdf
.
Filters
GO Gradient for Expectation-Based Objectives
[article]
2019
arXiv
pre-print
Within many machine learning algorithms, a fundamental problem concerns efficient calculation of an unbiased gradient wrt parameters for expectation-based objectives _q_ () [f()]. ...
We find that the GO gradient often works well in practice based on only one Monte Carlo sample (although one can of course use more samples if desired). ...
We also wish to thank Chenyang Tao, Liqun Chen, and Chunyuan Li for helpful discussions. ...
arXiv:1901.06020v1
fatcat:sqvz2zbonbabdmx565rgx63f7a
GO Hessian for Expectation-Based Objectives
[article]
2020
arXiv
pre-print
An unbiased low-variance gradient estimator, termed GO gradient, was proposed recently for expectation-based objectives E_q_γ(y) [f(y)], where the random variable (RV) y may be drawn from a stochastic ...
Upgrading the GO gradient, we present for E_q_γ(y) [f(y)] an unbiased low-variance Hessian estimator, named GO Hessian. ...
Acknowledgments and Disclosure of Funding We thank the anonymous reviewers for their constructive comments. The research was supported by part by DARPA, DOE, NIH, NSF and ONR. ...
arXiv:2006.08873v1
fatcat:hrvztiwjynbzbmuztvzu7fcq3q
Toward Robust Material Recognition for Everyday Objects
2011
Procedings of the British Machine Vision Conference 2011
Let σ o z be the standard deviation of gradient orientation around z (using angles), and σ o z ...
Large-Margin Nearest Neighbor learning is used for a 30-fold dimension reduction. We improve the state-of-the-art accuracy on the Flickr dataset [16] from 45% to 54%. ...
Local binary patterns are then combined with a position kernel and a gradient-based weighting scheme to produce a patch-level shape descriptor that is very effective for shape-based recognition. ...
doi:10.5244/c.25.48
dblp:conf/bmvc/HuBR11
fatcat:wkbntlckfzd3rpjmclrnqt7k5a
Surface Dependent Representations for Illumination Insensitive Image Comparison
2007
IEEE Transactions on Pattern Analysis and Machine Intelligence
Previous work has shown the effectiveness of comparing the image gradient direction for surfaces with material properties that change rapidly in one direction. ...
This suggests that a combination of these strategies should be employed to compare general objects. ...
Therefore, it will yield similar results to those obtained by gradient direction comparison. We will call image comparison based on odd Gabors GO. ...
doi:10.1109/tpami.2007.250602
pmid:17108386
fatcat:f4o3f5qwpbhrtjkbx2iiy7scta
Monte-Carlo simulation balancing
2009
Proceedings of the 26th Annual International Conference on Machine Learning - ICML '09
We develop two algorithms for balancing a simulation policy by gradient descent. ...
We test each algorithm in the domain of 5 × 5 and 6 × 6 Computer Go, using a softmax policy that is parameterised by weights for a hundred simple patterns. ...
The objective is to maximise the expected cumulative reward from start state s. ...
doi:10.1145/1553374.1553495
dblp:conf/icml/SilverT09
fatcat:76z2xpqbgbh53bb65ul76v2b3a
Implicitly Constrained Semi-supervised Linear Discriminant Analysis
2014
2014 22nd International Conference on Pattern Recognition
Based on this objective, it turns out one can efficiently find the optimal classifier in this set of possible classifiers by allowing for soft label assignments to the unlabeled objects. ...
One way to know how well any of these classifiers is going to perform is to estimate its performance using the supervised objective function evaluated on labeled objects alone. ...
Based on this objective, it turns out one can efficiently find the optimal classifier in this set of possible classifiers by allowing for soft label assignments to the unlabeled objects. ...
doi:10.1109/icpr.2014.646
dblp:conf/icpr/KrytheL14
fatcat:thnyiafh4jao3fggjtxk2ckxpi
Learning Neural Parsers with Deterministic Differentiable Imitation Learning
[article]
2018
arXiv
pre-print
From another perspective, our approach is a variant of the Deterministic Policy Gradient suitable for the imitation learning setting. ...
We explore the problem of learning to decompose spatial tasks into segments, as exemplified by the problem of a painting robot covering a large object. ...
Acknowledgments The authors would like to thank Wen Sun, Anirudh Vemula, and Arjun Sharma for technical discussions, and Marinus Analytics for providing us access to computing resources for our experiments ...
arXiv:1806.07822v2
fatcat:q34dcagzondpzmcbbj32nu2qnm
Faded-Experience Trust Region Policy Optimization for Model-Free Power Allocation in Interference Channel
[article]
2020
arXiv
pre-print
Policy gradient reinforcement learning techniques enable an agent to directly learn an optimal action policy through the interactions with the environment. ...
We apply our method to the trust-region policy optimization (TRPO), primarily developed for locomotion tasks, and propose faded-experience (FE) TRPO. ...
In practice, the above expectation should be estimated over a batch of data collected from the current policy via Monte Carlo (MC) 2 technique (sample based estimate of the policy gradient). ...
arXiv:2008.01705v1
fatcat:rimi5rekfffypjomdufo5hmfxy
Optimum Functionally Gradient Materials for Dental Implant Using Simulated Annealing
[chapter]
2012
Simulated Annealing - Single and Multiple Objective Problems
A multi-objective approach based on simulated annealing and its application to nuclear fuel management, 5th International Conference on Nuclear Engineering, Nice, France, pp. 416-423.
References ...
Acknowledgement The authors would like to acknowledge for the Ministry of Higher Education of Malaysia and the University of Malaya, Kuala Lumpur, Malaysia for the financial support under UM.TNC2/IPPP/ ...
Going back to Figs. 4b and 4c for minimizing both objective functions (f2 and f3), we expect to have one optimal solution as the two functions have a similar trend to reach the optimal point. ...
doi:10.5772/45640
fatcat:sevf5vw6jraevir5c6ma4h4sem
Upper bounds for the 0-1 stochastic knapsack problem and a B&B algorithm
2009
Annals of Operations Research
Here, the former is used to approximate the gradient of the objective function that is a function in expectation. ...
Based on this observation, we use in the following a stopping criterion for the stochastic gradient algorithm of 500 iterations. ...
doi:10.1007/s10479-009-0577-5
fatcat:bwh4tthcvbc2bpot4ansfzw64u
Variance Adjusted Actor Critic Algorithms
[article]
2013
arXiv
pre-print
We present an actor-critic framework for MDPs where the objective is the variance-adjusted expected return. ...
We present an episodic actor-critic algorithm and show that it converges almost surely to a locally optimal point of the objective function. ...
For a policy π θ the expected reward-to-go J θ : X → R, also known as the value function, is given by J θ (x) E θ [B|x 0 = x] , where E θ denotes an expectation when following policy π θ . ...
arXiv:1310.3697v1
fatcat:atcu74wwtvgvljjmwfvz224gde
Morphological Segmentation of Image Sequences
[chapter]
1994
Computational Imaging and Vision
In image compression, object-based approaches are adapted to high compression rates, since they take into account the geometry of the objects and the human eye characteristics. ...
This paper presents a method to segment image sequences, first step of an object-oriented compression system, based on Mathematical Morphology. ...
Some segmentations at different resolutions, based on a contrast-size criterion are shown in fig. 2.
Fig. 2 . 2 2D segmentations.
Fig. 4 . 4 3D gradient of a moving object. ...
doi:10.1007/978-94-011-1040-2_14
dblp:conf/ismm/MarcoteguiM94
fatcat:a2onkjsqg5hhblmssscgtpe3ua
Learning from Heterogeneous Sources via Gradient Boosting Consensus
[chapter]
2012
Proceedings of the 2012 SIAM International Conference on Data Mining
Multiple data sources containing different types of features may be available for a given task. For instance, users' profiles can be used to build recommendation systems. ...
the gradient residual of the objective function. ...
For example, the genre database does not have the record for "Monster a-Go Go"; the running times database does not have any record of "Apocalypse Now". ...
doi:10.1137/1.9781611972825.20
dblp:conf/sdm/ShiPGY12
fatcat:jqiq5mepxbefrnout2vt7q2ade
Trajectory-Based Off-Policy Deep Reinforcement Learning
[article]
2019
arXiv
pre-print
The resulting objective is amenable to standard neural network optimization strategies like stochastic gradient descent or stochastic gradient Hamiltonian Monte Carlo. ...
Policy gradient methods are powerful reinforcement learning algorithms and have been demonstrated to solve many complex tasks. ...
Trajectory based objective estimate Whilst evaluation of the Monte Carlo based expected cost estimate is possible also for deterministic policies, the off-policy evaluation is no longer feasible since ...
arXiv:1905.05710v1
fatcat:6po2azo7yndsrjmh4ewcdnfmum
Marine boundary layer refractive effects in the infrared
1979
Applied Optics
This is the case to be expected for flir systems operated from periscopes near the ocean surface. ...
gradient for various distances. ...
doi:10.1364/ao.18.002532
pmid:20212700
fatcat:2kxsaq6iyvhgbn7k5wn4v3mcz4
« Previous
Showing results 1 — 15 out of 218,472 results