Filters








218,472 Hits in 2.7 sec

GO Gradient for Expectation-Based Objectives [article]

Yulai Cong, Miaoyun Zhao, Ke Bai, Lawrence Carin
2019 arXiv   pre-print
Within many machine learning algorithms, a fundamental problem concerns efficient calculation of an unbiased gradient wrt parameters for expectation-based objectives _q_ () [f()].  ...  We find that the GO gradient often works well in practice based on only one Monte Carlo sample (although one can of course use more samples if desired).  ...  We also wish to thank Chenyang Tao, Liqun Chen, and Chunyuan Li for helpful discussions.  ... 
arXiv:1901.06020v1 fatcat:sqvz2zbonbabdmx565rgx63f7a

GO Hessian for Expectation-Based Objectives [article]

Yulai Cong, Miaoyun Zhao, Jianqiao Li, Junya Chen, Lawrence Carin
2020 arXiv   pre-print
An unbiased low-variance gradient estimator, termed GO gradient, was proposed recently for expectation-based objectives E_q_γ(y) [f(y)], where the random variable (RV) y may be drawn from a stochastic  ...  Upgrading the GO gradient, we present for E_q_γ(y) [f(y)] an unbiased low-variance Hessian estimator, named GO Hessian.  ...  Acknowledgments and Disclosure of Funding We thank the anonymous reviewers for their constructive comments. The research was supported by part by DARPA, DOE, NIH, NSF and ONR.  ... 
arXiv:2006.08873v1 fatcat:hrvztiwjynbzbmuztvzu7fcq3q

Toward Robust Material Recognition for Everyday Objects

Diane Hu, Liefeng Bo, Xiaofeng Ren
2011 Procedings of the British Machine Vision Conference 2011  
Let σ o z be the standard deviation of gradient orientation around z (using angles), and σ o z  ...  Large-Margin Nearest Neighbor learning is used for a 30-fold dimension reduction. We improve the state-of-the-art accuracy on the Flickr dataset [16] from 45% to 54%.  ...  Local binary patterns are then combined with a position kernel and a gradient-based weighting scheme to produce a patch-level shape descriptor that is very effective for shape-based recognition.  ... 
doi:10.5244/c.25.48 dblp:conf/bmvc/HuBR11 fatcat:wkbntlckfzd3rpjmclrnqt7k5a

Surface Dependent Representations for Illumination Insensitive Image Comparison

Margarita Osadchy, David W. Jacobs, Michael Lindenbaum
2007 IEEE Transactions on Pattern Analysis and Machine Intelligence  
Previous work has shown the effectiveness of comparing the image gradient direction for surfaces with material properties that change rapidly in one direction.  ...  This suggests that a combination of these strategies should be employed to compare general objects.  ...  Therefore, it will yield similar results to those obtained by gradient direction comparison. We will call image comparison based on odd Gabors GO.  ... 
doi:10.1109/tpami.2007.250602 pmid:17108386 fatcat:f4o3f5qwpbhrtjkbx2iiy7scta

Monte-Carlo simulation balancing

David Silver, Gerald Tesauro
2009 Proceedings of the 26th Annual International Conference on Machine Learning - ICML '09  
We develop two algorithms for balancing a simulation policy by gradient descent.  ...  We test each algorithm in the domain of 5 × 5 and 6 × 6 Computer Go, using a softmax policy that is parameterised by weights for a hundred simple patterns.  ...  The objective is to maximise the expected cumulative reward from start state s.  ... 
doi:10.1145/1553374.1553495 dblp:conf/icml/SilverT09 fatcat:76z2xpqbgbh53bb65ul76v2b3a

Implicitly Constrained Semi-supervised Linear Discriminant Analysis

Jesse H. Krijthe, Marco Loog
2014 2014 22nd International Conference on Pattern Recognition  
Based on this objective, it turns out one can efficiently find the optimal classifier in this set of possible classifiers by allowing for soft label assignments to the unlabeled objects.  ...  One way to know how well any of these classifiers is going to perform is to estimate its performance using the supervised objective function evaluated on labeled objects alone.  ...  Based on this objective, it turns out one can efficiently find the optimal classifier in this set of possible classifiers by allowing for soft label assignments to the unlabeled objects.  ... 
doi:10.1109/icpr.2014.646 dblp:conf/icpr/KrytheL14 fatcat:thnyiafh4jao3fggjtxk2ckxpi

Learning Neural Parsers with Deterministic Differentiable Imitation Learning [article]

Tanmay Shankar, Nicholas Rhinehart, Katharina Muelling, Kris M. Kitani
2018 arXiv   pre-print
From another perspective, our approach is a variant of the Deterministic Policy Gradient suitable for the imitation learning setting.  ...  We explore the problem of learning to decompose spatial tasks into segments, as exemplified by the problem of a painting robot covering a large object.  ...  Acknowledgments The authors would like to thank Wen Sun, Anirudh Vemula, and Arjun Sharma for technical discussions, and Marinus Analytics for providing us access to computing resources for our experiments  ... 
arXiv:1806.07822v2 fatcat:q34dcagzondpzmcbbj32nu2qnm

Faded-Experience Trust Region Policy Optimization for Model-Free Power Allocation in Interference Channel [article]

Mohammad G. Khoshkholgh, Halim Yanikomeroglu
2020 arXiv   pre-print
Policy gradient reinforcement learning techniques enable an agent to directly learn an optimal action policy through the interactions with the environment.  ...  We apply our method to the trust-region policy optimization (TRPO), primarily developed for locomotion tasks, and propose faded-experience (FE) TRPO.  ...  In practice, the above expectation should be estimated over a batch of data collected from the current policy via Monte Carlo (MC) 2 technique (sample based estimate of the policy gradient).  ... 
arXiv:2008.01705v1 fatcat:rimi5rekfffypjomdufo5hmfxy

Optimum Functionally Gradient Materials for Dental Implant Using Simulated Annealing [chapter]

Ali Sadollah, Ardeshir Bahreininej
2012 Simulated Annealing - Single and Multiple Objective Problems  
A multi-objective approach based on simulated annealing and its application to nuclear fuel management, 5th International Conference on Nuclear Engineering, Nice, France, pp. 416-423. References  ...  Acknowledgement The authors would like to acknowledge for the Ministry of Higher Education of Malaysia and the University of Malaya, Kuala Lumpur, Malaysia for the financial support under UM.TNC2/IPPP/  ...  Going back to Figs. 4b and 4c for minimizing both objective functions (f2 and f3), we expect to have one optimal solution as the two functions have a similar trend to reach the optimal point.  ... 
doi:10.5772/45640 fatcat:sevf5vw6jraevir5c6ma4h4sem

Upper bounds for the 0-1 stochastic knapsack problem and a B&B algorithm

Stefanie Kosuch, Abdel Lisser
2009 Annals of Operations Research  
Here, the former is used to approximate the gradient of the objective function that is a function in expectation.  ...  Based on this observation, we use in the following a stopping criterion for the stochastic gradient algorithm of 500 iterations.  ... 
doi:10.1007/s10479-009-0577-5 fatcat:bwh4tthcvbc2bpot4ansfzw64u

Variance Adjusted Actor Critic Algorithms [article]

Aviv Tamar, Shie Mannor
2013 arXiv   pre-print
We present an actor-critic framework for MDPs where the objective is the variance-adjusted expected return.  ...  We present an episodic actor-critic algorithm and show that it converges almost surely to a locally optimal point of the objective function.  ...  For a policy π θ the expected reward-to-go J θ : X → R, also known as the value function, is given by J θ (x) E θ [B|x 0 = x] , where E θ denotes an expectation when following policy π θ .  ... 
arXiv:1310.3697v1 fatcat:atcu74wwtvgvljjmwfvz224gde

Morphological Segmentation of Image Sequences [chapter]

B. Marcotegui, F. Meyer
1994 Computational Imaging and Vision  
In image compression, object-based approaches are adapted to high compression rates, since they take into account the geometry of the objects and the human eye characteristics.  ...  This paper presents a method to segment image sequences, first step of an object-oriented compression system, based on Mathematical Morphology.  ...  Some segmentations at different resolutions, based on a contrast-size criterion are shown in fig. 2. Fig. 2 . 2 2D segmentations. Fig. 4 . 4 3D gradient of a moving object.  ... 
doi:10.1007/978-94-011-1040-2_14 dblp:conf/ismm/MarcoteguiM94 fatcat:a2onkjsqg5hhblmssscgtpe3ua

Learning from Heterogeneous Sources via Gradient Boosting Consensus [chapter]

Xiaoxiao Shi, Jean-Francois Paiement, David Grangier, Philip S. Yu
2012 Proceedings of the 2012 SIAM International Conference on Data Mining  
Multiple data sources containing different types of features may be available for a given task. For instance, users' profiles can be used to build recommendation systems.  ...  the gradient residual of the objective function.  ...  For example, the genre database does not have the record for "Monster a-Go Go"; the running times database does not have any record of "Apocalypse Now".  ... 
doi:10.1137/1.9781611972825.20 dblp:conf/sdm/ShiPGY12 fatcat:jqiq5mepxbefrnout2vt7q2ade

Trajectory-Based Off-Policy Deep Reinforcement Learning [article]

Andreas Doerr, Michael Volpp, Marc Toussaint, Sebastian Trimpe, Christian Daniel
2019 arXiv   pre-print
The resulting objective is amenable to standard neural network optimization strategies like stochastic gradient descent or stochastic gradient Hamiltonian Monte Carlo.  ...  Policy gradient methods are powerful reinforcement learning algorithms and have been demonstrated to solve many complex tasks.  ...  Trajectory based objective estimate Whilst evaluation of the Monte Carlo based expected cost estimate is possible also for deterministic policies, the off-policy evaluation is no longer feasible since  ... 
arXiv:1905.05710v1 fatcat:6po2azo7yndsrjmh4ewcdnfmum

Marine boundary layer refractive effects in the infrared

R. Feinberg, H. G. Hughes
1979 Applied Optics  
This is the case to be expected for flir systems operated from periscopes near the ocean surface.  ...  gradient for various distances.  ... 
doi:10.1364/ao.18.002532 pmid:20212700 fatcat:2kxsaq6iyvhgbn7k5wn4v3mcz4
« Previous Showing results 1 — 15 out of 218,472 results