A copy of this work was available on the public web and has been preserved in the Wayback Machine. The capture dates from 2020; you can also visit the original URL.
The file type is application/pdf
.
Filters
Privacy-Preserving Multi-Party Contextual Bandits
[article]
2020
arXiv
pre-print
This paper develops a privacy-preserving multi-party contextual bandit for this learning setting by combining secure multi-party computation with a differentially private mechanism based on epsilon-greedy ...
Contextual bandits are commonly used to solve recommendation or ranking problems. ...
Privacy-Preserving Multi-Party Contextual Bandits ...
arXiv:1910.05299v3
fatcat:4yt2qxezifgo3ofqylodcpa6cy
Locally Differentially Private (Contextual) Bandits Learning
[article]
2021
arXiv
pre-print
Note that given the existing Ω(T) lower bound for DP contextual linear bandits (Shariff Sheffe, 2018), our result shows a fundamental difference between LDP and DP contextual bandits learning. ...
Based on our frameworks, we can improve previous best results for private bandits learning with one-point feedback, such as private Bandits Convex Optimization, and obtain the first result for Bandits ...
Note the non-linearity of g makes things much more complicated either from the view of bandits learning or privacy preservation. ...
arXiv:2006.00701v4
fatcat:quapec7ss5fl7hlzzurpbvin3i
Privacy-Preserving Bandits
[article]
2020
arXiv
pre-print
Contextual bandit algorithms (CBAs) often rely on personal data to provide recommendations. ...
This paper proposes a technique we call Privacy-Preserving Bandits (P2B); a system that updates local agents by collecting feedback from other local agents in a differentially-private manner. ...
CONCLUSIONS This paper presents P2B, a privacy-preserving approach for machine learning with contextual bandits. ...
arXiv:1909.04421v4
fatcat:ynffzmb3czc33dxlkncocevyja
Dynamic Privacy Pricing For Timely Rewards
2018
International Journal for Research in Applied Science and Engineering Technology
Conclusion-It is useful to protect individual's privacy and to set the proper pay off. ...
Setting a price for individual's privacy is one form to conquer these threats is a measure though it is a tough issue. ...
Contextual Bandit Approach Instead of estimating the cumulative distribution, here we view the time-variant characteristic of the bandit problem from a different perspective. G. ...
doi:10.22214/ijraset.2018.2134
fatcat:npvtposimnevti2qffjso226pa
Locally Differentially Private Reinforcement Learning for Linear Mixture Markov Decision Processes
[article]
2021
arXiv
pre-print
To protect the users' privacy, privacy-preserving RL algorithms are in demand. In this paper, we study RL with linear function approximation and local differential privacy (LDP) guarantees. ...
To the best of our knowledge, this is the first provable privacy-preserving RL algorithm with linear function approximation. ...
In the LDP setting, the privacy-preserving mechanism M generates the privatized version of the context x t , denoted by r x t " Mpx t q, to the contextual linear bandit algorithm. ...
arXiv:2110.10133v1
fatcat:gvvkdlzxr5eylcawjdp5xrmgcy
Mitigating Bias in Adaptive Data Gathering via Differential Privacy
[article]
2018
arXiv
pre-print
hypothesis tests on complex data gathered via contextual bandit algorithms leads to false discovery. ...
Moreover, there exist differentially private bandit algorithms with near optimal regret bounds: we apply existing theorems in the simple stochastic case, and give a new analysis for linear contextual bandits ...
Contextual Bandit Problems In the contextual bandit problem, decisions are endowed with observable features. ...
arXiv:1806.02329v1
fatcat:jj7jifrysjd35l3f52umsnzj54
Differentially Private Contextual Linear Bandits
[article]
2018
arXiv
pre-print
Our goal is to devise private learners for the contextual linear bandit problem. We first show that using the standard definition of differential privacy results in linear regret. ...
We study the contextual linear bandit problem, a version of the standard stochastic multi-armed bandit (MAB) problem where a learner sequentially selects actions to maximize a reward which depends also ...
[1] gives an instance dependent bound for linear bandits, which we convert to the contextual setting. Differential Privacy. Differential privacy, first introduced by Dwork et al. ...
arXiv:1810.00068v1
fatcat:rdvsvyj54ratlkstlxqfebupt4
Privacy-Preserving Dynamic Personalized Pricing with Demand Learning
[article]
2021
arXiv
pre-print
Using the fundamental framework of differential privacy from computer science, we develop a privacy-preserving dynamic pricing policy, which tries to maximize the retailer revenue while avoiding information ...
Our policy achieves both the privacy guarantee and the performance guarantee in terms of regret. ...
In fact, this is still an open problem for generalized linear contextual bandit under the DP guarantee. ...
arXiv:2009.12920v2
fatcat:ibwk2m4ptrdxdhnptlal262loe
Federated Bandit: A Gossiping Approach
[article]
2020
arXiv
pre-print
We then propose Fed_UCB, a differentially private version of Gossip_UCB, in which the agents preserve ϵ-differential privacy of their local data while achieving O(max{poly(N,M)/ϵlog^2.5 T, poly(N,M) (log_λ ...
In this paper, we study Federated Bandit, a decentralized Multi-Armed Bandit problem with a set of N agents, who can only communicate their local data with neighbors described by a connected graph G. ...
Future work may include extending this framework to contextual bandits [46] with local features or bandits with continuous arms [53] . ...
arXiv:2010.12763v1
fatcat:skq7homjy5cs5phj3ikbcda2r4
Online learning with Corrupted context: Corrupted Contextual Bandits
[article]
2020
arXiv
pre-print
In order to address the corrupted-context setting,we propose to combine the standard contextual bandit approach with a classical multi-armed bandit mechanism. ...
We consider a novel variant of the contextual bandit problem (i.e., the multi-armed bandit with side-information, or context, available to a decision-maker) where the context used at each decision may ...
In this framework, motivated by privacy preserving in online recommender systems, the goal is to maximize the sum of the (unobserved) rewards, based on the observation of transformation of these rewards ...
arXiv:2006.15194v1
fatcat:aipupxx235gybdgytq2cnfbcue
Cascading Bandit under Differential Privacy
[article]
2021
arXiv
pre-print
This paper studies differential privacy (DP) and local differential privacy (LDP) in cascading bandits. ...
Our results extend to combinatorial semi-bandit. We show respective lower bounds for DP and LDP cascading bandits. Extensive experiments corroborate our theoretic findings. ...
Conservative contextual combinatorial cascading
bandit. arXiv preprint arXiv:2104.08615, 2021. ...
arXiv:2105.11126v2
fatcat:mwmdsqfelrhh7ni3onk5h4vzxe
Multi-Armed Bandits with Local Differential Privacy
[article]
2020
arXiv
pre-print
This paper investigates the problem of regret minimization for multi-armed bandit (MAB) problems with local differential privacy (LDP) guarantee. ...
To handle this dilemma, we adopt differential privacy and study the regret upper and lower bounds for MAB algorithms with a given LDP guarantee. ...
In [25], the authors studied privacy-preserving adversarial bandits. ...
arXiv:2007.03121v1
fatcat:eehdco6p4ndpffatetx47o4nzi
Differential Privacy for Multi-armed Bandits: What Is It and What Is Its Cost?
[article]
2020
arXiv
pre-print
Based on differential privacy (DP) framework, we introduce and unify privacy definitions for the multi-armed bandit algorithms. ...
We derive and contrast lower bounds on the regret of bandit algorithms satisfying these definitions. We leverage a unified proving technique to achieve all the lower bounds. ...
Shariff and Sheffet (2018) proves a finite-time problem-dependent lower bound for contextual bandits. ...
arXiv:1905.12298v2
fatcat:eiv2s4b7yrgajjr2vl3m6z2udi
Privacy-Aware Online Task Offloading for Mobile-Edge Computing
[chapter]
2020
Lecture Notes in Computer Science
Firstly, we formulate the joint optimization problem of task offloading and privacy preservation as a semiparametric contextual multi-armed bandit (MAB) problem, which has a relaxed reward model. ...
In order to tackle these challenges, a privacy-preserving and device-managed task offloading scheme is proposed in this paper for MEC. ...
In the cloud layer, authors proposed a privacy-preserving and contextual online learning algorithm to manage the participants' reputation. ...
doi:10.1007/978-3-030-59016-1_21
fatcat:wmjd24uhurgrdl4xq4udbv6zla
Federated Linear Contextual Bandits
[article]
2021
arXiv
pre-print
This paper presents a novel federated linear contextual bandits model, where individual clients face different K-armed stochastic bandits coupled through common global parameters. ...
Li et al. (2020) and Zhu et al. (2021) focus on differential privacy based local data privacy protection in federated bandits. ...
Li et al. (2020) and Zhu et al. (2021) focus on differential privacy based local data privacy protection in federated bandits. ...
arXiv:2110.14177v1
fatcat:y322gkg7cvfcrnqckr5lfrdljm
« Previous
Showing results 1 — 15 out of 520 results