Online Active Regression [article]

Cheng Chen, Yi Li, Yiming Sun
2022 arXiv   pre-print
Active regression considers a linear regression problem where the learner receives a large number of data points but can only observe a small number of labels. Since online algorithms can deal with incremental training data and take advantage of low computational cost, we consider an online extension of the active regression problem: the learner receives data points one by one and immediately decides whether it should collect the corresponding labels. The goal is to efficiently maintain the
more » ... ession of received data points with a small budget of label queries. We propose novel algorithms for this problem under ℓ_p loss where p∈[1,2]. To achieve a (1+ϵ)-approximate solution, our proposed algorithms only require 𝒪̃(ϵ^-1 d log(nκ)) queries of labels, where n is the number of data points and κ is a quantity, called the condition number, of the data points. The numerical results verify our theoretical results and show that our methods have comparable performance with offline active regression algorithms.
arXiv:2207.05945v2 fatcat:wlu3uottjbflldcvdi43gt6yjy