Parity Queries for Binary Classification [article]

Hye Won Chung, Ji Oon Lee, Doyeon Kim, Alfred O. Hero
2019 arXiv   pre-print
Consider a query-based data acquisition problem that aims to recover the values of k binary variables from parity (XOR) measurements of chosen subsets of the variables. Assume the response model where only a randomly selected subset of the measurements is received. We propose a method for designing a sequence of queries so that the variables can be identified with high probability using as few (n) measurements as possible. We define the query difficulty d̅ as the average size of the query
more » ... s and the sample complexity n as the minimum number of measurements required to attain a given recovery accuracy. We obtain fundamental trade-offs between recovery accuracy, query difficulty, and sample complexity. In particular, the necessary and sufficient sample complexity required for recovering all k variables with high probability is n = c_0 max{k, (k log k)/d̅} and the sample complexity for recovering a fixed proportion (1-δ)k of the variables for δ=o(1) is n = c_1max{k, (k log(1/δ))/d̅}, where c_0, c_1>0.
arXiv:1809.00901v2 fatcat:rwl2wudwqfhphhsi32dovzzos4