Multi-armed Bandits for Link Configuration in Millimeter-wave Networks [article]

Yi Zhang, Robert W. Heath
Establishing and maintaining millimeter-wave (mmWave) links is challenging due to the changing environment and the high sensibility of mmWave signal to user mobility and channel conditions. MmWave link configuration problems often involve a search for optimal system parameter under environmental uncertainties, from a finite set of alternatives that are supported by the system hardware and protocol. For example, beam sweeping aims at identifying the optimal beam(s) for data transmission from a
more » ... screte codebook. Selecting parameters such as the beam sweeping period and the beamwidth are crucial to achieving high overall system throughput. In this article, we motivate the use of the multi-armed bandit (MAB) framework to intelligently search out the optimal configuration when establishing the mmWave links. MAB is a reinforcement learning framework that guides a decision-maker to sequentially select one action from a set of actions. As an example, we show that within the MAB framework, the optimal beam sweeping period, beamwidth, and beam directions could be dynamically learned with sample-computational-efficient bandit algorithms. We conclude by highlighting some future research directions on enhancing mmWave link configuration design with MAB.
doi:10.48550/arxiv.2202.01196 fatcat:4te6nq5pu5hpjcqknrqxwuct2i