Abstract: A restless multi-armed bandit problem that arises in
multichannel opportunistic communications is considered, where
channels are modeled as independent and identical Gilbert–Elliot
channels and channel state detection is subject to errors. A simple
structure of the myopic policy is established under a certain condition
on the false alarm probability of the channel state detector.
It is shown that myopic actions can be obtained by maintaining
a simple channel ordering without knowing the underlying Markovian
model. The optimality of the myopic policy is proved for
the case of two channels and conjectured for general cases. Lower
and upper bounds on the performance of the myopic policy are
obtained in closed-form, which characterize the scaling behavior
of the achievable throughput of the multichannel opportunistic
system. The approximation factor of the myopic policy is also
analyzed to bound its worst-case performance loss with respect to
the optimal performance.
0 Replies
Loading