Dissertation Defense
Online Learning and Decision Making for Resource Allocation
This event is free and open to the publicAdd to Google Calendar
PASSCODE: defense
Resource allocation is critical in various contexts like wireless networks, transportation, and recommendation systems, where system dynamics are often unknown or change over time. This dissertation proposes algorithms with theoretical guarantees that simultaneously learn and make decisions for resource allocation. First, we introduce a multi-armed bandit model with abandonment for recommendation systems, capturing user abandonment in short video and online education platforms. We propose algorithms that do more exploration when users like the previous recommendation and become more conservative when they do not. These algorithms achieve sharp regret bounds and outperform traditional methods in simulations. Second, we propose the MaxWeight with discounted upper confidence bound algorithm for multi-server queueing systems, which simultaneously learns service rates and schedules jobs. We prove an order-wise optimal average queue-length bound and an anytime queue-length bound, applicable to both stationary and nonstationary service rates. Simulations show that our algorithm significantly improves delay performance compared to previous algorithms. Lastly, we study pricing and matching in two-sided queues where the demand and supply curves are unknown. Using a learning-based pricing algorithm that combines zero-order stochastic projected gradient ascent with bisection search, we achieve sublinear profit regret and queue lengths, establishing a tradeoff between regret and queue length.
CHAIR: Professor Lei Ying