Book contents
- Frontmatter
- Dedication
- Contents
- Preface
- Acknowledgements
- Notation
- 1 Introduction
- 2 Ski-Rental
- 3 List Accessing
- 4 Bin-Packing
- 5 Paging
- 6 Metrical Task System
- 7 Secretary Problem
- 8 Knapsack
- 9 Bipartite Matching
- 10 Primal–Dual Technique
- 11 Facility Location and k-Means Clustering
- 12 Load Balancing
- 13 Scheduling to Minimize Flow Time (Delay)
- 14 Scheduling with Speed Scaling
- 15 Scheduling to Minimize Energy with Job Deadlines
- 16 Travelling Salesman
- 17 Convex Optimization (Server Provisioning in Cloud Computing)
- 18 Multi-Commodity Flow Routing
- 19 Resource Constrained Scheduling (Energy Harvesting Communication)
- 20 Submodular Partitioning for Welfare Maximization
- Appendix
- Bibliography
- Index
11 - Facility Location and k-Means Clustering
Published online by Cambridge University Press: 07 May 2024
- Frontmatter
- Dedication
- Contents
- Preface
- Acknowledgements
- Notation
- 1 Introduction
- 2 Ski-Rental
- 3 List Accessing
- 4 Bin-Packing
- 5 Paging
- 6 Metrical Task System
- 7 Secretary Problem
- 8 Knapsack
- 9 Bipartite Matching
- 10 Primal–Dual Technique
- 11 Facility Location and k-Means Clustering
- 12 Load Balancing
- 13 Scheduling to Minimize Flow Time (Delay)
- 14 Scheduling with Speed Scaling
- 15 Scheduling to Minimize Energy with Job Deadlines
- 16 Travelling Salesman
- 17 Convex Optimization (Server Provisioning in Cloud Computing)
- 18 Multi-Commodity Flow Routing
- 19 Resource Constrained Scheduling (Energy Harvesting Communication)
- 20 Submodular Partitioning for Welfare Maximization
- Appendix
- Bibliography
- Index
Summary
Introduction
In this section, we consider two related combinatorial online problems that have wide applications in the area of operations research and machine learning, called the facility location problem, and the k-clustering problem. With the facility location problem, requests arrive sequentially whose locations belong to a metric space. On the arrival of a new request, the decision to be made is whether to assign this request to any one of the currently open facilities or open a new facility. The cost of assigning a request to an open facility is equal to the distance between the location of the request and the location of the open facility, while opening a new facility incurs a fixed cost. The cost of an online algorithm is the sum of the costs of all requests plus the total facility opening cost, and the objective is to find online algorithms to minimize the competitive ratio.
The facility location problem is a rich object and captures important problems such as: where to install charging stations for electric vehicles with routing and infrastructure costs. In this chapter, we first derive lower bounds on the competitive ratios of both deterministic and randomized algorithms, and show that the best competitive ratio possible for any online algorithm is at least, where n is the number of requests. On the positive side, we present a randomized algorithm whose competitive ratio is at most O(log n). We also consider a secretarial input setting where the order of arrival of requests is uniformly random, for which the same randomized algorithm is at most 8-competitive. A deterministic algorithm with a competitive ratio of at most O(log n) is also established for a more general setting, where the facility-opening cost depends on the location.
Next, we consider a related problem called the k-clustering problem, where requests arrive online and the objective is to partition the set of requests into at most k-clusters that minimizes the total cost defined as follows. The cost of each cluster is the distance of all requests that belong to the cluster from its centroid (called the centre), and the total cost is the sum of the cost of all the clusters. The k-clustering problem essentially models the classification problem, a fundamental object in machine learning.
- Type
- Chapter
- Information
- Online Algorithms , pp. 217 - 246Publisher: Cambridge University PressPrint publication year: 2023