
- DSA - Home
- DSA - Overview
- DSA - Environment Setup
- DSA - Algorithms Basics
- DSA - Asymptotic Analysis
- Data Structures
- DSA - Data Structure Basics
- DSA - Data Structures and Types
- DSA - Array Data Structure
- DSA - Skip List Data Structure
- Linked Lists
- DSA - Linked List Data Structure
- DSA - Doubly Linked List Data Structure
- DSA - Circular Linked List Data Structure
- Stack & Queue
- DSA - Stack Data Structure
- DSA - Expression Parsing
- DSA - Queue Data Structure
- DSA - Circular Queue Data Structure
- DSA - Priority Queue Data Structure
- DSA - Deque Data Structure
- Searching Algorithms
- DSA - Searching Algorithms
- DSA - Linear Search Algorithm
- DSA - Binary Search Algorithm
- DSA - Interpolation Search
- DSA - Jump Search Algorithm
- DSA - Exponential Search
- DSA - Fibonacci Search
- DSA - Sublist Search
- DSA - Hash Table
- Sorting Algorithms
- DSA - Sorting Algorithms
- DSA - Bubble Sort Algorithm
- DSA - Insertion Sort Algorithm
- DSA - Selection Sort Algorithm
- DSA - Merge Sort Algorithm
- DSA - Shell Sort Algorithm
- DSA - Heap Sort Algorithm
- DSA - Bucket Sort Algorithm
- DSA - Counting Sort Algorithm
- DSA - Radix Sort Algorithm
- DSA - Quick Sort Algorithm
- Matrices Data Structure
- DSA - Matrices Data Structure
- DSA - Lup Decomposition In Matrices
- DSA - Lu Decomposition In Matrices
- Graph Data Structure
- DSA - Graph Data Structure
- DSA - Depth First Traversal
- DSA - Breadth First Traversal
- DSA - Spanning Tree
- DSA - Topological Sorting
- DSA - Strongly Connected Components
- DSA - Biconnected Components
- DSA - Augmenting Path
- DSA - Network Flow Problems
- DSA - Flow Networks In Data Structures
- DSA - Edmonds Blossom Algorithm
- DSA - Maxflow Mincut Theorem
- Tree Data Structure
- DSA - Tree Data Structure
- DSA - Tree Traversal
- DSA - Binary Search Tree
- DSA - AVL Tree
- DSA - Red Black Trees
- DSA - B Trees
- DSA - B+ Trees
- DSA - Splay Trees
- DSA - Range Queries
- DSA - Segment Trees
- DSA - Fenwick Tree
- DSA - Fusion Tree
- DSA - Hashed Array Tree
- DSA - K-Ary Tree
- DSA - Kd Trees
- DSA - Priority Search Tree Data Structure
- Recursion
- DSA - Recursion Algorithms
- DSA - Tower of Hanoi Using Recursion
- DSA - Fibonacci Series Using Recursion
- Divide and Conquer
- DSA - Divide and Conquer
- DSA - Max-Min Problem
- DSA - Strassen's Matrix Multiplication
- DSA - Karatsuba Algorithm
- Greedy Algorithms
- DSA - Greedy Algorithms
- DSA - Travelling Salesman Problem (Greedy Approach)
- DSA - Prim's Minimal Spanning Tree
- DSA - Kruskal's Minimal Spanning Tree
- DSA - Dijkstra's Shortest Path Algorithm
- DSA - Map Colouring Algorithm
- DSA - Fractional Knapsack Problem
- DSA - Job Sequencing with Deadline
- DSA - Optimal Merge Pattern Algorithm
- Dynamic Programming
- DSA - Dynamic Programming
- DSA - Matrix Chain Multiplication
- DSA - Floyd Warshall Algorithm
- DSA - 0-1 Knapsack Problem
- DSA - Longest Common Sub-sequence Algorithm
- DSA - Travelling Salesman Problem (Dynamic Approach)
- Hashing
- DSA - Hashing Data Structure
- DSA - Collision In Hashing
- Disjoint Set
- DSA - Disjoint Set
- DSA - Path Compression And Union By Rank
- Heap
- DSA - Heap Data Structure
- DSA - Binary Heap
- DSA - Binomial Heap
- DSA - Fibonacci Heap
- Tries Data Structure
- DSA - Tries
- DSA - Standard Tries
- DSA - Compressed Tries
- DSA - Suffix Tries
- Treaps
- DSA - Treaps Data Structure
- Bit Mask
- DSA - Bit Mask In Data Structures
- Bloom Filter
- DSA - Bloom Filter Data Structure
- Approximation Algorithms
- DSA - Approximation Algorithms
- DSA - Vertex Cover Algorithm
- DSA - Set Cover Problem
- DSA - Travelling Salesman Problem (Approximation Approach)
- Randomized Algorithms
- DSA - Randomized Algorithms
- DSA - Randomized Quick Sort Algorithm
- DSA - Karger’s Minimum Cut Algorithm
- DSA - Fisher-Yates Shuffle Algorithm
- Miscellaneous
- DSA - Infix to Postfix
- DSA - Bellmon Ford Shortest Path
- DSA - Maximum Bipartite Matching
- DSA Useful Resources
- DSA - Questions and Answers
- DSA - Selection Sort Interview Questions
- DSA - Merge Sort Interview Questions
- DSA - Insertion Sort Interview Questions
- DSA - Heap Sort Interview Questions
- DSA - Bubble Sort Interview Questions
- DSA - Bucket Sort Interview Questions
- DSA - Radix Sort Interview Questions
- DSA - Cycle Sort Interview Questions
- DSA - Quick Guide
- DSA - Useful Resources
- DSA - Discussion
Set Cover Problem
The set cover algorithm provides solution to many real-world resource allocating problems. For instance, consider an airline assigning crew members to each of their airplanes such that they have enough people to fulfill the requirements for the journey. They take into account the flight timings, the duration, the pit-stops, availability of the crew to assign them to the flights. This is where set cover algorithm comes into picture.
Given a universal set U, containing few elements which are all divided into subsets. Considering the collection of these subsets as S = {S1, S2, S3, S4... Sn}, the set cover algorithm finds the minimum number of subsets such that they cover all the elements present in the universal set.

As shown in the above diagram, the dots represent the elements present in the universal set U that are divided into different sets, S = {S1, S2, S3, S4, S5, S6}. The minimum number of sets that need to be selected to cover all the elements will be the optimal output = {S1, S2, S3}.
Set Cover Algorithm
The set cover takes the collection of sets as an input and and returns the minimum number of sets required to include all the universal elements.
The set cover algorithm is an NP-Hard problem and a 2-approximation greedy algorithm.
Algorithm
Step 1 − Initialize Output = {} where Output represents the output set of elements.
Step 2 − While the Output set does not include all the elements in the universal set, do the following −
Find the cost-effectiveness of every subset present in the universal set using the formula, $\frac{Cost\left ( S_{i} \right )}{S_{i}-Output}$
Find the subset with minimum cost effectiveness for each iteration performed. Add the subset to the Output set.
Step 3 − Repeat Step 2 until there is no elements left in the universe. The output achieved is the final Output set.
Pseudocode
APPROX-GREEDY-SET_COVER(X, S) U = X OUTPUT = while U select Si S which has maximum |SiU| U = U S OUTPUT = OUTPUT {Si} return OUTPUT
Analysis
assuming the overall number of elements equals the overall number of sets (|X| = |S|), the code runs in time O(|X|3)
Example

Let us look at an example that describes the approximation algorithm for the set covering problem in more detail
S1 = {1, 2, 3, 4} cost(S1) = 5 S2 = {2, 4, 5, 8, 10} cost(S2) = 10 S3 = {1, 3, 5, 7, 9, 11, 13} cost(S3) = 20 S4 = {4, 8, 12, 16, 20} cost(S4) = 12 S5 = {5, 6, 7, 8, 9} cost(S5) = 15
Step 1
The output set, Output =
Find the cost effectiveness of each set for no elements in the output set,
S1 = cost(S1) / (S1 Output) = 5 / (4 0) S2 = cost(S2) / (S2 Output) = 10 / (5 0) S3 = cost(S3) / (S3 Output) = 20 / (7 0) S4 = cost(S4) / (S4 Output) = 12 / (5 0) S5 = cost(S5) / (S5 Output) = 15 / (5 0)
The minimum cost effectiveness in this iteration is achieved at S1, therefore, the subset added to the output set, Output = {S1} with elements {1, 2, 3, 4}
Step 2
Find the cost effectiveness of each set for the new elements in the output set,
S2 = cost(S2) / (S2 Output) = 10 / (5 4) S3 = cost(S3) / (S3 Output) = 20 / (7 4) S4 = cost(S4) / (S4 Output) = 12 / (5 4) S5 = cost(S5) / (S5 Output) = 15 / (5 4)
The minimum cost effectiveness in this iteration is achieved at S3, therefore, the subset added to the output set, Output = {S1, S3} with elements {1, 2, 3, 4, 5, 7, 9, 11, 13}.
Step 3
Find the cost effectiveness of each set for the new elements in the output set,
S2 = cost(S2) / (S2 Output) = 10 / |(5 9)| S4 = cost(S4) / (S4 Output) = 12 / |(5 9)| S5 = cost(S5) / (S5 Output) = 15 / |(5 9)|
The minimum cost effectiveness in this iteration is achieved at S2, therefore, the subset added to the output set, Output = {S1, S3, S2} with elements {1, 2, 3, 4, 5, 7, 8, 9, 10, 11, 13}
Step 4
Find the cost effectiveness of each set for the new elements in the output set,
S4 = cost(S4) / (S4 Output) = 12 / |(5 11)| S5 = cost(S5) / (S5 Output) = 15 / |(5 11)|
The minimum cost effectiveness in this iteration is achieved at S4, therefore, the subset added to the output set, Output = {S1, S3, S2, S4} with elements {1, 2, 3, 4, 5, 7, 8, 9, 10, 11, 12, 13, 16, 20}
Step 5
Find the cost effectiveness of each set for the new elements in the output set,
S5 = cost(S5) / (S5 Output) = 15 / |(5 14)|
The minimum cost effectiveness in this iteration is achieved at S5, therefore, the subset added to the output set, Output = {S1, S3, S2, S4, S5} with elements {1, 2, 3, 4, 5, 6, 7, 8, 9, 10, 11, 12, 13, 16, 20}
The final output that covers all the elements present in the universal finite set is, Output = {S1, S3, S2, S4, S5}.
Implementation
Following are the implementations of the above approach in various programming langauges −
#include <stdio.h> #define MAX_SETS 100 #define MAX_ELEMENTS 1000 int setCover(int X[], int S[][MAX_ELEMENTS], int numSets, int numElements, int output[]) { int U[MAX_ELEMENTS]; for (int i = 0; i < numElements; i++) { U[i] = X[i]; } int selectedSets[MAX_SETS]; for (int i = 0; i < MAX_SETS; i++) { selectedSets[i] = 0; // Initialize all to 0 (not selected) } int outputIdx = 0; while (outputIdx < numSets) { // Ensure we don't exceed the maximum number of sets int maxIntersectionSize = 0; int selectedSetIdx = -1; // Find the set Si with the maximum intersection with U for (int i = 0; i < numSets; i++) { if (selectedSets[i] == 0) { // Check if the set is not already selected int intersectionSize = 0; for (int j = 0; j < numElements; j++) { if (U[j] && S[i][j]) { intersectionSize++; } } if (intersectionSize > maxIntersectionSize) { maxIntersectionSize = intersectionSize; selectedSetIdx = i; } } } // If no set found, break from the loop if (selectedSetIdx == -1) { break; } // Mark the selected set as "selected" in the array selectedSets[selectedSetIdx] = 1; // Remove the elements covered by the selected set from U for (int j = 0; j < numElements; j++) { U[j] = U[j] - S[selectedSetIdx][j]; } // Add the selected set to the output output[outputIdx++] = selectedSetIdx; } return outputIdx; } int main() { int X[MAX_ELEMENTS] = {1, 2, 3, 4, 5, 6, 7, 8, 9, 10}; int S[MAX_SETS][MAX_ELEMENTS] = { {1, 1, 0, 0, 0, 0, 0, 0, 0, 0}, {0, 1, 1, 1, 0, 0, 0, 0, 0, 0}, {0, 0, 0, 1, 1, 1, 0, 0, 0, 0}, {0, 0, 0, 0, 0, 1, 1, 1, 0, 0}, {0, 0, 0, 0, 0, 0, 0, 1, 1, 1} }; int numSets = 5; int numElements = 10; int output[MAX_SETS]; int numSelectedSets = setCover(X, S, numSets, numElements, output); printf("Selected Sets: "); for (int i = 0; i < numSelectedSets; i++) { printf("%d ", output[i]); } printf("\n"); return 0; }
Output
Selected Sets: 1 2 3 4 0
#include <iostream> #include <vector> using namespace std; #define MAX_SETS 100 #define MAX_ELEMENTS 1000 // Function to find the set cover using the Approximate Greedy Set Cover algorithm int setCover(int X[], int S[][MAX_ELEMENTS], int numSets, int numElements, int output[]) { int U[MAX_ELEMENTS]; for (int i = 0; i < numElements; i++) { U[i] = X[i]; } int selectedSets[MAX_SETS]; for (int i = 0; i < MAX_SETS; i++) { selectedSets[i] = 0; // Initialize all to 0 (not selected) } int outputIdx = 0; while (outputIdx < numSets) { // Ensure we don't exceed the maximum number of sets int maxIntersectionSize = 0; int selectedSetIdx = -1; // Find the set Si with maximum intersection with U for (int i = 0; i < numSets; i++) { if (selectedSets[i] == 0) { // Check if the set is not already selected int intersectionSize = 0; for (int j = 0; j < numElements; j++) { if (U[j] && S[i][j]) { intersectionSize++; } } if (intersectionSize > maxIntersectionSize) { maxIntersectionSize = intersectionSize; selectedSetIdx = i; } } } // If no set found, break from the loop if (selectedSetIdx == -1) { break; } // Mark the selected set as "selected" in the array selectedSets[selectedSetIdx] = 1; // Remove the elements covered by the selected set from U for (int j = 0; j < numElements; j++) { U[j] = U[j] - S[selectedSetIdx][j]; } // Add the selected set to the output output[outputIdx++] = selectedSetIdx; } return outputIdx; } int main() { int X[MAX_ELEMENTS] = {1, 2, 3, 4, 5, 6, 7, 8, 9, 10}; int S[MAX_SETS][MAX_ELEMENTS] = { {1, 1, 0, 0, 0, 0, 0, 0, 0, 0}, {0, 1, 1, 1, 0, 0, 0, 0, 0, 0}, {0, 0, 0, 1, 1, 1, 0, 0, 0, 0}, {0, 0, 0, 0, 0, 1, 1, 1, 0, 0}, {0, 0, 0, 0, 0, 0, 0, 1, 1, 1} }; int numSets = 5; int numElements = 10; int output[MAX_SETS]; int numSelectedSets = setCover(X, S, numSets, numElements, output); cout << "Selected Sets: "; for (int i = 0; i < numSelectedSets; i++) { cout << output[i] << " "; } cout << endl; return 0; }
Output
Selected Sets: 1 2 3 4 0
import java.util.*; public class SetCover { public static List<Integer> setCover(int[] X, int[][] S) { Set<Integer> U = new HashSet<>(); for (int x : X) { U.add(x); } List<Integer> output = new ArrayList<>(); while (!U.isEmpty()) { int maxIntersectionSize = 0; int selectedSetIdx = -1; for (int i = 0; i < S.length; i++) { int intersectionSize = 0; for (int j = 0; j < S[i].length; j++) { if (U.contains(S[i][j])) { intersectionSize++; } } if (intersectionSize > maxIntersectionSize) { maxIntersectionSize = intersectionSize; selectedSetIdx = i; } } if (selectedSetIdx == -1) { break; } for (int j = 0; j < S[selectedSetIdx].length; j++) { U.remove(S[selectedSetIdx][j]); } output.add(selectedSetIdx); } return output; } public static void main(String[] args) { int[] X = {1, 2, 3, 4, 5, 6, 7, 8, 9, 10}; int[][] S = { {1, 2}, {2, 3, 4}, {4, 5, 6}, {6, 7, 8}, {8, 9, 10} }; List<Integer> selectedSets = setCover(X, S); System.out.print("Selected Sets: "); for (int idx : selectedSets) { System.out.print(idx + " "); } System.out.println(); } }
Output
Selected Sets: 1 3 4 0 2
def set_cover(X, S): U = set(X) output = [] while U: max_intersection_size = 0 selected_set_idx = -1 for i, s in enumerate(S): intersection_size = len(U.intersection(s)) if intersection_size > max_intersection_size: max_intersection_size = intersection_size selected_set_idx = i if selected_set_idx == -1: break U = U - set(S[selected_set_idx]) output.append(selected_set_idx) return output if __name__ == "__main__": X = [1, 2, 3, 4, 5, 6, 7, 8, 9, 10] S = [ {1, 2}, {2, 3, 4}, {4, 5, 6}, {6, 7, 8}, {8, 9, 10} ] selected_sets = set_cover(X, S) print("Selected Sets:", selected_sets)
Output
Selected Sets: 1 3 4 0 2