Greedy Algorithms

This article provides a beginner-friendly overview of greedy algorithms. Greedy algorithms simulate the decision-making process of a very greedy individual. At every step, this person chooses the option that appears best at that moment, without worrying about future consequences. While this approach can lead to optimal solutions in some problems, it does not work universally. Therefore, it is crucial to verify the correctness of a greedy solution for each specific problem.

Introduction

A greedy algorithm makes the best possible choice at each step with the hope of finding a global optimum. However, because the algorithm only considers the present without taking into account future outcomes, it is essential to ensure that the problem has properties (such as optimal substructure) that guarantee an optimal solution from a series of locally optimal choices.

Explanation

When to Use Greedy Algorithms

Greedy methods work particularly well when the problem exhibits the property of optimal substructure. This means that the optimal solution of the overall problem can be constructed from the optimal solutions of its subproblems. In other words, if you can divide a problem into smaller, manageable pieces and the best solution for each piece leads to the best overall solution, then a greedy approach might be suitable.

Proving Greedy Algorithms

There are two common proof strategies for greedy algorithms:

Proof by Contradiction
Assume that swapping any two elements (or adjacent pairs) in the solution does not yield a better answer. This indicates that the current sequence is already optimal.
Inductive Proof
First, identify the optimal solution for a base case (for example, when there is only one element, i.e., $n=1$ ). Then, show that for every additional element $n+1$ , the optimal solution can be derived from the solution of $n$ .

Key Points

Common Problem Patterns

In many introductory problems, you'll encounter two main greedy strategies:

Sort the elements in a specific order (for example, ascending) and then select based on that order.
Continuously pick the maximum or minimum element from a collection and update the dataset. (Often, using a priority queue can help optimize the second approach.)

The primary difference between these methods is that the first is an "offline" strategy (process and sort all elements first, then choose), whereas the second is "online" (make choices incrementally as you process data).

Sorting-Based Solutions

A typical scenario involves receiving an array with one or two weights per element. By sorting the array and then traversing it, you can simulate the selection process to achieve the optimal solution.

The Regret Approach

In the regret approach, you initially accept each option irrespective of whether it is the best. You then compare the outcome after making the selection. If you find that the choice was suboptimal (i.e., not yielding the best result), you "regret" the decision and discard that option. This process is repeated until the optimal set of decisions is determined.

A Comparison with Dynamic Programming

One of the main differences between greedy algorithms and dynamic programming is that the greedy strategy makes immediate decisions for each subproblem, with no chance of backtracking. In contrast, dynamic programming stores results of previous computations and can adjust subsequent decisions based on those stored values, effectively allowing it to revise earlier choices.

Detailed Examples

Example 1: Neighbour Exchange Method

Imagine a scenario where a king organizes a game for his ministers during a national celebration. Each minister, as well as the king, writes an integer on each hand. After arranging the ministers in a line with the king at the front, every minister receives an award based on the product of the left-hand numbers of everyone in front of them divided by their own right-hand number, rounded down.

The objective is to rearrange the ministers so that the maximum award any minister receives is as small as possible.

Assume that after rearrangement, the $i$ ‑th minister has values $a_i$ and $b_i$ on his left and right hands respectively. Let $s$ be the product of all the left-hand numbers of those before the $i$ ‑th minister. Then, the award for the $i$ ‑th minister is

\frac{s}{b_i}

and the award for the $i+1$ ‑th minister is

\frac{s \cdot a_i}{b_{i+1}}.

If we swap these two ministers, the awards become

\frac{s}{b_{i+1}} \quad \text{and} \quad \frac{s \cdot a_{i+1}}{b_i}.

Swapping is beneficial when

\max\left(\frac{s}{b_i}, \frac{s \cdot a_i}{b_{i+1}}\right) < \max\left(\frac{s}{b_{i+1}}, \frac{s \cdot a_{i+1}}{b_i}\right).

After canceling the common factor $s$ and simplifying, we obtain the condition

\max\left(\frac{1}{b_i}, \frac{a_i}{b_{i+1}}\right) < \max\left(\frac{1}{b_{i+1}}, \frac{a_{i+1}}{b_i}\right).

This inequality can be transformed into a form that is easier to work with during implementation. One way to represent this logic in code is to define a custom comparator for each minister. Below is a Python demonstration that mimics this comparison:

class Minister:
    def __init__(self, a, b):
        self.a = a
        self.b = b
 
    def __lt__(self, other):
        # Compare based on the derived inequality.
        return max(other.b, self.a * self.b) < max(self.b, other.a * other.b)

In this example, each minister’s data is encapsulated within a class. The lt method (which Python uses for sorting) implements the comparison logic, enabling you to sort ministers according to the criteria derived from the inequality.

Example 2: The Regret Method – Work Scheduling

Consider a scheduling problem where John starts working at time 0 and has a vast amount of time available. There are $N$ jobs, each with a deadline $D_i$ and a profit $P_i$ . In any unit of time, John can complete any one job. Your task is to choose a subset of jobs so that the total profit is maximized without missing any deadlines.

The idea behind the regret approach is as follows:

Assume that every job is completed.
Sort all jobs by their deadlines.
Use a priority queue (or min-heap) to keep track of the jobs done so far. As you process each job:
- If the number of jobs already scheduled is less than the current job's deadline, simply add the job to your schedule.
- If the schedule is full (i.e., the number of scheduled jobs equals the deadline) but the current job's profit is higher than the smallest profit among the scheduled jobs, replace the smallest profit with the current job. This replacement is the “regret” – you wish you hadn’t taken the lower profit job.

Below is a sample Python implementation illustrating this approach:

import heapq
 
def max_profit(jobs):
    """
    Calculate the maximum profit from a list of jobs.
    
    Each job is represented as a tuple (deadline, profit).
    """
    # Sort jobs by their deadlines.
    jobs.sort(key=lambda job: job[0])
    profit_heap = []
    
    for deadline, profit in jobs:
        heapq.heappush(profit_heap, profit)
        # If the number of jobs exceeds the current deadline, remove the job with the smallest profit.
        if len(profit_heap) > deadline:
            heapq.heappop(profit_heap)
    
    return sum(profit_heap)
 
# Example usage:
if __name__ == "__main__":
    # List of jobs: (deadline, profit)
    jobs = [(4, 20), (2, 10), (4, 15), (3, 30)]
    print("Maximum Profit:", max_profit(jobs))

Complexity Analysis

Space Complexity: In the worst case, the priority queue may store all $n$ jobs, so the space complexity is $O(n)$ .
Time Complexity: Sorting the jobs takes $O(n \log n)$ time, and each insertion or removal operation on the priority queue takes $O(\log n)$ time. Thus, the overall time complexity is $O(n \log n)$ .

Exercises

Test your understanding by trying out these problems:

A job scheduling problem where you repair a barn.
A challenge that involves a queen’s game.
Explore more problems tagged with greedy algorithms on various problem-solving platforms.

Feel free to experiment with these approaches and see which method best fits the problem at hand. Happy coding!

Enumeration Prefix Sum and Difference Arrays