Introduction to Greedy Algorithms

Greedy is a paradigm of algorithms that finds a solution by choosing the best known option at each step, with no regard for the future consequences of that decision.

While this sounds all fine and dandy, greedy is not always the way to go! (just like in real life) For example, say your goal in life is to maximize your happiness. A greedy approach may be to cuddle with your cat 24/7. Cuddling with your cat seems like the best option at each step, and it's especially better than doing your work, but (unfortunately) this may not be the best decision in the long term.

In this article, we'll go over how to implement a greedy algorithm, and some examples of good (and of course, bad) greedy algorithms.

Let's take a look at an actual example of how a greedy algorithm can work.

How "Greedy" Works

Let's say you have a 4x4 grid of cash amounts.

$30	$10	$35	$5
$30	$100	$90	$1
$20	$50	$20	$40
$10	$75	$15	$10

Let's say you goal is to pick 4 cells from the table, but you can't select two numbers from the same row or column. How can you maximize your earnings?

Here's a greedy strategy. Pick the highest number on the board, and then remove that row/column from contention. Repeat until you've picked 4 numbers.

Not only is this strategy very easy to implement and visualize, it sounds like it could work, right? Well let's do it with the table above:

$30	$10	$35	$5
$30	$100	$90	$1
$20	$50	$20	$40
$10	$75	$15	$10

This gives us a total of 100+40+35+10=185 dollars! Not too bad, right?

But here's why you shouldn't always be greedy. Look at this solution:

$30	$10	$35	$5
$30	$100	$90	$1
$20	$50	$20	$40
$10	$75	$15	$10

This solution gives us a total of 90+75+40+30=235! Way better than our greedy strategy😔

What you should take away from this is that greedy cannot always be trusted!!! In general, greedy can be good at finding a somewhat good solution, but won't always get the very best one.

This is why if you really want the very best answer, and you want to use a greedy strategy, you have to prove that it always gives the best answer, (I'll show you how later!) and while implementing the algorithm is relatively simple, proving that it's optimal isn't quite so and requires careful analysis.

So why choose greedy algorithms if they're so unreliable? Well, there are many famous (and effective) algorithms that use the greedy approach, such as:

Dijkstra's Algorithm
Prim's Algorithm
Kruskal's Algorithm
Huffman Coding
Interval/Event Scheduling (which we will take a look at next)

I will eventually cover Dijkstra's, Prim's, and Kruskal's in an article about graph algorithms, so stay tuned for that!

Example: Event Scheduling

Suppose you're running a programming conference and you have a list of events, each with their corresponding start and end times. This list is given to you in the form:

$$[(s_1,f_1),(s_2,f_2),...,(s_n,f_n)]$$

You want to schedule the most events possible, but events can't overlap! (I guess the programming conference could only afford one room...)

Brute Force

The brute force approach is to check every single possibility of event combinations. First, set max to be 0. Next, check every combination of events, first checking if it is valid, and if it is, checking if that configuration has more events than max, in that case set max to that number.

It is easy to see that this approach is exponential in time. Way too slow! We've got a programming conference to plan!

Let's see if there is a greedy strategy that can speed this up, while also getting the best answer.

Approach 1: Earliest Start Time

Intuitively, maybe it makes sense that in order to jam as many events as possible into the conference, we want to schedule events as early as possible.

So let's try a greedy algorithm that repeatedly picks the event with the earliest start time in the list (that doesn't conflict with an event we've already chosen), removes it from the list, and then continues on until no events are left. This seems like a reasonable approach.

Let's take a look at an example of this algorithm producing the correct result. The set of events our algorithm choses are highlighted in green.

Cool, it worked! It chose 2 events, and the maximum it could've chosen was 2. But does this always work?

When determining whether or not your greedy algorithm always picks the right answer, you can show that it doesn't by finding a counterexample.

Here, we can see an example where the greedy algorithm above does not pick the most optimal answer. The optimal solution set is highlighted in orange, while the set of events our algorithm picked are highlighted in green.

Not ideal. Let's see if there's another greedy approach we can come up with.

Approach 2: Fewest Conflicts

Alright, instead of repeatedly picking the event with the earliest start time at each step, maybe it makes sense to minimize the amount of conflicts in each pick.

So this time, our algorithm will repeatedly pick the event that has the fewest amount of conflicts (that is still available), until no events are available.

Let's see this in action. Here, each event is labelled with the amount of other events it conflicts with. The ones labelled in green are the ones our greedy algorithm would choose.

Do you think you can find a counterexample that would pick more events than this one?

Well, if you haven't already, here's one:

Looks like this greedy approach didn't work either!

Approach 3: Earliest Finish Time

Let's try something a little unintuitive. Let's do the same thing as the earliest-start-time-approach, but pick the event with the earliest finish time at each step.

Let's take a look at an example of this in action:

This is the same list of events that tripped up our "Fewest Conflicts" approach, but you can see here that it actually picked the correct set!

Does this always work, though? Let's take a look at another example of this algorithm in practice:

It picked 4 events, which is the best you can do for this list of events. And as it turns out, this approach actually always picks the optimal solution for the set! Try any configuration of lists, and you'll see that our algorithm always picks the most events!

We'll get into proving why it does later, but for now let's look at an implementation of this algorithm:

Sort events by end times
events = (empty set)
time = 0
for i=1,...,n:
- if s_i > time, then:
  - add [s_i, f_i] to events
  - time = f_i
return events

This algorithm runs in O(n log n) time (pretty much bounded by the sorting), much better than exponential time! So it looks like being greedy can sometimes be great!

Part 2: Proving Your Greedy Algorithm is Optimal! (Coming soon)

Part 2 of this article will be coming soon! Stay tuned...

Here are other articles from my Algorithms series:

Dynamic Programming Algorithms

Jeremy Quinto