Greedy Algorithms
Learn how greedy algorithms make locally optimal decisions that can lead to globally optimal solutions. Explore intuition, correctness proofs, optimization strategies and competitive programming applications.
Greedy algorithms are a powerful algorithmic technique that constructs a solution step by step, always choosing the best local option at the current moment. The main idea is simple: make the optimal immediate choice and hope that a sequence of such choices leads to a globally optimal solution.
Unlike backtracking or exhaustive search, greedy algorithms never reconsider previous decisions. Because of this, they are often extremely fast and efficient, but proving their correctness can sometimes be challenging.
How Does a Greedy Algorithm Think?
A greedy algorithm always chooses what appears to be the best option at the current moment.
Instead of exploring all possible combinations, it makes an immediate decision and continues forward.
- takes the largest coin first
- chooses the shortest next step
- selects the activity that finishes earliest
The main philosophy is:
"Choose the best option now — hope it leads to the best overall solution."
Example 1: Step-by-step process and visualization — Making Change
Imagine you need to return 62 dinars using the smallest possible number of coins (denominations: 50, 20, 10, 5, 1). A greedy algorithm would follow these steps:
- Step 1: Take the largest coin not exceeding 62 → 50. (Remaining: 12)
- Step 2: Take the largest coin not exceeding 12 → 10. (Remaining: 2)
- Step 3: Take the largest coin not exceeding 2 → 1. (Remaining: 1)
- Step 4: Take one more coin of 1. (Remaining: 0)
In this case, we obtained an optimal solution using a total of 4 coins.
Implementation: Coin Change Problem in C++
This code uses a greedy strategy: it always takes the largest possible coin until the remaining amount becomes zero. Pay attention to the comments explaining each step of the logic.
#include <iostream>
#include <vector>
using namespace std;
void makeChange(int amount) {
// Set of available coin denominations (sorted descending)
// Greedy algorithm always starts from the largest value
vector<int> coins = {50, 20, 10, 5, 1};
vector<int> result;
int remaining = amount;
// Iterate through each coin
for (int i = 0; i < coins.size(); i++) {
// While we can still use this coin
while (remaining >= coins[i]) {
remaining -= coins[i]; // Reduce remaining amount
result.push_back(coins[i]); // Add coin to result
}
}
// Output result
if (remaining == 0) {
cout << "Change for " << amount << " is: ";
for (int coin : result) {
cout << coin << " ";
}
cout << endl << "Total coins used: " << result.size() << endl;
} else {
// This happens if greedy cannot form exact change
// (e.g., missing coin of value 1)
cout << "Error: Greedy algorithm could not form exact change!" << endl;
}
}
int main() {
int amount = 62;
makeChange(amount);
return 0;
}
Code analysis: The time complexity of this solution is O(n), where n is the number of coin types, since we iterate through all denominations once. Although very fast, this algorithm may fail to produce an optimal solution for certain coin systems (e.g. {25, 20, 1} with amount 40), which is an important lesson about greedy methods.
Greedy algorithms do not always guarantee an optimal solution. A choice that looks best at the moment may lead to a suboptimal final result.
We have coins: 1, 3, 4
We need to make the amount 6 Greedy approach:
4 + 1 + 1 = 3 coins ❌ Optimal solution:
3 + 3 = 2 coins ✅
Is the greedy approach always correct?
The answer is: NO. Greedy algorithms are "short-sighted". A choice that looks optimal at the moment can prevent reaching the true global optimum later.
Failure example: We need to make 40 units, and available coin denominations are 25, 20, and 1.
• Greedy: Takes 25 first, leaving 15, which must be formed using fifteen 1-unit coins. Total: 16 coins.
• Optimal: Two 20-unit coins. Total: 2 coins.
Why does greedy sometimes work and sometimes fail?
As we have seen, the greedy algorithm does not always produce the best solution. To understand when it can be used safely, we need to consider certain conditions.
This technique is correct only when the problem satisfies two key properties:
- Greedy-choice property: A globally optimal solution can be achieved by making locally optimal choices.
- Optimal substructure: The optimal solution of a problem contains optimal solutions to its subproblems.
Examples where greedy works well: Activity selection, Huffman coding, Dijkstra’s algorithm, Kruskal’s and Prim’s algorithms.
Examples where greedy usually fails: 0/1 Knapsack problem, Traveling Salesman Problem (TSP).
Basic code structure
while (there is a possible decision):
x = choose the best local option (e.g. largest coin or shortest step)
if (x is feasible):
add x to the solution
update the remaining problem
Example 2: Activity Selection Problem
The goal is to select the maximum number of non-overlapping activities. Greedy strategy: always choose the activity that finishes earliest.
// Sort activities by finishing time (a.second)
sort(activities.begin(), activities.end(),
[](auto &a, auto &b){ return a.second < b.second; });
int last_end = -1, count = 0;
for (auto &a : activities){
if (a.first >= last_end){
count++;
last_end = a.second;
}
}
Logic: The earlier an activity finishes, the more space we leave for others.
Advantages and disadvantages
| Advantages | Disadvantages |
|---|---|
| Very fast (often linear time complexity) | No guarantee of global optimality without proof |
| Simple to implement | If it fails, the error can be significant |
| Low memory usage | Requires a specific problem structure |
Conclusion
Greedy algorithms are elegant and efficient, but require caution. If a local optimum does not guarantee a global optimum, more advanced techniques such as dynamic programming should be used.
Advanced Greedy: Covering the Entire Alphabet
In some problems, greedy is not used to pick a minimum or maximum value, but to track the completion of entire sets of elements.
A typical example involves strings where we want to monitor how many times we manage to "collect" all letters of the alphabet.
Idea:
- Traverse the string from left to right
- Keep track of which characters have been seen
- When we collect all
Kdistinct characters → perform a "reset" - The number of such completions directly affects the answer
set = empty set
count_sets = 0
for each character c in string:
add c to set
if set contains all K letters:
count_sets++
clear set
Key idea: every time we collect all K letters,
we know that we have "completed" one full segment.
Until we collect all letters, there is always some "missing piece" in the current segment. Only when the set becomes complete do we move to the next level — this is why the answer changes only after each full set is formed.
This technique is often used in problems involving:
- minimum length of a missing pattern
- number of required segments
- or coverage analysis of all possible elements
Instead of directly constructing the solution, we track how many times we manage to fully cover the set — the final answer is derived from that.
In the next example, we will see how this idea is applied to solve a real contest problem.
Why does greedy work here?
It may seem that we need to try all possible subsequences, which would be too slow (exponential time).
However, greedy works because the following property holds:
- If we have not collected all
Kletters → there exists a missing string of length 1 - When we first collect all
Kletters → all strings of length 1 become achievable - When we collect them twice → all strings of length 2 can be formed
In other words:
the number of complete alphabet passes determines the minimal length of a missing subsequence
Therefore, the greedy decision:
- “keep collecting until all K letters are obtained”
leads to a globally optimal solution.
Example 3: Smallest Missing Subsequence
We are given a long string S consisting of characters from an alphabet of size K. We want to find the length of the shortest possible string that is NOT a subsequence (not necessarily contiguous) of S.
This is a perfect problem for an advanced greedy approach based on set completion: to make the missing subsequence as long as possible, we must "spend" as many complete alphabet cycles as we can.
Local decision: every time we collect all K distinct characters,
we treat that segment as one completed "level".
Our answer is the number of completed levels plus one.
#include <iostream>
#include <string>
#include <set>
using namespace std;
// Function finds the minimum length of a subsequence that does NOT appear
int shortestMissingSubsequence(string text, int K) {
set<char> seen;
int completed = 0;
// Greedy: scan from left to right
for (char c : text) {
// Add current character to the set
seen.insert(c);
// Key check: if we have all K alphabet characters
if (seen.size() == K) {
completed++; // One full segment is used
seen.clear(); // RESET for the next level
}
}
// Result: number of completed levels + 1
// This is the smallest length of a string that we cannot guarantee exists as a subsequence
return completed + 1;
}
int main() {
// Example: DNA sequence with K=4 alphabet (A, G, C, T)
string text = "AGCTAGGCTACGTCA";
int K = 4;
// For this text:
// [AGCT] (1), [AGGCTA] (2), [CGT] (incomplete) -> 2 + 1 = 3
// The shortest missing subsequence has length 3 (e.g. "AAA" or "GGG")
cout << shortestMissingSubsequence(text, K) << endl;
return 0;
}
Complexity analysis: The time complexity is $O(N \log K)$ due to the use of `std::set`, which is highly efficient. Without this greedy insight, the problem would become exponential. The illustration below shows the scan-and-reset process over a long string, highlighting how completed alphabet levels are formed.
Algorithm Explanation: Smallest Missing Subsequence
The task is to find, for a given string, the length of the shortest string that is not its subsequence.
A subsequence is a string that can be obtained by deleting some characters without changing the relative order.
We do not directly search for the missing subsequence. Instead, we track how many times we are able to see all characters of the alphabet.
How does the algorithm work?
We traverse the string from left to right and maintain a set of distinct characters we have seen.
- Add each new character to the set
- When the set contains all K characters → we complete one full "cycle"
- We increase the counter and reset the set
Reset means we start collecting a new full set of characters — a new level of completion.
Connection to the result
If we manage to complete C full sets, it means:
- all subsequences of length ≤ C can still be formed
- but at least one subsequence of length C + 1 cannot be guaranteed
The smallest missing subsequence has length C + 1.
Why does the result change only after a full set?
Let’s observe what happens to missing subsequences as we scan the string.
- a → missing subsequences: b, c → length = 1
- ab → missing subsequence: c → length = 1
- abc → now we have all letters → missing subsequence becomes length = 2
We can see an important pattern:
- before collecting all letters → there always exists a missing subsequence of length 1
- after collecting all letters → no missing subsequence of length 1 exists anymore
Only after completing a full set of K characters do we "consume" all short missing subsequences, forcing us to move to longer ones.
What happens next?
The same process repeats:
- after the first full set → we move to length 2
- after the second full set → we move to length 3
Each completed set increases the minimum missing subsequence length by 1.
Step-by-step illustration
Let’s consider the string: abcbaca, with K = 3 (a, b, c)
- Step 1: {a} → result = 1
- Step 2: {a, b} → result = 1
- Step 3: {a, b, c} → full set → reset → result = 2
- Step 4: {b} → result = 2
- Step 5: {b, a} → result = 2
- Step 6: {b, a, c} → full set → reset → result = 3
- Step 7: {a} → result = 3
Intuition (most important)
Think of it like collecting sets of stickers:
- once you collect all letters → you complete one level
- each level guarantees that all shorter subsequences are already "covered"
The number of completed levels determines how "long" the unavoidable missing subsequences become.
Why is the algorithm efficient?
- we scan the string only once → O(N)
- we maintain a set of size K → O(K)
This type of approach is common when we do not construct the answer directly, but instead track structural properties of the input (here: how often we complete the full alphabet).
How to read this visualization?
The image shows the process of finding the shortest missing subsequence for the string abcabca when K = 3 (letters a, b, c).
We scan the string from left to right and split it into segments (levels), where each level contains all alphabet characters.
Level 1
In the first part we see the characters a, b, c. Once we collect all three letters, we complete the first full set.
- This means that all subsequences of length 1 exist
Level 2
Next, we again collect letters: b, a, c. We again have all letters → second full set is completed.
- Now we know that all subsequences of length 2 exist
Level 3 (incomplete)
At the end, only a remains, which is not enough to complete a new full set.
If we have 2 completed levels, it means that all subsequences of length ≤ 2 exist.
Therefore, the shortest missing subsequence has length 3.
What is a missing subsequence?
A missing subsequence is a string that cannot be obtained by deleting characters from the original string while preserving order.
In this case, at least one string of length 3 does not exist as a subsequence — and that is the answer.
Instead of searching for the missing subsequence directly, we track how many times we can collect all alphabet letters.
Solution: Smallest Missing Subsequence for Every Prefix
We solve this problem efficiently using the idea of "layering".
We scan the string and keep track of which characters we have seen. Whenever we collect all K distinct letters, we complete one full level.
For every prefix, the answer is: number of completed levels + 1
#include <iostream>
#include <string>
using namespace std;
// Array for tracking whether a character has been seen
bool seen[26];
int main() {
ios_base::sync_with_stdio(false);
cin.tie(NULL);
int N, K;
cin >> N >> K;
string S;
cin >> S;
int completed_levels = 0; // how many times we collected all K letters
int distinct_count = 0; // how many distinct letters in current set
for (int i = 0; i < N; i++) {
int c = S[i] - 'a'; // map character to index (0 to K-1)
// If we have not seen this character in the current set
if (!seen[c]) {
seen[c] = true;
distinct_count++;
}
// If we have collected all K letters
if (distinct_count == K) {
completed_levels++; // one full level is completed
// reset for the next level
distinct_count = 0;
for (int j = 0; j < K; j++) {
seen[j] = false;
}
}
// Result for current prefix:
// number of completed levels + 1
cout << completed_levels + 1;
if (i != N - 1) cout << " ";
}
cout << endl;
return 0;
}
How the algorithm works
Let’s consider the string: abcbaca with K = 3 (letters a, b, c)
- a → {a} → 1 letter → result = 1
- b → {a, b} → 2 letters → result = 1
- c → {a, b, c} → full set → reset → result = 2
- b → {b} → result = 2
- a → {b, a} → result = 2
- c → {b, a, c} → full set → reset → result = 3
- a → {a} → result = 3
Why does this work?
When we collect all K letters, it means that:
- all subsequences of length 1 exist
- combining multiple levels gives all subsequences up to length C
If we have C completed levels:
- all subsequences of length ≤ C exist
- but at least one subsequence of length C + 1 does not exist
The shortest missing subsequence has length C + 1
Algorithm complexity
- Time complexity: O(N)
- Memory complexity: O(K)
Instead of directly searching for the missing subsequence, we observe the structure of the string and count how many times we can "cover" all letters.
Mini Challenge
We are given activities with time intervals: (1,3), (2,5), (4,7), (6,9) Task: Select the maximum number of non-overlapping activities. Question: Which one do you choose first, and why?Greedy Optimizations with Accumulated Effects
In this lesson we cover a special class of problems where the greedy choice is not immediately obvious, but instead becomes clear through analyzing how effects accumulate over time.
These problems often look like simulation or dynamic programming, but in essence they reduce to a smart step-by-step choice + accumulation of effects across the process.
Key idea
Instead of simulating the entire process step by step, we focus on:
- how each decision affects future steps
- how effects accumulate over time
- and how we can precompute the total contribution
In this way, the problem is often reduced to a simple per-step optimization.
Visual intuition — effect accumulation
Imagine the following process over time:
Move: 1 2 3 4 5
|-----|-----|-----|-----|
Poison: +2 +2 +2 +2 +2
(accumulates over all future moves)
Direct attack: affects only the current move
□ Key idea:
- poison has a persistent effect
- direct attack has a instant effect
- therefore the decision is not local, but time-dependent
What a “naive” simulation looks like
for each move:
compute current damage
add new effects
re-scan all previous poison effects
❌ This is too slow because we repeatedly recompute the same contributions.
How we think efficiently
Instead of simulation, we consider:
- the total contribution of each move in advance
- how valuable each choice is on a global level
This leads to the following transformation:
- time simulation problem → contribution aggregation problem
Connection to dynamic programming
Although the solution looks like greedy, behind it lies a DP idea:
- we consider the optimal result for the first k moves
- each step depends on previous decisions
However, we do not need to explicitly store a DP table because:
- the state can be compressed into cumulative values
- transitions can be replaced by a mathematical formula
□ This is a very common pattern in competitive programming: DP idea → optimized into a greedy/mathematical solution
Mini intuition for students
Remember the following rules:
- If an effect lasts over multiple steps → you cannot evaluate only the current move
- If effects accumulate → try to “pull them out of loops”
- If you see simulation → ask whether a formula exists instead
Typical problem patterns
- poison / damage-over-time effects
- buff/debuff systems in games
- cumulative contribution of decisions
- offline computation problems
Conclusion
Advanced greedy algorithms are not just about “pick the best now”, but rather:
transform the problem so that global effects can be computed via local decisions
This is exactly what separates basic greedy problems from competitive programming-level greedy techniques.
Independent Practice Problems
Try applying the greedy techniques you learned to new problems on your own.
These problems are designed to help you think like in competitive programming: local decisions, optimization, and reasoning about correctness.
Try to solve each problem without looking at the solution first. The focus is not only on implementation, but on problem-solving intuition.
Task 1: Elden Ring — Minimum Number of Moves
You are playing the latest installment of the popular game "Elden Ring". There are Q bosses, where the i-th boss has Hi health points. To complete the game, you must defeat all of them.
Each boss fight lasts at most N moves. Every boss is fought over the same N moves. During the k-th move, you have two options:
-
Direct attack — immediately deals
Dkdamage. -
Poison attack — from that move onward (including the current one),
the boss loses
Pkhealth after every move.
Poison attacks stack, so after each move the boss loses the sum of all active poisons so far.
If after any move the boss has at most 0 health points, it is defeated immediately.
For each boss, output the minimum number of moves required to defeat it.
If it is not possible within N moves, output -1.
Instructions — how to think
At first glance, it seems we need to simulate all N moves for each boss and track health changes step by step.
This is too slow because both N and Q can be large.
So we must reformulate the problem.
Key idea
Consider a fixed time t. We ask:
What is the maximum damage we can deal after exactly t moves?
For each move k, we choose between:
- direct attack:
Dk - poison attack:
Pkapplied on every move fromktot
If we choose poison at move k, its total contribution until time t is:
P[k] * (t - k + 1)
If we choose direct attack instead:
D[k]
So the gain of poison over direct attack is:
P[k] * (t - k + 1) - D[k]
We rewrite this into a linear form:
P[k] * t - (P[k] * (k - 1) + D[k])
Thus we precompute:
A = P[k]B = P[k] * (k - 1) + D[k]
and contribution becomes:
A * t - B
Visual intuition
k-th move:
Direct attack: D[k] (one-time damage)
Poison: P[k] (repeated every move until the end)
Timeline:
k → k+1 → k+2 → ... → t
+P +P +P +P
Poison is a persistent effect, while direct attack is instant.
Break-even point
Poison becomes better when:
A * t - B >= 0
which implies:
t >= B / A
So each poison move becomes useful only after some threshold time.
If P[k] = 0, it is never useful.
Efficient computation
We maintain:
A— sum of active poison ratesB— sum of constants
Then extra damage at time t is:
t * sum_A - sum_B
Total damage:
prefix_direct[t] + extra_damage
DP intuition
This looks like a DP over time, but we avoid full DP by:
- precomputing contributions
- evaluating damage per
t - binary searching the answer per boss
Algorithm steps
- read all moves
- build prefix sum of direct damage
- compute poison activation thresholds
- sort by threshold
- compute max damage for each
t - binary search for each boss
Complexity
O(N log N + Q log N)
#include
using namespace std;
using ll = long long;
struct Potez {
ll A;
ll B;
double prag;
};
int main() {
ios::sync_with_stdio(false);
cin.tie(nullptr);
int N, Q;
cin >> N >> Q;
vector direct(N + 1), poison(N + 1);
for (int i = 1; i <= N; i++) {
cin >> direct[i] >> poison[i];
}
vector prefix(N + 1, 0);
for (int i = 1; i <= N; i++) {
prefix[i] = prefix[i - 1] + direct[i];
}
vector cand;
for (int k = 1; k <= N; k++) {
if (poison[k] == 0) continue;
ll A = poison[k];
ll B = poison[k] * (k - 1) + direct[k];
double prag = (double)B / A;
cand.push_back({A, B, prag});
}
sort(cand.begin(), cand.end(),
[](auto &a, auto &b){ return a.prag < b.prag; });
vector best(N + 1, 0);
ll sumA = 0, sumB = 0;
int ptr = 0;
for (int t = 1; t <= N; t++) {
while (ptr < (int)cand.size() && cand[ptr].prag <= t) {
sumA += cand[ptr].A;
sumB += cand[ptr].B;
ptr++;
}
best[t] = prefix[t] + sumA * t - sumB;
}
for (int i = 1; i <= N; i++)
best[i] = max(best[i], best[i - 1]);
while (Q--) {
long long H;
cin >> H;
int l = 1, r = N, ans = -1;
while (l <= r) {
int m = (l + r) / 2;
if (best[m] >= H) {
ans = m;
r = m - 1;
} else l = m + 1;
}
cout << ans << "\n";
}
return 0;
}
Explanation of the solution
For every t, we precompute the maximum damage we can deal.
This means we decide for each move whether to:
- use direct attack
- or activate poison that becomes more valuable over time
Since poison contribution is linear in t, it is enough to know:
- when it becomes beneficial
- how much it contributes when active
Then the problem reduces to:
- preprocessing candidates
- computing damage for each
t - binary searching the answer for each boss
Why is this efficient?
We never simulate fights individually. Instead, we compute once how much damage we can do in 1..N moves. Then each boss query is answered with binary search.
This makes the solution fast enough for all constraints.
Key intuition
- direct attack = immediate effect
- poison = accumulating effect over time
- so decisions must consider future impact, not just current move
This is a classic case where naive simulation suggests DP, but careful analysis leads to a much faster greedy-based solution.