Link: https://leetcode.com/problems/minimum-deletions-to-make-character-frequencies-unique/

Solution:

Topics: hash map

Intuition
The basic approach to this problem is understanding that a duplicate frequency must be decremented (or deleted) until we reach a frequency that has not been seen.

For example:

s = aabbccccdddd
freq = {
	a:2,
	b:2,
	c:4,
	d:4
}

We can see that counts of 2 and 4 are repeated once, so they must be reduced. How can they be reduced?

Consider the set of seen frequencies: {2, 4}. Notice that there are two frequencies that are available...1 and 3. 

Optimally, the character 'c' or 'd' can be recuced to 3 and the character 'a' or 'b' can be reduced to 1. Observe that it would be far less optimal to reduce 'c' to 1, as that would result in spurious deletions. 

We should only reduce to zero (full deletion) if there are no available frequencies.

NOTE: The problem states that we can only delete, so while its also possible to make the frequencies unique by incrementing, this operation is forbidden. 

Implementation

def min_deletions(s):
	freq = {}	
	for char in s:
		freq[char] = freq.get(char, 0) + 1
		
	counts = freq.values()
	seen = set()
	res = 0
	for count in counts:
		if count not in seen:
			seen.add(count)
		else:
			while count and count in seen:
				count -= 1
				res += 1
			if count:
				seen.add(count)
	return res
 
#time: o(n)
#memory: o(n)

Visual

Review 1
This problem “got” me again. My mind initially went to sorting be frequency and then iterating backwards and decrementing by one when the next frequency is equal to the current. This does not work. Why?

counts = [2,2,2]
            ^       the next value is also 2, so lets make this 1. 

		 [2,1,2]
		  ^        the next value is 1, so this is fine?

No, because we have just spoiled our sorted order! There are still duplicates in the string! If we wanted to do it this way, we could use a strictly decreasing monotonic stack!

Implementation (stack)

res = 0
freq = {}
for char in s:
	freq[char] = freq.get(char, 0) + 1
 
counts = sorted(freq.values())
stack = [counts.pop()]
 
while counts:
	curr = counts.pop()
	while curr and curr >= stack[-1]:
		res += 1
		curr -= 1
	stack.append(curr)
 
return res

Or we just keep a used_freq set and decrement as needed. This is the same implementation as in my first note but this one is cleaner:

Implementation (set)

res = 0
freq = {}
for char in s:
	freq[char] = freq.get(char, 0) + 1
	
used = set()
for count in freq.values()
	while count and count in used:
		count -= 1
		res += 1
	used.add(count)
	
return res

Review 1
Interesting problem! My analysis above is partially wrong! You actually can just sort the frequencies and iterate backwards while incrementing the result if count[i] == count[i+1] and then set count[i] = count[i+1]-1! The important edge is if count[i] > count[i+1]. How can this happen if you sorted counts?

s = abc
counts = [1, 1, 1]

after setting counts[1] to 0 we get:

counts = [1, 0, 1]

So in the case that counts[i] > counts[i+1] we must add the entire value of counts[i] to the result and set counts[i] = counts[i+1]-1…in the case that counts[i+1] = 0, we add the entire value of counts[i] to the result and set it to 0!

Of course the hash map solution is more clever and more efficient, but this was not the first thing that came to mind. Pretty hard!

review
hard