Link: https://leetcode.com/problems/minimum-cost-to-convert-string-ii

Solution:

Topics: Dijkstra’s, DP

Intuition
This is a very challenging problem, and I honestly could not come up with a way to approach it. Fortunately, after reading some solutions I found an approach that makes sense to me.

Transformations will be referred to as the tuple (original, changed, cost)

Lets think about this problem on a high level. Clearly this will involve a search and we will be branching out with all the legal transformations for that position. So what is a legal transformation in this context? A legal transformation will be one that can convert a substring starting at a particular index into the target substring of that same range.

Lets look at a base-case:

trans = [('a', 'b', 1)]
source = a
target = b

We can see ('a', 'b', 1) is a legal transformation here because it will have converted the source into the target at cost 1. 

Of course there is more to this problem, because we could potentially need many transformations to turn source into target.

For example:

trans = [('a', 'b', 1), ('b', 'a', 1)]
source = ab
target = ba

In this case, we will have to use both transformations in order to convert source into target at a total cost of 2. 

However, we can imagine that there are multiple ways to convert source into target, and we are interested in the one with the lowest cost!

For example:

trans = [('a', 'b', 1), ('b', 'a', 1), (ab, ba, 1)]
source = ab
target = ba

In this case, we can see that using the third transformation is the most optimal.

So the idea here is to look at each substring from i to len(source), and determine the lowest cost of converting source[i:j] into target[i:j] if such a transformation exists! Or in other words, we are looking for the cheapest path between source[i:j] and target[i:j]. If such a path exists, we can branch off and shift i to j+1 because everything up until i would be considered transformed.

We can use either Dijkstra’s or optimized BFS to find the shortest path in a weighted graph! But first lets understand how exactly we can build this graph. We would generally use an adjacency list, but there is one important caveat. There could me many legal transformations that are useless to us. Why?

Because of this:

trans = [('a', 'b', 1), ('a', 'b', 2)]

If we look at the two transformations here, we can see that the second one is useless to us because its the same transformation as the first, but with a higher cost, so it should be discarded. 

So essentially, as we build our adjacency list, we must keep the minimum cost for duplicate transformations if the transformation already exists in the list! What this means is that using a list to store neighbouring nodes is very inconvenient because we potentially need to update transformations with smaller costs, and we want to do this in o(1) time.

For example:

trans = [('a', 'b', 2), ('a', 'c', 1), ('a', 'c', 1)]

i = 0
adj = {
	'a': [('b', 2)]
}

i = 1
adj = {
	'a': [('b', 2), ('c', 1)]
}

i = 2
adj = {
	'a': [('b', 2), ('c', 1)]
	            ^ we must iterate through the whole list (potentially)
		          to find 'b' and update its new cost to 1. 
}

The logical solution to avoid o(n) lookups is to store neighbours in a hash map, that way we can lookup neighbours in o(1) time and update costs. We can convert these neighbours back to a list of tuples, but it will not make much difference. So we get to our first block of code:

Building the list:

adj = {}
for parent, neighbor, price in zip(original, changed, cost):
	if parent not in adj:
		adj[parent] = {} #init the hash map
		
	if neighbor in adj[parent]: #found duplicate transformation
		adj[parent][neighbor] = min(adj[parent][neighbor], price) #take the min
	else:
		adj[parent][neighbor] = price

Now, we have a graph that can be used to run a Dijkstra’s between two nodes, so lets write a function that returns the minimum cost between two strings, or infinity if no such connection is possible.

Dijkstra:

def min_cost(start, end):
	queue = [(0, start)] #init the heap with cost = 0 and start string.
	costs = {start: 0} #init the costs with the same
	while queue:
		cost, curr = heappop(queue)
		if curr == end:
			return cost
			
		if curr not in adj:
			continue #no transformations
			
		for neighbor, price in adj[curr].items(): #iterate through neigbors
			new_cost = cost + price
			if neighbor not in costs:
				costs[neighbor] = float('inf')
			if new_cost < costs[neighbor]:
				costs[neighbor] = new_cost
				heappush(queue, (new_cost, neighbor))
				
	return float('inf')	

Now we can build our recursive DP function and use our Dijkstra function to verify (and get min cost of) valid transformations:

@cache
def dfs(i):
	if i >= len(source):
		return 0
	cost = float('inf') if source[i] != target[i] else dfs(i+1)
	for j in range(i+1, len(source)+1):
		sub_source = source[i:j]
		sub_target = target[i:j]
		min_c = min_cost(sub_source, sub_target)
		if min_c != float('inf'):
			cost = min(cost, min_c + dfs(j))
	return cost

In the above, we are checking every substring to find a valid transformation, but we can optimize this greatly with a trie! The key is to find all valid transformations in that range, and we can do it much faster by converting all transformations into trie! As it happens, we are already iterating through all the transformations when we build our graph, so we can modify that code to build our trie.

trie = {}
adj = {}
for parent, neighbor, price in zip(original, changed, cost):
	if parent not in adj:
		adj[parent] = {} #init the hash map
	if neighbor in adj[parent]: #found duplicate transformation
		adj[parent][neighbor] = min(adj[parent][neighbor], price) #take the min
	else:
		adj[parent][neighbor] = price
 
	curr = trie
	for char in parent:
		if char not in curr:
			curr[char] = {}
		curr = curr[char]
	curr['*'] = parent

And now lets implement trie logic into the DP function:

@cache
def dfs(i):
	if i >= len(source):
		return 0
	cost = float('inf') if source[i] != target[i] else dfs(i+1)
 
	curr_trie = trie
	sub_target = ''
	for j in range(i, len(source)):
		char = source[j]
		sub_target += target[j]
		if char not in curr_trie:
			break
		if '*' in curr_trie[char]:
			sub_source = curr_trie[char]['*']
			min_c = min_cost(sub_source, sub_target)
			if min_c != float('inf'):
				cost = min(cost, min_c + dfs(i+len(sub_source)))	
		curr_trie = curr_trie[char]
	return cost

Now putting it all together…

Implementation

trie = {}
adj = {}
for parent, neighbor, price in zip(original, changed, cost):
	if parent not in adj:
		adj[parent] = {} #init the hash map
	if neighbor in adj[parent]: #found duplicate transformation
		adj[parent][neighbor] = min(adj[parent][neighbor], price) #take the min
	else:
		adj[parent][neighbor] = price
 
	curr = trie
	for char in parent:
		if char not in curr:
			curr[char] = {}
		curr = curr[char]
	curr['*'] = parent
 
def min_cost(start, end):
	queue = [(0, start)] #init the heap with cost = 0 and start string.
	costs = {start: 0} #init the costs with the same
	while queue:
		cost, curr = heappop(queue)
		if curr == end:
			return cost
			
		if curr not in adj:
			continue #no transformations
			
		for neighbor, price in adj[curr].items(): #iterate through neigbors
			new_cost = cost + price
			if neighbor not in costs:
				costs[neighbor] = float('inf')
			if new_cost < costs[neighbor]:
				costs[neighbor] = new_cost
				heappush(queue, (new_cost, neighbor))
				
	return float('inf')	
 
@cache
def dfs(i):
	if i >= len(source):
		return 0
	cost = float('inf') if source[i] != target[i] else dfs(i+1)
 
	curr_trie = trie
	sub_target = ''
	for j in range(i, len(source)):
		char = source[j]
		sub_target += target[j]
		if char not in curr_trie:
			break
		if '*' in curr_trie[char]:
			sub_source = curr_trie[char]['*']
			min_c = min_cost(sub_source, sub_target)
			if min_c != float('inf'):
				cost = min(cost, min_c + dfs(i+len(sub_source)))	
		curr_trie = curr_trie[char]
	return cost
 
res = dfs(0)
return res if res != float('inf') else -1
 
#time: o(n*m*(mlogm)) n=len(source) m = len(transformations)
#memory: max(o(n), o(m))

Mnemonic
Think one string at a time. You gave a graph of strings and we want the shortest path in the weighted graph between start_string and end_string! ([Dijkstra’s])

Visual

Review 1
Truly an awesome problem! I found the right approach and every optimization fairly quickly but I missed trie search to optimize the DP! Very pleased with my progress on this problem. This problem is pretty close to Word break II.

review
hard
insane