Link: https://leetcode.com/problems/longest-duplicate-substring/

Solution:

Topics: binary search, rabin-karp

Intuition
Very tricky problem. Binary search is used because if there exists a duplicate substring of length n, then there is guaranteed to be a duplicate substring of length n-1. The idea is to perform binary search with a sliding window to check if there exists a duplicate substring.

The optimal solution requires the rabin-karp algorithm because a simple sliding window and set requires slicing the string, and slicing is o(length) time complexity. Rabin-karp allows us to slice and hash a substring in O(1) time.

Rabin-karp is pretty complicated to code by hand, but python provides a very nice class memoryview that allows for O(1) slicing! The reason being is that when you slice the memoryview object, it does not create a new object, only a reference to the underlying data.

Implementation

def longest_dup(s):
	def has_dup(length):
		has = set()
		l = 0
		for r in range(length, len(s)+1):
			curr = s_data[l:r]
			if curr in has:
				return s[l:r]
			has.add(curr)
			l += 1
		return ''
 
	s_data = memoryview(s.encode()) #remember this!
	res = ''
	l = 1
	r = len(s)-1
	while l <= r:
		mid = (l + r) // 2
		dup = has_dup(mid)
		if len(dup):
			l = mid + 1
			res = dup if len(dup) > len(res) else res
		else:
			r = mid - 1
	return res
	
#time: o(nlogn)
#memory: o(n)

Visual

Review 1
Nice problem. I actually came up with a pretty clever and somewhat o(n*n) solution using a trie search…unfortunately this approach results in MLE. I forgot that we must use rabin-karp and binary search for this class of problems. Use memoryview(s.encode()) for o(1) slicing. TODO: learn how to implement rabin-karp rolling hash from scratch.

review
hard