This post explains how to solve leetcode problem – 21. merge two sorted list – using python 3. Let’s take a look at an example. Let’s say there are two lists: 1 -> 3 -> 5 and 2 -> 4. After merging those lists, the expected answer is 1 -> 2 -> 3 -> 4 -> 5. You are supposed to merge the list in sorted order. This problem doesn’t require any tricks but you just need to be able to operate on iterating and merging the list. With that said, the algorithm is very simple. 1. iterate nodes in each list at the same time until at least one list is completely consumed 2. push the smaller node to the result list (merged) 3. After #1, 2, if there is any left over from either of the list, just append to the result list

Let’s take a look at working solution here. In the solution below, I used a dummy node so that I can blindly set the next pointer without having to worry about if it’s first node or not. You just need to make sure not to return the dummy node.

def mergeTwoLists(self, list1: Optional[ListNode], list2: Optional[ListNode]) -> Optional[ListNode]:
merged = ListNode(-1)
iter = merged
while list1 and list2:
if list1.val < list2.val:
iter.next = list1
list1 = list1.next
else:
iter.next = list2
list2 = list2.next
iter = iter.next
if list1:
iter.next = list1
elif list2:
iter.next = list2
return merged.next

As you see in the solution above, I am iterating the two lists at the same time, setting the smaller value node to the result list. Once at least one list is completely consumed, I need to check if there are any leftovers and append them to the result.

Time complexity is O(n + m) where n is the length of list1 and m is the length of list2. Space complexity is O(1) since I don’t need any extra space.

This post explains leetcode problem 206. reverse linked list using python 3. For example, let’s say there is a linked list: 1 -> 2-> 3-> 4 -> 5. After reversing the list, the expected result is 5 -> 4 -> 3 -> 2 -> 1. This is a fairly easy problem and there is a couple of ways to solve this.

Using stack

This method uses extra space (stack) as the name indicates. The algorithm is: 1. As iterating the original linked list, create the node and push it to the top of the stack. Note that the next pointer should be None 2. After #1, start popping the node from the stack. As you pop the node, connect it with previous (or next depending on your implementation) 3. After #2, you got the reversed linked list

Time complexity: O(n) Space complexity: O(n) because you have to store all the nodes to the stack

This method is very simple and has pretty decent time complexity. However, even though time complexity is O(n), you are iterating each node twice: 1st when iterating the list, 2nd when popping from the stack. In addition to that, it also uses O(n) space complexity.

The question is, can we do better?

Optimized linked list reversal

We can actually reverse the list in a single traversal without using any extra space. Since code is quite simple, I will just show you the code first then explain.

def reverseList(self, head: Optional[ListNode]) -> Optional[ListNode]:
reversed_list = None
while head:
next = head.next
head.next = reversed_list
reversed_list = head
head = next
return reversed_list

As you are iterating the original list, all you need to do is push the current node to the head of the new list. reversed_list is the head of the new list. While you are iterating the original list, save the next node because you are going to pop the current node from the original list and push to the head of the new list (line 4). At line 5,6 you are pushing the current node to the new list. Because you are setting the next pointer to the head of reversed_list, you are automatically disconnecting the node from the original list. At the line 6, it’s making sure to set the new head. Then you keep iterating by setting head to next node which was previously saved.

Time complexity is O(n) which is better than the method above because it only iterates the list one time. Space complexity is O(1) because it doesn’t require any extra space.

This post explains what minimum spnning tree is and how to construct minimum spanning tree using Kruskal’s algorithm

What is minimum spanning tree

Given you have a list of nodes in a graph and weighted edges connecting the nodes. What is the minimum cost to connect the nodes that all the nodes are connected? For example, looking at the graph below, all the nodes are connected. However, some edges can be removed so that total weight of the graph can be minimized and still connecting all the nodes. Minimum spanning tree is a graph that all the nodes are connected with the minimum cost of edges.

There are some characteristics of minimum spanning tree. 1. Any subset of solution cannot have cycle. Cycle means there is a redundant connection that can be dropped 2. Need to span all nodes 3. connected

How to construct minimum spanning tree?

Given a list nodes and weighted edges, how to construct minimum spanning tree? In this section, we will use Kruskal’s algorithm – greedy approach. Here is the algorithm.

Start with lists of nodes and edges

At each step, add the cheapest edge to the tree

Skip the edge if it creates the cycle

The algorithm itself is surprisingly simple and you might wonder if this actually results in a correct minimum spanning tree. Since greedy algorithm is typically wrong for many problems, it requires proof that the algorithm is correct. I will explain the proof later in the post.

Now, let’s look at the code. The key part of the algorithm is 1. Need to be able to select the cheapest edges, which requires the edges to be sorted based on the cost 2. Need to be able to detect if the given edge creates any cycle

#1 is fairly easy since you just need to sort the algorithm. But what about detecting a cycle? Would you you typical BFS or DFS for this? Running BFS or DFS every time you add the edge could be quite expensive as it will always result in quadratic time complexity. Union Find will be used in this algorithm since time complexity of the union find is practically O(n). Please refer to this post if you are unfamiliar with it. Pseudo code is used for simplicity

// n is a list of nodes from id 0 to n-1
n = [0,1,2,...n-1]
// edges is a list of edges. u is source, v is dest, w is weight
// assume any n log n sorting algorithm is used to sort the edges in ascending order of w
edges = [(u,v,w)...]
// parents are sizes are used for union find. Refer to the union find post above
parents = [0,1,2,...,n-1]
sizes = [1,1,1,...1]
total_cost = 0
num_components = n
mst = []
for u,v,w in edges:
rootU = find(u)
rootV = find(v)
// if rootU and rootV are same, adding this edge will create a cycle
if rootU != rootV:
if sizes[rootU] < sizes[rootV]:
parents[rootU] = rootV
sizes[rootV] += sizes[rootU]
else:
parents[rootV] = rootU
sizes[rootU] += sizes[rootV]
// decrease every time connecting different components. This will check if all the nodes are connected
num_components--
// keeping track of the total cost
total_cost += w
// edges used in minimum spanning tree
mst.add((u,v))
// not all nodes are connected
if num_components != 1:
return -1
// found minimum spanning tree
print (total_cost, mst)

Following things are happening in the above code.

Iterating all the edges which is sorted in ascending order of weights

Using union find, check if adding the edge will create any cycle

If creating a cycle, skip

else add the edge to the graph

If all the nodes are connected (checked by num_components), minimum spanning tree is found

Time complexity of the algorithm: 1. O(n log n) for sorting the edges. Assume the edges are sparse 2. O(n) running the edges using union find Total Time complexity is O(n log n)

Proof of Kruskal’s algorithm

I will prove the algorithm using proof by contradiction.

Suppose I constructed minimum spanning tree using Kruskal’s algorithm (greedy approach) – I will call this K. And suppose there is another minimum spanning tree that has less total cost – I will call this B. Now, looking at all the edges used in both K and B, it will look like this K1, K2, K3, K(n-1) edges for graph K, sorted in ascending order B1, B2, B3, B(n-1) edges for graph B, sorted in ascending order

Since we are claiming graph B is better than graph K, there will be some i-th edge that is different between K and B. Let’s call that K(i) and B(i). It means all the edges until i-1 are the same. There are two cases possible.

Case 1. K(i) > B(i) At K(i), let’s say the edge value is 12 and edge value of B(i) is 9. Is this actually possible? Up to this point, all the edges are same. Now given I add edge in increasing order, the fact K(i) edge value 12 and B(i) edge value 9 means I didn’t add edge 9 to K. What would be the case I didn’t add it? The only case I skip the edge is when it creates a cycle. If you added all the edges up to this point and add edge 9, it creates a cycle which means the edge is redundant. Therefore, this case is not possible.

Case 2. K(i) < B(i) Now let’s say edge value of K(i) is 9 and B(i) is 12 instead. Given all the edges up to this point exactly matching, adding edge 9 to K is the best choice because it doesn’t create a cycle. Since the edges are sorted in ascending order of weights, there isn’t any way graph B will be better than graph K. So adding edge 9 to K is the best choice indeed.

As you see in both cases, Kruskal’s algorithm actually works. The implementation is simple yet works perfectly using union find.

There are two methods in order to traverse a graph – BFS, DFS. Let’s say there is a graph problem to count the number of connected components. Here are the steps you can take. 1. Build the graph 2. Perform BFS or DFS 3. Outer loop to check each unvisited node 4. Number of calls to BFS or DFS is equivalent to the number of connected components because BFS or DFS will mark the connected component as visited

Time complexity would be O(V+E) where V is the node and E is the edge. It is linear complexity.

This is a simple problem with a simple solution: BFS or DFS works perfectly.

Now, let’s tweak the problem a little bit. Let’s say there is a stream of edges instead of the entire graph from the beginning. And you are asked to provide the number of connected components as the new edges are coming in. You have to perform BFS or DFS every time new edges come in because the new edge may or may not decrease the number of connected components. Let’s say there is an edge connecting node between u and v. Two possibilities of this edge is 1. if u, v belong to the same connected component, the number of connected components is unchanged 2. if u, v belong to the different connected components, the number of connected components will decrease by one because the component which u belongs will be connected to component which v belongs (or vice versa)

Time complexity of naive approach would be O(V+E) * O(E) = O(VE + E^2). It would be quadratic regardless this is dense or sparse graph.

The question we ask is can there be a better way to solve the problem?

Naive approach of union find

Let’s have an array to keep track of its component ids. The size of the array will be the size of the nodes each node will be assigned a component id it belongs. With the setting, when new edges come in, we look up the table and perform the following actions. 1. if they (two nodes in the edge) belong to the same component (have the same component id), then skip. 2. if they belong to different component, then update the component id.

Here is the psuedo code for the updating component id

componentid = array(size V), initialized to node id (node id will be component id)
for each edge(u,v):
if componentid[u] != componentid[v]: // find operation
cid = componentid[u]
for each i in 0 to n-1: // n is number of nodes. line 6-8 is merge (or union) operation
if componentid[i] == cid:
componentid[i] = componentid[v]

For each edge, number of union operation is n-1. The total cost of union operation would be O(VE) which is O(V^2) assuming this is sparse graph. For each edge, number of find operation is E and the total cost of find operation would be O(E) since find operation itself is O(1). All together, the total cost would be O(V^2) + O(E) = O(V^2). The code is simpler yet the complexity is still the same because union operation is expensive. Is there any way to improve union operation?

Optimized union operation

If you think about it, we actually don’t need to iterate every single node in a component and update the id. Let’s say there is a root node in a component, then we only need to update the root node component id. Let’s take a look at the example below.

In the above graph, there will be an edge between u and v nodes. Now, when updating connected component, you only need to update component id of root node of u. Each find operation will look up the component id of root node then update. How do you find the root node of component? Just keep looking at parents where node id == parent[node id]. Find operation will not be O(1) but union operation will be O(1) now because you always update 1 node for the operation. Here is the optimized version of pseudo code.

for each edge(u,v) in edges:
rootU = find(u)
rootV = find(v)
if rootU != rootV:
componentid[rootV] = componentid[rootU]
function find(x):
while componentid[x] != x:
x = componentid[x]
return x

Let’s look at the complexity of the operation. The cost of each union operation is O(1) and there are O(E) edges. Total cost of union operation is O(E). The cost of find operation is O(V) in worst case (where the tree is like a linked list, single chain) and there are O(E) edges. The total cost of find operation in worst case would be O(VE) which is O(V^2) assuming sparse graph.

We are heading to the correct direction but the time complexity didn’t improve which means there needs further optimization.

Union find with merging rules

If you blindly merge one tree to another, you might end up a single chain, linked list. The best case of tree structure would be a balanced tree where the height is O(log n). How can we achieve almost balanced tree to make find operation O(log n)?

Here is the claim of the rule: When merging together two components, you will achieve O(log n) if you make the root of the smaller component point to the root of the larger component.

Let’s prove if the claim is correct. Let’s say there are two connected components (trees) – u and v

With the above setting, we need to prove: h <= log(N(u) + N(v)) where h is height of the new component (tree) This will be a proof by induction.

Base case: 1. All nodes are disconnected: since all nodes are disconnected there is only one node per component. height of the component is 0 and it is true that 0 <= log 1 (only 1 node in the tree) 2. There is an edge between two nodes from step 1. In this case, h=1 and n=2. Claim is still true after first union operation.

After the above steps, there are two possible cases 1. two disconnected nodes are joined. Same as 1st step above 2. 2 nodes and 1 node join which becomes n=3, h=1 and 1 <= log3 according to the rule.

Assuming the claim is true after i union operations, we need to prove: h <= log n after i+1 union operation. Let’s note height of V tree as H(v) and height of U tree as H(u). There are two possible cases of the trees (u and v) where H(v) > H(u) or H(v) < H(u). With that said, the height of newly joined tree will be like this: h = max(H(v), H(u)+1). Note that number of nodes in V are greater than u here.

Case 1. H(v) > H(u). In this case, joining u tree to v doesn’t change the height and new height of the result tree is still H(v). We know that H(v) <= log N(v) (induction hypothesis) which also makes the following statement true H(v) <= log (N(v) + N(u)). New height h is H(v) so finally this guarantees h <= log(N(v) + N(u)).

Case 2. H(v) < H(u). h (new height) = H(u) + 1. based on induction hypothesis, we know H(u) <= log N(u) is true. Let’s add some tweak to the hypothesis: H(u) + 1 <= log N(u) + 1. H(u) + 1 is new height h. log N(u) + 1 would be log 2N(u). Then, it leads to this equality: h <= log 2N(u). Given N(u) < N(v), this is also true: h <= log (N(u) + N(v)), which guarantees new height will be lower than log (N(u) + N(v)).

Based on the above proof, we know that the height will be always balanced if two trees are merged according to the rule.

With that said, the total complexity will be (n-1) * cost of union op + E * cost of find op = (n-1) * O(1) + E * O(log n) = O (n log n)

Here is the pseudo code based on all the optimization

parents = a 1D array of size n
sizes = a 1D array of size n
for i in 0 to n-1:
parents[i] = i
sizes[i] = 1
for (u, v) in edges:
rootU = find(u)
rootV = find(v)
if rootU != rootV:
if sizes[rootU] < sizes[rootV]:
parents[rootU] = rootV
sizes[rootV] += sizes[rootU]
else:
parents[rootV] = rootU
sizes[rootU] += sizes[rootV]
function find(x):
while parents[x] != x:
x = parents[x]
return x

Time complexity is O(n log n), slightly slower than DFS or BFS but you don’t need to build graph for traversal. In addition to that, when graph changes dynamically, union find is better than DFS or BFS

Final optimization – Path Compression

There is one final optimization. You might have noticed that we are traversing to find the parent node every time we merge. However, what if we update the parent node id to the root node after traversing? It is slightly more work in the beginning but will save time for next root node search. In other words, we can compress the path (flattening the tree). The updated code will use recursion for simplicity.

function find(x):
// base case
if x == parents[x]:
return x
// recursive case
rootX = find(parent[x])
parent[x] = rootX
return rootX

Time complexity of the find is still O(log n). However, with many traversals, this practically makes it O(1)

Conclusion

This is it for the union find. DFS or BFS is typically used for many graph problems. However, there are cases when DFS or BFS doesn’t work perfectly and Union Find can actually improve for the better performance with simple code. I hope this helped you understand union find better. Thank you for reading.

In this post, solution to leetcode problem – 518, coin change II – is explained. Please refer to this link for detail problem description.

This is very similar to combination sum IV problem. The only difference is that the order doesn’t matter and this is essentially a subset problem with duplicates. Let’s see how we can approach the problem.

Brute Force

Exhaustively enumerate all subsets and check if it makes the amount. This will have exponential cost.

Optimized version

We can follow the similar approach as combination sum IV problem. The only difference is that order doesn’t matter and if you blindly add results from previous coins, you will end up counting the duplicate as you see in the diagram below.

In the above diagram, the recurrence relation is simply adding the results from previous amount with all the coin denominations. How can we avoid counting the duplicate? Instead of looking at all the coins for each amount, we can control how many coins each node has. For example, each node is responsible for keeping track of the number of coins for only one coin. Let’s look at the diagram below.

Now, it is not counting the duplicate and you just need to add up all the nodes with valid combinations to get the final result. The recurrence relation would be f(i, a) = sum(f(i-1, a – c*coins[i])) where i is coin index, a is amount, c is number of coins. The range for c is 0 to max coins. How do you calculate max coin? It is simply amount / coins[i]. As there are overlapping subproblems, we can use dynamic programming to solve this. For this, as there are two variables, we would need two dimensional array. row represents coin index and col represents amount. (you can change row/col if you want). What’s the base case? For the first column, when amount = 0, there is only one combination to make the amount – empty subset. For the first row, when no coins are used – I remember I said row is coin index, but since index makes problem slightly more difficult, let’s say row is ith coin that 0th coin means no coin, and the first coin is the first coin in coins – there isn’t any combination possible to make any amount. So everything is 0 except the first column. Although there are more optimizations possible, let’s take a look at the pseudo-code

table: 2d array of row (num coins + 1) col (amount + 1)
initialze first column to 1. everything else is 0
for i in 1 to n: // n is number of coins
for a in 1 to amount:
maxCoins = a / coins[i-1] // note that i is index of table. to access corresponding coin, need to subtract 1
// check all the possible number of coins can be included
for c in 0 to maxCoins:
table[i][a] += table[i-1][a - c*coins[i-1]]
return table[n][amount]

Solution is very simple but look at the time complexity. We already have a 2D table and there is an additional for-loop for each cell in the table. Can we do better?

More optimized version

If you closely look at the code, we can eliminate the inner for-loop (c) while achieving the goal. Instead of checking possible number of coins in the for-loop, let’s look at from exclusion/inclusion point of view. At each amount and coin[i], we have a choice – exclude or include. If you exclude, then you check the next coin. If you include, you have another choice – exclude or include the same coin again.

With that, we have a better time complexity. Here is the code written in C++.

int change(int amount, vector<int>& coins) {
vector<vector<int>> table(coins.size()+1, vector<int>(amount+1, 0));
for (int i = 0; i <= coins.size(); ++i)
{
table[i][0] = 1;
}
for (int i = 1; i <= coins.size(); ++i)
{
for (int j = 1; j <= amount; ++j)
{
// exclude
table[i][j] = table[i-1][j];
// include
if (j >= coins[i-1])
{
table[i][j] += table[i][j-coins[i-1]];
}
}
}
return table[coins.size()][amount];
}

Time complexity=O(n*amount) where n is length of coins and amount is initial amount. Space complexity=(n*amount) because we create 2D array of n*amount. We are very close but there is one more optimization we can achieve – space complexity. If you look at the all the update, exclude code at line 13 is completely unnecessary because after processing the row, next row can start exactly same as previous row. In addition to that, when you look at inclusion code at line 18, you only look at previous column because you are including the same coin!

It means that we only need a 1D array. Here is the C++ code.

int change(int amount, vector<int>& coins) {
vector<int> table(amount+1, 0);
table[0] = 1;
for (int i = 1; i <= coins.size(); ++i)
{
for (int j = 1; j <= amount; ++j)
{
if (j >= coins[i-1])
{
table[j] += table[j-coins[i-1]];
}
}
}
return table[amount];
}

Time complexity is unchanged because you have to iterate all the coins with all the amount. However, we reduced the space complexity to O(amount).

This post explains how to solve leetcode 377. combination sum IV problem. It starts with brute force idea and optimize to efficiently solve the problem. Please refer to this link for problem detail.

In this problem, you have two arguments – target and nums array. You are supposed to find the number of combinations to make the target amount only using numbers in nums array. In this problem, the order matters as you see in the example.

Let’s think about brute force idea – enumerate all the possible subsets of nums array. For each subset, you also need to check how many of each number can make the target. This is very expensive with exponential time complexity.

Can we do better? Let’s suppose nums = [1,2,3] and target is 10. Now, let’s also suppose that we have collected a number of combinations right before 10. In other words, we have the result at 9, 8, 7 which are (10 – 1), (10 – 2), (10 – 3). Since the order matters, we need to count all three cases. Then, the answer would be result(10 – 1) + result(10 – 2) + result(10 – 3). We can recursively derive the same relationship for 9, 8, 7 values. Let’s take a look at the below diagram.

We just need to recursively look for children until the base case is reached, which is 0. This is better than enumerating all the subsets and finding combinations. However, there is still room to improve because there are overlapping subproblems. This overlapping subproblem is like calculating fibonacci number recursively. We prevent calculating the same subproblem by storing the result. If subproblem is found, then we just need to reuse that one. Ultimately this becomes dynamic programming.

There are two ways – bottom up, top down – to solve the problem. In this post, I will just explain bottom up approach.

Let’s define f(t) as “number of combinations based on number in nums at target amount t”

Subsequently, the recurrence relation would be f(t) = sum(f(t – nums[i])), where i is 0..nums.length() (index). If t – nums[i] < 0, then it would be 0 as you can’t have negative value.

With the recurrence relation clearly defined, the code becomes very simple. The solution is based on C++.

Base case is t=0. There is only 1 way to make amount 0 which is empty subset. Based on the base case, we need to build up from the bottom until reach the target amount.

int combinationSum4(vector<int>& nums, int target) {
vector<unsigned int> table(target+1);
table[0] = 1;
for (int i = 0; i <= target; ++i)
{
for (int num : nums)
{
if (i >= num)
{
table[i] += table[i-num];
}
}
}
return table[target];
}

The time complexity is O(nt) where n is length of nums and t is target amount. One gotcha is that if t is really large (ex. > 2^n) this essentially becomes same as brute force solution.

This is an easy level coding interview problem from leetcode. The problem asks you to find the maximum number of repeating string which is given. Please refer to the problem description for more detail. For the solution, you need to iterate the sequence. At each index, you check if the substring matches the word. If it matches, then continue to the next index to find the end of the repeated word. If it doesn’t match, then just move on to the next index.

I provided solutions in C++ and Python3. You can try iterative solution but I provided recursive solutions here. For recursive solutions, it is following the same principles. But note that I keep track of the start index of the comparison because that will be used when the substring doesn’t match.

void maxRepeating(const string &seq, int currIdx, int matchIdx, const string &word, int numRepeat, int &maxRepeat)
{
// this is base case. make sure you get the max repeat.
if (seq.size() - matchIdx < word.size())
{
maxRepeat = max(maxRepeat, numRepeat);
}
// word doesn't match. get the max repeat and then continue comparing the string from the currIdx which is the start of the matching
else if (seq.substr(matchIdx, word.size()) != word)
{
maxRepeat = max(maxRepeat, numRepeat);
maxRepeating(seq, currIdx + 1, currIdx + 1, word, 0, maxRepeat);
}
// word matches. keep checking. make sure currIdx is passed to note the start of the pattern
else
{
maxRepeating(seq, currIdx, matchIdx + word.size(), word, numRepeat + 1, maxRepeat);
}
}
int maxRepeating(string sequence, string word) {
int maxRepeat = 0;
maxRepeating(sequence, 0, 0, word, 0, maxRepeat);
return maxRepeat;
}

class Solution:
result = 0
def max_repeating(self, sequence, curr_idx, match_idx, word, num_repeat):
compare = sequence[match_idx : match_idx + len(word)]
if len(compare) < len(word):
self.result = max(self.result, num_repeat)
return
elif compare != word:
self.result = max(self.result, num_repeat)
self.max_repeating(sequence, curr_idx + 1, curr_idx + 1, word, 0)
return
self.max_repeating(sequence, curr_idx, match_idx + len(word), word, num_repeat + 1)
def maxRepeating(self, sequence: str, word: str) -> int:
self.max_repeating(sequence, 0, 0, word, 0)
return self.result

This is an easy level leetcode problem. Please refer to this link for the detail of the problem description.

Basically, the idea is to find the longest consecutive 1’s and 0’s and return True if 1’s > 0’s.

There are a couple of ways to achieve the problem.

The first one is a pythonic way. Since the string is a binary string that only has 1 and 0, you can split the string based on them. After the split, it’s really to find the longest string in the array. But this solution is only applicable to python.

class Solution:
def checkZeroOnes(self, s: str) -> bool:
return max((len(digit) for digit in s.split('0'))) > max((len(digit) for digit in s.split('1')))

Another solution is to explicitly iterate the string and count num 1’s and 0’s which are applicable to all other languages. You can do it in one pass as you see the solution below.

This is an easy level leetcode problem, which you can use stack to solve.

Please refer to this link for more detail of the problem.

Essentially, you cannot accept 3 or more consecutive duplicate letters. There could be many ways but using a stack seems to be the most elegant approach. Basically, you accumulate each letter if it doesn’t match with the top two entries in the stack or stack has less than 2 elements. This way, you are only accumulating letters that are not duplicate of 3 or more consecutive characters.

class Solution:
def makeFancyString(self, s: str) -> str:
fancy = []
for letter in s:
if len(fancy) < 2 or fancy[-1] != letter or fancy[-2] != letter:
fancy.append(letter)
return "".join(fancy)

This is a leetcode interview problem that requires a hash table with strings. Please refer to this link for more detail of the problem description.

Let me first show you the solution code.

class Solution:
def checkAlmostEquivalent(self, word1: str, word2: str) -> bool:
def default_val():
return list((0, 0))
count = defaultdict(default_val)
for letter in word1:
count[letter][0] += 1
for letter in word2:
count[letter][1] += 1
for letter, freq in count.items():
if abs(freq[0] - freq[1]) > 3:
return False
return True

A hash table is necessary in order to keep track of the frequencies of the letter in the word. Note that I used a list of which size is 2 – the first index is frequencies for word1 and the second index is frequencies for word2.

Once you collected the frequencies, you just need to check if the frequency difference is greater than 3.

This website uses cookies to improve your experience. We'll assume you're ok with this, but you can opt-out if you wish. Cookie settingsOK

Privacy & Cookies Policy

Privacy Overview

This website uses cookies to improve your experience while you navigate through the website. Out of these cookies, the cookies that are categorized as necessary are stored on your browser as they are essential for the working of basic functionalities of the website. We also use third-party cookies that help us analyze and understand how you use this website. These cookies will be stored in your browser only with your consent. You also have the option to opt-out of these cookies. But opting out of some of these cookies may have an effect on your browsing experience.

Necessary cookies are absolutely essential for the website to function properly. This category only includes cookies that ensures basic functionalities and security features of the website. These cookies do not store any personal information.

Any cookies that may not be particularly necessary for the website to function and is used specifically to collect user personal data via analytics, ads, other embedded contents are termed as non-necessary cookies. It is mandatory to procure user consent prior to running these cookies on your website.