Max deviation among all substrings
Let's have a string: abbbcacbcdce
For substring abbb, you have most frequent letter b -> 3 and least frequent letter a -> 1. So the deviation is = most frequent - least frequent = 3 - 1 = 2. You need to look at all the substrings and find the max deviation.
Here substring cacbcdc has the max deviation. Frequency is like below: c -> 4, a ->1, b->1, d->1. So max freq - min freq = 4 - 1 = 3.
Solution:
We take all possible distinct as ( c1 & c2 pairs ) characters possible. eg. 'a' & 'b' , 'a' & 'c' ... 'z' & 'z'
We consider c1 as the character with maximum Freq and c2 as the character with minimum Freq
Then we construct array of only c1 and c2 from the string( coz we are only interested in max and min Freq characters). We take 1 for c1 and -1 for c2. Further if we have consecutive c1's we simply add their frequency. eg. ababcccaac c1 = 'c' and c2 = 'a' the array formed is -1 0 -1 0 1 1 1 -1 -1 1
Now the deviation will be the length of maximum sum of subarray( LC 53) in the generate array. Note : that the size of subarray should be greater than 1. Why? In above example max subarray sum is 3( of length 1 ) but this is not correct as all characters will be same( result should be 0) So consider subarray of size>1.
Time complexity : O( n * 26* 26 ) ~ O(n)
Space Complexity : O(n) ( used for constructing array )
Version 2:
Idea: The max deviation must happen between some pairs of distinct letters. So compute max deviation for each pair then take max.
Step 1: Build a dictionary containing the lists of indexes of the same letters. For example: for 'abbbcacbcdce' a appears in positions [0, 5] b appears in positions [1,2,3,7] c appears in positions [4, 6, 8, 10] d appears in positions [9] e appears in positions [11] So the dictionary is {0: [0, 5], 1: [1, 2, 3, 7], 2: [4, 6, 8, 10], 3: [9], 4: [11]}) (key is the ascii for the letter)
Step 2: For all pairs of different letters, take the corresponding two lists in the dictionary in Step1 then iterate from smallest index to the biggest index in both lists. In the meanwhile, keep a count variable which is the difference of the numbers of two letters in substring [0:index + 1].
In each step if the current index belongs to the first list, count increases by 1. If the index belong to the second list, count decreases by 1. Needs to take care of the cases when there are no more indexes in a list too. For example, if the two letters are 'a', 'c', then take the two lists [0, 5], [4, 6, 8, 10] then do index = 0, letter = 'a', count = 1 index = 4, letter = 'c', count = 0 index = 5, letter = 'a', count = 1 index = 6, letter = 'c', count = 0 index = 8, letter = 'c', count = -1 index = 10, letter = 'c', count = -2 max count is 1 in index 0 and 5, min count is -2 in index 0. This means the max deviation for 'a' and 'c' happens in [6, 10] and is the absolute value |max(count) - min(count)| =3. Do this for all pairs of distinct letters. Take max. Time complexity: O(n) (each index is only computed once in building dictionary and once in iterating through lists) Space complexity: O(n) (each index is only stored once in the dictionary)
Last updated