692. Top K Frequent Words

1.Description(Medium)

Given a list of words and an integer k, return the top k frequent words in the list.

Notice

You should order the words by the frequency of them in the return list, the most frequent one comes first. If two words has the same frequency, the one with lower alphabetical order come first.

Example

Given

[
    "yes", "lint", "code",
    "yes", "code", "baby",
    "you", "baby", "chrome",
    "safari", "lint", "code",
    "body", "lint", "code"
]

for k =3, return["code", "lint", "baby"].

for k =4, return["code", "lint", "baby", "yes"],

Challenge

Do it in O(nlogk) time and O(n) extra space.

Extra points if you can do it in O(n) time with O(k) extra space approximation algorithms.

Tags

Hash Table Heap Priority Queue

2.Code

Version 1: HashMap + PriorityQueue(MinHeap)

遍历数组来统计词频。用一个Word class来表示每个单词和其词频。

用一个HashMap来存每个单词和其Word。

用一个PriorityQueue来对HashMap中Word进行排序,重构Comparator来实现

1)按词频重大到小排序

2)词频相同时,按字典序排序。(String.compareTo())

然后依次取k个queue顶元素。

注意一开始要判断k==0,否则在初始化priorityqueue的时候会报错。

class Word{
    String value;
    int freq;
    public Word(String value,int freq){
        this.value=value;
        this.freq=freq;
    }   
}

public class Solution_471 {
     /**
     * @param words an array of string
     * @param k an integer
     * @return an array of string
     */



    public String[] topKFrequentWords(String[] words, int k) {
        String[] result=new String[k];
        //一定注意这里要记得判断k==0,不然在定义priorityqueue的时候就会报错
        if(words==null || words.length==0 ||k==0){
            return result;
        }
        //min heap
        PriorityQueue<Word> queue=new PriorityQueue<Word>(k,new Comparator<Word>(){
            public int compare(Word a,Word b){
                int diff=a.freq-b.freq;
                if(diff==0){
                    //注意这里比较string的方法,相当于b-a,下面这个是错的
                    return b.value.compareTo(a.value);
                    //return b.value.charAt(0)-a.value.charAt(0);
                }
                return diff;
             }
        });

        HashMap<String,Word> map=new HashMap<String,Word>();//注意hashmap存放的是string和word的映射

        for(int i=0;i<words.length;i++){
            String current=words[i];
            if(!map.containsKey(current)){
                Word node=new Word(current,1);
                map.put(current, node);
            }else{
                Word update=map.get(current);
                update.freq++;
                map.put(current,update);
            }
        }
        for(String element:map.keySet()){
            Word node=map.get(element);
            queue.offer(node);
            if(queue.size()>k){
                queue.poll();
            }
        }

        int len=k-1;
        while(!queue.isEmpty()){
            result[len--]=queue.poll().value;
        }
        return result;
    }
}

Version 2: Quick Select

https://www.jiuzhang.com/problem/top-k-frequent-words/

最优解法:quick select

  • Time: O(n + klogk)

  • Space: O(n) HashMap

  • HashMap 记录元素的频数

public class Solution {
    public String[] topKFrequentWords(String[] words, int k) {
        if (words == null || words.length == 0 || k == 0) {
            return new String[0];
        }
        
        // O(n) hashing
        Map<String, Integer> counts = new HashMap<>();
        for (String word : words) {
            counts.merge(word, 1, Integer::sum);
        }
        String[] wordSet = counts.keySet().toArray(new String[0]);
        
        // most frequent -> least frequent, then alphabetical order
        Comparator<String> comp = Comparator
            .comparing((String w) -> -counts.get(w)) 
            .thenComparing(Comparator.naturalOrder());
            
        // O(n) quick select
        quickSelect(wordSet, comp, 0, wordSet.length - 1, k - 1); 
        Arrays.sort(wordSet, 0, k - 1, comp);  // O(klogk) sort
        
        return Arrays.copyOf(wordSet, k); 
    }
    
    // Quick select with custom Comparator
    private String quickSelect(String[] words,  
                               Comparator<String> comp,
                               int start, 
                               int end, 
                               int k) {
        if (start >= end) {
            return words[k];
        }
        
        String pivot = words[start + (end - start) / 2];
        int left = start, right = end; 
        while (left <= right) {
            while (left <= right && comp.compare(words[left], pivot) < 0) {
                left++;
            }
            
            while (left <= right && comp.compare(words[right], pivot) > 0) {
                right--;
            }
            
            if (left <= right) {
                String temp = words[left];
                words[left++] = words[right];
                words[right--] = temp;
            }
        }
        
        if (k <= right) {
            return quickSelect(words, comp, start, right, k);
        }
        
        if (k >= left) {
            return quickSelect(words, comp, left, end, k);
        }
        
        return words[k];
    }
}

Last updated