362. Design Hit Counter

Design a hit counter which counts the number of hits received in the past 5 minutes (i.e., the past 300 seconds).

Your system should accept a timestamp parameter (in seconds granularity), and you may assume that calls are being made to the system in chronological order (i.e., timestamp is monotonically increasing). Several hits may arrive roughly at the same time.

Implement the HitCounter class:

  • HitCounter() Initializes the object of the hit counter system.

  • void hit(int timestamp) Records a hit that happened at timestamp (in seconds). Several hits may happen at the same timestamp.

  • int getHits(int timestamp) Returns the number of hits in the past 5 minutes from timestamp (i.e., the past 300 seconds).

Example 1:

Input
["HitCounter", "hit", "hit", "hit", "getHits", "hit", "getHits", "getHits"]
[[], [1], [2], [3], [4], [300], [300], [301]]
Output
[null, null, null, null, 3, null, 4, 3]

Explanation
HitCounter hitCounter = new HitCounter();
hitCounter.hit(1);       // hit at timestamp 1.
hitCounter.hit(2);       // hit at timestamp 2.
hitCounter.hit(3);       // hit at timestamp 3.
hitCounter.getHits(4);   // get hits at timestamp 4, return 3.
hitCounter.hit(300);     // hit at timestamp 300.
hitCounter.getHits(300); // get hits at timestamp 300, return 4.
hitCounter.getHits(301); // get hits at timestamp 301, return 3.

Constraints:

  • 1 <= timestamp <= 2 * 109

  • All the calls are being made to the system in chronological order (i.e., timestamp is monotonically increasing).

  • At most 300 calls will be made to hit and getHits.

Follow up: What if the number of hits per second could be huge? Does your design scale?

Solution:

这题和 933. 最近的请求次数346. 数据流中的移动平均值 类似,都是考察队列的使用。

题目说了 getHits 查询的 timestamp 是递增的,所以每个 getHits 时保留队列中最近 300 秒的数据即可,其他数据均可删除。

class HitCounter {
    Queue<Integer> q = new LinkedList<>();

    public void hit(int timestamp) {
        q.offer(timestamp);
    }

    public int getHits(int timestamp) {
        // 留队列中最近 300 秒的数据即可
        while (!q.isEmpty() && timestamp - q.peek() >= 300) {
            q.poll();
        }
        return q.size();
    }
}

Followup:

Solution 2 - Large scale What if the number of hits per second could be very large? Does your design scale?

class HitCounter {
    LinkedList<Integer> queueTimestamp = new LinkedList<>();
    HashMap<Integer, Integer> freq = new HashMap<>();
    int hitCount = 0;

    /** Initialize your data structure here. */
    public HitCounter() {

    }

    /** Record a hit.
     @param timestamp - The current timestamp (in seconds granularity). */
    public void hit(int timestamp) {
        if (!queueTimestamp.isEmpty() && queueTimestamp.peekLast() == timestamp) {
            freq.put(timestamp, freq.get(timestamp) + 1);
        } else {
            freq.put(timestamp, 1);
            queueTimestamp.addLast(timestamp);
        }
        hitCount++;
        rollOutOldData(timestamp);
    }

    /** Return the number of hits in the past 5 minutes.
     @param timestamp - The current timestamp (in seconds granularity). */
    public int getHits(int timestamp) {
        rollOutOldData(timestamp);
        return hitCount;
    }
    
    void rollOutOldData(int timestamp) {
        while (!queueTimestamp.isEmpty() && timestamp - queueTimestamp.peek() + 1 > 300) {
            int victim = queueTimestamp.poll();
            hitCount -= freq.get(victim);
            freq.remove(victim);
        }
    }
}

---Note: Not good at multi-threading env. queue does not work very well in multi-threading env. Although the input is guaranteed to be chronological increasing, your program may not be able to process the input in the same order. So a larger element may be inserted into the queue earlier than the smaller element.

How to make this work in multithreaded environment ? using ConcurrentHashMap and AtomicInteger for counter ??

@connect2krish Thanks for your ReadWriteLock solution. I have a question about the ReadLock here (e.g. getHits), what is multiple calls on getHits with different timestamps, since on the ReadLock concept, multiple threads can acquire the ReadLock at the same time. then the rollOutOldData methods called from getHits will also be called simultaneously. I think in this problem since the timestamp is in seconds granularity, when multiple getHits are called at the same timestamp, the result is the same. But what if the timestamp is in ms and if the rollOutOldData take a longer time, then will we need to use a single lock to limit the read part as well? Thanks.

you can try a simple ReentrantReadWriteLock:

class HitCounter {
    LinkedList<Integer> queueTimestamp = new LinkedList<>();
    HashMap<Integer, Integer> freq = new HashMap<>();
    int hitCount = 0;
    ReentrantReadWriteLock rw = new ReentrantReadWriteLock();
    Lock r = rw.readLock();
    Lock w = rw.writeLock();

    /** Initialize your data structure here. */
    public HitCounter() {

    }

    /** Record a hit.
     @param timestamp - The current timestamp (in seconds granularity). */
    public void hit(int timestamp) {
        w.lock();
        try {

            if (!queueTimestamp.isEmpty() && queueTimestamp.peekLast() == timestamp) {
                freq.put(timestamp, freq.get(timestamp) + 1);
            } else {
                freq.put(timestamp, 1);
                queueTimestamp.addLast(timestamp);
            }
            hitCount++;
            rollOutOldData(timestamp);
        } finally {
            w.unlock();
        }
    }

    /** Return the number of hits in the past 5 minutes.
     @param timestamp - The current timestamp (in seconds granularity). */
    public int getHits(int timestamp) {
        r.lock();
        try {
            rollOutOldData(timestamp);
        } finally {
            r.unlock();
        }
        return hitCount;
    }
    
    void rollOutOldData(int timestamp) {
        while (!queueTimestamp.isEmpty() && timestamp - queueTimestamp.peek() + 1 > 300) {
            int victim = queueTimestamp.poll();
            hitCount -= freq.get(victim);
            freq.remove(victim);
        }
    }
}

Followup 2:

http://www.michael-noll.com/blog/2013/01/18/implementing-real-time-trending-topics-in-storm/

Basic ideas:

  • Rolling Counter

  • Staged Aggregator

  • Partition

Last updated