A Coding Interview Question I Ran Into

2021/11/19

Preface

Lately I’ve suddenly wanted to move to a first-tier city, so I’m back to prepping for interviews. During one interview I ran into a coding question. It’s not really an “algorithm” problem—more like something practical from real business scenarios, with a bit of data-structure checking mixed in. Similar problems on LeetCode are usually easy, and during the interview I AC’d it using Java 8 Stream APIs. Since I’ve also been writing about Java 8 functional programming recently, I’m recording it here on purpose.

The Question

String compression - compress the number of occurrences in a string
For example aabbcd -> a2b2c1d1
For example abcbc -> a1b1c2d1
Assume the string only contains uppercase and lowercase letters

Conventional Solution - Hash Counting

The conventional approach is a very common hash-counting exercise—pure manual work. Of course, this is the first “appetizer” question in an interview. The idea is simple: build a map and use it for deduplication. Every time you see the same key, increment the map’s value by 1. I’ll just paste the code for the conventional solution.

@Test
void strCompress() {
    String text = "aabcdsdsa";
    Map<Character, Integer> map = new HashMap<>(text.length());
    for (char c : text.toCharArray()) {
        Integer count = map.getOrDefault(c, 0);
        map.put(c, ++count);
    }
    StringBuilder builder = new StringBuilder(text.length());
    map.forEach((k, v) -> builder.append(k).append(v));
    System.out.println(builder.toString());
}

output: a3b1c1s2d2

Three things worth noting:

  1. I’m using the map’s native getOrDefault() to fetch the value. This avoids having to handle the “cold start” case where the key doesn’t exist in the map, and it unifies the ++count operation.
  2. For concatenation I’m using StringBuilder. If you use "" + "", it doesn’t look great to the interviewer—because you don’t know how many characters there are, and concatenating inside a for loop is not GC-friendly. Using StringBuffer also doesn’t look great, because this isn’t a multi-threaded environment and there’s no race condition; StringBuffer performs worse than StringBuilder. Details in a small snippet still matter. With such an easy question, it shouldn’t just be “do some manual work”—there should be more considerations.
  3. When iterating the map, you can directly use forEach(). Using an iterator or entrySet isn’t “wrong”, it’s just less concise. Since Java 8, there’s a default forEach method where you can pass a BiConsumer and do the concatenation via side effects.

Java 8 Stream Grouping Solution

They give you a String and call it “compression”, but in reality it’s just classifying all the chars and counting them. If this were SQL, you’d just do: select char, count(1) as count from table group by char.

In Java 8 Streams you can achieve something similar. Don’t blame me for being wordy—I’ll break the code down into points (pros can jump straight to the code):

  1. The first problem is: how do you turn a String into a stream? Here I recommend a default method added to String in JDK 8—chars(). It turns the chars in the string into ints and puts them into an IntStream.
  2. What we get is an IntStream containing ASCII codes for the chars. But the thing we want to group and count is essentially still char, so we use IntStream’s mapToObj(). The Function inside converts each element into a char (it will auto-box to Character), mapToObj(e -> (char) e).
  3. Finally, we collect and group. Collecting is of course collect(), which takes a Collector. In Collectors there’s a groupingBy() method that returns a Collector. groupingBy() groups for you. It takes two parameters: (the first is a Function—input is the stream element, and the return value is used as the grouping key), (the second is another Collector—it defines what to do after elements of the same group are gathered together; we still use a method from the Collectors utility class: counting(), which accumulates the grouped elements). This part is a bit twisty—hope you read it twice; I tried my best. In the end: collect(Collectors.groupingBy(Function.identity(), Collectors.counting())); where Function.identity() is equivalent to e -> e. Since it’s used so often, the JDK provides it as a dedicated static method.

Final code:

@Test
void strCompress() {
    String text = "aabcdsdsa";
    Map<Character, Long> map = text.chars()
            .mapToObj(e -> (char) e)
            .collect(Collectors.groupingBy(Function.identity(), Collectors.counting()));
    StringBuilder builder = new StringBuilder(text.length());
    map.forEach((k, v) -> builder.append(k).append(v));
    System.out.println(builder.toString());
}

output: a3b1s2c1d2

One last note: the two parameters of Collectors.groupingBy()the first parameter is a mapping function. In this problem it’s directly e -> e, so we use Function.identity() to simplify. The second parameter is a Collector interface (note Collectors vs Collector: the former is a static utility class). In general we use the library functions provided by the Collectors utility class. Common ones include toList() to collect data into a list, counting() to count elements, and summingInt() to sum data into an int.

END

All articles in this blog, unless otherwise stated, are licensed under @Oreoft . Please indicate the source when reprinting!

Table of Contents