Preface
Lately I’ve suddenly wanted to move to a first-tier city, so I’m back to prepping for interviews. During one interview I ran into a coding question. It’s not really an “algorithm” problem—more like something practical from real business scenarios, with a bit of data-structure checking mixed in. Similar problems on LeetCode are usually easy, and during the interview I AC’d it using Java 8 Stream APIs. Since I’ve also been writing about Java 8 functional programming recently, I’m recording it here on purpose.
The Question
String compression - compress the number of occurrences in a string
For example aabbcd -> a2b2c1d1
For example abcbc -> a1b1c2d1
Assume the string only contains uppercase and lowercase letters
Conventional Solution - Hash Counting
The conventional approach is a very common hash-counting exercise—pure manual work. Of course, this is the first “appetizer” question in an interview. The idea is simple: build a map and use it for deduplication. Every time you see the same key, increment the map’s value by 1. I’ll just paste the code for the conventional solution.
@Test
void strCompress() {
String text = "aabcdsdsa";
Map<Character, Integer> map = new HashMap<>(text.length());
for (char c : text.toCharArray()) {
Integer count = map.getOrDefault(c, 0);
map.put(c, ++count);
}
StringBuilder builder = new StringBuilder(text.length());
map.forEach((k, v) -> builder.append(k).append(v));
System.out.println(builder.toString());
}
output: a3b1c1s2d2
Three things worth noting:
- I’m using the map’s native
getOrDefault()to fetch the value. This avoids having to handle the “cold start” case where the key doesn’t exist in the map, and it unifies the++countoperation. - For concatenation I’m using
StringBuilder. If you use"" + "", it doesn’t look great to the interviewer—because you don’t know how many characters there are, and concatenating inside aforloop is not GC-friendly. UsingStringBufferalso doesn’t look great, because this isn’t a multi-threaded environment and there’s no race condition;StringBufferperforms worse thanStringBuilder. Details in a small snippet still matter. With such an easy question, it shouldn’t just be “do some manual work”—there should be more considerations. - When iterating the map, you can directly use
forEach(). Using an iterator orentrySetisn’t “wrong”, it’s just less concise. Since Java 8, there’s a defaultforEachmethod where you can pass aBiConsumerand do the concatenation via side effects.
Java 8 Stream Grouping Solution
They give you a String and call it “compression”, but in reality it’s just classifying all the chars and counting them. If this were SQL, you’d just do: select char, count(1) as count from table group by char.
In Java 8 Streams you can achieve something similar. Don’t blame me for being wordy—I’ll break the code down into points (pros can jump straight to the code):
- The first problem is: how do you turn a
Stringinto a stream? Here I recommend a default method added toStringin JDK 8—chars(). It turns the chars in the string into ints and puts them into anIntStream. - What we get is an
IntStreamcontaining ASCII codes for the chars. But the thing we want to group and count is essentially stillchar, so we useIntStream’smapToObj(). TheFunctioninside converts each element into achar(it will auto-box toCharacter),mapToObj(e -> (char) e). - Finally, we collect and group. Collecting is of course
collect(), which takes aCollector. InCollectorsthere’s agroupingBy()method that returns aCollector.groupingBy()groups for you. It takes two parameters: (the first is aFunction—input is the stream element, and the return value is used as the grouping key), (the second is anotherCollector—it defines what to do after elements of the same group are gathered together; we still use a method from theCollectorsutility class:counting(), which accumulates the grouped elements). This part is a bit twisty—hope you read it twice; I tried my best. In the end:collect(Collectors.groupingBy(Function.identity(), Collectors.counting()));whereFunction.identity()is equivalent toe -> e. Since it’s used so often, the JDK provides it as a dedicated static method.
Final code:
@Test
void strCompress() {
String text = "aabcdsdsa";
Map<Character, Long> map = text.chars()
.mapToObj(e -> (char) e)
.collect(Collectors.groupingBy(Function.identity(), Collectors.counting()));
StringBuilder builder = new StringBuilder(text.length());
map.forEach((k, v) -> builder.append(k).append(v));
System.out.println(builder.toString());
}
output: a3b1s2c1d2
One last note: the two parameters of Collectors.groupingBy()—the first parameter is a mapping function. In this problem it’s directly e -> e, so we use Function.identity() to simplify. The second parameter is a Collector interface (note Collectors vs Collector: the former is a static utility class). In general we use the library functions provided by the Collectors utility class. Common ones include toList() to collect data into a list, counting() to count elements, and summingInt() to sum data into an int.
END
All articles in this blog, unless otherwise stated, are licensed under @Oreoft . Please indicate the source when reprinting!