Improving Map Lookup Performance in ClickHouse
Problem
Map lookup such as a['key'] works with linear complexity (mentioned here) and can be inefficient. This is because selecting a value with a specific key from a table would require iterating through all keys (~M) across all rows (N) in the Map column, resulting in ~MxN lookups.
A lookup using Map can be 10x slower than a String column. The experiment below also shows ~10x slowdown for cold query, and difference in multiple magnitudes of data processed (7.21 MB vs 5.65 GB).
Solution To improve the query, we can add another column with the value defaulting to a particular key in the Map column, and then materializing it to populate value for existing rows. This way, we extract and store the necessary value at insertion time, thereby speeding up the lookup at query time.
