Sunday, July 8

Tips & Tricks with Clojure Reducers

After noticing a few others encounter the same issues I ran into when first playing with Clojure's reducers, I thought I should collect some of what I discovered into a quick blog post.

First off, reducers are *AWESOME* and fully realize the power of parallel sequences but as they are still alpha have a few rough edges to work around. I'm sure in the near future it will become much more polished but until then be aware of the follow.

Note, clojure.core.reducers will be abbreviated as "r" below;

  • In many cases r/map and r/reduce are faster than their core counterparts, but not in all cases. At this point in time I only use the r/ variants for larger collections (> 1000 elements) and that has worked well.
  • The parallel part of reducers doesn't kick in until you use r/fold instead of reduce. For r/fold to be faster than r/reduce though, it needs to be able to "divide-n-conquer", and hence needs to be able to quickly divide up the collection.
    Summary: r/fold is only faster than r/reduce when operating on a vector.
  • (into [] (r/map inc (vec (range 10)))) works, but uses r/reduce instead of r/fold.
    Instead it's faster to use the following to take advantage of fold;

    (fold-into-vec (r/map inc (vec (range 2000))))
  • Reducing into map is many times slower than reducing into a single value or vector. Unfortunately merge and merge-with become the bottlenecks here, and this is the biggest rock in my sandal at the moment. :(
  • Don't forget mapv, which was introduced in Clojure 1.4 I believe. It provides a nice middle ground between core/map and core.reducers/map.

The function I'd specifically like to tune further with reducers is the following;

No comments: