Herewith, all in one place, are Clojuresque definitions of:
- reducing function
Longer elaborations of these definitions follow in the subsequent section.
Anything that can be used as the first argument of the
reduce function, e.g.
conj. Generally, it's a binary function that returns something of
the type of its first argument, which is supposed to be a kind of accumulation.
Anything that can be used as the second argument of the
For the purposes of the next few column inches, it's a collection - a vector, list, etc.
(reduce reducing-function reducible)
A function that turns one reducing function into another, by modifying the second argument or invoking the input function a number of times other than one.
I lied a little when I defined reducible; actually, it's anything that implements
CollReduce protocol, whose
coll-reduce method is invoked by
Thus, the exact behavior of
reduce is a property of the specific thing being reduced.
A reducer is a just a special kind of reducible that has a transducer
embedded in it, ready to be invoked whenever
reduce is finally called.
A reducible that also implements
CollFold, enabling you to to use
fold in lieu of
reduce; the two functions yield
the same answer, but
fold can be parallelized.
Basically, a synonym for "separated," as in "separation of concerns."
Here's a transducer that doubles all incoming elements before passing them to the input reducing function:
(defn double1 [reduction-function] (fn [result input] (reduction-function result (* 2 input))))
(reduce (double1 +) 0 c) would produce $2\sum_i c_i$
(reduce (double1 *) 1 c) would return
$\prod^n_i 2 c_i = 2^n \prod^n_i c_i$.
And here's one that doubles incoming elements in a different way, by repeating them twice:
(defn double2 [reduction-function] (fn [result input] (reduction-function (reduction-function result input) input)))
(reduce (double2 +) 0 coll) will return
$\sum_i (c_i + c_i) = 2 \sum_i c_i$, the same as with
(reduce (double2 *) 1 coll) would return the
$\prod_i c_i c_i = (\prod c_i)^2$.
Three cool things to note:
- The things can be composed, e.g.
(reduce ((comp double1 double2) +) 0 c)gives you $\sum_i 2(c_i +c_i) = 4\sum_i c_i$.
The needs of
reduceare a superset of those of
map. You could create a kind of
mapthat takes reducing functions instead of unary functions, and, unlike
reduce, it can be lazy:
defn process [xfn c] (lazy-seq (when-let [s (seq c)] (concat ((xfn #(concat %1 (list %2))) '() (first s)) (process xfn (rest s)))))) ;; (process double1 [1 2 3]) ; yields (2 4 6) ;; (process double2 [1 2 3]) ; yields (1 1 2 2 3 3) ;; (take 10 (process double2 (range))) ;; yields (0 0 1 1 2 2 3 3 4 4)
In fact, the
sequencefunction in Clojure 1.7 will have a two-argument form
([xform coll] (clojure.lang.LazyTransformer/create xform coll))
that does the same thing as my
processbut much more efficiently.
As long as the transducer doesn't diddle with
result, it will be totally oblivious to how it is called, from what kind of a collection, or even if it's a traditional collection at all. The earliest compelling examples of this generality are
core.async/channels, which, in their latest incarnation can take transducer arguments to process data that flows through them.
(def fatrange (reducer (range 10) double1))
is a reducible with a multiplication hiding out, waiting to be invoked
(reduce + fatrange) Since you can compose any number of
transducers, and they'll all get invoked at the time of reduction,
it's a lot more efficient than the conventional
(reduce + (map f1 (map f2 (map f3 ...))))
which creates a vast apparatus of lazy
sequences feeding each other. Moreover, of course, the transducers
can do more than just modify individual elements, as illustrated above
core.reducers namespace ships with a bevvy of functions like
r/filter - all convenience functions creating reducers with specific
transducers already attached.
The differences between the
fold are that the latter:
- assumes that its reducing function is associative, so that it can
- attack the problem in parallel by repeatedly dividing the collection in two, following the approach famously outlined by Guy Steele and
- using the magic of Java's fork/join framework to perform this work efficiently with a finite number of threads.
If you call
reduce on a
fold on a
reducer, the reduction
proceeds normally. Only if you use
fold, and the collection implements
the parallelization occur.
Some of the
r/whatever methods actually create folders, allowing
them to be used with either
It's interesting (to me) to notice that, while the redicible is the 2nd argument to
reduce (as is the convention in functional languages), it has to be the first
argument in the internal implementations, so that protocol delegation can occur properly.
Under some circumstances, reduction is handled in Java, via the
IReduce interface. This illustrates the "best of both worlds" approach that
Clojure takes to residence on the JVM. We're straight functional lisp when we want to be,
and we're OO when we need to be.
Search engines chiefly come up with blog posts about transducers, plus a few curricula vitae. It's definitely not the opposite of complected, so it definitely has nothing do with this. One is given to believe that this repurposing in nomenclature is less confusing than fancy math words like "monoid," by I'm not so sure.