Purely functional transducers - where does state belong?

After my recent attempt to provide type annotations for transducers, several people pointed out that I wasn't accounting for state. The signature of a pure function transformation, whether in Clojure

  (t/defalias Transducer (t/TFn [[a :variance :covariant]
                                 [b :variance :contravariant]]
                (t/All [r] [[r a -> r] -> [r b ->r]])))

or Haskell

  type Transducer a b = forall r . (r -> a -> r) -> (r -> b -> r)

nowhere acknowledges that the transducer might need to maintain, for example, the previous value in the series, in order to remove duplicates.

The failure is most obvious to hardcore Haskell programmers, who, as a rule, would rather eat broken glass than modify a hidden state variable in place. State is kept explicitly, though one might use monad sugar to avoid having to look at it all the time.

Hardness of core is a relative thing, and Clojure programmers profess more than a little reverence for referential transparency, so it was a bit surprising to see this:

(defn t-dedup []
  (fn [xf]
    (let [prev (volatile! ::none)]
      (fn
        ([result input]
          (let [prior @prev]
           (vreset! prev input)    ;; <= gack!
              (if (= prior input)
                result
                (xf result input))))))))

In case my ostensibly clarifying excisions make it not obvious, this de-duplicator is the suggested exemplar of a stateful transducer, taken straight from http://clojure.org/transducers (not including the comment).¹ When applied to a reducing function, it will produce a new function, closing over a volatile prev variable. Not only is the resulting function impure, but it is cryptically so. Having no way to detect whether the "stateful function" has been contaminated by prior use, one must take special care to insure that it is not used in more than one reduction, to say nothing of more than one thread.

Some of my best friends are Python programmers, but there have to be limits.

The answer is right in front of us

Reduction may already be the world's best example of state done right! Novelty accrues in the reduced quantity, successive versions of which are passed as the left argument to the reduction function. The quantity is never modified; rather, as usual in functional programming, new versions are created. If that reduction function is +, state is accumulated in the form of a sum; that the sum also happens to be the value we're interested in at the end of the reduction is a useful coincidence.

In fact, Clojure already uses the reduction value to hold state other than result, via the reduced function, used to indicate that reduction may terminate early (as in take). What reduced does is wrap the value up in a container class

  (defn reduced [x] (clojure.lang.Reduced. x))

Subsequently, we find out whether to bail by calling reduced?, which ultimately does nothing more than check what class we have:

  static public boolean isReduced(Object r){ return r instanceof Reduced;  }

As Reduced implements IDeref, the wrapped value can ultimately be extracted with @, which just calls the class's deref method.

This happens to be the one and only way in which state is currently embedded in the reduction, but if I were king, things would be different.

L'etat, c'est moi

In Clojure, the obvious way to attach state to some arbitrary value to the accumulated reduction is with the inbuilt metadata facility. The object returned by (with-meta obj {:boffo "foo"} is identical to obj from all perspectives other than that of silly individuals who put (:boffo (meta obj)) in their code, so we could have have our state, without the reduce consumer having to eat it. The only problem is that you can't attach metadata to scalar primitives (like numbers), which tend to figure prominently in numerical reductions. This, presumably, is why reduced didn't take that route.

Let's define a type that holds state for the deduplicator. It should hold the previous value, and of course we still have to keep the accumulated reduction value:

(deftype DedupStateWrapper [acc prev])

In the transducer, we'll check to see whether the result is already wrapped

(defn t-dedup1 [rf]
  (fn [result input]
    (if (instance? DedupStateWrapper result)
      ...

and, if so, check whether the new input matches the previous:

      (if (= input (.prev result))
        ...

If it's a duplicate, the result shouldn't change; otherwise, call the original reduction function and wrap its return value, setting the state to new input:

        result
        (DedupStateWrapper. (rf (.acc result) input) input))
      ...

Finally, if the result hasn't been wrapped yet, input can't possibly be duplicate, so invoke the reduction function and wrap it up:

      (DedupStateWrapper. (rf result input) input))))

Let's try it out:

playground.transducers> (reduce (t-dedup1 +) 0 [1 1 2 2])
#<DedupStateWrapper playground.transducers.DedupStateWrapper@1bf0ed5c>

Oh right.

playground.transducers> (.acc (reduce (t-dedup1 +) 0 [1 1 2 2]))
3

I'd hoped to avoid the manual unwrapping - before realizing that metadata wouldn't work - but we can argue that it's a good thing, forcing some acknowledgment of recent statefulness.

In any case, without modifying reduce to do our unwrapping (as it already does for Reduced), this is something we just have to live with.

So let's clean up. First, we might as well make a more general-sounding StateWrapper, since all transducers basically want the same thing. We'll make it print all pretty like,

(deftype StateWrapper [value state]
  Object
  (toString [this] (str "(" (.value this) "," (.state this) ")")))

and write some convenience functions to extract its innards:²

(defn get-state [r]
(when (instance? StateWrapper r)
  [(.value r) (.state r)]))

(defn unwrap [r]
  (if (instance? StateWrapper r) (unwrap (.value r)) r))

The deduplicator is now:

(defn t-dedup [rf]
  (fn [result input]
    (debug result input)
    (if-let [[result-val input-prev] (get-state result)]
      (if (= input input-prev)
        result
        (StateWrapper. (rf result-val input) input))
      (StateWrapper. (rf result input) input))))

and we'll also write an uncontroversially stateless transducer that truncates numbers to integers, which will be useful for testing.

(defn t-trunc [reduction-function]
(fn [result input]
    (debug result input)
    (reduction-function result (int input))))

Composition

We'll now take a series of floating point numbers [1.0 2.0 2.5 2.5 3.0] and run them through a stack of composed transducers to deduplicate, truncate and deduplicate again:

playground.transducers> (unwrap (reduce ((comp t-dedup t-trunc t-dedup) conj) [] [1.0 2.0 2.5 2.5 3.0]))
[1 2 3]

Had we applied corresponding transformations to the entire series, rather than as composed transducers, we would see the following steps:

[1.0 2.0 2.5 2.5 3.0]
[1.0 2.0 2.5 3.0]      ; duplicates removed
[1 2 2 3]              ; truncated, revealing more duplication
[1 2 3]                ; duplicates resulting from truncation removed

In transducer land, we're going to get to this result by a different route, having composed the transformation to run on each input iteratively. With so much wrapping and unwrapping going on, it's helpful to turn on debugging. Note that we can tell whether we're in the first or second deduplication by whether the input is, respectively, a double or an integer.

On the first round,

t-dedup [] 1.0  ;; first t-dedup receives empty vector and 1.0; passes down to t-trunc
t-trunc [] 1.0  ;; t-trunc gets empty vector and 1.0; truncates and passes to t-dedup
t-dedup [] 1    ;; second t-dedup receives empty vector and truncated value; passes to conj
test-conj [] 1  ;; performs conj, returns [1]

On the way back up the composed function stack, the second t-dedup wraps the conj'd result and the current input as ([1],1) and passes it up to t-trunc, which passes it up, unmodified, to the first t-dedup, which wraps it again, together with its first input. The current result, which will be passed to the next iteration is thus (([1],1),1.0):

t-dedup (([1],1),1.0) 2.0     ;; First t-dedup compares 1.0 and 2.0;
                                  unwraps and invokes t-trunc
t-trunc ([1],1) 2.0           ;; t-trunc converts to int; invokes 2nd t-dedup
t-dedup ([1],1) 2             ;; 2nd t-dedup compares 1 and 2; unwraps and invokes conj
test-conj [1] 2

t-dedup (([1 2],2),2.0) 2.5  ;; First t-dedup compares 2.0, 2.5;
                                unwraps and invokes t-trunc
t-trunc  ([1 2],2) 2.5       ;; t-trunc converts to int; invokes 2nd t-dedup
t-dedup ([1 2],2) 2          ;; 2nd t-dedup finds 2==2; DOESN'T invoke conj

Skipping over conj, we move forward:

t-dedup (([1 2],2),2.5) 2.5  ;; First t-dedup finds 2.5==2.5; DOESN'T invoke t-trunc
t-dedup (([1 2],2),2.5) 3.0  ;; First t-dedup compares 2.5, 3.0; invokes t-trunc
t-trunc ([1 2],2) 3.0        ;; t-trunc converts to int; invokes 2nd t-dedup
t-dedup ([1 2],2) 3          ;; 2nd dedup compares 2,3; invokes conj
test-conj [1 2] 3

The final [1 2 3] passes up through two layers of wrapping, so the value returned by reduce is (([1 2 3] 3), 3.0), which unwrap peels apart recursively.

Unsolved problems

All this wrapping and unwrapping doesn't come for free, computationally speaking. It seems possible that the JIT will optimize some of it away, but I haven't checked yet.
Having abolished crypto-state, we're a little closer to solving the type problem, but not quite there yet.

To be continued...

Yes, I know I've left off the 0 and 1-arity forms for the returned reduction function. While necessary, in real life, they don't affect the state discussion. ↩
The arity-1 form could in principle embed unwrap, but that wouldn't help for non-terminating processes. ↩

The answer is right in front of us

L'etat, c'est moi

Composition

Unsolved problems

Comments