Lenses are a construct for getting, "setting" or "modifying" values within data structures, especially deeply nested data structures. The quotation marks have the usual meaning when they show up in funktionsprache:1 not mutating anything per se, but instead producing an object, or reference thereto, that is identical except for the requested change.

In Scala, the need for lenses is pretty glaring, as illustrated in Eugene Yokota's great explanation of lenses in Scalaz, because of the centrality of case classes. In his example problem, a turtle is represented with three case classes:

  case class Point(x: Double, y: Double)
  case class Color(r: Byte, g: Byte, b: Byte)
  case class Turtle(position: Point, heading: Double, color: Color)
  val t = Turtle(Point(2.0, 3.0), 0.0, Color(255.toByte, 255.toByte, 255.toByte))
  // t: Turtle = Turtle(Point(2.0,3.0),0.0,Color(-1,-1,-1))

This is lovely, but if you wanted to change just the x position, you'd have to write

  t.copy(position=t.position.copy(x=42.0))

and you can imagine how truly awful this would look for even more deeply nested case classes.

In Clojure, this sort of thing is a lot easier out of the gate, because records implement map protocols (i.e. clojure.lang.Associative) and because of the ultra-slick assoc-in:

  (defrecord Point [^double x ^double y])
  (defrecord Color [^short r ^short g ^short b])
  (defrecord Turtle [^Point position ^double heading ^Color color])
  (def t (->Turtle (->Point 1.0 2.0) (/ Math/PI 4) (->Color 255 0 0)))
  (assoc-in t [:position :x] 42.)
  ;; #user.Turtle{:position #user.Point{:x 42.0, :y 2.0}, :heading 0.7853981633974483, :color #user.Color{:r 255, :g 0, :b 0}}

but tastiness like this just makes us hungrier.

Before proceeding, I should mention a very nice presentation that takes a more formal (functorish) approach to lenses than I do here.

Note: 2014-09-29

Since this post has been tweeted a bit (well, more than zero times, which is what I expected), I should note that there are two follow-up posts: One responding to some of the comments, and another offering an alternate implementation that, among other things, is compatible with core.typed. Neither, though, will make sense if you haven't read this one.

Use case: Amazon Web Services

I've recently been spending a lot of time wrangling AWS with Clojure. The main route to AWS is Amazon's Java SDK, around which amazonica provides a complete wrapper. In fact, it's better than a wrapper, because the plain SDK is mind-bogglingly tedious - a model citizen in the kingdom of nouns.

For example, to bid on an instance in the spot auction market, you call a static method in the AmazonEC2 class:

 RequestSpotInstancesResult requestSpotInstances(RequestSpotInstancesRequest requestSpotInstancesRequest)

where the RequestSpotInstantRequest class has a do-nothing constructor and a Dukedom of .setXYZ methods, the most important of which is

  void  setLaunchSpecification(LaunchSpecification launchSpecification)

in which LaunchSpecification has yet more .set methods, including

  void setNetworkInterfaces(Collection<InstanceNetworkInterfaceSpecification> networkInterfaces)

and the InstanceNetworkInterfaceSpecification is where setSubnetId(String subnetId) lives, so you really end up needing all of these classes.

Amazonica, by contrast, turns this logically nested data into explicitly nested hash-maps, so an entire request can be summarized and constructed as:

  (def req  [:spot-price 0.01, 
                :instance-count 1, 
                :type "one-time", 
                :launch-specification
                     {:image-id      "ami-something",
                      :instance-type "t1.micro",
                      :placement     {:availability-zone "us-east-1a"},
                      :key-name      "your-key"
                      :user-data     "aGVsbG8gc2FpbG9y"
                      :network-interfaces
                            [{:device-index 0
                              :subnet-id    "subnet-yowsa"
                              :groups       ["sg-hubba"]}]
                              :iam-instance-profile
                                  {:arn "arn:aws:iam::123456789:instance-profile/name-you-chose"}}])
  (request-spot-instances req)

You can see where this is going. (assoc-in req [:launch-specification :network-interfaces 0 :groups 0] "sg-foo") is nicer than the equivalent java jive, but it is not a thing of beauty.

Also, take a look at that :user-data field. The erudition of my readers is such that they will immediately recognize this as the base-64 encoding of "hello sailor", but Clojure is not that clever (yet). And it gets worse. Using amazonica with Amazon's Simple Query/Notification Services (SNS/SQS), you may encounter a response that

  • comes to us as a map
  • one of whose values is a JSON-encoded string
  • which in turn contains another base-64 encoded string,
  • which for me happened to contain yet more name-value pairs.

Can we find a way to deal with such complexity in a way that feels simple?

Want, want want!

What I want is:

  1. Paths that allow arbitrary transformations along the lookup/retrieval path.
  2. A convenient way to specify a dictionary of aliases to such paths.
  3. The usual lensy guff of special-purpose getters, setters and updaters.

Paths with arbitrary transformations

In the example AWS request above, we would like to set and get the :user-data field as a plain old string, without worrying about the encoding. Specifically, I'd like two functions:

    (ph-get-in m ks)
    (ph-assoc-in m ks)

which work just like their ph-less forebearers, but accommodate special transformation entries of the form

   [f-incoming f-outgoing]

where f-incoming transforms data on its way into the map, and f-outgoing transforms entries read from the map. For :user-data, we'll have2

   (def upath [:launch-specification :user-data [s->b64 b64->s]])

so (ph-get-in req-map upath) should return "hello sailor", and (ph-assoc-in req-map upath "lord love a duck") returns a map where the :user-data field is "bG9yZCBsb3ZlIGEgZHVjaw==".

Let's start by looking the recursive definition of the existing clojure.core/assoc-in:

(defn assoc-in [m [k & ks] v]
  (if ks
    (assoc m k (assoc-in (get m k) ks v))
    (assoc m k v)))

which we'll modify very slightly to detect a final k of the form [f-incoming f-outgoing]:

(defn ph-assoc-in-v1 [m [k & ks] v]
  (cond
   (vector? k) ((first k) v)
   ks          (assoc m k (ph-assoc-in (get m k) ks v))
   k           (assoc m k v)
   :else       v))

and just apply f-incoming to the the value v.

But... that isn't exactly what I asked for! I want to allow arbitrary transformations anywhere along the path, not just at the very end. For instance, I want to be able to get at the :c value in {:a 1 :b "{:c 2}"} so that

(ph-assoc-in {:a 1 :b "{:c 2}"}
             [:b [pr-str read-string] :c [inc dec]] 5)

should return {:b "{:c 6}", :a 1}, both incrementing the :c value and putting it back into string form.

In version 1, the (vector? k) clause just assumed we were at the end of the path and applied f-incoming, but if we're the middle of the path (e.g. we just encountered "{:c 6}"), we need to

  1. transform whatever weird thing we found (in the example above, a string) back into a java.lang.Associative (in the example, a map) by calling the outgoing function (read-string, in this case),

    (let [[f-incoming f-outgoing] k
          m  (f-outgoing m)
    
  2. insert the innernext layer by recursively calling ourselves (exactly as assoc-in does),

          m  (ph-assoc-in m ks v)]
    
  3. and transform the modified map back into its original form (here, with pr-str),

       (f-incoming m))
    
  4. to be returned to our recursive caller.

Putting that all together, along with handling the case where we are at the end of the path:

(defn ph-assoc-in [m [k & ks] v]
  ;;(println "   m=" (pr-str m) "k=" k)
  (cond
   (vector? k) (let [[f-incoming f-outgoing] k]
                  (f-incoming (if-not ks v (ph-assoc-in (f-outgoing m) ks v))))
   ks          (assoc m k (ph-assoc-in (get m k) ks v))
   k           (assoc m k v)
   :else       v))

That's better. We can now transform whatever we find in the nested map, wherever we find it. Uncommenting the println, we can see what happens very clearly:

acyclic.utils.pinhole> (ph-assoc-in {:a 1 :b "{:c 2}"} [:b [pr-str read-string] :c [inc dec]] 5)
   m= {:b "{:c 2}", :a 1} k= :b
   m= "{:c 2}" k= [#<core$pr_str clojure.core$pr_str@6006f3fa> #<core$read_string clojure.core$read_string@246a3dc1>]
   m= {:c 2} k= :c
   m= 2 k= [#<core$inc clojure.core$inc@196fe8b1> #<core$dec clojure.core$dec@7c9ab3af>]
{:b "{:c 6}", :a 1}

When we encounter "{:c 2}", the next "key" in the sequence is [pr-str read-string], so read-string gets applied to it before its passed recursively to ph-assoc-in, which means that the :c that's next gets applied to {:c 2}. At the innermost recursion the 5 gets incremented, and then we tumble back up the stack assoc-ing everything back into place.

The corresponding get-in is a bit simpler, because we have no need for f-incoming, but f-outgoing is still used to transform weird intermediate values into Associatives:

(defn ph-get-in [m [k & ks]]
  ;;(println "   m=" (pr-str m) "k=" k)
  (cond
   (vector? k) (let [[_ f-outgoing] k] (ph-get-in (f-outgoing k) m) ks)
   k           (ph-get-in (get m k) ks)
   :else       m))

Turning on printing:

acyclic.utils.pinhole> (ph-get-in {:a 1 :b "{:c 2}"} [:b [pr-str read-string] :c [inc dec]])
   m= {:b "{:c 2}", :a 1} k= :b
   m= "{:c 2}" k= [#<core$pr_str clojure.core$pr_str@6006f3fa> #<core$read_string clojure.core$read_string@246a3dc1>]
   m= {:c 2} k= :c
   m= 2 k= [#<core$inc clojure.core$inc@196fe8b1> #<core$dec clojure.core$dec@7c9ab3af>]
   m= 1 k= nil
1

A dictionary of aliases

In my AWS example, there are several fields I will want to change frequently, and it would be handy to have a dictionary of aliases to the complex paths to these fields, e.g.:

(def path-dict
  {:zone    [:launch-specification :placement :availability-zone]
   :public? [:launch-specification :network-interfaces 0 :associate-public-ip-address]
   :udata   [:launch-specification :user-data [s->b64 b64->s]]} )

Moreover, I shouldn't have to remember when I'm using an alias, so it would be nice to have a functions that worked like regular get/assoc when called with a keyword that isn't an alias for something else, even with a full path. Let's pull this massaging process into its own function, with returns a path, almost irrespective of what it's fed:

(defn condition-key [path-dict k]
     (cond
      (sequential? k) k
      (path-dict k)   (path-dict k)
      :else           [k]))

With this, ph-get is essentially trivial.

(defn ph-get [path-dict o k] (ph-get-in o (condition-key path-dict k)))

Now (ph-get paths-dict req-map :spot-price) will return the unadulterated field, while (ph-get path-dict req-map :udata) does a fancy base-64 transformation.

As the core/assoc can take an arbitrary number of interleaved key-value arguments, we should be able to do so as well. We'll knead the arguments into a list of key-value pairs and reduce over ph-assoc-in to insert the values sequentially.

(defn ph-assoc [paths m & kvs]
  (reduce (fn [m [k v]]
            (ph-assoc-in m (condition-key paths k) v)) m (partition 2  kvs)))

The lensy guff

The scalaz tutorial shows how to a create a turtleX, which focuses directly on terrapinic abscissae:

scala> turtleX.set(t0, 5.0)
res17: scalaz.Id.Id[Turtle] = Turtle(Point(5.0,3.0),0.0,Color(-1,-1,-1))

Take a look, if you like, at how turtleX is built. To my taste, it's a bit lengthy and boilerplated,3 though admittedly you get lots of type goodness in the bargain.

I'd like to to be able to make the lens with a one-liner:

(def txget (mk-ph-get [:position :x]))

and then use (or pass about) txget as a normal function (txget t0) for retrieving position.

With what we have so far, the necessary machinery is a two-liner:

(defn mk-ph-get [ks] (fn [o] (ph-get-in o ks)))
(defn mk-ph-set [ks] (fn [o v] (ph-assoc-in o ks v)))

What about modification in place? To start, imagine a function to turn the turtle to the right by some amount. Our user would provide a function of the original angle and the amount to turn, and return the new angle,4

  (defn new-heading [old-heading amt] (mod (- old-heading amt) (* Math/PI 2)))

and we'll provide mk-ph-mod such that

  (def turn (mk-ph-mod new-heading [:heading]))

so that

acyclic.utils.pinhole> (turn t (/ Math/PI 2))
   m= #user.Turtle{:position #user.Point{:x 1.0, :y 2.0}, :heading 0.7853981633974483, :color #user.Color{:r 255, :g 0, :b 0}} k= :heading
#user.Turtle{:position #user.Point{:x 1.0, :y 2.0}, :heading 5.497787143782138, :color #user.Color{:r 255, :g 0, :b 0}}

Cake city. We ph-get-in the heading, apply their function to it, and ph-assoc-in it back:

(defn mk-ph-mod-v1 [f arg-path]
  (fn [o & more-args] (ph-assoc-in o  arg-path (apply f (ph-get-in o arg-path) more-args))))

But we can do better...

Turtles all the way

The scalaz lens tutorial ends with a lens for moving the turtle forward by some amount in whatever direction it's currently pointed, so both the x and y coordinates change:

scala> forward(10.0)(t0)
res31: (Turtle, (Double, Double)) = (Turtle(Point(12.0,3.0),0.0,Color(-1,-1,-1)),(12.0,3.0))

Let's see if we can do the same, ideally in an intuitive, Clojurey way:

We'll tweak mk-ph-mod a bit, to accept a user function that consumes and returns an arbitrary number of path-specified values. Our turtle enthusiast will provide the trigonometry in the form of a function that now returns a vector of values:

(defn movexy [x y dir dist] [(+ x (* dist (Math/cos dir)))
                             (+ y (* dist (Math/sin dir)))])

And we'd like to be able to use it like this:

user> (def turtle-forward (mk-ph-mod movexy [:position :x] [:position :y] [:heading]))
user> (turtle-forward t 100.)
#user.Turtle{:position #user.Point{:x 71.71067811865476, :y 72.71067811865474}, :heading 0.7853981633974483, :color #user.Color{:r 255, :g 0, :b 0}}

All we need is mk-ph-mod, which will

  1. return a function of the object and maybe some run-time arguments,

    (defn mk-ph-mod [f & arg-paths]
       (fn [o & more-args]
    
  2. that extracts values for all the paths specified as the 2nd and subsequent arguments to mk-ph-mod,

          (let [args (map (partial ph-get-in o) arg-paths)
    
  3. passes them to the the user function, expecting it to return a vector of new values for one or more of those paths,

                vs   (apply f (concat args more-args))
    
  4. creates path/value pairs for all the values returned, and

                kvs  (map vector arg-paths vs)]
    
  5. reduces over the pairs, tucking them into the structure with ph-assoc-in.

              (reduce (fn [m [k v]] (ph-assoc-in m k v)) o kvs))))
    

All together:

(defn mk-ph-mod [f & arg-paths]
    (fn [o & more-args]
      (let [args (map (partial ph-get-in o) arg-paths)
            vs   (apply f (concat args more-args))
            kvs  (map vector arg-paths vs)]
        (reduce (fn [m [k v]] (ph-assoc-in m k v)) o kvs))))

Et voila.

We can now trivially build functions that do nearly anything to a structure of nested Associatives.

Idiomatic Lenses for Clojure

I'm anticipating complaints that these aren't real lenses, because those would be defined by the Haskell and Scalaz varieties that came came first, and a huge piece of what they do is ensure type correctness. The "pinhole" sobriquet is supposed to preempt such whiners by proudly owning the more primitive technology.

Another possible complaint is that Clojure already has assoc-in and get-in, and it's true that they are quite sufficient for many cases, and if they're not lenses, then neither is a more slightly complicated variety.

Nonetheless, whatever it is that we have here is useful for the thing that lenses are useful for, irrespective of language: focusing on data deep within a structure. What's more, thanks to the expressive power of dynamic Clojure, and higher order functions, these lenses are not just simple to use but simple to create.

The code discussed all lives on github along with with some other utilities I'm still hacking at. At some point, they'll make their way to clojars.

    (ph-get-in m path)
    (ph-assoc-in m path v)
    (ph-assoc paths-dict m path1 v1 path2 v2 ...)
    (ph-get path-dict m path-or-key-or-alias)
    (ph-assoc path-dict m path-or-key-or-alias v)
    (mk-ph-set path) or                                 (mk-ph-set path-dict path-or-key-or-alias)
    (mk-ph-get path) or                                 (mk-ph-get path-dict path-or-key-or-alias)
    (mk-ph-mod f arg-path1 arg-path2 ...) or            (mk-ph-mod path-dict f arg-key1 arg-key2 ...)

  1. I don't know German. 

  2. The functions are defined using clojure.data.codec.base64 as (defn s->b64 [s] (String. (b64/encode (.getBytes s)))) and (defn b64->s [s] (String. (b64/decode (.getBytes s))))

  3. An etymologically naive back-formation, but I like it. 

  4. Subtraction because, by convention, zero radians points to 3:00, with increases moving clockwise. Modulus, because we're because the circle is 2 π around. 



Comments

comments powered by Disqus