Around a year ago, there was a lively debate about the type invariance of the immutable Set in Scala. Dogpile argumentation on a subject far outside the popular interest is of course thrilling in itself, but the topic also provides a nice focal point for exploring and clarifying some important aspects of the Scala type system.

We recall that Scala collections (and other higher kinded classes) can be invariant, covariant or contravariant in their type parameters, corresponding repectively to declarations as class Whatever[A], class Whatever[+A] or class Whatever[-A].

  • In the case of covariance, a Whatever[B] will be a subclass of Whatever[A] if B is a subclass of A.
  • With contravariance, a Whatever[B] will be a superclass of Whatever[A] if B is a subclass of A.
  • With invariance, Whatever[A] and Whatever[B] are totally separate classes having nothing to do with each other.

To be even more precise about what this means, consider the Liskov Substitution Principle, which has an intimidating ring to it, but comes down to something simple: B is a subclass of A if objects of type B can be substituted everywhere an A is expected. It's unlikely that you'll need to recite this to yourself when dealing with simple, unparameterized types, but it's but it starts to get interesting for, say, functions. The classic example is a callback, e.g.

   class Event ...
   class ClickEvent extends Event ...
   def registerCallback(cb : ClickEvent => Status) : Unit = ...
   registerClickCallback(cb : ClickEvent => new Status(s"We just had a ${cb}"))

When somebody clicks, a ClickEvent object will be created and passed to the cb, which will examine it and somehow convert it into a Status object.

But really, we should be allowed to register a callback that can deal with any Event, not just ClickEvents, i.e. this should be OK:

  registerClickCallback(cb : Event => new Status(s"We just had a ${cb}"))

Indeed it is OK, i.e.

  1. we can substitute an Event => Status when a ClickEvent => Status is expected, which means that
  2. Event => Status is a subclass of ClickEvent => Status, even though
  3. Event is a superclass of ClickEvent.
  4. Thus

function types are contravariant in their argument type.

Indeed, the Scala library defines

  trait Function1[-T1, +R] {
     abstract def apply(v : T1) : R
  }

Of course this declaration says not just that functions are contravariant in their argument type, but also that they're covariant in their return type. That means we could do:

  class SimpleStatus extends Status ...
  registerClickCallback(cb : Event => new SimpleStatus(s"We just had a ${cb}"))

which is right and just, as the callback registerer can handle any kind of Status, and we should be able to go ahead and return a specific sort. In short, Function[ClickEvent,Status] is a subclass of Function1[Event,SimpleStatus].

It should now not be surprising that

immutable List is declared covariantly

as List[+T]. If there were a function

  def printOutEvents(es : List[Event])

we should be able to substitute a List[ClickEvent]. And we can. Note that the mutable list is declared invariantly as mutable.List[T], which reflects the fact that all hell would break loose if

  def appendNewEvents(es : mutable.List[Event])

could be passed mutable.list[ClickEvent] and then shove KeyboardEvents into it.

As usual, life in Java-land is less beautiful, but life in general is too short to talk about such things.

Instead, let's go back to the original question: Why are Sets declared invariantly as Set[T] when immutable collections like List and Vector are covariant? Why can't I define

  def printOutEvents(es : Set[Event]) { for (e <- es) println(e) }

and pass it a Set[ClickEvent]?

To answer that question (without reading the mailthread),

it will be nice to have a little tooling.

Given a class, I want to be able to trace its entire inheritance, trait and interface implementation graph all the way up to Object, in a nice outline form, like this:

scala> inhGraph(classOf[Integer])
class java.lang.Integer
 interface java.lang.Comparable
 class java.lang.Number
  interface java.io.Serializable
  class java.lang.Object

So we see not just that Integer is Serializable and Comparable, but that it got to the former via Number. This simple outline form can become much more complicated, not only as the hierarchy gets deeper, but as it recombines due to diamond inheritance. In the case of recombination, we'll print out ellipses "..." the second time we encounter a class, just to keep things tidy.

Circularly enough, keeping track of what we've seen will require a Set, and if your hair-trigger functional conscience requires that Set to be immutable, there are implementation implications. Should you care, you can read about them.

To cut to the chase,

here's the scoop on Set:

scala> inhGraph(classOf[Set[_]])
interface scala.collection.immutable.Set
 interface scala.collection.immutable.Iterable
  interface scala.collection.immutable.Traversable
   interface scala.collection.Traversable
    interface scala.collection.TraversableLike
     interface scala.collection.generic.HasNewBuilder
     interface scala.collection.generic.FilterMonadic
     interface scala.collection.TraversableOnce
      interface scala.collection.GenTraversableOnce
     interface scala.collection.GenTraversableLike
      interface scala.collection.GenTraversableOnce
      ...
      interface scala.collection.Parallelizable
    interface scala.collection.GenTraversable
     interface scala.collection.GenTraversableLike
     ...
     interface scala.collection.generic.GenericTraversableTemplate
      interface scala.collection.generic.HasNewBuilder
      ...
   interface scala.Immutable
  interface scala.collection.Iterable
   interface scala.collection.Traversable
   ...
   interface scala.collection.GenIterable
    interface scala.collection.GenIterableLike
     interface scala.collection.GenTraversableLike
     ...
    interface scala.collection.GenTraversable
    ...
   interface scala.collection.IterableLike
    interface scala.Equals
    interface scala.collection.TraversableLike
    ...
    interface scala.collection.GenIterableLike
    ...
 interface scala.collection.Set
  interface scala.collection.Iterable
  ...
  interface scala.collection.GenSet
   interface scala.collection.GenSetLike
    interface scala.collection.GenIterableLike
    ...
    interface scala.Function1
    interface scala.Equals
    ...
   interface scala.collection.GenIterable
   ...
   interface scala.collection.generic.GenericSetTemplate
    interface scala.collection.generic.GenericTraversableTemplate
    ...
  interface scala.collection.SetLike
   interface scala.collection.IterableLike
   ...
   interface scala.collection.GenSetLike
   ...
   interface scala.collection.generic.Subtractable

If you look carefully, you'll see that Set inherits from Function1 across five generations. Deleting a lot of lines, it looks like

interface scala.collection.immutable.Set
 interface scala.collection.Set
  interface scala.collection.GenSet
   interface scala.collection.GenSetLike
    interface scala.Function1

More specifically, a Set[T] may be substituted anywhere a Function1[T,Boolean] is expected. Since, as we know, Function1 is contravariant, there is no way that Set can be covariant. On the other hand, it directly implements the immutable Iterable, which is covariant, so there's no way that Set can be contravariant. The only option is invariance. To hammer in the general principal here: a class cannot be substituted for another if its variance is incompatible.

Tada.

The mystery isn't quite cleared up though.

First of all, Martin Odersky's explanation is slightly different - essentially blaming invariance on an implementation detail. Still, no matter what the implementation, you still couldn't provide Function1 and still be covariant. I hesitate to say that he's wrong, but he might be wrong.

Second, it isn't obvious why Set has to implement Function at all. Presumably, it's so we can use it as a sort of validator, for example as an argument to List.filter

   aListOfThings.filter(setOfThings)

but I can't imagine a national day of mourning if one had to write

   aListOfThings.filter(setOfThings.contains(_))

instead. There is nothing intrinsic about sets that requires them to pass as functions. At best, the feature is syntactic sugar - and sugar of diminished utility in a language that lets you define functions as compactly as Scala does.

All that said,

this should not matter to you if you lead a virtuous life,

where virtue here is all about using type signatures that enforce what you care about and not what you don't care about. I complained above that you couldn't pass Set[ClickEvent] to a function printOutEvents(es : Set[Event]), but in truth that function has very little reason to exist. What I should write instead is

  def printOutEvents(es : TraversableOnce[Event]) { for (e <- es) println(e) }

Set inherits from TraversableOnce

interface scala.collection.immutable.Set
 interface scala.collection.immutable.Iterable
  interface scala.collection.immutable.Traversable
   interface scala.collection.Traversable
    interface scala.collection.TraversableLike
     interface scala.collection.TraversableOnce

so we can call this new function with Set[Event] if we like, and TraversableOnce is covariant, so we can call it with Set[AmusingEvent]. We express no opinion at all on some other points, like whether we could have looped over the contents more than once, or whether the container needs a union method.

If you don't have TraversableOnce at the tips of our typing fingers, get thee to the API and spend some time there. The scaladoc web app is actually a bit fancier than javadocs, and you might want to read a good introduction to its capabilities.



Comments

comments powered by Disqus