Tuesday, July 04, 2006

Covariance and Contravariance in .NET, Java and C++

Prompted by a Microsoft Research paper I read recently and a post on the MS .NET newsgroups, I investigated covariance and contravariance support in .NET, and contrasted it with the support in Java.

I won't describe try to describe covariance or contravariance directly. Hopefully it'll be clear from the examples. A brief description of covariance and contravariance can be found here (Wikipedia).

To begin with, I'll enumerate some of the different kinds of variance supported in C#, Common Language Infrastructure (CLI), C++ and Java, as far as I know them, along with the definitions I'll use in this article. These definitions aren't official in any sense I'm aware of.

  1. Override Variance

    This variance refers to the parameters and return types of an overridden method in a descendant class. C++ supports override covariance of return types.

  2. Definition-site Generic Variance

    With this kind of variance, the generic type, as part of its definition, defines how the subtype relation applies to instantiations of the generic type when the type arguments are themselves related by the subtype relation. The CLI (and thus the CLR) supports definition-site generic variance.

  3. Use-site Generic Variance

    With this kind of variance, the generic variable declaration (i.e. parameter, local or field), as part of the declaration, defines whether or not it is assignment-compatible with generic instantiations whose parameter types are more derived (covariant) or less derived (contravariant). Java wildcards are an implementation of use-site generic variance.

  4. Array covariance

    Java supports covariance of arrays of object types. This covariance isn't fully sound with respect to the type system at compile time, because arrays are mutable. Thus, run-time checks are used to patch up the hole. C# and the CLI support this feature chiefly to support Java on the CLI. To demonstrate the hole:

    Dog[] dogs = new Dog[10];
    Mammal[] mammals = dogs;
    mammals[0] = new Cat();
    
    The above code is statically correct with respect to types at compile time, with the Java definition of array covariance, but of course it isn't actually statically type-safe.
  5. Delegate variance

    C# supports delegate variance only at the point of binding. The method to which a delegate value is bound may have a covariant return type and contravariant argument types. Once the delegate is bound, it is not assignment-compatible with another delegate type even where the underlying method would be compatible according to variance rules.

I'll drill a bit deeper into the first three of these, since most developers should be familiar with the last two.

Override Variance

This variance refers to the parameters and return types of an overridden method in a descendant class. C++ supports this for covariance of return types. Return types and out parameters may be covariant, input parameters may be contravariant, and in-out parameters must be invariant. C++ example:
class Mammal
{
public:
    virtual Mammal* GetValue();
};

class Dog : public Mammal
{
public:
    virtual Dog* GetValue();
};
The equivalent example to demonstrate contravariance of input parameters can't be written in C++, since C++ doesn't support it. If one could declare it, it would look a bit like the following C#:
class DogComparer
{
    public virtual int Compare(Dog left, Dog right)
    {
    }
}

class MammalComparer : DogComparer
{
    public override int Compare(Mammal left, Mammal right)
    {
    }
}
Note the difference: the inheritance relationship is the other way around. That's where the 'contra' comes in. The arguments' subtype relationship is the opposite of the outer type's subtype relationship.

It's also intuitively true. A comparer of mammals is naturally also a comparer of dogs, since dogs are a subtype of mammal - thus a comparer of mammals is a subtype of a comparer of dogs!

Neither C#, the CLI nor Java support override variance. C++ supports return type override covariance, but not override contravariant input arguments. C++/CLI doesn't support override covariant return types for managed classes. It gives this error:

error C2392: 'Dog ^Dog::GetValue(void)' : covariant returns types are
not supported in managed types, otherwise 'Mammal ^Mammal::GetValue(void)' would be overridden

Definition-site Generic Variance

The CLI (II 9.5) supports definition-site variance for interfaces and delegates, but not for reference classes and value types. It uses the syntax <+T> to denote covariance and <-T> for contravariance. Because covariance only works for output parameters, a generic type can specify covariance on type parameters which are used in output positions only. Similiarly, contravariance is allowed on type parameters which are used for input only.

Within these constraints, and pretending that C# supported this CLI feature, we could envision these types:

class Mammal { }
class Dog : Mammal { }

interface IReader<+T> // allows covariance
{
    T GetValue();
}

interface IWriter<-T> // allows contravariance
{
    void SetValue(T value);
}
With these definitions, covariance of generic parameters would allow this:
  IReader<Dog> dogReader = null;
  IReader<Mammal> mammalReader = dogReader;
Contravariance of generic parameters would allow this:
  IWriter<Mammal> mammalWriter = null;
  IWriter<Dog> dogWriter = mammalWriter;
These are both disallowed in C#, but allowed at the IL level. Here's an IL translation of the above imaginary C# which assembles and passes PEVerify:
.assembly extern mscorlib {}
.assembly Test {}

.class private auto ansi beforefieldinit Mammal
       extends [mscorlib]System.Object {}

.class private auto ansi beforefieldinit Dog
       extends Mammal {}

.class interface private abstract auto ansi IReader`1<+T>
{
  .method public hidebysig newslot abstract virtual 
          instance !T  GetValue() cil managed {}
}

.class interface private abstract auto ansi IWriter`1<-T>
{
  .method public hidebysig newslot abstract virtual 
          instance void  SetValue(!T 'value') cil managed {}
}

.class private auto ansi beforefieldinit App
       extends [mscorlib]System.Object
{
  .method private hidebysig static void Main() cil managed
  {
    .entrypoint
    .locals init (
             [0] class IReader`1<class dog> dogReader,
             [1] class IReader`1<class Mammal> mammalReader,
             [2] class IWriter`1<class Mammal> mammalWriter,
             [3] class IWriter`1<class Dog> dogWriter)
    
    ldnull
    stloc.0

    ldloc.0
    stloc.1

    ldnull
    stloc.2

    ldloc.2
    stloc.3

    ret
  }
}
If one switches around the assignments, to try and treat covariance contravariantly and vice versa, changing the body of the Main method to:
    ldnull
    stloc.1
    
    ldloc.1
    stloc.0
    
    ldnull
    stloc.3
    
    ldloc.3
    stloc.2
    
    ret
One then gets the following errors from PEVerify:
[IL]: Error: [App::Main][found ref 'IReader`1[Mammal]'][expected ref
'IReader`1[Dog]'] Unexpected type on the stack.
[IL]: Error: [App::Main][found ref 'IWriter`1[Dog]'][expected ref
'IWriter`1[Mammal]'] Unexpected type on the stack.
Similarly, if one tries to make IReader<+T> contravariant, i.e. change it to IReader<-T>, and similarly make IWriter<-T> covariant, one gets the following errors from PEVerify:
[token  0x02000004] Type load failed.
[token  0x02000005] Type load failed.
So, the covariant and contravariant support is there.

Use-site Generic Variance

This is the definition of variance at the use site, rather than the definition site. That means that when declaring variables of a generic type, one can make the variable declaration open to instances of generic types with more (covariant) or less (contravariant) derived type arguments.

To make the example concrete, I'll use Java 5, which supports covariance and contravariance through wildcards.

class Mammal {
    public Mammal() {
    }
}

// ---

class Dog extends Mammal {
    public Dog() {
    }
}

// ---

class Cat extends Mammal {
    public Cat() {
    }
}

// ---

public class Holder<T> {
    T _value;
    
    public Holder() {
    }
    
    public T getValue() {
        return _value;
    }
    
    public void setValue(T value) {
        _value = value;
    }
}
Given these definitions, I can make use of covariance thusly:
Holder<Cat> catHolder = new Holder<Cat>();
catHolder.setValue(new Cat());
// Use covariance to fit cat-holder into mammal-holder.
Holder<? extends Mammal> mammalHolder = catHolder;
// Can now access return values (covariant is only safe for out).
Mammal mammal = mammalHolder.getValue();
System.out.println(mammal);
// This won't work: covariance doesn't work for input parameters.
mammalHolder.setValue(new Dog());
Similarly, I can make use of contravariance:
Holder<Mammal> mammalHolder = new Holder<Mammal>();
// Use contravariance to fit mammal-holder into cat-holder.
Holder<? super Cat> catHolder = mammalHolder;
// Can now access input parameters (contravariance is only
// safe for input).
catHolder.setValue(new Cat());
// This won't work: contravariance doesn't allow output.
Cat cat = catHolder.getValue();

Summary

The CLI and Java have two quite different generic variance capabilities. C# hasn't exposed any of the CLI's generic variance. The delegate variance exposed by C# appears to be more a feature of the CLR / CLI's loosening of delegate binding restrictions, since it doesn't use generic variance functionality. Intuitively, it appears that use-site generic variance is a superset of the functionality of definition-site variance, since any definition-site variance scheme can be replaced by an equivalent use-site version, while the contrary isn't true (input arguments of covariant generic types and output arguments of contravariant generic types are strictly disallowed in definitions, but may be disallowed on a case-by-case basis at the use-point) - but I haven't tried to prove that.

There is, however, a cost associated with generic type variance - conceptual complexity. Covariance and contravariance are simple enough once one gets used to them, but they represent yet another barrier to be overcome for newcomers to a language.

No comments: