Wednesday, August 20, 2008

Tiburon: fun with generics and anonymous methods

Update: I've added more code to the end of this entry.

The next release, Delphi 2009 (codename TiburĂ³n) has some new language features: generics and anonymous methods in particular. I've decided to write up some of my experiments, playing with the new tools to see what can be done with them.

First, though, a disclaimer: I am of course working with the development tip for both compiler and RTL, so some details mentioned here (RTL in particular) may change before final release. However, the basics of generics and anonymous methods are pretty solid and should not change, so any gaps should be fillable if needs be.

Here's a generic method reference:

type
  TFunc<T> = reference to function: T;

This guy (a "method reference") can refer to any global function, method or anonymous method which takes no parameters and returns a value. It's a managed type, like strings and interfaces, because it can contain state that needs to be freed when the last reference to it goes out of scope. Here's a trivial example of a method reference in use with an anonymous method:

{$APPTYPE CONSOLE}

uses
  SysUtils;

function MakeCounter(Start, Increment: Integer): TFunc<Integer>;
begin
  Result := function: Integer
  begin
    Result := Start;
    Inc(Start, Increment);
  end;
end;

procedure WriteList(const Source: TFunc<Integer>; Count: Integer);
begin
  while Count > 1 do
  begin
    Write(Source, ', ');
    Dec(Count);
  end;
  Writeln(Source);
end;

procedure DoIt;
var
  evenNumbers: TFunc<Integer>;
  decades: TFunc<Integer>;
begin
  evenNumbers := MakeCounter(0, 2);
  decades := MakeCounter(0, 10);

  WriteList(evenNumbers, 10);
  WriteList(decades, 10);
end;

begin
  try
    DoIt;
  except
    on E:Exception do
      Writeln(E.Classname, ': ', E.Message);
  end;
end.

The code should be pretty easy to follow. It prints out the first 10 odd numbers, and the first 10 numbers evenly divisible by 10, starting at 0. Here's the exact output:

0, 2, 4, 6, 8, 10, 12, 14, 16, 18
0, 10, 20, 30, 40, 50, 60, 70, 80, 90

Notice how the value returned from MakeCounter has captured its parameters, and that these parameters have become internal state of the return value. This action is called variable capture, and an anonymous method that performs variable capture is a closure. TFunc<T> is useful enough that it is predefined in SysUtils; there are similar versions taking up to 4 arguments, and there are also TProc method references, also taking up to 4 arguments.

This example suggests to me an easy way of creating enumerators in Delphi. Rather than having to define a separate enumerator class, and implement the GetCurrent and MoveNext functions, as well as a property that reads from GetCurrent, we can use an anonymous method that simply returns each value in sequence. To signal termination, we'll use a special exception. With these tools, we can write a counter generator that has an upper bound too:

{$APPTYPE CONSOLE}

uses
  SysUtils;

type
  EEnumerationFinished = class(Exception)
    constructor Create;
  end;
  
  TSimpleEnumerator<T> = class
  private
    FSource: TFunc<T>;
    FCurrent: T;
    FFinished: Boolean;
    function GetCurrent: T;
  public
    constructor Create(const ASource: TFunc<T>);
    function MoveNext: Boolean;
    property Current: T read GetCurrent;
  end;

{ EEnumerationFinished }

constructor EEnumerationFinished.Create;
begin
  inherited Create('Enumeration finished');
end;
  
{ TSimpleEnumerator<T> }

constructor TSimpleEnumerator<T>.Create(const ASource: TFunc<T>);
begin
  FSource := ASource;
end;

function TSimpleEnumerator<T>.GetCurrent: T;
begin
  Result := FCurrent;
end;

function TSimpleEnumerator<T>.MoveNext: Boolean;
begin
  if FFinished then
    Exit(False);
  try
    FCurrent := FSource;
  except
    on e: EEnumerationFinished do
    begin
      FFinished := True;
      Exit(False);
    end;
  end;
  Exit(True);
end;

function MakeCounter(Start, Increment, Count: Integer): TSimpleEnumerator<Integer>;
begin
  Result := TSimpleEnumerator<Integer>.Create(function: Integer
  begin
    if Count <= 0 then
      raise EEnumerationFinished.Create;
    Result := Start;
    Inc(Start, Increment);
    Dec(Count);
  end);
end;

function ToString(e: TSimpleEnumerator<Integer>): string;
var
  sb: TStringBuilder;
begin
  sb := TStringBuilder.Create;
  try
    while e.MoveNext do
      sb.Append(e.Current).Append(', ');
    if sb.Length > 2 then
      sb.Length := sb.Length - 2;
    Result := sb.ToString;
  finally
    sb.Free;
  end;
end;

procedure DoIt;
var
  oddNums: TSimpleEnumerator<Integer>;
begin
  oddNums := MakeCounter(1, 2, 20);
  try
    Writeln('Odd numbers: ', ToString(oddNums));
    Writeln('And again: ', ToString(oddNums));
  finally
    oddNums.Free;
  end;
end;

begin
  try
    DoIt;
  except
    on E:Exception do
      Writeln(E.Classname, ': ', E.Message);
  end;
end.

Unfortunately, this has several drawbacks. First, the for-in syntax can't be used, because that expects a collection of some kind that has a method called GetEnumerator, whereas we only have the enumerator itself. Secondly, the enumeration can only be performed once. You can see this in the output:

Odd numbers: 1, 3, 5, 7, 9, 11, 13, 15, 17, 19, 21, 23, 25, 27, 29, 31, 33, 35, 37, 39
And again: 

We can fix both of these problems, however, by adding another layer of indirection. Rather than working with TFunc<T>, we'll work with TFunc<TFunc<T>>; and we'll write a TSimpleEnumerable<T> that returns our enumerator.

Now, inside one of the new units, Generics.Collections, are a couple of new base classes, TEnumerator<T> and TEnumerable<T>. I'll use those base classes too:

{$APPTYPE CONSOLE}

uses
  SysUtils, Generics.Collections;

type
  EEnumerationFinished = class(Exception)
    constructor Create;
  end;

  TSimpleEnumerable<T> = class(TEnumerable<T>)
  private
    FSourceMaker: TFunc<TFunc<T>>;
  public
    type
      TEnumerator = class(TEnumerator<T>)
      private
        FSource: TFunc<T>;
        FCurrent: T;
        FFinished: Boolean;
        constructor Create(const ASource: TFunc<T>);
        function GetCurrent: T;
      protected
        function DoGetCurrent: T; override;
        function DoMoveNext: Boolean; override;
      public
        function MoveNext: Boolean;
        property Current: T read GetCurrent;
      end;
  protected
    function DoGetEnumerator: TEnumerator<T>; override;
  public
    constructor Create(const ASourceMaker: TFunc<TFunc<T>>);
    function GetEnumerator: TEnumerator;
  end;
  
{ EEnumerationFinished }

constructor EEnumerationFinished.Create;
begin
  inherited Create('Enumeration finished');
end;
  
{ TSimpleEnumerable<T>.TEnumerator }

constructor TSimpleEnumerable<T>.TEnumerator.Create(const ASource: TFunc<T>);
begin
  FSource := ASource;
end;

function TSimpleEnumerable<T>.TEnumerator.DoGetCurrent: T;
begin
  Result := GetCurrent;
end;

function TSimpleEnumerable<T>.TEnumerator.DoMoveNext: Boolean;
begin
  Result := MoveNext;
end;

function TSimpleEnumerable<T>.TEnumerator.GetCurrent: T;
begin
  Result := FCurrent;
end;

function TSimpleEnumerable<T>.TEnumerator.MoveNext: Boolean;
begin
  if FFinished then
    Exit(False);
  try
    FCurrent := FSource;
  except
    on e: EEnumerationFinished do
    begin
      FFinished := True;
      FCurrent := Default(T); // don't keep alive if refcounted
      Exit(False);
    end;
  end;
  Result := True;
end;

{ TSimpleEnumerable<T> }

constructor TSimpleEnumerable<T>.Create(const ASourceMaker: TFunc<TFunc<T>>);
begin
  FSourceMaker := ASourceMaker;
end;

function TSimpleEnumerable<T>.DoGetEnumerator: TEnumerator<T>;
begin
  Result := GetEnumerator;
end;

function TSimpleEnumerable<T>.GetEnumerator: TEnumerator;
begin
  Result := TEnumerator.Create(FSourceMaker());
end;

function MakeCounter(Start, Increment, Count: Integer): TSimpleEnumerable<Integer>;
begin
  Result := TSimpleEnumerable<Integer>.Create(function: TFunc<Integer>
  var
    myStart, myIncrement, myCount: Integer;
  begin
    myStart := Start;
    myIncrement := Increment;
    myCount := Count;
    
    Result := function: Integer
    begin
      if myCount <= 0 then
        raise EEnumerationFinished.Create;
      Result := myStart;
      Inc(myStart, myIncrement);
      Dec(myCount);
    end;
  end);
end;

function ToString(Collection: TEnumerable<Integer>): string;
var
  sb: TStringBuilder;
  item: Integer;
begin
  sb := TStringBuilder.Create;
  try
    for item in Collection do
      sb.Append(item).Append(', ');
    if sb.Length > 2 then
      sb.Length := sb.Length - 2;
    Result := sb.ToString;
  finally
    sb.Free;
  end;
end;

procedure DoIt;
var
  oddNums: TEnumerable<Integer>;
begin
  oddNums := MakeCounter(1, 2, 20);
  try
    Writeln('Odd numbers: ', ToString(oddNums));
    Writeln('And again: ', ToString(oddNums));
  finally
    oddNums.Free;
  end;
end;

begin
  try
    DoIt;
  except
    on E:Exception do
      Writeln(E.Classname, ': ', E.Message);
  end;
end.

This did require a little extra complexity - making a copy of the state in the outer anonymous method of MakeCounter - but it also now performs as expected:

Odd numbers: 1, 3, 5, 7, 9, 11, 13, 15, 17, 19, 21, 23, 25, 27, 29, 31, 33, 35, 37, 39
And again: 1, 3, 5, 7, 9, 11, 13, 15, 17, 19, 21, 23, 25, 27, 29, 31, 33, 35, 37, 39

So there we have it: an easy way to create enumerators and simple collection wrappers using a combination of generics and anonymous methods. There's more that can be built on this foundation, but I'll leave that until later.

Update:

Joylon in the comments prompts me to show the alternative - i.e. how to write the above without using anonymous methods. Here's a basic enumerator which does the same thing as the above - i.e. generate an integer sequence of a specified length and stride:

type
  TEnumerator = class
  private
    FCurrent: Integer;
    FCount: Integer;
    FIncrement: Integer;
    function GetCurrent: Integer;
  public
    constructor Create(Start, Increment, Count: Integer);
    function MoveNext: Boolean;
    property Current: Integer read GetCurrent;
  end;
  
{ TEnumerator }

constructor TEnumerator.Create(Start, Increment, Count: Integer);
begin
  FCurrent := Start;
  FCount := Count;
  FIncrement := Increment;
end;

function TEnumerator.GetCurrent: Integer;
begin
  Result := FCurrent;
end;

function TEnumerator.MoveNext: Boolean;
begin
  if FCount <= 0 then
    Exit(False);
  Inc(FCurrent, FIncrement);
  Dec(FCount);
  Result := True;
end;

At 35 lines, it is quite a bit (3x) more verbose than the directly equivalent MakeCounter function (11 lines; this isn't one that supports resetting or recreating the enumerator, but the translation above doesn't either):

function MakeCounter(Start, Increment, Count: Integer): TSimpleEnumerator<Integer>;
begin
  Result := TSimpleEnumerator<Integer>.Create(function: Integer
  begin
    if Count <= 0 then
      raise EEnumerationFinished.Create;
    Result := Start;
    Inc(Start, Increment);
    Dec(Count);
  end);
end;

However, even then, I wouldn't necessarily recommend using the above technique to implement all enumerators in general - the fact that exceptions are used to signal termination will cause some performance loss and, more importantly, quite a bit of annoyance when running in a debugger.

Rather, I intend, over the next few posts in this blog, to use anonymous methods in all sorts of ways so that you, the reader, might see opportunities for their creative use in your own code.

15 comments:

Jolyon Smith said...

I have to wonder whether this isn't counter-productive if your aim is to show how useful/desirable anon methods and generics are.

My first thought is: Geez, that's one heck of a lot of really quite complex code just to produce a simple, repeatable list of numbers.

My second thought is: I can do this much more simply without using anons or generics, and in a way that is going to be much easier to follow for anyone reading my code in the future.


It really doesn't make me want to use these new language features at all.

:(

Can't we find some more compelling examples?

Barry Kelly said...

@Joylon: Most of the code is in the article is reusable library material. One of the problems with trying to show how to reduce effort in implementing A with some new applied abstraction f(A'), where A' < A, is that you have to introduce f(x); and it is typically the case with such abstractions that f(x) > A. Hopefully you can understand the example enough to see which bit is f(x) and which bit is A'. Probably the biggest problem with the post is that I never presented A, so you never saw the verbosity of that.

However, your complaint that all this does is list numbers, I find hard to take seriously. The point of the post is how to create an iterator without having to write a separate class (or two) and two (or three, if you include Reset) separate methods. Some imagination is required!

Sergey Antonov aka oxffff said...

Hi, Barry Kelly.

Next(from you) code(generic skeleton) is exact C# YIELD keyword implementation sample.
Though on IL side C# compiler guys make it a bit different.

Well done, Barry.

Result := TSimpleEnumerable{Integer}.Create(function: TFunc{Integer}
var
myStart, myIncrement, myCount: Integer;
begin
myStart := Start;
myIncrement := Increment;
myCount := Count;

Result := function: Integer
begin
if myCount <= 0 then
raise EEnumerationFinished.Create;
Result := myStart;
Inc(myStart, myIncrement);
Dec(myCount);
end;
end);

Steve Trefethen said...

Nice work Bary. I look forward to follow on posts.

Jolyon Smith said...

I think you missed my point Barry - I wasn't suggesting that you only eliminate anonymous methods in the implementation but that you instead consider the simplest possible implementation that would produce the same output (i.e. the desired result).

That simplest-possible implementation may not be AS re-usable (then again, ime people are more likely to re-use something they understand) but when assessing the value of re-use you have to consider not only how re-usable something IS but how likely and how often it is going to BE re-used and how much a re-usable implementation is actually going to save/improve anything.

Things haven't changed much in the last year or so - there is still a dearth of compelling, recognisably applicable examples to showcase these additions to the language. imho.

Barry Kelly said...

@Joylon -

"that would produce the same output"

You're using weasel words when you talk about "the same output" - of course the case at hand is trivial, a listing of numbers. The case is trivial so that the mechanism is the only thing that is complex - otherwise there would be unnecessary complexity in the example.

"the simplest possible implementation" - would be a machine code listing, as that would be devoid of abstraction.

Or perhaps you have your own, biased, concept of simplicity? Maybe you use the word "simple" to mean "familiar to Joylon"? It seems that way to me, because I guess that you probably haven't internalized closures as a language construct.

I imagine that not everybody will be willing to limit their toolbox to only those tools "familiar to Joylon", however.

Svein Bringsli said...

I give my support to Joylyon. Perhaps mr Kelly has forgotten that not all people who use Delphi are capable of writing compilers of their own? I agree wholeheartedly with Joylyon that the examples are complex, and I also think that this example didn't make me want to use the new features. I'm not saying the new features are no good, but the example didn't demonstrate their benefits in a way that appealed to me.

Also, the last comment from Barry struck me as quite patronizing. Maybe some humility would be better?

Barry Kelly said...

@Svein - I was annoyed with Joylon for his trollish comment on the Vista rant post.

Re examples, I'm sorry if they're too complex for you. I'm a developer, not a teacher, or even a marketer. I'll post what I find interesting. If I posted for different reasons, I wouldn't be posting for much longer (i.e. I would stop).

Robert Giesecke said...

Hi Barry,
One thing that would be very interesting: Do the predefined Enumrator/Enumerable classes implement the interfaces IEnumerable<T>/IEnumerator<T>?

And are iterator methods on the horizon for next versions?

Barry Kelly said...

@Robert - they do not implement IEnumerable<T> / IEnumerator<T> because those interfaces descend from IEnumerable / IEnumerator, and since Delphi doesn't have a rooted type system like C#, that causes a problem implementing the Current property. What should IEnumerator.Current return on an instance you get back from IEnumerable<Integer>?

Iterators - maybe. We'll see.

Robert Giesecke said...

Hi Barry,

@IEnumeable vs IEnumerable<T>
Well, you would not *have* to provide the inheritance as in the .Net world.
Since there is no legacy code that could be using IEnumerable, you could just opt for the generic one as the only one.
Otherwise an IEnumerable with Current as a Pointer could be possible as well (only used in cases where you don't know the type and thus pointer wouldn't be too bad a type choice).
IMO forcing class inheritence on collections is a very, very bad thing.
As an experienced .Net developer you probably came to appreciate the ease in which you can make your methods/classes work with any kind of container. As long as the container implemented either IEnumerable, ICollection<T>, etc...
Requiring this to be TEnumerable<T> means that *every* container had to be change to inherit from it, as opposed to just implement an interface to make it more usuable for the new Delphi.

Don't get this wrong, I think you showed what you're made of with this first compiler of yours. :-)
Just wanted to get some ideas of my chest...

Jolyon Smith said...

I'm not sure if you are deliberately mis-spelling my name, or why you would, but please note that you are.

My words are not "weasel" at all.

You seem to be trying to show the benefits of some new language features. I am suggesting that your attempt is falling short because your chosen examples are poor.

Applying these language features and techniques in the case of this example results in more complex and harder to follow code, not less and easier to follow.

As for the suggestion that the simplest possible implementation is machine code, those certainly are "weasel words".

I suspect that you know full well that by "simplest possible" I meant easiest to follow, and a machine code listing would fail that test just as much as your anon methods and generics approach.


My "trollish comment" on your Vista rant was intended as a tongue in cheek bit of light hearted ribbing, given that us "D7 Luddites" were lambasted by *some* CodeGear insiders for not "getting" the new IDE.

If a little light hearted ribbing is "trolling" then colour me green and call me "Shrek".

:)

I rather think you over-reacted.

Barry Kelly said...

@Jolyon - Mistake in the name is unintentional. Some time ago, I had mentally pronounced your name as I misspelled it. Sorry.

My point re "simple" is that it means something different to what you appear to think it means. Machine code is indeed more simple than a high-level language, because the primitives do less and are easier to understand individually.

Similarly, what you describe as simpler than the technique I outlined has primitives which do less and are easier to understand individually, in so far as they are less abstract.

However, higher-level abstraction allows writing more "dense" code. Sometimes, the best abstraction to use for a particular situation is passing code as an argument.

The example I had here was the problem of creating an enumerable stream. Normally, that requires creating two separate classes, and writing at least 3, if not 4 different methods. I reduced it down to a single method call.

You may not see the utility of the example, because you don't see the point of creating enumerable streams (perhaps you have never worked with C# LINQ). However, that criticism is specific to you, and is not general.

daniele.teti said...

Hi Barry. Thanks for your post. I have a question:
Closure should get actual local variable value for it's "internal state". In following code I'l expect that got "Hello world" and then "Hello there"... but I'm got "Hello There" (last variable x value) twice... I'm wrong?

procedure TForm1.Button5Click(Sender: TObject);
var
x: string;
p1,p2: TProc;
begin
x := 'Hello World';
p1 := procedure begin ShowMessage(x); end;
x := 'Hello There';
p2 := procedure begin ShowMessage(x); end;
p1;
p2;
end;

Barry Kelly said...

@daniele - No. Closures capture locations, not values. The behaviour you are seeing is common across Scheme, Ruby, C# (both anonymous methods and lambdas), etc. For example, in Scheme:

(define (f x)
(cons
(lambda (new-x)
(set! x new-x))
(lambda () x)))

This defines a function that returns a pair of lambdas, one that gets and one that sets the value of x.

(define slot (f 10))

This evaluates (f 10) and assigns it to slot. Slot is now a pair of lambdas. I can evaluate it with the getter:

((cdr slot))

This prints out 10. I can change the value of the captured x with the setter:

((car slot) 42)

And re-evaluate the getter:

((cdr slot))

As expected this prints out 42.