Tuesday, May 11, 2010

Locations vs Values: using RTTI to work with value types

Delphi's Rtti unit is designed in substantial part around TValue, a kind of hold-all record that should be capable of containing almost any Delphi value, along with type information for that value. However, this means that when you're working with value types, such as static arrays and records, modifying the values when stored in a TValue is modifying that copy, stored inside the TValue. If you want to manipulate a field (F1) of a record which is itself a field (F2) of another type, you need to first copy the F2 field's value out into a TValue, then modify F1 in the TValue, and then copy it back in to the original F2 field.

As an aside: TValue.MakeWithoutCopy does not relate to this copying behaviour, but is rather for managing reference counts with strings and interfaces and other managed types. This is particularly important when marshalling parameters to and from stack frames, where logical copies sometimes should be made, and sometimes not.

However, working with values in TValue all the time is not necessarily the most efficient technique. By adding another layer of indirection, we can improve things: instead of working with values, we can work with locations.

This can be encapsulated fairly trivially using the current RTTI support. I hacked up a TLocation type which represents a typed location analogously to how TValue represents a value:

type
  TLocation = record
  private
    FLocation: Pointer;
    FType: TRttiType;
  public
    class function FromValue(C: TRttiContext; const AValue: TValue): TLocation; static;
    class function FromAddress(ALocation: Pointer; AType: TRttiType): TLocation; static;
    function GetValue: TValue;
    procedure SetValue(const AValue: TValue);
    function Follow(const APath: string): TLocation;
    function Dereference: TLocation;
    function Index(n: Integer): TLocation;
    function FieldRef(const name: string): TLocation;
  end;

For ease of use, it uses TRttiType. If it were to be fully as flexible as TValue, it would use PTypeInfo instead, like TValue does. However, using the RTTI wrapper objects makes life a lot easier.

Here it is in use:

type
  TPoint = record
    X, Y: Integer;
  end;
  TArr = array[0..9] of TPoint;

  TFoo = class
  private
    FArr: TArr;
    constructor Create;
    function ToString: string; override;
  end;

{ TFoo }

constructor TFoo.Create;
var
  i: Integer;
begin
  for i := Low(FArr) to High(FArr) do
  begin
    FArr[i].X := i;
    FArr[i].Y := -i;
  end;
end;

function TFoo.ToString: string;
var
  i: Integer;
begin
  Result := '';
  for i := Low(FArr) to High(FArr) do
    Result := Result + Format('(%d, %d) ', [FArr[i].X, FArr[i].Y]);
end;

procedure P;
var
  obj: TFoo;
  loc: TLocation;
  ctx: TRttiContext;
begin
  obj := TFoo.Create;
  Writeln(obj.ToString);
  
  ctx := TRttiContext.Create;
  
  loc := TLocation.FromValue(ctx, obj);
  Writeln(loc.Follow('.FArr[2].X').GetValue.ToString);
  Writeln(obj.FArr[2].X);
  
  loc.Follow('.FArr[2].X').SetValue(42);
  Writeln(obj.FArr[2].X); // observe value changed
  
  // alternate syntax, not using path parser
  loc.FieldRef('FArr').Index(2).FieldRef('X').SetValue(24);
  Writeln(obj.FArr[2].X); // observe value changed again
  
  Writeln(obj.ToString);
end;

Here's most of the implementation:

{ TLocation }

type
  PPByte = ^PByte;

function TLocation.Dereference: TLocation;
begin
  if not (FType is TRttiPointerType) then
    raise Exception.CreateFmt('Non-pointer type %s can''t be dereferenced', [FType.Name]);
  Result.FLocation := PPointer(FLocation)^;
  Result.FType := TRttiPointerType(FType).ReferredType;
end;

function TLocation.FieldRef(const name: string): TLocation;
var
  f: TRttiField;
begin
  if FType is TRttiRecordType then
  begin
    f := FType.GetField(name);
    Result.FLocation := PByte(FLocation) + f.Offset;
    Result.FType := f.FieldType;
  end
  else if FType is TRttiInstanceType then
  begin
    f := FType.GetField(name);
    Result.FLocation := PPByte(FLocation)^ + f.Offset;
    Result.FType := f.FieldType;
  end
  else
    raise Exception.CreateFmt('Field reference applied to type %s, which is not a record or class',
      [FType.Name]);
end;

function TLocation.Follow(const APath: string): TLocation;
begin
  Result := GetPathLocation(APath, Self);
end;

class function TLocation.FromAddress(ALocation: Pointer;
  AType: TRttiType): TLocation;
begin
  Result.FLocation := ALocation;
  Result.FType := AType;
end;

class function TLocation.FromValue(C: TRttiContext; const AValue: TValue): TLocation;
begin
  Result.FType := C.GetType(AValue.TypeInfo);
  Result.FLocation := AValue.GetReferenceToRawData;
end;

function TLocation.GetValue: TValue;
begin
  TValue.Make(FLocation, FType.Handle, Result);
end;

function TLocation.Index(n: Integer): TLocation;
var
  sa: TRttiArrayType;
  da: TRttiDynamicArrayType;
begin
  if FType is TRttiArrayType then
  begin
    // extending this to work with multi-dimensional arrays and non-zero
    // based arrays is left as an exercise for the reader ... :)
    sa := TRttiArrayType(FType);
    Result.FLocation := PByte(FLocation) + sa.ElementType.TypeSize * n;
    Result.FType := sa.ElementType;
  end
  else if FType is TRttiDynamicArrayType then
  begin
    da := TRttiDynamicArrayType(FType);
    Result.FLocation := PPByte(FLocation)^ + da.ElementType.TypeSize * n;
    Result.FType := da.ElementType;
  end
  else
    raise Exception.CreateFmt('Index applied to non-array type %s', [FType.Name]);
end;

procedure TLocation.SetValue(const AValue: TValue);
begin
  AValue.Cast(FType.Handle).ExtractRawData(FLocation);
end;

To make it slightly easier to use, and slightly more fun for me to write, I also wrote a parser - the Follow method, which is implemented in terms of GetPathLocation:

function GetPathLocation(const APath: string; ARoot: TLocation): TLocation;

  { Lexer }
  
  function SkipWhite(p: PChar): PChar;
  begin
    while IsWhiteSpace(p^) do
      Inc(p);
    Result := p;
  end;

  function ScanName(p: PChar; out s: string): PChar;
  begin
    Result := p;
    while IsLetterOrDigit(Result^) do
      Inc(Result);
    SetString(s, p, Result - p);
  end;

  function ScanNumber(p: PChar; out n: Integer): PChar;
  var
    v: Integer;
  begin
    v := 0;
    while (p >= '0') and (p <= '9') do
    begin
      v := v * 10 + Ord(p^) - Ord('0');
      Inc(p);
    end;
    n := v;
    Result := p;
  end;

const
  tkEof = #0;
  tkNumber = #1;
  tkName = #2;
  tkDot = '.';
  tkLBracket = '[';
  tkRBracket = ']';
  
var
  cp: PChar;
  currToken: Char;
  nameToken: string;
  numToken: Integer;
  
  function NextToken: Char;
    function SetToken(p: PChar): PChar;
    begin
      currToken := p^;
      Result := p + 1;
    end;
  var
    p: PChar;
  begin
    p := cp;
    p := SkipWhite(p);
    if p^ = #0 then
    begin
      cp := p;
      currToken := tkEof;
      Exit(currToken);
    end;
    
    case p^ of
      '0'..'9':
      begin
        cp := ScanNumber(p, numToken);
        currToken := tkNumber;
      end;
      
      '^', '[', ']', '.': cp := SetToken(p);
      
    else
      cp := ScanName(p, nameToken);
      if nameToken = '' then
        raise Exception.Create('Invalid path - expected a name');
      currToken := tkName;
    end;
    
    Result := currToken;
  end;
  
  function Describe(tok: Char): string;
  begin
    case tok of
      tkEof: Result := 'end of string';
      tkNumber: Result := 'number';
      tkName: Result := 'name';
    else
      Result := '''' + tok + '''';
    end;
  end;
  
  procedure Expect(tok: Char);
  begin
    if tok <> currToken then
      raise Exception.CreateFmt('Expected %s but got %s', 
        [Describe(tok), Describe(currToken)]);
  end;

  { Semantic actions are methods on TLocation }
var
  loc: TLocation;
  
  { Driver and parser }
  
begin
  cp := PChar(APath);
  NextToken;
  
  loc := ARoot;
  
  // Syntax:
  // path ::= ( '.' <name> | '[' <num> ']' | '^' )+ ;;
  
  // Semantics:
  
  // '<name>' are field names, '[]' is array indexing, '^' is pointer
  // indirection.
  
  // Parser continuously calculates the address of the value in question, 
  // starting from the root.
  
  // When we see a name, we look that up as a field on the current type,
  // then add its offset to our current location if the current location is 
  // a value type, or indirect (PPointer(x)^) the current location before 
  // adding the offset if the current location is a reference type. If not
  // a record or class type, then it's an error.
  
  // When we see an indexing, we expect the current location to be an array
  // and we update the location to the address of the element inside the array.
  // All dimensions are flattened (multiplied out) and zero-based.
  
  // When we see indirection, we expect the current location to be a pointer,
  // and dereference it.
  
  while True do
  begin
    case currToken of
      tkEof: Break;
      
      '.':
      begin
        NextToken;
        Expect(tkName);
        loc := loc.FieldRef(nameToken);
        NextToken;
      end;
      
      '[':
      begin
        NextToken;
        Expect(tkNumber);
        loc := loc.Index(numToken);
        NextToken;
        Expect(']');
        NextToken;
      end;
      
      '^':
      begin
        loc := loc.Dereference;
        NextToken;
      end;
      
    else
      raise Exception.Create('Invalid path syntax: expected ".", "[" or "^"');
    end;
  end;
  
  Result := loc;
end;

The principle can be extended to other types and Delphi expression syntax, or TLocation may be changed to understand non-flat array indexing, etc.

This post was inspired by this question on Stack Overflow, and some similar questions to it that popped up over the past few weeks.

5 comments:

Mason Wheeler said...

I like it! Any chance we might see this in Fulcrum's RTTI.pas? I can already think of a handful of places where this might be useful.

BTW why the code duplication in the implementation of TLocation.FieldRef?

Barry Kelly said...

The code is different between instances and records because instances need a dereference before adding the offset. It could be written differently, of course.

Mason Wheeler said...

Oh, you're right. I missed that ^ mark.

Maciej Izak said...

I love what you do. I always wanted to work on the Delphi compiler, but if you do it, I can do something else :).

F. Bauer said...

Good enhancement, and the code works without having to wait for next version of Delphi.

What I still miss is setting length of dynamic (generic) array by RTTI: if I have a TLocation, where FType is TRttiDynamicArrayType, then a TLocation.SetLength is missing. Any hint on how to do this?

Further I miss RTTI-Invoke for methods of records a little.