Sunday, May 21, 2006

Delphi/Win32 and COM Interface casting

Over in borland.public.delphi.language.delphi.win32, Kevin Donn asked some questions about memory allocations caused by casting Delphi object instances to interfaces (which in Delphi/Win32, are always COM interfaces). My answer applies to Delphi/Win32 only.
Presumably this would not create a memory leak:
  i: IMyInterface
  o: TMyObject // supports IMyInterface
  i:=o as IMyInterface
The opposite may be true: it might free the object sooner than you think. Interfaces in Delphi/Win32 on classes that derive from TInterfacedObject follow COM rules. That basically means that mixing object references and interface references is dodgy. As soon as you cast or assign an object reference to an interface reference, it gets AddRef'd for the first time. When the last interface reference goes out of scope, it gets Release'd. If you still have an object reference to the object, then it will be a bad pointer - nasty.

It's best to either stick to COM rules and only access the object through interfaces (and thus get refcounted lifetime management), or else implement IInterface (aka IUnknown) yourself and use manual memory management.

But, wisdom aside, will the following cause a memory leak?
  p: pointer
  o: TMyObject // supports IMyInterface
  p:=pointer(o as IMyInterface)
This will create a temporary value (i.e. kind of an anonymous local variable) of type IMyInterface (which gets AddRef'd during this process), convert the interface address to a pointer, then Release's the interface. That may or may not free TMyObject. If it freed TMyObject, then both o and p will point to dead memory. If it didn't, then p is still valid, but it's just a pointer into o's memory. No memory is allocated in this process, but it might be freed, if there were no interfaces pointing to o in scope.

Pointers to interfaces are pointers into the middle of the object. A picture:

--- TMyObject ---
0: TMyObject metaclass pointer --->
// ...
n: TMyObject's IMyInterface vtable --->
// ...
// object data

--- TMyObject.IMyInterface vtable ---
// ... other IMyInterface methods
Normally, a pointer to a value of type TMyObject points to the start of the object, which itself points to the metaclass (i.e. TMyObject).

A pointer to an interface points to a vtable. This is defined by COM, which is a binary standard. This vtable is a list of function pointers. (These functions adjust the 'Self' pointer that is passed in as the first argument, and then jump to the real implementation of the methods.)

I'd draw better pictures, but it's very tedious in ASCII.

Both TMyObject and the vtables are statically allocated as part of the EXE or DLL image, and don't need to be freed.

Perhaps this program may make things clearer:

program Test;

uses SysUtils, Classes;


procedure Dump(Start: Pointer; Count: Integer);
  p: PPointer;
  p := Start;
  while Count > 0 do
    Writeln(Format('%p: %p', [p, p^]));

  o: TInterfacedObject;
  i: IInterface;
  o := TInterfacedObject.Create;
  Writeln('The Object');
  Dump(o, 4);
  Writeln('The Class');
  Dump(TInterfacedObject, 4);
  i := o;
  Writeln('The Interface');
  Dump(Pointer(i), 4);
When I run it on my system, this is what I get:
The Object
00A14E60: 0040111C
00A14E64: 00000000
00A14E68: 004010A1
00A14E6C: 00A14E81
The Class
0040111C: 6E495411
00401120: 66726574
00401124: 64656361
00401128: 656A624F
The Interface
00A14E68: 004010A1
00A14E6C: 00A14E81
00A14E70: 00000000
00A14E74: 00000001
Notice that the interface pointer is (in this case) at an offset of 8 from the object pointer. You can see that the vtable for TInterfacedObject's IInterface implementation is at $4010A1, while the metaclass is located at $40111C - relatively close together. Since .EXE images in Windows get linked so that their load address starts at $400000, you can infer from this that the metaclass and interface vtable are both part of the .EXE image.
More specifically, does the generation of an interface cause memory allocation and if so how does it get cleaned up?
The only memory used is part of the object, unless you've delegated the interface implementation to a property which returns an object derived from TAggregatedObject - which itself delegates AddRef and Release to its controller, the parent object.

I hope this makes it clearer. It's not a totally trivial question. You need to know what's going on beneath the hood to understand and use (and most especially implement) COM interfaces with any level of sophistication.

No comments: