A poster on microsoft.public.dotnet.languages.csharp
(the original post is
here) wrote in the other day about a bug he had - an access violation. He wasn't
using any unmanaged code, so my curiosity was piqued. The reason? It's theoretically
impossible to AV without using unmanaged code, so he had almost certainly found a CLR
or BCL bug. I was duly motivated to get to the root of the problem.
I'm very glad I stuck with it, because I found out something I wasn't aware of, something
which will make me very careful in similar scenarios in the future. If you're not interested
in the path to discovery, the summary is at the end.
I did some investigation, and I discovered that the problem lay in the Ping class.
Here's some code which reproduces the problem:
using System;
using System.Net;
using System.Net.NetworkInformation;
class App
{
static void Main()
{
try
{
for (;;)
new Ping().Send(IPAddress.Loopback);
}
catch (Exception ex)
{
if (ex.InnerException != null)
ex = ex.InnerException;
Console.WriteLine("Error: {0} ({1})", ex.Message,
ex.GetType().Name);
}
}
}
(I've registered this as a bug on the
Microsoft Connect site.)
This wasn't enough for me. I wanted to know how it happened. I looked at the code for
Ping using
.NET Reflector, but it wasn't
obvious to me. I fired up WinDbg and ran the above code until the access violation exception
was triggered (it can take some seconds, depending on your machine):
(a94.87c): Access violation - code c0000005 (first chance)
First chance exceptions are reported before any exception handling.
This exception may be expected and handled.
eax=04f20b74 ebx=00000000 ecx=00000008 edx=00000000 esi=04f20b54 edi=012e7580
eip=78144d3a esp=0012f278 ebp=0012f280 iopl=0 nv up ei pl zr na po nc
cs=001b ss=0023 ds=0023 es=0023 fs=003b gs=0000 efl=00010246
MSVCR80!memcpy+0x5a:
78144d3a f3a5 rep movsd ds:04f20b54=???????? es:012e7580=00000000
So, it was failing in memcpy. Where was that called from? SOS and !dumpstack gives
the right info (excerpted):
> .load sos
> !dumpstack
OS Thread Id: 0x87c (0)
Current frame: MSVCR80!memcpy+0x5a [F:\RTM\vctools\...\intel\memcpy.asm:188]
ChildEBP RetAddr Caller,Callee
0012f280 79e808d5 mscorwks!memcpyNoGCRefs+0x11, // ...
0012f290 79f02572 mscorwks!MarshalNative::CopyToManaged+0x11b, // ...
0012f2e0 79f02491 mscorwks!MarshalNative::CopyToManaged+0x22, // ...
0012f340 7a615416 (/*...*/ System.Net.NetworkInformation.PingReply..ctor(// ...
0012f358 7a61474e (/*...*/ System.Net.NetworkInformation.Ping.InternalSend(// ...
0012f37c 79eec356 mscorwks!FastAllocateObject+0xa6, calling mscorwks!Object// ...
0012f3fc 7a6139b4 (/*...*/ System.Net.NetworkInformation.Ping.Send(System.// ...
0012f43c 7a613752 (/*...*/ System.Net.NetworkInformation.Ping.Send(System.// ...
0012f454 00ca009e (MethodDesc 0x922fd8 +0x2e App.Main()), calling 7a87f55c
// ...
I can see here that the last managed code was inside PingReply's constructor.
Here's what the appropriate constructor overload looks like, roughly:
internal PingReply(IcmpEchoReply reply)
{
this.address = new IPAddress((long) reply.address);
this.ipStatus = (IPStatus) reply.status;
if (this.ipStatus == IPStatus.Success)
{
this.rtt = reply.roundTripTime;
this.buffer = new byte[reply.dataSize];
Marshal.Copy(reply.data, this.buffer, 0, reply.dataSize);
this.options = new PingOptions(reply.options);
}
else
{
this.buffer = new byte[0];
}
}
The call to MarshalNative::CopyToManaged() is inside Marshal.Copy(), and it's copying
data from reply.data (an IntPtr) to the managed array, this.buffer. This structure
is filled by the Windows API function IcmpSendEcho(). The buffer is allocated
by the caller, as is usual for Windows API functions. The buffer is owned through
a SafeHandle descendant, SafeLocalFree, and it's owned by the Ping class.
The memcpy function compiles to an x86 string operation, with esi and edi being the
source and destination registers. From the original exception data at the top, one
can see that ESI (the source) is an invalid pointer - that's what caused the access
violation. Thus, the buffer has been freed early.
How did that happen? One can use !dso (aka !dumpstackobjects) to find all the live
objects rooted on the stack or registers:
> !dso
OS Thread Id: 0x87c (0)
ESP/REG Object Name
0012f2ac 012e7508 System.Net.NetworkInformation.PingReply
0012f2c0 012e7508 System.Net.NetworkInformation.PingReply
0012f32c 012e7578 System.Byte[]
0012f350 012e7508 System.Net.NetworkInformation.PingReply
0012f410 012e74b4 System.Byte[]
0012f438 0128142c System.Net.IPAddress
0012f448 012e74b4 System.Byte[]
Aha! There's no Ping object!
Even though several instance methods of the Ping object
(two Ping.Send overloads and Ping.InternalSend) are currently executing, Ping is in fact
eligible for garbage collection!
It appears that's what's happened. Let's see if that can be confirmed. If one runs
!dumpheap -stat, there's still lots of Pings on the heap:
> !dumpheap -stat
// ...
7a779154 2041 40820 System.Net.SafeLocalFree
7a778ec0 2041 40820 System.Net.SafeCloseHandle
7915ff38 2041 65312 System.Threading.SendOrPostCallback
79124418 2043 89864 System.Byte[]
7a7812e0 2041 179608 System.Net.NetworkInformation.Ping
// ...
Let's see if we can find the Ping that's on our call stack:
> !dumpheap -type Ping
// ...
012e7040 7a7812e0 88
012e710c 7a7812e0 88
012e71d8 7a7812e0 88
012e72a4 7a7812e0 88
012e7370 7a7812e0 88
012e743c 7a7812e0 88
012e7508 7a781580 32
total 2042 objects
Statistics:
MT Count TotalSize Class Name
7a781580 1 32 System.Net.NetworkInformation.PingReply
7a7812e0 2041 179608 System.Net.NetworkInformation.Ping
Total 2042 objects
Now, since there is only one thread allocating Pings and there's nothing keeping them
alive, I can expect that the Ping in the highest address is the most recently allocated one.
To confirm that, I'll test the last two Ping objects listed (012e743c and 012e7370):
> !gcroot 012e7370
Note: Roots found on stacks may be false positives. Run "!help gcroot" for
more info.
Scan Thread 0 OSTHread 87c
Scan Thread 2 OSTHread 828
> !gcroot 012e743c
Note: Roots found on stacks may be false positives. Run "!help gcroot" for
more info.
Scan Thread 0 OSTHread 87c
ESP:12f390:Root:0130e1cc(System.Net.NetworkInformation.Ping)->
012e7494(System.Threading.SendOrPostCallback)->
012e743c(System.Net.NetworkInformation.Ping)
ESP:12f3f8:Root:0130e1cc(System.Net.NetworkInformation.Ping)->
012e7494(System.Threading.SendOrPostCallback)
ESP:12f434:Root:0130e1cc(System.Net.NetworkInformation.Ping)->
012e7494(System.Threading.SendOrPostCallback)
ESP:12f450:Root:0130e1cc(System.Net.NetworkInformation.Ping)->
012e7494(System.Threading.SendOrPostCallback)
Scan Thread 2 OSTHread 828
Here it can be seen that of the two, only 012e743c is alive. Because of the pattern of
the loop, it would be expected that any locations holding Ping would be overwritten each
time around, so this Ping at 012e743c should be the right one. Peeking inside it with
!do (aka !dumpobject):
> !do 012e743c
Name: System.Net.NetworkInformation.Ping
MethodTable: 7a7812e0
EEClass: 7a7e9100
Size: 88(0x58) bytes
(C:\WINDOWS\assembly\GAC_MSIL\System\2.0.0.0__b77a5c561934e089\System.dll)
Fields:
MT Field Offset Type VT Attr Value Name
790f9c18 4000184 4 System.Object 0 instance 00000000 __identity
// ...
7a779154 4002c06 20 ...Net.SafeLocalFree 0 instance 012e74f4 replyBuffer
// ...
I'm really only interested in this guy, the replyBuffer - because that's the
guy who's pointing to invalid memory. Drilling into that location:
> !do 012e74f4
Name: System.Net.SafeLocalFree
MethodTable: 7a779154
EEClass: 7a7db890
Size: 20(0x14) bytes
(C:\WINDOWS\assembly\GAC_MSIL\System\2.0.0.0__b77a5c561934e089\System.dll)
Fields:
MT Field Offset Type VT Attr Value Name
790fe160 40005b4 4 System.IntPtr 0 instance 82971448 handle
790fed1c 40005b5 8 System.Int32 0 instance 3 _state
79104f64 40005b6 c System.Boolean 0 instance 1 _ownsHandle
79104f64 40005b7 d System.Boolean 0 instance 1 _fullyInitialized
7a779154 40025ec 948 ...Net.SafeLocalFree 0 static 00000000 Zero
Now, the layout of the _state field is explained in the SSCLI 2.0, in
the file clr/src/vm/safehandle.cpp:
// So the state field ends up looking like this:
//
// 31 2 1 0
// +-----------------------------------------------------------+---+---+
// | Ref count | D | C |
// +-----------------------------------------------------------+---+---+
//
// Where D = 1 means a Dispose has been performed and C = 1 means the
// underlying handle has (or will be shortly) released.
This shows that a _state value of 3 means that it's been both disposed
and released. So, even though the Ping and SafeLocalFree objects haven't
actually been garbage collected, the eager finalization of SafeHandle
objects has released the buffer early. I'll consider that definitive.
Summary
There's at least two lessons from this:
- In general, one cannot rely on the fact that an instance method is on the call stack
to keep the instance alive. One needn't be too paranoid, though: the objects
can only be collected if the current line in each instance method is near or at the
end of the method, so that 'this' is no longer on the stack or enregistered.
- Thus, one needs to be extra careful to keep SafeHandle instances alive, either
by passing them along with any buffer / handle value extracted, or by using
SafeHandle.DangerousAddRef()/SafeHandle.DangerousRelease() in a try/finally block
to be absolutely sure that an early release doesn't occur.
Finally, to make everything crystal clear - what will the following program print?
using System;
class App
{
class A
{
public B Foo()
{
return new B();
}
~A()
{
Console.WriteLine("A finalized.");
}
}
class B
{
public B()
{
GC.Collect();
GC.WaitForPendingFinalizers();
Console.WriteLine("B now returning.");
}
}
static void Main()
{
new A().Foo();
}
}
If you've followed everything, you'll know that this is what it prints (compiled in Release mode):
A finalized.
B now returning.
Maybe this was obvious to everybody - but it certainly wasn't obvious to me,
nor to the designer of the Ping class!
1 comment:
Greetings Barry,
completely unrelated to this post of yours,but nonetheless my thanks for the tip you posted on http://aspalliance.com/groups/microsoft_public_dotnet_framework_clr/ng-362954_Question_Dynamically_Raise_Event_using_Reflection_.aspx
concerning how to raise an event reflection-style! Now my generic keystroke handler should work nicely! The working dll can be downloaded at my website.
I can see that you're well on top of the game concerning dotnet and the surrounding stuff, so I shall keep an eye on your blog, thanks for the info you provide on this and that.
Post a Comment