Building a mixed-mode stack walker - Part 2

by steve 28. April 2011 20:40

(Part 1 is here)

When I left off in Part 1, I had a stack-walker based on IDebug* that could successfully unwind a mixed-mode stack and resolve the native frames to symbols, but the managed sections of the stack were still unresolved.  In this post I'll talk about how to resolve those managed frames to managed MethodDescs and turn those into names, all without using ICorDebug or the CLR profiling APIs.  Also, this method will work on both live and dump targets.  Just a friendly warning here though: none of the proceeding is documented or supported by MS and is subject to change at any time.  Also, by using headers/idl from the SSCLI, you may be taking a dependency on that licensing (but I’m not a lawyer so don’t take my word on that.)

Starting out – Reversing SOS

When I started the project, I had known a little about how SOS works internally, but nothing substantial.  My first point was to learn as much about how it worked as possible, then use that to write my own implementation.

I started by looking at mscordacwks.dll and sos.dll.  I knew the purpose of mscordacwks.dll was to abstract away the CLR data structures to external tools, so I figured that was the best start.  Using dumpbin (part of the Windows Platform SDK) I looked at the export table of mscordacwks.  Only a few functions are exported (I talked about OutOfProcFunctionTableCallback last post), the most interesting for this project is CLRDataCreateInstance.

Googling around for that function turned up two interesting hits.  The first was a link to MSDN which was fairly useless (why is it even documented?), and the second was a link to clrdata.idl on koderz (a great site) from the SSCLI.  For those unfamiliar with the SSCLI, it’s basically a dumbed-down version of the .NET 2.0 source MS released under a shared-source license.  I actually took this opportunity to download the SSCLI, which turned out to be worth it as I referred back to the source many times during this project.

The signature for CLRDataCreateInstance looks like this:

HRESULT CLRDataCreateInstance (
    [in]  REFIID           iid, 
    [in]  ICLRDataTarget  *target, 
    [out] void           **iface
);
So we need to figure out 1) the IID to create, and 2) what a ICLRDataTarget is.

Figuring out the IID we want

The implementation of CLRDataCreateInstance is in clr\src\debug\daccess\daccess.cpp at the bottom of the file.  The function creates a ClrDataAccess object, then QIs it for the IID we passed to CLRDataCreateInstance.  The implementation of ClrDataAccess is also in daccess.cpp, and looking at it's implementation of QueryInterface (and class declaration), we can see that the only useful interface (to us) it supports is IXCLRDataProcess.  IXCLRDataProcess is defined in clr\src\inc\xclrdata.idl.  You can use midl to generate a .h file from this .idl file, or just use the one included in the SSCLI.  This will get us the IID of IXCLRDataProcess (5c552ab6-fc09-4cb3-8e36-22fa03c798b7).

Implementing ICLRDataTarget

ICLRDataTarget is defined in clrdata.idl (and clrdata.h in the platform SDK).  The interfaces defines a lot of methods, but actually very few of them seem to be used by the IXCLRData* implementations.  All we need is:

  • GetMachineType
    • I hard coded mine to return IMAGE_FILE_MACHINE_AMD64, in a cross-platform solution you'd want to return IMAGE_FILE_MACHINE_I386 on 32 bit systems as well.
  • GetPointerSize
    • This is the easiest one, return sizeof(void*).
  • GetImageBase
  • ReadVirtual

That's basically all you need to have a working ICLRDataTarget implementation.  (Side note: later on I found out that you can ask WinDBG for an IXCLRDataProcess  through Ioctrl and IG_GET_CLR_DATA_INTERFACE.  This has the advantage that WinDBG will try to load the "right" version of mscordacwks for you.  However, it doesn't work if you're not running inside WinDBG.  Conveniently though, the only case I can think of that you wouldn't be running inside windbg would be doing something to a live-process, in which case it's fine to just load mscordacwks from the framework directory.)

Putting it together – Resolving a managed IP to a MethodDesc

So at this point we have a working ICLRDataTarget implementation, we have an IID, and we have a way to create that IID.  Using CLRDataCreateInstance(__uuidof(IXCLRDataProcess), myDataTarget, (PVOID*)&pDac), we get an instance of IXCLRDataProcess bound to our IDebugClient through our ICLRDataTarget implementation.

There's a few ways to resolve an IP to a method name now that we have an IXCLRDataProcess, I'll go over two of them.  The first is to use IXCLRDataProcess::GetRuntimeNameByAddress and pass an IP.  This is probably the simplest method, but doesn't get you as much information.  In our case however, all I wanted was the name, so this was enough.

IXCLRDataProcess::Request

The second brings us to what, in my opinion, is the most powerful feature of IXCLRDataProcess, the Request(…) method.  This is basically the IOCtrl of IXCLRDataProcess; it takes a request code, and and input + output buffer.  All the valid requests as of .NET 2.0 are defined in src\inc\dacprivate.h, and there's a lot of them.  All of the output structs contain a Request method which set up the input/output buffers correctly based on the request.

Through experimentation, I've found a lot of these structs have changed definitions between the Rotor snapshot and .NET 4.0.  Request returns E_INVALIDARG if the input or output buffers are mis-sized (but not only in that case.)  There's two ways to figure out the correct buffer sizes:

  1. Disassemble the corresponding Request method in mscordacwks and look at what it's expecting for a buffer size.
  2. Set a breakpoint on ClrDataAccess::Request() and debug windbg+sos calling the method you want.

I usually went with #1 because it's a little faster.  However, you need to be creative figuring out how the structure changed, and then adjust the struct in dacprivate.h accordingly.

Back to resolving our managed IPs.  One DAC Request is DacpMethodDescData.  This request is an example of one that changed between Rotor and .NET 4, the output buffer changed by 8 bytes (a x64 pointer).  I removed the managedDynamicMethodObject field from my definition to get it to work.  This request struct contains a couple helper methods, one being RequestFromIP.  Giving this a managed IP will resolve it to a MethodDesc.  We can then take the MethodDescPtr from the result and pass it to GetMethodName, also on the DacpMethodDescData request.

Conclusion

We've gone through a lot of work here, but at this point we can resolve any managed IP to a method name.  The workflow looks like this:

  • Using the IDebugClient from part 1, create our ICLRDataTarget implementation.
  • Pass said ICLRDataTarget to CLRDataCreateInstance with IID = __uuidof(IXCLRDataProcess).
  • For each frame, call GetRuntimeNameByAddress with the frame's IP, anything that succeeds is a managed method.  Also, there may be cases where you'll have both a symbol name and a name from this call, GetRuntimeNameByAddress should override the symbol name.

There's definitely some room for improvement here.  One of the biggest downsides is that there's no logic currently to find the "right" version of mscordacwks.  SOS for example, will try to search around to find the best match, I currently just load it from Framework64\v4…\mscordacwks.dll.

Next up: more advanced CLR inspection with IXCLRDataProcess.

Tags:

clr | DAC | debugging | diagnostics | IDebugClient | IXCLRDataProcess | stack-walk | x64

Comments (12) -

JR
JR United States
6/14/2011 6:30:23 PM #

Good post. Any sample code? I have been trying to figure out how to map native address to managed function for more than a year now.

Reply

steve
steve United States
6/14/2011 7:53:22 PM #

I'm probably not going to be posting full out sample code, but if you have places you're stuck on specifically I'd be happy to point you in the right direction.

Reply

JR
JR United States
6/15/2011 11:44:28 AM #

                                      Understood.  
Trying to learn to fish a little and really appreciate any guidance.  
I was reversing sos also and found it calling mscordacwks functions. Googling what i found brought me here.
All my past extensions were done using the old wdbgexts api. I have a skeleton dll that compiles using the dbgeng API.  
I just have the basic

DebugExtensionInitialize
DebugExtensionNotify
DebugExtensionUninitialize

functions as well as a simple text output function.  

I also have the sscli. Are you including a bunch of files from sscli in your project to compile and be able to call "CLRDataCreateInstance(__uuidof(IXCLRDataProcess), myDataTarget, (PVOID*)&pDac)"?

In that case the code you are writing from sscli is the ICLRDataTarget implementation right?

Could you show me how you implemented GetImageBase?  

Reply

steve
steve United States
6/18/2011 2:03:36 PM #

The only two files you 100% need from the sscli are dacpriv.h and xclrdata.h.

Instead of using __uuidof(IXCLRDataProcess) I just declared the IID locally in my project:

IID IID_IXCLRDataProcess = {0x5c552ab6, 0xfc09, 0x4cb3, {0x8e,0x36,0x22,0xfa,0x03,0xc7,0x98,0xb7} };
Then you can call CLRDataCreateInstance with that.

ICLRDataTarget actually is defined in the windows SDK in clrdata.h, so you don't need anything from the sscli for that.

As for GetImageBase, the only thing it's used for is to get the base of CLR.dll, it's never called with any other module.  You can use GetModuleByModuleNameWide on IDebugSymbols3 to get the image base for clr.dll (note, the function takes the module name, not the dll name, so it's just "clr", not "clr.dll").

Reply

JR
JR United States
6/28/2011 4:54:20 PM #

Thanks for the reply. Did you mean dacprivate.h instead of just dacpriv.h? I included dacprivate.h in my project and this header includes cor.h, clrdata.h, and xclrdata.h. I defined my own myDatatarget class that inherits ICLRDataTarget. I implemented the GetMachineType, GetPointerSize, GetImageBase, and ReadVirtual in my class. I defined the IID as you said. I end up with this little piece of code that acually compiles.

ICLRDataTarget* pCLR;
    
if (SUCCEEDED(pDebugClient->QueryInterface(__uuidof(ICLRDataTarget),(void**)&pCLR)))
{
  myDataTarget* mdt;
  CLRDataCreateInstance(IID_IXCLRDataProcess, mdt, (PVOID*)&pCLR);
}

Now all I need to do is link to CLRDataCreateInstance function. I couldn't find a .lib file with that function in it and when I installed perl and such to build sscli20/rotor, the build dies saying AMD64 is not supported. How are you linking to that function may I ask?

Reply

steve
steve United States
6/28/2011 5:46:56 PM #

You need to late bind to CLRDataCreateInstance via LoadLibrary/GetModuleAddress, there's no lib for it.

HMODULE hCorDac = LoadLibrary(L"C:\\Windows\\Microsoft.NET\\Framework64\\v4.0.30319\\mscordacwks.dll");
  if (hCorDac == NULL)
  {
  return FALSE;
  }
  CLRDataCreateInstancePtr clrData = (CLRDataCreateInstancePtr)GetProcAddress(hCorDac, "CLRDataCreateInstance");
  RemoteCLRDataTarget *t = new RemoteCLRDataTarget(di->sym, di->data);
  HRESULT hr = clrData(IID_IXCLRDataProcess, t, (void**)ppDac);

Reply

steve
steve United States
6/29/2011 9:45:13 AM #

Also, I noticed a couple things in the code you posted:
1) I didn't know you could QueryInterface the IDebugClient for ICLRDataTarget, if you can do that there's no need to implement your own ICLRDataTarget, just use the result from the QueryInterface.

2) When you call CLRDataCreateInstance, you're looking for an IXCLRDataProcess back.  What you really wanted to do was this:

ICLRDataTarget* pCLR;
IXCLRDataProcess *pDac; // (don't ask me why I named it this, it's just habit now...)

if (SUCCEEDED(pDebugClient->QueryInterface(__uuidof(ICLRDataTarget),(void**)&pCLR)))
{
  CLRDataCreateInstance(IID_IXCLRDataProcess, pCLR, (PVOID*)&pDac);
}

Reply

JR
JR United States
7/29/2011 4:53:46 PM #

your right, you can’t queryinterface for ICLRDataTarget. I have my own implementation with the functions you said were required above but what is this line?

RemoteCLRDataTarget *t = new RemoteCLRDataTarget(di->sym, di->data);

seems like my ICLRDataTarget implementation needs a constructor that takes two parameters? I’m not sure what di->stm and di->data are.

Reply

steve
steve United States
8/13/2011 9:50:02 AM #

di->sym is a pointer to an IDebugSymbols3 instance for implementing GetImageBase, di->data is a pointer to an IDebugDataSpaces instance for implementing ReadVirtual.

Reply

PA
PA United States
8/12/2011 8:21:05 AM #

What is the difference between ICorDebug interfaces and the IDebug interfaces? How come there exists two different for the same(?) thing?



Reply

steve
steve United States
8/12/2011 8:26:14 AM #

IDebug* is for unmanaged debugging only and is basically only reading/writing from the process memory space, ICorDebug is for managed debugging only and connects to debugging services provided by the CLR inside the target process.

Reply

JR
JR United States
8/26/2011 6:28:13 PM #

I cant believe it Steve but after many hours of banging away at this I actually got it working....

Many thanks.

Reply

Pingbacks and trackbacks (1)+

Add comment




  Country flag
biuquote
  • Comment
  • Preview
Loading


Month List