Passing struct array in struct via P/Invoke

Tags: .net, pinvoke, native

There are times that you need to call native (C/C++) code from .NET environment. To do so, you have several options:

  1. use C++/CLI
  2. use P/Invoke mechanism
  3. use some sort of inter-process communication like pipes, memory mapped files etc
  4. persuade yourself that it's gonna be easy to port this cool C/C++ library to C#. Shoot yourself in the head when trying to do it.

Since option 4 is probably the most questionable, let's leave it for now and focus on other options. Option 3 is, well, probably too tricky if you just want to call some C/C++ functions. Option 1 might be tempting and I sometimes use it. It requires you to get along with clumsy C++/CLI syntax but it's doable. Especially if you want to operate on whole objects instead of single function calls.

Let's discuss option 2 then.

Native part

Suppose we want to use some super-cool and performant C/C++ library from C# code. Let's say that this library calculates the total area of arbitrary polygons we pass to it (it simply sums up all of the polygons' areas).

The header for this library is as follows:

#ifdef NATIVELIB_EXPORTS
#define NATIVELIB_API __declspec(dllexport)
#else
#define NATIVELIB_API __declspec(dllimport)
#endif

#pragma pack(push, 1)
struct NPoint
{
    double X;
    double Y;
};

struct NPolygon
{
    int PointCount;
    NPoint* Points;
};
#pragma pack(pop)

extern "C" NATIVELIB_API double CalculateTotalArea( NPolygon* polygons, int polygonsCount );

Note the #pragma pack(push, 1) and #pragma pack(pop). Those are compiler directives, that tells it to align struct fields to 1 byte instead of 4 or 8. It's a little excessive here but it's good to remember about it to avoid some unpleasant surprises (SEHExceptions and stuff) when P/Invoking things.

So there's a Polygon which contains array of Points. And we want to pass array of Polygons from C# to this C/C++ function. The exact implementation of CalculateTotalArea is unimportant for us right now.

.NET part

Things would be much simpler if the total number of points were always constant. In that case we could use MarshalAs attribute here and there and we would be done.

But when arrays are of arbitrary size, things can get messy.

First, define structs that "map" to C/C++ equivalent:

[StructLayout( LayoutKind.Sequential, Pack = 1 )]
public struct NPoint
{
    public double X;
    public double Y;
};

[StructLayout( LayoutKind.Sequential, Pack = 1 )]
public struct NPolygon
{
    public int PointCount;
    public IntPtr Points;
};

I always like to specify LayoutKind and Pack so I exactly how data is going to be passed from managed to native.

The C/C++ CalculateTotalArea function signature in .NET is as follows:

[DllImport( "NativeLib.dll", CallingConvention = CallingConvention.Cdecl )]
private static extern double CalculateTotalArea( [MarshalAs( UnmanagedType.LPArray, ArraySubType = UnmanagedType.Struct )] NPolygon[] polygons, int polygonsCount );

The tricky part is Points field in our NPolygon struct. Since we can have arbitrary number of points in polygon, we can't just have NPoint[] Points because .NET won't know how to pass pass it to native. Unfortunately, using [MarshalAs( UnmanagedType.LPArray )] attribute on NPoint[] Points field causes a TypeLoadException: Invalid managed/unmanaged type combination (Array fields must be paired with ByValArray or SafeArray). And we can't use ByValArray because we want to have arbitrary number of points and ByValArray requires to use constant sized array. I'm no expert on P/Invoke, but it looks like it's not that obvious for .NET to know that NPoint struct has just two double fields.

So we end up with getting points array address and pass it along with number of elements in the array. That address is in IntPtr Points field. The value we set for Points field is crucial - more on that below.

Note that we could set some MarshalAs attribute values in function declaration and it allowed us to avoid pointers there. I must admit that it's a little mistery for me why UnmanagedType.LPArray works in function declaration for struct array and why it doesn't work for arrays in struct... but as I said, I'm no P/Invoke expert, so I'm excused.

Preparing managed data for native use

All that left is to somehow set IntPtr Points field value.

The thing is, we can't just cast or convert NPoints[] into IntPtr because the memory in .NET can always be moved and native code won't have any idea about it. So if we want to pass and use our NPoint[] arrays in native code, we must prevent them from moving by pinning them.

There are two ways (that I know about) in .NET to pin an object: the first one by using fixed statement and the other by using GCHandle.Alloc method. Since we have the IList of NPoints[], using fixed is not an option so we will go with GCHandle.Alloc:

private IList<GCHandle> _createdHandles = new List<GCHandle>();
...
public double CalculatePolygonsArea( IList<NPoint[]> polygonsList )
{
    NPolygon[] nativePolygons = new NPolygon[ polygonsList.Count ];
    _createdHandles.Add( GCHandle.Alloc( nativePolygons, GCHandleType.Pinned ) );
    for( int i = 0; i < polygonsList.Count; ++i )
    {
        var pointsArray = polygonsList[ i ];
        GCHandle pointsArrayHandle = GCHandle.Alloc( pointsArray, GCHandleType.Pinned );
        var nativePolygon = new NPolygon()
        {
            PointCount = pointsArray.Length,
            Points = pointsArrayHandle.AddrOfPinnedObject() // (1)
        };
        nativePolygons[ i ] = nativePolygon;
        _createdHandles.Add( pointsArrayHandle ); // (2)
    }
    // now call native
    return CalculateTotalArea( nativePolygons, nativePolygons.Length );
}

There are two important things here: (1) the AddrOfPinnedObject method to get the IntPtr with the address of each array and (2) the fact that we should keep each GCHandle we create, so we could later release them:

if( _createdHandles != null )
{
    foreach( var handle in _createdHandles )
        handle.Free();
}

And that's how you can pass struct array in struct from .NET to C/C++.

The only remaining question is: is it worth to struggle with all those IntPtrs and GCHandles just to call one or two native functions? Well, I guess that depends. If you want to use a mature, tested in combat, great libraries like OpenCV or VisiLibity, then it should be worth it.

But if you want to call some native code to calculate polygons areas and hope that it'll be faster that managed equivalent: of that I'm not that certain. And I plan to check it in the nearest future.

Bonus

To calculate actual polygon area one can use this clever formula. The funny "side effect" of this formula is that it can also be used to determine if our polygon is oriented clockwise or counter-clockwise, depending if the value returned is less than zero or greater than zero. Magic.

Read also

Changing public API of your .NET assembly

Thinking about changing you API? Think hard before doing it.

Data export tools for .NET

How to export data from .NET to PDFs, Word/Exel documents or CSVs.

Comments