Data export tools for .NET

Tags: data export, .net, pdf, docx, xlsx, csv

There are many times that one needs to export some data out of .NET world. I'm not talking about exporting data to JSON or XML, because probably noone would know what to do with it, if she/he is not from IT. Normal people don't even know what JSON or XML is, they just open PDFs, Word/Exel documents or CSVs, so let's focus on that right now.

1. MigraDoc/PDFSharp

This is rather old but still one of the most popular library that can generate PDFs (unfortunately it can't read them). There's a little confusion here since there are actually two libraries: MigraDoc and PDFSharp. PDFSharp is like a foundation for PDF generation and MigraDoc is higher level abstract library that internaly uses PDFSharp. If you want to generate document in an "editor-like" way (with paragraphs, headers/footers, tables etc), then you should use MigraDoc. But if you just want to draw some lines or write some text, then you can use PDFSharp directly.

To start with PDFSharp, simply:

  1. Install from Nuget: Install-Package PDFsharp-MigraDoc-GDI
  2. Then paste this:
PdfDocument doc = new PdfDocument();
PdfPage page = doc.AddPage();
XGraphics gfx = XGraphics.FromPdfPage( page );
gfx.DrawString( "Hello PDF!", new XFont( "Verdana", 20, XFontStyle.Bold ), XBrushes.Black, new XRect( 0, 0, page.Width, page.Height ), XStringFormats.Center );
double size = XUnit.FromMillimeter( 10 );
gfx.DrawBezier( XPens.Blue, new XPoint( 2 * size, 3 * size ), new XPoint( 4 * size, size / 2.0 ), new XPoint( 8 * size, 5 * size ), new XPoint( 10 * size, 3 * size ) );
gfx.DrawBezier( XPens.Red, new XPoint( 2 * size, 3 * size ), new XPoint( 4 * size, 5*size ), new XPoint( 8 * size, size/2 ), new XPoint( 10 * size, 3 * size ) );
doc.Save( "Hello.pdf" );
  1. Tada!

With MigraDoc, you use slightly different API:

Document doc = new Document();
Section section = doc.AddSection(); // always add at least one section
var para = section.AddParagraph( "Hello PDF!" );
para.Format.Font.Size = Unit.FromMillimeter( 20 );
para.Format.Font.Color = new Color( 255, 0, 0, 255 );
var table = section.AddTable(); //
table.Borders.Width = 0.75;
table.TopPadding = Unit.FromMillimeter( 2 );
table.BottomPadding = Unit.FromMillimeter( 2 );
table.Format.Alignment = ParagraphAlignment.Center; // center cell contents
table.Rows.Alignment = RowAlignment.Center; // center horizontally table in page
table.AddColumn(); // need to add at least one column before accessing Row.Cells property
var headerRow = table.AddRow();
headerRow.Shading.Color = Colors.LightGray;
headerRow.Cells[ 0 ].AddParagraph( "Row header" );
table.AddRow().Cells[ 0 ].AddParagraph( "Row data" );
PdfDocumentRenderer renderer = new PdfDocumentRenderer( true, PdfFontEmbedding.Always );
renderer.Document = doc;
renderer.RenderDocument();
renderer.Save( "Hello.pdf" );

But as you can see, it's all pretty straightforward.

The documentation could be better at some points, the API is a bit clumsy sometimes, but on the other hand MigraDoc isn't that complicated to use and there are samples to clarify things a little.

The nice thing is that it works with both desktop and ASP.NET apps, so you can learn once, run anywhere(*).

2. OpenXML from Microsoft

If you want to work with OpenXML files (DOCX, XLSX), this library is for you. The thing is: it's not that obvious to get it since it's not on Nuget officially. You can get it from Github repo: https://github.com/OfficeDev/Open-XML-SDK which has the most recent version (2.6 as the time of writing), but then you'll have to build it by yourself. Or you can get a slightly older version (2.5) directly from MS: https://www.microsoft.com/en-in/download/details.aspx?id=30425 - and I recommend the second approach.

Once installed, you just add a reference to your project (you'll also have to add reference to WindowsBase assembly but despite that it'll still work in web environment).

Now the fun part. Let's create a simple Word (DOCX) document. It's as simple as:

using( WordprocessingDocument package = WordprocessingDocument.Create( "1.docx", WordprocessingDocumentType.Document ) )
{
    MainDocumentPart mainDocumentPart1 = package.AddMainDocumentPart();

    Document document1 = new DocumentFormat.OpenXml.Wordprocessing.Document() { MCAttributes = new MarkupCompatibilityAttributes() { Ignorable = "w14 wp14" } };
    document1.AddNamespaceDeclaration( "o", "urn:schemas-microsoft-com:office:office" );
    document1.AddNamespaceDeclaration( "r", "http://schemas.openxmlformats.org/officeDocument/2006/relationships" );
    document1.AddNamespaceDeclaration( "v", "urn:schemas-microsoft-com:vml" );
    document1.AddNamespaceDeclaration( "w", "http://schemas.openxmlformats.org/wordprocessingml/2006/main" );
    document1.AddNamespaceDeclaration( "w10", "urn:schemas-microsoft-com:office:word" );
    document1.AddNamespaceDeclaration( "wp", "http://schemas.openxmlformats.org/drawingml/2006/wordprocessingDrawing" );
    document1.AddNamespaceDeclaration( "wps", "http://schemas.microsoft.com/office/word/2010/wordprocessingShape" );
    document1.AddNamespaceDeclaration( "wpg", "http://schemas.microsoft.com/office/word/2010/wordprocessingGroup" );
    document1.AddNamespaceDeclaration( "mc", "http://schemas.openxmlformats.org/markup-compatibility/2006" );
    document1.AddNamespaceDeclaration( "wp14", "http://schemas.microsoft.com/office/word/2010/wordprocessingDrawing" );
    document1.AddNamespaceDeclaration( "w14", "http://schemas.microsoft.com/office/word/2010/wordml" );

    Body body1 = new Body();

    Paragraph paragraph1 = new Paragraph();

    ParagraphProperties paragraphProperties1 = new ParagraphProperties();
    ParagraphStyleId paragraphStyleId1 = new ParagraphStyleId() { Val = "Normal" };
    ParagraphMarkRunProperties paragraphMarkRunProperties1 = new ParagraphMarkRunProperties();

    paragraphProperties1.Append( paragraphStyleId1 );
    paragraphProperties1.Append( paragraphMarkRunProperties1 );

    Run run1 = new Run();
    RunProperties runProperties1 = new RunProperties();
    Text text1 = new Text();
    text1.Text = "Hello Word!";

    run1.Append( runProperties1 );
    run1.Append( text1 );

    paragraph1.Append( paragraphProperties1 );
    paragraph1.Append( run1 );

    SectionProperties sectionProperties1 = new SectionProperties();
    SectionType sectionType1 = new SectionType() { Val = SectionMarkValues.NextPage };
    PageSize pageSize1 = new PageSize() { Width = (UInt32Value)11906U, Height = (UInt32Value)16838U };
    PageMargin pageMargin1 = new PageMargin() { Top = 1134, Right = (UInt32Value)1134U, Bottom = 1134, Left = (UInt32Value)1134U, Header = (UInt32Value)0U, Footer = (UInt32Value)0U, Gutter = (UInt32Value)0U };
    PageNumberType pageNumberType1 = new PageNumberType() { Format = NumberFormatValues.Decimal };
    FormProtection formProtection1 = new FormProtection() { Val = false };
    TextDirection textDirection1 = new TextDirection() { Val = TextDirectionValues.LefToRightTopToBottom };

    sectionProperties1.Append( sectionType1 );
    sectionProperties1.Append( pageSize1 );
    sectionProperties1.Append( pageMargin1 );
    sectionProperties1.Append( pageNumberType1 );
    sectionProperties1.Append( formProtection1 );
    sectionProperties1.Append( textDirection1 );

    body1.Append( paragraph1 );
    body1.Append( sectionProperties1 );

    document1.Append( body1 );

    mainDocumentPart1.Document = document1;

    package.PackageProperties.Creator = "Krzysztof";
    package.PackageProperties.Title = "";
    package.PackageProperties.Revision = "1";
    package.PackageProperties.Language = "pl-PL";
}

And you end up with beautiful DOCX document. As you can see, the API is very user friendly and extremely easy to use. At least in Microsoft's twisted way of thinking about how to make things simple. It's because this library is like a thin wrapper over XML to generate OpenXML documents from C# code. So in order to make this usable at all, there's this great tool in SDK called Open XML SDK v2.5 Productivity Tool. You can load any DOCX/XLSX document in it and then it can generate C# code that creates this exact document. It's a totally life-saver and it's what makes this library usable at all.

So my workflow when using this library looked like this:

  1. take a few deep breaths
  2. create DOCX/XLSX document in Word or LibreOffice with some placeholder text to be easily spotted
  3. generate C# code for this document from Productivity Tool
  4. copy and paste generated code to my code
  5. replace placeholders with dynamic data while trying not to change anything else because even changing the order of some xml elements could make the generated file unusable

And voila - the DOCX/XLSX generator is complete.

As you might expect, there's almost no documentation on MSDN, but there are some useful resources in Eric White's blog: http://ericwhite.com/blog/ so I guess you should check it out if you want to dive deeper into this library.

3. CSVHelper

Sure you can generate CSV by yourself with StringBuilders, but why bother when you can use this great little library. You can use it much like JSON.NET with only one line or you can create mappings to better translate your .NET model to CSV. You can configure many export settings like delimiter or character encoding but it can also properly quote-encode all the values, so you don't have to worry about that. If you tried exporting manually to CSV at least once, you'll definitely appreciate this feature.

You can get it from Nuget: Install-Package CsvHelper

Say, you have some data model:

public class Data
{
    public int Id { get; set; }
    public string Title { get; set; }
    public DateTime DateCreated { get; set; }
}

To write IEnumerable of this data to CSV simply:

IEnumerable items = ... // get the data
var csvConf = new CsvHelper.Configuration.CsvConfiguration(); // this is optional
// csvConf.RegisterClassMap<DataCSVMapping>(); // use this when you have custom mapping (see below)
csvConf.Delimiter = ";";
csvConf.Encoding = Encoding.UTF8;
csvConf.QuoteAllFields = true;
using( var csv = new CsvHelper.CsvWriter( new StreamWriter( "1.csv" ), csvConf ) )
{
    csv.WriteRecords( items );
}

That's it! If you have some more complex scenario or want to customize the output (like column names or order), you can use mappings for that. This way, you can also have nested objects:

public class Inner
{
    public string Position { get; set; }
    public string Name { get; set; }
}

public class Data
{
    public int Id { get; set; }
    public string Title { get; set; }
    public DateTime DateCreated { get; set; }
    public Inner Child { get; set; }
}
// for each type define mapping:
public class InnerCSVMapping : CsvClassMap<Inner>
{
    public InnerCSVMapping()
    {
        Map( x => x.Position );
        Map( x => x.Name );
    }
}

public class DataCSVMapping : CsvClassMap<Data>
{
    public DataCSVMapping()
    {
        Map( x => x.Id );
        Map( x => x.Title );
        Map( x => x.DateCreated ).TypeConverterOption( "yyyy-MM-dd" ); // formatting sample
        References<InnerCSVMapping>( x => x.Child ); // for complex types use this method
    }
}

And when you have slight more complex model, like so:

public class Data
{
    public int Id { get; set; }
    public string Title { get; set; }
    public DateTime DateCreated { get; set; }
    public List<Inner> Children { get; set; }
}

Then you need to either flatten it by yourself or create reusable type converter:

public class InnerListTypeConverter : CsvHelper.TypeConversion.ITypeConverter
{
    #region ITypeConverter Members
    public bool CanConvertFrom( Type type ) { return type is IEnumerable<Inner>; }
    public bool CanConvertTo( Type type ) { return true; }
    public object ConvertFromString( CsvHelper.TypeConversion.TypeConverterOptions options, string text ) { return null; }
    public string ConvertToString( CsvHelper.TypeConversion.TypeConverterOptions options, object value )
    {
        var innerList = value as IEnumerable<Inner>;
        if( innerList == null )
            return null;
        return string.Join( ";", innerList.Select( x => x.Name ) );
    }
    #endregion
}

And use it in mapping:

public class DataCSVMapping : CsvClassMap<Data>
{
    public DataCSVMapping()
    {
        Map( x => x.Id );
        Map( x => x.Title );
        Map( x => x.DateCreated ).TypeConverterOption( "yyyy-MM-dd" );
        Map( x => x.Children ).TypeConverter<InnerListTypeConverter>();
    }
}

It may seem like it's a lot for a simple task as CSV writing, but in reality it's not that bad, it's all quite powerful and reusable. And as you can see, the API itself is quite nice and modern (fluent, no attributes).

Bonus. ASP.NET Webforms ReportViewer

Back in the day there was something like ASP.NET Webforms Reporting Services which included ReportViewer webforms control. You could fed it with your custom data or data pulled directly from SQL. To be honest, I used it a few times and it wasn't that bad. You could generate some "cool" looking plots and charts, there also was a GUI builder for reports, but it also could export data to DOC/XLS formats which was a very nice feature.

But since probably nobody uses or knows ASP.NET Webforms any more, I just wanted to mention it without going any deeper.

(*) yes, I know, it's ReactJS punchline, but it fits here as well if we are talking about MS environments.

Read also

3 essential tools/apps for devs or IT

Here's my list of 3 favorite tools or apps for every IT guy or developer, regardless of technology you are breathing with.

Comments