Thursday, June 05, 2008

LINQ to CSV

I thought it would be nice to be able to produce a CSV file by doing something like this:

string ordersCsv = orderRepository.GetAll().Select(o => new 
{ 
    OrderId = o.OrderId,
    Email = o.Email,
    OrderStatus = o.OrderStatus.Name,
    CreatedDate = o.CreatedDate,
    Total = o.Basket.Total
}).AsCsv();

So here's an extension method to do just that:

public static string AsCsv<T>(this IEnumerable<T> items)
    where T : class
{
    var csvBuilder = new StringBuilder();
    var properties = typeof (T).GetProperties();
    foreach (T item in items)
    {
        string line = properties.Select(p => p.GetValue(item, null).ToCsvValue()).ToArray().Join(",");
        csvBuilder.AppendLine(line);
    }
    return csvBuilder.ToString();
}

private static string ToCsvValue<T>(this T item)
{
    if (item is string)
    {
        return "\"{0}\"".With(item.ToString().Replace("\"", "\\\""));
    }
    double dummy;
    if (double.TryParse(item.ToString(), out dummy))
    {
        return "{0}".With(item);
    }
    return "\"{0}\"".With(item);
}

It's work with anything that implements IEnumerable<T>, that includes the results of LINQ-to-SQL queries , arrays, List<T> and pretty much any kind of collection. Here's it's unit test:

[TestFixture]
public class EnumerableExtensionsTests
{
    [Test]
    public void GetCsv_ShouldRenderCorrectCsv()
    {
        IEnumerable<Thing> things = new List<Thing>()
            {
                new Thing
                    {
                        Id = 12,
                        Name = "Thing one",
                        Date = new DateTime(2008, 4, 20),
                        Child = new Child
                                    {
                                        Name = "Max"
                                    }
                    },
                new Thing
                    {
                        Id = 13,
                        Name = "Thing two",
                        Date = new DateTime(2008, 5, 20),
                        Child = new Child
                                    {
                                        Name = "Robbie"
                                    }
                    }
            };

        string csv = things.Select(t => new { Id = t.Id, Name = t.Name, Date = t.Date, Child = t.Child.Name }).AsCsv();

        Assert.That(csv, Is.EqualTo(expectedCsv));
    }

    const string expectedCsv = 
@"12,""Thing one"",""20/04/2008 00:00:00"",""Max""
13,""Thing two"",""20/05/2008 00:00:00"",""Robbie""
";

    public class Thing
    {
        public int Id { get; set; }
        public string Name { get; set; }
        public DateTime Date { get; set; }
        public Child Child { get; set; }
    }

    public class Child
    {
        public string Name { get; set; }
    }
}

11 comments:

Anonymous said...

Cool stuff.

One worry on the date format though. i think it would be better to use a portable format (i.e - insensitive to mmdd/ddmm and to timezones)
so:
if (item is DateTime)
{
return string.Format("{0:u}", item);
}

Mike Hadlow said...

Hi Ken,

Thanks, that's a good suggestion.

There are other things that I really should do like escape commas properly and new lines. It's not really the most robust CSV implementation at the moment.

LuisFX said...

Hi Mike, when I try to compile, I get an error "string does not contain a definition for 'With'..." at "return "\"{0}\"".With(item) ... any ideas?

Luis.

Mike Hadlow said...

Hi Luisfx,

Sorry 'With' is an extension method on string in place of the string.format statement. Just change the line to:

return string.Format("\"{0}\"", item.ToString().Replace("\"", "\\\""));

And it should work fine.

Anonymous said...

Mike,

You haven't thought about the scenario where the thing you are converting has a string that has a comma in it!

Anonymous said...

Woops,

I just saw your comment!

Anonymous said...

Hi Mike
I'm also getting the error
"No overload for method 'Join' takes '1' arguments",
any idea why?

Thanks

Anonymous said...

Hi,

I get the same error:

"No overload for method 'Join' takes '1' arguments",

Mike Hadlow said...

Hi Anonymous,

I can't remember now, but I may have written a little extension method for join. It's a simple matter to use string.Join() instead.

Scott said...

if you want to had a line with the names of the linq properties do something like:

if (UseHeader)
{
csvBuilder.Append(string.Join(",", (from a in properties select a.Name).ToArray()));
}

put this above the line that iterates over the ienumerable

Anonymous said...

Changed your ToCsvValue function to handle nulls, and got rid of the custom Extension methods:

private static string ToCsvValue(this T item)
{
if (item is string)
{
return string.Format("\"{0}\"", item.ToString().Replace("\"", "\\\"")); ;
}

double dummy;
if (item == null)
return "";

if (double.TryParse(item.ToString(), out dummy))
return string.Format("{0}", item);

return string.Format("\"{0}\"", item);
}

Also, regarding Scott's post on getting the headers: Change Append() to AppendLine() to avoid losing the first row of data!