Built in .NET CSV Parser

In administrative systems, there is often a need to import and parse csv files. .NET actually has a built in CSV parser, although it is well hidden in a VB.NET namespace. If I had known about it I wouldn’t have had to write all those custom (sometimes buggy) parsers.

To really test the parser, I’m going to parse a csv file in the Swedish format.

Name; FactoryLocation; EstablishedYear; ProfitMillionSEK
Volvo; "Gothenburg, Sweden; Gent, Belgium"; 1926; 0,345463
#A comment line
Saab; Trollhättan, Sweden; 1945; -3 009

Note that there is an embedded ; in the FactoryLocation field of Volvo, which is part of the field text and not a field delimiter.

There are three special formatting rules that applies to Swedish csv files.

  • The decimal delimiter is ,
  • The field delimiter is ; to not be confused with the decimal delimiter
  • The thousand separator in numbers is a space.

I really did my best to come up with a format that requires flexibility of the parser, but the TextFieldParser has really flexible configuration options and just worked.

To use the TextFieldParser a reference to the Microsoft.VisualBasic assembly has to be added to the project. Then it’s just to instantiate the parser, set needed configuration through properties and start parsing.

// TextFieldParser is in the Microsoft.VisualBasic.FileIO namespace.
using (TextFieldParser parser = new TextFieldParser(path))
    parser.CommentTokens = new string[] { "#" };
    parser.SetDelimiters(new string[] { ";" });
    parser.HasFieldsEnclosedInQuotes = true;
    // Skip over header line.
    while (!parser.EndOfData)
        string[] fields = parser.ReadFields();
        yield return new Brand()
            Name = fields[0],
            FactoryLocation = fields[1],
            EstablishedYear = int.Parse(fields[2]),
            Profit = double.Parse(fields[3], swedishCulture)

The parser can be configured with comment tokens and delimiters. It can handle fields enclosed in quotes. There is also multiple read functions. It can read lines just as a string, it can split the line into fields and it can read the remainder of a file as a huge string. To be honest, it’s way better than any of the parsers I’ve written.

The only thing I could possibly wish for is built in conversion to other data types than strings and object materialization. It could be an interesting thing to write, so maybe I’ll come back with a materialization wrapper.

Wouldn’t it be cool to have a data annotations based csv parser? Create a class with proper annotations and then automatically parse data from a csv file!

  • miki on 2013-01-21

    This is very simple, the most important part is missing – ability to process multiline strings (in quotes).

    • lunchbeast on 2013-03-19

      ‘Most’ important part is missing? The most important stuff is there and well explained. You should be able to figure out the rest.

      • Tom on 2015-03-19

        While the quoted multiline strings may not be ‘the most’ important part, it is still required for a valid csv reader. “You should be able to figure out the rest” – the only way to do this is to NOT use TextFieldParser because it can’t do it.

      • Anders Abel on 2015-03-19

        I don’t think that handling multi lines is the “most important” part of a CSV reader. In most cases strings are not multiline. But in the case that you do need the multiline capability, the TextFieldParser is obviously not the right tool for the task.

  • Brian on 2013-07-17

    Thank you Anders – this is most helpful!

  • Danny Warren on 2014-01-10

    Life Saver! And Life Changing! ;-) I am poised to parse a couple different kinds of CSV this sprint and was just looking for tips on how to do it intelligently. Thanks for showing me the light. I will never manually parse another CSV file again!

  • Tony on 2014-06-25

    Excellent post! Thanks! Works great (converted to VB.. forgive the trespass.. still working on learning C#).

  • Antony on 2014-08-09

    Thanks Anders, you just save me writing my own parser.

  • Nidhi on 2015-03-29

    Thanks, it works. Values within quotes containing commas & new-lines also worked. What’s all the talk below about this not working?

    • Anders Abel on 2015-03-30

      Well, actually I never tested new lines within the quotes myself. I just assumed it didn’t work based on the comments. If that works too, it’s even better. Writing a good csv parser that covers all cases is obviously non-trivial.

  • NBM on 2015-06-08

    It is obviously a very good way to parse flat files, unless you need better performance. It takes almost 9 times as much time as using StreamReader. Is there any way to speed it up?

  • Leave a Reply

    Your name as it will be displayed on the posted comment.
    Your e-mail address will not be published. It is only used if I want to get in touch during comment moderation.
    Your name will be a link to this address.
Software Development is a Job – Coding is a Passion

I'm Anders Abel, a systems architect and developer working for Kentor in Stockholm, Sweden.

profile for Anders Abel at Stack Overflow, Q&A for professional and enthusiast programmers

The complete code for all posts is available on GitHub.

Popular Posts



Powered by WordPress with the Passion for Coding theme.