Login


Reading and Writing CSV Files in MFC

By Jonathan Wood on 12/2/2010 (Updated on 12/17/2010)
Language: C++
Technology: MFC
Platform: Windows
License: CPOL
Views: 17,313
Desktop Development » Files & Directories » File Formats » Reading and Writing CSV Files in MFC

Introduction

I recently had the need to import and export CSV files in an MFC application. A CSV (Comma-Separated Values) file is a plain-text file where each row contains one or more fields, separated by commas.

CSV files are probably best known for their use by Microsoft Excel. CSV files provide a convenient format for sharing spreadsheet data between applications, particularly when you consider having your application work directly with native Excel files would be a very complex task.

The CSV file format is not complex. Mostly, it just requires some simple parsing. One trick is when a data field contains a comma. Since commas are used to delimit fields, this would cause problems and so such fields are enclosed in double quotes. And since double quotes have special meaning, we have problems if a data field contains a double quote and so such fields are also enclosed in double quotes, and pairs of double quotes are interpreted to mean a single double quote in the data.

My CCSVFile Class

The header file for the CCSVFile class is shown in Listing 1 and my source file is shown in listing 2.

Listing 1: CSVFile.h

#pragma once
#include "afx.h"

class CCSVFile : public CStdioFile
{
public:
  enum Mode { modeRead, modeWrite };
  CCSVFile(LPCTSTR lpszFilename, Mode mode = modeRead);
  ~CCSVFile(void);
  bool ReadData(CStringArray &arr);
  void WriteData(CStringArray &arr);
#ifdef _DEBUG
  Mode m_nMode;
#endif
};

The class is very simple: it only contains two methods (besides the constructor and destructor).

Note that it is up to the caller to ensure that only the ReadData() method is called when the constructor specified read mode, and only the WriteData() method is called when the constructor specified write mode. To help enforce this, the code asserts if it is not the case when _DEBUG is defined.

Listing 2: CSVFile.cpp

#include "StdAfx.h"
#include "CSVFile.h"

CCSVFile::CCSVFile(LPCTSTR lpszFilename, Mode mode)
  : CStdioFile(lpszFilename, (mode == modeRead) ?
    CFile::modeRead|CFile::shareDenyWrite|CFile::typeText   
    :
    CFile::modeWrite|CFile::shareDenyWrite|CFile::modeCreate|CFile::typeText)
{
#ifdef _DEBUG
  m_nMode = mode;
#endif
}

CCSVFile::~CCSVFile(void)
{
}

bool CCSVFile::ReadData(CStringArray &arr)
{
  // Verify correct mode in debug build
  ASSERT(m_nMode == modeRead);

  // Read next line
  CString sLine;
  if (!ReadString(sLine))
    return false;

  LPCTSTR p = sLine;
  int nValue = 0;

  // Parse values in this line
  while (*p != '\0')
  {
    CString s;  // String to hold this value
    if (*p == '"')
    {
      // Bump past opening quote
      p++;

      // Parse quoted value
      while (*p != '\0')
      {
        // Test for quote character
        if (*p == '"')
        {
          // Found one quote
          p++;

          // If pair of quotes, keep one
          // Else interpret as end of value
          if (*p != '"')
          {
            p++;
            break;
          }
        }

        // Add this character to value
        s.AppendChar(*p++);
      }
    }
    else
    {
      // Parse unquoted value
      while (*p != '\0' && *p != ',')
      {
        s.AppendChar(*p++);
      }

      // Advance to next character (if not already end of string)
      if (*p != '\0')
        p++;
    }

    // Add this string to value array
    if (nValue < arr.GetCount())
      arr[nValue] = s;
    else
      arr.Add(s);

    nValue++;
  }

  // Trim off any unused array values
  if (arr.GetCount() > nValue)
    arr.RemoveAt(nValue, arr.GetCount() - nValue);

  // We return true if ReadString() succeeded--even if no values
  return true;
}

void CCSVFile::WriteData(CStringArray &arr)
{
  static TCHAR chQuote = '"';
  static TCHAR chComma = ',';

  // Verify correct mode in debug build
  ASSERT(m_nMode == modeWrite);

  // Loop through each string in array
  for (int i = 0; i < arr.GetCount(); i++)
  {
    // Separate this value from previous
    if (i > 0)
      WriteString(_T(","));

    // We need special handling if string contains
    // comma or double quote
    bool bComma = (arr[i].Find(chComma) != -1);
    bool bQuote = (arr[i].Find(chQuote) != -1);
    if (bComma || bQuote)
    {
      Write(&chQuote, sizeof(TCHAR));
      if (bQuote)
      {
        for (int j = 0; j < arr[i].GetLength(); i++)
        {
          // Pairs of quotes interpreted as single quote
          if (arr[i][j] == chQuote)
            Write(&chQuote, sizeof(TCHAR));
          TCHAR ch = arr[i][j];
          Write(&ch, sizeof(TCHAR));
        }
      }
      else
      {
        WriteString(arr[i]);
      }
      Write(&chQuote, sizeof(TCHAR));
    }
    else
    {
      WriteString(arr[i]);
    }
  }
  WriteString(_T("\n"));
}

There are many ways to go about parsing text. I was more comfortable manually stepping through the text, character-by-character and so that's what the code does.

As implied by the names, ReadData() is used to read from a CSV file while WriteData() is used to write to one. Each line of data is stored in a CStringArray[], which is passed by reference to both functions. The caller must call these methods once for each line. When calling ReadData(), false is returned when the end of the file is reached.

Conclusion

As mentioned, the code is fairly simple but I've actually found I've used this code on several occasions and even ported it to C#. Perhaps you will find it useful as well.

End-User License

Use of this article and any related source code or other files is governed by the terms and conditions of The Code Project Open License.

Author Information

Jonathan Wood

I'm a software/website developer working out of the greater Salt Lake City area in Utah. I've developed many websites including Black Belt Coder, Insider Articles, and others.