Skip to content

Streaming csv parser inspired by binary-csv that aims to be faster than everyone else

License

Notifications You must be signed in to change notification settings

amismailz/csv-parser

 
 

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 

Repository files navigation

csv-parser

Streaming CSV parser that aims for maximum speed as well as compatibility with the csv-spectrum CSV acid test suite

npm install csv-parser

build status dat

csv-parser can convert CSV into JSON at at rate of around 90,000 rows per second (perf varies with data, try bench.js with your data).

Usage

Simply instantiate csv and pump a csv file to it and get the rows out as objects

You can use csv-parser in the browser with browserify

Let's say that you have a CSV file some-csv-file.csv like this:

NAME, AGE
Daffy Duck, 24
Bugs Bunny, 22

You can parse it like this:

var csv = require('csv-parser')
var fs = require('fs')

fs.createReadStream('some-csv-file.csv')
  .pipe(csv())
  .on('data', function (data) {
    console.log('Name: %s Age: %s', data.NAME, data.AGE)
  })

The data emitted is a normalized JSON object. Each header is used as the property name of the object.

The csv constructor accepts the following options as well

var stream = csv({
  raw: false,     // do not decode to utf-8 strings
  separator: ',', // specify optional cell separator
  quote: '"',     // specify optional quote character
  escape: '"',    // specify optional escape character (defaults to quote value)
  newline: '\n',  // specify a newline character
  strict: true    // require column length match headers length
})

It accepts too an array, that specifies the headers for the object returned:

var stream = csv(['index', 'message'])

// Source from somewere with format 12312,Hello World
origin.pipe(stream)
  .on('data', function (data) {
    console.log(data) // Should output { "index": 12312, "message": "Hello World" }
  })

or in the option object as well

var stream = csv({
  raw: false,     // do not decode to utf-8 strings
  separator: ',', // specify optional cell separator
  quote: '"',     // specify optional quote character
  escape: '"',    // specify optional escape character (defaults to quote value)
  newline: '\n',  // specify a newline character
  headers: ['index', 'message'] // Specifing the headers
})

If you do not specify the headers, csv-parser will take the first line of the csv and treat it like the headers.

Another issue might be the encoding of the source file. Transcoding the source stream can be done neatly with something like iconv-lite, Node bindings to iconv or native iconv if part of a pipeline.

Events

The following events are emitted during parsing.

data

For each row parsed (except the header), this event is emitted. This is already discussed above.

headers

After the header row is parsed this event is emitted. An array of header names is supplied as the payload.

fs.createReadStream('some-csv-file.csv')
  .pipe(csv())
  .on('headers', function (headerList) {
    console.log('First header: %s', headerList[0])
  })

Other Readable Stream Events

The usual Readable stream events are also emitted. Use the close event to detect the end of parsing.

fs.createReadStream('some-csv-file.csv')
  .pipe(csv())
  .on('data', function (data) {
    // Process row
  })
  .on('end', function () {
    // We are done
})

Command line tool

There is also a command line tool available. It will convert csv to line delimited JSON.

npm install -g csv-parser

Open a shell and run

$ csv-parser --help # prints all options
$ printf "a,b\nc,d\n" | csv-parser # parses input

Options

You can specify these CLI flags to control how the input is parsed:

Usage: csv-parser [filename?] [options]

  --headers,-h        Explicitly specify csv headers as a comma separated list
  --output,-o         Set output file. Defaults to stdout
  --separator,-s      Set the separator character ("," by default)
  --quote,-q          Set the quote character ('"' by default)
  --escape,-e         Set the escape character (defaults to quote value)
  --strict            Require column length match headers length
  --version,-v        Print out the installed version
  --help              Show this help

For example, to parse a TSV file:

cat data.tsv | csv-parser -s $'\t'

Related

  • neat-csv - Promise convenience wrapper

License

MIT

About

Streaming csv parser inspired by binary-csv that aims to be faster than everyone else

Resources

License

Stars

Watchers

Forks

Packages

No packages published

Languages

  • JavaScript 100.0%