Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Add command line tool and support (plus split large files) #131

Open
wants to merge 41 commits into
base: gh-pages
Choose a base branch
from

Conversation

danamlewis
Copy link

Created version as command line tool, including support for splitting large files.

  • Some of the functions converted to make it work for command line should be tested to ensure backwards compatibility with your web version
  • If jsonsplit.sh is helpful, could maybe be used to prevent choking on large files for the web version, too. The command line version works on files ranging from 2kb to 1.32 GB, with very complex schema :)

scottleibrand and others added 30 commits February 9, 2017 18:48
stub of command-line js-only version of index.html
clean up filenames; print progress lines
@konklone
Copy link
Owner

Thank you so much for doing this, @danamlewis. I want to give this proper attention, and get it integrated into the repo (and into the website, at least referenced), though unfortunately am about to take off early in the morning for a couple weeks of travel. My apologies in advance for my delay in giving this the energy it deserves!

@danamlewis
Copy link
Author

Hi @konklone, I was talking with someone who's been using this on the command line and finding it helpful for their work, which reminded me to check in on this PR. I still think folks who find your web tool might find a command line version helpful, too, so please let me know if there's anything I can do in helping you review it (or likewise, understand if you'd prefer to close this/not incorporate, just let me know). Thanks again for the ongoing work on this type of tool!

@konklone
Copy link
Owner

@danamlewis So, I'm really impressed with your work on complex-json2cv and jsonsplit here. I think the approach is a little inefficient and mixes some technical concerns, so let's see if we can talk through a way forward we both like.

This PR adds a package.json to this repo (which is currently published, so the complex-json2csv npm package contains this whole repo), and the Node module makes use of the existing functions, as well as jquery-to-csv. The Node module code is also making reference to functions that perform HTML work, like showing and hiding tables, which I think is just copied from the original site.js.

This has some downsides -- there's copied/pasted code that isn't used, there's HTML work being done that isn't relevant to command-line processing of JSON->CSV, and the Node module as published has a ton of stuff in it that isn't relevant to operation of the module.

There's also a missed opportunity, I think, to integrate in the other direction -- have the Node module contain the core processing code, and then have the website JS call into the module JS, rather than having the module JS call into the website JS.

The approach I think would work the best looks like this:

  • There's a new GitHub repo containing just a Node module that exposes a CLI interface, and a JS API, to convert JSON into CSV.
  • That Node module ideally doesn't use jquery-csv, but a standalone (pure-JS, so it can be run in-browser) CSV serializer.
  • The functionality of jsonsplit.sh is ideally written in JS rather than shell scripting and is a helper function of the Node module.
  • The complex-json2csv npm module is republished with just the contents of that GitHub repository. (I'd also suggest a rename to just json2csv.)
  • There's this GitHub repo, which contains just the website, which snapshots a version of the Node module locally (could be done with browserify, but might be simpler as a one-off) and has instructions on a command to run to update the snapshotted version.
  • The website is updated to cite you and the Node module.

@danamlewis What do you think of the above? I can take a first pass at a new GitHub repo with the CLI/JS API core.

@danamlewis
Copy link
Author

Hey @konklone - thanks for the thoughtful review on this.

  • There are a zillion (roughly) json2csv packages and tools out there, so I would recommend sticking with a name that does emphasize the complex aspect since that is what differentiates it.
  • I am sure it is not efficient, etc. - I come from a non-traditional non-technical background, so I'm mostly just thrilled it works :) but also don't mind helping hands to clean it up, make it efficient, and do many of the things you suggested. I also don't have the technical chops to do what you've suggested, though, so happy to support whichever approach you think makes sense to move forward with - especially since it's your tool to begin with!

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

4 participants