Long processing times when handling very large files

Hello,

I've been here before and I'm back 😊
This gem has become a cornerstone in one of the projects I've developed. In most cases, it handles very well, minus some configuration options we need to customize to each scenario, but so it goes.

Now, we are working with larger files, 1.5million and such rows. In some cases, it seems to take hours. I've tested this before in tests with files between 500,000 and 1,000,000 rows, and have experienced around 15 minutes or more to fully process diffs of these files using the gem. We can deal with that even though it's not lovely, but any time taking longer than that is detrimental.

Now, I am not sure if this is an issue with how we provide `key_fields` or such, but I am mainly writing this issue out as a question on what experiences people have had with comparing large files? Is this a gem constraint, our own `CSVDiff` configuration, or something else?

What have you recorded for working with files of one-million plus rows, with up to 100 columns? 

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Long processing times when handling very large files #12

Metadata

Assignees

Labels

Projects

Milestone

Relationships

Development

Long processing times when handling very large files #12

Description

Metadata

Metadata

Assignees

Labels

Projects

Milestone

Relationships

Development

Issue actions