You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
I just attempted to open the referenced file in Excel, and it seems to do the job well of determining the quotes that are cell boundaries vs textual quotes.
Out of curiosity, how does this work? It seems quite complicated to make a Regex that reliably solves all such cases, and it's a bit of a chicken-and-egg to build a csv-parser that works on these types of strings if the quote: true option is enabled.
If I want to gracefully handle unescaped / unmatched quotes in the middle of a cell value, what options do I have? I really appreciate your advice!
For reference, right now I'm doing something like this:
function replaceEmbeddedQuotes(
readStream: Readable,
): Readable {
// Create a transform stream to process the data
const transformer = new Transform({
objectMode: true,
transform(
chunk: Buffer | string,
encoding: string,
callback: Function
) {
// Convert chunk to string if it's a buffer
const line = chunk instanceof Buffer ? chunk.toString() : chunk;
// If a quote is found in the middle of a field, double it
const processedLine = line.replaceAll(/(\s)"(\s)/g, '$1""$2');
// Push the processed line to the output stream
this.push(processedLine);
callback();
},
});
// Pipe the input stream through our transformer
return readStream.pipe(transformer);
}
And then passing that readable to csv-parse, but this doesn't handle some cases, like if the extra quote is not surrounded by spaces on each side, and I assume tehre are also cases where a valid end quote could be surrounded by spaces on each side, like
"a " , "b","c"
The text was updated successfully, but these errors were encountered:
I don't have much time too dig into your quote, but we are not using regular expression. Instead we parse bytes one by one and maintain a state of what we have. I don't have all the implementation details in memory but white spaces are supported around quote and delimiter, see trim. A quick search reveals is a test for it, "with whitespaces around quotes", in test/options.ltrim.coffee.
Referencing #421 which has been closed:
I just attempted to open the referenced file in Excel, and it seems to do the job well of determining the quotes that are cell boundaries vs textual quotes.
Out of curiosity, how does this work? It seems quite complicated to make a Regex that reliably solves all such cases, and it's a bit of a chicken-and-egg to build a csv-parser that works on these types of strings if the
quote: true
option is enabled.If I want to gracefully handle unescaped / unmatched quotes in the middle of a cell value, what options do I have? I really appreciate your advice!
For reference, right now I'm doing something like this:
And then passing that readable to csv-parse, but this doesn't handle some cases, like if the extra quote is not surrounded by spaces on each side, and I assume tehre are also cases where a valid end quote could be surrounded by spaces on each side, like
"a " , "b","c"
The text was updated successfully, but these errors were encountered: