Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Cannot handle aligned multi-space-delimited files #212

Closed
Tracked by #827
Jolanrensen opened this issue Oct 22, 2024 · 4 comments · Fixed by #219 or #220
Closed
Tracked by #827

Cannot handle aligned multi-space-delimited files #212

Jolanrensen opened this issue Oct 22, 2024 · 4 comments · Fixed by #219 or #220
Assignees
Labels
bug Something isn't working

Comments

@Jolanrensen
Copy link

Jolanrensen commented Oct 22, 2024

Description
Let's say we have a multi-space-delimited file like:

NAME                     STATUS   AGE      LABELS
argo-events              Active   2y77d    app.kubernetes.io/instance=argo-events,kubernetes.io/metadata.name=argo-events
argo-workflows           Active   2y77d    app.kubernetes.io/instance=argo-workflows,kubernetes.io/metadata.name=argo-workflows
argocd                   Active   5y18d    kubernetes.io/metadata.name=argocd
beta                     Active   4y235d   kubernetes.io/metadata.name=beta

which is a common thing to see in logs etc., I cannot seem to parse it correctly.
The delimiter can only be a char, which I suppose should be ' ' in this case and then we could trim the rest with ignoreSurroundingSpaces = true

Steps to reproduce

Parse the string above with delimiter ' ', ignoreSurroundingSpaces = true.

Expected results

I'd expect there to be a way to ignore repetition of the delimiter char.

Actual results

After parsing, we get something like:

⌌---------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------⌍
|  |           NAME| untitled| 1| 2| 3| 4| 5| 6| 7| 8| 9|     10| 11| 12|     13| 14| 15|    16|                                       17|     18|   19|                                   STATUS|    20|   21|    AGE|   22|                                 23|                               24|   25|   26| LABELS|
|--|---------------|---------|--|--|--|--|--|--|--|--|--|-------|---|---|-------|---|---|------|-----------------------------------------|-------|-----|-----------------------------------------|------|-----|-------|-----|-----------------------------------|---------------------------------|-----|-----|-------|
| 0|    argo-events|         |  |  |  |  |  |  |  |  |  |       |   |   | Active|   |   | 2y77d|                                         |       |     | app.kubernetes.io/instance=argo-event...|  null| null|   null| null|                               null|                             null| null| null|   null|
| 1| argo-workflows|         |  |  |  |  |  |  |  |  |  | Active|   |   |  2y77d|   |   |      | app.kubernetes.io/instance=argo-workf...|   null| null|                                     null|  null| null|   null| null|                               null|                             null| null| null|   null|
| 2|         argocd|         |  |  |  |  |  |  |  |  |  |       |   |   |       |   |   |      |                                         | Active|     |                                         | 5y18d|     |       |     | kubernetes.io/metadata.name=argocd|                             null| null| null|   null|
| 3|           beta|         |  |  |  |  |  |  |  |  |  |       |   |   |       |   |   |      |                                         |       |     |                                   Active|      |     | 4y235d|     |                                   | kubernetes.io/metadata.name=beta| null| null|   null|
⌎---------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------⌏

Edit:

Additionally common; A single space inside a column, while multiple spaces indicates a delimiter, like:

NAME                     STATUS       AGE      LABELS
argo-events              Not Active   2y77d    app.kubernetes.io/instance=argo-events,kubernetes.io/metadata.name=argo-events
argo-workflows           Active       2y77d    app.kubernetes.io/instance=argo-workflows,kubernetes.io/metadata.name=argo-workflows
argocd                   Active       5y18d    kubernetes.io/metadata.name=argocd
beta                     Not Active   4y235d   kubernetes.io/metadata.name=beta
@Jolanrensen Jolanrensen added the bug Something isn't working label Oct 22, 2024
@Jolanrensen Jolanrensen changed the title Cannot handle aligned space-delimited files Cannot handle aligned multi-space-delimited files Oct 22, 2024
@devinrsmith
Copy link
Member

@Jolanrensen thanks for the issue, we'll look into it and report back here.

@kosak
Copy link
Contributor

kosak commented Oct 26, 2024

Hi, thanks for the bug report. I'd like to suggest supporting this in a different way.

It feels more natural to me for the library to support fixed-width columns, where the column widths are either specified explicitly by the caller, or inferred from the first row of the input. In this proposal we would also allow the library to trim the spaces inside the fixed-width cells, perhaps reusing the flag ignoreSurroundingSpaces.

For example the library could read

NAME                     STATUS       AGE      LABELS

and infer starting column positions of 1, 26, 39, 48 (in a 1-based convention, and assuming I've counted characters correctly). It would assume that the rest of the file had data at these positions.

Would this work for you? I have some reluctance to support variable-length delimiters, not least because of the edge cases it introduces when there are empty cells.

@Jolanrensen
Copy link
Author

@kosak Yes! I think that would work great. I think in all cases, the maximum cell width is defined by the size of the column title (+ n.o. spaces - 1 delimiter space) (aside from the final column of course). So this would solve the problem correctly.

@devinrsmith
Copy link
Member

This will be fixed by #220

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
bug Something isn't working
Projects
None yet
3 participants