-
Notifications
You must be signed in to change notification settings - Fork 11
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Tighten trimming for ExpandUntil and ExpandBetween #53
base: master
Are you sure you want to change the base?
Conversation
Before I dive into this, would you mind giving a simple example (with code) that shows the behavior before and after? |
Behaviour beforeGithub messes up the spacing a little bit, but note that there are three spaces to the right of Bob and Sue, and two spaces to the right of the Sun Yat-Sen's names.
Behaviour afterNote that there are now only two spaces to the right of Bob and Sue, and one space to the right of Sun Yat-Sen.
ExplanationSince Sun Yat-Sen's names each consist of three wide characters they are of width 6, which is larger than our requested maximum of 5. The therefore asks code would request that each string be trimmed down to width 5. However, none of Sun Yat-Sen's names can be trimmed to 5 characters: the closest you can get is 4. Previously, it would trim them to 4 and then add 1 space of padding to make it up to 5 characters. But since none of the trimmed items are really of length 5, we can get better spacing by not adding the extra padding. This is what the code does now: it recognises that we really are only a column of width 4, and renders it that way.
|
a6df00b
to
21c925f
Compare
Here's another example with cut marks Before
After
|
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
In general, I am not opposed to these changes. The problem I am having is with the logic for measuring that is intertwined with the logic for building. And to me that indicates that the problem is not fully understood or not properly represented. If it cannot be represented that way, I would like to know the reasoning.
where | ||
l = visibleLength a | ||
|
||
paddingNeeded i = totalAdjustment' . buildCellViewTight . case posS of |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
What I don't like about this is that it uses more than just functions for measuring from Cell
. This looks like a code smell to me and makes me wonder whether there is not a more direct approach. For example, could Cell
have a function that does this:
visibleLengthAfterDrop
:: Cell a
=> a
-> Int -- ^ The amount to drop left.
-> Int -- ^ The amount to drop right.
-> Int -- ^ The actual visible width of the remaining characters.
visibleLengthAfterDrop c l r = visibleLength c - (l + r) -- default implementation
Instances that support multi-width characters can then return a lesser amount.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I agree that this is a code smell, but I think the solution goes in the other direction. At this point we know exactly what we need to trim to, and past this point the only possibility is that we need to add more padding. We should be keeping the trimmed result around along with its length, and then padding based on the other elements in the column afterwards.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
We should be keeping the trimmed result around along with its length, and then padding based on the other elements in the column afterwards.
This function has the purpose of getting an abstract description of what to do with columns. One example why this is required is the headers. Thus, I want to keep this part unless there is a good reason to change it, and I don't see that at the moment.
The only drawback of having something like visibleLengthAfterDrop
that I can see is the duplicate wide character scanning. Could you explain why this would not work? What am I missing here?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
It's not that it won't work: it will. The reason I'm reluctant to switch is that calculating the visible length after drop and building the result of dropping are almost exactly the same work. Implementing the two things separately while avoiding code duplication would result in writing a function which returns both as a tuple, and then implementing the two typeclass functions as projecting that tuple to one or the other. This forces you to calculate things twice. By returning both at the same time, you at least have the option of reusing the work.
ExpandBetween i _ | l < i -> i | ||
_ -> l | ||
where | ||
l = visibleLength a | ||
trimAdjustment n | ||
| l > n = (totalAdjustment :: CellView String -> Int) . trim' posS cutM n l |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Same issue here.
That's completely reasonable, and I'm not sure I've found the optimal approach. I'm interested in your feedback and thoughts. The reason I've gone with this approach is that calculating the padding needed is almost all of the work of actually building. To figure out whether any padding is needed, you need to try attempt to trim the requested amount, and then check to see whether you had to trim more than requested. At this point, you build by adding the extra padding. So the keeping track of the built cell without padding, as well as how much padding is needed, makes either simple:
For all the examples we have so far, checking whether padding is needed involves dropping one character (or unit, for |
* Only one build function is now supplied to buildCellViewHelper * Trimming functions supplied to buildCellViewHelper must now also build * Introduce buildCellViewTightHelper, where trim functions can return in an Applicative context. This is useful for defining buildCellViewTight.
Add buildCellViewTight to Cell typeclass, which will track whether any padding is needed after trimming. The default implementation has no padding necessary, which is sufficient for most cases. In some cases, for example with WideString, tighter spacing can be achieved using buildCellViewTight.
When trimming wide strings, it sometimes happens that every item in a column needs extra padding to achieve the requested width. In this case, we can add less padding.
* Ensure cut marks are always placed before padding when trimming * Ensure cut marks are included in the requested width * The LPA versions of ColumnModifier functions are no longer necessary, since code uses all parts of ColSpec.
Perhaps a word on how I view this conceptually. The
I have explicitly left out padding, because padding is not something that For most instances of That said, I haven't been as principled about this philosophy as I could be: |
The thing is, you do the scanning for multi-width characters twice anyway. I think we should treat efficiency as a different issue for now. If it is necessary later there is other stuff you can do (mostly caching in To me it seems the only thing that changed with the introduction of multi-width characters is that if we drop a character, several width units are potentially dropped at once, and not just one.
The way I see it, neither
Now, I still do not see why we cannot make our existing code aware of the new dropping behavior. If it is not possible at all, the cause has to be identified and we need to figure out a new architecture. |
Okay, that is a reasonable decision. I'lll go along with it.
Yes, that is my take on it too.
By ‘our existing code’, you mean can we do this without introducing any new functionality into the typeclass? I'm sceptical that it's possible. The existing code assumes that if you ask it to drop a certain width, it will be able to drop exactly that width. Namely, it satisfies the following invariant: This is why it's able to do everything with a single call to There are two ways I can see doing this: my (Note: I'm being a bit sloppy here with length, but I think my point is clear. Let me know if it isn't.) It sounds like your call is that we go with some variation of
My vote is for the second, and that it completely ignore requested padding. |
I'm like to change my opinion to be that it should not ignore requested padding. This will enable the default implementation to be the straightforward |
I do not care too much about the name. Small details are easily changed. I'm more interested in how the implementation will change and if there is something coming about that we did not plan. |
I've attempted to redo this PR using the The advantage of As I see it, we need to introduce functions to the
The second can be naturally implemented in terms of the first. The first can be implemented in terms of the second, but it's a bit clunky. I've done a lazy job here.
I think we should stick with the implementation with |
Let me try to summarise this second padding problem in a different way. Let me know if you agree. In the current paradigm, there are three things that you can do with an instance of
These are kept separate conceptually. Note that there is no concept of padding inherent in The current architecture is able to maintain this separation because it assumes that you can drop any positive integral width from either side of the Wide characters complicate this because there exist lengths you can't drop. For example, you cannot drop length 1 from a string consisting of a single width-2 character. In order to maintain the current architecture, we need to fake it by dropping more width than necessary, and then ‘topping up’ by adding extra padding to get to the desired width. By adding this extra padding, we can maintain the above invariants. But this complicates the architecture, because it requires some sort of padding to be inherent in the definition of However, this causes a problem. It means padding happens in a new place in the code, and this doesn't interface well with existing architecture. For example, when we call a function like There are a few ways I can see out of this problem:
I continue to think that option 1 is the best approach. |
Thank you for the summary. I feel that I do not fully understand everything and first need to dive back into the code. I have some vacation soon, so I hope to get this review done at around Christmas. |
I found this issue to be complex so I am trying to structure my response into sections. General
I think neither 2. or 3. can handle the actual complexity. In general, I agree with your assessment, that adding padding or trimming in IssueIn my opinion, the crux of this issue lies in ApproachI try to describe how I imagine it. It is very likely to overlap with what you already implemented or wrote. Thus, I don't want to claim this is something completely new or fully my idea.
Details
How to move forward
If we agree on this proposal, then we can talk about who does the work or where we could collaborate. Technically, I wouldn't mind working on it and I still have a few days of vacation. |
I will probably start with minor refactorings:
|
This is a very detailed analysis and proposal. At first read-over it seems reasonable to me, but I'll need to think a bit more deeply (and probably try to implement bits of it) to know for certain. In the meantime, I'll factor out the unrelated commits and submit a separate PR. |
How should we proceed? Are you intending to implement this, or should we divide up the work in a reasonable way? |
If you are willing to help, what would probably make the most sense for you is adding the new What I did so far is introduce a type that tracks the length instead of building it instantly and adapt the code. This may or may not be the best representation but we can change that easily later on.
The following parts are still missing.
|
I'm looking into this now, but my brain is having a bit of trouble wrapping itself around this. Could you give a quick example of how you'd implement and use this?
|
I hope the following makes sense.
|
Fixes #35