Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Update the types of dataclass attributes according to usage #163

Merged
merged 2 commits into from
Nov 1, 2020

Conversation

kvid
Copy link
Collaborator

@kvid kvid commented Sep 1, 2020

Fixes #156

Copy link
Collaborator Author

@kvid kvid left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

  • A couple of my comments below are connected to the wrong code line because Github didn't allow commenting code lines that are not yet changed in this PR. Use https://github.com/formatc1702/WireViz/pull/163/files to expand the hidden code lines.

  • Are we using type annotations only for dataclass attributes, or should we use it for the rest of the code as well?

src/wireviz/DataClasses.py Outdated Show resolved Hide resolved
src/wireviz/DataClasses.py Outdated Show resolved Hide resolved
src/wireviz/DataClasses.py Show resolved Hide resolved
src/wireviz/DataClasses.py Outdated Show resolved Hide resolved
src/wireviz/DataClasses.py Outdated Show resolved Hide resolved
Name = str # Case insensitive unique name of connector or cable
Pin = Union[int, str] # Pin identifier
Wire = Union[int, str] # Wire number or 's' for shield
MLstr = str # Multi-line string where any newline is properly handled
Copy link
Collaborator Author

@kvid kvid Sep 13, 2020

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

If PR #164 is accepted, then we might need 3 different text types:

  • Text = Text that might contain HTML hyperlinks that are removed in all outputs except in HTML output.
  • TextML = Text that might contain HTML hyperlinks that are removed in all outputs except in HTML output, and newlines that are translated to <br/> in diagram output or to space otherwise.
  • str = Plain text that doesn't contain any HTML tags nor newline.

@formatc1702
Copy link
Collaborator

formatc1702 commented Oct 10, 2020

This is a very nice contribution.
I suggest changing Name to Designator since that is what has been used in the documentation and the BOM as well, and is a bit less generic.
While my initial reaction was skepticism at having too many custom datatypes, I can see the value in explicitly defining them so there are no surprises when HTML links are stripped, etc... because yes, #164 is a great new feature.
Now, my worry is just that Text and TextML are not self-explanatory enough (I know you added comments in their definitions)... Maybe something like str_html and str_html_multiline? I'm open for better suggestions

@formatc1702 formatc1702 added this to the v0.3 milestone Oct 17, 2020
@formatc1702
Copy link
Collaborator

According to your suggestion, this PR should be the next one to be merged.
If you agree with my suggested naming changes, feel free to implement them and rebase onto dev :)
If you feel there is a need for additional discussion, fire away!

@kvid
Copy link
Collaborator Author

kvid commented Oct 27, 2020

I'm sorry for this late reply.

This is a very nice contribution.

Thank's.

I suggest changing Name to Designator since that is what has been used in the documentation and the BOM as well, and is a bit less generic.

I agree.

While my initial reaction was skepticism at having too many custom datatypes, I can see the value in explicitly defining them so there are no surprises when HTML links are stripped, etc... because yes, #164 is a great new feature.
Now, my worry is just that Text and TextML are not self-explanatory enough (I know you added comments in their definitions)... Maybe something like str_html and str_html_multiline? I'm open for better suggestions

PEP 484 recommends capitalizing alias names, since they represent user-defined types, which (like user-defined classes) are typically spelled that way.

I therefore suggest these alternatives:

  • PlainText = str # Text not containing HTML tags nor newlines
  • LinkText = str # Text possibly including HTML hyperlinks that are removed in all outputs except HTML output
  • MultilineLinkText = str # LinkText possibly also including newlines to break lines in diagram output

@formatc1702
Copy link
Collaborator

formatc1702 commented Oct 29, 2020

I'm sorry I keep nagging about the exact name of the user-defined types.

IMHO, LinkText creates the expectation that this is a text that always/usually is or includes a link.
However, as I mentioned in my comment in #168:

TBH, having a separate url field will probably make the need for including URLs in the type attribute obsolete in most cases.

That's why I am uncomfortable pushing what I expect to be an edge case use for an attribute, into the very definition of that attribute.
My only counter-suggestion, however, might be RichText/RichTextMultiline or String_HTML, which on the other hand creates an expectation that it supports much more than just links.

Another option is getting rid of the custom types, marking everything as str and adding a little comment in the dataclass mentioning the fact that a certain attribute has the custom Link/Linebreak behavior. Personally, I am leaning towards this option again after all...

Since I don't want this problem delaying integration much further, I will merge it as-is tomorrow unless we find a nicer alternative (or unless you want to change something else before, let me know!). We can always rename the type later if something better pops up.

@kvid
Copy link
Collaborator Author

kvid commented Oct 29, 2020

I'm sorry I keep nagging about the exact name of the user-defined types.

Extra effort to find good identifier names should be an investment that hopefully pays off in the future by reducing the number of misunderstandings by users.

IMHO, LinkText creates the expectation that this is a text that always/usually is or includes a link.

I see your point. Is TextThatMightContainLinks or TextWithOptionalLinks any better?

However, as I mentioned in my comment in #168:

TBH, having a separate url field will probably make the need for including URLs in the type attribute obsolete in most cases.

I don't fully agree. Supporting link tags is way more flexible in the sense that the user can visualize what the link represents by selecting only a section of the text attribute as the clickable link text, and if needed, link to different URLs from different text sections.

That's why I am uncomfortable pushing what I expect to be an edge case use for an attribute, into the very definition of that attribute.

The whole point of type hints is to document what kind of values that are valid, so the type alias names should describe the differences somehow.

My only counter-suggestion, however, might be RichText/RichTextMultiline or String_HTML, which on the other hand creates an expectation that it supports much more than just links.

A better alternative is perhaps Hypertext that is defined by Wikipedia as text displayed on a computer display or other electronic devices with references (hyperlinks) to other text that the reader can immediately access.

If you like RichText better, it can be explained as text that might contain tags for the HTML output, but currently only supporting link tags, and add support for more tags later on. However, I believe the Rich Text term is more used to describe other formats and markup languages that HTML, e.g. RTF.

Another option is getting rid of the custom types, marking everything as str and adding a little comment in the dataclass mentioning the fact that a certain attribute has the custom Link/Linebreak behavior. Personally, I am leaning towards this option again after all...

If we are going to use type hints in more parts of the source code than DataClasses.py, then the different type aliases will get more useful to avoid the same comments at several locations.

Since I don't want this problem delaying integration much further, I will merge it as-is tomorrow unless we find a nicer alternative (or unless you want to change something else before, let me know!). We can always rename the type later if something better pops up.

True, but I'm willing to think a bit more before we decide.

@formatc1702
Copy link
Collaborator

formatc1702 commented Oct 29, 2020

Hypertext! That's a good one!

It almost sounds a bit retro (in a good way), but per the definition you posted, it's pretty close to what we are trying to communicate through the type hint, without over-promising (like RichText).

I wouldn't feel bad including Hypertext and HypertextMultiline, thanks for the cool suggestion!

@kvid
Copy link
Collaborator Author

kvid commented Oct 29, 2020

Hypertext! That's a good one!

Then we can agree on that one. 😃

It almost sounds a bit retro (in a good way), but per the definition you posted, it's pretty close to what we are trying to communicate through the type hint, without over-promising (like RichText).

The term is from 1963 and has been used in different projects and predecessors of HTML/HTTP. The term is also the leading and major part of both HTML and HTTP.

I wouldn't feel bad including Hypertext and HypertextMultiline, thanks for the cool suggestion!

Personally, I feel MultilineHypertext is easier to read as it has the natural order of these words in an English sentence. Do you have strong feelings or arguments for the opposite order?

@formatc1702
Copy link
Collaborator

It almost sounds a bit retro (in a good way), but per the definition you posted, it's pretty close to what we are trying to communicate through the type hint, without over-promising (like RichText).

The term is from 1963 and has been used in different projects and predecessors of HTML/HTTP. The term is also the leading and major part of both HTML and HTTP.

Of course it's still relevant; I just haven't heard anybody use the actual term "hypertext" in conversation or in writing for a veeery long time ;-)

Personally, I feel MultilineHypertext is easier to read as it has the natural order of these words in an English sentence. Do you have strong feelings or arguments for the opposite order?

My main argument would be that, in certain situations, it makes sense to assign names starting from the general, then going into the specific. In this particular case, it's not really that important, so MultilineHypertext is fine too!

@formatc1702
Copy link
Collaborator

Is this ready for merging from your perspective? I saw you force-pushed the latest changes, but I'm not sure.

I would like to try something out: Once you are done with your changes, please use the link in the top right of this PR to request a review from me. That lets me know you are finished with your changes, and if I approve the review, I can merge without having to ask again. Thanks!

Using Any or str in type annotations might increase the need for extra
comments to explain the real valid values. However, such needs can be
drastically reduced with the help of semanticly named type aliases.

Each type alias have their legal values described in comments.
Actual validation might be implemented in the future.
@kvid
Copy link
Collaborator Author

kvid commented Nov 1, 2020

Is this ready for merging from your perspective? I saw you force-pushed the latest changes, but I'm not sure.

Yes, now I think it should be ready. I squashed most commits into one, but kept the first commit separate to avoid hiding this separate issue in all the type alias changes afterwords.

I would like to try something out: Once you are done with your changes, please use the link in the top right of this PR to request a review from me. That lets me know you are finished with your changes, and if I approve the review, I can merge without having to ask again. Thanks!

I would like to test this feature, but I can't find the link you describe. I found this old blog posting about it, but I cannot find the cog wheel icon described there. This how my right column looks like:
image

Comment on lines +72 to +74
manufacturer: Optional[MultilineHypertext] = None
mpn: Optional[MultilineHypertext] = None
pn: Optional[Hypertext] = None
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Maybe the answer to my question is buried in the discussion on #115, but:

Why are manufacturer and mpn multiline-capable, and pn is not?

Personally, I'm not sure why any of them would need to be multiline, but at least it should be consistend, since all of them are passed through the html_line_breaks() function.

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Maybe the answer to my question is buried in the discussion on #115, but:

This feature is much older that #115.

Why are manufacturer and mpn multiline-capable, and pn is not?

Blame tells me the change was committed by you in 102c7d6 as suggested by me in #136 (comment): In case a user need a long manufacturer info, I suggest supporting line breaks to avoid a very wide node. And since both manufacturer and mpn was placed in the same table cell, I suggested calling html_line_breaks on the whole cell contents.

Personally, I'm not sure why any of them would need to be multiline, but at least it should be consistend, since all of them are passed through the html_line_breaks() function.

Currently, (since the commit mentioned above that is included in v0.2) manufacturer and mpn are passed through the html_line_breaks() function, but not pn, and that's why I used different type hints to reflect how it is currently implemented.

I understand your argument about consistency, but then we also need to change the implementation. I have an idea about moving html_line_breaks() in #168 that will solve your consistency request, but it's still WIP.

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks for digging through history for me :)

OK, then let's keep it as-is to reflect the current implementation, and I'll wait for #168.

@formatc1702
Copy link
Collaborator

formatc1702 commented Nov 1, 2020

According to the GitHub documentation:

Pull request authors can't request reviews unless they are either a repository owner or collaborator with write access to the repository.

This doesn't make sense to me, but I guess it's how things work. So for now, the only option I see is for the PR author to post an explicit comment stating that the code is ready for [re-]review :/

The alternative would be for me (the owner+reviewer) to set the PR back to draft status if I request changes, and for you (the PR author) to remove draft status again to signal readiness for review; but that doesn't seem like the right way to use the draft feature.

@formatc1702 formatc1702 merged commit 64bd34a into wireviz:dev Nov 1, 2020
formatc1702 added a commit that referenced this pull request Nov 1, 2020
@kvid kvid deleted the issue156-types branch March 7, 2021 15:22
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants