-
-
Notifications
You must be signed in to change notification settings - Fork 1
Translate Erlang/OTP XML into EEP 48 docs chunk #3
Comments
I'd like to propose changing: Attributes :: [{binary(), binary()}] to: Attributes :: #{binary() => binary()} I think this would be more friendly to tooling as we we could easily pattern match on subset of attributes. I believe the order of the attributes shouldn't matter and duplicates shouldn't be allowed (e.g. xmerl does not allow duplicates) |
@wojtekmach yes I agree that a map would be more efficient for tooling here and I don't think it will increase the size of the doc chunk. |
I already have a poc for this, I'm on vacations this week but the next one
I can put together a status report and next steps.
…On Fri, 15 Nov 2019, 16:32 Kenneth Lundin, ***@***.***> wrote:
@wojtekmach <https://github.com/wojtekmach> yes I agree that a map would
be more efficient for tooling here and I don't think it will increase the
size of the doc chunk.
—
You are receiving this because you are subscribed to this thread.
Reply to this email directly, view it on GitHub
<#3>,
or unsubscribe
<https://github.com/notifications/unsubscribe-auth/AAAQW3YEFI2PK5XH3J6SYF3QT2XIHANCNFSM4JMWEBSQ>
.
|
To be more specific: I have a project that reads OTP xml docs and converts
it to a tree data structure that then is translated to html, markdown and
restructured text
On Fri, 15 Nov 2019, 19:44 Mariano Guerra, <[email protected]>
wrote:
… I already have a poc for this, I'm on vacations this week but the next one
I can put together a status report and next steps.
On Fri, 15 Nov 2019, 16:32 Kenneth Lundin, ***@***.***>
wrote:
> @wojtekmach <https://github.com/wojtekmach> yes I agree that a map would
> be more efficient for tooling here and I don't think it will increase the
> size of the doc chunk.
>
> —
> You are receiving this because you are subscribed to this thread.
> Reply to this email directly, view it on GitHub
> <#3>,
> or unsubscribe
> <https://github.com/notifications/unsubscribe-auth/AAAQW3YEFI2PK5XH3J6SYF3QT2XIHANCNFSM4JMWEBSQ>
> .
>
|
I have been thinking about the format of the doc chunk for Erlang for some time and during that I have also written a program which translate the OTP XML into the doc chunk.
@josevalim wrote
But during my work and the comment from @wojtekmach I think I would like to do a minor change to this:
I don't really see a problem with having the Tag and the key for attributes as an atom() vs. a binary() but this is a detail. The more important part is what tags there are. I have worked with the idea of having the well known html names when possible. My idea is that we extend the OTP build with producing the doc chunks in addition to the current html, man and pdf so that anyone can build OTP with doc chunks (optional). That version should also contain API functions for fetching the doc chunks and presenting them in the shell. I hope we can have this already in OTP 23 (May) at least as an experimental feature as all the formats and API's might not be 100% settled |
@KennethL That's great news! Are pull requests to EDoc based on the work done so far welcome? |
Agreed. I will send a PR to update EEP 48. I propose, however, that the format must then end with
Thanks for the updates! Our current plan is to change ExDoc to work on HTML ASTs. So for Elixir, ExDoc will convert the Markdown to a HTML tree and then traverse it. For Erlang, we will work directly with the tree stored in the chunk. It may be that the tools (markdown processor and Erlang) do not agree on the format of the tree, so we will need some pre-processing. Furthermore, we want the simplest possible tree while processing, Using a map instead of list for attributes also means we lose ordering. Once again, this is probably not a problem for Erlang, but a Markdown processor may want to keep the user ordering, and therefore they are forced to use a list. In other words, I think it is fine to go with your proposal. Realistically speaking, it is most likely that ExDoc will have to normalize those trees. Even if they have the same AST type, I wouldn't be surprised if certain representation are sometimes different.
Should we make the building of docs chunk enabled by default? I think this would be the simplest to have the docs accessible to everyone. |
Yes puul request for Edoc are welcome, but try to make them compatible so that current usage is not broken, I assume even the not so nice layout it produces today will have to be kept for a while. But for the output there are support for plugins which can produce whatever format. These plugins don't even need to be part of Edoc, but in this case I think we want it to be part of Edoc. |
Thinking again about if format (in EEP-48) should be a binary or a structured term, a binary with the term in external format might have its advantages when using online help where the doc for only one function at a time is requested. It would then be less computation to fetch the whole chunk and then convert only the doc for one function with
Having a map for the attributes instead of a list is not super important for me. If the order sometimes is important we can keep it as a list.
Yes I think building docs shunks could be default, but we have to decide if the chunk goes into the .beam file or if it goes into a separate file per module. If it goes into the .beam file I think we have to integrate is more with the ´erlc´ command and the compiler or of course it could be an extra separate pass where docs chunks are added to the .beam files. I am not sure what I think yet, we have to discuss it more in the OTP team (will probably happen this week). Do you have any input on that? |
I was the one proposing the attributes map but I agree that sticking with a list might be better after all.
My understanding is for modules that are part of erts there's no beam file so in order to eventually do: Perhaps OTP stores erts docs in
I think this is the most appealing option: when working in rebar3/mix/etc projects when using dependencies they would be compiled with docs with no extra work (maybe just a compiler flag to enable chunks generation) and thus the documentation would be immediately available. Otherwise, if it's part of edoc, then we'd need to run edoc for all deps (which could of course be automated somehow). |
I think both options are ok. Having it as a binary is definitely more consistent and it doesn't require changing EEP-48. If it has actual performance benefits in some situations, then even better!
My initial question is which docs would be integrated into |
I was also going to suggest that a separate pass might make sense when the feature is still experimental, given the pass is different for OTP and non-OTP modules. For building project releases that include OTP libs some care has to be taken not to unintentionally skip the separate |
Hello, For the last week or so I've been working on making this a reality. You can track my progress here: https://github.com/garazdawi/otp/tree/lukas/kernel/code-chunk-lookup So far I've implemented help for all functions and types in all of Erlang/OTP. Some notes:
Here is an example of what is in the chunk:
And here it is when rendered:
Remaining issues:
|
@garazdawi fantastic!
I agree. We may need to bump the version of the format.
You probably need to decide if you are going to emit HTML or a HTML-like tree. Have you had any thoughts about this? |
I think it will have to be HTML-like. Some of the things I do now could be translated. Not sure how easy it would be to do with the What I'm trying to avoid is for different renderers to have to do very complex parsing loops in order to figure out what to print. |
@garazdawi is that the tag that processes the |
No. It is the tag that either described what a type the function uses looks like (e.g. https://github.com/erlang/otp/blob/master/lib/asn1/doc/src/asn1ct.xml#L78-L89) or the name of the type if it is part of the code (e.g. https://github.com/erlang/otp/blob/master/lib/stdlib/doc/src/calendar.xml#L135-L138). The spec metadata is rendered instead of the signature when present. I decided to not put the rendered version of the spec in the signature as the spec format is more flexible and easier to create links etc from. |
It always appears at the top at the moment. Or rather just after the |
And to clarify, the
Sounds like a good call to me. |
Yes. |
Just an idea looking at the problem from a different angle - since functions in Erlang are actually uniquely identified by their |
There are 58 places (unless my grep skills fail me) in the docs where this is feature is used. So it could be done. However, in my opinion, the (HTML) docs become easier to navigate by using the method and I don't really see an alternative notation that would work as well. I'll leave that part to rest for a while and focus on other parts of the doc chunks first and then come back to it. Maybe we'll have an epiphany somewhere along the way. |
Pushed an update where types are like this:
for types that are derived from the metadata. And like this:
for inline type desciptions. Also I changed
That leaves |
I plan to open a PR for my changes tomorrow. There are still some issues that are unresolved, but they may have to be fixed later as |
Fantastic! If there is anything I can do or if you want to jump on a call to discuss the unresolved issues, please let me know! |
The main unresolved issue is still what to do with the multiple function definitions. I'm starting to lean towards actually just re-writing the docs to not use that feature any more... One thing that is left is to use a second renderer from the chunk sources. Right now I've only tested with my own |
My current thinking is that EEP 48 will become more robust if we change it to support multiple entries. For Erlang and Elixir it boils down to an implementation detail but I can imagine a language in the future that may want to document each of them individually, especially statically typed languages. In the worst case scenario, if we don't want to use the feature, it just ends up being a one item list. :) But if you think it is best to save this fight for later when the need arises, then that's ok by me too! Once the PR is available, we will start looking into giving it a try on ExDoc and give you some feedback. |
As discussed earlier we should also have a plugin for edoc to produce the
same format as we now are producing from the OTP xml. Is there anyone
working on that part?
I would like to have it as a PR to include in OTP and as a pure extension,
leaving the old functionality of edoc compatible (as much as possible).
Kenneth
…On Tue, Feb 18, 2020, 19:38 José Valim ***@***.***> wrote:
The main unresolved issue is still what to do with the multiple function
definitions. I'm starting to lean towards actually just re-writing the docs
to not use that feature any more...
My current thinking is that EEP 48 will become more robust if we change it
to support multiple entries. For Erlang and Elixir it boils down to an
implementation detail but I can imagine a language in the future that may
want to document each of them individually, especially statically typed
language. In the worst case scenario, if we don't want to use the feature,
it just ends up being a one item list. :)
But if you think it is best to save this fight for later when the need
arises, then that's ok by me too!
Once the PR is available, we will start looking into giving it a try on
ExDoc and give you some feedback.
—
You are receiving this because you were mentioned.
Reply to this email directly, view it on GitHub
<#3>,
or unsubscribe
<https://github.com/notifications/unsubscribe-auth/AABFWSFZ2T6RRYDWJXS4A3LRDQTL7ANCNFSM4JMWEBSQ>
.
|
No. Even if it is named "type-" it is not actually a documented type, it is just a link to a |
hmm, on second thought... maybe it should be... need to dig more. |
So, type links are apparently a mess... I will try to make something better... |
Probably erlang/otp#2545 (comment) Your links look good now! |
@wojtekmach in my branch I've now made it so that has to point to a documented type that is accessible through a doc chunk. I've also written a validator that makes sure that points to documented functions and points to documented types. Next is to extend it to also check that <see*> points to actual markers in the docs and not to generated header markers. Maybe I'll have to introduce a <seetitle...>. Now I have to focus on the OTP-23 rc2 though as it is about to be released. |
@garazdawi Great! If you'd like me to test your changes, which branch is it? But happy to wait for RCs to test them out, so no pressure. |
There is a link to the latest changes in the top post of this issue. |
Got it, thanks.
… On 23 Mar 2020, at 11:04, Lukas Larsson ***@***.***> wrote:
There is a link to the latest changes in the top post of this issue.
—
You are receiving this because you were mentioned.
Reply to this email directly, view it on GitHub <#3 (comment)>, or unsubscribe <https://github.com/notifications/unsubscribe-auth/AAASSJ6MO5TN7YESHPY2PW3RI4X4XANCNFSM4JMWEBSQ>.
|
erlang/otp#2573 opened a pr with the link changes. |
So, what I did for this is that I introduced a metadata tag för each doc entry that is called
|
Hi @garazdawi! I have just given RC2 a try and all of my previous comments have been addressed. Thank you for that! I have just two tiny remarks left:
Have a good weekend! |
Edit: just remembered. I did it because I didn't want to deal with the beam file and the doc chunk becoming out of sync. |
Opened a new PR with more changes: erlang/otp#2583 |
Changes merged to master. |
I noticed a problem with some types in the stdlib app, they the have erl> {ok, {docs_v1, _, _, _, _, _, Docs}} = code:get_doc(array),
[{KindNameArity, Doc} || {KindNameArity, _, _, Doc, _} <- Docs, Doc == #{}].
[{{type,array,0},#{}},
{{type,array_indx,0},#{}},
{{type,array_opts,0},#{}},
{{type,array_opt,0},#{}},
{{type,indx_pairs,1},#{}},
{{type,indx_pair,1},#{}}] the remaining chunks look good. I'm running OTP 23.0-rc3. |
So that is when a type is documented, but there is no text describing it. i.e. http://erlang.org/doc/man/array.html#type-array_indx I think the correct thing to do is to put |
Actually, why can't it be an empty map which would mean the same as |
When a thing should show up in docs but doesnt have docs text, thats exactly what the atom |
In the OTP implementation of EEP-48 I make a distinction between "the user has not yet written any docs for this" and "the user has decided that there should be no docs for this, but it is still a public type". For the first I emit When I'm not against changing this so that I always emit |
In Elixir, we are doing the following:
So our usage of Therefore I propose to not change Erlang/OTP. @garazdawi @wojtekmach do you think this makes sense? |
I'm fine with keeping things as is, just a note that we will have to update our tools:
(same for ExDoc) |
When running ExDoc on EDoc generated chunks I'm getting quite a few warnings like this (also about
Initially I was thinking it's an ExDoc/EDoc integration issue, but it doesn't seem to be the case. On OTP 23 rc3: > ht(erl_syntax, syntaxTree, 0).
{error,type_missing} Yet, at the same time I can see the relevant Moreover, the types are listed in the > {ok, Doc} = code:get_doc(erl_syntax), maps:get(types, element(6, Doc)).
...
{syntaxTree,0} =>
{attribute,452,type,
{syntaxTree,{type,452,union,
[{user_type,452,tree,[]},
{user_type,452,wrapper,[]},
{user_type,452,erl_parse,[]}]},
[]}},
... But not in 13> {ok, D} = code:get_doc(erl_syntax), lists:usort( [element(1, element(1, Doc)) || Doc <- D#docs_v1.docs] ).
[function] |
The documentation for So any module documented by Edoc in the Erlang/OTP source tree will (for now) not have any type documentation. |
Just a quick note about that, I believe we need to update EEP 48 too since it currently has:
|
I've spotted two small things. The first one is formatting with
There's no leading empty line before the module name. The second one is about the {{type,indx_pairs,1},
[{file,"array.erl"},{location,181}],
[<<"-type indx_pairs(Arg1) :: term().">>],
#{},
#{signature =>
[{attribute,181,type,
{indx_pairs,{type,181,list,
[{user_type,181,indx_pair,[{var,181,'Type'}]}]},
[{var,181,'Type'}]}}]}},
{{type,indx_pair,1},
[{file,"array.erl"},{location,180}],
[<<"-type indx_pair(Arg1) :: term().">>],
#{},
#{signature =>
[{attribute,180,type,
{indx_pair,{type,180,tuple,
[{ann_type,180,
[{var,180,'Index'},{user_type,180,array_indx,[]}]},
{var,180,'Type'}]},
[{var,180,'Type'}]}}]}}]}} For
I'm proposing to standardise the signature for types to the following:
Tools, on the other hand, should prefer the use of the spec metadata if they can and it's present (as |
I'm for option #2 as it would look the most as what the functions look like. It's not pretty, but the user should never have to see it anyways. |
This is a thread to track the progress and any pending tasks that may arise during this effort.
The current idea is that Erlang/OTP XML will be converted to an HTML tree data-structure. This is better than storing HTML as text as we don't have to parse it again.
The tree will have the following format:
You can track the current progress here.
TODO
<p>
h/1
and friends, seec:i/0
for details<section>
in the refman xml to only be allowed inside<description>
h(ttb,tpl)
,
and.
alone on new-line when after a tagged string.application/erlang+html
format and the EEP-48 format.shell_docs:validate/1
function to follow a stricter dtd.shell_docs:render/*
by using the dtd.Don't do
<funcs>
tags in the source xml<li>
contents with a<p>
(don't think I will do this as it will change the meaning of things)The text was updated successfully, but these errors were encountered: