Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

How to represent Declarator Docs in plain RakuDoc #439

Open
patrickbkr opened this issue Sep 17, 2024 · 22 comments
Open

How to represent Declarator Docs in plain RakuDoc #439

patrickbkr opened this issue Sep 17, 2024 · 22 comments
Labels
language Changes to the Raku Programming Language

Comments

@patrickbkr
Copy link
Member

Prelude:

Toolchain wise we've long had the issue that tools to display documentation (GitHub, GitLab, raku.land, ...) have an issue with running the Rakudo parser on source code to reach for the embedded docs as that code can contain BEGIN blocks, which is a big safety concern.

A possible way forward is to split the task of rendering RakuDoc (and Declarator Docs in particular) into two phases:

  1. A tool, the Extractor extracts and converts all docs in a Raku source file and outputs a pure .rakudoc file. This is an unsafe procedure.
  2. A separate tool, a plain RakuDoc parser, then parses that pure .rakudoc file. It barfs on anything that is not plain RakuDoc. This is a safe process.

Platforms that wish to render RakuDoc (e.g. GitHub) can safely use the RakuDoc parser. Platforms that want to render the full documentation (e.g. raku.land) can run the Extractor in a sandbox.

All of the above is not part of the issue this ticket wants to address.


The question I want to discuss is: What should the output of the above mentioned Extractor that can convert a .rakumod file into a .rakudoc file look like?

The issue is mostly orthogonal to #438 in that it is not concerned with how Declarator Docs should look in code, but how they should be represented in RakuDoc.

@patrickbkr patrickbkr added the language Changes to the Raku Programming Language label Sep 17, 2024
@patrickbkr
Copy link
Member Author

Just to get the discussion started, here is the first thing I was able to come up with.
Given this piece of code in a file called foo/bar/frobnicator.rakumod:

#| The class C<Foo::Bar::Frobnicator> is the one stop shop for all your frobnication needs.
class Foo::Bar::Frobnicator is Affector[I] does Helping {
    #| Perform a frobnication.
    method frobnicate(
        Str() $thingy = "something", #= The thing you want to frobnicate.
        :$quiet                #= Do it quietly.
    ) { ... }
}

#| User expertise levels.
enum Level
    ONE, #= Beginner.
    TWO, #= Expert.
;

#| Default config values.
my %config =
    size docs "How big should the buffer be?" => 5,
;

The output in a file foo/bar/frobnicator.rakudoc might be:

=begin dock :type("class")
=           :name("Foo::Bar::Frobnicator")
=           :code("class Foo::Bar::Frobnicator is Affector[I] does Helping")
=           :file("foo/bar/frobnicator.rakumod")
The class C<Foo::Bar::Frobnicator> is the one stop shop for all your frobnication needs.
=end dock

=begin dock :type("method")
=           :name("frobnicate")
=           :code("method frobnicate(Str() $thingy = \"something\", :$quiet)")
=           :file("foo/bar/frobnicator.rakumod")
=           :context(class => "Foo::Bar::Frobnicator")
Perform a frobnication.
=end dock

=begin dock :type("parameter")
=           :name("thingy")
=           :code("Str() $thingy = \"something\"")
=           :file("foo/bar/frobnicator.rakumod")
=           :context(class => "Foo::Bar::Frobnicator", method => "frobnicate")
The thing you want to frobnicate.
=end dock

=begin dock :type("parameter")
=           :name("quiet")
=           :code(":$quiet")
=           :file("foo/bar/frobnicator.rakumod")
=           :context(class => "Foo::Bar::Frobnicator", method => "frobnicate")
Do it quietly.
=end dock

=begin dock :type("enum")
=           :name("Level")
=           :code("enum Level")
=           :file("foo/bar/frobnicator.rakumod")
User expertise levels.
=end dock
=begin dock :type("literal")
=           :name("ONE")
=           :code("ONE")
=           :file("foo/bar/frobnicator.rakumod")
=           :context(enum => "Level")
Beginner.
=end dock
=begin dock :type("literal")
=           :name("TWO")
=           :code("TWO")
=           :file("foo/bar/frobnicator.rakumod")
=           :context(enum => "Level")
Expert.
=end dock

=begin dock :type("variable")
=           :name("config")
=           :code("my %config")
=           :file("foo/bar/frobnicator.rakumod")
Default config values.
=end dock
=begin dock :type("key")
=           :name("size")
=           :code("size")
=           :file("foo/bar/frobnicator.rakumod")
=           :context(variable => "config")
How big should the buffer be?
=end dock

Immediate questions:

  • Which syntactic elements do we allow Declarator Docs on? (Declarator Docs should be limited in scope #438 might provide the answer for that.)
  • Which components of some piece of code do we want to expose as separate metadata keys and which do we not want to expose? E.g. thingy could have :positional(0), :default("something"), :sig-type("Str()") and so on. How much is too much?
  • Would the above be enough to provide a decent rendering of the above Declarator Docs?
  • I have the feeling that :context and :file can not be as simple as I make them be in the above example. Are there conditions in which the naive approach above fails?

@finanalyst I believe you currently have the deepest insight into implementing renderers. Thus the ping.

@tbrowder
Copy link
Member

tbrowder commented Sep 18, 2024

I forgot but just found that back in 2020 we did get the ability to keep the leading declarator block in its original form with use of an environment variable RAKUDO_POD_DECL_BLOCK_USER_FORMAT. That implementation was the quickest and easiest way to affect Raku parsing.

When that is true, and you run 'raku --doc ...', you will get text from the leading declarator blocks as input by the author of the code.

I used that for documenting my code. I hope that is retained and improved for RakuAST.

The test for it is in roast file 'S26-documentation/block-leading-user-format.t'. I

@lizmat
Copy link
Collaborator

lizmat commented Sep 18, 2024

It is retained if you use $=pod.

@patrickbkr patrickbkr changed the title How to represent Declarator Docs to plain RakuDoc How to represent Declarator Docs in plain RakuDoc Sep 18, 2024
@lizmat
Copy link
Collaborator

lizmat commented Sep 18, 2024

What I implemented for --rakudoc now:

#| before subset
subset Positive of Int where * > 0; #= after subset

#| before enum
enum Frobnicate <Yes No>;  #= after enum

#| before regex
my regex zippo { z }  #= after regex

#| before block
{ … } #= after block

#| before pointy block
{ … } #= after pointy block

#| the main class
class A {  #= that we have

    #| a method before
    multi method a( #= method after
      #| parameter before
      Int:D $foo #= parameter after
    ) {
        #| a variable
        my $a = 42; #= with the answer
    }
}

produces:

=begin doc-subset :name<Positive>
  =leading before subset
  =trailing after subset
=end doc-subset

=begin doc-enum :name<Frobnicate>
  =leading before enum
  =trailing after enum
=end doc-enum

=begin doc-regex :name<zippo>
  =leading before regex
  =trailing after regex

=end doc-regex

=begin doc-block
  =leading before block
  =trailing after block

=end doc-block

=begin doc-block
  =leading before pointy block
  =trailing after pointy block

=end doc-block

=begin doc-class :name<A>
  =leading the main class
  =trailing that we have

  =begin doc-method :multi :name<a>
    =leading a method before
    =trailing method after

    =begin doc-parameter :name<Int:D $foo where { ... }>
      =leading parameter before
      =trailing parameter after
    =end doc-parameter

    =begin doc-variable :name<$a>
      =leading a variable
      =trailing with the answer
    =end doc-variable
  =end doc-method
=end doc-class

@lizmat
Copy link
Collaborator

lizmat commented Sep 18, 2024

Note that this was just a first attempt and no way intended to be cast in stone... Very much open to improvements :-)

@patrickbkr
Copy link
Member Author

patrickbkr commented Sep 18, 2024

@lizmat It's a nice surprise you actually started working on it already. I did not see that coming. You've implemented support for quite a list already. Nice!

The first elements that come to mind I still miss are: doc-sub, doc-constant, doc-has, doc-token, doc-rule, doc-grammar

Questions:

  • Is there any point in putting a Dock on a block? I can't imagine a plain block to be part of an API. It's especially strange given that blocks have no name or other identifier by which they could be referred to.
  • Same can be asked for my variables. No, it makes perfect sense - e.g. for IDE support for maintainers.
  • Nesting the hierarchies is nice. Are there cases where this might not be easily possible? E.g. When having a class spread multiple files using augment?
  • Can we have the literal code of the respective element in there? (What I put in :code() in my attempt.)
  • Semantically, do we want to give #| and #= different meanings? If not, then we can get rid of =leading and =trailing and just have the contents directly in there.

@lizmat
Copy link
Collaborator

lizmat commented Sep 22, 2024

I still miss: doc-sub, doc-constant, doc-has, doc-token, doc-rule, doc-grammar

Hmmm looks like decks on constants are currently broken.

Re: doc-sub these all fall out of generic Routine handling

#| before sub
sub foo() { #= after sub
}

produces:

=begin doc-sub :name<foo>
  =leading before sub
  =trailing after sub

=end doc-sub

But it looks like submethod is being rendered as "doc-method" instead of "doc-submethod"

Looks like has is being rendered as "doc-variable" instead of "doc-has". Hmmmm...

"doc-token" and "doc-rule" are already being rendered correctly, as they are a special case of "doc-regex".

"doc-grammar" and all other package types, are already being rendered.

@lizmat
Copy link
Collaborator

lizmat commented Sep 22, 2024

Is there any point in putting a Dock on a block?

There are spectests for it.

@lizmat
Copy link
Collaborator

lizmat commented Sep 22, 2024

Same can be asked for my variables. No, it makes perfect sense - e.g. for IDE support for maintainers.

And there the line between internal and external documentation blurs. So are decks for external or internal documentation?

@lizmat
Copy link
Collaborator

lizmat commented Sep 22, 2024

Nesting the hierarchies is nice. Are there cases where this might not be easily possible? E.g. When having a class spread multiple files using augment?

All RakuDoc is based on a RakuAST tree. However you build such a tree, is open to debate. I guess one could join multiple ASTs from different files into a single AST. Then it all depends on the scoping of such a merge.

@lizmat
Copy link
Collaborator

lizmat commented Sep 22, 2024

Can we have the literal code of the respective element in there? (What I put in :code() in my attempt.)

Yes, you could. But this could become quite a lot. And wasn't it the point to not have any code in safe RakuDoc?

@lizmat
Copy link
Collaborator

lizmat commented Sep 22, 2024

Semantically, do we want to give #| and #= different meanings? If not, then we can get rid of =leading and =trailing and just have the contents directly in there.

Sure, we could do that.

@lizmat
Copy link
Collaborator

lizmat commented Sep 22, 2024

rakudo/rakudo@27565cc1f7 now renders attributes as doc-attribute.

@lizmat
Copy link
Collaborator

lizmat commented Sep 22, 2024

rakudo/rakudo@8162f3eb3b now renders submethods as doc-submethod.

@patrickbkr
Copy link
Member Author

patrickbkr commented Sep 22, 2024 via email

@patrickbkr
Copy link
Member Author

patrickbkr commented Sep 22, 2024 via email

@patrickbkr
Copy link
Member Author

I've just realized, that we'll probably not only want to generate content for the elements that have a deck attached, but for all elements that are part of a files public API.
So given

#| A fighter
class Warrior {
  has $.name;
  has $.hp; #= health points
  has $.strength;
}

We should also generate entries for $.name and $.strength even though they don't have any deck attached.

@lizmat
Copy link
Collaborator

lizmat commented Jan 4, 2025

We should also generate entries for $.name and $.strength even though they don't have any deck attached.

That could only work if the class specification itself has a deck. Otherwise it'd be hard to go back through the attribute to the class and get the sequence in the doc correct.

@patrickbkr
Copy link
Member Author

That could only work if the class specification itself has a deck. Otherwise it'd be hard to go back through the attribute to the class and get the sequence in the doc correct.

Could we do it differently and extract all the wanted elements (packages, classes, methods, ...) directly instead of going backwards from the decks?

Going via the decks has the additional disadvantage, that only elements that have some deck attached somewhere in their hierarchy are exported. Deckless elements would be ignored.

@finanalyst
Copy link

At the moment, it would be useful for me at least, if we could focus on how to manage the AST that is now being generated.
A simple example with both rakudoc and a pre-deck (declarator block leading the declaration) now is

=begin rakudoc
some stuff
=end rakudoc
#| bleh
my $var;

Running on a CLI, we have

$ raku -e 'q:b[=begin rakudoc\n some stuff\n=end rakudoc\n#| bleh\nmy $var;].AST.say'
RakuAST::StatementList.new(
  RakuAST::Doc::Block.new(
    type       => "rakudoc",
    paragraphs => (
      RakuAST::Doc::Block.new(
        margin     => " ",
        type       => "implicit-code",
        paragraphs => (
          "some stuff\n",
        )
      ),
    )
  ),
  RakuAST::Statement::Expression.new(
    expression => RakuAST::VarDeclaration::Simple.new(
      sigil       => "\$",
      desigilname => RakuAST::Name.from-identifier("var")
    ).declarator-docs(
      leading => (
        "bleh\n",
      )
    )
  )
)

If we filter out just the RakuDoc (the .rakudoc method on an AST), which is what a renderer will do, we get:

$ raku -e 'q:b[=begin rakudoc\n some stuff\n=end rakudoc\n#| bleh\nmy $var;].AST.rakudoc.say'
(RakuAST::Doc::Block.new(
  type       => "rakudoc",
  paragraphs => (
    RakuAST::Doc::Block.new(
      margin     => " ",
      type       => "implicit-code",
      paragraphs => (
        "some stuff\n",
      )
    ),
  )
) RakuAST::Doc::Declarator.new(
  WHEREFORE => RakuAST::VarDeclaration::Simple.new(
    sigil       => "\$",
    desigilname => RakuAST::Name.from-identifier("var")
  ).declarator-docs(
    leading => (
      "bleh\n",
    )
  ),
  leading   => (
    "bleh\n",
  )
))

I don't have much idea about how this might look.

Suppose that all declarations are gathered together into a Table, perhaps called DECTABLE, is treated as an implicitly hidden semantic block. Then any where inside a Rakudoc source we could have
=place semantic:DECTABLE :caption<Declaration blocks>, but for a default, we place it at the end.
Then we might get something like:

name type description
$var Simple bleh

As for security, here is a speculation:
In order to make this completely safe in a fairly general case including BEGIN block, I think - let me know if I'm wrong - that we could generate the AST of a source (a rakumod or rakudoc file) inside a secure container, serialise the AST.rakudoc, then render from the deserialised AST.rakudoc.

@patrickbkr
Copy link
Member Author

Suppose that all declarations are gathered together into a Table, perhaps called DECTABLE, is treated as an implicitly hidden semantic block. Then any where inside a Rakudoc source we could have =place semantic:DECTABLE :caption<Declaration blocks>, but for a default, we place it at the end. Then we might get something like:
name type description
$var Simple bleh

We are dealing with two separate sets of documentation that follow a different structure.

  1. The non-declarator-block RakuDoc parts follow a top to bottom ordering in a file and structured in a flat list of file names. RakuDoc provides flexible tools to introduce ordering via automatic or manually created TOCs, indexes and links.

  2. The deck (declarator-block) RakuDoc parts follow the hierarchical structure of the code. The sum of all decks form a single tree of package / namespace -> class / role -> method / sub.

I think we should not try to bend decks (2.) to fit into the RakuDoc documentation (1.) structure. I think it's not possible to do this cleanly as we are working against the preexisting structure.

My current best idea is to keep the two structures separate. So when processing the documentation we'll produce two separate sets of documentation each following their own structure.

  1. Resembles the output of typical documentation systems, e.g. The Sphinx docs
  2. Resembles an API documentation, e.g. JavaDoc or Doxygen

That should be cleanly possible. Building on that we can allow linking to and embedding parts of 2. in 1. (Linking might also make sense in reverse, embedding not so much.)

Module authors should be able to disable generating the API docs if they want to provide 1. only (still being able to embed parts of 2.).

As for security, here is a speculation: In order to make this completely safe in a fairly general case including BEGIN block, I think - let me know if I'm wrong - that we could generate the AST of a source (a rakumod or rakudoc file) inside a secure container, serialise the AST.rakudoc, then render from the deserialised AST.rakudoc.

I believe this will work and is a quick solution freeing us of the need to define syntax for RakuDoc representations of Raku code. I still see value in having such a representation, as the intermediate format is then plain RakuDoc. This incentivizes creating tools that process RakuDoc exclusively. Also the intermediate format is then human readable and writable.

@finanalyst
Copy link

We are dealing with two separate sets of documentation that follow a different structure.

1. The non-declarator-block RakuDoc parts follow a top to bottom ordering in a file and structured in a flat list of file names. RakuDoc provides flexible tools to introduce ordering via automatic or manually created TOCs, indexes and links.

2. The deck (declarator-block) RakuDoc parts follow the hierarchical structure of the code. The sum of all decks form a single tree of `package / namespace -> class / role -> method / sub`.

This does not seem to be a good description. All sources (by which I mean the files with .rakumod or .rakudoc extensions) are parsed into ASTs. The AST describes the structure.

Sources are themselves gathered into directories. Hence the name 'Collection' for how to process them all. The RakuDoc specification makes no mention of the directories, and the Raku documentation suite is not a flat list of filenames.

The sum of all decks is a structure, which we can create as we run through the AST, gathering decks from branches. The same is true of the ToC, Footnotes, Semantic, and Index structures, where information is gathered from =head, N<>, X<> and =NAME_FOLLOWING_SEMANTIC_RULES.

How these data are then rendered is quite arbitrary.

What I suggested above would be easily added to the existing rakuast-rakudoc-render by adding a handler for Declarator blocks, and a template for rendering them.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
language Changes to the Raku Programming Language
Projects
None yet
Development

No branches or pull requests

4 participants