-
Notifications
You must be signed in to change notification settings - Fork 14
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Italic text is no longer recognized in some cases #28
Comments
@triska What's a good markup language that is both little effort to type but still has a formal specification? (Many of these modern markup languages, like Markdown, seem to have this property of not being able to interpret even common strings in the language unambiguously.) |
HTML satisfies exactly two of your 3 criteria. That's already one more than Markdown. If it were an option, I would simply use LaTeX or one of its newer dialects for such descriptions throughout. |
HTML and LaTeX both seem fine IMO. You would type a bit more in terms of markup, but that's about it. With LaTeX we would even allow future versions of the web site to show in-line formulas. Out of curiosity, would you envision typing the paragraph tags explicitly (assuming HTML), as below, or would they be inferred? % <p>With this definition, you can generate a random 256-bit integer
% <i>from</i> a list of 32 random <i>bytes</i>:</p> The plDoc/Markdown discussion was a bit before my time / before I came to Prolog. Is it possible to summarize the argument? |
As I said, I would prefer to use LaTeX or some other language that is at least minimally programmable, so I could write this for example as: % \par With this definition, you can generate a random 256-bit~integer % \i{from} a list of 32 random~\i{bytes}:\par Please not the ~ to avoid bad line The PlDoc discussion is: http://swi-prolog.iai.uni-bonn.narkive.com/Ko4IHNJm/pldoc-version-2-online-and-in-git The argument that concluded the discussion at that time was: I agree that you point out desirable properties of markup languages. I believe that, although desirable, they are not necessary properties and I think there are more useful things I can do with my time than `fixing' these issues that doesn't seem to bother that many people and do far less harm than many other issues with SWI-Prolog. So, I am filing these issues, with at least some success thanks to Jan's improvements. However, I, likewise, prefer to do more useful things than continuously wrestle with these issues, where every other day the semantics of PlDoc change and what previously rendered as intended now is rendered differently (#26 → #27 → #28). I remember at the time of the discussion, I did not see the full extent of the argument Richard was making, and although I considered PlDoc far from ideal also back then (due to too many missing features), it appeared to me to be "good enough" for a system that does not strive for excellence in all respects. Now, after being repeatedly among the few people who actually encounter all these issues, I see more completely and agree with what Richard was saying back then already. In my view, the ideal state of documentation for Prolog predicates would also involve a declarative description of what the predicate means, or at least state some properties such as "Which exceptions can arise?" in a way that can be parsed and analyzed automatically. From that, we are currently far, far away still, not only with PlDoc but also in the ISO standard and other documents. Still, seeing that you are interested in semantics, why not aim high? |
I think it's a little sad that the Markdown grammar is still causing so many issues several years later. Since Jan is an extremely good programmer this seems to indicate that it is indeed impossible to implement Markdown. I would not be opposed to a LaTeX-inspired parser for plDoc. I would probably use it myself. In your mind, would LaTeX commands be interspersed with Markdown commands? That may lead to even more trouble... Would it be necessary to delimit LaTeX comments? If so, how would that be done? It seems reasonable to assume that each comment is either in Markdown or in LaTeX. If there is at least one |
Most of the issue is not so much that markdown cannot be parsed, but that the syntax started with TWiki, got aspects from Wikimedia, from JavaDoc, from Doxygen and finally from Github. Then Prolog has its own handy shortcuts like name/arity and its own term syntax that should not interfere too much with the wiki/markdown syntax. Next, where originally very few of these constructs could be nested, people started to ask for font changes inside links, nested font changes, etc. This quickly causes ambiguities. All this stuff was sort of ad hoc added to the existing Twiki parser and lacking a proper test suite and well described interactions it became a mess. As the dust around Markdown and all dialects has mostly settled we should define our position in that field and see what we support. Define what can be combined with what, establish a test suite and rewrite notably the last part that recognises fonts, links and other objects in running text. And no, LaTeX makes it even worse. For one thing you need to convert it into HTML and that is very hard to do correctly, in particular if you want more or less readable HTML can can be searched and indexed properly. LaTeX is simply far too open ended. HTML is too much typing and escaping. Partial HTML support link GitHub doesn't make it much better either. |
So IIUC we should (1) properly define our own Markdown + Prolog terms grammar, (2) implement the parser in such a way that it allows the vast majority of strings typed in dialect (1) to be parsed correctly, and (3) update the placed where legacy syntax (Wikimedia, Doxygen) is used in the comments. Sounds like a summer vacation project? |
I think yes. I think we can keep old markup as long as it is sufficiently unique that it is very unlikely to cause issues. Otherwise you need everybody to update their source and that is not very well appreciated. The main issue is with the basic font switches like italic, bold (which it used to be, but no longer on github) and their ambiguity because these symbols are quite common in Prolog descriptions. So you need to define what happens with
But who wants to do it ... |
@JanWielemaker if you can point to the relevant code parts then I'm volunteering to assemble a unified plDoc grammar that allows Prolog expressions to be typed and that also implements a Markdown-inspired form of markup. TWISI the outcome should be an EBNF grammar which is only a couple of lines long and that should cover 95-98% of current plDoc usage. When we publish the EBNF it is clear which kinds of nesting are and which are not supported. @triska are you joining forces with me for this? |
The wiki parser consists of two layers. The first recognises the structure (headers, lists, code, etc) and the second adds font changes and links to running text. I think that is the one we must deal with first as it is the most ambiguous one. The input of the second layer is a list of words (alpahnumerical sequences) represented as w(Word), white space, always represented as a single space and all remaining characters, represented by themselves. This parser is implemented using wiki_faces/3 in doc_wiki.pl. An important feature of the parser must be to never fully reject any text. So, a grammar is nice, but we also need some way to handle/describe resolution of dubious input. Given presence, I think the github markdown should be the first input but we need to take care of the fact that there is a lot of Prolog notation that should not conflict and preferably be recognised and rendered appropriately without user intervention. |
The predicate
crypto_n_random_bytes/2
contains in its PlDoc description the following fragment:Expectation: We expect the following rendering, as was also the case in earlier versions:
Result: However, as of the most recent git version, this is unexpectedly rendered as:
This means that the italics are no longer recognized in this paragraph.
The more I proceed, by trial and error, to somehow get the intended effects out of PlDoc, the more I admire Richard O'Keefe, who accurately foretold this struggle already years ago as the result of not using a formally specified markup language...
The text was updated successfully, but these errors were encountered: