Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

A grammar of graphics details? #7

Open
timchurches opened this issue Nov 10, 2017 · 3 comments
Open

A grammar of graphics details? #7

timchurches opened this issue Nov 10, 2017 · 3 comments

Comments

@timchurches
Copy link
Collaborator

timchurches commented Nov 10, 2017

I was just wondering if it is worth designing and defining an formal or semi-formal grammar for these function names, so that they are readily guessable without having to look them up? In other words, a naming convention, such as, based on what has been done so far, [prefix][axis-qualifier][attribute|verb].

Or, rather than having to define a zillion easy_ functions, what about one function that uses a little DSL to set the theme attributes? No need to mess with lex and yacc (which are available in R via the rly package btw), it might be enough just to pass commands and scalars as ellipsis arguments eg

ggdetails("x", "axis", "blue")

or, equivalently,

ggdetails("blue", "x", "axis")

or, using a lexer (eg rly) to tokenise a single string argument:

ggdetails("blue x axis")

or

ggdetails("x axis blue")

This also makes it easier for users who are non-native English speakers, whose natural word-ordering assumptions may be different - order would not matter.

The order of the ellipsis arguments or tokens wouldn't matter because the class of the argument or token can be inferred from its value in the very constrained context of the ggdetails() function. "axis", "legend", "text" can only refer to plot elements, "blue", "orange" or a hex RGB value can only refer to colours, and "2" or "5" are scalars (for font size or rotation), and "+25%" or "-33%" means increase or decrease current size (or whatever is specified by 25% or 33% respectively. That way argument order doesn't need to be remembered.

Actually, using yacc and lex via rly to build a simple DSL might be the best option, but the utility of the concept could be tested using individual ellipsis arguments to start with.

@jonocarroll
Copy link
Owner

This is a seriously cool idea. I'd lean towards having the zillion helper functions and the 'ggbot' chat lexer. Looking at https://github.com/systemincloud/rly I don't think I could immediately build that, so if you're more familiar with the idea then by all means please have a crack at it. I think it would be of great benefit!

We could make a fairly simple approximation to this with a heap of if() statements... I started writing this then got carried away trying to see how it might work... now it's here: #8

@timchurches
Copy link
Collaborator Author

timchurches commented Nov 12, 2017

The text expression is a command, in which the subject is implicit (the subject is "ggbot"), the verb is implicit ("make"), the object of the command is one of the entities listed under Arguments here, the attribute of the named entity to be modified is inferred from the attribute value ("blue" must be a colour, "2" must be an absolute size/thickness scalar, "2cm" is scale with units, "+25%" is a relative scalar, "-45deg" is an angular quantity etc). In some cases the attribute to be modified might need to be named explicitly. One command per string, but a vector of strings or multiple ellipsis string arguments could be passed to one call of ggbot(). Probably want to allow modification of multiple attributes per command string, but strictly only one object entity per command. Thus, "text blue" would make all text blue. "text blue 15" would make all text blue and size 15. But "text line blue" would be illegal because there are two object entities (targets): "text" and "line". (Aside: this restriction of one target entity per command is just to keep it simple to start with).

Now, a minor complication is how the object entities should be specified. "text" or "line" are easy, and, as per the ggplot2 theme model, these apply to all text or to all lines. But what about, say, the x-axis text? Well, adding "x" or "y" (or "z"?) implies that modification of some axis attributes are being requested, in other words, that "axis" is implied. If modification of both axes is desired, then any or all of the following could be supported: "axis blue", "axes blue" "x y blue". Except we haven't specified which aspect of the axis or axes we want to modify, so we need a qualifier: for axes, the valid ones are "title", "text", "ticks" and "line" (and pluralised fires of those etc).

What about suppressing elements? I think the solution is recognise some special-case attributes, such as "invisible", or gerund forms such as "disappear", "begone" or just "no" or "none" or "zap" or "ditch" etc.

OK, is this language model adequate? The way to find out is to build a table of all the entity target types which theme() supports, and a separate table for each entity type, enumerating all the attributes that entity type can have set, and specify an example command string and check that it can be unambiguously parsed. Then check that there is no overlap between any of the words used as specifications in both tables. If there is any overlap (i.e. the sets of words are not disjoint) then there will be ambiguity which can only be resolved by word order, which means using a more complex language model.

Creating the tables is a slightly tedious task, but if split up shouldn't take too long. Once the adequacy of the language model is confirmed, or it is tweaked until adequate, then coding should commence. Such a table can also provide the basis for the ggbot() tests, of course.

Obviously, lots of synonyms can be included in the language model. The question is that whether an unambiguous model can be constructed with just single-word tokens, or not? On a quick scan, I think it can, but that needs to be thoroughly checked. If not, then a smarter tokeniser may be needed. A lemmatiser could also handle synonyms and alternative spellings etc. But I agree, the aim should be to keep it as lightweight as possible. The aim of designing the language model first, before coding it up, is to check whether a bunch of if/else statements is enough, or whether a proper lever and lemmatiser is needed or worthwhile. Or whether is is better to build a formal domain-specific language, in which case yacc (via rly) needs to be used to build a parse tree. However, I don't think we want a formal DSL.

@timchurches
Copy link
Collaborator Author

timchurches commented Nov 12, 2017

entity element type specifier(s) synonyms
line element_line line lines
rect element_rect rect rectangle, rectangles
text element_text text
title element_text title titles, headings
aspect.ratio ? aspect, ratio
axis.title element_text axis, title axes, titles
axis.title.x element_text x, title, axis (implied by x)

...and the rest of these, and then need to consider the settable attributes for element type, in a separate table. The main thing is to ensure that there is no overlap (i.e. ambiguity) between the way entities are specified and the way attributes and quantities/values are specified.

axis.title.x.top	
x axis label on top axis (element_text; inherits from axis.title.x)

axis.title.y	
y axis label (element_text; inherits from axis.title)

axis.title.y.right	
y axis label on right axis (element_text; inherits from axis.title.y)

axis.text	
tick labels along axes (element_text; inherits from text)

axis.text.x	
x axis tick labels (element_text; inherits from axis.text)

axis.text.x.top	
x axis tick labels on top axis (element_text; inherits from axis.text.x)

axis.text.y	
y axis tick labels (element_text; inherits from axis.text)

axis.text.y.right	
y axis tick labels on right axis (element_text; inherits from axis.text.y)

axis.ticks	
tick marks along axes (element_line; inherits from line)

axis.ticks.x	
x axis tick marks (element_line; inherits from axis.ticks)

axis.ticks.y	
y axis tick marks (element_line; inherits from axis.ticks)

axis.ticks.length	
length of tick marks (unit)

axis.line	
lines along axes (element_line; inherits from line)

axis.line.x	
line along x axis (element_line; inherits from axis.line)

axis.line.y	
line along y axis (element_line; inherits from axis.line)

legend.background	
background of legend (element_rect; inherits from rect)

legend.margin	
the margin around each legend (margin)

legend.spacing	
the spacing between legends (unit)

legend.spacing.x	
the horizontal spacing between legends (unit); inherits from legend.spacing

legend.spacing.y	
the horizontal spacing between legends (unit); inherits from legend.spacing

legend.key	
background underneath legend keys (element_rect; inherits from rect)

legend.key.size	
size of legend keys (unit)

legend.key.height	
key background height (unit; inherits from legend.key.size)

legend.key.width	
key background width (unit; inherits from legend.key.size)

legend.text	
legend item labels (element_text; inherits from text)

legend.text.align	
alignment of legend labels (number from 0 (left) to 1 (right))

legend.title	
title of legend (element_text; inherits from title)

legend.title.align	
alignment of legend title (number from 0 (left) to 1 (right))

legend.position	
the position of legends ("none", "left", "right", "bottom", "top", or two-element numeric vector)

legend.direction	
layout of items in legends ("horizontal" or "vertical")

legend.justification	
anchor point for positioning legend inside plot ("center" or two-element numeric vector) or the justification according to the plot area when positioned outside the plot

legend.box	
arrangement of multiple legends ("horizontal" or "vertical")

legend.box.just	
justification of each legend within the overall bounding box, when there are multiple legends ("top", "bottom", "left", or "right")

legend.box.margin	
margins around the full legend area, as specified using margin

legend.box.background	
background of legend area (element_rect; inherits from rect)

legend.box.spacing	
The spacing between the plotting area and the legend box (unit)

panel.background	
background of plotting area, drawn underneath plot (element_rect; inherits from rect)

panel.border	
border around plotting area, drawn on top of plot so that it covers tick marks and grid lines. This should be used with fill=NA (element_rect; inherits from rect)

panel.spacing	
spacing between facet panels (unit)

panel.spacing.x	
horizontal spacing between facet panels (unit; inherits from panel.spacing)

panel.spacing.y	
vertical spacing between facet panels (unit; inherits from panel.spacing)

panel.grid	
grid lines (element_line; inherits from line)

panel.grid.major	
major grid lines (element_line; inherits from panel.grid)

panel.grid.minor	
minor grid lines (element_line; inherits from panel.grid)

panel.grid.major.x	
vertical major grid lines (element_line; inherits from panel.grid.major)

panel.grid.major.y	
horizontal major grid lines (element_line; inherits from panel.grid.major)

panel.grid.minor.x	
vertical minor grid lines (element_line; inherits from panel.grid.minor)

panel.grid.minor.y	
horizontal minor grid lines (element_line; inherits from panel.grid.minor)

panel.ontop	
option to place the panel (background, gridlines) over the data layers. Usually used with a transparent or blank panel.background. (logical)

plot.background	
background of the entire plot (element_rect; inherits from rect)

plot.title	
plot title (text appearance) (element_text; inherits from title) left-aligned by default

plot.subtitle	
plot subtitle (text appearance) (element_text; inherits from title) left-aligned by default

plot.caption	
caption below the plot (text appearance) (element_text; inherits from title) right-aligned by default

plot.margin	
margin around entire plot (unit with the sizes of the top, right, bottom, and left margins)

strip.background	
background of facet labels (element_rect; inherits from rect)

strip.placement	
placement of strip with respect to axes, either "inside" or "outside". Only important when axes and strips are on the same side of the plot.

strip.text	
facet labels (element_text; inherits from text)

strip.text.x	
facet labels along horizontal direction (element_text; inherits from strip.text)

strip.text.y	
facet labels along vertical direction (element_text; inherits from strip.text)

strip.switch.pad.grid	
space between strips and axes when strips are switched (unit)

strip.switch.pad.wrap	
space between strips and axes when strips are switched (unit)

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants