lvaudor · maelle · Sep 28, 2023 · Sep 28, 2023
diff --git a/README.Rmd b/README.Rmd
@@ -24,47 +24,10 @@ knitr::opts_chunk$set(
 [![Codecov test coverage](https://codecov.io/gh/lvaudor/glitter/branch/master/graph/badge.svg)](https://app.codecov.io/gh/lvaudor/glitter?branch=master)
 <!-- badges: end -->
 
-This package aims at writing and sending SPARQL queries without advanced knowledge of the SPARQL language syntax. 
-It makes the exploration and use of Linked Open Data (Wikidata in particular) easier for those who do not know SPARQL well.
 
-With glitter, compared to writing SPARQL queries by hand, your code should be easier to write, and easier to read by your peers who do not know SPARQL.
-The glitter package supports a "domain-specific language" (DSL) with function names (and syntax) closer to the tidyverse and base R than to SPARQL.
-
-For instance, to find a corpus of 5 articles with a title in English and "wikidata" in that title, instead of writing SPARQL by hand you can run:
-
-```{r}
-library("glitter")
-query <- spq_init() %>%
-  spq_add("?item wdt:P31 wd:Q13442814") %>%
-  spq_label(item) %>%
-  spq_filter(str_detect(str_to_lower(item_label), 'wikidata')) %>%
-  spq_head(n = 5)
-
-query
-```
-
-Note how we were able to use `str_detect()` and `str_to_lower()` (as in the stringr package) instead of SPARQL's functions `REGEX` and `LCASE`.
-
-To perform the query,
-
-```{r}
-spq_perform(query)
+```{r child="man/rmd-fragments/intro.Rmd"}
 ```
 
-To get a random subset of movies with the date they were released, you could use
-
-```{r}
-spq_init() %>%
-  spq_add("?film wdt:P31 wd:Q11424") %>%
-  spq_label(film) %>%
-  spq_add("?film wdt:P577 ?date") %>%
-  spq_mutate(date = year(date)) %>%
-  spq_head(10) %>%
-  spq_perform()
-```
-
-Note that we were able to "overwrite" the date variable, which is straightforward in dplyr, but not so much in SPARQL.
-
 ## Installation
 
 Install this packages through R-universe:
@@ -76,13 +39,10 @@ install.packages("glitter", repos = "https://lvaudor.r-universe.dev")
 Or through GitHub:
 
 ```r
-install.packages("remotes") #if remotes is not already installed
-remotes::install_github("lvaudor/glitter")
+install.packages("pak") #if pak is not already installed
+pak::pak("lvaudor/glitter")
 ```
 
 ## Documentation
 
 You can access the documentation regarding package `glitter`  [on its pkgdown website](http://perso.ens-lyon.fr/lise.vaudor/Rpackages/glitter/).
-
-
-
diff --git a/README.md b/README.md
@@ -14,10 +14,10 @@ experimental](https://img.shields.io/badge/lifecycle-experimental-orange.svg)](h
 coverage](https://codecov.io/gh/lvaudor/glitter/branch/master/graph/badge.svg)](https://app.codecov.io/gh/lvaudor/glitter?branch=master)
 <!-- badges: end -->
 
-This package aims at writing and sending SPARQL queries without advanced
-knowledge of the SPARQL language syntax. It makes the exploration and
-use of Linked Open Data (Wikidata in particular) easier for those who do
-not know SPARQL well.
+The glitter package aims at writing and sending SPARQL queries without
+advanced knowledge of the SPARQL language syntax. It makes the
+exploration and use of Linked Open Data (Wikidata in particular) easier
+for those who do not know SPARQL well.
 
 With glitter, compared to writing SPARQL queries by hand, your code
 should be easier to write, and easier to read by your peers who do not
@@ -38,7 +38,7 @@ query <- spq_init() %>%
 
 query
 #> PREFIX rdfs: <http://www.w3.org/2000/01/rdf-schema#>
-#> SELECT ?item ?item_label
+#> SELECT ?item (COALESCE(?item_labell,'') AS ?item_label)
 #> WHERE {
 #> 
 #> ?item wdt:P31 wd:Q13442814.
@@ -47,8 +47,8 @@ query
 #> FILTER(lang(?item_labell) IN ('en'))
 #> }
 #> 
-#> BIND(COALESCE(?item_labell,'') AS
-#> ?item_label)FILTER(REGEX(LCASE(?item_label),"wikidata"))
+#> BIND(COALESCE(?item_labell,'') AS ?item_label)
+#> FILTER(REGEX(LCASE(?item_label),"wikidata"))
 #> }
 #> 
 #> LIMIT 5
@@ -83,23 +83,27 @@ spq_init() %>%
   spq_head(10) %>%
   spq_perform()
 #> # A tibble: 10 × 3
-#>    film                                  date film_label       
-#>    <chr>                                <dbl> <chr>            
-#>  1 http://www.wikidata.org/entity/Q372   2009 We Live in Public
-#>  2 http://www.wikidata.org/entity/Q595   2011 The Intouchables 
-#>  3 http://www.wikidata.org/entity/Q595   2011 The Intouchables 
-#>  4 http://www.wikidata.org/entity/Q595   2012 The Intouchables 
-#>  5 http://www.wikidata.org/entity/Q595   2012 The Intouchables 
-#>  6 http://www.wikidata.org/entity/Q593   2011 A Gang Story     
-#>  7 http://www.wikidata.org/entity/Q1365  1974 Swept Away       
-#>  8 http://www.wikidata.org/entity/Q1365  1974 Swept Away       
-#>  9 http://www.wikidata.org/entity/Q1365  1975 Swept Away       
-#> 10 http://www.wikidata.org/entity/Q1365  1975 Swept Away
+#>    film                                 film_label         date
+#>    <chr>                                <chr>             <dbl>
+#>  1 http://www.wikidata.org/entity/Q372  We Live in Public  2009
+#>  2 http://www.wikidata.org/entity/Q595  The Intouchables   2011
+#>  3 http://www.wikidata.org/entity/Q595  The Intouchables   2011
+#>  4 http://www.wikidata.org/entity/Q595  The Intouchables   2012
+#>  5 http://www.wikidata.org/entity/Q595  The Intouchables   2012
+#>  6 http://www.wikidata.org/entity/Q593  A Gang Story       2011
+#>  7 http://www.wikidata.org/entity/Q1365 Swept Away         1974
+#>  8 http://www.wikidata.org/entity/Q2201 Kick-Ass           2010
+#>  9 http://www.wikidata.org/entity/Q2201 Kick-Ass           2010
+#> 10 http://www.wikidata.org/entity/Q2201 Kick-Ass           2010
 ```
 
 Note that we were able to “overwrite” the date variable, which is
 straightforward in dplyr, but not so much in SPARQL.
 
+If you want to learn more about SPARQL, you could read the [Learning
+SPARQL book by Bob
+DuCharme](https://www.oreilly.com/library/view/learning-sparql-2nd/9781449371449/).
+
 ## Installation
 
 Install this packages through R-universe:
@@ -111,8 +115,8 @@ install.packages("glitter", repos = "https://lvaudor.r-universe.dev")
 Or through GitHub:
 
 ``` r
-install.packages("remotes") #if remotes is not already installed
-remotes::install_github("lvaudor/glitter")
+install.packages("pak") #if pak is not already installed
+pak::pak("lvaudor/glitter")
 ```
 
 ## Documentation

diff --git a/man/rmd-fragments/equivalents.Rmd b/man/rmd-fragments/equivalents.Rmd
@@ -0,0 +1,134 @@
+In glitter functions such as `spq_mutate()` and `spq_filter()` 
+you can use functions that look like R functions, for instance
+`str_detect()` below:
+
+```{r}
+# Lexemes in English that match an expression
+# here starting with "pota"
+query <- spq_init() |>
+  spq_prefix(prefixes = c(dct = "http://purl.org/dc/terms/")) |>
+  spq_add(spq('?lexemeId dct:language wd:Q1860')) |>
+  spq_mutate(lemma = wikibase::lemma(lexemeId)) |>
+  spq_filter(str_detect(lemma, '^pota.*')) |>
+  spq_select(lexemeId, lemma)
+```
+
+The query looks like so in SPARQL, so `str_detect()` has been translated to REGEX.
+
+```{r}
+query
+```
+
+What functions are available?
+
+### Functions operating on sets
+
+```{r}
+set_functions
+```
+
+Note the case of `str_c()`, whose argument `sep` will be translated to the SPARQL argument `SEPARATOR`.
+
+```{r}
+spq_init() %>%
+  spq_add("?film wdt:P31 wd:Q11424") %>%
+  spq_add("?film wdt:P921 ?subject") %>%
+  spq_label(subject) %>%
+  spq_group_by(film) %>%
+  spq_summarise(subject_label_concat = str_c(subject_label, sep="; ")) %>%
+  spq_head(10)
+```
+
+### Functions operating on terms
+
+```{r}
+term_functions
+```
+
+Example with the `lang()` function
+
+```{r}
+spq_init() %>%
+  spq_mutate(statement = wdt::P1843(wd::Q331676)) %>%
+  spq_mutate(lang = lang(statement))
+``` 
+
+### Miscellaneous functions
+
+```{r}
+misc_functions
+```
+
+Example with `desc()`
+
+```{r}
+spq_init() %>%
+  spq_add("?item wdt:P31/wdt:P279* wd:Q4022") %>%
+  spq_label(item) %>%
+  spq_add("?item wdt:P2043 ?length") %>%
+  spq_add("?item wdt:P625 ?location") %>%
+  spq_arrange(desc(length), item_label) %>%
+  spq_head(50)
+```
+
+### Functions operating on strings
+
+```{r}
+string_functions
+```
+
+Example with `str_detect()`.
+
+```{r}
+# Lexemes in English that match an expression
+# here starting with "pota"
+spq_init() |>
+  spq_prefix(prefixes = c(dct = "http://purl.org/dc/terms/")) |>
+  spq_add(spq('?lexemeId dct:language wd:Q1860')) |>
+  spq_mutate(lemma = wikibase::lemma(lexemeId)) |>
+  spq_filter(str_detect(lemma, '^pota.*')) |>
+  spq_select(lexemeId, lemma)
+```
+
+### Functions operating on numbers
+
+```{r}
+numeric_functions
+```
+
+Example (chemical elements)
+
+```{r}
+spq_init() %>%
+  spq_add("?element wdt:P31 wd:Q11344.") %>%
+  spq_mutate(density = wdt::P2054(element)) %>%
+  spq_label(element) %>%
+  spq_mutate(round_density = round(density))
+```
+
+### Functions operating on date-time objects
+
+```{r}
+datetime_functions
+```
+
+Example with `year()`:
+
+```{r}
+spq_init() %>%
+  spq_add("?film wdt:P31 wd:Q11424") %>%
+  spq_label(film) %>%
+  spq_add("?film wdt:P577 ?date") %>%
+  spq_mutate(date = year(date)) %>%
+  spq_head(10)
+```
+
+### All correspondences
+
+```{r}
+all_correspondences
+```
+
+### More correspondences?
+
+Please open an [issue](https://github.com/lvaudor/glitter/issues) if you think we should amend or add a function.
diff --git a/man/rmd-fragments/intro.Rmd b/man/rmd-fragments/intro.Rmd
@@ -0,0 +1,42 @@
+The glitter package aims at writing and sending SPARQL queries without advanced knowledge of the SPARQL language syntax.
+It makes the exploration and use of Linked Open Data (Wikidata in particular) easier for those who do not know SPARQL well.
+
+With glitter, compared to writing SPARQL queries by hand, your code should be easier to write, and easier to read by your peers who do not know SPARQL.
+The glitter package supports a "domain-specific language" (DSL) with function names (and syntax) closer to the tidyverse and base R than to SPARQL.
+
+For instance, to find a corpus of 5 articles with a title in English and "wikidata" in that title, instead of writing SPARQL by hand you can run:
+
+```{r}
+library("glitter")
+query <- spq_init() %>%
+  spq_add("?item wdt:P31 wd:Q13442814") %>%
+  spq_label(item) %>%
+  spq_filter(str_detect(str_to_lower(item_label), 'wikidata')) %>%
+  spq_head(n = 5)
+
+query
+```
+
+Note how we were able to use `str_detect()` and `str_to_lower()` (as in the stringr package) instead of SPARQL's functions `REGEX` and `LCASE`.
+
+To perform the query,
+
+```{r}
+spq_perform(query)
+```
+
+To get a random subset of movies with the date they were released, you could use
+
+```{r}
+spq_init() %>%
+  spq_add("?film wdt:P31 wd:Q11424") %>%
+  spq_label(film) %>%
+  spq_add("?film wdt:P577 ?date") %>%
+  spq_mutate(date = year(date)) %>%
+  spq_head(10) %>%
+  spq_perform()
+```
+
+Note that we were able to "overwrite" the date variable, which is straightforward in dplyr, but not so much in SPARQL.
+
+If you want to learn more about SPARQL, you could read the [Learning SPARQL book by Bob DuCharme](https://www.oreilly.com/library/view/learning-sparql-2nd/9781449371449/).
diff --git a/vignettes/articles/explore.Rmd b/vignettes/articles/explore.Rmd
@@ -26,7 +26,7 @@ When in doubt, add a `spq_head()` in your query pipeline, to ask less at a time,
 ## Asking for a subset of all triples
 
 In the code below we'll ask for 10 triples.
-Note that we use the `endpoint` argument of `spq_perform()` to indicate where to send the query, as well as the `request_type` argument.
+Note that we use the `endpoint` argument of `spq_init()` to indicate where to send the query, as well as the `request_type` argument.
 
 How can one know whether a service needs `request_type = "body-form"`?