intro in overview

tdhock · Mar 31, 2024 · 487f89b · 487f89b
1 parent a5aab7e
commit 487f89b
Show file tree

Hide file tree

Showing 8 changed files with 18 additions and 21 deletions.
diff --git a/vignettes/v0-overview.Rmd b/vignettes/v0-overview.Rmd
@@ -9,26 +9,37 @@ vignette: >
   \usepackage[utf8]{inputenc}
 ---
 
-# Overview of nc functionality
-
 ```{r setup, include = FALSE}
 knitr::opts_chunk$set(
   collapse = TRUE,
   comment = "#>"
 )
 ```
 
-Here is an index of topics which are explained in the different
+`nc` is a package for named capture regular expressions (regex), which
+are useful for parsing/converting text data to tabular data (one row
+per match, one column per capture group). In the terminology of regex,
+we attempt to match a regex/pattern to a subject, which is a string of
+text data. The regex/pattern is typically defined using a single
+string (in other frameworks/packages/languages), but in `nc` we use a
+special syntax: one or more R arguments are concatenated to define a
+regex/pattern, and named arguments are used as capture groups.  For
+more info about regex in general see
+[regular-expressions.info](https://www.regular-expressions.info/reference.html)
+and/or the [Friedl book](http://regex.info/book.html), and for more
+info about the special `nc` syntax, see `help("nc",package="nc")`.
+
+Below is an index of topics which are explained in the different
 vignettes, along with an overview of functionality using simple
 examples.
 
 ## Capture first match in several subjects
 
 [Capture first](v1-capture-first.html) is for the situation when your
-input is a character vector (each element is a different subject), you
-want find the first match of a regex to each subject, and your desired
-output is a data table (one row per subject, one column per capture
-group in the regex). 
+input is a character vector (each element is a different subject to
+parse), you want find the first match of a regex to each subject, and
+your desired output is a data table (one row per subject, one column
+per capture group in the regex).
 
 ```{r}
 subject.vec <- c(

diff --git a/vignettes/v1-capture-first.Rmd b/vignettes/v1-capture-first.Rmd
@@ -9,8 +9,6 @@ vignette: >
   \usepackage[utf8]{inputenc}
 ---
 
-# Capture first match
-
 ```{r setup, include = FALSE}
 knitr::opts_chunk$set(
   collapse = TRUE,

diff --git a/vignettes/v2-capture-all.Rmd b/vignettes/v2-capture-all.Rmd
@@ -16,8 +16,6 @@ knitr::opts_chunk$set(
 )
 ```
 
-# Capture all matches in a single subject string
-
 The `nc::capture_all_str` function is for the common case of
 extracting each match from a multi-line text file (a single large
 subject string). In this section we demonstrate how to extract data

diff --git a/vignettes/v3-capture-melt.Rmd b/vignettes/v3-capture-melt.Rmd
@@ -9,8 +9,6 @@ vignette: >
   \usepackage[utf8]{inputenc}
 ---
 
-# Capture melt
-
 ```{r setup, include = FALSE}
 knitr::opts_chunk$set(
   collapse = TRUE,

diff --git a/vignettes/v4-comparisons.Rmd b/vignettes/v4-comparisons.Rmd
@@ -9,8 +9,6 @@ vignette: >
   \usepackage[utf8]{inputenc}
 ---
 
-# Comparisons with other packages
-
 ```{r setup, include = FALSE}
 knitr::opts_chunk$set(
   collapse = TRUE,

diff --git a/vignettes/v5-helpers.Rmd b/vignettes/v5-helpers.Rmd
@@ -9,8 +9,6 @@ vignette: >
   \usepackage[utf8]{inputenc}
 ---
 
-# Helper functions
-
 ```{r setup, include = FALSE}
 knitr::opts_chunk$set(
   collapse = TRUE,

diff --git a/vignettes/v6-engines.Rmd b/vignettes/v6-engines.Rmd
@@ -9,8 +9,6 @@ vignette: >
   \usepackage[utf8]{inputenc}
 ---
 
-# Uniform interface to three regex engines
-
 ```{r setup, include = FALSE}
 knitr::opts_chunk$set(
   collapse = TRUE,

diff --git a/vignettes/v7-capture-glob.Rmd b/vignettes/v7-capture-glob.Rmd
@@ -9,8 +9,6 @@ vignette: >
   \usepackage[utf8]{inputenc}
 ---
 
-# Reading regularly named files
-
 ```{r setup, include = FALSE}
 knitr::opts_chunk$set(
   collapse = TRUE,