-
Notifications
You must be signed in to change notification settings - Fork 3
/
README.Rmd
112 lines (72 loc) · 3.92 KB
/
README.Rmd
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
78
79
80
81
82
83
84
85
86
87
88
89
90
91
92
93
94
95
96
97
98
99
100
101
102
103
104
105
106
107
108
109
110
111
112
# AWS Comprehend Client Package
[![CRAN](https://www.r-pkg.org/badges/version/aws.comprehend)](https://cran.r-project.org/package=aws.comprehend)
![Downloads](https://cranlogs.r-pkg.org/badges/aws.comprehend)
[![Travis Build Status](https://travis-ci.org/cloudyr/aws.comprehend.png?branch=master)](https://travis-ci.org/cloudyr/aws.comprehend)
[![codecov.io](https://codecov.io/github/cloudyr/aws.comprehend/coverage.svg?branch=master)](https://codecov.io/github/cloudyr/aws.comprehend?branch=master)
**aws.comprehend** is a package for natural language processing.
## Code Examples
All of the functions (except `detect_medical_*`) accept either a single character string or a character vector. Note that AWS currently limits batch queries to 25 documents, so character vectors should have 25 elements maximum.
The default language is English (`"en"`) but this is easily changed using the `language` argument.
```{r set-options, echo=FALSE, cache=FALSE}
# to prevent data.frame wrapping in the outputs below
options(width = 150)
```
### Sentiment analysis
```{r}
library("aws.comprehend")
detect_sentiment("I have never been happier. This is the best day ever.")
# Sentiment analysis in Spanish
detect_sentiment("¡Hoy estoy feliz!", language = "es")
```
### Language detection
```{r}
# simple language detection
detect_language("This is a test sentence in English")
# multi-lingual language detection
detect_language("A: ¡Hola! ¿Como está, usted? B: Bien, merci. Et toi?")
```
### Named Entity Recognition
```{r}
txt <- c("Amazon provides web services.", "Jeff is their leader.")
detect_entities(txt)
```
### Key Phrase Detection
```{r}
txt <- c("Amazon provides web services.", "Jeff is their leader.")
detect_phrases(txt)
```
### Syntax Analysis
```{r}
detect_syntax("The quick fox jumps over the lazy dog.")
```
### Medical Entity and Personal Health Information (PHI) Detection
```{r}
# medical entity detection
medical_txt <- "Pt is 40yo mother, highschool teacher. HPI : Sleeping trouble on present dosage of Clonidine."
detect_medical_entities(medical_txt)
# Protected Health Information (PHI) detection
detect_medical_phi(medical_txt)
```
## Setting up credentials
To use the package, you will need an AWS account and to enter your credentials into R. Your keypair can be generated on the [IAM Management Console](https://aws.amazon.com/) under the heading *Access Keys*. Note that you only have access to your secret key once. After it is generated, you need to save it in a secure location. New keypairs can be generated at any time if yours has been lost, stolen, or forgotten. The [**aws.iam** package](https://github.com/cloudyr/aws.iam) profiles tools for working with IAM, including creating roles, users, groups, and credentials programmatically; it is not needed to *use* IAM credentials.
A detailed description of how credentials can be specified is provided at: https://github.com/cloudyr/aws.signature/. The easiest way is to simply set environment variables on the command line prior to starting R or via an `Renviron.site` or `.Renviron` file, which are used to set environment variables in R during startup (see `? Startup`). They can be also set within R:
```R
Sys.setenv("AWS_ACCESS_KEY_ID" = "mykey",
"AWS_SECRET_ACCESS_KEY" = "mysecretkey",
"AWS_DEFAULT_REGION" = "us-east-1",
"AWS_SESSION_TOKEN" = "mytoken")
```
## Installation
You can install this package from CRAN or, to install the latest development version, from the cloudyr drat repository:
```R
# Install from CRAN
install.packages("aws.comprehend")
# Latest version passing CI tests, from drat repo
install.packages("aws.comprehend", repos = c(getOption("repos"), "http://cloudyr.github.io/drat"))
```
You can also pull a potentially unstable version directly from GitHub, using the `remotes` package:
```R
remotes::install_github("cloudyr/aws.comprehend")
```
---
[![cloudyr project logo](https://i.imgur.com/JHS98Y7.png)](https://github.com/cloudyr)