-
Notifications
You must be signed in to change notification settings - Fork 0
/
Copy pathRtutorial.rmd
138 lines (110 loc) · 6.07 KB
/
Rtutorial.rmd
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
78
79
80
81
82
83
84
85
86
87
88
89
90
91
92
93
94
95
96
97
98
99
100
101
102
103
104
105
106
107
108
109
110
111
112
113
114
115
116
117
118
119
120
121
122
123
124
125
126
127
128
129
130
131
132
133
134
135
136
137
138
1. Database
SQLITE
Installing
Selecting
Inserting
MySQL
Installing
Linux
Windows
Selecting
PostgreSQL
RODBC
Read MS Excel Files
Reading and Writing Data
Connections
Examples
Data Frames
Determining Size
Date/Time
Parse a Date
Dates and data.frames
Format a Date
Time Intervals
Time Sequences
Formulas
Graphics
Simple Plot
Plot Multiple Timeseries
Pause between Plots
Plot a filled countour
Examples
Plot Least Squares Fit
Plot ESRI Shapefile
Clustering
Package ''flexclust''
Machine Learning/Statistical Learning
Spatial data
Work with shapefiles
Timeseries
Create a Timeseries
Extending a Timeseries
Debugging
browser
debug/undebug
Random Data
Generate a Random Matrix
#### 1. Database
SQLITE
Access to sqlite is provided via the RSQLite package.
Installing
sudo R install.packages("RSQLite")
Selecting
Note: To fetch all rows of a resultset, append n=-1 to your fetch statement.
library(RSQLite) driver = dbDriver("SQLite") con <- dbConnect(driver,dbname="some_sqlite.db") dbListTables(con) # show tables rs <- dbSendQuery(con, "SELECT * from sensordata limit 10") # send a query data <- fetch(rs,n=3) # fetch 3 rows from result (use -1 to fetch all rows) dbHasCompleted(rs) # checks if other rows left and returns true/false #cleanup dbClearResult(rs) dbDisconnect(con) dbUnloadDriver(driver)
Inserting
library(RSQLite); driver = dbDriver("SQLite"); con <- dbConnect(driver, dbname="/home/soma/Desktop/testdata.db"); data(USArrests); #prepare some data frame dbWriteTable(con,"arrests", USArrests); # insert it as a table #cleanup dbClearResult(rs); dbDisconnect(con); dbUnloadDriver(driver);
MySQL
Installing
Linux
sudo aptitude install r-cran-rmysql
Windows
Set system variable MYSQL_HOME (RMYSQL directory in R-directory e.g. C:\Programme\R\R-2.8.1\library\RMySQL)
Directory defined above has to contain the client DLL for MySQL-Servers. A specific structure of subdirectories has to be created.
#### Selecting
library(RMySQL) roadid = 1234 laneid = 1 drv = dbDriver("MySQL") con = dbConnect(drv,dbname="flow_timeseries",user="123",pass="123",host="10.10.10.10") sql <- paste("SELECT * from timeseries WHERE roadid = ",roadid,"AND laneid = ",laneid,"ORDER BY day") res <- dbSendQuery(con,sql) data <- fetch(res, n = -1) dbDisconnect(con)
#### PostgreSQL
Install the r-base postgres server dependencies
sudo apt-get install r-base-dev postgresql-server-dev-8.3 sudo R
In an R shell, install the necessary packages
rS="http://www.bioconductor.org/packages/release/bioc/" install.packages("Rdbi", repos=rS, dependencies=TRUE) install.packages("RdbiPgSQL", repos=rS, dependencies=TRUE)
Talk to a postgres from within R:
library(RdbiPgSQL) drv <- PgSQL() con <- dbConnect(drv, dbname="advanced", user="1234", password="1234", host="10.10.10.10") res <- dbSendQuery(con, "SELECT * FROM ...") data <- dbGetResult(res) dbDisconnect(con)
RODBC
Package RODBC on CRAN provides an interface to database sources supporting an ODBC interface. This is very widely available, and allows the same R code to access different database systems. RODBC runs on both Unix/Linux and Windows, and almost all database systems provide support for ODBC.
Read MS Excel Files
As a simple example of using ODBC under Windows with a Excel spreadsheet, we can read from a spreadsheet by
library(RODBC) channel <- odbcConnectExcel("bdr.xls") ## list the spreadsheets > sqlTables(channel)
TABLE_CAT TABLE_SCHEM TABLE_NAME TABLE_TYPE REMARKS 1 C:\\bdr NA Sheet1$ SYSTEM TABLE NA 2 C:\\bdr NA Sheet2$ SYSTEM TABLE NA 3 C:\\bdr NA Sheet3$ SYSTEM TABLE NA 4 C:\\bdr NA Sheet1$Print_Area TABLE NA
sh1 <- sqlFetch(channel, "Sheet1") sh1 <- sqlQuery(channel, "select * from [Sheet1$]")
#### Reading and Writing Data
Connections
Connections provide a flexible way for R to read data from a variety of sources, providing more complete control over the nature of the connection than simply specifying a file name as input to functions like read.table and scan.
file: files on the local file system
pipe: output from a command
textConnection: treats text as a file
gzfile: local gzipped file
unz: local zip archive (with single file; read-only)
bzfile: local bzipped file
url: remote file read via http
socketConnection: socket for client/server programs
##### Examples
Skip last lines of a data file (e.g. last two lines):
con <- textConnection(rev(rev(readLines('data.txt'))[-(1:2)])) data <- read.table(con) close(con)
Read data from gzip-compressed file:
gz <- gzfile("datafile.csv.gz", "r") raw <- textConnection(readLines(gz)) close(gz) dataset <- read.table(raw, sep=";", as.is=TRUE, header=TRUE) close(raw)
#### Data Frames
Determining Size
df = data.frame(...) row_count = nrow(df) col_count = ncol(df) dim(df)
#### Date/Time
R and its contributed packages have a number of datetime (i.e. date or date/time) classes:
POSIX classes: POSIX classes refer to the two classes POSIXct, POSIXlt and their common super class POSIXt. These support times and dates including time zones and standard vs. daylight savings time.
Date: Date is the newest R date class, introduced in R-1.9.0. It supports dates without times. Eliminating times simplifies dates substantially since not only are times, themselves, eliminated but the potential complications of timezones and daylight savings time vs. standard time need not be considered either. Date has an interface similar to the POSIX classes discussed below making it easy to move between them.
chron: The CRAN-Package chron provides dates and times. There are no time zones or notion of daylight vs. standard time in chron which makes it simpler to use for most purposes than date/time packages that do employ time zones.
References: R News, The Newsletter of the R Project, Volume 4/1, June 2004, ISSN 1609-3631, http://cran.r-project.org/doc/Rnews/Rnews_2004-1.pdf
#### Parse a Date
d1 <- as.Date("2008-05-18") class(d1) # Output: [1] "Date" d2 <- strptime("2008-01-01 14:30", "%Y-%m-%d %H:%M") class(d2) # Output: [1] "POSIXt" "POSIXlt" d3 <- as.POSIXct("2008-01-01 14:30", tz="GMT") class(d3) #Output: [1] "POSIXt" "POSIXct"
See strptime for formatting details
All functions also work for lists:
strptime(c("2008-01-01 14:30","2008-02-02 0:30"), "%Y-%m-%d %H:%M")