-
Notifications
You must be signed in to change notification settings - Fork 10
/
ch11.Rmd
115 lines (88 loc) · 1.99 KB
/
ch11.Rmd
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
78
79
80
81
82
83
84
85
86
87
88
89
90
91
92
93
94
95
96
97
98
99
100
101
102
103
104
105
106
107
108
109
110
111
112
113
114
115
---
title: "使用 apply 函數家族"
author: "郭耀仁"
date: "`r Sys.Date()`"
output: slidy_presentation
---
```{r setup, include=FALSE}
knitr::opts_chunk$set(echo = TRUE, results = 'hide', warning = FALSE)
```
## apply 函數家族是什麼
- 根基於 C 語言函數
- 效率高於迴圈
## apply 函數家族的成員
- `apply()`
- `lapply()`
- `sapply()`
- `tapply()`
## `apply()`
- 輸入:matrix,dataframe 或 list
- MARGIN:1 表示為列,2 表示為欄
- FUN:所使用的函數
```{r}
apply(cars, FUN = mean, MARGIN = 2)
#result <- rep(NA, times = 2)
#for (i in 1:length(cars)) {
# result[i] <- mean(cars[, i])
#}
#result
```
## `lapply()`
- `l` as in list,輸出 list 的意思
- 輸入:dataframe 或 list
- FUN:所使用的函數
```{r}
my_list <- list(1:10, 1:100)
lapply(my_list, FUN = mean)
unlist(lapply(my_list, FUN = mean))
#result <- rep(NA, times = 2)
#for (i in 1:length(my_list)) {
# result[i] <- mean(my_list[[i]])
#}
#result
```
## `lapply()`(2)
```{r}
lapply(mtcars, FUN = mean)
unlist(lapply(mtcars, FUN = mean))
```
## `sapply()`
- `s` as in simple,輸出向量的意思
- 輸入:dataframe 或 list
- FUN:所使用的函數
```{r}
my_list <- list(1:10, 1:100)
sapply(my_list, FUN = mean)
sapply(mtcars, FUN = mean)
```
## `tapply()`
- 輸入:vector
- INDEX:分組的類別變數
- FUN:所使用的函數
```{r}
tapply(iris$Sepal.Length, INDEX = iris$Species, FUN = mean)
```
## 使用 `system.time()` 檢視
```{r}
# 建立 data frame
col1 <- runif (12^4, min = 0, max = 2)
col2 <- rnorm (12^4, mean = 0, sd = 2)
col3 <- rpois (12^4, lambda = 3)
col4 <- rchisq (12^4, df = 2)
df <- data.frame (col1, col2, col3, col4)
# 使用 apply() 計算每列的總和
system.time(apply(df, FUN = sum, MARGIN = 1))
# 使用 for-loop 計算每列的總和
result <- c()
system.time(
for (i in 1:nrow(df)) {
result[i] <- sum(df[i, ])
}
)
```
## 其他的 apply() 函數
- vapply()
- mapply()
- rapply()
- eapply()
- [資料來源](http://blog.fens.me/r-apply/)