-
Notifications
You must be signed in to change notification settings - Fork 1
/
select_vars_for_models.qmd
163 lines (105 loc) · 3.63 KB
/
select_vars_for_models.qmd
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
78
79
80
81
82
83
84
85
86
87
88
89
90
91
92
93
94
95
96
97
98
99
100
101
102
103
104
105
106
107
108
109
110
111
112
113
114
115
116
117
118
119
120
121
122
123
124
125
126
127
128
129
130
131
132
133
134
135
136
137
138
139
140
141
142
143
144
145
146
147
148
149
150
151
152
153
154
155
156
157
158
159
160
161
---
title: "Select Variables"
format: html
---
## Setup
```{r}
library(tidyverse)
satp_data <- read.csv("data/satp_clean.csv")
glimpse(satp_data)
```
## Perpetrator
Here we look at the perpetrator of the violence. We will drop the categories of "Civilians" and "Non-Maoist armed group" as they have very few observations. Then we will save the data to a csv file along with the incident summary.
```{r}
# count number of observations in each perpetrator cateogry
satp_data |>
count(perpetrator) # only 1 in Civilians, only 6 in non-Moaist armed group
# dropping civilian and non-Maoist armed group
perpetrator <- satp_data |>
filter(perpetrator %in% c("Maoist", "Security", "Unknown")) |>
select(perpetrator, incident_summary)
write_csv(perpetrator, "data/perpetrator.csv")
glimpse(perpetrator)
```
## Action Type
Now we select all of the action types and save the data to a csv file along with the incident summary.
```{r}
action_type <- satp_data|>
select(armed_assault:abduction, incident_summary)
write_csv(action_type, "data/action_type.csv")
glimpse(action_type)
```
## Target Type
Select the variables related to target type and save the data to a CSV file along with the incident summary
```{r}
target_type <- satp_data|>
select(first_target:other_civilian, incident_summary)
write_csv(target_type, "data/target_type.csv")
glimpse(target_type)
```
## Deaths
Select the variables related to fatalities and save the data to a csv file along with the incident summary.
```{r}
deaths <- satp_data |>
select(total_fatalities:other_armed_grp_fatalities, incident_summary) |>
drop_na()
write_csv(deaths, "data/deaths.csv")
glimpse(deaths)
```
## Injuries
Select the variables related to injuries and save the data to a csv file along with the incident summary.
```{r}
injuries <- satp_data |>
select(total_injuries:non_maoist_armed_group_injuries, incident_summary) |>
drop_na()
write_csv(injuries, "data/injuries.csv")
glimpse(injuries)
```
## Property Damage
Just two variables here: whether there was property damage and value of property damage (if reported). I doubt the value of property damage will be useful in a model, but I'll include it for now. Save the data to a csv file along with the incident summary.
```{r}
property_damage <- satp_data |>
select(property_damage, value_property_damage, incident_summary)
write_csv(property_damage, "data/property_damage.csv")
glimpse(property_damage)
```
## Abductions
For kidnappings, just one variable: number of people abducted. Save the data to a csv file along with the incident summary.
```{r}
abductions <- satp_data |>
select(total_abducted, incident_summary)
write_csv(abductions, "data/abductions.csv")
glimpse(abductions)
```
## Arrests
Multiple arrest counts based on who was arrested.
```{r}
arrests <- satp_data |>
select(total_arrests:unknown_arrests, incident_summary)
write_csv(arrests, "data/arrests.csv")
glimpse(arrests)
```
## Surrenders
Total surrenders to unknown surrenders. Save as CSV.
```{r}
surrenders <- satp_data |>
select(total_surrenders:unknown_surrenders, incident_summary)
write_csv(surrenders, "data/surrenders.csv")
glimpse(surrenders)
```
## Identifying Information
Save year, date and incident number in a separate file.
```{r}
identifying_info <- satp_data |>
select(year, date, incident_number, incident_summary)
write_csv(identifying_info, "data/identifying_info.csv")
glimpse(identifying_info)
```
## Location Information
Save location information, e.g. state - latitude.
```{r}
location_info <- satp_data |>
select(state:latitude, incident_summary)
write_csv(location_info, "data/location_info.csv")
glimpse(location_info)
```