-
Notifications
You must be signed in to change notification settings - Fork 0
Home
Welcome to the GlobalTerrorism wiki!
##Steps taken to clean data so far:
-
Look only at top 6 countries of interest: Iraq, Pakistan, Afghanistan, India, Philippines and United States - These make up ~65% of entire dataset
-
Remove columns where there are more than half the rows with NA
-
Remove columns with "_txt" since these are coded numerically
-
Remove other columns with detailed location information or characters: "eventid", "provstate", "city","latitude","longitude","specificity", "location","summary","targsubtype1","motive","weapdetail","propcomment","scite1","scite2","dbsource"
-
Of the columns that include international information, remove those where more than half the rows are missing or incomplete: "INT_LOG","INT_IDEO","INT_ANY"
-
Assign numeric values to gname (Perpetrator Group Name), corp1 (corporate entity/government agency that was targeted) and target1 (Specific person, building, installation, targeted) and remove the columns with corresponding character information.
-
Rearrange columns so that the label "Success" is at far right
##Perform subset selection using regsubsets from leaps package:
- using 20 features:
Looking at the features selection based on highest Adjusted R squared:
coef(reg.model, max.adjR)
(Intercept) | iyear | imonth | extended | country | region | doubtterr | multiple |
---|---|---|---|---|---|---|---|
1.45E+01 | -6.80E-03 | -1.28E-03 | -4.36E-02 | -2.12E-04 | 1.35E-02 | 3.85E-02 | 1.46E-02 |
suicide | attacktype1 | claimed | weaptype1 | weapsubtype1 | nkill | nkillus | nwound |
-5.73E-02 | 5.95E-02 | 1.41E-02 | -2.57E-02 | -3.59E-03 | 5.94E-03 | 2.24E-02 | -3.73E-04 |
property | ishostkid | corp.index | gname.index | target1.index | |||
-3.17E-03 | -3.21E-02 | -8.39E-06 | 1.55E-04 | -2.16E-06 |
AIC returns the same features, however BIC returns these:
(Intercept) | iyear | country | region | doubtterr | suicide | attacktype1 | claimed |
---|---|---|---|---|---|---|---|
1.33E+01 | -6.18E-03 | -2.14E-04 | 1.32E-02 | 3.97E-02 | -5.74E-02 | 5.79E-02 | 1.47E-02 |
weaptype1 | weapsubtype1 | nkill | property | ishostkid | corp.index | gname.index | target1.index |
-2.38E-02 | -3.56E-03 | 5.29E-03 | -3.17E-03 | -4.59E-02 | -8.74E-06 | 1.65E-04 | -2.23E-06 |
Looks like imonth, extended, multiple, nkillus, nwound are all not returned.